Hi,
I am trying to train with an NER model using en_core_web_trf base model and getting this error!
(venv) C:\Users\Asli>python -m prodigy train "C:\Users\Asli\....\NER_modelv0.5\en_core_web_trf_1" --base-model en_core_web_trf --ner corpus_training_emob_data_v0.5_bat0_correct,corpus_training_emob_data_v0.5_bat1_correct --gpu-id 0
ℹ Using GPU: 0
========================= Generating Prodigy config =========================
ℹ Auto-generating config with spaCy
ℹ Using config from base model
✔ Generated training config
=========================== Initializing pipeline ===========================
[2021-09-07 13:33:45,991] [INFO] Set up nlp object from config
Components: ner
Merging training and evaluation data for 1 components
- [ner] Training: 1458 | Evaluation: 364 (20% split)
Training: 743 | Evaluation: 316
Labels: ner (3)
[2021-09-07 13:33:46,282] [INFO] Pipeline: ['transformer', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer', 'ner']
[2021-09-07 13:33:46,283] [INFO] Resuming training for: ['ner', 'transformer']
[2021-09-07 13:33:46,290] [INFO] Created vocabulary
[2021-09-07 13:33:46,292] [INFO] Finished initializing nlp object
[2021-09-07 13:33:46,292] [INFO] Initialized pipeline components: []
✔ Initialized pipeline
============================= Training pipeline =============================
Components: ner
Merging training and evaluation data for 1 components
- [ner] Training: 1458 | Evaluation: 364 (20% split)
Training: 743 | Evaluation: 316
Labels: ner (3)
ℹ Pipeline: ['transformer', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer',
'ner']
ℹ Frozen components: ['tagger', 'parser', 'attribute_ruler', 'lemmatizer']
ℹ Initial learn rate: 0.0
E # LOSS TRANS... LOSS NER ENTS_F ENTS_P ENTS_R SCORE
--- ------ ------------- -------- ------ ------ ------ ------
⚠ Aborting and saving the final best model. Encountered exception:
OutOfMemoryError('Out of memory allocating 5,047,296 bytes (allocated so far:
1,512,513,536 bytes).')
Traceback (most recent call last):
File "C:\Users\Asli\anaconda3\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\Asli\anaconda3\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\Asli\anaconda3\lib\site-packages\prodigy\__main__.py", line 61, in <module>
controller = recipe(*args, use_plac=True)
File "cython_src\prodigy\core.pyx", line 325, in prodigy.core.recipe.recipe_decorator.recipe_proxy
File "C:\Users\Asli\anaconda3\lib\site-packages\plac_core.py", line 367, in call
cmd, result = parser.consume(arglist)
File "C:\Users\Asli\anaconda3\lib\site-packages\plac_core.py", line 232, in consume
return cmd, self.func(*(args + varargs + extraopts), **kwargs)
File "C:\Users\Asli\anaconda3\lib\site-packages\prodigy\recipes\train.py", line 277, in train
return _train(
File "C:\Users\Asli\anaconda3\lib\site-packages\prodigy\recipes\train.py", line 197, in _train
spacy_train(nlp, output_path, use_gpu=gpu_id, stdout=stdout)
File "C:\Users\Asli\anaconda3\lib\site-packages\spacy\training\loop.py", line 122, in train
raise e
File "C:\Users\Asli\anaconda3\lib\site-packages\spacy\training\loop.py", line 105, in train
for batch, info, is_best_checkpoint in training_step_iterator:
File "C:\Users\Asli\anaconda3\lib\site-packages\spacy\training\loop.py", line 224, in train_while_improving
score, other_scores = evaluate()
File "C:\Users\Asli\anaconda3\lib\site-packages\spacy\training\loop.py", line 281, in evaluate
scores = nlp.evaluate(dev_corpus(nlp))
File "C:\Users\Asli\anaconda3\lib\site-packages\spacy\language.py", line 1377, in evaluate
for doc, eg in zip(
File "C:\Users\Asli\anaconda3\lib\site-packages\spacy\util.py", line 1490, in _pipe
yield from proc.pipe(docs, **kwargs)
File "spacy\pipeline\transition_parser.pyx", line 237, in pipe
File "C:\Users\Asli\anaconda3\lib\site-packages\spacy\util.py", line 1509, in raise_error
raise e
File "spacy\pipeline\transition_parser.pyx", line 233, in spacy.pipeline.transition_parser.Parser.pipe
File "spacy\pipeline\transition_parser.pyx", line 247, in spacy.pipeline.transition_parser.Parser.predict
File "spacy\pipeline\transition_parser.pyx", line 262, in spacy.pipeline.transition_parser.Parser.greedy_parse
File "C:\Users\Asli\anaconda3\lib\site-packages\thinc\model.py", line 315, in predict
return self._func(self, X, is_train=False)[0]
File "C:\Users\Asli\anaconda3\lib\site-packages\spacy\ml\tb_framework.py", line 33, in forward
step_model = ParserStepModel(
File "spacy\ml\parser_model.pyx", line 223, in spacy.ml.parser_model.ParserStepModel.__init__
File "spacy\ml\parser_model.pyx", line 362, in spacy.ml.parser_model.precompute_hiddens.__init__
File "C:\Users\Asli\anaconda3\lib\site-packages\thinc\model.py", line 291, in __call__
return self._func(self, X, is_train=is_train)
File "C:\Users\Asli\anaconda3\lib\site-packages\spacy\ml\_precomputable_affine.py", line 27, in forward
Yf = model.ops.xp.vstack((model.get_param("pad"), Yf))
File "C:\Users\Asli\anaconda3\lib\site-packages\cupy\_manipulation\join.py", line 114, in vstack
return concatenate([cupy.atleast_2d(m) for m in tup], 0)
File "C:\Users\Asli\anaconda3\lib\site-packages\cupy\_manipulation\join.py", line 55, in concatenate
return _core.concatenate_method(tup, axis, out)
File "cupy\_core\_routines_manipulation.pyx", line 533, in cupy._core._routines_manipulation.concatenate_method
File "cupy\_core\_routines_manipulation.pyx", line 577, in cupy._core._routines_manipulation.concatenate_method
File "cupy\_core\core.pyx", line 163, in cupy._core.core.ndarray.__init__
File "cupy\cuda\memory.pyx", line 718, in cupy.cuda.memory.alloc
File "cupy\cuda\memory.pyx", line 1395, in cupy.cuda.memory.MemoryPool.malloc
File "cupy\cuda\memory.pyx", line 1416, in cupy.cuda.memory.MemoryPool.malloc
File "cupy\cuda\memory.pyx", line 1096, in cupy.cuda.memory.SingleDeviceMemoryPool.malloc
File "cupy\cuda\memory.pyx", line 1117, in cupy.cuda.memory.SingleDeviceMemoryPool._malloc
File "cupy\cuda\memory.pyx", line 1355, in cupy.cuda.memory.SingleDeviceMemoryPool._try_malloc
cupy.cuda.memory.OutOfMemoryError: Out of memory allocating 5,047,296 bytes (allocated so far: 1,512,513,536 bytes).
From the forum discussions, I tried setting up batch size to 128, which then gives me this error!
ℹ Using GPU: 0
========================= Generating Prodigy config =========================
ℹ Auto-generating config with spaCy
ℹ Using config from base model
✔ Generated training config
=========================== Initializing pipeline ===========================
✘ Config validation error
training -> batch_size extra fields not permitted
{'train_corpus': 'corpora.train', 'dev_corpus': 'corpora.dev', 'seed': 0, 'gpu_allocator': None, 'dropout': 0.1, 'accumulate_gradient': 3, 'patience': 5000, 'max_epochs': 0, 'max_steps': 20000, 'eval_frequency': 1000, 'frozen_components': ['tagger', 'parser', 'attribute_ruler', 'lemmatizer'], 'before_to_disk': {'@misc': 'prodigy.todisk_cleanup.v1'}, 'annotating_components': [], 'logger': {'@loggers': 'prodigy.ConsoleLogger.v1'}, 'batch_size': 128, 'batcher': {'@batchers': 'spacy.batch_by_padded.v1', 'discard_oversize': True, 'get_length': None, 'size': 2000, 'buffer': 256}, 'optimizer': {'@optimizers': 'Adam.v1', 'beta1': 0.9, 'beta2': 0.999, 'L2_is_weight_decay': True, 'L2': 0.01, 'grad_clip': 1.0, 'use_averages': True, 'eps': 1e-08, 'learn_rate': {'@schedules': 'warmup_linear.v1', 'warmup_steps': 250, 'total_steps': 20000, 'initial_rate': 5e-05}}, 'score_weights': {'tag_acc': None, 'dep_uas': None, 'dep_las': None, 'dep_las_per_type': None, 'sents_p': None, 'sents_r': None, 'sents_f': None, 'lemma_acc': None, 'ents_f': 0.16, 'ents_p': 0.0, 'ents_r': 0.0, 'ents_per_type': None}}
I would highly appreciate any support on how to resolve this issue.
I am using prodigy==11.1 and spacy-transformers==1.0.6.
Thanks!