cupy.cuda.memory.OutOfMemoryError problem

Hi,

I am trying to train with an NER model using en_core_web_trf base model and getting this error!

(venv) C:\Users\Asli>python -m prodigy train "C:\Users\Asli\....\NER_modelv0.5\en_core_web_trf_1" --base-model en_core_web_trf --ner corpus_training_emob_data_v0.5_bat0_correct,corpus_training_emob_data_v0.5_bat1_correct --gpu-id 0
ℹ Using GPU: 0

========================= Generating Prodigy config =========================
ℹ Auto-generating config with spaCy
ℹ Using config from base model
✔ Generated training config

=========================== Initializing pipeline ===========================
[2021-09-07 13:33:45,991] [INFO] Set up nlp object from config
Components: ner
Merging training and evaluation data for 1 components
  - [ner] Training: 1458 | Evaluation: 364 (20% split)
Training: 743 | Evaluation: 316
Labels: ner (3)
[2021-09-07 13:33:46,282] [INFO] Pipeline: ['transformer', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer', 'ner']
[2021-09-07 13:33:46,283] [INFO] Resuming training for: ['ner', 'transformer']
[2021-09-07 13:33:46,290] [INFO] Created vocabulary
[2021-09-07 13:33:46,292] [INFO] Finished initializing nlp object
[2021-09-07 13:33:46,292] [INFO] Initialized pipeline components: []
✔ Initialized pipeline

============================= Training pipeline =============================
Components: ner
Merging training and evaluation data for 1 components
  - [ner] Training: 1458 | Evaluation: 364 (20% split)
Training: 743 | Evaluation: 316
Labels: ner (3)
ℹ Pipeline: ['transformer', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer',
'ner']
ℹ Frozen components: ['tagger', 'parser', 'attribute_ruler', 'lemmatizer']
ℹ Initial learn rate: 0.0
E    #       LOSS TRANS...  LOSS NER  ENTS_F  ENTS_P  ENTS_R  SCORE
---  ------  -------------  --------  ------  ------  ------  ------
⚠ Aborting and saving the final best model. Encountered exception:
OutOfMemoryError('Out of memory allocating 5,047,296 bytes (allocated so far:
1,512,513,536 bytes).')
Traceback (most recent call last):
  File "C:\Users\Asli\anaconda3\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\Asli\anaconda3\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\Asli\anaconda3\lib\site-packages\prodigy\__main__.py", line 61, in <module>
    controller = recipe(*args, use_plac=True)
  File "cython_src\prodigy\core.pyx", line 325, in prodigy.core.recipe.recipe_decorator.recipe_proxy
  File "C:\Users\Asli\anaconda3\lib\site-packages\plac_core.py", line 367, in call
    cmd, result = parser.consume(arglist)
  File "C:\Users\Asli\anaconda3\lib\site-packages\plac_core.py", line 232, in consume
    return cmd, self.func(*(args + varargs + extraopts), **kwargs)
  File "C:\Users\Asli\anaconda3\lib\site-packages\prodigy\recipes\train.py", line 277, in train
    return _train(
  File "C:\Users\Asli\anaconda3\lib\site-packages\prodigy\recipes\train.py", line 197, in _train
    spacy_train(nlp, output_path, use_gpu=gpu_id, stdout=stdout)
  File "C:\Users\Asli\anaconda3\lib\site-packages\spacy\training\loop.py", line 122, in train
    raise e
  File "C:\Users\Asli\anaconda3\lib\site-packages\spacy\training\loop.py", line 105, in train
    for batch, info, is_best_checkpoint in training_step_iterator:
  File "C:\Users\Asli\anaconda3\lib\site-packages\spacy\training\loop.py", line 224, in train_while_improving
    score, other_scores = evaluate()
  File "C:\Users\Asli\anaconda3\lib\site-packages\spacy\training\loop.py", line 281, in evaluate
    scores = nlp.evaluate(dev_corpus(nlp))
  File "C:\Users\Asli\anaconda3\lib\site-packages\spacy\language.py", line 1377, in evaluate
    for doc, eg in zip(
  File "C:\Users\Asli\anaconda3\lib\site-packages\spacy\util.py", line 1490, in _pipe
    yield from proc.pipe(docs, **kwargs)
  File "spacy\pipeline\transition_parser.pyx", line 237, in pipe
  File "C:\Users\Asli\anaconda3\lib\site-packages\spacy\util.py", line 1509, in raise_error
    raise e
  File "spacy\pipeline\transition_parser.pyx", line 233, in spacy.pipeline.transition_parser.Parser.pipe
  File "spacy\pipeline\transition_parser.pyx", line 247, in spacy.pipeline.transition_parser.Parser.predict
  File "spacy\pipeline\transition_parser.pyx", line 262, in spacy.pipeline.transition_parser.Parser.greedy_parse
  File "C:\Users\Asli\anaconda3\lib\site-packages\thinc\model.py", line 315, in predict
    return self._func(self, X, is_train=False)[0]
  File "C:\Users\Asli\anaconda3\lib\site-packages\spacy\ml\tb_framework.py", line 33, in forward
    step_model = ParserStepModel(
  File "spacy\ml\parser_model.pyx", line 223, in spacy.ml.parser_model.ParserStepModel.__init__
  File "spacy\ml\parser_model.pyx", line 362, in spacy.ml.parser_model.precompute_hiddens.__init__
  File "C:\Users\Asli\anaconda3\lib\site-packages\thinc\model.py", line 291, in __call__
    return self._func(self, X, is_train=is_train)
  File "C:\Users\Asli\anaconda3\lib\site-packages\spacy\ml\_precomputable_affine.py", line 27, in forward
    Yf = model.ops.xp.vstack((model.get_param("pad"), Yf))
  File "C:\Users\Asli\anaconda3\lib\site-packages\cupy\_manipulation\join.py", line 114, in vstack
    return concatenate([cupy.atleast_2d(m) for m in tup], 0)
  File "C:\Users\Asli\anaconda3\lib\site-packages\cupy\_manipulation\join.py", line 55, in concatenate
    return _core.concatenate_method(tup, axis, out)
  File "cupy\_core\_routines_manipulation.pyx", line 533, in cupy._core._routines_manipulation.concatenate_method
  File "cupy\_core\_routines_manipulation.pyx", line 577, in cupy._core._routines_manipulation.concatenate_method
  File "cupy\_core\core.pyx", line 163, in cupy._core.core.ndarray.__init__
  File "cupy\cuda\memory.pyx", line 718, in cupy.cuda.memory.alloc
  File "cupy\cuda\memory.pyx", line 1395, in cupy.cuda.memory.MemoryPool.malloc
  File "cupy\cuda\memory.pyx", line 1416, in cupy.cuda.memory.MemoryPool.malloc
  File "cupy\cuda\memory.pyx", line 1096, in cupy.cuda.memory.SingleDeviceMemoryPool.malloc
  File "cupy\cuda\memory.pyx", line 1117, in cupy.cuda.memory.SingleDeviceMemoryPool._malloc
  File "cupy\cuda\memory.pyx", line 1355, in cupy.cuda.memory.SingleDeviceMemoryPool._try_malloc
cupy.cuda.memory.OutOfMemoryError: Out of memory allocating 5,047,296 bytes (allocated so far: 1,512,513,536 bytes).

From the forum discussions, I tried setting up batch size to 128, which then gives me this error!

ℹ Using GPU: 0

========================= Generating Prodigy config =========================
ℹ Auto-generating config with spaCy
ℹ Using config from base model
✔ Generated training config

=========================== Initializing pipeline ===========================
✘ Config validation error
training -> batch_size   extra fields not permitted

{'train_corpus': 'corpora.train', 'dev_corpus': 'corpora.dev', 'seed': 0, 'gpu_allocator': None, 'dropout': 0.1, 'accumulate_gradient': 3, 'patience': 5000, 'max_epochs': 0, 'max_steps': 20000, 'eval_frequency': 1000, 'frozen_components': ['tagger', 'parser', 'attribute_ruler', 'lemmatizer'], 'before_to_disk': {'@misc': 'prodigy.todisk_cleanup.v1'}, 'annotating_components': [], 'logger': {'@loggers': 'prodigy.ConsoleLogger.v1'}, 'batch_size': 128, 'batcher': {'@batchers': 'spacy.batch_by_padded.v1', 'discard_oversize': True, 'get_length': None, 'size': 2000, 'buffer': 256}, 'optimizer': {'@optimizers': 'Adam.v1', 'beta1': 0.9, 'beta2': 0.999, 'L2_is_weight_decay': True, 'L2': 0.01, 'grad_clip': 1.0, 'use_averages': True, 'eps': 1e-08, 'learn_rate': {'@schedules': 'warmup_linear.v1', 'warmup_steps': 250, 'total_steps': 20000, 'initial_rate': 5e-05}}, 'score_weights': {'tag_acc': None, 'dep_uas': None, 'dep_las': None, 'dep_las_per_type': None, 'sents_p': None, 'sents_r': None, 'sents_f': None, 'lemma_acc': None, 'ents_f': 0.16, 'ents_p': 0.0, 'ents_r': 0.0, 'ents_per_type': None}}

I would highly appreciate any support on how to resolve this issue.

I am using prodigy==11.1 and spacy-transformers==1.0.6.

Thanks!

Are you sure you're not actually running out of memory on your machine?