I annotated data on older version of prodigy, adn now when I db-in that data and train it on newer version of prodigy, its giving me the following error during training:
(spacy221) C:\Users\BNV>python -m prodigy train ner newenglish en_core_web_lg --output D:\latest\modelss
Loaded model 'en_core_web_lg'
C:\Users\BNV\Envs\spacy221\lib\site-packages\prodigy\recipes\train.py:453: UserWarning: [W030] Some entities could not be aligned in the text "chalo hamaray ilawa koi tou balance ki baat kertay..." with entities "[(0, 5, 'IGNORE'), (6, 13, 'IGNORE'), (14, 19, 'IG...". Use spacy.gold.biluo_tags_from_offsets(nlp.make_doc(text), entities)
to check the alignment. Misaligned entities ('-') will be ignored during training.
biluo = biluo_tags_from_offsets(doc, offsets, missing=missing_tag)
C:\Users\BNV\Envs\spacy221\lib\site-packages\prodigy\recipes\train.py:453: UserWarning: [W030] Some entities could not be aligned in the text "haha..."tu mera hero" bhi tha:-P;-)" with entities "[(0, 4, 'IGNORE'), (7, 10, 'IGNORE'), (11, 15, 'IG...". Use spacy.gold.biluo_tags_from_offsets(nlp.make_doc(text), entities)
to check the alignment. Misaligned entities ('-') will be ignored during training.
biluo = biluo_tags_from_offsets(doc, offsets, missing=missing_tag)
C:\Users\BNV\Envs\spacy221\lib\site-packages\prodigy\recipes\train.py:453: UserWarning: [W030] Some entities could not be aligned in the text "Arsal:-Sab kOo apni jan bachane ka haq ga.Jiya :-T..." with entities "[(0, 5, 'PERSON'), (7, 10, 'IGNORE'), (11, 14, 'IG...". Use spacy.gold.biluo_tags_from_offsets(nlp.make_doc(text), entities)
to check the alignment. Misaligned entities ('-') will be ignored during training.
biluo = biluo_tags_from_offsets(doc, offsets, missing=missing_tag)
Created and merged data for 20761 total examples
Using 16609 train / 4152 eval (split 20%)
Component: ner | Batch size: compounding | Dropout: 0.2 | Iterations: 10
Baseline accuracy: 0.665
=========================== Training the model ===========================
Loss Precision Recall F-Score
1: 74%|█████████████████████████████████████████████████████▏ | 12281/16609 [02:44<00:24, 177.92it/s]Traceback (most recent call last):
File "C:\Users\BNV\AppData\Local\Programs\Python\Python36\Lib\runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "C:\Users\BNV\AppData\Local\Programs\Python\Python36\Lib\runpy.py", line 85, in run_code
exec(code, run_globals)
File "C:\Users\BNV\Envs\spacy221\lib\site-packages\prodigy_main.py", line 60, in
controller = recipe(args, use_plac=True)
File "cython_src\prodigy\core.pyx", line 300, in prodigy.core.recipe.recipe_decorator.recipe_proxy
File "C:\Users\BNV\Envs\spacy221\lib\site-packages\plac_core.py", line 328, in call
cmd, result = parser.consume(arglist)
File "C:\Users\BNV\Envs\spacy221\lib\site-packages\plac_core.py", line 207, in consume
return cmd, self.func((args + varargs + extraopts), **kwargs)
File "C:\Users\BNV\Envs\spacy221\lib\site-packages\prodigy\recipes\train.py", line 163, in train
nlp.update(docs, annots, drop=dropout, losses=losses)
File "C:\Users\BNV\Envs\spacy221\lib\site-packages\spacy\language.py", line 529, in update
proc.update(docs, golds, sgd=get_grads, losses=losses, **kwargs)
File "nn_parser.pyx", line 446, in spacy.syntax.nn_parser.Parser.update
File "nn_parser.pyx", line 551, in spacy.syntax.nn_parser.Parser._init_gold_batch
File "transition_system.pyx", line 102, in spacy.syntax.transition_system.TransitionSystem.get_oracle_sequence
File "transition_system.pyx", line 163, in spacy.syntax.transition_system.TransitionSystem.set_costs
ValueError: [E024] Could not find an optimal move to supervise the parser. Usually, this means that the model can't be updated in a way that's valid and satisfies the correct annotations specified in the GoldParse. For example, are all labels added to the model? If you're training a named entity recognizer, also make sure that none of your annotated entity spans have leading or trailing whitespace or punctuation. You can also use the experimental debug-data
command to validate your JSON-formatted training data. For details, run:
python -m spacy debug-data --help