I'm trying to create a new model with ner.manual and then train it further with ner.teach.
I was able to annotate my new labels, for which I used the following command:
prodigy ner.manual new_set en_core_web_sm train.jsonl --label labels.txt
Now I want to improve that dataset with other data by using ner.teach. How to do this?
I tried to create a new model out of the dataset with to use in ner.teach:
prodigy ner.batch-train new_set en_core_web_sm --output /tmp/model --eval-split 0.5 --label labels.txt
However, this resulted in the following error:
Traceback (most recent call last):
File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.5/dist-packages/prodigy/main.py", line 253, in
controller = recipe(args, use_plac=True)
File "cython_src/prodigy/core.pyx", line 150, in prodigy.core.recipe.recipe_decorator.recipe_proxy
File "/usr/local/lib/python3.5/dist-packages/plac_core.py", line 328, in call
cmd, result = parser.consume(arglist)
File "/usr/local/lib/python3.5/dist-packages/plac_core.py", line 207, in consume
return cmd, self.func((args + varargs + extraopts), **kwargs)
File "/usr/local/lib/python3.5/dist-packages/prodigy/recipes/ner.py", line 400, in batch_train
drop=dropout, beam_width=beam_width)
File "cython_src/prodigy/models/ner.pyx", line 309, in prodigy.models.ner.EntityRecognizer.batch_train
File "cython_src/prodigy/models/ner.pyx", line 370, in prodigy.models.ner.EntityRecognizer._update
File "cython_src/prodigy/models/ner.pyx", line 364, in prodigy.models.ner.EntityRecognizer._update
File "cython_src/prodigy/models/ner.pyx", line 365, in prodigy.models.ner.EntityRecognizer._update
File "/usr/local/lib/python3.5/dist-packages/spacy/language.py", line 415, in update
proc.update(docs, golds, drop=drop, sgd=get_grads, losses=losses)
File "nn_parser.pyx", line 558, in spacy.syntax.nn_parser.Parser.update
File "nn_parser.pyx", line 676, in spacy.syntax.nn_parser.Parser._init_gold_batch
File "ner.pyx", line 119, in spacy.syntax.ner.BiluoPushDown.preprocess_gold
File "ner.pyx", line 178, in spacy.syntax.ner.BiluoPushDown.lookup_transition
KeyError: 'B-IDENTIFIER'