Hello,
I trained my model 2 times with annotations that I received after doing ner.make-gold, and got a 20% improvement in the overall label's accuracy.
For a next step, I looked at the precision/recall results of each label by using Score from spaCy and figured out which labels need to be improved.
For that, I already had some ground truth label examples in local as following:
{"label": "SPORTS", "pattern": [{"lower": "abseiling"}]}
{"label": "SPORTS", "pattern": [{"lower": "adventure racing"}]}
I imported these labels into the database and tried to run the ner.batch-train by using the model that I trained earlier, but got an error on merging spans:
Traceback (most recent call last):
File "/Users/lib/python3.7/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec)
File "/Users/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals)
File "/Users/lib/python3.7/site-packages/prodigy/__main__.py", line 380, in <module>
controller = recipe(*args, use_plac=True)
File "cython_src/prodigy/core.pyx", line 212, in prodigy.core.recipe.recipe_decorator.recipe_proxy
File "/Users/lib/python3.7/site-packages/plac_core.py", line 328, in call cmd, result = parser.consume(arglist)
File "/Users/lib/python3.7/site-packages/plac_core.py", line 207, in consume
return cmd, self.func(*(args + varargs + extraopts), **kwargs)
File "/Users/lib/python3.7/site-packages/prodigy/recipes/ner.py", line 552, in batch_train
examples = merge_spans(DB.get_dataset(dataset))
File "cython_src/prodigy/models/ner.pyx", line 40, in prodigy.models.ner.merge_spans
KeyError: 'text'
Could you please help me to understand how can I use this type of dataset in ner.batch-train with an existing pre-trained model?
Thanks,