I am trying to train a model with 6 labels. I already annotated a training set outside of prodigy. What is wrong with the following jsonl?
{"search_term": "authorized announced stock repurchase buyback program", "accept": ["share_repurchase"], "answer": "accept", "text": "VeriSign announced that its Board of Directors has authorized a new $1 billion stock repurchase program."}
This runs fine
prodigy db-in corp_alloc annotated_training_set.jsonl
But when I try to train the model
prodigy train textcat corp_alloc en_core_web_md -TE
I get the following error:
Loaded model 'en_core_web_md'
Traceback (most recent call last):
File "/home/ploc/anaconda3/envs/corp-capital-allocation/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/ploc/anaconda3/envs/corp-capital-allocation/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/ploc/anaconda3/envs/corp-capital-allocation/lib/python3.7/site-packages/prodigy/main.py", line 60, in
controller = recipe(args, use_plac=True)
File "cython_src/prodigy/core.pyx", line 213, in prodigy.core.recipe.recipe_decorator.recipe_proxy
File "/home/ploc/anaconda3/envs/corp-capital-allocation/lib/python3.7/site-packages/plac_core.py", line 328, in call
cmd, result = parser.consume(arglist)
File "/home/ploc/anaconda3/envs/corp-capital-allocation/lib/python3.7/site-packages/plac_core.py", line 207, in consume
return cmd, self.func((args + varargs + extraopts), **kwargs)
File "/home/ploc/anaconda3/envs/corp-capital-allocation/lib/python3.7/site-packages/prodigy/recipes/train.py", line 99, in train
data, labels = merge_data(nlp, **merge_cfg)
File "/home/ploc/anaconda3/envs/corp-capital-allocation/lib/python3.7/site-packages/prodigy/recipes/train.py", line 374, in merge_data
for eg in convert_options_to_cats(textcat_validated, exclusive=textcat_exclusive):
File "cython_src/prodigy/components/preprocess.pyx", line 289, in prodigy.components.preprocess.convert_options_to_cats
KeyError: 'label'