Hi all, beginner user of Spacy here.
I'm trying out an NER task with transformers, specifically with the "en_trf_robertabase_lg" model. I have a few labeled data points and when I start training it throws the following error:
prodigy train ner demo_ner_news_headlines en_trf_robertabase_lg
Output:
✔ Loaded model 'en_trf_robertabase_lg'
Created and merged data for 46 total examples
Using 23 train / 23 eval (split 50%)
Component: ner | Batch size: compounding | Dropout: 0.2 | Iterations: 10
ℹ Baseline accuracy: 0.000
=========================== ✨ Training the model ===========================
# Loss Precision Recall F-Score
-- -------- --------- -------- --------
1: 0%| | 0/23 [00:00<?, ?it/s]Traceback (most recent call last):
File "/anaconda3/envs/nlp-flywheel/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/anaconda3/envs/nlp-flywheel/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/anaconda3/envs/nlp-flywheel/lib/python3.7/site-packages/prodigy/__main__.py", line 60, in <module>
controller = recipe(*args, use_plac=True)
File "cython_src/prodigy/core.pyx", line 213, in prodigy.core.recipe.recipe_decorator.recipe_proxy
File "/anaconda3/envs/nlp-flywheel/lib/python3.7/site-packages/plac_core.py", line 367, in call
cmd, result = parser.consume(arglist)
File "/anaconda3/envs/nlp-flywheel/lib/python3.7/site-packages/plac_core.py", line 232, in consume
return cmd, self.func(*(args + varargs + extraopts), **kwargs)
File "/anaconda3/envs/nlp-flywheel/lib/python3.7/site-packages/prodigy/recipes/train.py", line 159, in train
nlp.update(docs, annots, drop=dropout, losses=losses)
File "/anaconda3/envs/nlp-flywheel/lib/python3.7/site-packages/spacy_transformers/language.py", line 81, in update
tok2vec = self.get_pipe(PIPES.tok2vec)
File "/anaconda3/envs/nlp-flywheel/lib/python3.7/site-packages/spacy/language.py", line 281, in get_pipe
raise KeyError(Errors.E001.format(name=name, opts=self.pipe_names))
KeyError: "[E001] No component 'trf_tok2vec' found in pipeline. Available names: ['ner']"
Note training works fine with the "en_core_web_sm" model.
Any help is appreciated!