Help with --binary flag

Hi there,

I'm trying to do binary annotation to classify a new entity type on-top of en_core_web_lg but am having trouble using the --binary flag for training. Here's the output I'm getting:

$ prodigy train ner phone_ents_train en_core_web_lg --binary
:heavy_check_mark: Loaded model 'en_core_web_lg'
Using 296 train / 296 eval (split 50%)
Component: ner | Batch size: compounding | Dropout: 0.2 | Iterations: 10
:information_source: Baseline accuracy: 0.000

=========================== :sparkles: Training the model ===========================

Loss Skip Right Wrong Accuracy

Traceback (most recent call last):
File "C:\Users\613629\Anaconda3\lib\", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\613629\Anaconda3\lib\", line 87, in run_code
exec(code, run_globals)
File "C:\Users\613629\Projects\prodigy\lib\site-packages\prodigy_main
.py", line 53, in
controller = recipe(args, use_plac=True)
File "cython_src\prodigy\core.pyx", line 321, in prodigy.core.recipe.recipe_decorator.recipe_proxy
File "C:\Users\613629\Projects\prodigy\lib\site-packages\", line 367, in call
cmd, result = parser.consume(arglist)
File "C:\Users\613629\Projects\prodigy\lib\site-packages\", line 232, in consume
return cmd, self.func(
(args + varargs + extraopts), **kwargs)
File "C:\Users\613629\Projects\prodigy\lib\site-packages\prodigy\recipes\", line 174, in train
losses = annot_model.batch_train(
File "cython_src\prodigy\models\ner.pyx", line 346, in prodigy.models.ner.EntityRecognizer.batch_train
File "cython_src\prodigy\models\ner.pyx", line 438, in prodigy.models.ner.EntityRecognizer._update
File "cython_src\prodigy\models\ner.pyx", line 431, in prodigy.models.ner.EntityRecognizer._update
File "C:\Users\613629\Projects\prodigy\lib\site-packages\spacy\", line 460, in disable_pipes
return DisabledPipes(self, *names)
File "C:\Users\613629\Projects\prodigy\lib\site-packages\spacy\", line 1124, in init
self.extend(nlp.remove_pipe(name) for name in names)
File "C:\Users\613629\Projects\prodigy\lib\site-packages\spacy\", line 1124, in
self.extend(nlp.remove_pipe(name) for name in names)
File "C:\Users\613629\Projects\prodigy\lib\site-packages\spacy\", line 418, in remove_pipe
raise ValueError(Errors.E001.format(name=name, opts=self.pipe_names))
ValueError: [E001] No component 'sentencizer' found in pipeline. Available names: ['ner']

I've tried googling the solution, but I'm pretty lost at this point. Thanks!

Hi! This is strange, I don't remember seeing tihs issue before :thinking: Which version of Prodigy are you using?

Also, here's a quick workaround/test in the meantime. After saving out the updated pipeline, try using ./en_core_web_lg_updated as the base model for training:

nlp = spacy.load("en_core_web_lg")

Hi Ines!

Thank you for responding. I've tried your proposed solution, but I get a similar error saying that there's no parser present in the pipeline. I'm using version 1.10.7. I wish I could give you more information, but I'm not really sure how this is operating in the background. Thanks!

Thanks! Could you run pip list and post the output here? :slightly_smiling_face:

Package Version

aiofiles 0.6.0
backcall 0.2.0
blis 0.7.4
cachetools 4.2.1
catalogue 1.0.0
certifi 2020.12.5
chardet 4.0.0
click 7.1.2
colorama 0.4.4
cymem 2.0.5
decorator 4.4.2
en-core-web-lg 2.3.1
fastapi 0.44.1
h11 0.9.0
idna 2.10
ipykernel 5.5.0
ipython 7.21.0
ipython-genutils 0.2.0
jedi 0.18.0
jupyter-client 6.1.11
jupyter-core 4.7.1
murmurhash 1.0.5
numpy 1.20.1
parso 0.8.1
peewee 3.14.1
pickleshare 0.7.5
pip 20.1.1
plac 1.1.3
preshed 3.0.5
prodigy 1.10.7
prompt-toolkit 3.0.16
pydantic 1.8.1
Pygments 2.8.0
PyJWT 1.7.1
python-dateutil 2.8.1
pywin32 300
pyzmq 22.0.3
requests 2.25.1
setuptools 47.1.0
six 1.15.0
spacy 2.3.5
srsly 1.0.5
starlette 0.12.9
thinc 7.4.5
toolz 0.11.1
tornado 6.1
tqdm 4.58.0
traitlets 5.0.5
urllib3 1.26.3
uvicorn 0.11.8
wasabi 0.8.2
wcwidth 0.2.5
websockets 8.1

I'm getting the same error when training NER. Has this already been resolved?

@jpz129 @paulterhorst Sorry about that, this is very strange! I wonder if there's a recent update in spaCy that makes a difference here :thinking: Anyway, as a quick fix, try the following:

  • Find your Prodigy installation (you can run prodigy stats to print the path) and open recipes/
  • Find the following two lines and swap them (so that you're calling disable_pipes before creating the annotation model):
annot_model = get_annot_model(component, nlp, labels) if binary else None
disabled = nlp.disable_pipes([p for p in nlp.pipe_names if p != component])

This did the trick. Thank you so much!

Just released v1.10.8, which should resolve the underlying issue :slightly_smiling_face: