Help with --binary flag

Hi there,

I'm trying to do binary annotation to classify a new entity type on-top of en_core_web_lg but am having trouble using the --binary flag for training. Here's the output I'm getting:

$ prodigy train ner phone_ents_train en_core_web_lg --binary
:heavy_check_mark: Loaded model 'en_core_web_lg'
Using 296 train / 296 eval (split 50%)
Component: ner | Batch size: compounding | Dropout: 0.2 | Iterations: 10
:information_source: Baseline accuracy: 0.000

=========================== :sparkles: Training the model ===========================

Loss Skip Right Wrong Accuracy


Traceback (most recent call last):
File "C:\Users\613629\Anaconda3\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\613629\Anaconda3\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "C:\Users\613629\Projects\prodigy\lib\site-packages\prodigy_main
.py", line 53, in
controller = recipe(args, use_plac=True)
File "cython_src\prodigy\core.pyx", line 321, in prodigy.core.recipe.recipe_decorator.recipe_proxy
File "C:\Users\613629\Projects\prodigy\lib\site-packages\plac_core.py", line 367, in call
cmd, result = parser.consume(arglist)
File "C:\Users\613629\Projects\prodigy\lib\site-packages\plac_core.py", line 232, in consume
return cmd, self.func(
(args + varargs + extraopts), **kwargs)
File "C:\Users\613629\Projects\prodigy\lib\site-packages\prodigy\recipes\train.py", line 174, in train
losses = annot_model.batch_train(
File "cython_src\prodigy\models\ner.pyx", line 346, in prodigy.models.ner.EntityRecognizer.batch_train
File "cython_src\prodigy\models\ner.pyx", line 438, in prodigy.models.ner.EntityRecognizer._update
File "cython_src\prodigy\models\ner.pyx", line 431, in prodigy.models.ner.EntityRecognizer._update
File "C:\Users\613629\Projects\prodigy\lib\site-packages\spacy\language.py", line 460, in disable_pipes
return DisabledPipes(self, *names)
File "C:\Users\613629\Projects\prodigy\lib\site-packages\spacy\language.py", line 1124, in init
self.extend(nlp.remove_pipe(name) for name in names)
File "C:\Users\613629\Projects\prodigy\lib\site-packages\spacy\language.py", line 1124, in
self.extend(nlp.remove_pipe(name) for name in names)
File "C:\Users\613629\Projects\prodigy\lib\site-packages\spacy\language.py", line 418, in remove_pipe
raise ValueError(Errors.E001.format(name=name, opts=self.pipe_names))
ValueError: [E001] No component 'sentencizer' found in pipeline. Available names: ['ner']

I've tried googling the solution, but I'm pretty lost at this point. Thanks!

Hi! This is strange, I don't remember seeing tihs issue before :thinking: Which version of Prodigy are you using?

Also, here's a quick workaround/test in the meantime. After saving out the updated pipeline, try using ./en_core_web_lg_updated as the base model for training:

nlp = spacy.load("en_core_web_lg")
nlp.add_pipe(nlp.create_pipe("sentencizer"))
nlp.to_disk("./en_core_web_lg_updated")

Hi Ines!

Thank you for responding. I've tried your proposed solution, but I get a similar error saying that there's no parser present in the pipeline. I'm using version 1.10.7. I wish I could give you more information, but I'm not really sure how this is operating in the background. Thanks!

Thanks! Could you run pip list and post the output here? :slightly_smiling_face:

Package Version


aiofiles 0.6.0
backcall 0.2.0
blis 0.7.4
cachetools 4.2.1
catalogue 1.0.0
certifi 2020.12.5
chardet 4.0.0
click 7.1.2
colorama 0.4.4
cymem 2.0.5
decorator 4.4.2
en-core-web-lg 2.3.1
fastapi 0.44.1
h11 0.9.0
idna 2.10
ipykernel 5.5.0
ipython 7.21.0
ipython-genutils 0.2.0
jedi 0.18.0
jupyter-client 6.1.11
jupyter-core 4.7.1
murmurhash 1.0.5
numpy 1.20.1
parso 0.8.1
peewee 3.14.1
pickleshare 0.7.5
pip 20.1.1
plac 1.1.3
preshed 3.0.5
prodigy 1.10.7
prompt-toolkit 3.0.16
pydantic 1.8.1
Pygments 2.8.0
PyJWT 1.7.1
python-dateutil 2.8.1
pywin32 300
pyzmq 22.0.3
requests 2.25.1
setuptools 47.1.0
six 1.15.0
spacy 2.3.5
srsly 1.0.5
starlette 0.12.9
thinc 7.4.5
toolz 0.11.1
tornado 6.1
tqdm 4.58.0
traitlets 5.0.5
typing-extensions 3.7.4.3
urllib3 1.26.3
uvicorn 0.11.8
wasabi 0.8.2
wcwidth 0.2.5
websockets 8.1

I'm getting the same error when training NER. Has this already been resolved?

@jpz129 @paulterhorst Sorry about that, this is very strange! I wonder if there's a recent update in spaCy that makes a difference here :thinking: Anyway, as a quick fix, try the following:

  • Find your Prodigy installation (you can run prodigy stats to print the path) and open recipes/train.py.
  • Find the following two lines and swap them (so that you're calling disable_pipes before creating the annotation model):
annot_model = get_annot_model(component, nlp, labels) if binary else None
disabled = nlp.disable_pipes([p for p in nlp.pipe_names if p != component])

This did the trick. Thank you so much!

Just released v1.10.8, which should resolve the underlying issue :slightly_smiling_face: