ValueError: [E008] Some current components would be lost when restoring previous pipeline state.

nikolaypavlov · June 7, 2021, 4:05pm

Every time I train the model with the --binary flag I got the following error:

$ prodigy train ner subjects en_core_web_sm --n-iter 1 --dropout 0.2 --binary --output models/company_mentions_sm
✔ Loaded model 'en_core_web_sm'
Using 906 train / 226 eval (split 20%)
Component: ner | Batch size: compounding | Dropout: 0.2 | Iterations: 1
ℹ Baseline accuracy: 0.039

=========================== ✨  Training the model ===========================

#    Loss       Skip    Right   Wrong   Accuracy
--   --------   -----   -----   -----   --------
 1       2.33       0      89     138      0.392                                                                                                                                                                                                                                

Correct     89   
Incorrect   138  
Baseline    0.039             
Accuracy    0.392

Traceback (most recent call last):
  File "/Users/quetzal/.pyenv/versions/3.7.7/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/Users/quetzal/.pyenv/versions/3.7.7/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/quetzal/.pyenv/versions/company-mentions-dataset/lib/python3.7/site-packages/prodigy/__main__.py", line 53, in <module>
    controller = recipe(*args, use_plac=True)
  File "cython_src/prodigy/core.pyx", line 321, in prodigy.core.recipe.recipe_decorator.recipe_proxy
  File "/Users/quetzal/.pyenv/versions/company-mentions-dataset/lib/python3.7/site-packages/plac_core.py", line 328, in call
    cmd, result = parser.consume(arglist)
  File "/Users/quetzal/.pyenv/versions/company-mentions-dataset/lib/python3.7/site-packages/plac_core.py", line 207, in consume
    return cmd, self.func(*(args + varargs + extraopts), **kwargs)
  File "/Users/quetzal/.pyenv/versions/company-mentions-dataset/lib/python3.7/site-packages/prodigy/recipes/train.py", line 194, in train
    disabled.restore()
  File "/Users/quetzal/.pyenv/versions/company-mentions-dataset/lib/python3.7/site-packages/spacy/language.py", line 1139, in restore
    raise ValueError(Errors.E008.format(names=unexpected))
ValueError: [E008] Some current components would be lost when restoring previous pipeline state. If you added components after calling `nlp.disable_pipes()`, you should remove them explicitly with `nlp.remove_pipe()` before the pipeline is restored. Names of the new components: ['sentencizer']

Without the binary flag, everything works just fine. I'm using prodigy 1.10.8. Could you please help me with this issue?

ines · June 11, 2021, 12:34am

Hi and sorry about that, that's strange Could you try something and see if reverting the following change here solves the problem for you?

Abhishek · June 11, 2021, 7:17am

Hi Ines,
I am using Prodigy version 1.10.7, but I when I start to train the model in --binary mode the error reappears for me:
I have already tried swapping the annot_model = ... above the line disabled = ... in the train recipe. But even that does not solve the error.
Any workarounds for this? I am trying to create a active learning pipeline.

Abhishek · June 11, 2021, 12:41pm

Hi @ines Ines,
I am using Prodigy version 1.10.7, but I when I start to train the model in --binary mode the error reappears for me:
I have already tried swapping the annot_model = ... above the line disabled = ... in the train recipe. But even that does not solve the error.
Any workarounds for this? I am trying to create a active learning pipeline.

ines · June 14, 2021, 1:58am

Can you try upgrading to v1.10.8? I think that version includes a change that might be relevant here.

groppcw · July 27, 2021, 8:03pm

I'm using 1.10.8, but had the same issue; manually adding the sentencizer pipeline to the model before starting binary training on it worked, though.

ines · July 28, 2021, 6:57am

Thanks for the update, that's strange – but glad to hear there's a manual workaround.

The upcoming version (currently available as a nightly pre-relase) will definitely resolve this problem, since it makes the binary training workflow obsolete and now uses the same process for training from binary and manual annotations (including the ability to train from a mix of binary and manual datasets).

Topic		Replies	Views
Help with --binary flag ner , done	8	720	April 8, 2021
Training on binary annotations throws error done , training	4	686	August 12, 2021
train ner dataset -> ValueError: too many values to unpack ner , done	6	2665	January 10, 2020
Port from old to new version usage , install , spacy , solved	5	1091	January 16, 2020
Basic question about Prodigy annotations and model training. usage , ner	12	772	January 18, 2019

ValueError: [E008] Some current components would be lost when restoring previous pipeline state.

Related topics