tagger -> neg_prefix extra fields not permitted

klopez · March 29, 2022, 1:55pm

Hi I have trained a en_core_web_trf model using a gpu. I used the prodigy train. This was done on another machine with access to gpu with spacy 3.2.3, prodigy 1.11.7 while on the annotation server I have prodigy 1.11.6 and spacy 3.2.0 (required by prodigy?). I am getting the following error:

Annotation bash[1025383]: /usr/local/lib/python3.8/dist-packages/spacy/util.py:833: UserWarning: [W095] Model 'en_pipeline' (0.0.0) was trained with spaCy v3.2 and may not be 100% compatible with the current version (3.2.0). If you see errors or degraded performance, download a newer compatible model or retrain your custom model with the current spaCy version. For more details and available updates, run: python -m spacy validate
Mar 29 13:47:24 Annotation bash[1025383]:   warnings.warn(warn_msg)
Mar 29 13:47:24 Annotation bash[1025383]: Traceback (most recent call last):
Mar 29 13:47:24 Annotation bash[1025383]:   File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
Mar 29 13:47:24 Annotation bash[1025383]:     return _run_code(code, main_globals, None,
Mar 29 13:47:24 Annotation bash[1025383]:   File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
Mar 29 13:47:24 Annotation bash[1025383]:     exec(code, run_globals)
Mar 29 13:47:24 Annotation bash[1025383]:   File "/usr/local/lib/python3.8/dist-packages/prodigy/__main__.py", line 61, in <module>
Mar 29 13:47:24 Annotation bash[1025383]:     controller = recipe(*args, use_plac=True)
Mar 29 13:47:24 Annotation bash[1025383]:   File "cython_src/prodigy/core.pyx", line 329, in prodigy.core.recipe.recipe_decorator.recipe_proxy
Mar 29 13:47:24 Annotation bash[1025383]:   File "/usr/local/lib/python3.8/dist-packages/plac_core.py", line 367, in call
Mar 29 13:47:24 Annotation bash[1025383]:     cmd, result = parser.consume(arglist)
Mar 29 13:47:24 Annotation bash[1025383]:   File "/usr/local/lib/python3.8/dist-packages/plac_core.py", line 232, in consume
Mar 29 13:47:24 Annotation bash[1025383]:     return cmd, self.func(*(args + varargs + extraopts), **kwargs)
Mar 29 13:47:24 Annotation bash[1025383]:   File "/usr/local/lib/python3.8/dist-packages/prodigy/recipes/ner.py", line 215, in correct
Mar 29 13:47:24 Annotation bash[1025383]:     nlp = spacy.load(spacy_model)
Mar 29 13:47:24 Annotation bash[1025383]:   File "/usr/local/lib/python3.8/dist-packages/spacy/__init__.py", line 51, in load
Mar 29 13:47:24 Annotation bash[1025383]:     return util.load_model(
Mar 29 13:47:24 Annotation bash[1025383]:   File "/usr/local/lib/python3.8/dist-packages/spacy/util.py", line 422, in load_model
Mar 29 13:47:24 Annotation bash[1025383]:     return load_model_from_path(Path(name), **kwargs)  # type: ignore[arg-type]
Mar 29 13:47:24 Annotation bash[1025383]:   File "/usr/local/lib/python3.8/dist-packages/spacy/util.py", line 488, in load_model_from_path
Mar 29 13:47:24 Annotation bash[1025383]:     nlp = load_model_from_config(config, vocab=vocab, disable=disable, exclude=exclude)
Mar 29 13:47:24 Annotation bash[1025383]:   File "/usr/local/lib/python3.8/dist-packages/spacy/util.py", line 525, in load_model_from_config
Mar 29 13:47:24 Annotation bash[1025383]:     nlp = lang_cls.from_config(
Mar 29 13:47:24 Annotation bash[1025383]:   File "/usr/local/lib/python3.8/dist-packages/spacy/language.py", line 1785, in from_config
Mar 29 13:47:24 Annotation bash[1025383]:     nlp.add_pipe(
Mar 29 13:47:24 Annotation bash[1025383]:   File "/usr/local/lib/python3.8/dist-packages/spacy/language.py", line 788, in add_pipe
Mar 29 13:47:24 Annotation bash[1025383]:     pipe_component = self.create_pipe(
Mar 29 13:47:24 Annotation bash[1025383]:   File "/usr/local/lib/python3.8/dist-packages/spacy/language.py", line 671, in create_pipe
Mar 29 13:47:24 Annotation bash[1025383]:     resolved = registry.resolve(cfg, validate=validate)
Mar 29 13:47:24 Annotation bash[1025383]:   File "/usr/local/lib/python3.8/dist-packages/thinc/config.py", line 729, in resolve
Mar 29 13:47:24 Annotation bash[1025383]:     resolved, _ = cls._make(
Mar 29 13:47:24 Annotation bash[1025383]:   File "/usr/local/lib/python3.8/dist-packages/thinc/config.py", line 778, in _make
Mar 29 13:47:24 Annotation bash[1025383]:     filled, _, resolved = cls._fill(
Mar 29 13:47:24 Annotation bash[1025383]:   File "/usr/local/lib/python3.8/dist-packages/thinc/config.py", line 833, in _fill
Mar 29 13:47:24 Annotation bash[1025383]:     filled[key], validation[v_key], final[key] = cls._fill(
Mar 29 13:47:24 Annotation bash[1025383]:   File "/usr/local/lib/python3.8/dist-packages/thinc/config.py", line 899, in _fill
Mar 29 13:47:24 Annotation bash[1025383]:     raise ConfigValidationError(
Mar 29 13:47:24 Annotation bash[1025383]: thinc.config.ConfigValidationError:
Mar 29 13:47:24 Annotation bash[1025383]: Config validation error
Mar 29 13:47:24 Annotation bash[1025383]: tagger -> neg_prefix   extra fields not permitted
Mar 29 13:47:24 Annotation bash[1025383]: {'nlp': <spacy.lang.en.English object at 0x7f945fb614c0>, 'name': 'tagger', 'model': {'@architectures': 'spacy.Tagger.v1', 'nO': None, 'tok2vec': {'@architectures': 'spacy-transformers.Tok2VecTransformer.v3', 'name': 'roberta-base', 'mixed_precision': False, 'pooling': {'@layers': 'reduce_mean.v1'}, 'grad_factor': 1.0, 'get_spans': {'@span_getters': 'spacy-transformers.strided_spans.v1', 'window': 128, 'stride': 96}, 'grad_scaler_config': {}, 'tokenizer_config': {'use_fast': True}, 'transformer_config': {}}}, 'neg_prefix': '!', 'overwrite': False, 'scorer': {'@scorers': 'spacy.tagger_scorer.v1'}, '@factories': 'tagger'}

It trained fine, and I have done this before where I trained a model on my gpu machine and transferred the model over. What can I do to resolve?

I am trying to use the prodigy ner.correct recipe.

EDIT:
OK I upgraded my python to 3.9 and installed prodigy 1.11.7 and spacy to 3.2.3 on my annotation server. When trying again to load up the ner.correct recipe I get: ValueError: Cannot deserialize model: mismatched structure

This model was trained on the same version of prodigy 1.11.7 and same version of spacy 3.2.3. Any idea why this is happening? Maybe a config mismatch but then how can I compare the configs?

EDIT2:
I just tried running the recipe NER.correct with the same model on my GPU server. I have not made any changes to my gpu server packages etc, yet I am getting the same error about mismatched structure. Why is this happening?

EDIT3:
Just retrained the model and tried running it on ner.correct. same error:

raise ValueError("Cannot deserialize model: mismatched structure")

EDIT4:
Here is how I trained my model:

python -m prodigy train ner_75_25/ --ner ct_images_75_25_subset --base-model en_core_web_trf --gpu-id 0

Once the model finished training I ran this:

python -m prodigy ner.correct ct_images_75_25_subset ner_75_25/model-last nodule_text_dataset_75_25.txt --label ATTENUATION,CALCIFICATION,EDGE,LATERALITY,LOBE,NODULE,QUANTITY,TIMING,SIZE

but this gives an error:

ValueError: Cannot deserialize model: mismatched structure

EDIT5:
here is my spacy-transformers package:

Name: spacy-transformers
Version: 1.1.5

I tried this solution found here

here is what I did:

nlp = spacy.load('ner_75_25/model-best', exclude="tagger,parser")
nlp_orig = spacy.load("en_core_web_trf")
nlp.add_pipe("parser", source=nlp_orig, after="transformer")
nlp.add_pipe("tagger", source=nlp_orig, after="parser")
nlp.to_disk('ner_75_25/model-best-fixed')

and it worked!

all in all I had to make sure that the prodigy, spacy, and spacy-transformers were all correct. This also required me to get both machines to python 3.9

Topic		Replies	Views
Unable to use Prodigy annotations with SpaCy CLI train usage , spacy , solved	2	1502	October 8, 2019
Training on binary annotations throws error done , training	4	683	August 12, 2021
mismatched structure when loading ner tranformers model (en_core_web_trf) usage , ner , transformers	1	495	October 7, 2022
No tagger in pre-trained models? coref	1	204	March 26, 2024
Custom recipe error - blocks: extra fields not permitted usage , ner , solved	1	1063	November 25, 2020

tagger -> neg_prefix extra fields not permitted

Related topics