Hi I have trained a en_core_web_trf model using a gpu. I used the prodigy train. This was done on another machine with access to gpu with spacy 3.2.3, prodigy 1.11.7 while on the annotation server I have prodigy 1.11.6 and spacy 3.2.0 (required by prodigy?). I am getting the following error:
Annotation bash[1025383]: /usr/local/lib/python3.8/dist-packages/spacy/util.py:833: UserWarning: [W095] Model 'en_pipeline' (0.0.0) was trained with spaCy v3.2 and may not be 100% compatible with the current version (3.2.0). If you see errors or degraded performance, download a newer compatible model or retrain your custom model with the current spaCy version. For more details and available updates, run: python -m spacy validate
Mar 29 13:47:24 Annotation bash[1025383]: warnings.warn(warn_msg)
Mar 29 13:47:24 Annotation bash[1025383]: Traceback (most recent call last):
Mar 29 13:47:24 Annotation bash[1025383]: File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
Mar 29 13:47:24 Annotation bash[1025383]: return _run_code(code, main_globals, None,
Mar 29 13:47:24 Annotation bash[1025383]: File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
Mar 29 13:47:24 Annotation bash[1025383]: exec(code, run_globals)
Mar 29 13:47:24 Annotation bash[1025383]: File "/usr/local/lib/python3.8/dist-packages/prodigy/__main__.py", line 61, in <module>
Mar 29 13:47:24 Annotation bash[1025383]: controller = recipe(*args, use_plac=True)
Mar 29 13:47:24 Annotation bash[1025383]: File "cython_src/prodigy/core.pyx", line 329, in prodigy.core.recipe.recipe_decorator.recipe_proxy
Mar 29 13:47:24 Annotation bash[1025383]: File "/usr/local/lib/python3.8/dist-packages/plac_core.py", line 367, in call
Mar 29 13:47:24 Annotation bash[1025383]: cmd, result = parser.consume(arglist)
Mar 29 13:47:24 Annotation bash[1025383]: File "/usr/local/lib/python3.8/dist-packages/plac_core.py", line 232, in consume
Mar 29 13:47:24 Annotation bash[1025383]: return cmd, self.func(*(args + varargs + extraopts), **kwargs)
Mar 29 13:47:24 Annotation bash[1025383]: File "/usr/local/lib/python3.8/dist-packages/prodigy/recipes/ner.py", line 215, in correct
Mar 29 13:47:24 Annotation bash[1025383]: nlp = spacy.load(spacy_model)
Mar 29 13:47:24 Annotation bash[1025383]: File "/usr/local/lib/python3.8/dist-packages/spacy/__init__.py", line 51, in load
Mar 29 13:47:24 Annotation bash[1025383]: return util.load_model(
Mar 29 13:47:24 Annotation bash[1025383]: File "/usr/local/lib/python3.8/dist-packages/spacy/util.py", line 422, in load_model
Mar 29 13:47:24 Annotation bash[1025383]: return load_model_from_path(Path(name), **kwargs) # type: ignore[arg-type]
Mar 29 13:47:24 Annotation bash[1025383]: File "/usr/local/lib/python3.8/dist-packages/spacy/util.py", line 488, in load_model_from_path
Mar 29 13:47:24 Annotation bash[1025383]: nlp = load_model_from_config(config, vocab=vocab, disable=disable, exclude=exclude)
Mar 29 13:47:24 Annotation bash[1025383]: File "/usr/local/lib/python3.8/dist-packages/spacy/util.py", line 525, in load_model_from_config
Mar 29 13:47:24 Annotation bash[1025383]: nlp = lang_cls.from_config(
Mar 29 13:47:24 Annotation bash[1025383]: File "/usr/local/lib/python3.8/dist-packages/spacy/language.py", line 1785, in from_config
Mar 29 13:47:24 Annotation bash[1025383]: nlp.add_pipe(
Mar 29 13:47:24 Annotation bash[1025383]: File "/usr/local/lib/python3.8/dist-packages/spacy/language.py", line 788, in add_pipe
Mar 29 13:47:24 Annotation bash[1025383]: pipe_component = self.create_pipe(
Mar 29 13:47:24 Annotation bash[1025383]: File "/usr/local/lib/python3.8/dist-packages/spacy/language.py", line 671, in create_pipe
Mar 29 13:47:24 Annotation bash[1025383]: resolved = registry.resolve(cfg, validate=validate)
Mar 29 13:47:24 Annotation bash[1025383]: File "/usr/local/lib/python3.8/dist-packages/thinc/config.py", line 729, in resolve
Mar 29 13:47:24 Annotation bash[1025383]: resolved, _ = cls._make(
Mar 29 13:47:24 Annotation bash[1025383]: File "/usr/local/lib/python3.8/dist-packages/thinc/config.py", line 778, in _make
Mar 29 13:47:24 Annotation bash[1025383]: filled, _, resolved = cls._fill(
Mar 29 13:47:24 Annotation bash[1025383]: File "/usr/local/lib/python3.8/dist-packages/thinc/config.py", line 833, in _fill
Mar 29 13:47:24 Annotation bash[1025383]: filled[key], validation[v_key], final[key] = cls._fill(
Mar 29 13:47:24 Annotation bash[1025383]: File "/usr/local/lib/python3.8/dist-packages/thinc/config.py", line 899, in _fill
Mar 29 13:47:24 Annotation bash[1025383]: raise ConfigValidationError(
Mar 29 13:47:24 Annotation bash[1025383]: thinc.config.ConfigValidationError:
Mar 29 13:47:24 Annotation bash[1025383]: Config validation error
Mar 29 13:47:24 Annotation bash[1025383]: tagger -> neg_prefix extra fields not permitted
Mar 29 13:47:24 Annotation bash[1025383]: {'nlp': <spacy.lang.en.English object at 0x7f945fb614c0>, 'name': 'tagger', 'model': {'@architectures': 'spacy.Tagger.v1', 'nO': None, 'tok2vec': {'@architectures': 'spacy-transformers.Tok2VecTransformer.v3', 'name': 'roberta-base', 'mixed_precision': False, 'pooling': {'@layers': 'reduce_mean.v1'}, 'grad_factor': 1.0, 'get_spans': {'@span_getters': 'spacy-transformers.strided_spans.v1', 'window': 128, 'stride': 96}, 'grad_scaler_config': {}, 'tokenizer_config': {'use_fast': True}, 'transformer_config': {}}}, 'neg_prefix': '!', 'overwrite': False, 'scorer': {'@scorers': 'spacy.tagger_scorer.v1'}, '@factories': 'tagger'}
It trained fine, and I have done this before where I trained a model on my gpu machine and transferred the model over. What can I do to resolve?
I am trying to use the prodigy ner.correct recipe.
EDIT:
OK I upgraded my python to 3.9 and installed prodigy 1.11.7 and spacy to 3.2.3 on my annotation server. When trying again to load up the ner.correct recipe I get: ValueError: Cannot deserialize model: mismatched structure
This model was trained on the same version of prodigy 1.11.7 and same version of spacy 3.2.3. Any idea why this is happening? Maybe a config mismatch but then how can I compare the configs?
EDIT2:
I just tried running the recipe NER.correct with the same model on my GPU server. I have not made any changes to my gpu server packages etc, yet I am getting the same error about mismatched structure. Why is this happening?
EDIT3:
Just retrained the model and tried running it on ner.correct. same error:
raise ValueError("Cannot deserialize model: mismatched structure")
EDIT4:
Here is how I trained my model:
python -m prodigy train ner_75_25/ --ner ct_images_75_25_subset --base-model en_core_web_trf --gpu-id 0
Once the model finished training I ran this:
python -m prodigy ner.correct ct_images_75_25_subset ner_75_25/model-last nodule_text_dataset_75_25.txt --label ATTENUATION,CALCIFICATION,EDGE,LATERALITY,LOBE,NODULE,QUANTITY,TIMING,SIZE
but this gives an error:
ValueError: Cannot deserialize model: mismatched structure
EDIT5:
here is my spacy-transformers package:
Name: spacy-transformers
Version: 1.1.5
I tried this solution found here
here is what I did:
nlp = spacy.load('ner_75_25/model-best', exclude="tagger,parser")
nlp_orig = spacy.load("en_core_web_trf")
nlp.add_pipe("parser", source=nlp_orig, after="transformer")
nlp.add_pipe("tagger", source=nlp_orig, after="parser")
nlp.to_disk('ner_75_25/model-best-fixed')
and it worked!
all in all I had to make sure that the prodigy, spacy, and spacy-transformers were all correct. This also required me to get both machines to python 3.9