I am using Ubuntu Linux version 20.14, Python 3.9, Prodigy 1.11.6, Spacy 3.2.0. I also downloaded the latest version of en_core_web_trf.
I created my model using prodigy the following command:
prodigy train ./output -tc verbatim_claims -es .20 --base-model en_core_web_trf --label-stats --verbose --gpu-id 0
Training runs well on my Nvidia RTX-3090 and the final output of the training run is:
759 19000 0.00 5.89 99.07 4894.11 0.99
800 20000 0.00 6.24 99.07 4889.38 0.99
✔ Saved pipeline to output directory
output/model-last
=========================== Textcat F (per label) ===========================
P R F
CLAIM 98.78 99.79 99.28
NO_CLAIM 99.74 98.44 99.08
======================== Textcat ROC AUC (per label) ========================
ROC AUC
CLAIM 1.00
NO_CLAIM 1.00
I attempt to load the model using:
nlp = spacy.load(name='./output/model-best')
It throws the following exception:
Traceback (most recent call last):
File "/snap/pycharm-professional/260/plugins/python/helpers/pydev/pydevd.py", line 1483, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "/snap/pycharm-professional/260/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/.../ClaimsModel/main.py", line 161, in <module>
nlp_claims = spacy.load(name="./verbatim_claims/output/model-last")
File "/.../ClaimsModel/venv/lib/python3.9/site-packages/spacy/__init__.py", line 51, in load
return util.load_model(
File "/.../ClaimsModel/venv/lib/python3.9/site-packages/spacy/util.py", line 422, in load_model
return load_model_from_path(Path(name), **kwargs) # type: ignore[arg-type]
File "/.../ClaimsModel/venv/lib/python3.9/site-packages/spacy/util.py", line 489, in load_model_from_path
return nlp.from_disk(model_path, exclude=exclude, overrides=overrides)
File "/.../ClaimsModel/venv/lib/python3.9/site-packages/spacy/language.py", line 2043, in from_disk
util.from_disk(path, deserializers, exclude) # type: ignore[arg-type]
File "/.../ClaimsModel/venv/lib/python3.9/site-packages/spacy/util.py", line 1300, in from_disk
reader(path / key)
File "/.../ClaimsModel/venv/lib/python3.9/site-packages/spacy/language.py", line 2037, in <lambda>
deserializers[name] = lambda p, proc=proc: proc.from_disk( # type: ignore[misc]
File "spacy/pipeline/transition_parser.pyx", line 595, in spacy.pipeline.transition_parser.Parser.from_disk
File "/.../ClaimsModel/venv/lib/python3.9/site-packages/thinc/model.py", line 593, in from_bytes
return self.from_dict(msg)
File "/.../ClaimsModel/venv/lib/python3.9/site-packages/thinc/model.py", line 610, in from_dict
raise ValueError("Cannot deserialize model: mismatched structure")
ValueError: Cannot deserialize model: mismatched structure
I found some reports of this same problem, but it appeared that they had been resolved from the messages.
Any guidance would be greatly appreciated.
Thanks,
Michael Wade