Hi,
I'm running into an issue when I try to load custom FastText vectors into my model. These are the steps:
- Load the base
en_core_web_sm
model. - Use
spacy.cli.init_model.add_vectors
to load FastText vectors (stored as.vec.gz
) to the model. - Disable all pipelines except NER (tagger & parser in particular).
- Train NER.
- End training, and restore pipelines.
- Get a final evaluation score with
nlp.evaluate()
. - Save to disc with
nlp.to_disk
The all works fine. However, later when I try to re-load the model from disk, I get this error:
Traceback (most recent call last):
File "nn_parser.pyx", line 671, in spacy.syntax.nn_parser.Parser.from_disk
File "/opt/venv/lib/python3.7/site-packages/thinc/neural/_classes/model.py", line 375, in from_bytes
dest = getattr(layer, name)
AttributeError: 'FunctionLayer' object has no attribute 'vectors'
...
File "/opt/venv/lib/python3.7/site-packages/spacy/language.py", line 936, in <lambda>
p, exclude=["vocab"]
File "nn_parser.pyx", line 673, in spacy.syntax.nn_parser.Parser.from_disk
ValueError: [E149] Error deserializing model. Check that the config used to create the component matches the model being loaded.
I'm guessing that loading in the parser/tagger pipelines is causing this because somehow it's expecting vectors to exist where they don't. My meta.json
file includes the following vector info:
"vectors": {
"width": 300,
"vectors": 766082,
"keys": 766082,
"name": "en_model.vectors"
},
While my parser cfg has the following:
{
"beam_width":1,
"beam_density":0.0,
"beam_update_prob":1.0,
"cnn_maxout_pieces":3,
"nr_feature_tokens":8,
"deprecation_fixes":{
"vectors_name":null
},
"learn_tokens":false,
"nr_class":107,
"hidden_depth":1,
"token_vector_width":96,
"hidden_width":64,
"maxout_pieces":2,
"pretrained_vectors":null,
"bilstm_depth":0,
"self_attn_depth":0,
"conv_depth":4,
"conv_window":4,
"embed_size":2000
}
I'm using version 2.2.3
. The whole process (load, train, save, load) does work if I do not add any vectors, so it's not a version mismatch issue.
Any idea what's causing this?