mismatched structure when using tranformers model to train textcat (en_core_web_trf)

I am using Ubuntu Linux version 20.14, Python 3.9, Prodigy 1.11.6, Spacy 3.2.0. I also downloaded the latest version of en_core_web_trf.

I created my model using prodigy the following command:

prodigy train ./output -tc verbatim_claims -es .20 --base-model en_core_web_trf --label-stats --verbose --gpu-id 0

Training runs well on my Nvidia RTX-3090 and the final output of the training run is:

759   19000           0.00          5.89       99.07  4894.11    0.99
800   20000           0.00          6.24       99.07  4889.38    0.99
✔ Saved pipeline to output directory
output/model-last

=========================== Textcat F (per label) ===========================

               P       R       F
CLAIM      98.78   99.79   99.28
NO_CLAIM   99.74   98.44   99.08


======================== Textcat ROC AUC (per label) ========================

           ROC AUC
CLAIM         1.00
NO_CLAIM      1.00

I attempt to load the model using:

nlp = spacy.load(name='./output/model-best')

It throws the following exception:

Traceback (most recent call last):
  File "/snap/pycharm-professional/260/plugins/python/helpers/pydev/pydevd.py", line 1483, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/snap/pycharm-professional/260/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/.../ClaimsModel/main.py", line 161, in <module>
    nlp_claims = spacy.load(name="./verbatim_claims/output/model-last")
  File "/.../ClaimsModel/venv/lib/python3.9/site-packages/spacy/__init__.py", line 51, in load
    return util.load_model(
  File "/.../ClaimsModel/venv/lib/python3.9/site-packages/spacy/util.py", line 422, in load_model
    return load_model_from_path(Path(name), **kwargs)  # type: ignore[arg-type]
  File "/.../ClaimsModel/venv/lib/python3.9/site-packages/spacy/util.py", line 489, in load_model_from_path
    return nlp.from_disk(model_path, exclude=exclude, overrides=overrides)
  File "/.../ClaimsModel/venv/lib/python3.9/site-packages/spacy/language.py", line 2043, in from_disk
    util.from_disk(path, deserializers, exclude)  # type: ignore[arg-type]
  File "/.../ClaimsModel/venv/lib/python3.9/site-packages/spacy/util.py", line 1300, in from_disk
    reader(path / key)
  File "/.../ClaimsModel/venv/lib/python3.9/site-packages/spacy/language.py", line 2037, in <lambda>
    deserializers[name] = lambda p, proc=proc: proc.from_disk(  # type: ignore[misc]
  File "spacy/pipeline/transition_parser.pyx", line 595, in spacy.pipeline.transition_parser.Parser.from_disk
  File "/.../ClaimsModel/venv/lib/python3.9/site-packages/thinc/model.py", line 593, in from_bytes
    return self.from_dict(msg)
  File "/.../ClaimsModel/venv/lib/python3.9/site-packages/thinc/model.py", line 610, in from_dict
    raise ValueError("Cannot deserialize model: mismatched structure")
ValueError: Cannot deserialize model: mismatched structure

I found some reports of this same problem, but it appeared that they had been resolved from the messages.

Any guidance would be greatly appreciated.

Thanks,

Michael Wade

Hi! Sorry you've been running into issues with this.

Which version of spacy-transformers do you have installed?

Hi,

I am using spacy-transformers 1.1.2 and spacy 3.2.0. I do have pytorch installed with cu113 support as well (torch 1.10.0+cu113)

FYI, I get a good model if I use the en_core_web_lg as the base model and I can categorize from that model w/o issue.

Thanks,

Michael Wade

I was able to resolve the issue with the training by re-installing spacy-transformers directly from github as follows:

pip install git+https://github.com/explosion/spacy-transformers

I built the model directly using spacy (not prodigy) and I was not able to successfully load the model and use it.

Hi! Happy to hear the issue got resolved by reinstalling.

I built the model directly using spacy (not prodigy) and I was not able to successfully load the model and use it.

Can you elaborate a bit further? Was the model you're trying to load built with the same environment? If not, can you summarize the versions (of spacy & spacy-transformers) you used to build it with, and the version you're trying to load it with? What is the error this is being given?

Hi,@Wade, thanks for your message. I got a question that how did you add the metrics like P, R, F, ROC, and AUC in your output? Should I revise the recipe of the train? Sorry, I am new to Prodigy. Thanks!

Went on vacation after this happen and forgot to check back for messages! I was just using prodigy with the --label-stats option to do the training. I did redo all the training but switch over to using spacy train instead. That doesn't output the smae information but I believe you can use the scorer to output this kind of information if you want.

Hello! I'm experiencing the same issue of not being able to load a trf trained textcat model, but it was not resolved by reinstalling pip spacy-transformers.

I created a transformer-based textcat model using the following command:

python -m prodigy train taking_drug_trf_model --textcat-multilabel taking_drug --eval-split .2 --base-model en_core_web_trf --label-stats --verbose 

The model completed with good results but when I load the model to categorize new data I receive an error

import spacy
nlp = spacy.load(r'path\taking_drug_trf_model\model-best')

Here is the error msg:

Exception has occurred: ValueError
Cannot deserialize model: mismatched structure

I tried this code in two environments to make sure there wasn't an issue with not being up-to-date; neither worked. Here is my pip freeze

blis==0.7.9
catalogue==2.0.8
certifi==2022.9.24
charset-normalizer==2.1.1
click==8.1.3
colorama==0.4.6
confection==0.0.3
cymem==2.0.7
en-core-web-trf @ https://github.com/explosion/spacy-models/releases/download/en_core_web_trf-3.4.1/en_core_web_trf-3.4.1-py3-none-any.whl
filelock==3.8.0
huggingface-hub==0.11.1
idna==3.4
Jinja2==3.1.2
langcodes==3.3.0
MarkupSafe==2.1.1
murmurhash==1.0.9
numpy==1.23.5
packaging==21.3
pandas==1.5.2
pathy==0.10.0
preshed==3.0.8
pydantic==1.10.2
pyparsing==3.0.9
python-dateutil==2.8.2
pytz==2022.6
PyYAML==6.0
regex==2022.10.31
requests==2.28.1
sentencepiece==0.1.97
six==1.16.0
smart-open==5.2.1
spacy==3.4.3
spacy-alignments==0.8.6
spacy-legacy==3.0.10
spacy-loggers==1.0.3
spacy-transformers==1.1.8
srsly==2.4.5
thinc==8.1.5
tokenizers==0.12.1
torch==1.13.0
tqdm==4.64.1
transformers==4.21.3
typer==0.7.0
typing-extensions==4.4.0
Unidecode==1.3.6
urllib3==1.26.13
wasabi==0.10.1

Any help is most appreciated!

hi @clark!

Thanks for your question and welcome to the Prodigy community :wave:

This issue from spaCy GitHub discussion looks very similar, as it compares spacy train with prodigy train and after reinstalling spacy-transformers. Hopefully this should give you more direction and ideas of how to debug. For more spaCy specific problems like model training/configuration, you'll likely find more help on the spaCy GitHub discussions too .

As you'll see in that issue, as you begin doing more advanced modeling like transformers, you'll likely want to move away from running prodigy train for spacy train instead and customizing your config file. prodigy train is simply a wrapper for spacy train and creates a default config file when run. In Prodigy, you can convert your data to spaCy bin files using data-to-spacy. See the docs for more details. An additional benefit of doing this is that data-to-spacy will also create separate training and dedicated hold out evaluation sets, which is better than doing --eval-split, which will create a new hold out every time.

1 Like

It's wonderful to be here! These forums are so helpful. I've always been able to find the answer here or on y'all's GitHub (but alas the transformers broke me). Thank you so much for your quick response and helpful links! I haven't read that GitHub discussion yet so I'm excited to see if it helps. Appreciate you!

Hello,

Experiencing the same issue with spacy 3.5.1 and transformers. Training config is generated by prodigy 1.11.11. Getting the error when either try to use this model with Prodigy with ner.correct or just load it into spacy with spacy.load(). Trying on the same machine as trainig.
Python version 3.7.12 it was trained in.
Also tried to load the model in Python 3.10.10.

freeze:

blis==0.7.9
catalogue==2.0.8
certifi==2022.12.7
charset-normalizer==3.1.0
click==8.1.3
confection==0.0.4
cupy-cuda113==10.6.0
cupy-cuda116==10.6.0
cymem==2.0.7
fastrlock==0.8.1
filelock==3.9.0
fr-core-news-sm @ https://github.com/explosion/spacy-models/releases/download/fr_core_news_sm-3.5.0/fr_core_news_sm-3.5.0-py3-none-any.whl
fr-dep-news-trf @ https://github.com/explosion/spacy-models/releases/download/fr_dep_news_trf-3.5.0/fr_dep_news_trf-3.5.0-py3-none-any.whl
huggingface-hub==0.13.1
idna==3.4
importlib-metadata==6.0.0
Jinja2==3.1.2
langcodes==3.3.0
MarkupSafe==2.1.2
murmurhash==1.0.9
numpy==1.21.6
nvidia-cublas-cu11==11.10.3.66
nvidia-cuda-nvrtc-cu11==11.7.99
nvidia-cuda-runtime-cu11==11.7.99
nvidia-cudnn-cu11==8.5.0.96
packaging==23.0
pathy==0.10.1
preshed==3.0.8
protobuf==3.20.3
pydantic==1.10.6
PyYAML==6.0
regex==2022.10.31
requests==2.28.2
sentencepiece==0.1.97
smart-open==6.3.0
spacy==3.5.1
spacy-alignments==0.9.0
spacy-legacy==3.0.12
spacy-loggers==1.0.4
spacy-transformers==1.2.2
srsly==2.4.6
thinc==8.1.9
tokenizers==0.13.2
torch==1.13.1
tqdm==4.65.0
transformers==4.26.1
typer==0.7.0
typing_extensions==4.4.0
urllib3==1.26.15
wasabi==1.1.1
zipp==3.15.0

config:

[paths]
train = null
dev = null
vectors = null
init_tok2vec = null

[system]
gpu_allocator = null
seed = 0

[nlp]
lang = "fr"
pipeline = ["tok2vec","transformer","morphologizer","parser","attribute_ruler","lemmatizer","ner"]
disabled = []
before_creation = null
after_creation = null
after_pipeline_creation = null
batch_size = 64
tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}

[components]

[components.attribute_ruler]
source = "fr_dep_news_trf"

[components.lemmatizer]
source = "fr_dep_news_trf"

[components.morphologizer]
source = "fr_dep_news_trf"
replace_listeners = ["model.tok2vec"]

[components.ner]
factory = "ner"
incorrect_spans_key = "incorrect_spans"
moves = null
scorer = {"@scorers":"spacy.ner_scorer.v1"}
update_with_oracle_cut_size = 100

[components.ner.model]
@architectures = "spacy.TransitionBasedParser.v2"
state_type = "ner"
extra_state_tokens = false
hidden_width = 64
maxout_pieces = 2
use_upper = true
nO = null

[components.ner.model.tok2vec]
@architectures = "spacy.Tok2VecListener.v1"
width = ${components.tok2vec.model.encode.width}
upstream = "*"

[components.parser]
source = "fr_dep_news_trf"
replace_listeners = ["model.tok2vec"]

[components.tok2vec]
factory = "tok2vec"

[components.tok2vec.model]
@architectures = "spacy.Tok2Vec.v2"

[components.tok2vec.model.embed]
@architectures = "spacy.MultiHashEmbed.v2"
width = ${components.tok2vec.model.encode.width}
attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
rows = [5000,1000,2500,2500]
include_static_vectors = false

[components.tok2vec.model.encode]
@architectures = "spacy.MaxoutWindowEncoder.v2"
width = 96
depth = 4
window_size = 1
maxout_pieces = 3

[components.transformer]
source = "fr_dep_news_trf"

[corpora]

[corpora.dev]
@readers = "spacy.Corpus.v1"
path = ${paths.dev}
max_length = 0
gold_preproc = false
limit = 0
augmenter = null

[corpora.train]
@readers = "spacy.Corpus.v1"
path = ${paths.train}
max_length = 0
gold_preproc = false
limit = 0
augmenter = null

[training]
train_corpus = "corpora.train"
dev_corpus = "corpora.dev"
seed = ${system:seed}
gpu_allocator = ${system:gpu_allocator}
dropout = 0.1
accumulate_gradient = 3
patience = 5000
max_epochs = 0
max_steps = 20000
eval_frequency = 1000
frozen_components = ["morphologizer","parser","attribute_ruler","lemmatizer"]
before_to_disk = null
annotating_components = []
before_update = null

[training.batcher]
@batchers = "spacy.batch_by_padded.v1"
discard_oversize = true
get_length = null
size = 2000
buffer = 256

[training.logger]
@loggers = "spacy.ConsoleLogger.v1"
progress_bar = false

[training.optimizer]
@optimizers = "Adam.v1"
beta1 = 0.9
beta2 = 0.999
L2_is_weight_decay = true
L2 = 0.01
grad_clip = 1.0
use_averages = true
eps = 0.00000001

[training.optimizer.learn_rate]
@schedules = "warmup_linear.v1"
warmup_steps = 250
total_steps = 20000
initial_rate = 0.00005

[training.score_weights]
pos_acc = null
morph_acc = null
morph_per_feat = null
dep_uas = null
dep_las = null
dep_las_per_type = null
sents_p = null
sents_r = null
sents_f = null
lemma_acc = null
ents_f = 1.0
ents_p = 0.0
ents_r = 0.0
ents_per_type = null
speed = 0.0

[pretraining]

[initialize]
vectors = ${paths.vectors}
init_tok2vec = ${paths.init_tok2vec}
vocab_data = null
lookups = null
before_init = null
after_init = null

[initialize.components]

[initialize.components.morphologizer]

[initialize.components.morphologizer.labels]
@readers = "spacy.read_labels.v1"
path = "spacy_training/labels/morphologizer.json"
require = false

[initialize.components.ner]

[initialize.components.ner.labels]
@readers = "spacy.read_labels.v1"
path = "spacy_training/labels/ner.json"
require = false

[initialize.components.parser]

[initialize.components.parser.labels]
@readers = "spacy.read_labels.v1"
path = "spacy_training/labels/parser.json"
require = false

[initialize.tokenizer]

error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-ddc3bbe33688> in <module>
----> 1 nlp = spacy.load('./')

~/trainings/env/lib/python3.7/site-packages/spacy/__init__.py in load(name, vocab, disable, enable, exclude, config)
     58         enable=enable,
     59         exclude=exclude,
---> 60         config=config,
     61     )
     62 

~/trainings/env/lib/python3.7/site-packages/spacy/util.py in load_model(name, vocab, disable, enable, exclude, config)
    442             return load_model_from_package(name, **kwargs)  # type: ignore[arg-type]
    443         if Path(name).exists():  # path to model data directory
--> 444             return load_model_from_path(Path(name), **kwargs)  # type: ignore[arg-type]
    445     elif hasattr(name, "exists"):  # Path or Path-like to model data
    446         return load_model_from_path(name, **kwargs)  # type: ignore[arg-type]

~/trainings/env/lib/python3.7/site-packages/spacy/util.py in load_model_from_path(model_path, meta, vocab, disable, enable, exclude, config)
    522         meta=meta,
    523     )
--> 524     return nlp.from_disk(model_path, exclude=exclude, overrides=overrides)
    525 
    526 

~/trainings/env/lib/python3.7/site-packages/spacy/language.py in from_disk(self, path, exclude, overrides)
   2123             # Convert to list here in case exclude is (default) tuple
   2124             exclude = list(exclude) + ["vocab"]
-> 2125         util.from_disk(path, deserializers, exclude)  # type: ignore[arg-type]
   2126         self._path = path  # type: ignore[assignment]
   2127         self._link_components()

~/trainings/env/lib/python3.7/site-packages/spacy/util.py in from_disk(path, readers, exclude)
   1367         # Split to support file names like meta.json
   1368         if key.split(".")[0] not in exclude:
-> 1369             reader(path / key)
   1370     return path
   1371 

~/trainings/env/lib/python3.7/site-packages/spacy/language.py in <lambda>(p, proc)
   2118                 continue
   2119             deserializers[name] = lambda p, proc=proc: proc.from_disk(  # type: ignore[misc]
-> 2120                 p, exclude=["vocab"]
   2121             )
   2122         if not (path / "vocab").exists() and "vocab" not in exclude:  # type: ignore[operator]

~/trainings/env/lib/python3.7/site-packages/spacy/pipeline/transition_parser.pyx in spacy.pipeline.transition_parser.Parser.from_disk()

~/trainings/env/lib/python3.7/site-packages/thinc/model.py in from_bytes(self, bytes_data)
    617         msg = srsly.msgpack_loads(bytes_data)
    618         msg = convert_recursive(is_xp_array, self.ops.asarray, msg)
--> 619         return self.from_dict(msg)
    620 
    621     def from_disk(self, path: Union[Path, str]) -> "Model":

~/trainings/env/lib/python3.7/site-packages/thinc/model.py in from_dict(self, msg)
    634         nodes = list(self.walk())
    635         if len(msg["nodes"]) != len(nodes):
--> 636             raise ValueError("Cannot deserialize model: mismatched structure")
    637         for i, node in enumerate(nodes):
    638             info = msg["nodes"][i]

ValueError: Cannot deserialize model: mismatched structure

I couldn't help but notice this in the traceback:

Could you share the Prodigy command that you ran to train the model as well as the recipe that follows? I want to make sure that this isn't a path-related issue.

1 Like

Model is trained with spacy using prodigy-generated config.
Command used to train the model is:

python -m spacy train <path_to_prodigy_generated_train_folder>/config.cfg -o spacy_training_output -g 0 --paths.train <path_to_prodigy_generated_train_folder>/train.spacy --paths.dev <path_to_prodigy_generated_train_folder>/dev.spacy

Load from ./ - in the example of the error above, import is done from IPython launched within the folder with the model for demonstration purposes. Same error if model is loaded for other location with spacy or prodigy.

Ah gotya. Let's try and reproduce this then!

I'm starting with an examples.jsonl file that looks like this:

{"text": "je suis vincent"}
{"text": "je suis sok"}
{"text": "je suis noa"}
{"text": "je suis jenny"}
{"text": "je suis timmy"}

Next, I annotate these examples.

python -m prodigy ner.manual issue-5015 blank:fr examples.jsonl --label name

Once annotated, I proceed to train a spaCy model. To do that I ran the following commands.

# Download French model first 
python -m spacy download fr_dep_news_trf
# Generate the config
python -m prodigy spacy-config --ner issue-5015 --base-model fr_dep_news_trf config.cfg
# Train the model
python -m spacy train config.cfg --output trained --training.max-steps=25

This training was very slow, because I do not own a GPU. So I did make one small change in the config by setting training.max-steps=25. So my model won't be great, but I do have a model saved in trained/model-best.

Then I try to run this:

python -m prodigy ner.correct issue-5015-2 trained/model-best examples.jsonl --label name

And indeed, I seem to strike the same error!

Using 1 label(s): name
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/vincent/Development/prodigy-demos/venv/lib/python3.8/site-packages/prodigy/__main__.py", line 62, in <module>
    controller = recipe(*args, use_plac=True)
  File "cython_src/prodigy/core.pyx", line 379, in prodigy.core.recipe.recipe_decorator.recipe_proxy
  File "/home/vincent/Development/prodigy-demos/venv/lib/python3.8/site-packages/plac_core.py", line 367, in call
    cmd, result = parser.consume(arglist)
  File "/home/vincent/Development/prodigy-demos/venv/lib/python3.8/site-packages/plac_core.py", line 232, in consume
    return cmd, self.func(*(args + varargs + extraopts), **kwargs)
  File "/home/vincent/Development/prodigy-demos/venv/lib/python3.8/site-packages/prodigy/recipes/ner.py", line 225, in correct
    nlp = load_model(spacy_model)
  File "cython_src/prodigy/util.pyx", line 634, in prodigy.util.load_model
  File "/home/vincent/Development/prodigy-demos/venv/lib/python3.8/site-packages/spacy/__init__.py", line 54, in load
    return util.load_model(
  File "/home/vincent/Development/prodigy-demos/venv/lib/python3.8/site-packages/spacy/util.py", line 434, in load_model
    return load_model_from_path(Path(name), **kwargs)  # type: ignore[arg-type]
  File "/home/vincent/Development/prodigy-demos/venv/lib/python3.8/site-packages/spacy/util.py", line 514, in load_model_from_path
    return nlp.from_disk(model_path, exclude=exclude, overrides=overrides)
  File "/home/vincent/Development/prodigy-demos/venv/lib/python3.8/site-packages/spacy/language.py", line 2125, in from_disk
    util.from_disk(path, deserializers, exclude)  # type: ignore[arg-type]
  File "/home/vincent/Development/prodigy-demos/venv/lib/python3.8/site-packages/spacy/util.py", line 1352, in from_disk
    reader(path / key)
  File "/home/vincent/Development/prodigy-demos/venv/lib/python3.8/site-packages/spacy/language.py", line 2119, in <lambda>
    deserializers[name] = lambda p, proc=proc: proc.from_disk(  # type: ignore[misc]
  File "spacy/pipeline/transition_parser.pyx", line 602, in spacy.pipeline.transition_parser.Parser.from_disk
  File "/home/vincent/Development/prodigy-demos/venv/lib/python3.8/site-packages/thinc/model.py", line 619, in from_bytes
    return self.from_dict(msg)
  File "/home/vincent/Development/prodigy-demos/venv/lib/python3.8/site-packages/thinc/model.py", line 636, in from_dict
    raise ValueError("Cannot deserialize model: mismatched structure")
ValueError: Cannot deserialize model: mismatched structure

I will log this internally. This certainly seems to be a bug.

1 Like

Oh! And one more piece of advice. In the meantime, you can still run these commands without the transformer model attached.

# Don't add a transformer base model
python -m prodigy spacy-config --ner issue-5015  config.cfg
# Train the spaCy model in lightweight mode now, much faster! 
python -m spacy train config.cfg --output trained --training.max-steps=25
# This should run fine now! 
python -m prodigy ner.correct issue-5015-2 trained/model-best examples.jsonl --label name

This works on my machine, so you should be able to run this as well in the meantime while we try to figure out what's happening here.

1 Like

It seems that our method of generating the config does not consider transformers that are passed via --base-model. I'm working to fix this, but for your currently problem ... I think I've found a workaround for now.

Here's a config.cfg file that seems to run just fine on my machine but still uses fr_dep_news_trf.

[paths]
train = null
dev = null
vectors = null
init_tok2vec = null

[system]
gpu_allocator = null
seed = 0

[nlp]
lang = "fr"
pipeline = ["transformer", "ner"]
disabled = []
before_creation = null
after_creation = null
after_pipeline_creation = null
batch_size = 64
tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}

[components]

[components.ner]
factory = "ner"
incorrect_spans_key = "incorrect_spans"
moves = null
scorer = {"@scorers":"spacy.ner_scorer.v1"}
update_with_oracle_cut_size = 100

[components.ner.model]
@architectures = "spacy.TransitionBasedParser.v2"
state_type = "ner"
extra_state_tokens = false
hidden_width = 64
maxout_pieces = 2
use_upper = true
nO = null

[components.ner.model.tok2vec]
@architectures = "spacy-transformers.TransformerListener.v1"
grad_factor = 1.0
pooling = {"@layers":"reduce_mean.v1"}
upstream = "*"

[components.transformer]
source = "fr_dep_news_trf"

[corpora]
@readers = "prodigy.MergedCorpus.v1"
eval_split = 0.2
sample_size = 1.0

[corpora.ner]
@readers = "prodigy.NERCorpus.v1"
datasets = ["issue-5015"]
eval_datasets = []
default_fill = "outside"
incorrect_key = "incorrect_spans"

[training]
train_corpus = "corpora.train"
dev_corpus = "corpora.dev"
seed = ${system:seed}
gpu_allocator = ${system:gpu_allocator}
dropout = 0.1
accumulate_gradient = 3
patience = 5000
max_epochs = 0
max_steps = 20000
eval_frequency = 1000
frozen_components = []
before_to_disk = null
annotating_components = []
before_update = null

[training.batcher]
@batchers = "spacy.batch_by_padded.v1"
discard_oversize = true
get_length = null
size = 2000
buffer = 256

[training.optimizer]
@optimizers = "Adam.v1"
beta1 = 0.9
beta2 = 0.999
L2_is_weight_decay = true
L2 = 0.01
grad_clip = 1.0
use_averages = true
eps = 0.00000001

[training.optimizer.learn_rate]
@schedules = "warmup_linear.v1"
warmup_steps = 250
total_steps = 20000
initial_rate = 0.00005

[training.score_weights]
pos_acc = null
morph_acc = null
morph_per_feat = null
dep_uas = null
dep_las = null
dep_las_per_type = null
sents_p = null
sents_r = null
sents_f = null
lemma_acc = null
ents_f = 0.16
ents_p = 0.0
ents_r = 0.0
speed = 0.0

[pretraining]

[initialize]
vectors = ${paths.vectors}
init_tok2vec = ${paths.init_tok2vec}
vocab_data = null
lookups = null
before_init = null
after_init = null

[initialize.components]

[initialize.tokenizer]

You should be able to run this to train the model for a single epoch:

python -m spacy train generated-config.cfg --output attempt --training.max-steps=1

And from here I've been able to load it in via spaCy:

import spacy 
nlp = spacy.load("attempt/model-last")

Just to check, can you confirm that this works on your machine as well? It should be a temporary workaround to your problem.

1 Like

Thanks, it works :pray: Removing parts of the pipeline seems to do the trick.
I've also tried to only remove tok2vec and keep all other pipeline compoents, it works too.
Here is my modified config file:

[paths]
train = null
dev = null
vectors = null

[system]
gpu_allocator = null
seed = 0

[nlp]
lang = "fr"
pipeline = ["transformer","morphologizer","parser","attribute_ruler","lemmatizer","ner"]
disabled = []
before_creation = null
after_creation = null
after_pipeline_creation = null
batch_size = 64
tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}

[components]

[components.attribute_ruler]
source = "fr_dep_news_trf"

[components.lemmatizer]
source = "fr_dep_news_trf"

[components.morphologizer]
source = "fr_dep_news_trf"

[components.ner]
factory = "ner"
incorrect_spans_key = "incorrect_spans"
moves = null
scorer = {"@scorers":"spacy.ner_scorer.v1"}
update_with_oracle_cut_size = 100

[components.ner.model]
@architectures = "spacy.TransitionBasedParser.v2"
state_type = "ner"
extra_state_tokens = false
hidden_width = 64
maxout_pieces = 2
use_upper = true
nO = null

[components.parser]
source = "fr_dep_news_trf"

[components.transformer]
source = "fr_dep_news_trf"

[corpora]

[corpora.dev]
@readers = "spacy.Corpus.v1"
path = ${paths.dev}
max_length = 0
gold_preproc = false
limit = 0
augmenter = null

[corpora.train]
@readers = "spacy.Corpus.v1"
path = ${paths.train}
max_length = 0
gold_preproc = false
limit = 0
augmenter = null

[training]
train_corpus = "corpora.train"
dev_corpus = "corpora.dev"
seed = ${system:seed}
gpu_allocator = ${system:gpu_allocator}
dropout = 0.1
accumulate_gradient = 3
patience = 5000
max_epochs = 0
max_steps = 20000
eval_frequency = 1000
frozen_components = ["morphologizer","parser","attribute_ruler","lemmatizer"]
before_to_disk = null
annotating_components = []
before_update = null

[training.batcher]
@batchers = "spacy.batch_by_padded.v1"
discard_oversize = true
get_length = null
size = 2000
buffer = 256

[training.logger]
@loggers = "spacy.ConsoleLogger.v1"
progress_bar = false

[training.optimizer]
@optimizers = "Adam.v1"
beta1 = 0.9
beta2 = 0.999
L2_is_weight_decay = true
L2 = 0.01
grad_clip = 1.0
use_averages = true
eps = 0.00000001

[training.optimizer.learn_rate]
@schedules = "warmup_linear.v1"
warmup_steps = 250
total_steps = 20000
initial_rate = 0.00005

[training.score_weights]
pos_acc = null
morph_acc = null
morph_per_feat = null
dep_uas = null
dep_las = null
dep_las_per_type = null
sents_p = null
sents_r = null
sents_f = null
lemma_acc = null
ents_f = 1.0
ents_p = 0.0
ents_r = 0.0
ents_per_type = null
speed = 0.0

[pretraining]

[initialize]
vectors = ${paths.vectors}
vocab_data = null
lookups = null
before_init = null
after_init = null

[initialize.components]

[initialize.components.morphologizer]

[initialize.components.morphologizer.labels]
@readers = "spacy.read_labels.v1"
path = "spacy_training/labels/morphologizer.json"
require = false

[initialize.components.ner]

[initialize.components.ner.labels]
@readers = "spacy.read_labels.v1"
path = "spacy_training/labels/ner.json"
require = false

[initialize.components.parser]

[initialize.components.parser.labels]
@readers = "spacy.read_labels.v1"
path = "spacy_training/labels/parser.json"
require = false

[initialize.tokenizer]