mismatched structure when loading ner tranformers model (en_core_web_trf)

I am using Python 3.8, Prodigy 1.11.8, Spacy 3.4.0. I also downloaded the latest version of en_core_web_trf.

I created my model 'en_tagger_parser_trf' using prodigy the following code:

The model was trained with the following command:

prodigy train --ner gdpr_ner ./tmp_model_7 --eval-split 0.2 --config configAccuracy.cfg --label-stats --gpu-id 0 --base-model en_tagger_parser_trf

I attempt to load the model using:


It throws the following exception:

hi @ruiyeNLP!

Can you explain what you're doing here?

I don't understand why you're loading en_core_web_lg but calling it a _trf since it would only have tok2vec layer, not a transformers. Maybe that's just a typo.

Are you trying to simply reuse a full pipeline but then create your own ner?

I found this post in spaCy GitHub Discussion forum with a similar error.

It recommends:

nlp = spacy.load("en_core_web_lg", exclude=["ner"]) # could be en_core_web_trf
nlp.add_pipe("ner", source=spacy.load("my_custom_pipeline"))

Could you try this?