Can't find model 'en_vectors_web_lg'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

Jakub · July 27, 2020, 12:30pm

Hello Prodigy team,

I'm following your tutorial here

and

Could you please help me debug the error in the title? I saw you were debugging it in the past here once, but I'm wasn't able to resolve my problem.

I'm quite sure I have the model installed and linked.

I am able to do

import spacy;
spacy.load('en_core_web_lg');

but not

prodigy train ner my_data en_vectors_web_lg --init-tok2vec ./tok2vec_cd8_model289.bin --output ./tmp_model --eval-split 0.2

I also tried linking to the model directly like this:

 prodigy train ner my_data /Users/Jakub/.pyenv/versions/3.7.7/lib/python3.7/site-packages/spacy/data/en/en_core_web_lg-2.3.1 --init-tok2vec ../tok2vec_cd8_model289.bin --output ./tmp_model --eval-split 0.2
✔ Loaded model
'/Users/Jakub/.pyenv/versions/3.7.7/lib/python3.7/site-packages/spacy/data/en/en_core_web_lg-2.3.1'
Created and merged data for 18 total examples
Using 15 train / 3 eval (split 20%)
Component: ner | Batch size: compounding | Dropout: 0.2 | Iterations: 10
✔ Initializing with tok2vec weights ../tok2vec_cd8_model289.bin

.. but then I'm getting

ValueError: could not broadcast input array from shape (128) into shape (96)

Could you please help?

Best regards,
Jakub

Jakub · July 27, 2020, 2:50pm

The problem was the model path. I had to change it from

This:

/Users/Jakub/.pyenv/versions/3.7.7/lib/python3.7/site-packages/spacy/data/en/en_core_web_lg-2.3.1

to this:

/Users/Jakub/.pyenv/versions/3.7.7/lib/python3.7/site-packages/en_core_web_lg/en_core_web_lg-2.3.1

The next issue is unfortunately the shape of data caused by

ValueError: could not broadcast input array from shape (128) into shape (96)

I noticed that some people have reported a similar problem on this forum with a common solution to use the LG model instead of SM. But as I'm using the LG model, I'm wondering what I'm doing wrong.

Jakub · July 28, 2020, 6:55am

Solved. I was using the wrong model. Use the en_vectors_web_lg

python -m spacy download en_vectors_web_lg

For the team reference where I think it went sideways:

If you click on the model to download en_vectors_web_lg, it opens "https://spacy.io/models/en#en_vectors_web_lg", notice the anchor #en_vectors_web_lg. But the model is not on the webpage. As the LG is the key differentiator when it comes to the models, I concentrated on finding LG and discarded the remainder of the model name.

Maybe this will help to somebody.

ines · July 28, 2020, 9:03am

Oh, thanks for the heads-up and sorry about the confusion! I We previously had the vectors-only models on the same page as the pretrained core models, but then moved them to the "starter models" page: English · spaCy Models Documentation This is a better fit, because the vectors are really just vectors you can train on top of and bootstrap your models with – but I guess we forgot to update the link in the README.

Topic		Replies	Views
ValueError: could not broadcast input array from shape (128) into shape (96) usage , ner , spacy , solved	4	2140	May 25, 2020
en_vectors_web_lg loading issue usage , spacy , solved	1	1123	May 13, 2020
Prodigy created model does not work usage , ner	2	741	November 9, 2018
ValueError: [E010] Word vectors set to length 0. spacy , terms	1	2800	July 2, 2018
pretrained tok2vec weights - prodigy v 1.11 bug , ner , spacy	5	737	October 21, 2021

Can't find model 'en_vectors_web_lg'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

Related topics