I think the problem here is that you didn't pretrain with the setting to use the word vectors as features. If you didn't add the
--use-vectors flag, the model won't expect to have the word vectors during training. So for the model you've pretrained, try setting
blank:en as the model rather than
These are different dimensions in the model. Unfortunately there's no way to talk about this especially clearly, because the neural network model has many layers of activations on the way to finally assigning a vector to each token. There isn't really an elegant way to distinguish those activations from each other terminologically. So, the word vectors you load in, the big static table, are just one feature used to compute the activation for each word individually. The other features are also vectors, representing the word's lower-cased form, prefix, suffix and shape. All of those vectors are mixed together in a feed-forward layer to produce another vector, and then a CNN is run to mix in information from the surrounding context, outputting the thing we call "token vectors".
pretrained_dims is the width of the big static table, produced by an algorithm like word2vec or GloVe. It was named that before the
spacy pretrain command was around.
token_vector_width is the width of the vectors output by the CNN.
Again, I regret how confusing this all is, but it's a dilemma that other deep learning models face as well.