Problems related with Spacy pretrain with customized vector models

Hello,

I have couple of questions regarding using pretrain command in spacy.

  1. Does the dimension of the vector model matters?
    I saw this line in https://github.com/explosion/spaCy/blob/16aa092fb5cffb5ec7079951ea0c04cb96733b3e/spacy/cli/pretrain.py#L318
    where the Maxout layer has output dim as 300 (if I understand here correctly). A interesting thing is for the language model en_vectors_web_lg, the vector size is also 300. Is this a coincidence or some specific design?

  2. If I use nlp = spacy.load('en_core_web_lg'), I think this should also be counted as using "pretrained" model, is that right? Since en_core_web_lg and en_vectors_web_lg have the same vector (all from Glove, if I understand it correctly), then what is the difference between using nlp = spacy.load('en_core_web_lg') and use a tok2vec pretrained with en_vectors_web_lg.

Thanks,

Meanwhile, from the settings for the dimension of the approximated vector , the vector dimension approximated by the pretrained tok2vec should be the same as the vector model. Will that break the design of Spacy 2.1 such that the vector dimension is 96?

This is approved. I loaded nlp=spacy.load('en_core_web_lg') and nlp.vocab.vectors.shape[1] is 300, which is the dim of glove embeddings.

Hi,

  1. That dimension of the Maxout layer is a hidden layer --- so it could be anything, so long as the input size of the subsequent Affine layer matches up. The Affine layer does need to have the same dimensionality as the vector model being used as the target.

  2. Yes, you could use en_core_web_lg. The only difference is that the en_vectors_web_lg model has vectors for more words, so you may as well use it instead.

Thanks for the nice explanation. One more question. That is, the affine layer used here is just a simple linear transformation, or if you used a "average perceptron" idea here.