Loading gensim word2vec vectors for terms.teach?

beckerfuffle · February 28, 2018, 4:29pm

Awesome I think this is working! It’s still running through my 1 million word vectors but it worked without any obvious errors on the first 100 so I’m guessing this will work out. Here’s the complete recipe:

from gensim models

word2vec = models.Word2Vec.load('word2vec.model')
word2vec.wv.save_word2vec_format('word2vec.bin')

import spacy
import numpy as np
nlp = spacy.load("en_core_web_sm", vectors=False)
rows, cols = 0, 0
for i, line in enumerate(open('word2vec.bin', 'r')):
    if i == 0:
        rows, cols = line.split()
        rows, cols = int(rows), int(cols)
        nlp.vocab.reset_vectors(shape=(rows, cols))
    else:
        word, *vec = line.split()
        vec = np.array([float(i) for i in vec])
        nlp.vocab.set_vector(word, vec)
        print(word)

nlp.to_disk('spacy_word2vec')

Topic		Replies	Views
biomedical nlp models in spacy usage , spacy , solved , gensim	4	2401	February 28, 2018
How to use two .txt files one with vectors the other with words usage , spacy , solved	4	1940	May 26, 2018
Using Fastext vector model in Prodigy? usage , spacy , solved	7	3403	March 15, 2018
Convert Gensim FastText to spaCy-readable Word2Vec format for terms.teach recipe spacy , terms , solved , gensim	4	1495	September 11, 2020
Add vectors to nlp model using terms.train-vectors terms , solved	4	1294	April 10, 2018

Loading gensim word2vec vectors for terms.teach?

Related topics