Obtain a list of similar words from my own trained model

natu · September 1, 2020, 3:55pm

Hi! I have a question regarding using my own trained model (by using the sense2vec.train recipe) with prodigy for sense2vec (similarly to what is done here in the DEMO https://explosion.ai/demos/sense2vec). I've seen (and used) the example like the one below:

import spacy
from sense2vec import Sense2VecComponent

nlp = spacy.load('my_own_model')
s2v = Sense2VecComponent('/path/to/reddit_vectors-1.1.0')
nlp.add_pipe(s2v)

doc = nlp("A sentence about natural language processing.")
assert doc[3].text == 'natural language processing'
freq = doc[3]._.s2v_freq
vector = doc[3]._.s2v_vec
most_similar = doc[3]._.s2v_most_similar(3)

But I'm not quite sure how to use it by just inputting a single word (i.e. software) and use the doc = nlp("A sentence about natural language processing.") line (because I have no idea what to put there ), just want something similar to the standalone implementation of sense2vec but with my own model.

Any ideas?
Thanks a lot

ines · September 3, 2020, 8:37pm

Hi! This forum is mostly intended for questions around Prodigy specifically, so we won't be able to provide in-depth usage help with our other libraries and tools.

The sense2vec component lets you query sense2vec vectors from a spaCy Doc, so you can look up the vector for one or more tokens. So if your doc contains the word software, you can look up the sense2vec vector for it. You don't need a spaCy model here, you can also query the sense2vec vectors directly. The sense2vec docs have some more details and examples: https://github.com/explosion/sense2vec/

Topic		Replies	Views
custom sense2vec usage	5	1417	August 15, 2021
Prodigy sense2vec.teach recipe with gensim w2vec usage , spacy , terms , solved , sense2vec	3	604	March 6, 2021
How do I work with available word vectors during NER training? ner , training	3	359	June 30, 2022
biomedical nlp models in spacy usage , spacy , solved , gensim	4	2399	February 28, 2018
Similarity spacy , gensim	2	1854	March 3, 2018

Obtain a list of similar words from my own trained model

Related topics