Obtain a list of similar words from my own trained model

Hi! I have a question regarding using my own trained model (by using the sense2vec.train recipe) with prodigy for sense2vec (similarly to what is done here in the DEMO https://explosion.ai/demos/sense2vec). I've seen (and used) the example like the one below:

import spacy
from sense2vec import Sense2VecComponent

nlp = spacy.load('my_own_model')
s2v = Sense2VecComponent('/path/to/reddit_vectors-1.1.0')
nlp.add_pipe(s2v)

doc = nlp("A sentence about natural language processing.")
assert doc[3].text == 'natural language processing'
freq = doc[3]._.s2v_freq
vector = doc[3]._.s2v_vec
most_similar = doc[3]._.s2v_most_similar(3)

But I'm not quite sure how to use it by just inputting a single word (i.e. software) and use the doc = nlp("A sentence about natural language processing.") line (because I have no idea what to put there :sweat:), just want something similar to the standalone implementation of sense2vec but with my own model.

Any ideas?
Thanks a lot :blush:

Hi! This forum is mostly intended for questions around Prodigy specifically, so we won't be able to provide in-depth usage help with our other libraries and tools.

The sense2vec component lets you query sense2vec vectors from a spaCy Doc, so you can look up the vector for one or more tokens. So if your doc contains the word software, you can look up the sense2vec vector for it. You don't need a spaCy model here, you can also query the sense2vec vectors directly. The sense2vec docs have some more details and examples: https://github.com/explosion/sense2vec/

1 Like