custom sense2vec

Hi All,
@ines
I hope you are ok and healthy, I have two types of questions, theoretical, implementation

1- I want to know your idea how can I use word2sense for a specific corpus (A book with 7000 sentences) to add more semantic to my word vector or my custom NER

2- How should I implement, I followed this

but I lost a bit, aI could run this comment on my specific corpus

python -m prodigy sense2vec.teach data_merged_v22 C:/Users/moha/Documents/Models/s2v_old --seeds "since"

her I used "since " and was looking for other cue words related to causation, however, I know that you also used sense2vecto improve NER

I would be happy if you can give me some hints mainly how can I Training your own sense2vec vectors AND how can use it to add more sematic to my model

I would be very thankful if someone can give me some ideas about my question , many thanks

If you want to train your model, the link you shared is the way to go – you need to follow the steps and run the scripts for preprocessing, and then use either FastText or GloVe to train the vectors: https://github.com/explosion/sense2vec#-training-your-own-sense2vec-vectors

To train meaningful vectors, you typically want to use a lot of text, like 1 billion words. So your 7000 likely won't be enough. Maybe you can find other similar texts from a different source that you can use.

Once you have a sense2vec model, you can then use the vectors to find more similar terms. Not sure if "since" is a good seed term here, because there are not that many similar expressions. It works better for things like (proper) nouns.

1 Like

like always, very informative! you are right. I have only around 150000 tokens, many thanks, let me suppose that I will find another corpus, can I use the prodigy comments instead of scripts for pre-processing?

is there any other usage of sense2vec that I can use with the combination of NER to expand and improve my entities?

could you have a look my other question here?

many thanks,
Bleiben Sie Gesund