Hi! I have a long list of seeds that I tried to use with
sense2vec.teach. However, many of my terms didn't come up (even in the large reddit dataset). The current approach bonks at the first term not found. So I remove it from my list, then run again. Rinse & repeat until I either get tired and just take what I've got or until I make it to the end of my term list.
It would be very handy if instead of stopping at the first term not found, all terms were checked and then reported on. For example, if I use
--seeds "A, B, C, D, E, F" and B-E aren't found, instead of me running 5 times, with a message in 4 of those about a single seed, I could run just twice. The first time, I'd be told
✘ Can't find seed terms: 'B', 'C', 'D', 'E'.