sense2vec: updated library, new vectors, tutorial for bootstrapping NER models, more Prodigy recipes & open-source datasets

Also cross-posting it here in case you missed it :slightly_smiling_face:

In 2016 we trained a sense2vec model on the 2015 portion of the Reddit comments corpus, leading to a useful library and one of our most popular demos. That work is now due for an update. In this post, we present a new version of the library, new vectors, new evaluation recipes, and a demo NER project that we trained to usable accuracy in just a few hours.

We also open-sourced a few of our example projects and datasets for NER and text classification. The data is in Prodigy's format and the projects include details on how it was created and scripts to reproduce our results, as well as powerful tok2vec weights to initialize your models.

1 Like