sense2vec recipes for prodigy


We used prodigy to train our own spacy model, and used it to generate a sense2vec model.

We are looking into senes2vec recipes for prodigy, and are having troubles with somes of the recipes, e.g. sense2vec.teach and sense2vec.eval, whereas sense2vec.eval-most-similar works fine.

Actually, when using sense2vec.teach, the web server starts, but when we visit the served URL, we obtain only a blank rectangle, and when using sense2vec.eval, the web server start, but when we visit the URL, it seems to enter an infinite loop.

We are using sense2vec v1.0.2, prodigy 1.9.8 and spacy 2.2.3.

Do you have any idea of what might go wrong ? Is it a compatibility issue with the versions we use ?

Thanks a lot !

Best regards

Hi! I don't think it's version compatibility, it sounds like you might not be getting any results. Which seed phrases are you using, and did you check that they have a vector assigned? And did you play around with the --threshold to see if maybe a lower similarity threshold gives you results?

Hi and thanks for the answer !

Sorry for the delay in responding, I was training a newer version of the model to try first.

I thried already to lower the threshold, but the invisible propositions have high similarity values, e.g. "SCORE: 0.68 SENSE: ?", and when I perform an action (ignrore, reject, accept...) the proposition is show in the side bar, as seen in the following example.

The model we use is based on a custom spacy model we train from scratch with labels specific to our domain. It works fine via the Python (both the spacy and sense2vec models).

We usually run prodigy through Docker, it works fine for everything but the sense2vec recipes. We tried different base images with different versions of python and different distributions (python 3.7 or 3.8 in Debian buster/stretch, Alpine and Ubuntu).

When we run it directly in the host OS (Ubuntu 19.10) it works fine. It seems to be a weird side effect from running it via a Docker container, beats me...

Thanks !

Okay, so this is very mysterious, but I think I might have figured out what's happening here: Does the prodigy.json used in your Docker container by any chance specify a "html_template" value?

The sense2vec recipes use the a custom "html_template" to define how to display the word and optional sense in the UI. The global or local prodigy.json lets you override defaults defined in recipes (which can make sense for stuff like styling and other behaviours). But if it's overriding something like the HTML template, that can be a problem. And if it's only your Docker setup that does this, it'd explain why it works fine locally.

(On a related note, Prodigy should probably show warnings if the prodigy.json overrides certain settings that are also used by the recipe and are typically not things you'd want to override globally, like the "html_template" or "choice_style". We should be able to detect that internally, so we can warn the user, while still keeping the option and keeping the mechanism consistent.)

Just released Prodigy v1.10, which will show a warning if properties like html_template are overwritten by the global prodigy.json (which is likely unintentional), to prevent problems like this.