Issue with terms.teach recipe wrapper saving to SQLite (custom db setup)

Hello,

We were trying to create MongoDB wrapper for Prodigy and we've come a bit far with it.

However, when wrapping a custom recipe terms.teach as far as I see in Prodigy source code, some of the data is being saved to SQL even before Save button functionality is being used. The end result is that some of the data ends up in an SQLite file and some of it ends up in MongoDB which is far from ideal.

Given that I am not expert, there are two different JSON objects stored under the examples table, the ones that get saved in SQLite only are the initial seed CSV values when prodigy command is ran:

And the other ones seem to be trained data that is saved upon saving:

Firstly question: What is the reason for those two seemingly different rows in the same table?

Second question: Couldn't db be passed to terms.teach so it worked like in for instance ner.teach so everything could be saved in a single place?

Hi! The saving in the web app calls the exact same API endpoint, no matter if you hit "Save" or if Prodigy saves the examples in the background.

But I think what might be happening in your custom recipe is this: When the terms.teach recipe starts, the seed terms are already saved to the database automatically, because they should also be part of the patterns. If you want to use a custom database for the recipe, you should make sure to also use your custom database there and not save to the SQLite database instead. See here:

However you've structured your MongoDB integration, this call should be made to the custom DB as well.

Sounds good, I will save the seeds prior to calling teach function if this doesn't matter/screw anything up.

1 Like

The dataset just holds the saved annotations, so the saving of the seed terms just happens upfront to make sure they're also in the set (and you don't have to click through them again). It shouldn't have any other implications for the recipe :slightly_smiling_face: