My dataset is fairly large, so loading it each time takes quite a bit of memory. Is there a recipe setting to NOT load the existing dataset and only add/append to it?
Hi! Only appending to the dataset is the default behaviour (via Database.add_examples) – however, when the server starts, the dataset is loaded once, mostly to set the count of already annotated examples. I'll see if we can replace this with a more efficient query
There really shouldn't be a need to load any of the actual examples.
1 Like
Ooh I get it, that's the count, makes sense. I'll keep an eye out in the changelog 
Just released v1.9.8, which includes a small adjustment to the startup query so Prodigy doesn't load the individual examples anymore. It still makes a database request, but the startup with a large dataset should hopefully be more efficient now 
1 Like