Disable Dataset Creation on starting prodigy

I noticed that each time I start prodigy it creates a dataset that is name by yyyy-MM-dd_hh-mm-ss e.g. 2021-11-23_22-16-24? Is there a way to disable this? Thanks!

Hi @pl6306 ,

I'm curious what you're trying to do. If you're starting a new prodigy session, you can supply a database name to store all annotations. However, if you want to resume annotation, it is possible to load from existing datasets so that you don't have to create a new one again.

I do supply a prodigy.json that points to a postgresql database config. However each time I start a new prodigy instance even though I supply a dataset name with the recipe, the code creates a dataset with a timestamp as the name? This happens automatically behind the scenes somewhere. This just pollutes the database with empty datasets.

Hi @pl6306 , sorry for taking some time on this

Prodigy will always create two datasets in the db:

  • The one with the name you chose
  • A session dataset (the one timestamped) that allows you to view annotations in a given session. Here, a session means the activity that has happened between starting and stopping the server.

There's no way to currently disable the latter. On the database level, there shouldn't be any duplication, as it's just a separate session dataset where examples are linked to.