db-out equivalent on postgres

Hi there,

I have a setup where we automatically sample data into a postgres table, which serves as the input for our prodigy annotation interface, which itself saves the annotations in postgres again. Unfortunately I do not fully understand the table structure of prodigys db, and I would like to avoid saving data locally with db-out.

Are there any recommendations re how to export data from prodigys db within postgres, ideally just creeating/appending to a table containing the key variables of the annotation?

Hi @nicolaiberk,

You can interact with Prodigy DB programatically. The DB API is documented here
You can use methods such e.g. get_dataset_examples to serialize the annotations from a given dataset as a list of Python dictionaries.
It sounds like what you want is a custom loader function that reads data from your "input" table. Since it is your custom function, it can be tailored to whatever schema you use in your "input" DB. It just needs to output Python dictionaries with Prodigy task format required by your UI.
Then, the recipe would take care of saving it in postgre DB if it is configured to do so.

Perhaps I'm not understanding the nature of the required relation between your "input" DB and "output" DB (that stores annotations) because I think you don't really need make direct queries to the Prodigy DB in this workflow - the API (agnostic to the the type of DB) should be all you need.
In any case, to learn more details about the Prodigy DB structure it would probably be easiest to inspect an existing DB using a viewer UI such as for example DB Browser for SQLite or pgAdmin or similar of postgreSQL.

Hey,

my apologies for the late answer. I think the reference to prodigy.components.db.connect was already very helpful!

In general I was hoping to get a bit more control over the storage format in postgres as I do not fully understand the relations between the different tables. Another option would be the export of annotated examples into a new, 'clean' table in the db.

Hi @nicolaiberk,

If you'd rather save data differently to how Prodigy does it, here you can find the docs on how to plug in your custom database implementation.
Alternatively, as you say, you can export the annotated dataset and save it elsewhere independent of Prodigy.

1 Like