We've been using a custom recipe to connect to our PostgresQL database, using the PostgresqlExtDatabase method from playhouse, so that we can input the user/pass/host as environment variables and avoid pushing this information to the repository. Like so
db = PostgresqlExtDatabase(os.getenv('PRODIGY_DB_NAME'),
I was wondering if there's an equivalent to using the
prodigy db-out command, but via a custom recipe, so that we can continue using this method to connect to our database, or if there's another recommendation for accomplishing this goal?
Hi! When you pass your custom
db into Prodigy's
Database class, you get an instance of Prodigy's database object that has various methods for retrieving annotations and hashes, adding examples, adding datasets and so on. You can find the full API documentation in your
If you want to get the annotations in a given dataset, you can do something like this:
prodigy_db = Database(db)
examples = prodigy_db.get_dataset("your_dataset_name")
examplesis a list of dicts that you can then save however you like. You might also find our little helper library
srsly helpful, especially for writing JSONL. It's also what Prodigy uses under the hood.