Hi, I would like to export a jsonl file from an existing data set using the prodigy library. I want to call the db-out command from within pyhton. Is this possible?
Yes, that should be very straightforward! Check out the database API: https://prodi.gy/docs/api-database#database This is pretty much what db-out
does under the hood, too. The get_dataset
method loads all examples in a given dataset and returns a list of dictionaries. Here's an example:
from prodigy.components.db import connect
db = connect() # uses settings from your prodigy.json
examples = db.get_dataset("my_cool_dataset")
2 Likes