programmatic way to get stats


I was wondering if there is any way to pragmatically (i.e. in python) get the type of stats that prodigy stats -ls and prodigy stats DBNAME give. In particular, if there is a way to return a list of all databases and, given a database, return the number of annotations.


Yes, check out the Database API: For example, here's how you would connect, get all dataset names and get all annotations of a given dataset:

from prodigy.components.db import connect

db = connect()
all_dataset_names = db.datasets
examples = db.get_dataset("my_dataset")

The source of the stats command/recipe is also shipped with Prodigy, so you can always check it out in recipes/ and see how it's done there.

Ah should have found that-- thanks!

1 Like