Is it possible to use Prodigy by importing it or via an API?

Hi there,

I'm trying to integrate Prodigy into a larger project, and I would like to programmatically interact with prodigy (e.g. starting a labeling server, dumping out its db contents, etc.)

Ideally there would be an interface like:

import prodigy

labeling_server = prodigy.textcat(db='task_a', port=9001, ...)
labeling_server.stop()
labeling_server.dbout(db='task_a', out='/tmp/labels.jsonl')

or

import prodigy

app = prodigy.server(port=9001)

app.textcat(db='task_a') # then I go to localhost:9001/task_a to label it...
app.textcat(db='task_b') # then I go to localhost:9001/task_b to label it...

app.dbout(db='task_a', out='/tmp/labels.json')

This way instead of manually starting and restarting prodigy servers from the command line every time I want to label something, I can keep one long-running process.

Is it possible to do this?

Hi! The prodigy.serve function can be used to start a recipe script and the Prodigy server on a given host and port: https://prodi.gy/docs/api-components#serve I've posted about how it works under the hood in this thread. The app is a modern FastAPI app served by uvicorn, and the source of the app.py is shipped with Prodigy. So if you want something custom, you can also always implement your own server.

The recipes are just regular Python functions so there's no need to have "magical" methods like dbout. You can interact with the database programmatically (API docs here) and write your own logic, or call the db_out recipe (in prodigy.recipes.commands) as a regular function. You can also interact with the REST API if you want to.

Awesome, thank you @ines!

1 Like