Semi hot-reload without prodigy restart

ines · October 31, 2019, 10:59am

Hi! Streams in Prodigy are regular Python generators – so you can set them up however you like and also make it respond to outside state, read from an external source (database, REST API) etc. For instance, here's a pseudocode example of loading data from something like a paginated API:

def custom_stream():
    page = 0
    while True:
        examples = get_new_examples(page)
        yield from examples
        page += 1

You could also use the files in a directory and after each iteration (all examples in the file are sent out), check if there's a new file you can read from. I don't know where your original review data lives – but if you can retrieve it in Python, you could also do it directly in the recipe script, so you can skip the whole export step alltogether.

If you don't want to edit the recipe script (e.g. if you're using a built-in recipe), you can also write a custom loader script that writes to stdout and then pipe that forward. See here for an example.

Prodigy will typically create two datasets: one with the name you've given when you run the recipe and one timestamped dataset per session. In a custom recipe, you can also return a get_session_id callback to customise how the session IDs are generated.

You might also want to check out the named multi-user sessions (see the "Multi-user sessions" section in your PRODIGY_README.html for details). This allows you to append something like ?session=johannes to the URL in the web app and associate all annotations you collect with that session. You can also customise whether all sessions should see the same examples or whether everyone should see different questions.

Topic		Replies	Views
Prodigy input stream as MySQL usage , solved	2	499	February 26, 2019
"Refreshing" the stream of examples usage , solved	6	1799	October 23, 2018
Restarting a custom recipe without overwriting annotations stored in the database	4	444	November 17, 2022
how to update records for annotations in realtime database , solved , streams	1	554	June 14, 2022
Use database as source	1	282	May 5, 2022

Semi hot-reload without prodigy restart

Related topics