how to update records for annotations in realtime

meka_rahul · June 13, 2022, 3:49pm

Hi,
we have a database of text which gets updated in real-time, and I want to connect this database with Prodigy UI for annotation tasks where the number of annotation records should be updated in near real-time. how can we achieve this?

koaning · June 14, 2022, 9:01am

Hi Meka,

it depends a bit on what you mean with "near real-time" but I'll list some ideas that might help.

Option A: Cron

You could schedule a cronjob that downloads data from your database to a machine that's running Prodigy. You could have the cronjob restart the Prodigy server at regular intervals too and this would theoretically mean that you're always working with an up-to-date dataset. This approach is a bit hacky in the sense that the restart will cause Prodigy to go down momentarily, but it is something that's relatively quick to set up.

Option B: Custom Recipe

A neater approach might be to write a custom recipe. To copy the pseudocode listed on the docs:

import prodigy

@prodigy.recipe(
    "my-custom-recipe",
    dataset=("Dataset to save answers to", "positional", None, str),
    view_id=("Annotation interface", "option", "v", str)
)
def my_custom_recipe(dataset, view_id="text"):
    # Load your own streams from anywhere you want
    stream = load_my_custom_stream()

    def update(examples):
        # This function is triggered when Prodigy receives annotations
        print(f"Received {len(examples)} annotations!")

    return {
        "dataset": dataset,
        "view_id": view_id,
        "stream": stream,
        "update": update
    }

The stream here is a sequence of dictionaries that contain items to be annotated. Typically these are read in from a file on disk, but nothing is stopping you from writing a Python generator that queries your database for new items. This approach does involve writing a custom recipe for your task, but it does feel like the most flexible approach.

Does this help?

Topic		Replies	Views
Semi hot-reload without prodigy restart usage , streams	1	695	October 31, 2019
Stream data from database in an infinite loop. streams	3	1071	March 22, 2022
Integrating Prodigy with Custom Platform and adding raw data dynamically without restarting Server usage , streams	1	549	March 1, 2022
Save annotations with update method / Fail gracefully usage , solved	6	485	May 31, 2022
Adding new data to be annotated without re-starting the server usage , database	10	245	November 3, 2023

how to update records for annotations in realtime

Option A: Cron

Option B: Custom Recipe

Related topics