Integrating Prodigy with Custom Platform and adding raw data dynamically without restarting Server

ines · March 1, 2022, 8:39pm

Hi! If you're annotating data for different tasks, you typically want to do this in different instances so you can keep state in memory, load a model if needed, save the different annotation types to separate datasets you can run training experiments with etc. We'd also recommend running longer annotation sessions at a time instead of annotating a single example at a time, which often isn't that useful.

So a better approach might be to include a feature to flag an example for annotation, and then periodically annotating the selected examples in batches.

If you want to implement this, you could have a loader that periodically queries from an external source, like an API or your database containing the flagged examples for annotation. Here are some basic examples loading from a file path or a custom source: Loaders and Input Data · Prodigy · An annotation tool for AI, Machine Learning & NLP – only that in your case, you'd make a request to your database or similar, and wrap in in while True: so it keeps looping until new data is available. New examples will then be queued up when you refresh the browser.

Topic		Replies	Views
Using prodigy to annotate online stream of documents. usage	2	301	August 7, 2023
Adding new data to be annotated without re-starting the server usage , database	10	239	November 3, 2023
How to get specific data samples to show up on the annotation UI usage , custom , streams	1	391	July 15, 2021
Problems with ner.manual on short, dynamic task queue with batch_size 1 custom , server	3	578	March 4, 2020
Semi hot-reload without prodigy restart usage , streams	1	694	October 31, 2019

Integrating Prodigy with Custom Platform and adding raw data dynamically without restarting Server

Related topics