Find out annotator progress

ines · March 3, 2021, 11:46pm

Hi! Are you storing anything with your individual examples that indicates the document the text belongs to (or the line number, a running index, something like that)? Ultimately, it depends on how you define "progress" and the metrics you're looking for, but let's say your examples look like this and include the number of the document and a running ID (e.g. nth sentence in the document):

{"text": "...", "document_no": 5, "id": 1234}

You could then, for instance, do something like this to find the highest ID available in your dataset for that document – this tells you how far your annotators have come already.

from prodigy.components.db import connect

def get_progress_for_document(dataset, document_no):
    db = connect()
    examples = db.get_dataset(dataset)
    ids = []
    for eg in examples:
        if eg["document_no"] == document_no:
            ids.append(eg["id"])
    print(f"Progress for {dataset}", max(ids))

If you're using named multi-user sessions for your annotators, you could also include the "_session_id" here, which will tell you the session (and user) that annotated the given example. There might also be other meta data that you can include or analyse here, depending on what's in your data. (Pro tip: You could even put together a mini Streamlit app and visualise these stats, so you can have a custom dashboard super specific to your dataset )

Topic		Replies	Views
Checking the progress of different annotators	6	168	July 27, 2024
Annotators Performance Tracker enhancement , done	4	1045	November 7, 2022
Session Progress Bar Getting Started usage , custom , front-end	2	227	March 21, 2024
Customize the "PROGRESS" view usage , front-end , solved	3	477	March 2, 2022
Few records in in the db for the same example usage	26	627	June 13, 2023

Find out annotator progress

Related topics