In your case, wouldn't it be easier to run such a script on the annotated dataset instead?
prodigy db-out dataset > out.jsonl
Here's an example file I have locally with annotators.
{"text":"stroopwafels are great","_input_hash":506862616,"_task_hash":-1495214589,"label":"truthy","_view_id":"classification","answer":"accept","_timestamp":1666777124,"_annotator_id":"issue-6044-vincent","_session_id":"issue-6044-vincent"}
{"text":"apples are healthy","_input_hash":111541500,"_task_hash":1515955516,"label":"truthy","_view_id":"classification","answer":"accept","_timestamp":1666777125,"_annotator_id":"issue-6044-vincent","_session_id":"issue-6044-vincent"}
{"text":"stroopwafels are great","_input_hash":506862616,"_task_hash":-1495214589,"label":"truthy","_view_id":"classification","answer":"accept","_timestamp":1666777134,"_annotator_id":"issue-6044-jimmy","_session_id":"issue-6044-jimmy"}
{"text":"apples are healthy","_input_hash":111541500,"_task_hash":1515955516,"label":"truthy","_view_id":"classification","answer":"accept","_timestamp":1666777134,"_annotator_id":"issue-6044-jimmy","_session_id":"issue-6044-jimmy"}
{"text":"stroopwafels are great","_input_hash":506862616,"_task_hash":-1495214589,"label":"truthy","_view_id":"classification","answer":"reject","_timestamp":1666777142,"_annotator_id":"issue-6044-lechuck","_session_id":"issue-6044-lechuck"}
{"text":"apples are healthy","_input_hash":111541500,"_task_hash":1515955516,"label":"truthy","_view_id":"classification","answer":"reject","_timestamp":1666777143,"_annotator_id":"issue-6044-lechuck","_session_id":"issue-6044-lechuck"}
{"text":"brussel sprouts are amazing","_input_hash":564254940,"_task_hash":-321962903,"label":"truthy","_view_id":"classification","answer":"reject","_timestamp":1666777527,"_annotator_id":"issue-6044-vincent","_session_id":"issue-6044-vincent"}
{"text":"brussel sprouts are amazing","_input_hash":564254940,"_task_hash":-321962903,"label":"truthy","_view_id":"classification","answer":"reject","_timestamp":1666777537,"_annotator_id":"issue-6044-jimmy","_session_id":"issue-6044-jimmy"}
{"text":"brussel sprouts are amazing","_input_hash":564254940,"_task_hash":-321962903,"label":"truthy","_view_id":"classification","answer":"reject","_timestamp":1666777544,"_annotator_id":"issue-6044-lechuck","_session_id":"issue-6044-lechuck"}
{"text":"it is cold today","_input_hash":718077657,"_task_hash":-363462449,"label":"truthy","_view_id":"classification","answer":"accept","_timestamp":1666878566,"_annotator_id":"issue-6044-guybrush","_session_id":"issue-6044-guybrush"}
{"text":"a wood chuck could chuck a lot of wood if a wood chuck could chuck wood","_input_hash":-1690856185,"_task_hash":1885086500,"label":"truthy","_view_id":"classification","answer":"accept","_timestamp":1666878830,"_annotator_id":"issue-6044-guybrush","_session_id":"issue-6044-guybrush"}
Here's a pandas script that can take such a file and round per hour.
import pandas as pd
(pd.read_json("out.jsonl", lines=True)
.assign(dt=lambda d: pd.to_datetime(d['_timestamp'], unit="s").round("H"))
.groupby("dt")
.agg(n_text=("_input_hash", "nunique"),
n_annot=("_annotator_id", "nunique"),
n_examples=("_annotator_id", "size")))
Here's the output.
n_text n_annot n_examples
dt
2022-10-26 10:00:00 3 3 9
2022-10-27 14:00:00 2 1 2
You can customize such a pandas query to your hearts content, but I can imagine that running something like that as a script is the most flexible. Maybe even turn it into a streamlit app or something?