I’m working with a team of annotators to develop an Arabic NER model using Prodigy. I put together some really quick code to set up a multi-annotator Prodigy system. Here it is, in case anyone else finds it useful! https://github.com/ahalterman/multiuser_prodigy
It’s pretty bare-bones and one-off, but comments and issues are welcome.
Wow, this is great – thanks a lot for sharing!
In this case, how would you save the outputs of different threads?
If I read the code correctly, all annotations are saved to the dataset
If you want to save annotations to different sets, or add information about the annotator who worked on them, you might want to start several individual Prodigy sessions with different sets, or put a little service in the middle that farms out the annotation tasks to the annotators, and adds the annotator ID to them – for example
task['annotator_id'] = 123. You can find different approaches to this on the forum if you search for “multiple annotators”. I think starting multiple sessions of different ports/hosts might be the easiest solution. You can do this with a simple Python or Shell script, and even overwrite the
session_id (see the
PRODIGY_README.html for details) and set the port and host via the
PRODIGY_HOST environment variables.
Ok. I was also trying using different ports.
In case anyone’s using the multiuser setup, I put together soem code to generate a daily report to track things like coding activity per coder per day, the distribution of time per task, etc. There’s a lot that’s hard coded in the code, but should be quite easy to modify for your own DB or figures.
I just pushed some updates to this, including a more useful readme. There's more example code using a
blocks interface, plus a new dashboard using Streamlit that annotators can use to check their progress. This multiuser setup has been working well for me for several annotation projects, and I hope it's useful to others!