Code sharing: multi-annotator setup

I’m working with a team of annotators to develop an Arabic NER model using Prodigy. I put together some really quick code to set up a multi-annotator Prodigy system. Here it is, in case anyone else finds it useful! https://github.com/ahalterman/multiuser_prodigy

It’s pretty bare-bones and one-off, but comments and issues are welcome.

2 Likes

Wow, this is great – thanks a lot for sharing! :heart_eyes:

@andy @ines
In this case, how would you save the outputs of different threads?

If I read the code correctly, all annotations are saved to the dataset 'multiuser_test'.

If you want to save annotations to different sets, or add information about the annotator who worked on them, you might want to start several individual Prodigy sessions with different sets, or put a little service in the middle that farms out the annotation tasks to the annotators, and adds the annotator ID to them – for example task['annotator_id'] = 123. You can find different approaches to this on the forum if you search for “multiple annotators”. I think starting multiple sessions of different ports/hosts might be the easiest solution. You can do this with a simple Python or Shell script, and even overwrite the session_id (see the PRODIGY_README.html for details) and set the port and host via the PRODIGY_PORT and PRODIGY_HOST environment variables.

2 Likes

Ok. I was also trying using different ports.
Thank You. :smile:

1 Like

In case anyone’s using the multiuser setup, I put together soem code to generate a daily report to track things like coding activity per coder per day, the distribution of time per task, etc. There’s a lot that’s hard coded in the code, but should be quite easy to modify for your own DB or figures.

1 Like