Correct. An alternative to multi-user named sessions is to run each annotator on a different port/dataset.
Here's a good pro/con of each:
Sounds like then you want "overlapping" annotations, i.e., you need to set feed_overlap
to true
. You can do this in prodigy.json
or as an override.
I just responded to a similar example; see the bottom of the difference of feed_overlap
:
If you want to autosave, you can either set your batch_size
to 1 or setting instant_submit
in prodigy.json (config) to true. The one downside of these approaches is that it'll remove the undo as you'll no longer be holding the batch in client before sending to DB.
I can't remember the specific differences but this post seems to discuss it more:
Also, the Progress Bar is not updated real time; it's only updated when a new batch is retrieved. We've been debating modifying this in a future version but there are some unintended consequences in high latency environments with multiple annotators that can cause issues.
If you wanted something more realtime, you could check out the update
callback. @koaning has a great video on it to track annotator speed:
Since, you're using feed_overlap
is true, be sure to update Prodigy today. Yesterday, we released v1.11.9 which fixed a lingering bug of "duplicated" annotations in high latency/multiple annotators sessions.
Also, it's worth mentioning the team is working very hard on releasing Prodigy v2 in a few months. This will add even greater customization to feed_overlap
(e.g., what if you want only a certain percentage overlapping across annotators).
For the release of Prodigy
v2
, we have some exciting new features and a significant redesign to the way examples are sent to different annotators. Specifically, this redesign will eliminate the need for this tradeoff and should eliminate 100% of unwanted duplicates.
It will also bring more customization to the
feed_overlap
setting like setting the number of annotations you want for each task , or configuring a percentage of examples to have overlapping annotations for. We're even working on registering custom policies to distribute work to different annotators.
Also, since you're looking at having multiple annotators, be sure to check out @pmbaumgartner great Inter-annotator Agreement recipes. This can allow you to better "calibrate" how consistent annotators are (see Peter's wonderful NormConf talk on why calibration is important). We'd love feedback as this project is evolving!
Thanks again for your questions (sorry about dumping a ton of resources). Hopefully you have plenty of resources and please post back if you have interesting experiments or link to blog/paper if you're successful!