I just tried reproducing this locally, and while I wasn't able to find any issues, I figured that it couldn't hurt to share my findings and steps taken.
I started with this dataset.
{"text":"1"}
{"text":"2"}
{"text":"3"}
{"text":"4"}
{"text":"5"}
{"text":"6"}
{"text":"7"}
{"text":"8"}
{"text":"9"}
{"text":"10"}
{"text":"11"}
{"text":"12"}
{"text":"13"}
{"text":"14"}
{"text":"15"}
{"text":"16"}
{"text":"17"}
{"text":"18"}
{"text":"19"}
{"text":"20"}
{"text":"21"}
{"text":"22"}
{"text":"23"}
{"text":"24"}
{"text":"25"}
{"text":"26"}
{"text":"27"}
{"text":"28"}
{"text":"29"}
{"text":"30"}
It's really just a dummy dataset that has exactly 30 rows. This dataset was used by this recipe call:
PRODIGY_ALLOWED_SESSIONS="user1,user2,user3,user4,user5" PRODIGY_CONFIG_OVERRIDES='{"annotations_per_task": 3, "allow_work_stealing": false}' PRODIGY_LOGGING=verbose python -m prodigy ner.manual issue-6702 en_core_web_sm examples-30.jsonl --label number
Notice that I'm setting the sessions upfront, assigning 3 annotations per task and disallowing work stealing. I'm also turning on the verbose logs.
Next, I start a browser and open up five tabs, one for each user.
As I open up these tabs, I also see logs appear. These are the logs for the first time that the task router triggered, for user1
.
14:38:57: ROUTER: Routing item with _input_hash=-2045454197 -> ['issue-6702-2-user4', 'issue-6702-2-user5', 'issue-6702-2-user2']
14:38:57: ROUTER: Routing item with _input_hash=-784123405 -> ['issue-6702-2-user1', 'issue-6702-2-user5', 'issue-6702-2-user4']
14:38:57: ROUTER: Routing item with _input_hash=-805513229 -> ['issue-6702-2-user2', 'issue-6702-2-user5', 'issue-6702-2-user3']
14:38:57: ROUTER: Routing item with _input_hash=-1835389134 -> ['issue-6702-2-user2', 'issue-6702-2-user4', 'issue-6702-2-user1']
14:38:57: ROUTER: Routing item with _input_hash=-1991218384 -> ['issue-6702-2-user2', 'issue-6702-2-user1', 'issue-6702-2-user5']
14:38:57: ROUTER: Routing item with _input_hash=-1639312927 -> ['issue-6702-2-user4', 'issue-6702-2-user2', 'issue-6702-2-user5']
14:38:57: ROUTER: Routing item with _input_hash=372028165 -> ['issue-6702-2-user1', 'issue-6702-2-user3', 'issue-6702-2-user4']
14:38:58: ROUTER: Routing item with _input_hash=2018066853 -> ['issue-6702-2-user4', 'issue-6702-2-user2', 'issue-6702-2-user1']
14:38:58: ROUTER: Routing item with _input_hash=42709192 -> ['issue-6702-2-user3', 'issue-6702-2-user1', 'issue-6702-2-user4']
14:38:58: ROUTER: Routing item with _input_hash=1254603855 -> ['issue-6702-2-user1', 'issue-6702-2-user5', 'issue-6702-2-user2']
14:38:58: ROUTER: Routing item with _input_hash=-1789291377 -> ['issue-6702-2-user4', 'issue-6702-2-user5', 'issue-6702-2-user1']
14:38:58: ROUTER: Routing item with _input_hash=1626462944 -> ['issue-6702-2-user5', 'issue-6702-2-user1', 'issue-6702-2-user4']
14:38:58: ROUTER: Routing item with _input_hash=855192632 -> ['issue-6702-2-user3', 'issue-6702-2-user1', 'issue-6702-2-user5']
You'll notice that it keeps polling until it has 10 examples for user1. That's why there are more than 10 lines. The tasks are distributed somewhat evenly, but not perfectly because of the hashing.
I proceeded by annotating all of these examples by simply hitting accept everywhere. Then, I stop the recipe and call db-out
. That results in the following output:
{"text":"2","_input_hash":-784123405,"_task_hash":985633209,"_is_binary":false,"tokens":[{"text":"2","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288501,"_annotator_id":"issue-6702-user1","_session_id":"issue-6702-user1"}
{"text":"4","_input_hash":-1835389134,"_task_hash":678744456,"_is_binary":false,"tokens":[{"text":"4","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288502,"_annotator_id":"issue-6702-user1","_session_id":"issue-6702-user1"}
{"text":"5","_input_hash":-1991218384,"_task_hash":-1581206528,"_is_binary":false,"tokens":[{"text":"5","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288502,"_annotator_id":"issue-6702-user1","_session_id":"issue-6702-user1"}
{"text":"7","_input_hash":372028165,"_task_hash":1401556973,"_is_binary":false,"tokens":[{"text":"7","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288502,"_annotator_id":"issue-6702-user1","_session_id":"issue-6702-user1"}
{"text":"8","_input_hash":2018066853,"_task_hash":-1154883125,"_is_binary":false,"tokens":[{"text":"8","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288502,"_annotator_id":"issue-6702-user1","_session_id":"issue-6702-user1"}
{"text":"9","_input_hash":42709192,"_task_hash":811274923,"_is_binary":false,"tokens":[{"text":"9","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288503,"_annotator_id":"issue-6702-user1","_session_id":"issue-6702-user1"}
{"text":"10","_input_hash":1254603855,"_task_hash":16089945,"_is_binary":false,"tokens":[{"text":"10","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288503,"_annotator_id":"issue-6702-user1","_session_id":"issue-6702-user1"}
{"text":"11","_input_hash":-1789291377,"_task_hash":-474789062,"_is_binary":false,"tokens":[{"text":"11","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288503,"_annotator_id":"issue-6702-user1","_session_id":"issue-6702-user1"}
{"text":"12","_input_hash":1626462944,"_task_hash":2018616092,"_is_binary":false,"tokens":[{"text":"12","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288503,"_annotator_id":"issue-6702-user1","_session_id":"issue-6702-user1"}
{"text":"13","_input_hash":855192632,"_task_hash":335429472,"_is_binary":false,"tokens":[{"text":"13","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288504,"_annotator_id":"issue-6702-user1","_session_id":"issue-6702-user1"}
{"text":"14","_input_hash":-423927864,"_task_hash":-311834425,"_is_binary":false,"tokens":[{"text":"14","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288504,"_annotator_id":"issue-6702-user1","_session_id":"issue-6702-user1"}
{"text":"15","_input_hash":-902457195,"_task_hash":185203310,"_is_binary":false,"tokens":[{"text":"15","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288504,"_annotator_id":"issue-6702-user1","_session_id":"issue-6702-user1"}
{"text":"17","_input_hash":933542244,"_task_hash":-1066193932,"_is_binary":false,"tokens":[{"text":"17","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288505,"_annotator_id":"issue-6702-user1","_session_id":"issue-6702-user1"}
{"text":"19","_input_hash":2035079888,"_task_hash":30789149,"_is_binary":false,"tokens":[{"text":"19","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288505,"_annotator_id":"issue-6702-user1","_session_id":"issue-6702-user1"}
{"text":"21","_input_hash":-1139507680,"_task_hash":-1258641755,"_is_binary":false,"tokens":[{"text":"21","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288505,"_annotator_id":"issue-6702-user1","_session_id":"issue-6702-user1"}
{"text":"22","_input_hash":1765252975,"_task_hash":-1221112846,"_is_binary":false,"tokens":[{"text":"22","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288505,"_annotator_id":"issue-6702-user1","_session_id":"issue-6702-user1"}
{"text":"25","_input_hash":230261373,"_task_hash":329569934,"_is_binary":false,"tokens":[{"text":"25","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288506,"_annotator_id":"issue-6702-user1","_session_id":"issue-6702-user1"}
{"text":"27","_input_hash":-738557658,"_task_hash":613896265,"_is_binary":false,"tokens":[{"text":"27","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288506,"_annotator_id":"issue-6702-user1","_session_id":"issue-6702-user1"}
{"text":"1","_input_hash":-2045454197,"_task_hash":1182163795,"_is_binary":false,"tokens":[{"text":"1","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288508,"_annotator_id":"issue-6702-user2","_session_id":"issue-6702-user2"}
{"text":"3","_input_hash":-805513229,"_task_hash":1162353785,"_is_binary":false,"tokens":[{"text":"3","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288509,"_annotator_id":"issue-6702-user2","_session_id":"issue-6702-user2"}
{"text":"4","_input_hash":-1835389134,"_task_hash":678744456,"_is_binary":false,"tokens":[{"text":"4","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288509,"_annotator_id":"issue-6702-user2","_session_id":"issue-6702-user2"}
{"text":"5","_input_hash":-1991218384,"_task_hash":-1581206528,"_is_binary":false,"tokens":[{"text":"5","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288509,"_annotator_id":"issue-6702-user2","_session_id":"issue-6702-user2"}
{"text":"6","_input_hash":-1639312927,"_task_hash":-777789492,"_is_binary":false,"tokens":[{"text":"6","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288510,"_annotator_id":"issue-6702-user2","_session_id":"issue-6702-user2"}
{"text":"8","_input_hash":2018066853,"_task_hash":-1154883125,"_is_binary":false,"tokens":[{"text":"8","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288510,"_annotator_id":"issue-6702-user2","_session_id":"issue-6702-user2"}
{"text":"10","_input_hash":1254603855,"_task_hash":16089945,"_is_binary":false,"tokens":[{"text":"10","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288510,"_annotator_id":"issue-6702-user2","_session_id":"issue-6702-user2"}
{"text":"14","_input_hash":-423927864,"_task_hash":-311834425,"_is_binary":false,"tokens":[{"text":"14","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288510,"_annotator_id":"issue-6702-user2","_session_id":"issue-6702-user2"}
{"text":"15","_input_hash":-902457195,"_task_hash":185203310,"_is_binary":false,"tokens":[{"text":"15","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288511,"_annotator_id":"issue-6702-user2","_session_id":"issue-6702-user2"}
{"text":"16","_input_hash":-343056218,"_task_hash":96950103,"_is_binary":false,"tokens":[{"text":"16","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288511,"_annotator_id":"issue-6702-user2","_session_id":"issue-6702-user2"}
{"text":"17","_input_hash":933542244,"_task_hash":-1066193932,"_is_binary":false,"tokens":[{"text":"17","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288511,"_annotator_id":"issue-6702-user2","_session_id":"issue-6702-user2"}
{"text":"18","_input_hash":-1213183927,"_task_hash":-556031871,"_is_binary":false,"tokens":[{"text":"18","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288511,"_annotator_id":"issue-6702-user2","_session_id":"issue-6702-user2"}
{"text":"20","_input_hash":-883828987,"_task_hash":-1493531608,"_is_binary":false,"tokens":[{"text":"20","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288512,"_annotator_id":"issue-6702-user2","_session_id":"issue-6702-user2"}
{"text":"21","_input_hash":-1139507680,"_task_hash":-1258641755,"_is_binary":false,"tokens":[{"text":"21","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288512,"_annotator_id":"issue-6702-user2","_session_id":"issue-6702-user2"}
{"text":"23","_input_hash":348472867,"_task_hash":-1683186191,"_is_binary":false,"tokens":[{"text":"23","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288512,"_annotator_id":"issue-6702-user2","_session_id":"issue-6702-user2"}
{"text":"24","_input_hash":1124906533,"_task_hash":1709963953,"_is_binary":false,"tokens":[{"text":"24","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288512,"_annotator_id":"issue-6702-user2","_session_id":"issue-6702-user2"}
{"text":"25","_input_hash":230261373,"_task_hash":329569934,"_is_binary":false,"tokens":[{"text":"25","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288513,"_annotator_id":"issue-6702-user2","_session_id":"issue-6702-user2"}
{"text":"28","_input_hash":-165836021,"_task_hash":-1541557956,"_is_binary":false,"tokens":[{"text":"28","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288513,"_annotator_id":"issue-6702-user2","_session_id":"issue-6702-user2"}
{"text":"29","_input_hash":-539359274,"_task_hash":1560417868,"_is_binary":false,"tokens":[{"text":"29","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288513,"_annotator_id":"issue-6702-user2","_session_id":"issue-6702-user2"}
{"text":"30","_input_hash":1814743553,"_task_hash":-550346988,"_is_binary":false,"tokens":[{"text":"30","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288514,"_annotator_id":"issue-6702-user2","_session_id":"issue-6702-user2"}
{"text":"3","_input_hash":-805513229,"_task_hash":1162353785,"_is_binary":false,"tokens":[{"text":"3","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288519,"_annotator_id":"issue-6702-user3","_session_id":"issue-6702-user3"}
{"text":"7","_input_hash":372028165,"_task_hash":1401556973,"_is_binary":false,"tokens":[{"text":"7","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288519,"_annotator_id":"issue-6702-user3","_session_id":"issue-6702-user3"}
{"text":"9","_input_hash":42709192,"_task_hash":811274923,"_is_binary":false,"tokens":[{"text":"9","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288520,"_annotator_id":"issue-6702-user3","_session_id":"issue-6702-user3"}
{"text":"13","_input_hash":855192632,"_task_hash":335429472,"_is_binary":false,"tokens":[{"text":"13","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288520,"_annotator_id":"issue-6702-user3","_session_id":"issue-6702-user3"}
{"text":"14","_input_hash":-423927864,"_task_hash":-311834425,"_is_binary":false,"tokens":[{"text":"14","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288520,"_annotator_id":"issue-6702-user3","_session_id":"issue-6702-user3"}
{"text":"15","_input_hash":-902457195,"_task_hash":185203310,"_is_binary":false,"tokens":[{"text":"15","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288520,"_annotator_id":"issue-6702-user3","_session_id":"issue-6702-user3"}
{"text":"16","_input_hash":-343056218,"_task_hash":96950103,"_is_binary":false,"tokens":[{"text":"16","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288521,"_annotator_id":"issue-6702-user3","_session_id":"issue-6702-user3"}
{"text":"22","_input_hash":1765252975,"_task_hash":-1221112846,"_is_binary":false,"tokens":[{"text":"22","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288521,"_annotator_id":"issue-6702-user3","_session_id":"issue-6702-user3"}
{"text":"23","_input_hash":348472867,"_task_hash":-1683186191,"_is_binary":false,"tokens":[{"text":"23","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288521,"_annotator_id":"issue-6702-user3","_session_id":"issue-6702-user3"}
{"text":"24","_input_hash":1124906533,"_task_hash":1709963953,"_is_binary":false,"tokens":[{"text":"24","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288521,"_annotator_id":"issue-6702-user3","_session_id":"issue-6702-user3"}
{"text":"26","_input_hash":1310182142,"_task_hash":266669984,"_is_binary":false,"tokens":[{"text":"26","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288522,"_annotator_id":"issue-6702-user3","_session_id":"issue-6702-user3"}
{"text":"27","_input_hash":-738557658,"_task_hash":613896265,"_is_binary":false,"tokens":[{"text":"27","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288522,"_annotator_id":"issue-6702-user3","_session_id":"issue-6702-user3"}
{"text":"29","_input_hash":-539359274,"_task_hash":1560417868,"_is_binary":false,"tokens":[{"text":"29","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288522,"_annotator_id":"issue-6702-user3","_session_id":"issue-6702-user3"}
{"text":"1","_input_hash":-2045454197,"_task_hash":1182163795,"_is_binary":false,"tokens":[{"text":"1","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288527,"_annotator_id":"issue-6702-user4","_session_id":"issue-6702-user4"}
{"text":"2","_input_hash":-784123405,"_task_hash":985633209,"_is_binary":false,"tokens":[{"text":"2","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288528,"_annotator_id":"issue-6702-user4","_session_id":"issue-6702-user4"}
{"text":"4","_input_hash":-1835389134,"_task_hash":678744456,"_is_binary":false,"tokens":[{"text":"4","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288528,"_annotator_id":"issue-6702-user4","_session_id":"issue-6702-user4"}
{"text":"6","_input_hash":-1639312927,"_task_hash":-777789492,"_is_binary":false,"tokens":[{"text":"6","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288528,"_annotator_id":"issue-6702-user4","_session_id":"issue-6702-user4"}
{"text":"7","_input_hash":372028165,"_task_hash":1401556973,"_is_binary":false,"tokens":[{"text":"7","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288528,"_annotator_id":"issue-6702-user4","_session_id":"issue-6702-user4"}
{"text":"8","_input_hash":2018066853,"_task_hash":-1154883125,"_is_binary":false,"tokens":[{"text":"8","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288529,"_annotator_id":"issue-6702-user4","_session_id":"issue-6702-user4"}
{"text":"9","_input_hash":42709192,"_task_hash":811274923,"_is_binary":false,"tokens":[{"text":"9","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288529,"_annotator_id":"issue-6702-user4","_session_id":"issue-6702-user4"}
{"text":"11","_input_hash":-1789291377,"_task_hash":-474789062,"_is_binary":false,"tokens":[{"text":"11","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288529,"_annotator_id":"issue-6702-user4","_session_id":"issue-6702-user4"}
{"text":"12","_input_hash":1626462944,"_task_hash":2018616092,"_is_binary":false,"tokens":[{"text":"12","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288530,"_annotator_id":"issue-6702-user4","_session_id":"issue-6702-user4"}
{"text":"16","_input_hash":-343056218,"_task_hash":96950103,"_is_binary":false,"tokens":[{"text":"16","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288530,"_annotator_id":"issue-6702-user4","_session_id":"issue-6702-user4"}
{"text":"18","_input_hash":-1213183927,"_task_hash":-556031871,"_is_binary":false,"tokens":[{"text":"18","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288530,"_annotator_id":"issue-6702-user4","_session_id":"issue-6702-user4"}
{"text":"19","_input_hash":2035079888,"_task_hash":30789149,"_is_binary":false,"tokens":[{"text":"19","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288530,"_annotator_id":"issue-6702-user4","_session_id":"issue-6702-user4"}
{"text":"20","_input_hash":-883828987,"_task_hash":-1493531608,"_is_binary":false,"tokens":[{"text":"20","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288531,"_annotator_id":"issue-6702-user4","_session_id":"issue-6702-user4"}
{"text":"24","_input_hash":1124906533,"_task_hash":1709963953,"_is_binary":false,"tokens":[{"text":"24","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288531,"_annotator_id":"issue-6702-user4","_session_id":"issue-6702-user4"}
{"text":"25","_input_hash":230261373,"_task_hash":329569934,"_is_binary":false,"tokens":[{"text":"25","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288531,"_annotator_id":"issue-6702-user4","_session_id":"issue-6702-user4"}
{"text":"26","_input_hash":1310182142,"_task_hash":266669984,"_is_binary":false,"tokens":[{"text":"26","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288531,"_annotator_id":"issue-6702-user4","_session_id":"issue-6702-user4"}
{"text":"27","_input_hash":-738557658,"_task_hash":613896265,"_is_binary":false,"tokens":[{"text":"27","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288532,"_annotator_id":"issue-6702-user4","_session_id":"issue-6702-user4"}
{"text":"28","_input_hash":-165836021,"_task_hash":-1541557956,"_is_binary":false,"tokens":[{"text":"28","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288532,"_annotator_id":"issue-6702-user4","_session_id":"issue-6702-user4"}
{"text":"29","_input_hash":-539359274,"_task_hash":1560417868,"_is_binary":false,"tokens":[{"text":"29","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288532,"_annotator_id":"issue-6702-user4","_session_id":"issue-6702-user4"}
{"text":"30","_input_hash":1814743553,"_task_hash":-550346988,"_is_binary":false,"tokens":[{"text":"30","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288532,"_annotator_id":"issue-6702-user4","_session_id":"issue-6702-user4"}
{"text":"1","_input_hash":-2045454197,"_task_hash":1182163795,"_is_binary":false,"tokens":[{"text":"1","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288535,"_annotator_id":"issue-6702-user5","_session_id":"issue-6702-user5"}
{"text":"2","_input_hash":-784123405,"_task_hash":985633209,"_is_binary":false,"tokens":[{"text":"2","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288536,"_annotator_id":"issue-6702-user5","_session_id":"issue-6702-user5"}
{"text":"3","_input_hash":-805513229,"_task_hash":1162353785,"_is_binary":false,"tokens":[{"text":"3","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288536,"_annotator_id":"issue-6702-user5","_session_id":"issue-6702-user5"}
{"text":"5","_input_hash":-1991218384,"_task_hash":-1581206528,"_is_binary":false,"tokens":[{"text":"5","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288536,"_annotator_id":"issue-6702-user5","_session_id":"issue-6702-user5"}
{"text":"6","_input_hash":-1639312927,"_task_hash":-777789492,"_is_binary":false,"tokens":[{"text":"6","start":0,"end":1,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288536,"_annotator_id":"issue-6702-user5","_session_id":"issue-6702-user5"}
{"text":"10","_input_hash":1254603855,"_task_hash":16089945,"_is_binary":false,"tokens":[{"text":"10","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288537,"_annotator_id":"issue-6702-user5","_session_id":"issue-6702-user5"}
{"text":"11","_input_hash":-1789291377,"_task_hash":-474789062,"_is_binary":false,"tokens":[{"text":"11","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288537,"_annotator_id":"issue-6702-user5","_session_id":"issue-6702-user5"}
{"text":"12","_input_hash":1626462944,"_task_hash":2018616092,"_is_binary":false,"tokens":[{"text":"12","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288537,"_annotator_id":"issue-6702-user5","_session_id":"issue-6702-user5"}
{"text":"13","_input_hash":855192632,"_task_hash":335429472,"_is_binary":false,"tokens":[{"text":"13","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288537,"_annotator_id":"issue-6702-user5","_session_id":"issue-6702-user5"}
{"text":"17","_input_hash":933542244,"_task_hash":-1066193932,"_is_binary":false,"tokens":[{"text":"17","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288538,"_annotator_id":"issue-6702-user5","_session_id":"issue-6702-user5"}
{"text":"18","_input_hash":-1213183927,"_task_hash":-556031871,"_is_binary":false,"tokens":[{"text":"18","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288538,"_annotator_id":"issue-6702-user5","_session_id":"issue-6702-user5"}
{"text":"19","_input_hash":2035079888,"_task_hash":30789149,"_is_binary":false,"tokens":[{"text":"19","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288538,"_annotator_id":"issue-6702-user5","_session_id":"issue-6702-user5"}
{"text":"20","_input_hash":-883828987,"_task_hash":-1493531608,"_is_binary":false,"tokens":[{"text":"20","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288539,"_annotator_id":"issue-6702-user5","_session_id":"issue-6702-user5"}
{"text":"21","_input_hash":-1139507680,"_task_hash":-1258641755,"_is_binary":false,"tokens":[{"text":"21","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288539,"_annotator_id":"issue-6702-user5","_session_id":"issue-6702-user5"}
{"text":"22","_input_hash":1765252975,"_task_hash":-1221112846,"_is_binary":false,"tokens":[{"text":"22","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288539,"_annotator_id":"issue-6702-user5","_session_id":"issue-6702-user5"}
{"text":"23","_input_hash":348472867,"_task_hash":-1683186191,"_is_binary":false,"tokens":[{"text":"23","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288539,"_annotator_id":"issue-6702-user5","_session_id":"issue-6702-user5"}
{"text":"26","_input_hash":1310182142,"_task_hash":266669984,"_is_binary":false,"tokens":[{"text":"26","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288540,"_annotator_id":"issue-6702-user5","_session_id":"issue-6702-user5"}
{"text":"28","_input_hash":-165836021,"_task_hash":-1541557956,"_is_binary":false,"tokens":[{"text":"28","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288540,"_annotator_id":"issue-6702-user5","_session_id":"issue-6702-user5"}
{"text":"30","_input_hash":1814743553,"_task_hash":-550346988,"_is_binary":false,"tokens":[{"text":"30","start":0,"end":2,"id":0,"ws":false}],"_view_id":"ner_manual","answer":"accept","_timestamp":1690288540,"_annotator_id":"issue-6702-user5","_session_id":"issue-6702-user5"}
It's 90 lines, which is exactly what I'd expect given that there are three annotations per example (3 x 30 = 90). Next, I'll perform a count on that dataset.
import polars as pl
pl.read_ndjson("issue-6702.jsonl").groupby("_annotator_id").agg(pl.col("_input_hash").count())
This yields the following table.
┌──────────────────┬─────────────┐
│ _annotator_id ┆ _input_hash │
│ --- ┆ --- │
│ str ┆ u32 │
╞══════════════════╪═════════════╡
│ issue-6702-user4 ┆ 20 │
│ issue-6702-user3 ┆ 13 │
│ issue-6702-user1 ┆ 18 │
│ issue-6702-user2 ┆ 20 │
│ issue-6702-user5 ┆ 19 │
└──────────────────┴─────────────┘
The distribution isn't perfect, again because of the hashing, but it doesn't feel out of bounds. The reason why we use hashes here, instead of round-robin, has to do with the consistent mapping. The hashing trick really guarantees that a specific hash is mapped to a specific user, even if the order of the stream were to change. With a round-robin approach the allocation might change after the server restarts and new data is added.
Out of curiosity, if you were to follow these steps, do you see something different?