Tasks left unannotated with multi-annotator setup

helmiina · June 17, 2025, 6:22am

Hi,

We have a simple multiple-choice annotation interface with three pre-defined annotation sessions and overlap 1.1 to calculate annotator agreement. We have a list of 3777 image URLs, which are read from a jsonlines file and tasks are created based on them.

So we are expecting to get ~4155 total tasks with the overlap. However, all three annotators "ran out of tasks" (they see the No more tasks message) but only 3205 tasks have been annotated. Everyone's sessions have been refreshed in the browser multiple times, mut tasks are not being routed to anyone. I'm using the default router route_average_per_task. I made sure that nothing existed in the result database before starting the annotation, and manually querying the database also shows 3205 tasks. Work stealing is also enabled to make sure that everything gets annotated. Logs show no errors or anything, just

06:16:02: POST: /get_session_questions
06:16:02: CONTROLLER: Getting batch of questions for session: meta_content_types-sid
06:16:02: RESPONSE: /get_session_questions (0 examples)
06:16:02: {'tasks': [], 'total': 3205, 'progress': None, 'session_id': 'meta_content_types-sid'}
INFO:     10.131.2.2:34134 - "POST /get_session_questions HTTP/1.1" 200 OK

for every annotator with every browser refresh.

What might be going on? One thing I noticed from the logs earlier is that the router would consistently route tasks to an empty list of session IDs, so some tasks were routed for 1 annotator, some for 2 and some for nobody. Is this expected or can it be related to the problem?

magdaaniol · June 17, 2025, 8:32am

Hi @helmiina,

There is a couple of conditions in which route_average_per_task returns an empty list:
1.

if hash_count >= average:
            return []

For unannotated tasks this of course should be False, so the first thing to check is whether this condition results in unexpected results due to annotations being already present in the DB (which I believe you have already discarded).
2.

if len(pool) == 0: return annot

If len(pool) (the list of currently available and eligible annotators for this specific task) is 0, the function will return whatever annot list it has built so far. If pool is 0 at the very beginning of the assignment logic, then annot will be empty.
This might happen if only the annotators that have already annotated a given task enough times are active because other annotators haven't registered with the Controller yet. Remember the task router is called once per task, so if there's not enough eligible annotators available at this point, the task will receive fewer annotations than expected.

To dig into it, could you log the state of ctrl.session_ids when these unannotated tasks were being routed?

Finally, two less likely hypotheses:

if you use custom hash values they might not have a uniform distribution which messes up the probabilistic fractional assignment.
there are duplicates (based on either input or task hash value - depends on the exclude_by setting) in the input stream that get filtered out by the stream loading function. That would explain why there are examples that received 0 annotations. You can check that that by loading the stream with the get_stream function and the dedup parameter set to False, then converting to a list and checking the length.

helmiina · June 17, 2025, 10:03am

this I checked and no, there were no existing annotations prior to starting the annotation.

I added a print statement to check ctrl.session_ids every time the router fires, it prints all three annotators.

I use the set_hashes() function with no modifications to set hashes to our input tasks (only input data in the jsonfiles is the URL).

I did this and the length of the stream before and after deduplicating is the expected 3777.

I watched the whole routing ordeal go through in the logs, and for a while it showed tasks being routed to an empty list (I guess these were mostly the ones already annotated?) and then towards the end, tasks were routed to the annotators, although with many empty routing lists in between as well. So after restarting the instance, tasks that were not previously routed are now routed.

How is it possible that tasks that were not available before are available now, when the the state of the database and the input stream is the exact same? It's really not practical for us to keep restarting the interfaces whenever it looks like there are no more tasks to be done, we really expect everything to be annotated on one go. Or at least there shouldn't be 900 unannotated tasks... It's concerning, since this is the very simplest of our tasks with a regular, non-infinite input stream, default recipe and default router.

magdaaniol · June 17, 2025, 10:55am

Hi @helmiina,

I definitely understand the concern. We'll work on reproducing the problem and we'll get back you you asap. One clarification question: do you use PRODIGY_ALLOWED_SESSIONS setting? Thank you.

helmiina · June 17, 2025, 11:00am

Yes, we have three annotators set with the PRODIGY_ALLOWED_SESSIONS setting.

Thanks a lot for looking into this!

magdaaniol · June 18, 2025, 8:17am

Hi @helmiina,

So far I haven't been able to reproduce the issue. Given the following conditions:

the use of PRODIGY_ALLOWED_SESSIONS that makes controller.sessions_ids a constant
the lack of duplicates in the input
the lack of existing annotations in the target datasets

the route_average_per_task shouldn't ever return an empty list.

Also the fact that behavior was different after the restart suggests it is not the router logic that is at fault. Importantly, the calls to router are done under the thread lock so concurrency should not be an issue either.

Having API logs on the backed excludes network related issues as well, I think.

Since the problem seem to be the combination of the data and the code, it probably would be most efficient if you could send us the input data and the Prodigy setting/command so that we can reproduce it. Alternatively, if it's more convenient, we could also set up a call look and at it together. Thank you!

helmiina · June 18, 2025, 2:36pm

Hi @magdaaniol ,

I think I may have found the issue with this specific recipe.

I didn't see an error in the logs because it was buried quite deep after trying to refresh the tasks in the browser for so many times. Here's what I did just now:

I restarted the prodigy instance and only created one session for myself
I was able to annotate hundreds of the missing tasks just fine
I was keeping my eye on the logs as I was annotating, and again I ran out of tasks earlier than expected. This time I noticed a network error in the logs:

13:11:06: CONTROLLER: Getting batch of questions for session: meta_content_types-helmiina
INFO:     10.131.2.2:55102 - "POST /get_session_questions HTTP/1.1" 500 Internal Server Error
Task exception was never retrieved
future: <Task finished name='Task-985' coro=<RequestResponseCycle.run_asgi() done, defined at /usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py:401> exception=ProtocolError('Connection broken: IncompleteRead(1094376 bytes read, 96039 more expected)', IncompleteRead(1094376 bytes read, 96039 more expected))>
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/urllib3/response.py", line 754, in _error_catcher
    yield
  File "/usr/local/lib/python3.10/site-packages/urllib3/response.py", line 900, in _raw_read
    raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
urllib3.exceptions.IncompleteRead: IncompleteRead(1094376 bytes read, 96039 more expected)

So there was a network error reading our input file from s3, and seems like prodigy thought that the stream ended there.

Now, I'm wondering how I could prevent this in the future? I'm not sure if a try: except block in our stream will do the trick. It would be better in our case if the prodigy instance just crashed at this point, so our pod would restart and the annotation could continue. If the error is not raised or caught in any way, we will keep having to manually restart the pods when this happens, which is not possible.

Any ideas? My apologies for saying that there were no errors in the logs, there were I just didn't scroll far enough and for some reason didn't think to grep.

Unfortunately I have similar-ish issues with our more complicated annotation pipeline, where we are sending tasks from one interface to another based on how the annotators annotate the examples. Tasks are stuck there as well and not being routed to prodigy instances as expected despite me testing everything as thoroughly as possible by myself beforehand. I think it would be most convenient for everyone if we could look at this on a call, I will send you an email about this.

Thanks again for the quick support!

magdaaniol · June 19, 2025, 12:24pm

Let's pick up from here during the call next week - thank you!

Topic		Replies	Views
Task routers - Problem in JSONL output/Annotators Problem ner	7	406	August 3, 2023
Task Routing's problem: I want to get 5 annotations per task but it doesn't work. textcat	8	768	October 16, 2023
Problems with task stealing and sessions	7	43	May 9, 2025
Tasking routing: ensuring n annotations without specifying sessions upfront	1	12	October 3, 2024
Allowing for a constant stream of examples in a multi-annotator setting usage , streams , multi-user	3	278	April 17, 2024

Tasks left unannotated with multi-annotator setup

Related topics