Some annotated data is missing

HI,
I'm trying to annotate a dataset with feed_overlap = True. Once prodigy shows No task is available, the users save the data and when we get the dataset, the number of examples annotated par user is different. Here is an example to illustrate what I mean : I have a dataset with 100 examples and 3 annotators: each one annotate the dataset using his user name. At the end of the we use prodigy db-out to get the annotated data and we notice the following 100 annotated data for user_1, 85 for user_2, 64 for user_3.

Any help please

Thnak you

I'm using feed_overlap = True so that each example is going to be annotated mulipletimes. Normally at the end, I shoud get 100 annotated data per each annotator. This is the fonctoriality of feed_overlap. Am I wrong ?

Ah, sorry, I think I misread your post. In that case, do you know how many examples were actually shown to the annotators? It sounds like the more likely explanation here is that Prodigy showed "No tasks available" after 85 examples – which could happen if it ends up with an empty batch somehow that the server responds with too quickly. So that's the most likely scenario to investigate here. It'd also be interesting to check whether refreshing the browser helps – if so, that also points to the "premature empty batch" theory.

(Btw, if you have multiple annotators annotating the exact same data, you could also just give them their own dedicated instances and datasets? This makes the whole process much more straightforward.)

1 Like