Number of tasks doesn't match number of items in input file

ines · November 15, 2019, 1:29pm

Hi! The default batch size is 10 examples, and each request asks for a single batch – so 110 examples (assuming there are no duplicates and no existing answers in the dataset) would mean that 11 batches were sent out but didn't come back answered.

The underlying problem here is that if you have multiple annotators working on the same data and a batch is sent out, Prodigy has no way of knowing whether it's coming back or not. Maybe someone is still working on it, maybe they're offline – it's not tracking the user in the app. So by default, the server will not re-send a batch to prevent duplicate annotations.

However, the batches are not gone or lost, and Prodigy keeps a very detaild record of the examples and annotations via the hashes it assigns. So if you restart the server, the unannotated examples are added back to the queue. Alternatively, you can also make your stream "infinite" and assume that examples that are not in the dataset yet after the first iteration should be sent out again until all hashes are in the dataset. Here's a code example that shows the idea. This works well for a finite stream that you don't necessarily need to annotate in order.

We'll also be adding a new feature that's a bit more complex and lets you enforce the exact ordering of batches that are sent out – see this thread for a more in-depth discussion. If this setting is enabled, the server would then always respond with the same batch until it has received the answers for it. However, the trade-off is that you can end up with duplicates if two people annotate in the same session (e.g. both accessing the app without a session name appended to the URL). So this way of handling the stream would work best if your annotators are all annotating the same data in their own separatre sessions (overlapping feed) and it's important that examples go out in the exact order they're loaded in.

Topic		Replies	Views
Get 'no task' before all annotation finished usage , ner	3	1234	June 18, 2019
No Tasks Available for Non-Active Learning Classification custom	3	797	October 1, 2019
Prodigy says "No tasks available." usage , solved , streams	14	905	October 7, 2021
Losing tasks while reloading page. usage	2	700	October 15, 2018
No tasks available for ner.correct? ner	2	549	October 3, 2020

Number of tasks doesn't match number of items in input file

Related topics