force_stream_order and exposed stream length

Hello,
We're using prodigy for human evaluation of neural text generation.
This simply involves textcat.manual with force_stream_order and feed_overlap set to true so we can have multiple annotators each label every example.

However, I was wondering what would be the best way to get the stream length displayed.
If I go the custom recipe route where I define __len__ for my stream generator as described here, what else should I do to keep the behaviour from force_stream_order and feed_overlap?

Thank you,
Andrei

Hi! I think what you describe here is currently a bit tricky, because if you run a multi-user session, all sessions share the same controller and instance (which is the point). But it currently doesn't keep track of the counts for the individual sessions, which can also change at runtime – only the totals. This also recently came up in this thread related to a custom progress bar. We still need to find a good solution for this that works across all possible scenarios.

In the meantime, if your goal is to have all examples labelled by all annotators, you could also just start separate instances for each annotator on different ports. It's more instances overall, but it does give you a very clean separation between the sessions and the progress.

Having your stream expose a __len__ would be the easiest solution to tell Prodigy that there's a finite number of examples. (The other option for modifying the progress bar is to have your recipe expose a custom progress function – but this makes more sense if you want to use your own custom metric to determine the annotation progress.)