force_stream_order and exposed stream length

amr · April 7, 2020, 7:37pm

Hello,
We're using prodigy for human evaluation of neural text generation.
This simply involves textcat.manual with force_stream_order and feed_overlap set to true so we can have multiple annotators each label every example.

However, I was wondering what would be the best way to get the stream length displayed.
If I go the custom recipe route where I define __len__ for my stream generator as described here, what else should I do to keep the behaviour from force_stream_order and feed_overlap?

Thank you,
Andrei

ines · April 8, 2020, 9:08am

Hi! I think what you describe here is currently a bit tricky, because if you run a multi-user session, all sessions share the same controller and instance (which is the point). But it currently doesn't keep track of the counts for the individual sessions, which can also change at runtime – only the totals. This also recently came up in this thread related to a custom progress bar. We still need to find a good solution for this that works across all possible scenarios.

In the meantime, if your goal is to have all examples labelled by all annotators, you could also just start separate instances for each annotator on different ports. It's more instances overall, but it does give you a very clean separation between the sessions and the progress.

Having your stream expose a __len__ would be the easiest solution to tell Prodigy that there's a finite number of examples. (The other option for modifying the progress bar is to have your recipe expose a custom progress function – but this makes more sense if you want to use your own custom metric to determine the annotation progress.)

Topic		Replies	Views
Display stream length front-end , solved	3	989	December 13, 2018
textcat.manual Duplicate Samples usage , textcat , done , streams	9	1593	June 5, 2020
Non-random batches across Annotators usage , front-end , multi-user	1	446	October 3, 2022
Named multi-user session exceeds dataset length usage , streams	1	584	January 2, 2022
Option feed_overlap=false doesn't show expected behaviour usage , streams	3	1429	December 30, 2021

force_stream_order and exposed stream length

Related topics