Feed overlap not working as expected

Hi @ines,

Continuing on this one,
I have this scenario: for example 10 examples for annotation

two routes:

  • 127.0.0.1:8080/?session=user1
  • 127.0.0.1:8081/?session=user2

But, this creates each user their own dataset.
I want user 1 to annotate 5, and user 2 to annotate other 5.

But, this does not work as expected always. If I open both sessions at the same time, I annotate 5 example from user1, save, exit. Now, if I go to user2, it also shows me the examples that I already annotated through user1.

I don’t want this to happen. Any way?

Yes, this is the default behaviour: user feeds overlap so everyone gets to see the same examples. If you set "feed_overlap": false, the annotations will be sent out without an overlap, meaning that every examples will only be annotated once overall instead of once per annotator.

Here's the relevant section in the PRODIGY_README.html:

The "feed_overlap" setting in your prodigy.json or recipe config lets you configure how examples should be sent out across multiple sessions. By default (true), each example in the dataset will be sent out once for each session, so you'll end up with overlapping annotations (e.g. one per example per annotator). Setting "feed_overlap" to false will send out each example in the data once to whoever is available. As a result, your data will have each example
labelled only once in total.

Still, it does not work as expected.
Should user1 exit before user2 start working?

(Moved this to a new topic so it’s easier to find and not hidden in the comments of the other thread.)

Where did you set the setting and what are you seeing? If you start two separate sessions, are both sessions still receiving the same questions?

I set "feed_overlap": false in prodigy.json .
Yes, start two separate sessions at the same time, I annotate 4 from one session, save. And I go to other session, refresh, and still get the previous questions.

Hi @ines,

Seems like it works on two prodigy sessions running on same ports. But fails on different ports. Is this the expected behavior?

Ah yes, that makes sense, though, since you’re running separate instances. The named multi user sessions are specifically intended for running one single instance of Prodigy on one port.

Hi @ines,

Doesn't work for me. I run a single prodigy session on a single port, with "feed_overlap": false in prodigy.json. There is still the same feed in each different named session.

Which version of Prodigy are you running? And are you running one single instance and accessing different sessions via the ?session= parameter?

Hello Ines, I'm having the same issue as the others: feed_overlap = false, one single instance of prodigy on one single port and multiple users accessing it through browser via url like example.com/?session=peter and example.com/?session=paul
It shows the current session but it still sends the same entries to both sessions.

Prodigy version is 1.10.7

I am still somewhat confused about how to configure this properly. Is it possible to use a different naming conventions such as split dataset among session or repeat dataset per session? Does feed_overlap = false mean that you split the dataset among the sessions? And feed_overlap = true means you repeat the entire dataset per session? Or is the other way around?

feed_overlap = false means that the examples are sent out with no overlap, so different annotators will get different examples, whatever is next available in the stream.

feed_overlap = true means that every annotator will get the same examples. This is the same as starting different instances of Prodigy and giving each annotator their own instance with the same stream.

If you already know how you want to split up your source data, it's easier to just start separate instances of Prodigy with different input files. This way, you can control exactly which examples the different annotators will see, and you can use your own convention for how to name the datasets.

I am experiencing the same situation. Same dataset, different sessions. We have feed_overlap = true. We are on the latest version of Prodigy. The workaround seems to be break out the dataset for each user, but is there another setting that I'm missing here in addition to feed_overlap. Is this a known issue?

Hi Cheyanneb, has this been addressed by the fix from the thread Duplicate annotations in output - #12 by kab ?

As a quick aside, is there a way to set the percentage of overlap instead of having full overlap? This would be a nice feature, if it doesn't exist already! Thanks!

hi @vsocrates!

Thanks for the comment and sorry for the delay.

This functionality doesn't exist currently but the team is actively looking into adding the functionality as part of Prodigy v2. Thanks again for the feedback!

1 Like

Looking forward to it, thanks!