Continuing on this one,
I have this scenario: for example 10 examples for annotation
two routes:
127.0.0.1:8080/?session=user1
127.0.0.1:8081/?session=user2
But, this creates each user their own dataset.
I want user 1 to annotate 5, and user 2 to annotate other 5.
But, this does not work as expected always. If I open both sessions at the same time, I annotate 5 example from user1, save, exit. Now, if I go to user2, it also shows me the examples that I already annotated through user1.
Yes, this is the default behaviour: user feeds overlap so everyone gets to see the same examples. If you set "feed_overlap": false, the annotations will be sent out without an overlap, meaning that every examples will only be annotated once overall instead of once per annotator.
Here's the relevant section in the PRODIGY_README.html:
The "feed_overlap" setting in your prodigy.json or recipe config lets you configure how examples should be sent out across multiple sessions. By default (true), each example in the dataset will be sent out once for each session, so you'll end up with overlapping annotations (e.g. one per example per annotator). Setting "feed_overlap" to false will send out each example in the data once to whoever is available. As a result, your data will have each example
labelled only once in total.
I set "feed_overlap": false in prodigy.json .
Yes, start two separate sessions at the same time, I annotate 4 from one session, save. And I go to other session, refresh, and still get the previous questions.
Ah yes, that makes sense, though, since you’re running separate instances. The named multi user sessions are specifically intended for running one single instance of Prodigy on one port.
Doesn't work for me. I run a single prodigy session on a single port, with "feed_overlap": false in prodigy.json. There is still the same feed in each different named session.
Hello Ines, I'm having the same issue as the others: feed_overlap = false, one single instance of prodigy on one single port and multiple users accessing it through browser via url like example.com/?session=peter and example.com/?session=paul
It shows the current session but it still sends the same entries to both sessions.
I am still somewhat confused about how to configure this properly. Is it possible to use a different naming conventions such as split dataset among session or repeat dataset per session? Does feed_overlap = false mean that you split the dataset among the sessions? And feed_overlap = true means you repeat the entire dataset per session? Or is the other way around?
feed_overlap = false means that the examples are sent out with no overlap, so different annotators will get different examples, whatever is next available in the stream.
feed_overlap = true means that every annotator will get the same examples. This is the same as starting different instances of Prodigy and giving each annotator their own instance with the same stream.
If you already know how you want to split up your source data, it's easier to just start separate instances of Prodigy with different input files. This way, you can control exactly which examples the different annotators will see, and you can use your own convention for how to name the datasets.
I am experiencing the same situation. Same dataset, different sessions. We have feed_overlap = true. We are on the latest version of Prodigy. The workaround seems to be break out the dataset for each user, but is there another setting that I'm missing here in addition to feed_overlap. Is this a known issue?
As a quick aside, is there a way to set the percentage of overlap instead of having full overlap? This would be a nice feature, if it doesn't exist already! Thanks!
This functionality doesn't exist currently but the team is actively looking into adding the functionality as part of Prodigy v2. Thanks again for the feedback!