hi @kushalrsharma,
When all sessions have the same data, we call that "overlapping" annotations.
However, it seems that you want "non-overlapping" annotations, i.e., each example in the data once to whoever is available. As a result, your data will have each example labelled only once in total.
This is exactly what setting "feed_overlap"
to false
will do (this is the value by default).
But it seems like you're saying that you think your "feed_overlap"
is still set to false
, but it is not producing that behavior, right?
Try to run:
PRODIGY_CONFIG_OVERRIDES='{"feed_overlap": false}' python -m prodigy ...
This should override everything. I'm concerned you may have accidentally set "feed_overlap": true
at some point or not pointing to the correct prodigy.json
. You can technically have a prodigy.json
for your project and one for global. (You may want to run python -m prodigy stats
to verify what path your global prodigy.json
is).
Also, you may want to consider reseting your overrides:
export PRODIGY_CONFIG_OVERRIDES="{}"
Last, when providing examples, please provide reproducible examples for us. This will help you get faster responses. Since we don't have your data, it is impossible for us to help you to confirm what the problem and that we're talking about the same issue. I've created this example below that shows this to help you see what should be the difference.
Using this data:
nyt_text_dedup.jsonl (18.5 KB)
feed_overlap: false
("non-overlapping")
PRODIGY_CONFIG_OVERRIDES='{"feed_overlap": false}' python3 -m prodigy ner.manual ner_ex blank:en nyt_text_dedup.jsonl --label ORG
First, open browser for session1: "ryan"
Don't annotate any examples.
Then open a 2nd browser for session2: "kushu"
Notice how this starts "kushu" at record number 10 (since batch_size
is 10).
feed_overlap: true
(overlapping)
PRODIGY_CONFIG_OVERRIDES='{"feed_overlap": true}' python3 -m prodigy ner.manual ner_ex blank:en nyt_text_dedup.jsonl --label ORG
First, open browser for session1: "ryan"
Don't annotate any examples.
Then open a 2nd browser for session2: "kushu"
Notice how this starts "kushu" at record number 0 (that is, "ryan" and "kushu" have the same data, hence their annotations "overlap").
Hope this helps to clarify and let me know if this clears up any confusion!