We just notice duplicate annotations, we are using ?session=name_date to track annotations, our workers never work on the same task at the same task but we are getting duplicate data by session, is the --exclude=database not enough for this?
We are using ner.manual and mark,and for now excluding all sessions at start.
Please help us on this.
Thanks and best regards!
Just to make sure I understand your goal correctly: You want to annotate your data so that every annotator sees different examples? If so, did you customise the
"feed_overlap" and set it to
false? See here (from the docs):
"feed_overlap" setting in your
prodigy.json or recipe config lets you configure how examples should be sent out across multiple sessions. By default (
true), each example in the dataset will be sent out once for each session, so you’ll end up with overlapping annotations (e.g. one per example per annotator). Setting
false will send out each example in the data once to whoever is available. As a result, your data will have each example labelled only once in total.
And when you say “duplicated data”, do you mean that the annotators are seeing the same example twice in the same session? Across different sessions? Or after you restart the server and they start annotating again with the same session name?