Feed overlap issue (latest release)

Hi,

i am wondering if anyone else has issues with the feed overlap setting in the latest
Prodigy release (1.10.4). It seems that there is always feed overlap even if you set the feed_overlap to false in the config file (prodigy.json).
The feed_overlap functionality worked fine in the previous version we had installed (1.9.9) before upgrade to the latest version.
We use the mark interface with ner_manual view .

Best,
Manolis

Hi @mkyriakakis,

I tried to reproduce your overlap problem with mark recipe and ner_manual view, but I wasn't able to get any overlapping examples to show up when feed_overlap is false.

Can you provide more details to help reproduce the issue? What command line arguments do you use when launching your task. Can you provide a few examples from your dataset? Are you using named sessions when you open the prodigy web app (i.e. is there ?session=name on the url?) Is the problem reproduced every time, and if so what are the steps you take?

Thanks,
-Justin

Hi @justindujardin,

thanks for your reply. Of-course. I attach a sample of my dataset.

  • We do use named sessions of-course.
  • The problem is reproduced always and for that purpose we used a previous Prodigy version (1.9.9) instead of the latest release (1.10.4). Previous version works fine (no feed overlap).
  • The command we run to start the Prodigy server is the following:
    prodigy mark Test test.jsonl --view-id ner_manual
  • The configuration inside the prodigy.json is the following:
{
  "theme": "basic",
  "batch_size": 10,
  "port": 8080,
  "host": "localhost",
  "cors": true,
  "db": "sqlite",
  "db_settings": {},
  "api_keys": {},
  "validate": true,
  "auto_exclude_current": true,
  "instant_submit": false,
  "feed_overlap": false,
  "show_stats": true,
  "hide_meta": false,
  "show_flag": false,
  "instructions": false,
  "swipe": false,
  "split_sents_threshold": false,
  "diff_style": "words",
  "html_template": false,
  "global_css": null,
  "javascript": null,
  "writing_dir": "ltr",
  "show_whitespace": false,
  "hide_newlines": false,
  "ner_manual_require_click": false,
  "ner_manual_label_style": "list",
  "choice_style": "single",
  "choice_auto_accept": false,
  "darken_image": 0,
  "show_bounding_box_center": false,
  "preview_bounding_boxes": false,
  "shade_bounding_boxes": false
}

test.jsonl (1.9 MB)

Manolis

Thanks for sharing a sample of your dataset. Using your configuration and data.

While I couldn't reproduce your overlapping problem, one thing I noticed is that the mark recipe now uses a default of force_stream_order=True which causes the stream to repeat questions that are unanswered until they get answered by someone. This can cause some duplicates to be shown if two users annotate at the same moment, but any duplicate answers are filtered out on the server when they're received.

Could you try setting force_stream_order = False in your prodigy.json file and see if that alleviates the problem you're seeing?

If that doesn't fix the issue, can you tell me in more detail about how you see duplicates? For example, "I open browser 1 and see 'this text', then I answer and save the question, and open another browser that sees the same thing"

Thanks,
-Justin

Thanks @justindujardin,

setting force_stream_order = False indeed fix the issue.

Manolis

1 Like