Auto-accept behavior for binary classification results in accept when all annotators agree on reject

It appears that using --auto-accept with the prodigy review recipe for data that was a binary text classification unexpectedly sets the reviewed dataset answer field to accept when all sessions agree that the label should be rejected.

To reproduce

  1. Created a simple binary single label classification task and used prodigy textcat.manual on a single label.
  2. Annotate with multiple user sessions ensuring that some of the user sessions accept or reject classifications disagree. Ensure that at least one item all user sessions reject.
  3. Use prodigy review with --auto-accept flag and resolve the conflicts.
  4. Perform prodigy db-out on the reviewed dataset.

The result will be that all annotations where the reviewers agreed it the annotation should be reject will be marked as accept in the resulting .jsonl file.

I see why this may be happening given that accept and reject have a slightly different meaning for binary classification tasks. I can also appreciate that I could easily work around this by creating my own filter_review_stream function as indicated in this post.

But since this was added as a feature and had some unexpected behavior for my task, I figured I should point this out. Perhaps a quick documentation update could help.

Again, thanks for Prodigy and spaCy in general!

1 Like

Thanks for the report – this is a really good point and a use case we hadn't considered for this specific feature :+1: We'll definitely fix this for the next release!

In the meantime, you should be able to change this yourself by opening the file recipes/review.py in your Prodigy installation (you can run prodigy stats to find the path) and looking for this line in the filter_auto_accept_stream function:

eg["answer"] = "accept"

Changing it to something like this should work:

eg["answer"] = versions[0]["answer"]

This will take the answer value from the first (and only) version of the merged example, which will be the binary decision for binary annotations, and "accept" for all others.