Auto-accept behavior for binary classification results in accept when all annotators agree on reject

crscheid · January 5, 2022, 6:54pm

It appears that using --auto-accept with the prodigy review recipe for data that was a binary text classification unexpectedly sets the reviewed dataset answer field to accept when all sessions agree that the label should be rejected.

To reproduce

Created a simple binary single label classification task and used prodigy textcat.manual on a single label.
Annotate with multiple user sessions ensuring that some of the user sessions accept or reject classifications disagree. Ensure that at least one item all user sessions reject.
Use prodigy review with --auto-accept flag and resolve the conflicts.
Perform prodigy db-out on the reviewed dataset.

The result will be that all annotations where the reviewers agreed it the annotation should be reject will be marked as accept in the resulting .jsonl file.

I see why this may be happening given that accept and reject have a slightly different meaning for binary classification tasks. I can also appreciate that I could easily work around this by creating my own filter_review_stream function as indicated in this post.

But since this was added as a feature and had some unexpected behavior for my task, I figured I should point this out. Perhaps a quick documentation update could help.

Again, thanks for Prodigy and spaCy in general!

ines · January 6, 2022, 10:51am

Thanks for the report – this is a really good point and a use case we hadn't considered for this specific feature We'll definitely fix this for the next release!

In the meantime, you should be able to change this yourself by opening the file recipes/review.py in your Prodigy installation (you can run prodigy stats to find the path) and looking for this line in the filter_auto_accept_stream function:

eg["answer"] = "accept"

Changing it to something like this should work:

eg["answer"] = versions[0]["answer"]

This will take the answer value from the first (and only) version of the merged example, which will be the binary decision for binary annotations, and "accept" for all others.

Topic		Replies	Views
Can I approve/reject pre labelled text classifications usage , textcat	2	474	February 11, 2020
mutually exclusive classes and textcat.batch-train usage , textcat	5	727	July 1, 2019
Prodigy review recipe not entirely clear to me	8	623	June 22, 2023
Disable auto accept when selecting an option in text classification recipe single choice textcat , front-end	1	281	September 15, 2021
Train doesn't use rejected text for binary classification textcat , done	3	441	March 17, 2020

Auto-accept behavior for binary classification results in accept when all annotators agree on reject

Related topics