Reviewing Ignored Cases

mwidjaja1 · September 1, 2020, 3:40pm

Hi all!

We are trying to review entries which have been annotated (via textcat) by multiple annotators. We have questions over how Prodigy is handling text entries when an annotator says to ignore it. Our impression is that Prodigy will never show a text entry once it's labeled 'Ignore' by at least one annotator.

Is that right? And if so, is there a way to change that behavior? Thanks!

ines · September 2, 2020, 9:35am

Hi! By default, the review recipe will exclude all ignored answers, yes - it shouldn't ignore an example if another annotator answered something else, but it wouldn't show the ignored answer.

I definitely see the point, though, that if you use the "ignore" action to indicate "don't know the answer", you may still want to see who skipped it. So we should probably at least expose an option that lets you toggle whether to show who ignored that example.

In the meantime, you should also be able to change this yourself pretty easily: you can run prodigy stats to find the location of your Prodigy installation. Then open recipes/review.py and find the following line (should be around line 102) and remove it:

examples = (eg for eg in examples if eg["answer"] != "ignore")

mwidjaja1 · September 24, 2020, 6:47pm

Sorry for never responding back! For some reason, I didn't get a reply notification at all and I've been stomping out other issues, I forgot about this entire conversation. Thanks for the information, that is great to know, and we'll give it a spin!

ines · November 11, 2020, 9:50am

Update: Just released v1.10.5, which introduces a --show-skipped flag in the review recipe that will show examples that would otherwise be skipped (ignored answers or rejected annotations in manual mode).

valentijnnieman · April 22, 2022, 1:28pm

Hey! I found this thread while Googling around - not sure if there's a better way to do this but I haven't been able to find it yet: I'd like to be able to review only the ignored answers. We made up this workflow of, a person labels, and if they are not sure, you ignore it and someone more knowledgable will review only the ignored ones. I'm solving this now by running a python script that filters out the answers that are ignored - it's just a bit of a hassle with having to first export the data with db-out, then filter it with the script, then do db-in, review, db-out, and then db-merge (I think!).

spothedog1 · July 7, 2022, 3:19pm

Yea I agree with @valentijnnieman , I want to use ignore as an option to come back to review it later. I guess i'm looking for something where I can reintroduce ignored examples into the same dataset and have them pop up again so I can give them a second try. A Python script works but it requires constantly creating new datasets rather than editing existing ones.

I think someway to just scroll and view annotated examples in the GUI, pretty much exactly like the History section, but just all of the annotated examples instead of just the ones done that session.

I supposed what I'm asking for is a way to scroll through and re-label examples that have already been annotated and saved.

ryanwesslen · July 7, 2022, 3:40pm

Hi @spothedog1 !

Are you aware of the :ignore (or could be :accept or :reject) suffix you can add to a dataset name (along with dataset: prefix) to review only ignored examples?

For example:

python -m prodigy rel.manual ner_ignore_data blank:en dataset:ner_data:ignore --label SUBJECT,OBJECT

This will enable you to create a review dataset ner_ignore_data where you only review the ignored records from ner_data. This documentation provides more details.

Have you tried to modify the "history_size" and/or "batch_size" in the configuration file? This may cause issues due to memory if you have a lot of annotations -- let alone more problems if you forget to click save as many records will be in your browser and not yet in the database. This is the major challenge on why by default we set both to 10 records.

spothedog1 · July 7, 2022, 3:47pm

I actually did not know about that, it is very helpful thank you. I think that helps solve my problem.

lauvil · July 21, 2023, 7:30pm

Has something changed regarding this usage? I am using 1.12.0. I am getting an error message:

Dataset: 'ugs_dates:ignore' not found in the currently configured
Prodigy Database: sqlite"

I am using this in the python code:

stream = get_stream(source)

I am executing a custom recipe with this

prodigy my-dates ugs_dates_ignore en_core_web_lg dataset:ugs_dates:ignore -l DATE -F ./Recipe_Dates.py

Any suggestions?

ryanwesslen · July 21, 2023, 7:37pm

I'll need to verify but I know there were a few bugs found and fixed after v1.12.0. I know one specifically related to get_stream and handling dataset:.

Can you try to install v1.12.4 and let us know if you still have the problem?

lauvil · July 21, 2023, 8:03pm

I upgraded to v1.12.4 and have the same error.

ryanwesslen · July 21, 2023, 8:33pm

Gotcha. Just curious - does running dataset:ugs_dates work fine? That seems to be the case for me. I can see the issue with either :ignore, :accept, or :reject.

lauvil · July 24, 2023, 6:10pm

Sorry to not answer back, yes the dataset:us_dates works. but the answer part does not.

ryanwesslen · July 24, 2023, 6:42pm

Yes, we've found it's a small bug. We're looking to make an update for v1.12.5 in a day or so. This came up from our recent v1.12.0 release. We'll post back when it's available.

ryanwesslen · July 28, 2023, 3:36pm

Hi @lauvil,

Prodigy v1.12.5 was just released on PyPI including a fix for your problem. Thanks again for reporting!

Topic		Replies	Views
Review recipe: Ignore for now, but go over later. usage , ner , solved , review	2	442	January 21, 2023
Undesirable "ignore" examples build up with low quality input streams enhancement	5	1762	September 27, 2022
Review recipe: which examples does it show? usage , review	1	832	June 13, 2019
Skip Functionality usage	3	540	September 28, 2022
Reviewing/Editing annotated data usage , review , streams	1	966	June 23, 2020

Reviewing Ignored Cases

Related topics