Reviewing Ignored Cases

Hi all!

We are trying to review entries which have been annotated (via textcat) by multiple annotators. We have questions over how Prodigy is handling text entries when an annotator says to ignore it. Our impression is that Prodigy will never show a text entry once it's labeled 'Ignore' by at least one annotator.

Is that right? And if so, is there a way to change that behavior? Thanks!

Hi! By default, the review recipe will exclude all ignored answers, yes - it shouldn't ignore an example if another annotator answered something else, but it wouldn't show the ignored answer.

I definitely see the point, though, that if you use the "ignore" action to indicate "don't know the answer", you may still want to see who skipped it. So we should probably at least expose an option that lets you toggle whether to show who ignored that example.

In the meantime, you should also be able to change this yourself pretty easily: you can run prodigy stats to find the location of your Prodigy installation. Then open recipes/review.py and find the following line (should be around line 102) and remove it:

examples = (eg for eg in examples if eg["answer"] != "ignore")

Sorry for never responding back! For some reason, I didn't get a reply notification at all and I've been stomping out other issues, I forgot about this entire conversation. Thanks for the information, that is great to know, and we'll give it a spin!

Update: Just released v1.10.5, which introduces a --show-skipped flag in the review recipe that will show examples that would otherwise be skipped (ignored answers or rejected annotations in manual mode).

Hey! I found this thread while Googling around - not sure if there's a better way to do this but I haven't been able to find it yet: I'd like to be able to review only the ignored answers. We made up this workflow of, a person labels, and if they are not sure, you ignore it and someone more knowledgable will review only the ignored ones. I'm solving this now by running a python script that filters out the answers that are ignored - it's just a bit of a hassle with having to first export the data with db-out, then filter it with the script, then do db-in, review, db-out, and then db-merge (I think!).

1 Like

Yea I agree with @valentijnnieman , I want to use ignore as an option to come back to review it later. I guess i'm looking for something where I can reintroduce ignored examples into the same dataset and have them pop up again so I can give them a second try. A Python script works but it requires constantly creating new datasets rather than editing existing ones.

I think someway to just scroll and view annotated examples in the GUI, pretty much exactly like the History section, but just all of the annotated examples instead of just the ones done that session.

I supposed what I'm asking for is a way to scroll through and re-label examples that have already been annotated and saved.

Hi @spothedog1 !

Are you aware of the :ignore (or could be :accept or :reject) suffix you can add to a dataset name (along with dataset: prefix) to review only ignored examples?

For example:

python -m prodigy rel.manual ner_ignore_data blank:en dataset:ner_data:ignore --label SUBJECT,OBJECT

This will enable you to create a review dataset ner_ignore_data where you only review the ignored records from ner_data. This documentation provides more details.

Have you tried to modify the "history_size" and/or "batch_size" in the configuration file? This may cause issues due to memory if you have a lot of annotations -- let alone more problems if you forget to click save as many records will be in your browser and not yet in the database. This is the major challenge on why by default we set both to 10 records.

2 Likes

I actually did not know about that, it is very helpful thank you. I think that helps solve my problem.

Has something changed regarding this usage? I am using 1.12.0. I am getting an error message:

Dataset: 'ugs_dates:ignore' not found in the currently configured
Prodigy Database: sqlite"

I am using this in the python code:

stream = get_stream(source)

I am executing a custom recipe with this

prodigy my-dates ugs_dates_ignore en_core_web_lg dataset:ugs_dates:ignore -l DATE -F ./Recipe_Dates.py

Any suggestions?

I'll need to verify but I know there were a few bugs found and fixed after v1.12.0. I know one specifically related to get_stream and handling dataset:.

Can you try to install v1.12.4 and let us know if you still have the problem?

I upgraded to v1.12.4 and have the same error.

Gotcha. Just curious - does running dataset:ugs_dates work fine? That seems to be the case for me. I can see the issue with either :ignore, :accept, or :reject.

Sorry to not answer back, yes the dataset:us_dates works. but the answer part does not.

Yes, we've found it's a small bug. We're looking to make an update for v1.12.5 in a day or so. This came up from our recent v1.12.0 release. We'll post back when it's available.

Hi @lauvil,

Prodigy v1.12.5 was just released on PyPI including a fix for your problem. Thanks again for reporting!