Hi, I'm trying to run the built in review recipe on my annotated dataset. Unfortunately I'm getting an error on the web-app: TypeError: Cannot read properties of undefined (reading 'forEach').
PRODIGY_LOGGING=basic doesn't give me any errors related to this, only the UI.
It seems to be that enabling --auto-accept is what causes this error as the recipe works fine when I disable that option. Of course, it massively speeds up the review process to not have to click through all the non-conflicting annotations, so that option seems quite useful. For info, the dataset was generated using a slight adaptation of the textcat.correct recipe from the Github. Thanks.
Thanks for the information and welcome to the Prodigy community
You may have found a bug. We really appreciate the info.
Could you provide 1 example of a record from your database so I can try to recreate on my end? I suspect it'll be standard for the classification UI format, but I want to be sure.
You can provide the example like this:
from prodigy.components.db import connect
db = connect()
examples = db.get_dataset("my_dataset")
examples[0] # if first record is causing this issue
Feel free to change the example if there are any proprietary details in your example.
We'll try to provide an update later next week if we're able to confirm it's a bug or if we can find a solution.
Thanks for the very quick response! I did a bit of further digging on my side and I think it's in-fact not related to the --auto-accept option, instead due to a quirk of my dataset (but one that prodigy should perhaps handle more gracefully).
I think it's due to the following:
When setting up my annotation task I started with using the built-in recipe fortextcat.manual
We annotated some examples, then trained a model and continued from there with the textcat.correct recipe.
...except I had to use a custom version of textcat.correct from the github examples page (I think it was because we needed to support active learning with multiple labels, which the built-in recipe doesn't do).
It seems this github version of textcat.correct records an additional output field. I see this in the .jsonl filewhen I do prodigy db-out and when I output the data with your script Ryan.
It looks like "config":{"choice_style":"multiple"} is present in some of my examples, but not in others. I assume this is because the built-in recipe did not include that field by default.
It looks like the web-app falls over when it sees duplicate entries like this (for which I would want to review). One with and one without the above field included.
I uploaded an example debug.jsonl file with the above quirk and you should be able to reproduce the error with the below. For info I'm on Prodigy 1.11.8:
# Import to a dummy dataset
prodigy db-in debug_data ./debug.jsonl
# Launch the annotation server with the review 'recipe'
prodigy review debug_review debug_data