Hello,
I am using Prodigy 1.8.3. I have annotated a dataset unfortunately I have made a few mistakes that I want to correct. How can I review/edit the labels? When I launch Prodigy again, I can see it loading the annotations but at the end of the process, the web pages says "No tasks available.".
Clearly it remembers what has been annotated previously and skip over these examples.
I am using a recipe with "ner_manual" and everything is stored in the SQLLite database. I have tried to exclude or not the dataset I previously annotated but nothing changes.
What is the trick to tell Prodigy to review the previous examples with the previous annotations?
Hi! Prodigy's datasets are append-only by design, so if you want to edit or correct existing annotations, you'd conceptionally add a new data point, or a new dataset, so you always keep a reference to the annotations you (or someone else) previously created. By default, Prodigy filters out examples that are already in the dataset, so you're not asked a question twice.
Prodigy's input and output formats are the same, so an easy way to re-annotate your dataset is to export the annotations using db-out and then load the JSONL file back in as the input data for ner.manual and save the results to a new dataset.
In Prodigy v1.10+, you can also use the dataset: shorthand in the source argumet, so dataset:my_cool_dataset:accept would only load examples from my_cool_dataset that were answered with "accept".
If you have multiple annotations on the same input with potential conflicts (e.g. by multiple annotators who disagree), you can also use the review workflow: https://prodi.gy/docs/recipes#review This lets you see all variations side-by-side and create one final correct answer. The original answers are preserved in the data, so you always know what lead to the final decision.