NER: Pass annotated data set to Prodigy for validating / small corrections


For NER, I have a data set with annotations. I need a quick way to load each sentence with the associated named entities and review these annotations, potentially making small corrections.

What is the best way to do this with Prodigy? Is there a way to pass a data set with "hints" about the named entities labels?


Hi! Prodigy's input and output formats are the same, so you can always export a dataset to a JSONL file using db-out, and then load that back into Prodigy as the text source. Recipes like ner.manual will respect pre-set spans, so you'll be able to go through your existing annotations again and correct them if needed. Just make sure to use a new dataset for the reviewed annotations, so you're not mixing old and new data and can easily start over if something goes wrong.

If your data includes duplicates, e.g. annotations on the same text created by multiple annotators, and you want to merge them and resolve conflicts, you can also use the review recipe. This will show you all annotations on the same text and lets you create one final "master annotation". But this really only makes sense if you have multiple annotations on the same data – otherwise it's overkill.