Hi Prodigy Team,
New user here so apologies if this is a duplicate question. Is there a recipe that handles correcting the existing annotations (ner.correct) while at the same time giving the option to add a new entity type?
My actual use case is to extend the NER model with a new entity type but I'd also like the option to correct mistakes in few of the labels as I go through the data.
What's the best approach to go about achieving this?
ner.correct recipe lets you pass in labels that are already in the model, as well as new labels. So you can set something like
--label PERSON,ORG,MY_NEW_LABEL and Prodigy will show you the model's predictions for
ORG, and let you add new annotations for
This is a very good approach for adding a new label btw because it means that your data includes examples of what the model already predicts plus the new annotations, and it can help prevent forgetting effects etc.
Hi Ines, Thanks for getting back to me. Looks like ner.correct does not have an active learning component. What would be the proper recipe to add a new entity and have active learning assist me in labelling?
ner.correct will show you all examples exactly as they come in and doesn't do any example selection using an updated model in the loop and doesn't skip any less relevant examples in favour of better ones etc. It just lets you correct the model's suggestions and use the model to help you annotate faster.
I'm not sure if example selection with active learning really makes sense in your use case, since you want to add a new label from scratch. That can be pretty difficult if you start out with a model that knows nothing about the new entity type, and you need to make sure it gets to see enough examples so it can learn about it. So I would at least collect some gold-standard annotations first for all of your entity types so you can pretrain a model. You can always improve it by having the model select examples to annotate later.
Btw, on the original question, it's definitely possible to update the model in the loop in a manual workflow – it just comes down to experimentation: Combining ner.teach with patterns file and manual correction of spans