I'd like to extend the existing NER model

thalish · September 24, 2020, 7:31am

Hi Prodigy Team,

New user here so apologies if this is a duplicate question. Is there a recipe that handles correcting the existing annotations (ner.correct) while at the same time giving the option to add a new entity type?

My actual use case is to extend the NER model with a new entity type but I'd also like the option to correct mistakes in few of the labels as I go through the data.

What's the best approach to go about achieving this?

ines · September 24, 2020, 8:56am

Hi! The ner.correct recipe lets you pass in labels that are already in the model, as well as new labels. So you can set something like --label PERSON,ORG,MY_NEW_LABEL and Prodigy will show you the model's predictions for PERSON and ORG, and let you add new annotations for MY_NEW_LABEL.

This is a very good approach for adding a new label btw because it means that your data includes examples of what the model already predicts plus the new annotations, and it can help prevent forgetting effects etc.

thalish · September 25, 2020, 3:10am

Hi Ines, Thanks for getting back to me. Looks like ner.correct does not have an active learning component. What would be the proper recipe to add a new entity and have active learning assist me in labelling?

ines · September 25, 2020, 7:49am

Yes, ner.correct will show you all examples exactly as they come in and doesn't do any example selection using an updated model in the loop and doesn't skip any less relevant examples in favour of better ones etc. It just lets you correct the model's suggestions and use the model to help you annotate faster.

I'm not sure if example selection with active learning really makes sense in your use case, since you want to add a new label from scratch. That can be pretty difficult if you start out with a model that knows nothing about the new entity type, and you need to make sure it gets to see enough examples so it can learn about it. So I would at least collect some gold-standard annotations first for all of your entity types so you can pretrain a model. You can always improve it by having the model select examples to annotate later.

Btw, on the original question, it's definitely possible to update the model in the loop in a manual workflow – it just comes down to experimentation: Combining ner.teach with patterns file and manual correction of spans

Topic		Replies	Views
how to use ner.correct --update usage , ner , solved	4	686	October 21, 2021
ner.correct not showing suggestions ner , database , custom	3	408	May 4, 2023
add new lables as per new data received to existing data set and retrain the NER model ner , spacy	7	916	September 7, 2022
Add more 3 new entity type usage , ner	4	647	November 1, 2019
Adding new label usage , ner	5	1339	November 8, 2021

I'd like to extend the existing NER model

Related topics