How to remove a predefined NER type?

Silent · August 16, 2018, 5:27pm

From last time's experiments I have successfully trained a high accuracy NER model specifically for my use case. Once I figured out what went wrong, Prodigy is great and smooth for annotating data and training models.

Now before I use this model for production, I want to make sure it only outputs the trained NER types (PERSON, ORG), and not any other types (DATE, etc).

I could use the catastrophic forgetting effect -- it seems the model has already forgot other types in most cases. Or I could simply ignore other NER types it outputs.

But I just want to be 100% sure that it wouldn't at some point think that another NER type has a higher probability than the NER types I cared.

So, how can I remove a predefined NER type from a model?

Thanks.

honnibal · August 20, 2018, 12:58pm

The answer to this should be quite simple as well: when you run ner.batch-train, just make sure you’re training from a blank NER model, rather than one with pre-defined types.

If you want to use pre-trained vectors, you can start with the en_vectors_web_lg model, instead of e.g. en_core_web_md. Otherwise, just make a blank model like this:

import spacy
spacy.blank('en').to_disk('/tmp/en_blank')

Then you can pass that directory to Prodigy.

Topic		Replies	Views
How do I train a custom ner model? usage , ner , spacy , solved	7	2392	June 25, 2019
Prodigy model not learning, spaCy model ~90% F1 score usage , ner , spacy	11	1826	May 21, 2019
ner.batch-train not to use default labels but just the ones from a training sample ner , spacy , solved	8	738	July 30, 2018
Reproducing prodigy ner.batch-train in spacy: cross-validation results and outputted model usage , ner	3	1873	October 5, 2018
how to use ner.correct --update usage , ner , solved	4	682	October 21, 2021

How to remove a predefined NER type?

Related topics