How to remove a predefined NER type?

From last time’s experiments I have successfully trained a high accuracy NER model specifically for my use case. Once I figured out what went wrong, Prodigy is great and smooth for annotating data and training models.

Now before I use this model for production, I want to make sure it only outputs the trained NER types (PERSON, ORG), and not any other types (DATE, etc).

I could use the catastrophic forgetting effect – it seems the model has already forgot other types in most cases. Or I could simply ignore other NER types it outputs.

But I just want to be 100% sure that it wouldn’t at some point think that another NER type has a higher probability than the NER types I cared.

So, how can I remove a predefined NER type from a model?


The answer to this should be quite simple as well: when you run ner.batch-train, just make sure you’re training from a blank NER model, rather than one with pre-defined types.

If you want to use pre-trained vectors, you can start with the en_vectors_web_lg model, instead of e.g. en_core_web_md. Otherwise, just make a blank model like this:

import spacy

Then you can pass that directory to Prodigy.