From last time's experiments I have successfully trained a high accuracy NER model specifically for my use case. Once I figured out what went wrong, Prodigy is great and smooth for annotating data and training models.
Now before I use this model for production, I want to make sure it only outputs the trained NER types (PERSON, ORG), and not any other types (DATE, etc).
I could use the catastrophic forgetting effect -- it seems the model has already forgot other types in most cases. Or I could simply ignore other NER types it outputs.
But I just want to be 100% sure that it wouldn't at some point think that another NER type has a higher probability than the NER types I cared.
So, how can I remove a predefined NER type from a model?
The answer to this should be quite simple as well: when you run ner.batch-train, just make sure you’re training from a blank NER model, rather than one with pre-defined types.
If you want to use pre-trained vectors, you can start with the en_vectors_web_lg model, instead of e.g. en_core_web_md. Otherwise, just make a blank model like this: