Remove some labels from existing model?

ryanweaver718 · December 22, 2018, 8:47pm

Is there a way you can remove labels and their data from an existing model while keeping other labels present in the model?

ines · December 27, 2018, 11:17am

No, there's no easy way to just stop a pre-trained model from predicting a label. spaCy's pre-trained English models were trained on ~2 million words and their weights are based on all possible labels. For example, if a model is only trained with examples of numbers and not dates, it might predict dates as numbers – but if it also sees examples of dates, this will significantly change the overall analysis of a text containing dates.

If you just don't want to see certain labels, it makes more sense to add a filter around the doc.ents that only returns entities with a certain label. For example, in your own code, you could do something like this (even more elegant as a custom extension attribute like doc._.filtered_ents):

excluded_labels = ("PERSON", "ORG")
ents = [ent for ent in doc.ents if ent.label_ not in excluded_labels]

If you want to train a model with your own examples, you'd have to decide whether it makes sense to update and existing pre-trained model, or if you're better off starting with a fresh model. My comment on this thread explains the two approaches and the trade-offs in more detail:

Topic		Replies	Views
Only enable some labels in spacy NER model spacy , solved	4	4297	February 18, 2019
Prodigy train on specific custom entities usage , ner , spacy , training	1	394	July 23, 2021
ner.batch-train not to use default labels but just the ones from a training sample ner , spacy , solved	8	738	July 30, 2018
How to modify labels/entities in default models (en, en_core_web_lg, etc) and retrain usage , ner	6	2965	April 11, 2019
How to add my new NER model to existing Spacy usage , ner , spacy	1	425	April 28, 2020

Remove some labels from existing model?

Related topics