Is there a way you can remove labels and their data from an existing model while keeping other labels present in the model?
No, there's no easy way to just stop a pre-trained model from predicting a label. spaCy's pre-trained English models were trained on ~2 million words and their weights are based on all possible labels. For example, if a model is only trained with examples of numbers and not dates, it might predict dates as numbers – but if it also sees examples of dates, this will significantly change the overall analysis of a text containing dates.
If you just don't want to see certain labels, it makes more sense to add a filter around the
doc.ents that only returns entities with a certain label. For example, in your own code, you could do something like this (even more elegant as a custom extension attribute like
excluded_labels = ("PERSON", "ORG") ents = [ent for ent in doc.ents if ent.label_ not in excluded_labels]
If you want to train a model with your own examples, you'd have to decide whether it makes sense to update and existing pre-trained model, or if you're better off starting with a fresh model. My comment on this thread explains the two approaches and the trade-offs in more detail: