I have successfully trained a couple of NER models for new classes using prodigy on top of the default spacy2 models. Each of these needed around 2-5 thousand annotations to achieve a good accuracy (above 90%). Now I want to combine these mutually exclusive classes into a single model.
The naive approach was to merge all datasets used for training each individual model and do a batch train for all labels. But even after experimenting around with hyperparameters and adding thousands of additional annotations to the dataset, the combined model still does not perform anywhere close to the individual models.
Are there are some common pitfalls when combining multiple labels that I might now have considered?
One possible explanation for this would be that the data is somehow being merged in a wrong way and now I want to try merging all spans myself and then use spacy directly instead of using the prodigy commands.
While investigating how the spans are merged from prodigy datasets, I wanted to convert the annotations from Prodigy to Spacy’s format. I found this post (Mixing in gold data to avoid catastrophic forgetting) in the support forums.
From the code in this post it looks like only “accept” answers are being used and all other answers were dropped? This seems very weird to me since I had the impression that “reject” annotations were helpful while training the individual NER models.
How does Prodigy handle “reject” annotations internally and how can I transfer reject annotations to a format that can be used by spacy?
I think I understand what the problem might be. Unfortunately this is one of the workflows we’re least satisfied with.
Let’s say you annotate all PERSON entities in document 1, and all ORG entities in document 2. Then you make a new dataset, with both documents.
The problem is, there’s no way to tell the model to expect that each document is only annotated for one entity type. This means the model has to assume that any text where no entities are marked might actually contain missing entities. If you only had one entity type, you’d be able to use the --no-missing flag.
The best solution is probably to get the same text annotated with all of your entities. You can probably train with your current annotations, and use them to bootstrap. You’ll want to get predictions for each of your models on the texts, and then merge those predictions into one dataset. Finally, you can then run ner.manual on the result, to clean up any errors.
You’ll probably find it useful to do the prediction and merging in a separate script, as it’s pretty much a once-off task that doesn’t need Prodigy.