Merging single label-based models into one multiple label-model

ines · June 7, 2020, 10:34am

Hi! I think your experience and intuition makes sense here – we've also found that annotating all labels together (especially for large sets) can often makes things more difficult because you constantly have to think about all labels and it makes it harder to adjust the label scheme during the development phase.

Ideally, you'd always want to be training from scratch using all annotations. The presence or absence of one label can have an impact on all other labels, so it makes sense to train your model on all labels combined. So whenever you have a new label, you add the annotations for it and then retrain on all datasets.

This souldn't be more work either – if you're using Prodigy v1.9+, the train and data-to-spacy recipes will take care of merging annotations from multiple datasets automatically. The merged data wll only contain each example once, and all annotations referring to that example will be merged togethet (data-to-spacy even merges annotations of different types, like text classification and NER.

If you're annotating a new label and you're worried that there might be overlap with other labels, you could also re-annotate an existing dataset with another label. So if you have a dataset with ORG anotations and you want to add annotations for PRODUCT, export the data with db-out and re-annotate it with --label ORG,PRODUCT. You'll the the existing annotations and can adjust them if needed, and you'll be able to add new annotations for PRODUCT. Later on, you can then train with your new ORG/PRODUCT dataset instead of the ORG dataset.

Topic		Replies	Views
combining multiple models and exporting training data to spacy ner , spacy	3	2866	November 13, 2018
Multiple models or one single model? usage , ner	2	463	February 22, 2021
Combining two separate datasets into a single trained model ner	2	253	December 6, 2023
Training Multiple entities at the Same time? ner , spacy , solved	11	3162	November 27, 2018
Add more 3 new entity type usage , ner	4	647	November 1, 2019

Merging single label-based models into one multiple label-model

Related topics