Merging annotation models?

datawizard · August 1, 2019, 11:00am

Hi all, we are a team of 4-5 members planning to annotate smaller subsets of a larger dataset on separate prodigy instances. Would like to check feasibility & options available to merge the annotation models from these subsets to create a model for entire dataset. Any help / insights will be great.

ines · August 1, 2019, 11:33am

Hi! When you annotate with Prodigy, you can specify the name of a dataset to save the annotations to. So when you annotate, you can save your annotations to a dataset like ner_project_datawizard. Once you're all done annotating, you can use the db-merge command to create one "master dataset" with all annotations and then train your model from that set.

There are different approaches for dividing up the work, but it's often a good idea to have a little bit of overlap so you can compare the decisions and make sure everyone's following the same annotation strategy. (For instance, if you're annotating person names and one team member always includes titles like "Dr" in the entity while everyone else doesn't, you want to find out about this asap and adjust. Otherwise, your model might end up significantly worse because it has to learn from inconsistent data.)

If you do end up with conflicting annotations, Prodigy obviously can't just solve that for you – but it can help you resolve the conflicts and create a final corrected dataset using the new review recipe and interface. I posted a little screen recording of it on Twitter a while ago:

All examples you annotate receive hashes, so Prodigy is able to tell which annotations relate to the same example. It can then show them to you in a condensed interface and ask you to have the "final word".

datawizard · August 4, 2019, 7:13am

Thanks, appreciate the inputs. We will apply those and get back if further help is needed.

Topic		Replies	Views
Merging annotations from different datasets usage , ner , database , solved	12	5865	May 28, 2019
Data annotation : Query Regarding Data Annotation and Merging in Prodigy ner	1	16	January 10, 2025
Merging datasets of same input data to combine separately annotated entities usage , ner	2	13	February 17, 2025
How to merge data from ner.correct and ner.teach? usage , ner , database	1	690	November 9, 2020
combining multiple models and exporting training data to spacy ner , spacy	3	2878	November 13, 2018

Merging annotation models?

Related topics