Improve custom NER model performance for different input texts

magdaaniol · February 19, 2024, 3:40pm

Whether training training separate NER models per data type will be more effective than training one model depends a bit on how different the data types are and how much data you have available per each type. Honestly, I think it's hard to say upfront and you can get the best answer through experimentation.

For option 1) i.e. one NER per data type, I think you'd need a custom spacy pipeline per each (along the lines of the example from the spaCy board) and then add another component that would implement some logic for choosing the final prediction - probably choosing the prediction from the model with the highest confidence?

For option two 2) I think the best strategy would be to add new data to the dataset and annotate it with prodigy teachwhich will serve the examples that the model is most unsure of first.

When you add the samples of new data types (syllabi and curricula) it's probably best to add some data type identifier to the meta of each example so that you can easily do your experimentation.

Topic		Replies	Views
Problem with custom model - ner train - usage , ner , done , training	5	659	September 16, 2021
Train NER model to improve existing entities spacy vs prodigy ner , spacy	1	951	December 9, 2019
Train one multipurpose Model or multiple models for different usecases? ner , spacy , training	1	27	August 27, 2024
Adding Custom Features to Train a NER spaCy Model ner , spacy	1	699	February 16, 2021
Prodigy to Spacy Guide ner , spacy , best-practices	4	5322	January 13, 2020

Improve custom NER model performance for different input texts

Related topics