A model with multiple labels or multiple models with a single label?

pl6306 · February 7, 2020, 8:18pm

I am new to spacy and prodigy. I am trying to classify sentences into 5 distinct category with a 6 category to be ignored. So far I seem to have better results on using 5 individually trained models instead of 1 model trained with 5 labels. The 5 category are very distinct so I try each model one by one until I get one with 70% or higher probability. The problem with is approach is the memory required to load 5 individual model is high. Which approach do you recommend? Also how much data would I need annotated for the 5 label in one model case versus 1 label in 5 models?

pl6306 · February 8, 2020, 4:46pm

Forget this question. It is probably a rookie mistake. The proper approach is probably to make 6 category labels with the 6th being unknown then train textcat with with -TE?

ines · February 10, 2020, 11:43pm

Yes, that sounds like a reasonable plan

(Not sure what the 6th ignored category is, but if it's something like "noise" or "not relevant", it can sometimes help to use two text classifiers: one to filter out the noise, and one to predict the actual category, which is only trained on pre-filtered relevant examples. However, I would only recommend experimenting with that if the classifier with 6 categories struggles with examples from the ignored category.)

Topic		Replies	Views
Multiple, separate text classifications in a single model usage , textcat , solved	12	2887	September 28, 2021
Use textcat and textcat_multilabel in the same model textcat , spacy	1	347	May 19, 2022
Interface error with text cat.teach? usage , textcat	1	583	March 20, 2018
Active learning for a multilabel text classifer textcat	1	1126	December 14, 2017
How to do multiclass textcat? usage , textcat	8	4756	May 25, 2018

A model with multiple labels or multiple models with a single label?

Related topics