Two levels of classifications for text classifications

marcaspers · October 12, 2020, 11:27am

Hi there,

I'm looking for the best way to give two levels of classifications to each sentence, using the Text Classification (or similar) recipe. What I want to achieve is that I end up with a dataset where each sentence has a main category label and then a subcategory label.

As far as I can see, I can use Prodigy only to first create a dataset with the main category labels, and have to use that same set to label the same sentences with a subcategory. Are there alternative/better ways to achieve my goal?

ines · October 13, 2020, 8:27am

Hi! This is definitely a solution and something we often recommend for hierarchical label schemes. You can read more about the idea and reasoning here: Text Classification · Prodigy · An annotation tool for AI, Machine Learning & NLP It does mean you have to make a second pass over the data, but you'll be able to focus on only the subcategories in the second pass (which can be much more efficient because it's easier to focus) or use automation specific to the top-level categories (to pre-select labels). And it helps while you're developing, because it makes it easier to iterate.

Alternatively, you could also just generate a list of "options" with text like CATEGORY > SUBCATEGORY and list all possibilities in the same task.

Finally, you could also do something more custom and add your own checkboxes/radio buttons with an "accordion"-type UI that pops out additional options, depending on what you select and how you want the hierarchy to work. See here for details: Adding Accordion to Choices - #2 by ines

marcaspers · October 20, 2020, 5:17am

Thanks for your explanation! I'll look into it, but the 'focus' argument really makes me prefer to go for a second pass instead of doing two layers of annotation simultaneously.

Topic		Replies	Views
Hierarchal text classification process textcat , spacy	2	639	May 17, 2021
Multilabel text classification with more than 200 labels usage , textcat	1	780	January 19, 2022
Custom textcat for 2nd level textcat	5	718	January 23, 2023
Hierarchical text classification - multiple passes on same dataset textcat , database , solved	1	589	August 14, 2020
Hierarchal text classification trouble shooting usage , textcat	5	630	August 17, 2021

Two levels of classifications for text classifications

Related topics