Sure! The documentation on text classification is a good place to start. There's also a section on using hierarchical label schemes here: Text Classification · Prodigy · An annotation tool for AI, Machine Learning & NLP
You can definitely train multiple text classification components in spaCy and combine them into a single pipeline using different names. The components will all write their predicted scores to the doc.cats
property. So if your top-level text classifier predicts a high score for a label at the top level of your hierarchy, you can then look at the more fine grained labels predicted by
You will end up with multiple components this way, but not multiple pipelines, so the runtime pipeline can still be pretty lightweights and efficient. (In spaCy v3, you can also share the same embeddings with multiple components, so even if you're using large transformer embeddings, you'll only need to load them once for all text classifiers instead of once per classifier.)