Is there any out-of-box recipe to achieve multi-label text classification? Any pointer is appreciated.
Do you mean there are multiple possible labels, but each instance has exactly one, or that there are multiple labels, and each instance may have zero or more?
You can specify multiply labels on the command line, by separating them with commas.
The model also supports non-mutually exclusive labelling. For this I would recommend labelling the instances one-by-one, into different datasets, and then merging the datasets together.
My question was towards non-mutually exclusive labelling. Thank you so much for the response and for resolving the issue.
What if, each instance has zero or more label? In my case, the labels are hierarchical, Category > Sub-category > Style > Sub-Style, So how can I label the data only once with multi-labels multi-classes?
You could use the choice interface to do this, although I’m not sure it’s really better. You might want to try making multiple passes over the data instead. It sounds worse, but it’s really much faster to annotate each label individually. The decisions are very quick to make, and you don’t have to spend any time in the interface, as you can just click accept or reject. The annotations are also usually more accurate, as the decisions are individually easier.
We are building a multi-label (non exclusive) text classification model for which the data has been annotated using Prodigy UI. There're almost
4K examples for
10 labels out which
1000 are positive. Now our questions are:
- Is the size (
25%are +ve examples) of data set enough to train a model for
- We've used SpaCy CLI to train where the default metric is ROC AUC score (macro). We want to know how to get the threshold score to classify example to each class.
- Is there a separate threshold score for every class? If yes, how can we get that?
In general there's no real way to guess how many examples are needed to get to a certain accuracy. You could have one problem where a single word's occurrence linearly separates each class, in which case the model just has to learn that one feature. This problem could be learned from extremely few examples. You can also have other problems where the model will not learn it at all, no matter how much data. So it's not something I can really comment on. The learning curve might help you see whether you need more examples. Probably I would say 1k is a bit too few, you should definitely use word vectors in your model at that data size, and possibly also pretraining.
spaCy doesn't currently support setting a threshold for each score. Instead you can handle this yourself in code that interprets the results. The scores are provided in a dictionary,
doc.cats, so you can implement your own way of mapping the scores to positive or negative classifications, based on your cost sensitivity and calibration on your development data.