Best practise for multi-label and textcat.teach

ihave4thumbs · November 18, 2017, 2:42pm

Hi all,

Congrats on such a great product. You won’t believe it, but my beta invite came through just as I was on the couch manually labelling a spreadsheet of 100k+ rows. Needless to say, I was excited!

I’m just after some best practise/workflow advice.

What’s the best way to classify multiple labels (150+) into a single model?

I work in the chatbot space and need to run intent detection on inbound messages and respond with the correct label (to link to the correct answer from a different app).

Consider a QnA bot with 150 topics, so, 150 labels.

Should I textcat.teach for every label one at a time, then afterwards run textcat.batch-train for each label? If so, should I output the batch-train to the same model each time and then start again with the next label for textcat.teach?

I totally get the workflow for one label on one model, just after advice on how to add many more labels.

Any tips/insight appreciated.

honnibal · November 18, 2017, 3:24pm

Glad it came at a good time for you

You’ve probably seen this, but in case you haven’t the insults classification tutorial is probably pretty relevant for you: https://www.youtube.com/watch?v=5di0KlKl0fE

I think you’ll do well creating terminology lists to bootstrap your categories. Start off with a couple of seed terms, and then build out the word list using the terms.teach recipe, as shown at the start of the insults classifier video. This will help you create initial models for each of your terms.

You want to get to the point where you have a single dataset with at least a few positive and negative examples for each of your labels. Prodigy assumes labels are not mutually exclusive, i.e. that each text can have multiple labels. If that’s not true for your domain, then you know that all examples that are positive for one class will be negative examples for the other classes. To take advantage of this knowledge, you can create ‘reject’ examples for the other classes once one class has been accepted. This logic is left for you to implement because label schemes can have complicated dependencies, e.g. some labels may be mutually exclusive, others not.

Overall I suggest you let your workflow evolve as you go. It’s a boot-strapping process: hopefully every bit of knowledge you’re adding can be used to make the knowledge collection easier. The optimal procedure for this will be different for every problem, so we’ve tried to give you a variety of tools that compose well.

You’ll find yourself moving text in and out of the database, merging records, etc. This is all by design. Similarly, you’ll want to write little bits of Python (or shell if you’re perverse enough to prefer it ) as you go. This is also by design. We wanted to avoid a problem we often find with developer tools, especially hosted ones: they often end up creating this parallel language of scripts and configurations, that’s actually just worse than Python. We assume you know at least one programming language pretty well, so we wanted to make sure we let you use it, instead of creating more arbitrary stuff.

List of built-in recipes: https://prodi.gy/docs/recipes
Text classification workflow: https://prodi.gy/docs/workflow-text-classification
Example training spaCy’s text classifier directly: https://spacy.io/usage/training#section-textcat
Also check out the custom recipes section of the docs. You can also view the source of the recipes within your Prodigy installation, as examples.

ihave4thumbs · November 20, 2017, 9:22am

Thank you for the comprehensive reply.

I think your last para was just what I needed. The realisation that it’s designed for us to manipulate with our own scripts and code, to use it as a tool rather than a off-the-shelf solution (which, yes, you’re right, is much better).

Follow up question (shout if you want me to make a new thread).

How much of a hack / is it possible to have short sentences or phrases in a seed list to use in terms.teach and textcat.teach rather than words? Something like:

how are you getting on
how is your morning so far
how do you feel
how is your day going
how is it going
how is your evening
is everything all right
how are the things going
I’m fine and you
how has your day been going
how is your day being
how are you
how are you today
how have you been

ines · November 21, 2017, 2:42pm

The easiest way for now would be to simply pre-train your model with the examples you already have, so it doesn't start off at zero and has at least some concept of your labels. See the this spaCy example for an end-to-end text classification training script. The model you save to disk can be directly loaded into Prodigy:

prodigy textcat.teach my_dataset /path/to/model my_data.jsonl

Alternatively, if you want to use terms.teach for phrases, you'll need a model with vocab and vectors containing multi-word tokens. This is a little more complicated, though, because you'll need to retokenize the text so phrases are one token. If you're bootstrapping the text classification with terms.teach, the model you later use for textcat.teach needs access to the same vectors. So you'll have to either write a wrapper for textcat.teach that adds your custom merging/tokenization logic, or package that with your spaCy model. The best way to achieve this would be to use a custom pipeline component.

ronnie · March 22, 2018, 9:17am

I'm preparing to train a multi-label classifier with a little more than 20 labels and would like some input on how best to do that with regard to the annotation process.

For starters, I plan to annotate ~500 positive examples for each label to see where that gets me. Do you think, it would be best to create a multiple choice-style with all labels or run a separate session for each label?

If I ran separate sessions, then I would probably stream in training examples at random so as not to annotate the same text over and over. But would that create a bias in the model? The bias coming from having examples in my training data that hasen't been annotated with all possible labels.

I really like the one-decision-at-a-time design of Prodigy, but at the same time it seems a bit impractical to annotate the same examples over and over again. And it seems overwhelming to decide for more than 20 labels at a time.

How would you run the annotation process?

Thanks in advance

ronteo · May 2, 2019, 3:54pm

I have the same question, but I am not using “.teach”. I am using manual annotation with choice view because this is for collecting gold data from SME. I plan to used “.teach” when creating training data. Do we have a response for this?

Thanks,
Ron

ines · May 2, 2019, 5:09pm

Using the choice interface for manual annotation is a good solution here – if you don’t want to go through one example at a time, you can use the multiple choice view to select one or more labels in one go.

If you want to use a recipe like teach and improve the model’s suggestions in the loop, that’s a bit different – at least, for a workflow like this, the idea is that you don’t want to be looking at all labels for all examples, and instead, focus on the most relevant ones (e.g. the most uncertain predictions). In that case, it also makes more sense to look at the labels separately – the most relevant examples and corrections for label A might be completely different from those for label B.

So a possible workflow could be this:

Collect an initial dataset of gold-standard annotations using a multiple-choice interface. (Don’t forget to collect enough for a separate evaluation set!)
Pre-train a model and evaluate it. Here, you can also look at the mistakes and the labels that are most problematic. Maybe there are some labels the model mostly gets right, and others it struggles with.
Run textcat.teach for the labels that need improvement and give feedback on the model’s sugestions. If you feel like you need to fine-tune the example selection (e.g. to skip more examples), you can always write your own sorter like prefer_uncertain that takes a steam of (score, example) tuples and decides whether to send out an example for annotation, based on its score.

Topic		Replies	Views
Multi-label text classification with many labels usage , textcat	7	2440	June 30, 2020
textcat.teach for multi-class classification textcat	3	536	June 19, 2023
Is textcat.teach (as out-of-the-box) appropriate with multilabel tasks? textcat , solved	4	369	June 28, 2022
textcat.teach with multiple choice interface? usage , textcat	9	1395	November 3, 2020
Interface error with text cat.teach? usage , textcat	1	595	March 20, 2018

Best practise for multi-label and textcat.teach

Related topics