Use textcat and textcat_multilabel in the same model


I have a model that works with NER and TextCategorizer.

Can TextCategorizer be trained using both textcat and textcat_multilabel?
I need to evaluate the relevance of a text (textcat) and add tags (textcat_multilabel).
Is it possible to do this using a single model?

thank you!

It depends slightly on what your definition of "model" is. If you mean a single spaCy pipeline, then yes!

When you run the train command you can tell it that you want two classification components added.

prodigy train <output_dir> --textcat <dataset1> --textcat-multilabel <dataset2>

Internally, this will train a spaCy pipeline that has two classification components that don't directly influence each other but will be using the same data to make predictions. There's a slight caveat to this; if you're using transformers then they may influence each other, but that's a bit of a detail.

You could say "this is one model, because it's a single pipeline", but I might phrase it as "it's a single pipeline that has multiple trained model components in it".

Similarly, you could also add a NER component.

prodigy train <output_dir> --textcat <dataset1> --textcat-multilabel <dataset2> --ner <dataset3>

The end result will still be a spaCy model, just with more components. You can find more information on components and pipelines on the spaCy docs.