Hi, I am following this tutorial to explore textcat. When I run the command below, i got "Can't find recipe or command 'textcat.batch-train'." error, I was wondering if this is the correct recipe to use? Thanks!
batch-train recipes like ner.batch-train and textcat-batch-train were deprecated and removed as of v1.11. That video was created long ago and that's the one issue with creating video tutorials: we can't fix it if we make changes to our code in the future.
Instead use the train recipe. This is a general function where now you have to provide as an argument what type of model you want to train with the training dataset train --textcat [training_dataset] where, [training_dataset] is the name of your training dataset.
Thanks for pointing me to the updated docs. I used --textcat-multilabel for my two label annotation data, and obtained the results as attached below.
I am thinking of a few things to try next, e.g., revising my schema (having more labels in the same level, or creating a nested schema with more labels), and moving to use the model on a large chat dataset for real categorization. Could you recommend a few steps for me to get on track? Thanks again!
I would recommend using textcat.correct to identify weak spots in your model.
The idea would be with your current model to look at examples and to understand what are problems your model is still having (i.e., is there a pattern in the incorrect examples).
Hopefully by doing this, it should answer your question of the next step. For example, if you're finding the model is struggling on examples that seem not to fit either of your two labels, this may suggest the need of expanding to three labels. Alternatively, if it's looking like there are sub-categories within your existing categories, that's more indicative that you may want to create nested labels.
Hi Ryan, I have a follow up question of using textcat. Let's say I have obtained a satisfactory model to categorize my chat data with Research, and Nonsearch, how can I use the model to process my over about 1 million chat data?
If I am thinking of moving to adding more categories, but with two potential approaches, one level or two levels, how can I compare these two models down the road?
As you probably realized, if you have different categorization schemes, it's a bit hard to compare across accuracies. However, one resource is @koaning's spacy-report package. It can allow you to compare precision/recall tradeoffs by category. This package can help evaluate the performance for each model by category, thus finding where either model may have weaknesses by category.
Alternatively, you could move towards more qualitative approaches by comparing examples. One idea could be to use spacy-streamlit to load either models and apply it to input text. It doesn't compare side-by-side (you'd need to load one model at a time) but it can still be informative.
Hope these help and let us know if you have further questions!
With regards to implementing the second level for my schema, I am looking at this post. The explanation makes a lot sense to me, but I am not not sure how to implement as it suggested. I was wondering if there are any related code examples that I can refer to? Thank you!