textcat.batch-train versus spacy classificaion example

nix411 · March 25, 2019, 10:22am

I labelled ~2600 documents for binary classification. Then I trained the mode using textcat.batch-train with default parameters and achieved ~98% accuracy. I also tried training using the textcat script from spaCy. That yields around 90% - I am just wondering why I see this big difference.

I am using prodigy 1.6.1 and spacy 2.0.18.

honnibal · March 25, 2019, 11:08am

Hmm! Hard to say. 98% is very suspicious. Have you checked that the evaluation data is definitely different from the training data? Since the two experiments will involve some data export and conversion, I guess there’s the chance for the experiment to be different.

There are some hyper-parameter values could be getting set differently between the two. But I’d say 98% is likely to be an incorrect result, so I don’t think it’ll be a problem of trying to get the spaCy script to produce the same result as the Prodigy one. I think it’s more likely to be case of trying to find out what’s going wrong with the Prodigy experiment.

nix411 · March 25, 2019, 12:49pm

I think I disagree since the classification is expected to be a quite easy task actually. This was also evident by looking at the score in textcat.teach.

I have checked that I have only unique documents in my annotated data and I let prodigy split it into training and evaluation. I’d expect the difference to lie in hyperparameters but they look similar to me though. Have you published the script being used in prodigy?

nix411 · March 29, 2019, 8:45am

Just in case this got lost in a pile of work, I allow my self to re-ask The thing is I have an information extraction engine running spacy 2.1 and then I have a classification engine running spacy 2.0 (due to the better training) but I'd like the engines to run on the same machine/session.

honnibal · March 30, 2019, 12:16pm

Prodigy does include the source for its textcat.batch-train script. Have a look in your installation, in prodigy/recipes/textcat.py. You can also look on the recipes repo, but I think for certainty you probably want to look at the script you’re running.

The main things to look at would be the batch size, dropout rates, and making sure that if you’re using pretrained vectors, you’re using them in both models. You can also look at the cfg file that gets written out into each model’s folder to check whether there are any hyper-parameters that look different.

Topic		Replies	Views
Can't improve textcat model performance textcat	2	389	May 3, 2020
Textcat results seems worse in new prodigy version usage , textcat , spacy , solved , training	4	648	August 30, 2021
Best practices & realistic expectations with high number of classes for multiclass text classification task usage , textcat , spacy	2	1142	August 27, 2019
Custom objective for textcat usage , textcat	1	462	September 19, 2019
Prodigy textcat train optimization?? usage , textcat , spacy	3	538	March 23, 2020

textcat.batch-train versus spacy classificaion example

Related topics