textcat F1 score goes up and down, up and down

curious · April 13, 2020, 8:02pm

I'm training a text cat model. I am training a model with emails. Each email has max 150 tokens. I have a separate evaluation set, 5 positive and 5 negative. During the training the loss is going down but the F-score is going up and down and up. Which is different from what i normally see - F1 score going up while the loss goes down. And after 30 iteration, the model's F1 score is 0.63 which is way below baseline 0.8. What could be wrong?

Here is the train output -

Baseline accuracy: 0.800

Loss F-Score

1 16.84 0.500
2 12.83 0.367
3 8.44 0.667
4 6.25 0.800
5 7.50 0.667
6 4.91 0.633
7 4.32 0.900
8 4.07 0.800
9 2.79 0.800
10 2.56 0.733
11 0.98 0.767
12 0.19 0.700
13 0.02 0.700
14 0.00 0.667
15 0.10 0.633
16 0.05 0.633
17 0.01 0.633
18 0.01 0.600
19 0.03 0.600
20 0.02 0.567
21 0.00 0.533
22 0.10 0.533
23 0.03 0.500
24 0.00 0.533
25 0.00 0.533
26 0.02 0.533
27 0.06 0.533
28 0.00 0.567
29 0.00 0.633
30 0.00 0.633

============================= Results summary =============================

Label ROC AUC

mnpi 0.900

Best ROC AUC 0.900
Baseline 0.800

ines · April 14, 2020, 8:57am

How many examples are you training with in total? And does this mean your evaluation set only contains 10 exampes?

If you are in fact only evaluating on 10 examples, this could explain a lot, because it's really difficult to come to any conclusive results if your dataset is that small and you might be seeing the accuracy jump around like this. Just to put this into perspective: if you're training a binary classification model and you're evaluating on 10 examples, a small difference that leads to a single mistake would lead to a 10% decrease in accuracy.

curious · April 15, 2020, 2:26pm

Thank you very much for the reply. You are absolute on it.
After I added more data in the evaluation set, the F1 score gets higher while the loss gets lower.

Topic		Replies	Views
textcat.batch-train usage , textcat	3	1265	August 29, 2018
Can't improve textcat model performance textcat	2	393	May 3, 2020
Evaluating a text classification model usage , textcat	4	798	September 24, 2019
Custom objective for textcat usage , textcat	1	463	September 19, 2019
Traning/validation in Textcat/ textcat , spacy , off-topic	0	1183	May 26, 2020

textcat F1 score goes up and down, up and down

Loss F-Score

Related topics