`train-curve textcat` - display AUC for each classification label

JBunr · June 18, 2020, 8:13am

I would like to see training curves for each classification label using AUC as the metric. It doesn't seem like spacy supports this out of the box, in which case where would I start if I wanted to add reporting functionality that looked something like the below?

=============================== ✨  Train curve ===============================
%      POSITIVE  NEGATIVE NEUTRAL
----   --------  -------- -------
  0%       0.40   0.5     0.5
 10%       0.88   0.49  0.65
 20%       0.86   0.52   0.71
 30%       0.80   0.56   0.73
...

This page, seems to provide part of the answer, but only describes how to change the return value.

ines · June 18, 2020, 9:16am

Hi! The new train recipe function returns a (best_scores, baseline) tuple – best_scores is an instance of spaCy's Scorer, which includes overall accuracy scores, as well as scores per label.

The train-curve mostly just runs train with different portions of the data and then outputs the best score for each training run at the end. So you could write your own version of the recipe that prints the best_scores.textcats_per_cat instead of using the default results printer. (You can find the recipe in recipes/train.py in your Prodigy installation – it should be pretty straightforward to copy, because it mostly just calls train() with arguments. To find the location of your Prodigy installation, you can run prodigy stats).

JBunr · June 18, 2020, 9:39am

Thank you - will get to it!

Topic		Replies	Views
Additional metrics (recall, precision, accuracy F1) in textcat.train-curve enhancement , textcat	3	1086	January 18, 2023
TextCat Training Results on a per label basis. usage , textcat	1	443	February 18, 2019
Add metrics like accuracy, recall, precision etc. to output enhancement , usage , spacy	1	490	April 15, 2020
NER Trained Model Analysis ner , spacy	9	546	July 30, 2023
train textcat doesn't show precision and recall enhancement , textcat , spacy	1	657	March 23, 2020

`train-curve textcat` - display AUC for each classification label

Related topics