Understanding outputs for new texcat model

Hi, I recently updated to the newest version of prodigy and I'm very confused with how the results of the models are presented.

From a previous entry I know that:

  • "E" refers to the number of epochs that you've trained, with one epoch representing one pass over all the data.

But I don't understand what # is then. I haven't been able to find documentation on how to read the results. Is score the F-score? How do I get the usual Accuracy, Precision, Recall or F1? If I could be directed to documentation on this new format It'd be very helpful.

Thanks.

Hi! The E refers to the epochs and # is the number of steps (iterations). Also see here: Unclear what the column 'E' is outputting in the console output during training · Discussion #7731 · explosion/spaCy · GitHub

The score is the combined weighted score and you can read more about it here: https://spacy.io/usage/training#metrics If you're only training one component, this is typically identical to the main score reported by the component (e.g. F-score). If you're training multiple components, you can use the score weights to define what to prioritise (e.g. best text classifier + NER recall combination, and so on).

You can also use the [training.score_weights] block to define the scores shown in the table. The scores set by the individual components are available in the code via the @Language.factory decorator: https://spacy.io/api/textcategorizer

1 Like

Hi @ines,

In the last answer you told me the score metric was equal to the F-1 when training for one component. However, as you can see in the attached image the score is actually equal to the ROC AUC instead of the F-1. I'm still confused on how to read the output because in the spacy documentation you only explain what P, R and F mean. However there's no explanation there about the "#", "LOSS TEXTCAT" and "CAT_SCORE"

Is there any documentation on how to read the output.

Thanks,
Veronica

Hi, Ines mentioned F1 as a example, but it's not necessarily the default score for all components+settings.

The default textcat scoring, which is a bit complicated due to all the possible model configurations, is explained here:

Based on the config settings, the scorer tries to pick the most useful score to show in the "CATS_SCORE" column in the training output. All the possible scores are included in meta.json in the saved model and this should include a text description of which score was selected for cats_score under cats_score_desc.