Additional metrics (recall, precision, accuracy F1) in textcat.train-curve

cbjrobertson · September 26, 2019, 6:08pm

Hi--

I was wondering whether it is possible at all to include (and do you have plans to include) additional performance metrics in the output of textcat.train-curve. Accuracy is not always the most useful when dealing with unbalanced classes (as I am). Are additional metrics in the pipeline for textcat.train-curve, and do you suggest any workarounds in the meantime? (I guess apart from manually splitting up the data in various sizes and then running textcat.batch-train on them).

Cheers!

ines · September 27, 2019, 9:18am

That's a nice idea! The textcat.train-curve recipe currently uses the number returned by the textcat.batch-train recipe function. If you take a look at the code, you'll see that this is best_acc["accuracy"]. The full stats returned by model.evaluate are the following:

stats = {
    "tp": tp,
    "fp": fp,
    "fn": fn,
    "tn": tn,
    "avg_score": total_score / total,
    "precision": precision,
    "recall": recall,
    "fscore": 2 * ((precision * recall) / (precision + recall + 1e-8)),
    "loss": loss / (len(examples) + 1e-8),
    "accuracy": (tp + tn) / (tp + tn + fp + fn + 1e-8),
    "baseline": baseline,
}

So if you want the train curve recipe to compare the recall instead, the easiest way would be to change the batch-train recipe in recipes/textcat.py and make it return best_acc["recall"].

Btw, you can run the following to find the location of your Prodigy installation:

python -c "import prodigy; print(prodigy.__file__)"

sofiejb · January 18, 2023, 8:30pm

Hi, since I cannot find the batch-train recipe in texcat.py now. I assume we use handle_scores_per_type to evaluate now? Where would you recommend making these changes now?

ryanwesslen · January 18, 2023, 9:20pm

Yes, batch-train was replaced with train in v1.9 (Dec 2019).

Does this answer your question?

Topic		Replies	Views
textcat.batch-train usage , textcat	3	1263	August 29, 2018
Custom objective for textcat usage , textcat	1	462	September 19, 2019
Print accuracy in prodigy train textcat	2	428	November 11, 2022
Add metrics like accuracy, recall, precision etc. to output enhancement , usage , spacy	1	489	April 15, 2020
train textcat doesn't show precision and recall enhancement , textcat , spacy	1	657	March 23, 2020

Additional metrics (recall, precision, accuracy F1) in textcat.train-curve

Related topics