I was wondering whether it is possible at all to include (and do you have plans to include) additional performance metrics in the output of textcat.train-curve. Accuracy is not always the most useful when dealing with unbalanced classes (as I am). Are additional metrics in the pipeline for textcat.train-curve, and do you suggest any workarounds in the meantime? (I guess apart from manually splitting up the data in various sizes and then running textcat.batch-train on them).
That's a nice idea! The textcat.train-curve recipe currently uses the number returned by the textcat.batch-train recipe function. If you take a look at the code, you'll see that this is best_acc["accuracy"]. The full stats returned by model.evaluate are the following:
So if you want the train curve recipe to compare the recall instead, the easiest way would be to change the batch-train recipe in recipes/textcat.py and make it return best_acc["recall"].
Btw, you can run the following to find the location of your Prodigy installation:
Hi, since I cannot find the batch-train recipe in texcat.py now. I assume we use handle_scores_per_type to evaluate now? Where would you recommend making these changes now?