Custom objective for textcat

So I'm writing a custom textcat batch train recipe that will persist the model to a cloud data source.

To accomplish this, I copied the textcat.batch_train recipe in the repo and added some calls to our persistence client.

While doing this, I noticed that batch_train uses the model with the highest accuracy as the 'best' model. My dataset is very sparse with passages that should be labeled. Therefor a model could just always predict negative for the label if trying to maximize accuracy -- I've changed my recipe to use fscore instead of accuracy -- are there any downsides to this approach?

Maybe such an option could be included on the batchtrain recipes?

Sorry for the delay getting to this --- I missed the thread before somehow.

You're right that the default metric might not be the best choice for all situations. v2.2 of spaCy actually has some improvements in the textcat evaluation that I hope we'll be able to take advantage of in Prodigy.

In the meantime, I think implementing a custom recipe to choose the model under whatever criterion you need should be a good solution.