So I'm writing a custom textcat batch train recipe that will persist the model to a cloud data source.
To accomplish this, I copied the textcat.batch_train recipe in the repo and added some calls to our persistence client.
While doing this, I noticed that batch_train uses the model with the highest accuracy as the 'best' model. My dataset is very sparse with passages that should be labeled. Therefor a model could just always predict negative for the label if trying to maximize accuracy -- I've changed my recipe to use fscore instead of accuracy -- are there any downsides to this approach?
Maybe such an option could be included on the batchtrain recipes?