train textcat baseline accuracy

Where can I find a description of the "baseline accuracy" when using "prodigy train textcat"? Is it just the naive approach of classifying everything as the success class or something else?

Hi! The score reported as the baseline accuracy in the regular train recipe is the result of evaluating the base model on the evaluation set. If you're using a blank model, this is the accuracy with randomly initialized weights. Or, expressed in code, the equivalent of this:

nlp = spacy.blank("en")
textcat = nlp.create_pipe("textcat")
textcat.add_label("LABEL_A")  # etc.
nlp.add_pipe(textcat)
nlp.begin_training()
scores = nlp.evaluate(eval_data)

I think in the previous textcat.batch-train, Prodigy was actually calculating a majority class baseline, which is probably a more useful metric here and something we should add back (at least as an option).

1 Like

@ines just to clarify, is the baseline accuracy the predictive accuracy on the validation set? I'm a little confused because mine doesn't change after training....does this mean my model isn't actually improving? Despite ROC score of .9 (i.e., it's basically just overfitting).

Yes, in this case, it's the result of evaluating the model on the evaluation data before training. If you start with a blank model, it'll be the accuracy of a model with randomly initialized weights. So basically, the accuracy if you did nothing.

Do you mean the accuracy after training is lower than the baseline accuracy? If that's the case, that would indicate that something is wrong (either in the training or the evaluation), because the weights you trained ended up performing worse than the randomly initialized weights.