Can't improve textcat model performance

I have a text classification model that I've trained on sentences. The model is binary, with the label either present or not. I've got about 1750 annotated sentences in two datasets, although a lot of those are 'ignores' and only 722 are used in training. I can't seem to get the accuracy of the model past 0.75; train-test curve shows accuracy dropping in the last stage. Is there anything else I should be looking at to improve the performance of the model? Loss is pretty close to 0 at the conclusion of 10 iterations of training.

Hi! It's always difficult to give a definitive answer because there could be many explanations. But here are a few questions and ideas to help you debug:

  • Which versions of Prodigy and spaCy are you running?
  • Are you using a dedicated evaluation set, or are you using the default evaluation split that just holds back a certain percentage? If you don't have a separate evaluation set, it's possible that your evaluation is unstable because you're always taking a random 20% of the already small set of 722 training examples. Maybe this random split just happens to give you an unideal selection of the data.
  • If your loss hits 0 after the 10 iterations, this indicates that your model might be overfitting on the data.
  • How were your annotations created, and are they internally consistent? Is the distinction you're trying to train something the model can learn?

Thanks for the reply!

  1. Using spacy 2.2.4 and prodigy 1.9.9
  2. I have a dedicated evaluation set
  3. I created the annotations myself, based on a self-written spec, so this is the part that most worries me. If I set the threshold to a high level, the model identifies things pretty reliably but the recall at that point is pretty poor, it would be letting lots of things get by.