Hi! In that case, you could use the model you trained on your previously collected 500 annotations and use that as the input model when you run textcat.teach
and save the annotations to a new dataset. Based on the model's predictions, the recipe will then select the most uncertain predictions and will let you accept/reject them.
Depending on the number of labels you have, it might make sense to start with a subset of labels, especially those that may need the most improvement. If you run the training with --label-stats
, it will give you a breakdown of the accuracy per label, so you can see if there's one label in particular that stands out.
After you've annotated with textcat.teach
, you can then train a new model from scratch using the previously collected annotations + the new annotations.