Best use of `textcat.teach`

I am experimenting with textcat.teach to see if it would better to use this approach as opposed to textcat.manual.

I run the below:

prodigy textcat.teach my_test ja_core_news_lg text.jsonl --label Pos,Neg,Neut

text.jsonl contains 1060 lines, each with no preassigned classification.

I am expecting prodigy to load each sentence in the interface and present the annotator with one of my labels, and the annotator would assign an Accept/Decline label to each sentence/label pair.

However, after 60 sentences, i am presented with a "No tasks available." message.

Am I using this recipe in error, or is this behaviour expected in teach recipes?
Also what is the termination criteria that results in "No tasks available." being displayed?

Looking at the other support inquiries, spacy does not load all sentences for annotation. I ran training based on the 60 cases that i annotated, and got the below results.

They are disappointing, but given the limited training examples its unsurprising, but why is it the case that scrapy concludes that 60 annotations is sufficient?

$ prodigy train textcat my_test ja_core_news_lg
✔ Loaded model 'ja_core_news_lg'
Created and merged data for 60 total examples
Using 30 train / 30 eval (split 50%)
Component: textcat | Batch size: compounding | Dropout: 0.2 | Iterations: 10
ℹ Baseline accuracy: -1.000

=========================== ✨  Training the model ===========================

#    Loss       F-Score 
--   --------   --------
1    27.03      -1.000                                                                                                                        
...
10   4.34       -1.000                                                                                                                        
============================= ✨  Results summary =============================
Label    ROC AUC
------   -------
Other   0.528
Pos      -1.000
Neg     -1.000
Neut    -1.000

The idea of the textcat.teach recipe is that it uses the model in the loop to select the most relevant examples for annotation, based on the score (e.g. pioritising the examples with a score closest to 0.5, as those may be the most "uncertain" predictions). This also means that the recipe will skip examples with high and low scores, so you're not going to see all examples in your dataset. The recipe will use an exponential moving average go decide which scores to consider. This prevents Prodigy from getting stuck if the model ends up in a state where it produces more high/low scores etc.

If you're starting completely from scratch with a new model and you're annotating labels that might not be equally distributed, this workflow can be less effective because your model knows nothing. And it would take very long to get enough examples of all labels to teach it something meaningful so it can actually "participate" properly.

So it might make sense to start with a manual workflow like textcat.manual and annotate a small sample from scratch. You can then pretrain your model on that to give it a head-start. It can also help to use --patterns on textcat.teach to make sure that pattern matches are always shown if they occur (e.g. to show examples that may be part of rarer classes).

1 Like