Two Questions on Teach recipes

justin.epstein · January 20, 2020, 8:21pm

Hello! I have two questions on how to use ner.teach and textcat.teach.

We have trained an NER model using ~1000 messages and 20 entities. We are hoping to use ner.teach to improve our model, but I can't quite figure out how to use that recipe. When I export the results from ner.teach, the messages only have the singular entity I validated and not all entities in the message. Should I then run these sentences through ner.correct, or is that not necessary? How can I best use these results?

And finally, I also built a textcat model using a similar number of messages. But when I go run textcat.teach, I get a message saying there are no tasks. I trained the model outside of prodigy in spacy, could that be why?

Thanks!!

ines · January 21, 2020, 12:49pm

This is the concept, yes – instead of labelling all entities in the text by hand or in a semi-automated way, ner.teach uses the model to suggest entities (based on all possible analyses of the text) and asks you whether it's correct or not. Even if you don't know the answer for every single token (and in some cases, only know that a certain analysis is not correct), you can still update a model proportionally and move it into a more correct direction.

Prodigy's training recipes implement a mechanism to update from this type of binary incomplete data (ner.batch-train or the new train with --binary). So after collecting annotations with ner.teach, you'd then update your base model with those annotations and it hopefully produces better predictions than before.

Do the labels you've set on the command line match the labels in the model? Otherwise, Prodigy can't find any suggestions. Also, are you using a new dataset? If you've already annotated those examples before and they're in the same dataset, Prodigy will skip them (so you don't get asked the same question twice) and you'll end up with an empty stream.

(Whether you train the model with spaCy directly or in Prodigy won't make a difference for things like this – in both cases, you're doing the same thing and calling nlp.update with examples.)

justin.epstein · January 23, 2020, 1:05pm

Thank you so much!! I used ner.batch-train, but when I evaluate my new model against our existing test set, my F1 score drops by about 10% (precision stayed the same but recall went way down). Is this just saving the one annotated label marked as correct in ner.teach, and thus "training" the model to only return one entity?

ines · January 24, 2020, 12:07pm

If you're using ner.batch-train or train with --binary then no – the training process here was specifically designed for incomplete and binary yes/no answers. Before training, Prodigy will merge all annotations on the same texts, and all unannotated tokens will be considered missing values. (Btw, if you're interested in how the updating process works for binary annotations, my slides here show an example.)

Depending on the annotations and specific texts and entities, it can always happen that the binary annotations don't move the needle very much. If training the model on the binary annotations makes the model significantly worse, you might also want to double-check that your data is consistent (e.g. review a random sample using the review recipe). The dataset you're training from should only contain binary annotations, and shouldn't label any partial suggestions as accepted (see here for background on this).

justin.epstein · January 27, 2020, 2:48pm

Hi Ines!! After looking through our data, I definitely agree that we have some inconsistently tagged data in our training set. Do you have any advice on how to identify possibly mis-tagged data from prodigy?

ines · January 27, 2020, 9:54pm

The review recipe and interface is a good way to re-annotate data. You can see the original annotation and get to "overrule" it if it's incorrect. The original example is saved with the new annotation, so you don't lose the reference. The recipe also groups annotations together if you have overlaps (e.g. the same data annotated by multiple people).

If you can identify patterns that indicate that an annotation might be incorrect, that's very helpful, too, because it lets you write a script to pre-select candidates (so you don't have to do through all examples again). You can then queue those up for annotation first, and maybe perdiodically re-run training experiments to see if the corrected data makes a difference.

Topic		Replies	Views
ner.teach - couple of questions ner , done , solved , nightly	9	2654	December 30, 2021
Span annotation with ner.manual -- how to make use of ner.teach ner	6	860	December 3, 2019
Understanding textcat.teach from PyData Berlin 2018 talk textcat , solved	3	636	October 11, 2018
ner.train number of examples usage , ner	8	1954	August 3, 2018
Best strategy for training an NER engine usage , ner	8	2186	December 27, 2017

Two Questions on Teach recipes

Related topics