From Choice annotations to binary annotations with Teach

Hi Ines! Thanks a lot for your help. I have managed to follow your suggestion step-by-step until the end.

I converted the data, trained a multilabel model with spacy, saved the model to disk, and used it as a pre-trained model to help me annotate with textcat.teach for one of the labels.

I have 2 follow up questions:

  1. If I generate binary annotations with textcat.teach and then do textcat.batch-train to train the multilabel model with those binary annotations, what exactly is the model being trained on? Does prodigy assume that all other labels are False?

  2. While doing binary annotation for LABEL_ONE, all the documents which textcat.teach suggests are negative examples (none of them are LABEL_ONE). I would like to know how to make textcat.teach suggest the observations with highest probability of belonging to LABEL_ONE so that I can generate more positive annotations? (instead of the most uncertain scores. I know there must be a lot of value in choosing the uncertain ones, but it seems that it’s not the best when you have an unbalanced multilabel dataset and you’re just getting started)

Thanks!