Train binary textcat in Prodigy Nightly

Hello community, I am new to Prodigy and currently I am using the nightly version.

I can't figure out how to train a textcat model with binary label (https://prodi.gy/docs/text-classification#manual-binary). The score is always zero. If I use two labels (Yes/No) in annotation, the training will proceed correctly.

I feel I must have missed something obvious. Thanks for your help!

============================= Training pipeline =============================
Components: textcat
Merging training and evaluation data for 1 components
  - [textcat] Training: 20 | Evaluation: 4 (20% split)
Training: 20 | Evaluation: 4
Labels: textcat (1)
ℹ Pipeline: []
ℹ Initial learn rate: 0.001
E    #       SCORE
---  ------  ------
  0       0    0.00
141     200    0.00
341     400    0.00
541     600    0.00
741     800    0.00
941    1000    0.00
1141    1200    0.00
1341    1400    0.00
1541    1600    0.00

Hi! The problem here is that the textcat component in spaCy v3 expects at least 2 labels for binary categories (e.g. LABEL and NOT_LABEL). If you're training a pipeline with only one label, you can use the textcat_multilabel component instead.

We have a new version of the v1.11 coming that introduces a --textcat-multilabel option for training binary classifiers. In the meantime, you could just export your data with data-to-spacy and then train with a config using textcat_multilabel instead of textcat.

Thanks for the clarification! I am using the data-to-spacy approach now.

Just released a new update to the nightly that now lets you provide --textcat-multilabel datasets separately :slightly_smiling_face: