Train doesn't use rejected text for binary classification

Maybe I still didn't get the Accept and Reject concept.

I have a binary classification problem. I used the Match recipe to annotation text with label "x". In the result, all the text have the label "x". Some of them have "answer" "accept" and other have "reject". My expectation is that both "accept" and "reject" text will be used for training. "accept" will be the positive cases and "reject" will be the negative cases. However when I use the "train" recipe, only the "accept" cases were used in the training. My question is

  • Does it work as the design?
  • How can I convert the "reject" cases to negative cases?

Thank you,

Thanks for the report! This looks like a regression that was introduced in the latest v1.9.8, which causes the binary annotations to be filtered incorrectly in the general train recipe. Sorry about that! (The fix was actually supposed to correct a problem that could lead to incorrect totals being reported.)

You can work around this by editing the train.py file and adding accept_only=False to the
example loading for text classification:

textcat_examples = load_examples(DB, textcat_datasets, accept_only=False)

I'm just building the wheels for a small update that fixes this internally and improves a few other related things around how the examples are filtered.

Also just released v1.9.9, which should solve the underlying problem and handle different annotation types (and the meanings of accept and reject depending on the data type) correctly :slightly_smiling_face:

Yes, your solution works! thanks.