Making the right selection for multi-label text categorization

Hi all,

I'm working with textcat.manual to apply multiple labels to text. I don't quite understand the differences between the following:

  • Selecting "Reject"
  • Selecting "Ignore"
  • Not applying any labels and selecting "Accept"

From my understanding, "Ignore" just removes the example from the training set. Will "Reject" count against all labels?

Thanks for any guidance!
Tom

Hi @Pragma_Tom ,

For a multi-label problem, here is a good interpretation for each of these actions:

  1. Selecting REJECT: a less common use-case, you can interpret it as "the selected label DOESN't apply"
  2. Selecting IGNORE skip an example. This will save to DB but never use it during training. It is useful in cases where there are weird HTML markup errors, or you just really don't know what the labels is and want to move on.
  3. Not applying any labels and selecting ACCEPT: the sample will be saved to db without any labels attached to it.

Hope it helps!