I remember I read somewhere that "textcat.batch-train" is effective just for binary classification. If it is true, so this would be reason since I have 3 classes here. Any work around for that?
I'm still confused about the classification model.
- The classification model doesn't care about the highlighted entity while classifying a sentences, right?
so If in one sentence I have 2 entities with two different classes, I'm going to label one sentences 2 times. - Will I confuse the model? is there anyway to classify a sentence based on the highlighted entity?
Yes, that's correct.
You could always make a dataset where you stripped away all the words other than the highlighted ones. This would basically learn something like a lookup table. I think an explicitly rule-based approach might do better than that: it's fine to have some rules that say if particular phrases occur, a particular category applies. The spaCy Matcher
component can help you with that.
That's not quite right, and it might be the source of some confusion here!
The text classifier supports two types of problem definitions: it supports either mutually exclusive problems, or non-mututally-exlusive problems. It seems that you model above was trained non-mutually-exclusive, leaving it able to predict all three classes on a single example. This may or may not be what you want: it depends on the problem.