I have managed to annotate enough data for each of my labels to achieve around 0.8 F-score out of batch-train
Next, I would like to merge all the label-specific datasets, train multi-class classifier (non-exclusive labels), and evaluate it against gold-standard eval. set.
1] According to both the docs and this post:
the mark recipe is recommended for creating evaluation datasets.
I wonder why is the mark recipe prefered compared to the textcat.manual recipe? What is the actual difference between the two?
2] Is there any way I could use textcat.eval for multi-class classification on data annotated using choice view or does the textcat.eval only work for single-label evaluation? I only care if the correct label is among those assigned to a text...