I have 2 textcat models. One was taught manually, one was taught by active learning. The active learnt model only used the subset of the total dataset. Now I want to compare the performance of the 2 models.
Prodigy Train won't do it. Because it only evaluates again its own data. We would just compare apple with orange in my case.
It will make more sense to evaluate 2 models with the same unseen dataset. I haven't seen how to do in Prodigy. Looks like I will have to export 2 models to Spacy and evaluate in Spacy with the unseen evaluation set. Am I right?