Hi! Under the hood, Prodigy's
train recipe calls into spaCy and then runs
nlp.evaluate (which returns a
Scorer object with the scores).
If you want to use the
train command to train your model, the easiest way is to use a custom evaluation set (instead of just the
--eval-split 0.2, which holds back a random 20%). The
train command accepts an
--eval-id argument, which lets you point to a Prodigy dataset to use for evaluation. So if you have test data in Prodigy's format, you can import it to a new dataset and then use
--eval-id name_of_your_test_dataset to evaluate on that data and report those results. This approach is also very useful if you're using Prodigy to create both your training and test data.
Alternatively, if you already have your own evaluation pipeline set up in your spaCy code, you could also export your Prodigy annotations with
db-out and use them to train your model.