I use prodigy to annotate data but I generally use spacy to train the model. What confuses me is that when I run training-curve on my dataset, the model actually reaches a higher score, than when I train the model (with the same dataset), despite using the same config (I'm providing a base-model to training-curve).
What's behind this? Is this expected?
Any pointers would be much appreciated!
PS: training-curve is very helpful, thank you for includin a recipe for this!