Evaluation data for ner model

hi @kaiser!

Thanks for your question.

The Precision/Accuracy/F1 are on the evaluation data.

Here's a good post that describes more:

That includes this post on understanding the output:

It's important to mention that prodigy train is just a wrapper for spacy train. In fact, it's just a quick-and-dirty way to train using spaCy with smart defaults; however, it obscures a lot of important concepts like created spaCy binary files (i.e., dedicated training / evaluation dataset) along with using spaCy config files.

This post describes a bit more (and why you get best-model and last-model folders):

In general, I recommend moving towards using data-to-spacy then spacy train rather than prodigy train once you start developing a more serious pipeline.

This post shows either results in the same model if setup correctly:

You can create dedicated test/eval datasets using prodigy train but your eval will need to be its own dataset and you'll need to call prodigy train like this:

prodigy train --ner train_data,eval:eval_data ...

See this post for more context:

Hope this at least gets you started! Sorry if I included things you knew or you didn't directly ask for, but trying my best to give you a few extra resources that'll answer future questions before you ask :slight_smile:

1 Like