How are evaluation dataset used

nix411 · January 31, 2022, 3:06pm

I'm not sure how the evaluation dataset is used. I want to "freeze" a dataset that my models are never allowed to see when training and developing/tuning. Does eval: qualify for that?

See my related question in spacy discussions.

ines · January 31, 2022, 8:11pm

Yes, the eval: prefix lets you specify a dataset used only for evaluation and to calculate the accuracy scores. If you're serious about training, you typically want to use a fixed dataset here that never changes so you can meaningfully compare your results across experiments.

(Of course, you still need to make sure that no training examples are present in your evaluation data. One way to double-check this is to export your data with data-to-spacy and using spaCy's debug data, which will tell you if you ended up with duplicates.)

nix411 · February 8, 2022, 4:09pm

But currently eval is actually used to determine when to stop training if setting patience, right? So the eval data is implicitly being used in the training, That might be okay but is it correctly understood?

ines · February 10, 2022, 10:39am

The evaluation data is used to calculate the accuracy by comparing the predictions to the unseen examples and correct answers in the evaluation data. The accuracy is then used to determine when to stop training in the default configuration, if the accuracy stops improving.

Topic		Replies	Views
how to test my model on new dataset ? usage , spacy , solved	2	942	April 26, 2020
How to test model (nighly) accuracy on eval-data.json? usage , spacy	2	320	January 20, 2021
How to evaluate the model accuracy with test data (not part of training) usage , ner , spacy	8	705	March 12, 2024
Evaluation data for ner model ner	2	377	October 11, 2023
Workflow for textcat with imbalanced/skewed data usage , textcat , training	4	607	March 11, 2022

How are evaluation dataset used

Related topics