Train eval split

nix411 · March 25, 2019, 8:37am

Lets say I have a bunch of annotated data for classification, dataset A. Now I’d actually like to split this into A-train and A-eval in order to evaluate a couple of different models. Are there any easy way in prodigy to do this? I am aware of textcat.eval but that would require me to create additional annotations.

ines · March 25, 2019, 10:56am

The easiest way would probably be to export the examples, shuffle them (!), split them however you want and then re-add them as separate datasets. You probably also want to filter out examples with "answer": "ignore", since those won’t be used during training/evaluation anyways.

Topic		Replies	Views
Post-labeling train/test split enhancement , solved	3	878	December 29, 2017
How to compare performance of 2 textcat models usage , textcat	1	371	March 23, 2020
v1.9.7 train with --eval-id gives error textcat	3	1172	April 24, 2020
training with full dataset usage , textcat	1	489	September 4, 2020
how does prodigy data-to-spacy --eval-split do the split? usage	2	272	May 16, 2022

Train eval split

Related topics