Lets say I have a bunch of annotated data for classification, dataset A
. Now I’d actually like to split this into A-train
and A-eval
in order to evaluate a couple of different models. Are there any easy way in prodigy to do this? I am aware of textcat.eval
but that would require me to create additional annotations.
The easiest way would probably be to export the examples, shuffle them (!), split them however you want and then re-add them as separate datasets. You probably also want to filter out examples with "answer": "ignore"
, since those won’t be used during training/evaluation anyways.