feature request : more details on train-curve recipe results

bayethiernodiop · May 16, 2020, 7:00pm

I was using the train-curve recipe on a NER dataset with 8 types of entity, but the recipe only give the evolution of accuracy in all entities, which makes it hard to know on which label I should add more examples.
Because in my case when I learn the train recipe I can see that some entities are better predicted than others.
thanks.

ines · May 19, 2020, 9:33am

The train-curve recipe runs the training several times with different portions of the data and is intended to give you a rough idea of whether you're on the right track by simulating training with different dataset sizes. Outputting results per label would make the output a lot more verbose and I'm not sure it'd be very useful and conclusive. The results could be pretty arbitrary because you're holding back large portions of the data (e.g. 75% in the first run).

That said, under the hood, all the train-curve recipe does is call into train with different values for --factor (by default, 0.25, 0.5, 0.75 and 1.0). So if you want the detailed results for each experiment, you could just call into train directly.

bayethiernodiop · May 19, 2020, 8:18pm

I see. I just wanted to know for my worst entity type prediction whether adding more annotation on them could help. I think I'll have to do that by myself with a stratified sampling and a changing factor
Thanks

Topic		Replies	Views
Train curve accuracy getting worse usage , ner	5	1035	November 9, 2018
Train recipe performance indicators solved , training	1	337	March 29, 2022
ner.train-curve usage , ner	1	1000	February 26, 2018
what to do if train-curve shows slight decrease in last sample usage , best-practices , training	6	1107	June 8, 2022
Difference in quality in make-gold vs trained model's annotations (and others) ner	1	598	August 10, 2018

feature request : more details on train-curve recipe results

Related topics