Hi!
I have a dataset with 4007 annotations:
Dataset 'energy_patterns'
Dataset energy_patterns
Annotations 4007
Accept 517
Reject 3423
Ignore 67
but when I train a model ner.batch-train energy_patterns en_core_web_lg --output model_energy --eval-split 0.2
it is using only 309 examples for training and 50 for evaluation.
Loaded model en_core_web_lg
Using 20% of accept/reject examples (50) for evaluation
Using 100% of remaining examples (309) for training
Dropout: 0.2 Batch size: 10 Iterations: 10
Why is it not using all annotations in the dataset? I cannot see where does the difference come from. Could you help me with that?
Thanks!