I have a dataset with 4007 annotations:
Dataset 'energy_patterns' Dataset energy_patterns Annotations 4007 Accept 517 Reject 3423 Ignore 67
but when I train a model
ner.batch-train energy_patterns en_core_web_lg --output model_energy --eval-split 0.2 it is using only 309 examples for training and 50 for evaluation.
Loaded model en_core_web_lg Using 20% of accept/reject examples (50) for evaluation Using 100% of remaining examples (309) for training Dropout: 0.2 Batch size: 10 Iterations: 10
Why is it not using all annotations in the dataset? I cannot see where does the difference come from. Could you help me with that?