I am training a NER model to extract the named of injured body parts. For example if I have the text
The employee dropped the hammer from their left hand and it fell on their foot.
I want to annotate “foot” but not “hand” and INJURED_BODY_PART. So the model is learning both lexical and contextual information. I am only learning this new tag and not trying to retain model accuracy for other tags, so I don’t think catastrophic forgetting should be an issue here.
At the end of my annotation session the progress-to-zero-error bar was reading about 90%, and I was accepting most of the suggestions that Prodigy was making. Then I ran an ner-batch
and saw the following.
prodigy ner.batch-train safety_new safety2.model --output safety3.model --label INJURED_BODY_PART
Loaded model safety2.model
Using 50% of accept/reject examples (291) for evaluation
Using 100% of remaining examples (643) for training
Dropout: 0.2 Batch size: 32 Iterations: 10
BEFORE 0.010
Correct 7
Incorrect 722
Entities 2273
Unknown 8
LOSS RIGHT WRONG ENTS SKIP ACCURACY
01 7.707 87 642 1344 0 0.119
02 5.680 91 638 245 0 0.125
03 4.709 127 602 360 0 0.174
04 3.040 160 569 418 0 0.219
05 3.283 165 564 375 0 0.226
06 2.980 159 570 402 0 0.218
07 2.381 170 559 448 0 0.233
08 2.665 170 559 706 0 0.233
09 2.734 169 560 777 0 0.232
10 1.581 164 565 798 0 0.225
Correct 170
Incorrect 559
Baseline 0.010
Accuracy 0.233
I would expect the model to perform better on this task. The fact that the accuracy is flat makes me think that I don’t have enough training data. However, this seems strange because the active learning model in the Prodigy UI seemed to be doing extremely well, and I would expect that model to underestimate the true accuracy. However the accruacy in ner.batch is about 20%, but towards the end of Prodigy training
I was hitting Accept way more than 20% of the time.
My intuition is that this is strange. If I am hitting Accept way more than 20% of the time during active learning, I should see much better than 20% cross validation accuracy. Is this intuition correct, or is there something I’m overlooking?