Adding more gold annotations decreases the accuracy in gold-model

EFgit · January 30, 2019, 10:45am

I’m trying to build a model using only annotations from ner.make-gold. The logic was:

do some examples of GOLD
train a model with batch train using --no-missing argument
then repeat the first step with the new model and suggestions get better (no need for too much manual interventions)

Commands and output:

…

> prodigy ner.batch-train personal_info_gold_new prodigy_models/personal_info_gold_new2 -o prodigy_models/personal_info_gold_new3 --n-iter 10 --eval-split 0.2 --dropout 0.2 --no-missing

        BEFORE     0.652     
        Correct    15
        Incorrect  8
        Entities   17        
        Unknown    0                                                                                           

        AFTER
        Correct    20
        Incorrect  6
        Baseline   0.652     
        Accuracy   0.769

Added 650 annotations

> prodigy ner.batch-train personal_info_gold_new prodigy_models/personal_info_gold_new3 -o prodigy_models/personal_info_gold_new4 --n-iter 10 --eval-split 0.2 --dropout 0.2 --no-missing

       BEFORE     0.794     
       Correct    50
       Incorrect  13
       Entities   55        
       Unknown    0   

       AFTER
       Correct    46
       Incorrect  23
       Baseline   0.794     
       Accuracy   0.667

So, until this iteration the results were getting better then suddenly started to get worse.
What could be the problem here?

ines · January 30, 2019, 12:15pm

Just to confirm: It looks like you’re using one dataset, personal_info_gold_new and then keep updating the model artifact produced in the previous step, right? Can you reproduce the same results if you’re always updating the base model (e.g. en_core_web_sm or whatever else you used)?

EFgit · January 30, 2019, 1:28pm

Yes, personal_info_gold_new is the dataset I created to save the gold data I annotate in each iteration...

I started annotating using the model that I generated using terms.train-vectors (prodigy_models/resumes_model1) like this:

 prodigy ner.make-gold personal_info_gold_new prodigy_models/resumes_model1 data/jsonl/en_complete_316.jsonl --label "PERSON, EMAIL, BIRTH_DATE, PHONE_NUMBER, SOCIAL_MEDIA"

Then on batch-train I used again the same model (from which I think only the tokenizer is used):

prodigy ner.batch-train personal_info_gold_new prodigy_models/resumes_model1 -o prodigy_models/personal_info_gold_new --n-iter 10 --eval-split 0.2 --dropout 0.2

So, if I am understanding correctly you are asking if I can reproduce the same results if I stay with the first model which in this case would be: prodigy_models/resumes_model1 in all the iterations to come?

ines · January 30, 2019, 1:30pm

Yes, exactly. Since you're always updating with the full gold dataset, the result should be the same.

EFgit · January 30, 2019, 1:47pm

Yes, that was also what I thought but I got this:

prodigy ner.batch-train personal_info_gold_new prodigy_models/resumes_model1 -o prodigy_models/personal_info_gold_new4 --n-iter 10 --eval-split 0.2 --dropout 0.2 --no-missing

BEFORE 0.004
Correct 9
Incorrect 2534
Entities 2494
Unknown 0

Correct 35
Incorrect 29
Baseline 0.004
Accuracy 0.547

and now I am confused...

Topic		Replies	Views
Model accuracy not improving with new gold data ner	8	1019	December 27, 2018
Gold/Silver Dataset Confusion usage , ner , solved	2	1476	September 3, 2019
Difference in quality in make-gold vs trained model's annotations (and others) ner	1	598	August 10, 2018
Best practices for NER annotation to avoid overfitting usage , ner	3	1349	October 21, 2020
Annotation strategy for gold-standard data usage , ner , solved , best-practices	5	2700	October 26, 2018

Adding more gold annotations decreases the accuracy in gold-model

Related topics