accuracy differs with and without --no-missing

I have a training corpus some sentences tagged with 10 labels. Model created using ner.batch-train. after that i used ner.teach to create new examples with the created model in loop. i am observing the following when i trained again using the extended training corpus.

  1. The accuracy reduces when i retrain using the extended corpus. Why does this happen?
  2. I trained the model with and without “–no-missing” parameter. the training without the parameter always gives better accuracy. why?
  3. I have only those 10 tags.and the initial corpus for training is created using ner.manuel. so, every sentence is reviewed. after that it is extended with ner-teach. is it better to use --no-missing though there is a reduced accuracy - or skip it?

I notice a difference my previously tagged corpus bootstrapped with dictionary and manually corrected - fed in the db (say: ‘XXXX’) has “token_start” and “token_end” in the spans. This is the one i created the initial model with. keeping this loop, i have the ner.teach running ant storing the corrected corpus in the db. (say: ‘YYYY’) YYYY does not have “token_start” and “token_end” (the spacy training needs only start and end though.) But, does it make a difference?

I also notice the text id is not stored while doing ner.teach. (that also is not used in training) but this makes a difference in the format when i combine both.

Yes, this makes sense – the way the training data is interpreted can have a very significant impact on accuracy. The results you’re seeing also likely come down to the evaluation: If you’re training without --no-missing, all unannotated values are treated as missing (i.e. as values we don’t know anything about). If you train from complete annotations, we assume that we know everything about the data and everything that’s annotated is the truth – which lets you run a much more fine-grained evaluation. Do you have a dedicated evaluation set?

If you’ve collected annotations with a manual recipe and have annotated everything that’s in it, set the --no-missing flag during training. If you’ve collected binary annotations with a recipe like ner.teach, run the training without the flag.

The reason for this is that the manual recipes pre-tokenize the text so your selection can “snap” to the token boundaries and you can label things manually faster. So the annotations you create here will have the tokens present, which is not required for other interfaces. This shouldn’t be relevant for training or anything, since those properties aren’t used. Prodigy really only looks at the text and the span offsets.

Thank you for the detailed explanation.
I do not have a gold-evaluation set. I have a plan to create one. but not there yet.
So, i now have this issue: I have the initial corpus annotated with manual recipe then created a model and my next set came from ner.teach. How do i combine them? i mean - One has all annotations another is binary. if i combine the datasets the training runs, but the accuracy reduces from what i got from the manually corrected dataset.

One option would be to pre-train the model using the initial manually created examples and --no-missing, and then update that base model with binary annotations.

Alternatively, once you know that your data is suitable and your approach is working, you can also transform the “silver-standard” binary data to “gold-standard” full data. Here’s an example recipe that shows how this can look:

I tried using this on the binary tagged data. That was tagged using ner-teach. which means i have already gone through the entire dataset with the model in the loop to say if it is relevant or not. Now this is again getting into validating it manually again with the trained model. This seems to be redundant. is there a way to combine it without going through it again?

Ultimately, if you want gold-standard data, you need to find a way to label everything that’s in the data. If you already have binary annotations, you can use them to define the constraints and help you, so you don’t need to do everything from scratch. That’s what the silver-to-gold workflow is doing.