Only one entity per example in evaluation dataset

fuchs · September 18, 2019, 8:43pm

Hi I got a question

I collected annotation data with ner.teach and updated the en_core_web_lg model with my new annotation data. Now I evaluated the dataset using the scorer and I got really bad precision and recall. I think I know the problem but you might can confirm that I missed something:

When I am asked by the ner.teach binary "questions" there is always only one entity annotated and therefore all the examples in the evaluation.jsonl file contain only one entity and do not contain the other entities which are probably also in the text but not "asked" by the binary ner.teach question. Right?

How do I get an evaluation dataset containing all the entities? Do I have to create ner.make-gold data anyway if I work with the ner.teach to evaluate my dataset? If yes, does the scorer know which data to use (binary annotation dataset from ner.teach or the data from ner.make-gold) or do I have to create a new prodigy dataset and do the evaluation with the new one?

Thank you for your help!

honnibal · September 19, 2019, 10:33pm

Hi,

I think you've hit on the correct issue here. The ner.teach recipe gives you binary questions, so the model has to guess about the entities that aren't annotated. It doesn't know whether some unannotated question is correct or incorrect.

Now, this still is enough information from the questions to correct some errors, if the model is already quite good at the entity types you're working on. So it's a way to go from say, 85% to 90% with less annotation. But if you're at a low accuracy (because you're starting a new entity type), the ner.teach recipe isn't so helpful.

For a new entity type, you should probably start with the ner.manual recipe, to just start annotating in a fairly simple way. Then when you have enough data, you can use ner.batch-train, specifying the --no-missing flag. That flag tells the model that the annotations are complete, i.e. there's no missing entities. This way during training, if the model predicts some incorrect entity, the loss can be calculated to penalise it.

Let me know if it's still not clear, but I think from the sound of it your thinking is definitely on the right track

Topic		Replies	Views
ner.teach - couple of questions ner , done , solved , nightly	9	2674	December 30, 2021
Debugging NER - batch_train with custom dataset ner	5	617	October 16, 2019
"Gold Standard" dataset as evaluation for ner.batch-train with binary annotation? usage , ner	2	798	May 15, 2019
Check if ner.teach has caught all entities with ner.manual usage , ner	1	439	August 16, 2019
Two Questions on Teach recipes usage , ner , textcat , solved	5	757	January 27, 2020

Only one entity per example in evaluation dataset

Related topics