Model tagging all texts as labels

jsnleong · July 15, 2019, 6:14am

Hi,

I have begun my training on a blank ‘ner’ spaCy model, and have done some pattern bootstrapping, as well as ner.teach.

After that, I batch-trained my model and gotten a very low accuracy of 31%. When I tried to test my output model against some generic text, I realised that the model tags almost ALL the texts (some being a token, or a span of 2 words) as the label.

What seems to be the problem here? Thanks!

honnibal · July 16, 2019, 11:20am

If you’re starting from scratch, the ner.teach recipe isn’t always the most efficient, as it does take some time for the model to learn. You might be better off using the --no-missing flag for the initial training. This tells the model that there are no entities in the data that weren’t included in your annotation. By default, the ner.teach recipe doesn’t let the model assume that, because you’re only saying yes or no to specific suggestions.

I think what’s happening is that you haven’t said “reject” to many of the suggestions, since they’re all from your patterns file. This doesn’t give the model much to learn from — it isn’t learning what isn’t an entity. If you keep annotating, you should get some more suggestions from the model, which will let you tell it what isn’t an entity. But it might still take longer to learn this way, in comparison to just taking a slightly different approach, where you’re doing manual annotation and using the --no-missing flag to train.

Topic		Replies	Views
Error in teaching model trained with manual annotation usage , ner , spacy	2	481	November 4, 2019
Using terms.train-vectors recipe with NER ner , terms	1	1260	March 3, 2018
Understanding ner.batch-train stats usage , ner , solved , best-practices	7	2709	October 26, 2018
ner.batch-train with single label usage , ner , solved	3	679	June 6, 2019
Trying to teach NER from blank model for Russian language ner , spacy , solved	3	3200	August 8, 2018

Model tagging all texts as labels

Related topics