ner.teach suggests spaces as entities?

andy · July 27, 2018, 9:11pm

That’s right, you can keep doing the ner.batch-train–ner.teach loop to keep fixing those mislabelings. (It does some updating as you annotate, but apparently not enough to stop asking about white space/periods). You’ll probably want to just overwrite each model unless you’re interested in the change in accuracy with new annotations. And make sure you use de_core_news_sm as the base model every time for ner.batch-train, not the previously saved model.

My NER work has just been on updating existing labels, so I’m not sure how many you’ll need. It also depends on the accuracy you need as well. Maybe 1000 accepts?

If most of your examples are rejects, you can try shifting the bias parameter upward so the stream favors higher probability labels. See question 73.

Finally, if you’d like the model to also remember how to annotate people, locations, etc, you’ll need to look into some strategies for overcoming the “catastrophic forgetting” problem that comes from it just seeing your disease labels for a while. See the blog post or search the support forum for some ideas. But if you just need disease labels, don’t worry about it.

Topic		Replies	Views
Whitespace NER candidate or rendering bug? enhancement , ner	2	874	January 5, 2018
ner.train-curve error on whitespace usage , ner , spacy	1	597	December 25, 2019
Using terms.train-vectors recipe with NER ner , terms	1	1260	March 3, 2018
ner.teach starts going wacky about 30 examples in usage , ner , transformers	10	832	January 14, 2022
Incredibly poor training results ner , solved	5	735	October 25, 2018

ner.teach suggests spaces as entities?

Related topics