ner.teach suggests spaces as entities?

That’s right, you can keep doing the ner.batch-trainner.teach loop to keep fixing those mislabelings. (It does some updating as you annotate, but apparently not enough to stop asking about white space/periods). You’ll probably want to just overwrite each model unless you’re interested in the change in accuracy with new annotations. And make sure you use de_core_news_sm as the base model every time for ner.batch-train, not the previously saved model.

My NER work has just been on updating existing labels, so I’m not sure how many you’ll need. It also depends on the accuracy you need as well. Maybe 1000 accepts?

If most of your examples are rejects, you can try shifting the bias parameter upward so the stream favors higher probability labels. See question 73.

Finally, if you’d like the model to also remember how to annotate people, locations, etc, you’ll need to look into some strategies for overcoming the “catastrophic forgetting” problem that comes from it just seeing your disease labels for a while. See the blog post or search the support forum for some ideas. But if you just need disease labels, don’t worry about it.