Let's say you want to split GPE into COUNTRY, CITY, MISC. You can at least pre-tag a bunch of text with the initial model, and only annotate the examples it's labelled as GPE, initially. You could do this in the textcat interface. You probably also want to group the examples, so that you only have to annotate "America" once. If some of your phrases are ambiguous, you could flag them. Alternatively, if you do want to annotate ever instance rather than every type, it'll be efficient to order the queue so that you do all the "America" instances at once. This way you can click through quickly.
Of course, you can still have countries or cities which the model didn't initially tag as a GPE. But doing this first step of correction will give you a lot of examples quickly, so you can get the initial model trained. Once it's working, you can either use the ner.teach
or ner.make-gold
interface to fill in the missing entries.
If we're only interested in entity types that aren't in the initial model, there's not much to gain from resuming training. It's probably going to hurt more than it helps.
We can still start teaching the model entity types it's not trained with, using the ner.teach
interface. But to do that, you need to specify a patterns file. The patterns file will be used to start suggesting some entities of the new type. Once you've accepted some of these suggestions, they're used as examples for the model, so it can start making the suggestions.