I have around 15 entities to recognise in a document. Of which 13 are custom. (Two are ORG and DATE)
Following is the step followed.
- ner.manual is done on the 13 custom and separately did ner.batch-train off en_core_web_sm. We got the required accuracy for each of the entities with --no-missing option. And each of the models are performing well on unseen data.
- To manage them easily in a single model, I tried merge_spans and did a ner.batch-train to create a consolidated model off en_core_web_sm. Overall accuracy shown is approximately equal to the least of the 15, that is expected, I would assume. However, what is confusing is, an entity which is recognised by standalone model in the step 1 is not getting detected by the overall model.
I might be misunderstanding your question slightly — if so, apologies! Are you just wondering why the combined model might make different mistakes (or even more mistakes) than the separate models? If so, I’m afraid there probably isn’t an interesting answer for why the predictions might come out different on a particular text. Even just training a model twice with a different shuffling of the data can lead to a slightly different model, that makes slightly different mistakes.
On the other hand, if a whole entity type is being recognised very poorly in the combined model, that might indicate something more significant is going wrong. Have you checked that the entity annotations don’t overlap? For instance, it’s easy to first annotate an entity like
University of Chicago as an
ORG, and then in the second pass annotate
Chicago as a
LOC. The annotations overlap here, so the model won’t be able to recover both. If there are a lot of these confusions between the annotations, that could explain why the combined model performs worse on some entity type.
More generally, you might want to have a closer look at the combined data, just to make sure there’s no obvious problems. The
ner.print-dataset recipe is useful for that.