Annotating for few labels+new label but training on all labels

SandeepNaidu · February 1, 2018, 4:02pm

Hi,

I have annotated good number of examples on labels like PERSON, ORG, PRODUCT, CARDINAL, DATE and TECH(new label which I created using terms.teach). For the model batch training I am giving all labels as below.
–label [PERSON,DATE,ORG,GPE,LOC,TECH,CARDINAL,ORDINAL,LAW,WORK_OF_ART,EVENT,PRODUCT,FACILITY,NORP
,LANGUAGE,TIME,PERCENT,MONEY,QUANTITY]

Is there any side effect on the accuracy of the other labels by not having annotations (even rejects) for some of these labels?

honnibal · February 1, 2018, 8:21pm

I’ve tried to design the training algorithm to minimise this problem, so hopefully no there won’t be much side-effect on the accuracy of the other labels. However, the answer is ultimately empirical — you’ll need to run a test and see what’s happening on your data.

If you’re seeing that the accuracy does decline on these unlabelled entity types, you might try including some text for which you have no annotations. This should stabilise the training somewhat, by encouraging the model to stick to its original behaviour. If that still doesn’t work, you could try adding the initial annotations to those unlabelled sentences as gold annotations.

Topic		Replies	Views
ner.batch-train with single label usage , ner , solved	3	679	June 6, 2019
Train one label on a model that has two entities usage , ner , solved , finance	4	776	May 21, 2019
ner.train-curve usage , ner	1	1000	February 26, 2018
False Results of Trained models ner , spacy	16	925	March 12, 2019
How to train a NER model with unbalanced entities? usage , ner	1	1295	May 11, 2019

Annotating for few labels+new label but training on all labels

Related topics