Hi,
I am a bit confused by the ner.make-gold workflow. If the model gives me a suggestion where the labels are incorrect, am I supposed to correct the suggestions and then click accept or should I just reject it?
Hi,
I am a bit confused by the ner.make-gold workflow. If the model gives me a suggestion where the labels are incorrect, am I supposed to correct the suggestions and then click accept or should I just reject it?
The goal of the ner.make-gold
workflow is to produce gold-standard data – i.e. annotations that are complete and “perfect”. In ner.teach
, you just give the model binary feedback on different analyses of the text – but in ner.make-gold
, the idea is that you correct the entities until the example is complete and all entities are labelled, and then accept it. If you come across a sentence that includes no entities, you would simply accept the unlabelled sentence.
I normally use the “reject” action to explicitly mark examples that are wrong for other reasons – for example, if the tokenization is bad or if it includes bad markup etc.
(Btw, you could also create gold-standard data by hand using ner.manual
, but correcting the model’s predictions is often faster, because there’s always a chance that the model gets at least some of the entities right.)