teach versus silver versus gold

I am very confused about what exactly is a regular dataset versus silver versus gold. How does ner.teach differ from ner.make_gold and what is ner.silver_to_gold? Can you please explain or point me to the documentation where this is explained?

1 Like

I guess in layman's terms ner.teach is for labelling, while training your model in the loop. You are presented with the ner labels that the model is least certain about, and by marking yes/no, you train the model in an efficient way. This creates silver standard labels. Your sentences are not fully annotated, but certain words have binary marks for particular classes of entities.

With ner.make_gold, you are presented with entire sentences, labelled as the model has predicted (the better your model, the more accurate these labels will be) - your task is to turn these predicted labels to gold-standard by manually correcting them.

ner.silver_to_gold takes the binary annotations created from ner.teach and analyses these to predict accurate labels for the binary sentences, you can then correct to 'gold standard' as in ner.make_gold. The difference between silver_to_gold and make_gold is that silver_to_gold turns your quickly-created binary labels into fully annotated data.

I'm not sure if everything is clear, but I'm happy to try and answer more questions, or to be corrected if I myself have anything wrong here.

1 Like