Text normalization / conversion with Prodigy / spaCy

Silent · August 13, 2018, 10:14pm

After many experiments, I found the problem!

I was using the ner.make-gold in a wrong way.
I thought I need to reject the incorrect NER predictions from the model after I made corrections to it.

After viewing the content of Prodigy SQLite database, it turns out Prodigy doesn’t record the original prediction from the model, only the corrected final result.

So basically my dataset record shows I rejected ~80% of the correct answers, and accepted ~20% of correct model predictions. The model struggles because it sees contradicted training data.

I deleted the entire SQLite, added a text preprocess step, and start over with ner.manual, it now has 90% accuracy after 100 iterations of training.

It seems the reject button really serves no purpose in ner.make-gold or ner.manual, maybe you could consider disable it to avoid confusion.

Topic		Replies	Views
Company name matching usage , ner	1	1309	March 16, 2020
ner.teach to silver to gold -- how to best leverage Prodigy's recipes usage , ner	2	1291	August 19, 2019
spaCy, prodigy, annotation usage , ner , solved	2	720	February 8, 2019
NER model from scratch (strange behaviour) usage , ner , spacy	7	448	October 13, 2020
Improving in spacy's existing NER entities ner	2	442	December 13, 2019

Text normalization / conversion with Prodigy / spaCy

Related topics