Gold notation, Test/Eval set for already trained model

honnibal · May 14, 2019, 4:34pm

We had a built-in recipe that did something similar to that during an early beta, but we had too many NER recipes so we consolidated things to avoid confusion.

If you want to get a quick readable summary, you might find the prodigy.components.printers.pretty_print_ner function useful. If you mark the spans with an answer key, which should have a value in "accept", "reject" and "ignore", the spans will be coloured by correctness. I would set the correct predictions to have accept, and false predictions to have reject. You could list the false negatives at the end of the text as well (these might overlap with the predicted annotations, so you can’t easily show them in-line).

The loop to run the model and compare against the gold-standard should be pretty simple. You can have a look at my sample code for calculating precision/recall/F-score evaluation figures in this thread for reference: Recall and Precision (TN, TP, FN, FP)

Topic		Replies	Views
Create baseline metrics based on manual NER annotations usage , ner , solved	3	670	June 8, 2020
Prodigy NER model evaluation and custom evaluation scripts ner , spacy	5	2132	February 1, 2023
feature request: pre-trained model evaluation recipe enhancement	2	737	March 27, 2019
"Gold Standard" dataset as evaluation for ner.batch-train with binary annotation? usage , ner	2	788	May 15, 2019
Recall and Precision (TN, TP, FN, FP) ner , spacy	8	2413	May 17, 2019

Gold notation, Test/Eval set for already trained model

Related topics