Recall and Precision (TN, TP, FN, FP)

The prodigy.models.ner.EntryRecognizer.evaluate() method will tell you the accuracy of the model, but doesn’t currently return P/R/F scores. The method supports the use-case where the gold-standard has only entities known to be correct, without necessarily containing all of the correct entities — i.e., the use-case where the gold-standard has missing values. You should specify the flag no_missing=True if you don’t have missing values in your gold-standard.

Here’s some code to return P/R/F, assuming you have no missing values in your gold standard:


tp = 0.0
fp = 0.0
fn = 0.0
for eg in test_examples:
    doc = nlp(eg["text"])
    guesses = set((ent.start_char, ent.eng_char, ent.label_) for ent in doc.ents)
    truths = set((span["start"], span["end"], span["label"]) for span in eg["spans"])
    tp += len(guesses.intersection(truths))
    fn += len(truths - guesses)
    fp += len(guesses - truths)
precision = tp / (tp+fp+1e-10)
recall = tp / (tp+fn+1e-10)
fscore = (2 * precision * recall) / (precision + recall + 1e-10)
1 Like