Easy way to evaluate the output of ner.correct against its initial input

Given the following situation what is the easiest strategy to get evaluation metrics (Precision, Recall, F-Measure) for NER performance

  • We have some preannotated material A (not produced by a spaCy or prodigy model)
  • We use ner.manual recipe to produce the corrected version B
  • How can we measure the performance of A against gold B without applying to many conversion steps?

Thanks for hints.

Hi! I think the easiest way to do that would be to import your preannotated dataset into a separate Prodigy dataset and then run two training experiments with prodigy train and the same settings: one for the preannotated dataset and one for the corrected dataset. You'll then be able to compare the reported scores for both experiments and see how the model trained on your corrections compares.