Create baseline metrics based on manual NER annotations

svenski · June 8, 2020, 9:44am

I'm taking my first steps with Prodigy and have annotated a test set with ORG labels (only).

My intention is to see how well the existing Spacy models, and eventually other NER models, perform on this labelled data set out of the box, before I start any training. I haven't been able to find a simple way to do this. I essence I'm looking for a evaluate.model recipe where I can pass in a test/validation set and a model which outputs evaluation metrics.

I tried passing in no training data to the train recipe but it didn't want to play ball.

Currently I'm trying to transform the output from the data-to-spacy recipe to fit with the GoldParse input, so I can get some . It seems to me like it could be standard use case, so wanted to check if there is a simpler way to achieve what I want?

svenski · June 8, 2020, 12:19pm

It seems like the following post is dealing with the same question:

I'm trying this now.

ines · June 8, 2020, 12:21pm

Hi! I think what you're looking for is spacy evaluate?

This takes data in spaCy's format and will perform an evaluation. Prodigy's training experiments are really designed for quick training experiments with Prodigy dataset and not necessarily to replace the training or evaluation process of whichever library you're using (e.g. spaCy).

The above post is a bit more abstract and about a custom evaluatio, including evaluation of binary data (I think) so I'm not sure that's the right approach if you just want to output scores.

svenski · June 8, 2020, 12:41pm

Thank you for the prompt reply I didn't know about spacy evaluate! However, I will log the results to a database so using the code from the example above fits the bill quite well.

Topic		Replies	Views
Formatting Prodigy annotations for evaluation of external NER models using spaCy usage , ner , spacy	4	596	April 13, 2022
Prodigy NER model evaluation and custom evaluation scripts ner , spacy	5	2132	February 1, 2023
feature request: pre-trained model evaluation recipe enhancement	2	737	March 27, 2019
Evaluating Precision and Recall of NER ner , solved	6	11934	April 30, 2020
Gold notation, Test/Eval set for already trained model usage , ner	3	931	May 14, 2019

Create baseline metrics based on manual NER annotations

Related topics