After annotating my data with ner.make-gold, I used spaCy's training new entity type. If their use is the same, I'd like to know if there already exists some stats (as in the ner.batch-train stats like accuracy, FP, FN) recipe that I can use for my model which is resulted form spaCy's training new entity type.
For now in the code, I only have the losses.
sizes = compounding(1.0, 1000.0, 1.001)
# batch up the examples using spaCy's minibatch
for itn in range(n_iter):
random.shuffle(TRAIN_DATA)
batches = minibatch(TRAIN_DATA, size=sizes)
losses = {}
for batch in batches:
texts, annotations = zip(*batch)
nlp.update(texts, annotations, sgd=optimizer, drop=0.20, losses=losses)
print("Losses", losses)
You can use a model's evaluate() function to calculate performance on a dataset. If you have DEV_DATA in the same format as TRAIN_DATA, you can use:
scorer = nlp.evaluate(DEV_DATA)
The relevant scores for NER are: scorer.ents_p, scorer.ents_r, scorer.ents_f, scorer.ents_per_type
As an alternative, if you have data in a format that you can easily convert to spacy's training format (typically with python -m spacy convert), you can convert it and use the train CLI and/or the evaluate CLI for training and evaluating new models. See: https://spacy.io/api/cli
Adriane's answer above should give you everything you need – but just to address this specific point: Yes, under the hood, ner.batch-train also loops over the examples calls into nlp.update, just like all the example scripts we provide.
Prodigy's built-in training recipes are basically wrappers around spaCy's training methods that are optimised for quick experiments and give you nicely-formatted output. They take care of loading and converting Prodigy's data format, merging annotations on the same text (e.g. if you've accepted/rejected multiple entities in the same sentence) and handling both gold-standard and incomplete annotations (if you only know that one span is wrong, but don't know the answer for all tokens in the text).