SpaCy3 models evaluation on a custom dataset

ines · July 6, 2021, 1:26am

Hi! You can use the data-to-spacy recipe to export your annotations, which in Prodigy v1.10 will give you a corpus in spaCy's v2's JSON format. If you're using spaCy v3, you can run spacy convert to convert it to the binary format used by spacy train: https://spacy.io/api/cli#convert You'll then be able to train and evaluate your model using a transformer-based config.

Btw, under the hood, spaCy v3's binary format is just a collection of annotated Doc objects (which now also makes it much easier to generate it programmatically): https://spacy.io/api/data-formats#binary-training

The upcoming Prodigy v1.11, currently available as a nightly pre-release, will allow you to export your data in spaCy's .spacy format out-of-the-box.

Topic		Replies	Views
Formatting Prodigy annotations for evaluation of external NER models using spaCy usage , ner , spacy	4	592	April 13, 2022
SpaCy training from data-to-spacy output ? usage , spacy	8	1812	June 14, 2022
Exporting a NER model with training.jsonl & evaluation.jsonl ner , spacy , solved	2	648	June 2, 2020
Training prodigy ner data through spacy usage , ner , spacy , solved	3	893	January 8, 2020
Converting SpaCy training json file to Prodigy jsonl format usage , spacy	9	3013	April 17, 2023

SpaCy3 models evaluation on a custom dataset

Related topics