Using the output of ner.gold-to-spacy to train a new model

honnibal · April 4, 2018, 1:18am

Hi,

Sorry I missed this thread before. I've been writing about the same sort of question in this thread: Remarkable Difference Between Prodigy and Custom Training Times - #5 by wpm

There can be a problem here, yes, but we can take steps to solve it. For a start, you can use the prodigy.models.ner.merge_spans() function to group the annotations onto the same sentence. You should concatenate your datasets and pass them through this function, and then use the ner.print-dataset function to check that the results are correct. Next, you can pass your annotations through the ner.make-gold recipe, so that you can manually correct any missing entities. This should let you create a dataset you can use in spaCy or another NER tool.

Topic		Replies	Views
Prodigy annotations to SpaCy train spacy	13	5614	January 31, 2018
Cannot use the ner.gold-to-spacy output JSONL data to train in spacy train usage , ner , spacy , solved	3	671	June 20, 2019
combining multiple models and exporting training data to spacy ner , spacy	3	2883	November 13, 2018
Prodigy ner.batch-train vs Spacy train usage , spacy , best-practices	13	3496	June 2, 2020
ner.batch_train vs spacy nlp.begin_training ner , spacy	1	1098	January 26, 2018

Using the output of ner.gold-to-spacy to train a new model

Related topics