ner.batch_train vs spacy nlp.begin_training

ines · January 26, 2018, 3:44pm

Since Prodigy focuses a lot on usage as a developer tool, the built-in batch-train commands were also designed with the development aspect in mind. They’re optimised to train from Prodigy-style annotations and smaller datasets, include more complex logic to handle evaluation sets and output more detailed training statistics.

Prodigy’s ner.batch-train workflow was also created under the assumption that annotations would be collected using ner.teach – e.g. a selection of examples biased by the score, and binary decisions only. There’s not really and easy way to train from the sparse data formats created with the active learning workflow using spaCy – at least not out-of-the-box.

The ner.manual is still pretty new, and we haven’t ourselves trained models entirely from annotations collected with this workflow. But there shouldn’t be a problem converting them to spaCy’s training format and we’re thinking about including a recipe with a future version of Prodigy that takes care of this. (See this thread for a discussion on the topic.)

Topic		Replies	Views
Prodigy ner.batch-train vs Spacy train usage , spacy , best-practices	13	3494	June 2, 2020
Ner Training with Prodigy vs Spacy ner , spacy , best-practices	2	1205	July 2, 2020
accuracy same for both prodigy train and spacy train usage , spacy , solved	4	785	January 19, 2020
Remarkable Difference Between Prodigy and Custom Training Times ner	5	1435	April 4, 2018
Iterating on a NER spaCy model with Prodigy usage , ner , spacy , solved	3	403	July 21, 2020

ner.batch_train vs spacy nlp.begin_training

Related topics