Using binary accept/reject from NER teach in Spacy

justin.epstein · February 4, 2020, 4:51pm

Hi everyone! We are using the ner.teach recipe to generate the binary accept/reject data from our texts. Is there any way to train this using spacy? We currently train our models in databricks using spacy, so if there is a way to import that data and leverage our existing pipelines that would be great!

If not, is there a way to run the prodigy train command directly in a notebook or python script?

Thanks so much!

ines · February 5, 2020, 9:05am

You can use the data-to-spacy command to convert the annotations to spaCy's training format, and set --ner-missing to treat all unannotated tokens as missing values (so you can represent partial annotations). However, for the binary annotations collected with the active learning recipes, it does mean you're losing some information from the rejected examples. I've explained the difference of the two update mechanisms in more detail here:

spaCy does let you represent "negative" labels with an exclamation mark, though – for instance, !B-PERSON for a token that's not the beginning of a PERSON entity. So you could experiment with that to incporporate some of the rejected answers, if your results end up looking worse.

Topic		Replies	Views
Training a model on both gold and binary data usage , ner , done	11	1492	August 27, 2021
Correct procedure for ner.teach usage , ner , spacy	7	572	May 25, 2022
Prodigy single span data incompatible with NER model which expects all data to be present? usage , ner , api	3	876	August 17, 2018
Optional ner.correct argument --update error ner , done , nightly	4	711	April 25, 2021
train ner dataset -> ValueError: too many values to unpack ner , done	6	2626	January 10, 2020

Using binary accept/reject from NER teach in Spacy

Related topics