Using the output of ner.gold-to-spacy to train a new model

Hi,

Sorry I missed this thread before. I've been writing about the same sort of question in this thread: Remarkable Difference Between Prodigy and Custom Training Times - #5 by wpm

There can be a problem here, yes, but we can take steps to solve it. For a start, you can use the prodigy.models.ner.merge_spans() function to group the annotations onto the same sentence. You should concatenate your datasets and pass them through this function, and then use the ner.print-dataset function to check that the results are correct. Next, you can pass your annotations through the ner.make-gold recipe, so that you can manually correct any missing entities. This should let you create a dataset you can use in spaCy or another NER tool.

1 Like