Newbie. I am trying to do a NER labelling task in which I want to correct an already labelled dataset.
I have a dataframe in which I have already done (based on regex, lists etc.) labelling and I want to use this labelled data in Prodigy to start correcting some of those labels (since this initial labelling is not perfect of course). I have a huge ''pre-labelled'' set based on this.
To give an idea (some records with only O as label) how my dataframe looks like:
Now I know I have to convert it to JSONL format, however I am not sure on format content it should be so I can load it into Prodigy and to start correcting the labels (with ner.manual). I know I don't need ner.correct since that is if I already have a trained model (if I am correct on this
As anology I don't want to label based on a pattern file, but based on an already pre-labelled file (anology comes from projects/ner-food-ingredients at master · explosion/projects · GitHub, which is a tutorial on NER food ingredients.)
Can someone point me in the right direction?