Convert spaCy training json file to prodigy jsonl format for db-in command


Following the previous post in Jul 18,, I am wondering if there is any update to this request.

I have a spaCy training json file I would like to import into prodigy to check and display the annotations. I am wondering if a function has been written thus far, or I should write the function myself.


Hi! We don't have a built-in method for this at the moment, but the conversion hopefully shouldn't be too difficult: you can just loop over the examples, use spaCy to create a Doc and then extract the annotations you need. For NER annotations, you can alsouse spaCy's helper functions like gold.offsets_from_biluo_tags to convert the token-based tags into character offsets.