Convert spaCy training json file to prodigy jsonl format for db-in command

Hi,

Following the previous post in Jul 18, https://support.prodi.gy/t/converting-spacy-training-json-file-to-prodigy-jsonl-format/687, I am wondering if there is any update to this request.

I have a spaCy training json file I would like to import into prodigy to check and display the annotations. I am wondering if a function has been written thus far, or I should write the function myself.

Thanks!

Hi! We don't have a built-in method for this at the moment, but the conversion hopefully shouldn't be too difficult: you can just loop over the examples, use spaCy to create a Doc and then extract the annotations you need. For NER annotations, you can alsouse spaCy's helper functions like gold.offsets_from_biluo_tags to convert the token-based tags into character offsets.