produce jsonl from json source file?

Hi - complete newbie here :slight_smile:
I have some really nice clean source files I would like to label that are in a nested json format.

I would like to prepare these files to be labelled using ner.manual.
There are certain fields in the json that I would like to include in the labelling process.

Can anyone help me with a way to format my files?

Thanks

BD

Hi! You can find the expected JSON format for NER tasks here: https://prodi.gy/docs/api-interfaces#ner

At a minimum, each entry needs to have a key "text" containing the text to annotate. You can optionall include your own tokenization and already pre-labelled spans, but that's not a requirement.

The JSON format also lets you pass through any custom keys in the data, so you can use that to include additional meta information that's then saved with the examples in the database.