If you look at the “Annotation task formats” section in your PRODIGY_README.html
, you’ll find the exact JSON format that Prodigy expects for pre-annotated data for the different annotation types (NER, text classification etc.). The format should be pretty straightforward: for each example, you usually have a "text"
and then either a "label"
or "spans"
, depending on what you’re annotating. You can then convert your pre-annotated data accordingly. For example, for named entity recognition, you’ll need the text and the start/end character offsets and labels for the entities in that text.
1 Like