We have an existing tool that uses Stanford’s CoreNLP for entity recognition. I would like to use Prodigy to generate the annotation and then create a model in Stanford. How does one just dump out the annotations generated by Prodigy for a training set so they can be used to create models in another tool?
1 Like
Yes, you can use the db-out
command to export an existing dataset in Prodigy’s JSON format:
prodigy db-out your_dataset > your_dataset.jsonl
Each annotation in the exported data will have the following format.
{
"text": "Apple updates its analytics service with new metrics",
"spans": [
{"start": 0, "end": 5, "label": "ORG"}
]
}
The "start"
and "end"
of a span describe the character offsets within the text. You can find more details and examples in the “Annotation task formats” section of your PRODIGY_README.html
.
1 Like