Using prodigy annotation in Stanford CoreNLP

We have an existing tool that uses Stanford’s CoreNLP for entity recognition. I would like to use Prodigy to generate the annotation and then create a model in Stanford. How does one just dump out the annotations generated by Prodigy for a training set so they can be used to create models in another tool?

1 Like

Yes, you can use the db-out command to export an existing dataset in Prodigy’s JSON format:

prodigy db-out your_dataset > your_dataset.jsonl

Each annotation in the exported data will have the following format.

{
    "text": "Apple updates its analytics service with new metrics",
    "spans": [
        {"start": 0, "end": 5, "label": "ORG"}
    ]
}

The "start" and "end" of a span describe the character offsets within the text. You can find more details and examples in the “Annotation task formats” section of your PRODIGY_README.html.

1 Like