Im trying to get annotated documents vetted from experts using ner.match. I want to get the output of the ner.match session as a parsable document. Is there a way to make a custom recipe. or edit the ner.match recipe to get an output in a .json file in the following format:
[{'text': text1, 'annotations': {'annotation term': term, 'span': (n1, n2)}, 'positive': True/False}, .....]
Prodigy's db-out
command (or Python database API – see the PRODIGY_README.html
for details) lets you download the annotations as a JSONL file (or list of dicts). You can then convert that to any format you need using a custom script etc.
Each example will have an "_input_hash"
propery, which makes it easy to find different annotations on the same text. So you can combine all examples with the same input hash. Examples also include a list of "spans"
(the highlighted entities) and an "answer"
(whether you accepted or rejected the suggestion). So creating the format you need should be pretty straightforward in a few lines of code