Storing external IDs with Annotations

The text items I am annotating have ids that link them to other information. When Prodigy writes an annotation to its DB is it possible to adjust a recipe to store that id? So the jsonl consumed by the recipe would include that id...would be something like this:

{"text": "some candidate text", "External_ID": 12345}

and get something like:

{"text": "some candidate text", "External_ID": 12345, "label": "Negative", "answer": "accept"}

I imagine I can tweak an existing recipe to take more than the "text" field, strip the "External_ID" field before sending it to a stream...but not sure how I would get the "External_ID" back jsonl before it get written to the prodigy.db. Any thoughts welcome.

What you describe here should already be possible out-of-the-box. If you load in data with additional fields, those will simply get passed through and then stored with the annotations.

Is there a specific reason you want to strip the field and then add it back? The additional fields will have no impact during annotation (unless you use them in a custom HTML template or something), so you can just leave them in the data.

Apologies, tried that, got an error, assumed I had to send a text field only, just rechecked the trace and it was badly formatted file....as you expected it works perfectly out of the box. Thank you.

1 Like