Displaying Span/Token Metadata

Mindful · February 24, 2021, 1:57am

I've not used Prodigy yet so I apologize if there's a straightforward answer to this question that I missed somewhere; I couldn't find an obvious answer in the documentation.

I'm hoping to use Prodigy to annotate relation data in sentences with grammatical errors. In terms of actual annotation functionality, it looks like the relations interface will work perfectly. The issue is that I would like to display information about corrections made to the sentence to help inform how these relations are annotated. I'm hoping for an end result where the parts of the sentence that were corrected are highlighted/otherwise visually different in some way, and the content of the corrections are visible either directly under them or in a popup on mouseover. A more general way to say this is that I'm hoping to assign metadata in the form of text to specific tokens/spans and have it visible when annotation is being performed.

Basically I'd like to be able to pass in something like this:

{
  "text": "My friend is always to run",
  "tokens": [
    {"text": "My", "start": 0, "end": 2, "id": 0, "ws": true},
    {"text": "friend", "start": 3, "end": 9, "id": 1, "ws": true},
    {"text": "is", "start": 10, "end": 12, "id": 2, "ws": true},
    {"text": "always", "start": 13, "end": 19, "id": 3, "ws": true},
    {"text": "to", "start": 20, "end": 22, "id": 4, "ws": true},
    {"text": "run", "start": 23, "end": 26, "id": 5, "ws": true},
  ],
  "spans": [
    {"start": 20, "end": 26, "token_start": 4, "token_end": 5, "correction": "running"},
  ],
  ...
}

And have some way to display running as the correction for to run, without it being an actual span label for the purposes of training or being changeable/interactive in any way during annotation. The actual annotation component is straightforward span/relation annotation that seems like it should be easy with Prodigy's existing features.

Is there an easy way to do something like this? I see that there's a metadata field, but it appears to be for passing in metadata about the entire sentence/document that gets displayed in the lower right hand corner. I'm hoping there's a relatively straightforward way to attach metadata to tokens/and or spans.

ines · February 24, 2021, 12:42pm

Hi! From what you describe, using the "label" key here could actually work and might be the easiest solution. It would display the text nicely below the token(s) it refers to. If you're not annotating relations and spans jointly, the span labels will be static and not editable, so they'll just function as a visual guide. And you don't have to actually use this for training later on – you could just strip out the spans or span labels afterwards or just use the "relations" data.

Alternatively, the "meta" field could be an option as well – but it will be more separate. Anything you put in "meta" will be displayed in the bottom right corner, as static meta information. So you could set up custom keys in it programmatically, like "run": "running".

Mindful · February 24, 2021, 1:46pm

I've honestly not figured out how much span annotating I'll need to be doing (definitely relations though), but in any case it sounds like the reasonable first step is to try this using the label key. Thanks very much!

Edit: Just in case anyone else has a similar issue with GEC related work, I ultimately resolved this by just using the diff view. The labels idea would have been perfect if I didn't occasionally need to annotate span labels as well (I.E. were just doing relations), but it's turned out that I do, so the diff view was a better way to communicate the same information.

Topic		Replies	Views
relation recipe missing span annotation on custom tokens because of tokenization didnt match relations , spancat	1	350	September 15, 2022
Correction of annotation in UI enhancement , done	5	1348	December 25, 2017
Anotation task format for ner_manual interface usage , ner , solved	7	1782	May 10, 2019
Can relations view_id use HTML render instead of text tokens?	3	210	June 2, 2023
annotations imported via db-in not showned ner , done , front-end	2	39	August 31, 2024

Displaying Span/Token Metadata

Related topics