Hello,
i am a bit confused about the format for NER annotations.
In the documentation i see:
{
"text": "Hello Apple",
"tokens": [
{"text": "Hello", "start": 0, "end": 5, "id": 0},
{"text": "Apple", "start": 6, "end": 11, "id": 1}
],
"spans": [
{"start": 6, "end": 11, "label": "ORG", "token_start": 1, "token_end": 1}
]
}
why token_start and token_end have the same value? I thought it was an example but no, i must do my_span.end - 1
for token_end value to correctly see the spans in ner.manual interface.
Could anyone explain the reason? I basically i pre-annotate my sentences and then use ner.manual to check them and add other labels.