Span of annotation is not correct in the browser when trying to re-annotate

Hi! I had a jsonl file with annotations that I obtained by ner.make_gold. After some time, I realized that there are some annotations missing so I want to add these annotations by ner.manual (also keeping the annotations that are already there). The annotated spans I see in the browser become different from what is in the jsonl file.
For example, I have these json objects:

{"text": "UK based operator Hurricane Energy has announced that the vessel destined for it 's West of Shetland development has arrived in Dubai . The floating production and storage offloading vessel ( FPSO ) Aoka Mizu is now at the Drydocks World Dubai Shipyard . The vessel is set to be used for the firm 's flagship Lancaster field . Drydocks World is [ ... ] The post Hurricane 's FPSO arrives in Dubai appeared first on Energy Voice .", "spans": [{"start": 18, "end": 34, "token_start": 3, "token_end": 5, "label": "ORG"}, {"start": 362, "end": 371, "token_start": 67, "token_end": 68, "label": "ORG"}], "answer": "accept"}
{"text": "In the latest issue of EGR Intel , Global Gaming CEO Stefan Olsson provided his thoughts on the company 's recent progress , as well as giving readers an insight into its future plans . The Q&A sees our CEO talk at length about its unique PayNPlay solution , which has revolutionised the online casino world with its non - registration platform and instant payouts , plus how the recent NASDAQ listing has impacted the company , as well as the ambition to expand Global Gaming 's international operations . http://news.cision.com/global-gaming-555-ab/r/global-gaming-ceo-discusses-long-term-ambitions-with-egr-intel,c2421917 http://mb.cision.com/Public/16275/2421917/8fd0e3f0df4f6a93.pdf", "spans": [{"start": 27, "end": 32, "token_start": 6, "token_end": 7, "label": "ORG"}, {"start": 35, "end": 48, "token_start": 8, "token_end": 10, "label": "ORG"}, {"start": 463, "end": 476, "token_start": 84, "token_end": 86, "label": "ORG"}], "answer": "accept"}

In the browser, the annotated spans are incorrect:


But if I do not change anything but accept them, then when I use ner.print-database, the annotated spans are actually correct:

Do you have any idea why this is happening? Am I missing something about tokenization or?

Thanks for the report and example – this is very strange indeed! :thinking: I just tried it and you’re right - it seems like the character offsets are correct, but there seems to be an off-by-one error in the start and end token indices.

The good news is: As a very quick fix, remove the "token_start" and "token_end" from the spans and it should work as expected again. At least, assuming you’re tokenizing with the same model. Prodigy will then just use the character offsets to assign the tokens, and this will be the correct indices.

To help me debug this: Do you still have the original token mappings, i.e. the "tokens" key that was present in the examples? And do you know which version of Prodigy you used when you created the examples?

Thank you for the solution! It works perfectly when there are no token_start and token_end presented:)
The prodigy version used is 1.6.1.
These are the examples with "tokens":

{"text":"UK based operator Hurricane Energy has announced that the vessel destined for it 's West of Shetland development has arrived in Dubai . The floating production and storage offloading vessel ( FPSO ) Aoka Mizu is now at the Drydocks World Dubai Shipyard . The vessel is set to be used for the firm 's flagship Lancaster field . Drydocks World is [ ... ] \n The post Hurricane 's FPSO arrives in Dubai appeared first on Energy Voice .","spans":[{"start":18,"end":34,"label":"ORG","text":"Hurricane Energy","answer":"accept","token_start":3,"token_end":4},{"start":364,"end":373,"label":"ORG","text":"Hurricane","answer":"accept","token_start":68,"token_end":68}],"_input_hash":-1997449806,"_task_hash":401810758,"tokens":[{"text":"UK","start":0,"end":2,"id":0},{"text":"based","start":3,"end":8,"id":1},{"text":"operator","start":9,"end":17,"id":2},{"text":"Hurricane","start":18,"end":27,"id":3},{"text":"Energy","start":28,"end":34,"id":4},{"text":"has","start":35,"end":38,"id":5},{"text":"announced","start":39,"end":48,"id":6},{"text":"that","start":49,"end":53,"id":7},{"text":"the","start":54,"end":57,"id":8},{"text":"vessel","start":58,"end":64,"id":9},{"text":"destined","start":65,"end":73,"id":10},{"text":"for","start":74,"end":77,"id":11},{"text":"it","start":78,"end":80,"id":12},{"text":"'s","start":81,"end":83,"id":13},{"text":"West","start":84,"end":88,"id":14},{"text":"of","start":89,"end":91,"id":15},{"text":"Shetland","start":92,"end":100,"id":16},{"text":"development","start":101,"end":112,"id":17},{"text":"has","start":113,"end":116,"id":18},{"text":"arrived","start":117,"end":124,"id":19},{"text":"in","start":125,"end":127,"id":20},{"text":"Dubai","start":128,"end":133,"id":21},{"text":".","start":134,"end":135,"id":22},{"text":"The","start":136,"end":139,"id":23},{"text":"floating","start":140,"end":148,"id":24},{"text":"production","start":149,"end":159,"id":25},{"text":"and","start":160,"end":163,"id":26},{"text":"storage","start":164,"end":171,"id":27},{"text":"offloading","start":172,"end":182,"id":28},{"text":"vessel","start":183,"end":189,"id":29},{"text":"(","start":190,"end":191,"id":30},{"text":"FPSO","start":192,"end":196,"id":31},{"text":")","start":197,"end":198,"id":32},{"text":"Aoka","start":199,"end":203,"id":33},{"text":"Mizu","start":204,"end":208,"id":34},{"text":"is","start":209,"end":211,"id":35},{"text":"now","start":212,"end":215,"id":36},{"text":"at","start":216,"end":218,"id":37},{"text":"the","start":219,"end":222,"id":38},{"text":"Drydocks","start":223,"end":231,"id":39},{"text":"World","start":232,"end":237,"id":40},{"text":"Dubai","start":238,"end":243,"id":41},{"text":"Shipyard","start":244,"end":252,"id":42},{"text":".","start":253,"end":254,"id":43},{"text":"The","start":255,"end":258,"id":44},{"text":"vessel","start":259,"end":265,"id":45},{"text":"is","start":266,"end":268,"id":46},{"text":"set","start":269,"end":272,"id":47},{"text":"to","start":273,"end":275,"id":48},{"text":"be","start":276,"end":278,"id":49},{"text":"used","start":279,"end":283,"id":50},{"text":"for","start":284,"end":287,"id":51},{"text":"the","start":288,"end":291,"id":52},{"text":"firm","start":292,"end":296,"id":53},{"text":"'s","start":297,"end":299,"id":54},{"text":"flagship","start":300,"end":308,"id":55},{"text":"Lancaster","start":309,"end":318,"id":56},{"text":"field","start":319,"end":324,"id":57},{"text":".","start":325,"end":326,"id":58},{"text":"Drydocks","start":327,"end":335,"id":59},{"text":"World","start":336,"end":341,"id":60},{"text":"is","start":342,"end":344,"id":61},{"text":"[","start":345,"end":346,"id":62},{"text":"...","start":347,"end":350,"id":63},{"text":"]","start":351,"end":352,"id":64},{"text":"\n ","start":353,"end":355,"id":65},{"text":"The","start":355,"end":358,"id":66},{"text":"post","start":359,"end":363,"id":67},{"text":"Hurricane","start":364,"end":373,"id":68},{"text":"'s","start":374,"end":376,"id":69},{"text":"FPSO","start":377,"end":381,"id":70},{"text":"arrives","start":382,"end":389,"id":71},{"text":"in","start":390,"end":392,"id":72},{"text":"Dubai","start":393,"end":398,"id":73},{"text":"appeared","start":399,"end":407,"id":74},{"text":"first","start":408,"end":413,"id":75},{"text":"on","start":414,"end":416,"id":76},{"text":"Energy","start":417,"end":423,"id":77},{"text":"Voice","start":424,"end":429,"id":78},{"text":".","start":430,"end":431,"id":79}],"answer":"accept"}
{"text":"In the latest issue of EGR Intel , Global Gaming CEO Stefan Olsson provided his thoughts on the company 's recent progress , as well as giving readers an insight into its future plans . The Q&A sees our CEO talk at length about its unique PayNPlay solution , which has revolutionised the online casino world with its non - registration platform and instant payouts , plus how the recent NASDAQ listing has impacted the company , as well as the ambition to expand Global Gaming 's international operations . \n http://news.cision.com/global-gaming-555-ab/r/global-gaming-ceo-discusses-long-term-ambitions-with-egr-intel,c2421917 \n http://mb.cision.com/Public/16275/2421917/8fd0e3f0df4f6a93.pdf","spans":[{"start":35,"end":48,"token_start":8,"token_end":9,"label":"ORG"},{"start":463,"end":476,"token_start":84,"token_end":85,"label":"ORG"}],"_input_hash":-56640068,"_task_hash":-218024407,"tokens":[{"text":"In","start":0,"end":2,"id":0},{"text":"the","start":3,"end":6,"id":1},{"text":"latest","start":7,"end":13,"id":2},{"text":"issue","start":14,"end":19,"id":3},{"text":"of","start":20,"end":22,"id":4},{"text":"EGR","start":23,"end":26,"id":5},{"text":"Intel","start":27,"end":32,"id":6},{"text":",","start":33,"end":34,"id":7},{"text":"Global","start":35,"end":41,"id":8},{"text":"Gaming","start":42,"end":48,"id":9},{"text":"CEO","start":49,"end":52,"id":10},{"text":"Stefan","start":53,"end":59,"id":11},{"text":"Olsson","start":60,"end":66,"id":12},{"text":"provided","start":67,"end":75,"id":13},{"text":"his","start":76,"end":79,"id":14},{"text":"thoughts","start":80,"end":88,"id":15},{"text":"on","start":89,"end":91,"id":16},{"text":"the","start":92,"end":95,"id":17},{"text":"company","start":96,"end":103,"id":18},{"text":"'s","start":104,"end":106,"id":19},{"text":"recent","start":107,"end":113,"id":20},{"text":"progress","start":114,"end":122,"id":21},{"text":",","start":123,"end":124,"id":22},{"text":"as","start":125,"end":127,"id":23},{"text":"well","start":128,"end":132,"id":24},{"text":"as","start":133,"end":135,"id":25},{"text":"giving","start":136,"end":142,"id":26},{"text":"readers","start":143,"end":150,"id":27},{"text":"an","start":151,"end":153,"id":28},{"text":"insight","start":154,"end":161,"id":29},{"text":"into","start":162,"end":166,"id":30},{"text":"its","start":167,"end":170,"id":31},{"text":"future","start":171,"end":177,"id":32},{"text":"plans","start":178,"end":183,"id":33},{"text":".","start":184,"end":185,"id":34},{"text":"The","start":186,"end":189,"id":35},{"text":"Q&A","start":190,"end":193,"id":36},{"text":"sees","start":194,"end":198,"id":37},{"text":"our","start":199,"end":202,"id":38},{"text":"CEO","start":203,"end":206,"id":39},{"text":"talk","start":207,"end":211,"id":40},{"text":"at","start":212,"end":214,"id":41},{"text":"length","start":215,"end":221,"id":42},{"text":"about","start":222,"end":227,"id":43},{"text":"its","start":228,"end":231,"id":44},{"text":"unique","start":232,"end":238,"id":45},{"text":"PayNPlay","start":239,"end":247,"id":46},{"text":"solution","start":248,"end":256,"id":47},{"text":",","start":257,"end":258,"id":48},{"text":"which","start":259,"end":264,"id":49},{"text":"has","start":265,"end":268,"id":50},{"text":"revolutionised","start":269,"end":283,"id":51},{"text":"the","start":284,"end":287,"id":52},{"text":"online","start":288,"end":294,"id":53},{"text":"casino","start":295,"end":301,"id":54},{"text":"world","start":302,"end":307,"id":55},{"text":"with","start":308,"end":312,"id":56},{"text":"its","start":313,"end":316,"id":57},{"text":"non","start":317,"end":320,"id":58},{"text":"-","start":321,"end":322,"id":59},{"text":"registration","start":323,"end":335,"id":60},{"text":"platform","start":336,"end":344,"id":61},{"text":"and","start":345,"end":348,"id":62},{"text":"instant","start":349,"end":356,"id":63},{"text":"payouts","start":357,"end":364,"id":64},{"text":",","start":365,"end":366,"id":65},{"text":"plus","start":367,"end":371,"id":66},{"text":"how","start":372,"end":375,"id":67},{"text":"the","start":376,"end":379,"id":68},{"text":"recent","start":380,"end":386,"id":69},{"text":"NASDAQ","start":387,"end":393,"id":70},{"text":"listing","start":394,"end":401,"id":71},{"text":"has","start":402,"end":405,"id":72},{"text":"impacted","start":406,"end":414,"id":73},{"text":"the","start":415,"end":418,"id":74},{"text":"company","start":419,"end":426,"id":75},{"text":",","start":427,"end":428,"id":76},{"text":"as","start":429,"end":431,"id":77},{"text":"well","start":432,"end":436,"id":78},{"text":"as","start":437,"end":439,"id":79},{"text":"the","start":440,"end":443,"id":80},{"text":"ambition","start":444,"end":452,"id":81},{"text":"to","start":453,"end":455,"id":82},{"text":"expand","start":456,"end":462,"id":83},{"text":"Global","start":463,"end":469,"id":84},{"text":"Gaming","start":470,"end":476,"id":85},{"text":"'s","start":477,"end":479,"id":86},{"text":"international","start":480,"end":493,"id":87},{"text":"operations","start":494,"end":504,"id":88},{"text":".","start":505,"end":506,"id":89},{"text":"\n ","start":507,"end":509,"id":90},{"text":"http://news.cision.com/global-gaming-555-ab/r/global-gaming-ceo-discusses-long-term-ambitions-with-egr-intel,c2421917","start":509,"end":626,"id":91},{"text":"\n ","start":627,"end":629,"id":92},{"text":"http://mb.cision.com/Public/16275/2421917/8fd0e3f0df4f6a93.pdf","start":629,"end":691,"id":93}],"answer":"accept"}

Note that here the start and end token/character indices are not necessarily the same as in the above post. I tried out these two examples too just now. As they are, the annotation shows up differently in the browser. But if I remove the token_start and token_end, it behaves correctly.