Display issue when NER tag order doesn't match word order

When the order of NER tags in spans doesn’t match the order or words in the sentence, the UI jumps back in the sentence and repeats some text (see picture). It’s really easy to fix by just ensuring the spans list is in sentence order, but I wanted to pass it along in case it produces other unexpected behavior.

{'spans': [{'end': 151, 'label': 'LOCATION', 'start': 142},
   {'end': 114, 'label': 'VERB', 'start': 108}],
  'text': 'If the Syrian Arab Army successfully overtakes the city from the Islamic State, they will be in position to attack the Jirah Airbase near the Euphrates River for the first time in 4 years.',
  'word': 'Euphrates'}

Thanks for the report – this sucks, sorry about that!

I just had a look at the implementation and I think the problem here is that the spans in the manual NER interface are currently sorted by "start" when they’re added or modified, but not actually on the very first render. So if the incoming, pre-defined spans are not in order, the highlighting algorithm messes up the text, as it iterates over the spans in order and slices the text accordingly.

Will fix this for the next release, thanks! :+1:

1 Like

No worries! Not a big issue and it was really easy to work around.

(Specifically, like this: sorted(span_list, key=lambda k: k['start']))