Fully manual NER annotations without tokeniser

Hello, I am new to Prodigy.

My task is about annotating specific characters on text with categories. I don't want to use Spacy for modelling so I am struggling with the requirement to tokenize the text for ner.manual.

I would like to get annotations with respect to original text rather than the token annotations coming out of ner.manual. I've tried searching for the answer in the forum but failed so far.

Any ideas if what I want is possible?

Hi! Your question is timed well, because v1.10 will actually have a mode for this out-of-the-box that just lets you highlight characters :slightly_smiling_face:

In the meantime, you could also achieve something similar by making each character an entry in the "tokens" – they're called "tokens", but in reality, they're mostly just a highlightable unit. And then you probably also want to adjust the margin of the .prodigy-content span (the chracters) so they're not as spaced.

Hi Ines. Thanks for the response and I am happy my request has been in the works already. I tried separating by character but didn't know about the span control. I will try that out while you finalise the 1.10 release.

Just released Prodigy v1.10, which includes a --highlight-chars flag that lets you highlight characters instead. Also see here for details and examples.