I want to make a precise selection. When I try to select for example "Name\abc", or "Name[abc" it selects whole string, instead of only "Name", and I want to be able to correct this, but I do not know how to do that, I do not see anywhere an option to change tagged text.
Thanks for your question and welcome to the Prodigy community
Are you looking for character-based highlighting?
Per the docs, you ca set a
--highlight-chars flag to allow highlighting individual characters instead of only tokens. This will only store the character offsets of your annotation and won’t add a
"tokens" property to the saved task.
But it's important to highlight this from the docs:
When using character-based highlighting, annotation may be slower and there’s no guarantee that the spans you annotate map to actual tokens later on. If your goal is to train a named entity recognizer, you should consider using the same tokenizer during annotation, to make sure that your data can be used.
The key point is critical: make sure to use the same tokenizer you're using in annotation as you'd want to use for training. If you're not careful, you could run into tokenization alignment problems.
If you can find ways to modify your tokenizer early on, you may save yourself from headaches down the road due to mismatched tokens.
Hope this helps!