I was doing NER + relation annotations with a similar interface to your example:
prodigy rel.manual rel_bio en_core_web_sm ./bio_events.jsonl --label Theme,Cause --span-label GGP,Gene_Expr,Transcr,Prot_Cat,Phosph,Loc,Bind,Reg,Reg+,Reg- --wrap
It would be quite useful to be able to label spans at the character level as you can do in the
ner.manual recipe with the
--highlight-chars flag. However, the
rel.manual recipe doesn't seem to accept
--highlight-chars as an argument. Is there any workaround this?
Thanks a lot in advance
Hi! One option to do this would be to provide data with
"tokens" that map to individual characters, or at least more fine grained chunks you want to be able to highlight individually.
relations UI is more complex than the manual NER interface, so being able to connect every character to possibly every other character will add even more complexity on top and can make it a lot harder to navigate. So if possible, I would recommend choosing an in-between scheme that gives you smaller units for tokens where it matters, and leaves others intact (so you're not splitting
a n d for no reason). For instance, if you're working with biomedical texts, you could have more rigorous splitting rules for punctuation like hyphens or maybe even numbers.
Thanks for the answer, that makes a lot of sense. I'll use a tokenizer that splits the text into smaller units.