Label annotations common word higlight feature

I have a proposal that I think would help in improving making of annotations.
I use prodigy on a regular and the spacy.llm feature has really come in handy.
Even with this, I am suggesting a feature in prodigy that, when you're making annotations for example a label METHOD and we have words like sampling, trawling and so on.. We can have a way that once you click on sampling, it automatically highlights all the other sampling words on that text.
This applies to trawling. This would speed up annotations as the word sampling is picked whenever it appears on the text once you click on the first sampling word.

For example like the view below, the first click on trawl should trigger prodigy to highlight all the other trawl words. This would speed up annotations as well.

1 Like

Hi @foscraft ,
Thanks so much :slight_smile: This is a really neat suggestion!
One reason we haven't had such functionality until now is that, in general, NER is context specific which might result in many false positives if we indiscriminately apply a label to a phrase.
I do agree, though that there are also many datasets where there's very little ambiguity and this would speed up annotation a lot (I just recently had exactly the same experience when working with financial data).
I definitely put it on my list to think about :slight_smile:

Just as reminder, you could also leverage Prodigy patterns for pre-highlighting tokens that you've identified as entites. You'd need to restart the annotation server for that, but it's probably worth to go through a sample of data first to get an idea about possible patterns and then proceed with the rest of data using ner.manual + patterns.

Hello @magdaaniol
You're welcome. Hoping to see this implemented soon. I was using prodigy ANN recipe earlier, and it had me thinking how this combination could move things fast!