I have a proposal that I think would help in improving making of annotations.
prodigy on a regular and the
spacy.llm feature has really come in handy.
Even with this, I am suggesting a feature in
prodigy that, when you're making annotations for example a label
METHOD and we have words like
trawling and so on.. We can have a way that once you click on
sampling, it automatically highlights all the other
sampling words on that text.
This applies to
trawling. This would speed up annotations as the word sampling is picked whenever it appears on the text once you click on the first sampling word.
For example like the view below, the first click on
trawl should trigger prodigy to highlight all the other
trawl words. This would speed up annotations as well.
Hi @foscraft ,
Thanks so much This is a really neat suggestion!
One reason we haven't had such functionality until now is that, in general, NER is context specific which might result in many false positives if we indiscriminately apply a label to a phrase.
I do agree, though that there are also many datasets where there's very little ambiguity and this would speed up annotation a lot (I just recently had exactly the same experience when working with financial data).
I definitely put it on my list to think about
Just as reminder, you could also leverage Prodigy patterns for pre-highlighting tokens that you've identified as entites. You'd need to restart the annotation server for that, but it's probably worth to go through a sample of data first to get an idea about possible patterns and then proceed with the rest of data using ner.manual + patterns.
You're welcome. Hoping to see this implemented soon. I was using prodigy ANN recipe earlier, and it had me thinking how this combination could move things fast!