hi @bev.manz !
Thanks for your question.
Yes, check out this post:
Hi! This is definitely a cool idea And yes, for that to work you should implement your own function that adds annotated spans to the matcher as patterns. The PatternMatcher currently only sets the pattern-based annotations and is updated with the accept/reject information to update the scores assigned to the patterns (so you can use it in an active learning context where you filter based on certain/uncertain predictions). It doesn't add annotated spans to the patterns, since this isn't alw…
Key is using update
callback to add new spans as patterns:
def update(answers):
patterns = set()
for eg in answers:
for span in eg.get("spans", []):
# Get the text of each annotated span given its offsets
span_text = eg["text"][span["start"]:span["end"]]
patterns.add({"pattern": span_text, "label": span["label"]})
matcher.add_patterns(patterns)
There are a few more recent posts that build off of this idea:
Hi,
I am doing NER annotation and the data I am annotating has a lot of recurring terms. I want to annotate these terms and then I want to update the entity ruler to include a pattern. I have two questions about this. Can I just call ruler.add_patterns(pattern) at the end of my custom recipe? Also, is this the best way to do this? Should I be doing a session with ner.manual and then another session with ner.correct or something like that?
Cheers,
Dan
Hi! There's no built-in workflow for this, but you should be able to implement something like it by adapting a recipe like ner.manual and adding an update callback that updates your matcher from spans annotated in the data.
See this thread for a pretty similar approach (and some considerations for how to handle the batching):
Instead of going via the PatternMatcher, you might want to use the Matcher or PhraseMatcher directly, which removes one layer of abstraction. When setting the "spans" o…