Wildcard like Patterns in Custom Named Entities

rkeyvani · July 22, 2018, 1:01pm

I have a corpus of text where I have the following the phrases can be occur

inside forgery and theft
forgery or theft premise inside
theft given forgery and (premise inside)
premise inside forgery and theft

is there some pattern i can create where I could if I see any permutation of the above based on
inside, forgery and theft and premise in a phrase i could tag it into one entity?

honnibal · July 23, 2018, 11:48am

There’s a few ways you might do that, depending on the exact boundaries of the phrases you’re interested in. It’s hard to give a specific pattern without knowing what sort of phrases should be excluded.

Are there any instances of the words “forgery” or “theft” that you don’t want the pattern to match?

One option to consider is to use the dependency parse, after you’ve identified the key term. This can help you move from a single word to a longer phrase you’re interested in. You can plug different sentences or fragments into the parser here: https://explosion.ai/demos/displacy?text=suspicion%20of%20forgery%20or%20theft%20inside%20the%20premise&model=en_core_web_sm&cpu=0&cph=0 . You can read more about how to use the parser on the spaCy docs: https://spacy.io/usage/linguistic-features#section-dependency-parse

Topic		Replies	Views
Problem with new entity type and patterns usage , ner , solved	2	765	January 8, 2019
(Re)using labels in patterns usage , spacy	1	287	July 21, 2021
Patterns and custom NER usage , ner	1	2653	December 27, 2017
Pattern Matching on Custom Attributes usage , spacy , off-topic	2	696	September 22, 2021
Can the NER recognize groups of words? Should I use patterns? usage , ner	1	512	October 22, 2018

Wildcard like Patterns in Custom Named Entities

Related Topics