I have a corpus of text where I have the following the phrases can be occur
inside forgery and theft
forgery or theft premise inside
theft given forgery and (premise inside)
premise inside forgery and theft
is there some pattern i can create where I could if I see any permutation of the above based on
inside, forgery and theft and premise in a phrase i could tag it into one entity?
There’s a few ways you might do that, depending on the exact boundaries of the phrases you’re interested in. It’s hard to give a specific pattern without knowing what sort of phrases should be excluded.
Are there any instances of the words “forgery” or “theft” that you don’t want the pattern to match?