NER Tagging with patterns

gnileew88 · May 9, 2019, 2:51am

Hi there!

I am wondering if i can combine the patterns file with ner.manual? So that i can refine the tags that are suggested by the model?

I was previously trying to use ner.teach with a pattern file to tag ‘duration’ entity. For instance:

-> has been experiencing for the past 2 days ( I want to be able to recognize past 2 days with my model, but as my patterns file only contain days, model was only able to pick up the word ‘days’, and it doesnt seem feasible to include every possible variations of the number of days in the patterns file).

Thanks, hopefully ive been clear in describing the issue.

ines · May 9, 2019, 8:47am

Hi! You can try using the other token attributes – that's what's so cool about the token-based patterns For example, the like_num attribute will match all tokens whose value resemples a numer. That could be "2", but also "two" or "2.5". You can find more details on this in the spaCy docs.

So one pattern could look like this:

{"label": "DURATION", "pattern": [{"like_num": true}, {"lower": "days"}]}

Instead of "lower": "days", you could also try "lemma": "day" – this would match all tokens whose base form is "day", so both "day" and "days".

If you're using ner.teach with an existing model, keep in mind that it can be difficult for the model to learn new definitions of entities that "clash" with existing labels. For instance, the pre-trained English model might already classify some of the durations as DATE or numbers as CARDINAL. Trying to teach it a completely new definition can be tricky and would require a lot more data. So it might make sense for you to start off with a blank model instead.

There's no out-of-the-box way to do this in ner.manual – but you could build your own custom recipe like that. The only difficult part here is that you'll likely want all matches in the example, and you'll have to handle overlapping matches etc.

Topic		Replies	Views
NER with Gazetteer enhancement , ner	1	2141	January 31, 2018
(Re)using labels in patterns usage , spacy	1	315	July 21, 2021
Training NER model from scratch using (forward-looking) patterns usage	8	690	December 17, 2019
Using terms.train-vectors recipe with NER ner , terms	1	1260	March 3, 2018
Problem with new entity type and patterns usage , ner , solved	2	817	January 8, 2019

NER Tagging with patterns

Related topics