How to use a spaCy pattern in Prodigy

ines · May 22, 2019, 11:01am

Hi! Sorry if this wasn't fully clear – I'll see if we can add the pattern file details more prominently in the docs

The good news is, spaCy patterns are fully compatible with Prodigy. So in order to use your existing patterns, all you have to do is create a file like patterns.jsonl containing one object per line, each with a key "label" and "pattern". For example:

{"label": "YOUR_LABEL", "pattern": [{"IS_ASCII": true}, {"ORTH": "-"}, {"IS_ASCII": true}]}

This is also the same format used by spaCy's new EntityRuler btw – so if you've been working with that, you can reuse the exact same patterns files.

To test your patterns, you can use the ner.match recipe, which will show you all matches in the data and ask you to accept / reject them. For example:

prodigy ner.match your_dataset en_core_web_sm /path/to/your_data.jsonl /path/to/patterns.jsonl --label YOUR_LABEL

The ner.make-gold workflow currently doesn't have a --patterns argument – it really only goes through the doc.ents set by a spaCy model, pre-highlights them in the texts and lets you correct those entities manually. However, thanks to spaCy v2.1 and the new EntityRuler, you can still make this work:

Create a new EntityRuler and add your patterns to it (see here for more info).
Load a pre-trained model and add the entity ruler to the pipeline.
Save the modified model with the entity ruler to disk using nlp.to_disk – the entity ruler and its patterns will be serialized automatically and loaded back in when you load the model. The doc.ents set by that model now include the pattern matches.
Load the saved model into ner.make-gold and annotate entity predictions plus pattern matches.

prodigy ner.make-gold your_dataset /path/to/saved-model /path/to/your_data.jsonl --label YOUR_LABEL

Topic		Replies	Views
Accept hyphen(-) in patterns shape usage , ner , spacy	4	1634	October 12, 2018
ner.manual: issue to recognize multi-words entity containing "-" usage , spacy , solved	2	308	June 15, 2021
match pattern work in spacy but does not work in prodigy usage , ner , spacy	2	436	January 25, 2021
How to tell SpaCy not to split any intra-hyphen words? spacy , solved	6	9965	June 5, 2019
(Re)using labels in patterns usage , spacy	1	317	July 21, 2021

How to use a spaCy pattern in Prodigy

Related topics