How to perform automatically NER annotation based on patterns?

I would like to perform (and store in prodigy dataset) NER annotation based on patterns without manual evaluation.

Any how-to suggestions?

Hi! In that case, you can just go directly via spaCy, for example, using the entity ruler: https://spacy.io/usage/rule-based-matching#entityruler

It lets you add your patterns and will add all matches to the doc.ents, just like an entity recognizer. You can then use that nlp object to process your texts and extract the pattern-based NER annotations. In theory, you don't even have to go through Prodigy at all and you could just export the data and train with spaCy directly. But if you want to mix these annotations in with other annotations you've created manually, you can create data in Prodigy's format pretty easily using the processed doc:

doc = nlp("This is a text")
spans = [{"start": ent.start_char, "end": ent.end_char, "label": ent.label} for ent in doc.ents]
example = {"text": doc.text, "spans": spans}

You can then add it to a dataset using Prodigy's database API: https://prodi.gy/docs/api-database#database Or alternatively, using the db-in command. I'd recommend setting up a separate dataset for your automatically generated annotations – if there's a bug in your code or a pattern that you want to improve, you can just remove the dataset and re-add it (which is harder if you've added the data to the same set as your manual annotations).