There are some NER candidates Prodigy suggests that I know a priori are wrong. For example, ones that consist entirely of whitespace.
ner.teach recipe should have an “antipatterns” option that allows you to specify a set of patterns that will always be marked “reject” without showing them to the user.
This happens on some structured texts with default models shipped with Spacy and so Prodigy which uses spacy. I wrote a filter component which filters out such patterns in Spacy. However for prodigy I think the ner.teach recipe needs some specialization. May be @honnibal can clear this up much better.
@wpm: This thread discusses a similar problem, and has some code you might find useful: patterns using regex or shape
We’ll think about whether there should be explicit support for something like this, thanks.