I was using
ner.teach to learn to recognize date entities in particular contexts. I use date matching patterns as my seed. What I’ve seen in the past is that Prodigy starts by proposing candidates that exactly match the dates and then only after you have gone through a lot of those will it start making suggestions from the model, e.g. what Prodigy is supposed to do.
I tried this on a new corpus, and the candidates proposed by Prodigy start off coming from the model. Since the model is untrained, they’re essentially random. It looks like the patterns are never used.
I don’t understand how this could be happening. I’m running everything exactly the same way as before. The only thing that’s different is the corpus. The only thing odd is that this new corpus is tiny (order of 100 candidate entities).
Does this sound like a bug, or is there some corner case for small corpora that would compel Prodigy to use the model instead of the patterns?