Hey, thanks a lot for the super detailed report!
Before looking into this in more detail, I can definitely confirm what the intended behaviour is: ner.teach
excludes by task hash, so two examples with the same task hash are considered duplicates and you should never be asked about them twice. The task hash is based on the text and span/label so you may be asked about different suggestions on the same text, but never about the same text + span + label combination. If an incoming example has the same task hash as an example in one of the excluded sets (via --exclude
or the current dataset) and it's presented to you, that's a bug.
Other recipes, mostly the manual ones like ner.manual
, exclude by input (via the "exclude_by": "input"
config setting) because the assumption here is that you want to create one gold-standard annotation for each text and don't want to see the same text again, even if it comes in later with different pre-highlighted suggestions.
This thread made me a little suspicious about the --exclude
option with a separate dataset. Although nothing really changed around this, so I'm not entirely sure where the problem would be But it's probably the first thing we should double-check.
db-merge
currently only appends and doesn't do any hashing/filtering/combining (that's currently only done during training and when you run data-to-spacy
). So if you have 4 examples + 4 duplicates, you'll end up with one set of 8 examples.