"Gold Standard" dataset as evaluation for ner.batch-train with binary annotation?

Hi there,

So I have a dataset containing a bunch of “gold standard” annotations (i.e. all relevant entities are annotated and only them, and there’s no ‘rejects’) that I used as evaluation set for training my model with another gold standard dataset (so, using the --no-missing flag).

After my model got good enough to make decent suggestion for binary annotation, I made a bunch of those as well (in a new dataset) and then tried to further train the model on that dataset (without the --no-missing flag).

My question is, can I use the same evaluation set as I used when training with the first batch of “gold standard” annotations? That evaluation set only contains accepts, and without the --no-missing flag, there is no way for the model to know that whatever is not marked as entity is not an entity.

The recipe arguments are a bit awkward around this at the moment, sorry. They don’t support the full granularity of options, and you’ve hit one of the reasonable cases that aren’t supported directly.

The underlying routines do support this though — it’s just not surfaced well by the arguments to the built-in recipes. What you need to do is make sure the task objects have an entry "no_missing": True. When the evaluation is run, this tells the model to treat that example as having complete annotations. This should let you train with the binary annotations, while evaluating with the gold-standard annotations.

Thanks, it works. I also just saw your reply to the similar question on Training a model on both gold and binary data . It would maybe make sense to add this as a note in the documentation, as it’s not very clear at the moment :slight_smile: