Training a model on both gold and binary data

The --binary flag uses a more complex mechanism to update the model that lets you take advantage of both the accepted and rejected suggestions of single entity spans – basically, like the default behaviour of ner.batch-train. I've written some more about it on this thread:

The --ner-missing flag lets you to specify that unannotated tokens should be treated as missing values (and not as explicitly as "not an entity"). This allows you to still train from incomplete annotations, e.g. if you only have annotations for one or two labels.

The new train command was designed to harmonise training between Prodigy and spaCy, and use spaCy's regular update mechanism to train from gold-standard data (with and without missing values). This also makes it easier to ensure that results are consistent and reproducible.