How to merge data from ner.correct and ner.teach?

Hi, I hope I understand your question correctly! When you run prodigy train, the examples in the dataset will be merged to reflect the unique examples, and all annotations that are available for a given example will be combined to create the final training example.

However, mixing annotations of different types (binary and manual) in the same dataset can sometimes lead to unexpected results and means you won't be able to update the model as effectively: to train from binary yes/no questions, you want to update differently and consider the rejected answers, while also treating all unannotated tokens as unknown. This is done when you set --binary on prodigy train. If you train from complete gold-standard annotations created with ner.correct, you typically want to consider all unannotated tokens as non-entity tokens, which makes it easier for the model to learn. So we typically recommend keeping those types of annotation separate.

So one option would be to just use the metadata of the exported annotations to separate them into two sets and then re-import the data. Also see this thread for more details: