Unfortunately I still can't resolve the issue. I tried data-to-spacy without an evaluation split and then did the splitting with my own script. But still, debug data
reports training examples also in evaluation data.
Is it possible that the problem is related to the issue of duplicate annotations in output? Duplicate annotations in output