I'm trying to relate some pre-annotated entities. However, when I feed a dataset with them to rel.manual:
prodigy rel.manual test_dataset_rel en_core_web_sm dataset:test_dataset -l SUBJECT
it throws a bunch of few exceptions like this:
⚠ Skipped 27 span(s) that were already present in the input data because the tokenization didn't match.
And indeed, it drops most of the labels, especially (but not only) the ones that are several tokens long, e.g.:
I wonder what might be the cause.