ValueError: A Token can only be part of one entity [...]

Hi! The language is definitely not the problem here. What the error message is trying to tell you is that you somehow ended up with two entity annotations that overlap - or, in this case, are indentical.

I'm a bit confused how this could have happened – normally, Prodigy should only ever show you the same text once, so there's not really a way to generate exact duplicates, because you should never be asked the same thing twice.

When did this error occur? During annotation with ner.teach, or during training with ner.batch-train? Are you using the latest version of Prodigy? And coud you run the db-out command to export your dataset and try to find the sentence (e.g. in your editor)? Is it in there twice, or only once?

Edit: Can you check if you're using spaCy v2.2? Prodigy isn't officially compatible with the latest version yet, which introduces backwards-incompatible stricter handling of overlapping entities. If you're installing from the Prodigy wheel, it should auto-install the compatible spaCy version. Also see here:

1 Like