ValueError: A Token can only be part of one entity [...]

Hey Everyone! I was following the "Training a new Entity Type" - YT Tutorial and suddenly got this Error:

ValueError: [E103] Trying to set conflicting doc.ents: '(94, 98, 'CONDITION')' and '(94, 98, 'CONDITION')'. A token can only be part of one entity, so make sure the entities you're setting don't overlap.

I can't really tell what's the cause for this :frowning:

The sentence which gave the Error was:
"Resolvi usar novamente a carnitina, depois de ler que ele resolve mtos problemas."

Maybe it gave me the Error because the Text was in a different language?

Thank you for your help and Greetings from Berlin City! :smiley:

Hi! The language is definitely not the problem here. What the error message is trying to tell you is that you somehow ended up with two entity annotations that overlap - or, in this case, are indentical.

I'm a bit confused how this could have happened – normally, Prodigy should only ever show you the same text once, so there's not really a way to generate exact duplicates, because you should never be asked the same thing twice.

When did this error occur? During annotation with ner.teach, or during training with ner.batch-train? Are you using the latest version of Prodigy? And coud you run the db-out command to export your dataset and try to find the sentence (e.g. in your editor)? Is it in there twice, or only once?

Edit: Can you check if you're using spaCy v2.2? Prodigy isn't officially compatible with the latest version yet, which introduces backwards-incompatible stricter handling of overlapping entities. If you're installing from the Prodigy wheel, it should auto-install the compatible spaCy version. Also see here:

@ines thanks for the tip I noticed this after upgrading to Spacy v2.2

1 Like