Hey Everyone! I was following the "Training a new Entity Type" - YT Tutorial and suddenly got this Error:
ValueError: [E103] Trying to set conflicting doc.ents: '(94, 98, 'CONDITION')' and '(94, 98, 'CONDITION')'. A token can only be part of one entity, so make sure the entities you're setting don't overlap.
I can't really tell what's the cause for this
The sentence which gave the Error was:
"Resolvi usar novamente a carnitina, depois de ler que ele resolve mtos problemas."
Maybe it gave me the Error because the Text was in a different language?
Thank you for your help and Greetings from Berlin City!
Hi! The language is definitely not the problem here. What the error message is trying to tell you is that you somehow ended up with two entity annotations that overlap - or, in this case, are indentical.
I'm a bit confused how this could have happened – normally, Prodigy should only ever show you the same text once, so there's not really a way to generate exact duplicates, because you should never be asked the same thing twice.
When did this error occur? During annotation with ner.teach, or during training with ner.batch-train? Are you using the latest version of Prodigy? And coud you run the db-out command to export your dataset and try to find the sentence (e.g. in your editor)? Is it in there twice, or only once?
Edit: Can you check if you're using spaCy v2.2? Prodigy isn't officially compatible with the latest version yet, which introduces backwards-incompatible stricter handling of overlapping entities. If you're installing from the Prodigy wheel, it should auto-install the compatible spaCy version. Also see here: