Mapping relationships between named entities and unlabeled spans

SofieVL · November 12, 2021, 10:11am

Hi!

If you're looking at a challenge with overlapping spans, the spancat is definitely the way forward. How you want to predict the entities here depends a bit on the data though. While "knee injection" can be seen as two entities, "dental treatment" would be more awkward to split up. Then again, for sentences like the one around "buttocks" where the treatment and body part are not mentioned in a continuous span, you probably have to split them up into two entities anyway - there's no good solution otherwise.

Ultimately what it always boils down to is: what is the "easiest" way for a model to learn the information? If the entities are typically mentioned together and used as one phrase within the sentence, the model might find it easier to recognize them as one. A proxy for this, to determine what is "easiest", is by doing some of the annotation and trying out both schemes. Which of the two feels more natural and is easier to do? And which feels more intuitive as a human, interpreting language as we do? Chances are high that this will correlate to what is easier to do for a model, too (and thus, eventually, higher accuracy).

I might have strayed a bit from the original question - let me know if this helps or not!

Topic		Replies	Views
Spancat: use of embeddings, compatibility with transformers, upstream to relationship extraction usage , relations , spancat	4	784	November 17, 2021
Span vs NER, compatibility with transformers models ner , spacy , transformers , spancat	2	356	May 17, 2023
Annotation of non-contiguous entities enhancement , ner , front-end	2	739	February 11, 2021
Advanced Relation Labeling Receipe usage , relations	1	483	November 26, 2020
Annotating compound entity phrases usage , ner , best-practices	2	853	April 17, 2020

Mapping relationships between named entities and unlabeled spans

Related topics