Annotation of non-contiguous entities

ines · February 10, 2021, 11:56pm

Hi! Non-contiguous and potentially overlapping spans can definitely be a little tricky, especially if you want the annotation process to be clear and efficient.

One option could be to just model this as a relation annotation task instead of span annotation and only focus on connecting tokens with the given entity label. For example, in your case, you would annotate "Type" + "1" → "diabetes" and "Type" + "2" → "diabetes". This way, a token or entity fragment can be part of multiple spans of different types and it'll be clearly visualised in the UI with different colours.

Sorry if this sounds a bit abstract – but I made a quick example Initially, your sentence would look like this:

To merge expressions like "Type 1" that are part of the non-contiguous span, you can use the span highlighting mode. I used the label X here because the label doesn't matter – we'll be annotating that at the relation level so that fragments can be part of multiple, potentially different entity types if needed:

You can now connect the fragments using the given entity label. In theory, a fragment can be part of multiple entity types.

The resulting JSON data (see here for an example) will include each annotated relations and the two fragments they connect, with their token indices and offsets into the text, as well as the label. This should make it pretty easy to export the information in the format you're looking for.

The only constructions that would be difficult to express with this approach are cases where you have nested expressions that you want to treat as separate entities (e.g. "Type 2 diabetes" and "Type 2 diabetes research") – but I'm not even sure that this actually makes sense conceptually.

Topic		Replies	Views
Editing rel.manual to allow for multiple passes to account for nested entities usage	1	406	November 16, 2022
annotating entities with overlapping spans and their relations	5	385	March 23, 2023
Custom recipe for Annotating Overlapping Spans custom , front-end , best-practices	15	2498	September 6, 2020
Mapping relationships between named entities and unlabeled spans ner , medical	1	634	November 12, 2021
Using rel.manual with nested entities. usage , ner , solved , relations	4	709	September 23, 2021

Annotation of non-contiguous entities

Related topics