BILUO/IOB tags - yes / no?

Hello guys,

First of all, thanks for the great tool! Let me know, if there is already a similiar topic and also if this forum is the right place to discuss this.

Inspired by this post here: Proposal: Non-NER span categorizer · Discussion #3961 · explosion/spaCy · GitHub, I hoped I could use the forum to share some thoughts on the BILUO / BIO tags and a possible reason to not use them.

So from my understanding, BILOU/ BIO / etc / tagging schemes provide a richer expression for the classifier compared to just IO, while adding only a few more parameters. It was introduced in 1995 by Ramshaw and Marcus in the context of noun phrase chunking (NP-chunking). For example:

“The little yellow dog barket at the cat.” NP: the little yellow dog; the cat

This representation is helpful if you work with rules and also for feature-based machine learning, because there is some logic you can use. (If the previous token was classified as B-X, the next token should be classified as I-X.)

Furthermore, in NER the BILOU tagging is helpful because there is a special emphasis on the boundary of the chunk / entity. Let’s consider "Ford Motor Company" (B-Org, I-Org, I-Org). It is not helpful to just detect "Motor Company" (O, B-Org, I-Org) and would punish the model for two misclassifications ( O --> B-Org, B-Org --> I-Org). For other sequence / token classification problems on the other hand, this could be counterproductive. Let’s stay in the domain of cars and consider this example:

“The automotive industry is involved in the design, development, manufacturing, marketing, and selling of motor vehicles.”

In this example, we might want to classify more generally spans related to cars (label – Car). So spans you would like to detect are probably “automotive industry” and “motor vehicles”.

With BILUO tagging, “motor vehicles” would become [B-Car, I-car]. Let’s assume the model predicts [O, U-Car]. This would results in a misclassification for both tokens, punishing the model, although it predicted correctly “vehicles” as a term in the generic field of “cars” in the context of the sentence. So in this example, the BILOU tagging actually hurts more than it helps, and a simple IO tagging would be more useful because you would like to detect some part of the span rather than miss it completely.

As I mentioned before, let me know if this forum is the right place for discussions like this. I would love to hear some input / experiences / thoughts on this. I would consider myself still fairly new to NLP and wanted to check that I am not getting this completely wrong.