Relax matching criteria in NER scoring?

yihzhang-suncor · November 12, 2021, 6:32pm

Hi,

Are there any references for ways to relax the matching criteria when scoring NER models during training in prodigy? IE. one of our entity is EQUIPMENT, one of the labels is "Truck Unit 1714", and if the model predicted "Truck" or "Unit 1714" that would be an acceptable match rather than having to predict the exact "Truck Unit 1714" match.

Thanks.

SofieVL · November 16, 2021, 3:50pm

Hi!

I'm not sure this is something you'd want to do in general. Typically, entities as extracted by an NER model do have strict boundaries. In your example, "Truck Unit 1714" seems like a clear Named Entity to me, but "Truck" by itself just doesn't carry the same kind of semantic information.

During annotation and prediction, ideally you'd apply a consistent set of guidelines to make sure the entities are always annotated the same. The reason why is that this consistency will actually help the model make confident predictions. If instead you'd allow the model to make mistakes (without backpropagating the errors on "fuzzy" boundaries), the model would have a harder time to predict what is correct.

Think about this from a human perspective. Let's say I'm a human annotator and one day you're telling me that "Truck" is just fine as entity, and the next you're very happy with "Truck Unit 1714". Now when I get "Truck Unit 89" in a sentence, I'm still confused as to whether I should give you "Truck" or "Truck Unit 89" as an entity.

In short: I think by doing this, you'd be making the NER more difficult, not easier.

Topic		Replies	Views
NER or PhraseMatcher? ner , spacy , best-practices	17	6093	September 20, 2018
Improving NER for label Coordinate usage , ner	3	383	July 22, 2020
Annotating correctly using the ner.correct recipe usage , ner , solved	5	458	January 20, 2022
Improving on spacy's existing NER entities ner	1	664	December 5, 2019
NER training best practices usage , ner , solved , best-practices	2	2287	December 28, 2017

Relax matching criteria in NER scoring?

Related topics