NER Model Features

mitch · May 31, 2018, 10:41am

Hi,

I’m training an NER model to recognize a custom entity that is specific to my domain. I have a lot of example documents where the first word in the document is an example of the entity I’m after. However, in real data the entity is likely to appear anywhere in the document.

What features does the NER model use to detect entities? Does it use word position? Does it use word context? etc?

Basically, I want to avoid training a model that is biased towards selecting the first word(s) as being the entity that I am interested in.

Thanks.

mitch · May 31, 2018, 11:04am

I think I found my answer here - Basically I want to find a better dataset that is representative of real word data…

honnibal · June 1, 2018, 1:57pm

The model will definitely pay attention to whether the word is the first one of the document. The features include a window of up to four words either side of the target word, subword features, the previously tagged entity, and the currently open entity.

Topic		Replies	Views
Incorporating custom position feature into NER ner , spacy , thinc	9	2370	July 3, 2018
How important is the actual labeled word in NER? usage , ner	2	363	March 31, 2020
Will custom word vectors improve NER training on new entities? usage , ner , spacy , solved	1	375	November 20, 2020
Advice on training NER models with new entities usage , ner , hr	13	3884	January 25, 2019
NER model fails to detect if first word is an entity unless a non-whitespace character is added to the start of the text ner , spacy , transformers	2	377	December 1, 2021

NER Model Features

Related topics