NER Annotation Highlight's nothing at beginning of sentence

manganc · October 21, 2018, 5:43pm

Hi guys,

I am training my NER and I see, from time to time, a highlight at the beginning of the sentence, but it is not actually highlighting anything… for instance… I copy pasted from the text of the Web UI below for my CURRENT_EMPLOYER_ORG label I an labelling…

could it be that it is a ‘space’ (\s) that is trying to be considered as a match for my label? in this case, I should reject, right?

" CURRENT_EMPLOYER_ORGInterested in Nordstrom."

Second question:

same thing is happening with highlighting the period at the end of the sentence… I get like 10 in a row w/ the period highlighted (which I reject all)… then finally a few good examples (where most are reject as well… but at least the highlights are for words or phrases… )… then another 10 or so highlighting last character (usually periods… )… SCORES are like .40 for these as well??? Thoughts?

Last question… When prodigy is serving up tasks, does it ALWAYS make a predicted NER tag? what if there is really no match? or it is super low confidence (I assume that is a low SCORE on the UI? )… will it just pick the tagged tokens that has the greatest score? albeit, even if it is intolerably low and would be below any threshold we would consider a confident match/prediction?

Thanks for your help

honnibal · October 22, 2018, 10:14pm

I’ll answer the second question first, since I think it sheds light on the other issue:

Yes, even if the predictions are very low, the model will eventually ask something. The reason is that we can’t be sure the model is well calibrated: it might be predicting scores without much relationship to the truth. In this situation, we don’t want it to get “stuck”. We want to always have a way to get out of the problem and into a better situation.

That’s why the predictions are sometimes strange, including predictions of punctuation, spaces etc. The model doesn’t start off knowing that these things are never entities, so you’ll sometimes get asked about them.

Topic		Replies	Views
NER with commas in the word through ner.correct	1	396	September 12, 2022
Questionable results from NER - we must be doing something wrong ner , spacy , best-practices , legal	5	4359	August 30, 2018
Best approach for using ner manual and mark usage , ner , solved	22	2373	January 20, 2020
Highlighting spans that are not the entities to be labeled when using ner.correct usage , ner	1	466	December 21, 2020
Label annotations common word higlight feature ner , spacy , project	2	429	November 22, 2023

NER Annotation Highlight's nothing at beginning of sentence

Related topics