why spacy ner model failed to predict amount or total amount

Hi, sorry if this is a duplicate thread.
I trained spacy ner with my custom data contains plain text with multiple numbers. such as table with numbers and more.
Model predict other field pretty well. But sometimes failed to predict amount or total amount.
Also accuracy of predicting total amount is lower than other field.
What is the problem with numbers for spacy ner model?
Thanks in Advance.
Sagor

Hi! The problem likely isn't the numbers themselves and more related to the data and surrounding context. If one entity type does especially badly, there's liklely something about it that the model struggles to learn. The NER model implementation is generally optimized for regular text, because that's what named entity recognition is typically about: recognising names and concepts in text, based on the surrounding context. A model like this may not perform well if there is no context and you're just feeding it tables of numbers and if there's nothing in the surrounding context.

In general, useful questions to ask yourself are: Is the data annotated consistently and is the data representative? Can the entity types you're annotating be predicted based on the surrounding context? Do you have a representatitve evaluation, and how does your model compare to other approaches (e.g. rule-based extraction using the table headers)?

1 Like

Hi! Thank you so much. Now it is clear to me that the context in my data(table or such kind) is not so good.
Yes, it has tables of data and I am finding total amount from that.
I will follow your suggestion to use this data for rule based approach and compare with other models.

Thanks again and Regards
Sagor