Repetitive Text in Dataset


We have some repetitive text in email chains and when we used ner.teach, we received suggestions for the repeating text. For instance,

Regards, Rachel ABC Services Center 423-132-5423 On 11/8/2018 8: 13 AM, Jane Braxton wrote: Good morning, the exterminator will treat this unit today.

The recipe ner.teach gives us repeated suggestion for the phone entity as shown above. I was wondering if these repeated annotations in our dataset will negatively affect the model.

I would say this won’t be a problem, no. The model should learn the correct label for that text fairly quickly. It’s just that initially, you’ll have to work through the example which the model predicted before the update was made.