Expanding NER to include neighbouring tokens

Hi @ines

Thanks a lot for the helpful answers you always seem to deliver!

Alright. Good to know. I also read this post where you suggest training a new NER model from scratch. I might take that approach as well - at least test and compare. Is there a way to omit some pretrained labels but keep others?

That is exactly what I need to do and I think I will go for the attributes approach indeed.

Sure. Let me know if I can be of any help. I have lots of data. My data is a4 pages like of HTML reports but I have chopped them up by parsing the HTML and then added the parsed content to spacy. Is it possible to give the whole thing to spacy and then do some preprocessing to keep track of the origin of the tokens or should I add that logic as a combination of some custom logic and spacy?

It doesn't get more relevant than that!

1 Like