Will NER improve Text Categorization?

valentijnnieman · April 28, 2022, 2:31pm

Hello!

There's a question I have that's maybe more regarding NLP in general than Prodigy or SpaCy. I thought I'd try here since it seems you folks are so responsive and helpful - but please let me know if this isn't the right place to ask.

I was wondering - if I'm doing text categorization with SpaCy, using textcat-multi for example, will those results improve if an NER component was before it in the pipeline? My thinking about that is: if a sentence like "Senior Javascript Developer" would be categorized as, say, "A" (or any other category), and if then Javascript would be tagged as a "Programming Language" entity or similar, would the textcat pick that up, and use that to say, for example, a sentence like "Python Engineer", is similar because of that entity? Assuming Python is also a "Programming Language" entity of course.

My understanding of it is that the textcat component will take the tok2vec vectors and look for similarity there, but will the vectors be similar in one or more dimensions if the found entity using NER is similar? Am I thinking about this the right way? If it's at all possible, how would that work with SpaCy and/or Prodigy?

Thanks a bunch in advance, and do let me know if this isn't the right space to ask these questions!

koaning · April 29, 2022, 8:12am

Hi Valentijn,

spaCy questions are better asked on our GitHub Discussions board. The spaCy contributors also keep an eye on that forum, which is why I recommend going there.

In fact, your question seems partially answered there.

pavelklymenko · July 18, 2022, 7:41pm

@valentijnnieman , just curious if you were able to obtain answers to your questions?

Likely, I’ll have same/ similar question in the nearest future.

@koaning thanks for pointing out the topic on GitHub. It’s a bit hard for me to completely understand the answer at this point. Need to do more homework as I’m just starting to dig into the topic.

Thank you.

Topic		Replies	Views
Does textcat use NER entities as features? ner , textcat , spacy , solved	2	561	April 20, 2021
Combining NER with text classification usage , ner , textcat	10	6897	March 20, 2024
Combining NER and Classification usage , ner , textcat , solved	7	723	August 5, 2022
Framing NER task as a text classification task usage , ner , textcat	5	633	December 19, 2019
Is it possible to do NER and Textcat Annotation together? ner , textcat	4	38	October 28, 2024

Will NER improve Text Categorization?

Related topics