Combining NER with text classification

honnibal · February 11, 2018, 10:24pm

@wpm If you segment the document into paragraphs, you can run the NER over all the paragraphs though, right?

I think it makes good sense to use the text classifier during training to find paragraphs with a high enough density of entities to make your annotation effort productive. But at runtime, where you just want the tool to extract entities, you may as well run it over the whole text.

As far as doing joint learning goes: there are a few ways you can do this. One solution would be to share the CNN layer between the NER component and the text classifier. This may or may not help: it does help a little to share the weights between the POS tagger and parser, but the disadvantage is you have to train the two together, which is a pain.

Another way to do joint NER and textcat would be to condition the NER labels on the type label applied to the text. For instance, you might jointly learn role labels for movie reviews with a scheme like POSITIVE_ACTOR and NEGATIVE_ACTOR.

While it’s not a joint strategy, a cheap way of including text classification labels as features would be to add the label as a token in the sentence (likely the first token). I doubt this would be very effective, though.

Topic		Replies	Views
NER for Financial Text ner	14	1609	October 25, 2023
Span annotation with ner.manual -- how to make use of ner.teach ner	6	859	December 3, 2019
questions on Multi NERs Annotation & Training at Once in a Sentence usage , ner , spacy	5	615	October 3, 2022
Prodigy to Spacy Guide ner , spacy , best-practices	4	5328	January 13, 2020
Best approach for using ner manual and mark usage , ner , solved	22	2345	January 20, 2020

Combining NER with text classification

Related topics