Combining NER and Classification

ivan · March 1, 2022, 11:23pm

Hi,

There are some great posts on combining NER with Classification. I have made good progress, but would like to make sure I am doing this correctly.

I have an NER model called lpo_ner_model. I would like to use the entities this model can identify for training the classification model.

What does this command do?
prodigy textcat.teach lpo_cat lpo_ner_model data/comments.jsonl --label labels.txt

How is it using the NER model to teach the classification model?
Is it the best way to use the NER Model to start categorizing the data in comments.jsonl?

Many Thanks.

ines · March 2, 2022, 7:12pm

Hi! In spaCy, the ner and textcat or textcat_multilabel are separate, so the textcat predictions don't depend on the named entities, and vice versa. However, the textcat component will still take the whole text into account so the words mentioned will have an impact on the predictions – but it won't rely on the entities and entity labels for that.

So you can collect your NER and textcat annotations in separate datasets and then use prodigy train with --ner pointing at your NER dataset(s) and --textcat or --textcat-multilabel pointing at your textcat datasets. This will train a model containing both components, trained on the respective data.

ivan · March 3, 2022, 12:38am

Thankyou this makes sense.

What is the best approach for trying to use the NER model predictions to train the Classification model?

I found this reference, is this basically the approach to use?
Using predictions from preceding components V3.1](https://spacy.io/usage/training#annotating-components)

Sorry if these are elementary questions, I am still learning what the out of box capabilities are. I love the annotation interface, not super familiar with spaCy yet.

ines · March 4, 2022, 11:36am

In theory, you could build your own text classifier implementation that uses the named entities as features but this is likely overkill I'm really not sure that's necessary and will actually give you an advantage of just training the default text classifier separately. The text classifier will still get to see the same texts including the entity mentions, so it can still take this information into account without the predicted entities being explicit features of the model.

ivan · March 4, 2022, 2:05pm

OVERKILL!

My data is human entered cause and effect data. I am trying to extract cause, subject and effect from this data.

The NER model seems to be pretty good at extracting entities from these comments, however, since the comments are from humans there are lots of synonyms for the same cause, subject, and effect. It may be that I just need to generate a synonym list rather than a classification model.

I can imagine that a classification model may have trouble trying to figure out cause and effect when different nouns are used sometimes as a cause and sometimes as an effect, and the comments are very terse in many cases.

Thanks for your help, got lots of things to try!

senol · July 31, 2022, 1:10pm

Hi, the same problem here. I also want to use predicted entities for text classification. Have you found any solution?

ryanwesslen · August 1, 2022, 3:28pm

hi @senol!

Thanks for the question. Check out this recent post How to train a TextCategorizer using the entities matched by NER or EntityRuler on the spaCy GitHub community site. Hope this helps!

senol · August 5, 2022, 6:06pm

thanks for the reply. will check it.

Topic		Replies	Views
Framing NER task as a text classification task usage , ner , textcat	5	633	December 19, 2019
Using NER output in Textcat ner , textcat , spacy	2	548	February 3, 2020
Recommended approaches for combining NER with text calssification usage , ner , textcat	2	731	October 22, 2019
Combining NER with text classification usage , ner , textcat	10	6898	March 20, 2024
combined labelling for NER and Classification purposes usage , ner , textcat , solved	3	549	October 18, 2019

Combining NER and Classification

Related topics