I've read a few threads relevant to this and I think I am heading in the right direction. However, I have a few specific questions about implementation and it would be wonderful to benefit from anyone's insights.
I need to design an annotation and measurement pipeline for target-dependent sentiment analysis. For my use case, the measurement problem has two components:
- Detect targets
- Measure sentiment expressed toward targets
To detect targets, I am planning on training an NER model. I'll probably have about 12 target types.
If an eligible target is detected in a document, I then want to feed it into a sentiment model. This model will need to predict multiple distinct sentiment categories for that target.
For example, assume fruit is a entity type and I want to estimate valence, purchase intent, and previous experience . In the document:
Ex 1. "I'm going to get some mangos. I've eaten them since I was a child and I absolutely love them.
Mangos would be tagged as "fruit" and the correct sentiment predictions would be: valence: 1, purchase: 1, experience: 1.
However, this gets more tricky when a document has multiple entities:
Ex 2. "I'm going to get some mangos. I've eaten them since I was a child and I absolutely love them. But, I'm not bringing back any spinach...I've always hated it!
Obviously, the example prediction scheme above won't work. The model needs to make predictions for each entity. To accomplish this, I've thought of three options:
Have a sentiment category for each entity type. So, if there are 10 entity types and 10 category types, predict 100 labels.
Append the entity token(s) to the beginning of the sentence. I am not sure how well it would work, but the idea would be that the Ex 2. would be sent to the sentiment classifier twice. The first time, mango would be appended and the second time spinach would be appended. With enough training data, my thought is that the model will learn to focus on the target.
A recent paper reported good target-dependent sentiment results using BERT. Rather than classify the whole sentence, they just took the embeddings for the target token(s) and pushed them through a few layers. So, for Ex 2., I would use the NER spans to extract embeddings for mango and send that to the classifier. Then I'd do the same for spinach.
My questions are:
For option 3, I assume I would need to use a custom spacy model. Anything else I should be aware of?
For option 2, I could indicate which entity to annotate for in a given document. I read a previous thread about highlighting spans in textcat. However, it might be easier to just append the target to the sentence (as described) and then annotate that. Any thoughts?
Tbh, I don't think option 1 is a great idea. There will be a ton of sparsity and most of it will be unnecessary as I expect that semantic and syntactic patterns indicating sentiment will be quite similar across entity types.
Finally, I will ideally be detecting three levels of sentiment for some sentiment categories. Something like: not present, moderate, and extreme. That is, I'd like to distinguish between "I hate mangos" and "Mangos don't really do it for me".
It seems like prodigy would want me to treat these as two binary tasks, rather than a 3-way classification. But, it would be nice if someone could weigh in on this.
Thanks so much!