Label NER, sentiment, and intent at the same time

Apologies if this has been asked before (a cursory search came up with nada) but I was wondering if it's possible to have an interface allowing me to perform NER, sentiment, and intent classification on a single pass, rather than having to visit the same data 3 times. As you may have intuited, this is for a chatbot scenario.


We don't have an interface option for that, no. This is one of those times where we can make some nice improvements to the software by enforcing an assumption about how the tasks work. Specifically, we can keep the UI simpler and more consistent if we only allow one annotation layer at a time.

The constraint also guides users to more effective usage patterns in most situations. We would pretty much always recommend doing the NER and text classification annotations separately. Doing smaller units of work in each annotation pass allows you to more easily maintain inter-annotator consistency, as you can judge the work separately, and the annotators only have to think about a smaller annotation scheme at once.

Keeping the annotation scheme smaller for each pass also minimises the number of clicks per annotation, so even though the same text needs to be visited multiple times, it's usually faster to do the annotations in multiple passes, one per annotation type, rather than doing the annotations together.

Of course, there can always be situations where the normal finding (that it's faster and more accurate to do it in separate passes) doesn't hold for a particular use-case. If your use-case is such that you're really sure a single pass would be better, I'd be glad to hear of why that's the case. But unfortunately we don't plan to support multiple annotation layers in a single session at this point.

Thanks for the reply, and I confess I definitely don't have as much experience as you all at the processes you describe and so it's possible i am being naive in my expectation that different passes at the same text would be annoying.

More specifically, the task at hand is to annotate customer service chats for making a chatbot. I am building models to do (or at least attempt to) NER for rather finicky things like addresses (extract down to suburb/postcode level), but also more straightforward things like intent (for chatbot flow) and sentiment (just useful to have).

Were the task simply +ve or -ve sentiment I would definitely agree that doing it separately makes sense because you can setup some sweet keyboard shortcuts and you're away, but in this case the intent and extracted entities are likely quite closely tied and so the annotator has to sort of mentally parse those aspects together anyway. For example, if someone is asking for help on a specific booking, that tells you the intent and also you might have to tag a booking number. If the intent is get a quote, you're probably tagging locations as well.

(Just 'for completeness' I did try using existing NER classifiers, including spacy, for the location tagging but unfortunately it wasn't very consistent, not helped by the fact a lot of our inputs come across as uncased which seems to hurt the location tags a lot, understandably - but I've had decent success with a plain CRF with some domain specific features)

For the workflow you are describing here, I would suggest first labelling the intents, and then labelling entities, one intent at a time. So for example once you've labelled all the bookings, you would run a second session where you were only tagging booking numbers; this means that you won't have to switch between annotation types during the labelling session, reducing cognitive load on the labellers.