I think this task presents a number of challenges for annotation, because it’s difficult to design the schema precisely. I would probably encourage you to limit the number of target topics to a smallish set, e.g. delivery, customer service, price, quality etc. Then you can do the annotation as a text classification task, with multiple labels possible on the text.
Modelling the problem as text classification has the advantage that you’re not running into problems identifying which words in a text constitute the target. As we see in your second example, the phrase “customer service” never occurs. Even the noun phrase which does anchor the reference (support), doesn’t have a direct syntactic relationship with the word which ascribes the sentiment (better).
The disadvantage of the text classification approach is that the number of targets of the sentiment are fixed — and have to be designed into the annotation scheme, in fact. If you use the sentiment analysis system, you can’t discover negative sentiment about something new, like e.g. the online booking system.
I’d say that the questions around how to do this best with Prodigy are the same questions you’ll face doing the task with other tools. Prodigy makes you confront some of these questions more explicitly, which we see as a big advantage — but no matter what tooling you use, the same issues will be there.