Hi! The highlighted text is the matched pattern that was used to select that example. (When I recorded my video, Prodigy didn't yet highlight the pattern that was actually matched, which people found a bit confusing. The recipe now does that to make it more transparent that the example was selected based on a specific match in the text). You're still annotating the text plus label, and when you train your model, you'll be training on the text plus label, too. The highlight is just there so you know what the suggestion is based on.
The pattern matcher currently just yields out every match, so if multiple matches occur in the same text, you see each example once. We do want to change this for the next update that allows us to break backwards-compatibility. In the meantime, you can find more details and code for a filter function in this thread: textcat.teach presents same annotation task if text snippet contains multiple patterns - #2 by ines
That's interesting, because I always feel like writing abstract patterns is actually much more useful for text classification than it is for NER. For entities, you often have a pretty specific idea of what the spans should be, so the main token attributes you'd probably want to use are the token text and maybe the lowercase form (to make them case-insensitive). But if you're assigning labels to the whole text, the "trigger words" or phrases are often much more vague and can be stuff like "word with the lemma sell" or "this noun with optional adjective X, Y or Z". That's where token-based patterns make a lot more sense than just more or less exact string matches. But I guess it really depends on the use case.