I have been trying to solve your insult video using patterns. But accuracy level is very low (20-25%). And also I can’t able to use textcat.teach recipe. Can you please send tell me in details.
Even that Link is not clear for labelling and (combining pattern and reddit dataset). Please add sample example for it.
Selecting the examples to annotate isn't something we'll be able to help with. What data you choose (and how you define self deprecation) will depend on what you want to do with the model. The training text needs to be a functional approximation of the text you want to use the model on at runtime, so it's really an application-specific question.