I want to use Prodigy for the identification of side effects within social media posts. Here I use the classification receipe. I've got unbalanced data and therefore i decided to add a pattern list of some terms of an side effect to make bootstrapping. Now in the annotation process Prodigy marks me the terms which are in the list within the posts. But when the marked term within the post represents a symptom and another term, which is not part of the pattern list represents a side effect, Prodigy can not recognize that the mention should anyway labeled as an side effect mention.
E.g the list contain the word "headache". The mention looks like that: "To avoid having a headache i take aspirin. But after taking aspirin i always have a flu". The word "headache" is no side effect in this context but the mention itself should be classified as "side effect" because the user is talking about having a flu.
Do you have any tips how i can proceed?
And three more questions:
- Do Prodigy save the pattern list within the generated model?
- How does the classification internally work?
3.I have a predefined list of drugs. How can I use the classification receipe in such a way, that prodigy realizes, that only in combination with one of the predefined drugs there is a side effect in a post?