Hi,
I've a small data set of 10 data points on "Lipitor 5mg" (same as Atorvastatin, but branded), 5 data points on "Atorvastatin 10mg", and 5 on "Atorvastatin 20mg". When I do a regular term.teach they are all identified as drug, but not as a statin since I have too few data points.
{"label": "DRUG", "patterns": [{lower: "Lipitor"}, {"lower": 5mg}]}
{"label": "DRUG", "patterns": [{lower: "Atorvastatin"}, {"lower": 10mg}]}
{"label": "DRUG", "patterns": [{lower: "Atorvastatin"}, {"lower": 20mg}]}
I like to map all drugs to the single entry Atorvastatin in the vocabulary that is used further in the analysis. I tried
{"name": "Atorvastatin", "label": "DRUG", "patterns":
[
[{lower: "Lipitor"}, {"lower": 5mg}],
[{lower: "Atorvastatin"}, {"lower": 10mg}],
[{lower: "Atorvastatin"}, {"lower": 20mg}],
]}
but I don't know if I get what I need. I could not find a smart way to verify it. In the ideal case I have a sentence where I have only the drug Atorvastatin, nothing else
suggestions?
thanks
Andreas