Med7 — an information extraction model for clinical natural language processing (built with spaCy & Prodigy)

mefrem · October 29, 2020, 8:39pm

Andrey! This is a great project. I've been using it for my purposes, but had a question.

Did you or the team try to ever do any matching between the entity types in Med7 and each 'DRUG' entity? It seems like DOSAGE, STRENGTH, FREQUENCY, etc. are related to the mention of a specific DRUG, and are logically "downstream" of that DRUG entity.

What I wanted to do was create a custom pipeline that, for all DRUG entities, set an extension that contained all the other entities that could be accessed.

sent = med7('She was prescribed Ibuprofen 200 mg daily for two weeks.')

for ent in sent.ents:
    if ent.ent_label_ == 'DRUG:
        print(ent._.drug_attributes)
>>> (('200 mg', 'STRENGTH'), ('daily','FREQUENCY'), ('for two weeks','DURATION'))

But the logic is hard. I've tried dependency parsing but there is every conceivable dependency one could imagine, so the rule-based approach is tough.

I figured you might have some experience or may have even pursued this functionality. I've had a lot of trouble because there are so many possible linguistic relationships one could envision between drug_attributes and each DRUG.

@honnibal What would be the best approach? Do you think associating these Med7 entities I'm calling "drug_attributes" (DOSAGE, STRENGTH, etc.) to a specific DRUG in each sentence is a task well-suited to a rule-set or matching? Or do you think this would probably be better for a statistical model?

Topic		Replies	Views
Blog post: Clinical Natural Language Processing – Transfer Learning and Weak Supervision (using spaCy and Prodigy) project , medical	0	580	May 27, 2021
Demographics Entity Extraction from clinical trail eligibility criteria ner , spacy , medical	2	1018	April 17, 2020
Domain-specific NER project usage , ner , medical	1	1792	July 8, 2019
Healthcare NER or Text Classificaiton usage , ner , textcat , solved , medical	2	1701	August 31, 2018
Stuck training some NER models (newbie) usage , ner , best-practices	2	1027	July 16, 2020

Med7 — an information extraction model for clinical natural language processing (built with spaCy & Prodigy)

Related topics