Demographics Entity Extraction from clinical trail eligibility criteria

umer · April 17, 2020, 1:41am

HI
I have recently joined the prodigy community. At the moment I've started working on a project to extract demographics data, medical conditions, and drugs from clinical trial eligibility criteria to develop a semantic relationship graph. I was able to extract medical conditions, drugs, and there attributes using the Scispacy model. However, Now I also want to extract demographics data, could you please suggest me some available pre-trained model or ways to extract them.
In addition to that, I've also wanted to categorize the disease type based on attributes such as level (mild, moderate, and severe) and duration (chronic or short term), etc. It will be helpful if anyone can guide me.
Thanks in Advance!

ines · April 17, 2020, 9:47am

Hi! It's difficult to give a definitive answer because the approach that works best will depend on your data, how you break down the demographics you want to extract into categories etc. Maybe in this case, you want to experiment with doing some manual annotation first (maybe with patterns to help you and pre-select entities for you), and then train a separate entity recognizer. The usage guide on NER should be a good place to start:

If you haven't seen it already, also check out the medical tag on the forum for discussion related to training models for biomedical use cases: Topics tagged medical

Also, this is a recent project published by researchers at Oxford, and it's built on top of spaCy and trained on data annotated with Prodigy. They published a detailed blog post and a paper the approaches they chose and the different considerations. So if you haven't seen this yet, it's definitely an interesting read and should be pretty relevant to you.

umer · April 17, 2020, 11:53pm

@ines Thanks for your prompt reply. I'll try and get back to you.

Topic		Replies	Views
Med7 — an information extraction model for clinical natural language processing (built with spaCy & Prodigy) ner , project , medical , paper	12	4289	December 11, 2020
Domain-specific NER project usage , ner , medical	1	1795	July 8, 2019
Spanish NER by context usage , ner , spacy , solved	7	1726	January 24, 2019
Improve accuracy of the Spacy model ner , spacy	4	4791	October 30, 2019
Prodigy + spaCy for negation extraction and a link between the entities usage , ner , dep , best-practices , medical	2	2539	July 23, 2018

Demographics Entity Extraction from clinical trail eligibility criteria

Related topics