A joint project of the Graduate School, Peabody College, and the Jean & Alexander Heard Library

Title page for ETD etd-07022012-120906

Type of Document Master's Thesis
Author Smith, Joshua Carl
Author's Email Address joshua.smith@vanderbilt.edu
URN etd-07022012-120906
Title Exploring Adverse Drug Effect Discovery from Data Mining of Clinical Notes
Degree Master of Science
Department Biomedical Informatics
Advisory Committee
Advisor Name Title
Randolph A. Miller Committee Chair
Joshua C. Denny Committee Member
S. Trent Rosenbloom Committee Member
W. Anderson Spickard, III Committee Member
  • pharmacovigilance
  • adverse drug effects
  • natural language processing
  • data mining
Date of Defense 2012-05-17
Availability unrestricted
Many medications have potentially serious adverse effects detected only after FDA approval. After 80 million people worldwide received prescriptions for the drug rofecoxib (Vioxx), its manufacturer withdrew it from the marketplace in 2004. Epidemiological data showed that it increases risk of heart attack and stroke. Recently, the FDA warned that the commonly prescribed statin drug class (e.g., Lipitor, Zocor, Crestor) may increase risk of memory loss and Type 2 diabetes. These incidents illustrate the difficulty of identifying adverse effects of prescription medications during premarketing trials. Only post-marketing surveillance can detect some types of adverse effects (e.g., those requiring years of exposure). We explored the use of data mining on clinical notes to detect novel adverse drug effects. We constructed a knowledge base using UMLS and other data sources that could classify drug-finding pairs as “currently known adverse effects” (drug causes finding), “known indications” (drug treats/prevents finding), or “unknown relationship”. We used natural language processing (NLP) to extract current medications and clinical findings (including diseases) from 360,000 de-identified history and physical examination (H&P) notes. We identified 35,000 “interesting” co-occurrences of medication-finding concepts that exceeded threshold probabilities of appearance. These involved ~600 drugs and ~2000 findings. Among the identified pairs are several that the FDA recognized as harmful in postmarketing surveillance, including rofecoxib and heart attack, rofecoxib and stroke, statins and diabetes, and statins and memory loss. Our preliminary results illustrate both the problems and potential of using data mining of clinical notes for adverse drug effect discovery.
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  Smith.pdf 1.33 Mb 00:06:09 00:03:09 00:02:46 00:01:23 00:00:07

Browse All Available ETDs by ( Author | Department )

If you have more questions or technical problems, please Contact LITS.