A joint project of the Graduate School, Peabody College, and the Jean & Alexander Heard Library

Title page for ETD etd-12022013-152619


Type of Document Master's Thesis
Author VanHouten, Jacob Paul
Author's Email Address jacob.p.vanhouten@vanderbilt.edu
URN etd-12022013-152619
Title Random Forest Classification of Acute Coronary Syndrome
Degree Master of Science
Department Biomedical Informatics
Advisory Committee
Advisor Name Title
Thomas A. Lasko, MD, PhD Committee Chair
David J. Maron, MD Committee Member
John M. Starmer, MD Committee Member
Nancy M. Lorenzi Committee Member
Keywords
  • prediction
  • machine learning
  • cardiovascular
Date of Defense 2013-09-06
Availability unrestricted
Abstract
Coronary artery disease (CAD) is the leading cause of death worldwide. Acute coronary syndromes (ACS), a subset of CAD, account for 1.4 million hospitalizations $165 billion in costs in the United States alone. A major challenge to the physician when diagnosing and treating patients with suspected ACS is that there is significant overlap between patients with and without ACS. There is a high cost to missing a diagnosis of ACS, but also a high cost to inappropriate treatment of patients without ACS. American College of Cardiology/American Heart Association guidelines recommend early risk stratification of patients to determine their likelihood of major adverse events, but many individual tests and prognostic indices lack sufficient performance characteristics for use in clinical practice. Prognostic indices specifically are often not representative of the population on which they are used and rely on complete and accurate data. We explored the use of state-of-the-art machine learning techniques random forest and elastic net on 23,576 records from the Synthetic Derivative to develop models with better performance characteristics than previously established prognostic indices in determining the risk of ACS for patients presenting with suspicious symptoms. We bootstrapped the process of model creation, and found that the random forest significantly outperformed elastic net, L2 regularized regression, and the previously-developed TIMI and GRACE scores. We also assessed the model calibration for the random forest and explored methods of correction. Our preliminary findings suggest that machine learning applied to noisy and largely missing data can still perform as well or better than previously developed scoring metrics.
Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  complete_masters_thesis_VanHouten.pdf 1.44 Mb 00:06:40 00:03:25 00:03:00 00:01:30 00:00:07

Browse All Available ETDs by ( Author | Department )

If you have more questions or technical problems, please Contact LITS.