A joint project of the Graduate School, Peabody College, and the Jean & Alexander Heard Library

Title page for ETD etd-11222014-155557

Type of Document Dissertation
Author Carroll, Robert James
Author's Email Address Robert.J.Carroll@Vanderbilt.edu
URN etd-11222014-155557
Title Defining Phenotypes, Predicting Drug Response, and Discovering Genetic Associations in the Electronic Health Record with Applications in Rheumatoid Arthritis
Degree PhD
Department Biomedical Informatics
Advisory Committee
Advisor Name Title
Josh Denny Committee Chair
Digna Velez-Edwards Committee Member
Hua Xu Committee Member
Jeremy Warner Committee Member
Tom Lasko Committee Member
  • Secondary use
  • data analysis
  • informatics
Date of Defense 2014-11-11
Availability unrestricted
Electronic Health Records (EHRs) allow for the digital capture of patient information and have proven to be a valuable tool for patient treatment. In this dissertation, I explore reuse of EHR data for clinical and genomic research with a focus on rheumatoid arthritis (RA). RA is a chronic autoimmune disorder that primarily affects joints with swelling, stiffness, and pain, and if left untreated can lead to permanent joint damage. Phenome wide association studies (PheWAS) leverage the breadth of codified diagnostic information about patients in the EHR to find disease associations. A package for the R statistical language is presented here that includes the tools needed to perform EHR-based or observational trial PheWAS, from ICD-9 code translation to association testing and meta-analysis. It includes a versatile plotting system for phenotype related information following the Manhattan plot paradigm. This methodology is applied in conjunction with genetic risk scores (GRS) to assess pleiotropy and shared genetic risk among phenotypes. Investigations of 99 known risk variants for RA and three formulations of GRS show that the GRS is more specific to RA than the individual single nucleotide polymorphisms, but the GRSs had clinically interesting associations with hypothyroidism. Presented next is the development of an algorithm to retrospectively identify drug response to etanercept in the EHR. Using chart reviews and a variety of input data including billing codes, processed free text, and medication entries, a support vector machine and random forest classifier were created that can discriminate between drug responders and non-responders with an area under the receiver operating characteristic curve of 0.939 and 0.923, respectively. The drug response algorithm was applied to create a case control cohort. Using these records, the final study identifies phenotypes associated with etanercept response, including fibromyalgia and several axial skeleton disease phenotypes: intervertebral disc disorders, degeneration of intervertebral disc, and spinal stenosis. Taken together, these studies demonstrate that EHR data can be an important tool for clinical and genomic research, and offer particular promise for the study of RA.
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  Carroll.pdf 1.53 Mb 00:07:06 00:03:39 00:03:11 00:01:35 00:00:08

Browse All Available ETDs by ( Author | Department )

If you have more questions or technical problems, please Contact LITS.