A joint project of the Graduate School, Peabody College, and the Jean & Alexander Heard Library

Title page for ETD etd-06062014-110802


Type of Document Master's Thesis
Author Teixeira, Pedro Luis, Jr.
Author's Email Address pedro.l.teixeira@vanderbilt.edu
URN etd-06062014-110802
Title Using Evolutionarily-Based Correlation Measures and Machine Learning to Improve Protein Structure Prediction in BCL::Fold
Degree Master of Science
Department Biomedical Informatics
Advisory Committee
Advisor Name Title
Jens Meiler Ph.D. Committee Chair
Terry P. Lybrand Ph.D. Committee Member
Thomas A. Lasko M.D., Ph.D. Committee Member
Keywords
  • direct information
  • artificial neural networks
  • computational structural biology
  • correlation
  • protein folding
  • computational biology
  • machine learning
  • decision trees
Date of Defense 2013-03-14
Availability unrestricted
Abstract
De novo protein structure prediction is a challenge due to the sheer size of the search space. One can limit the set of potential models with long-range contact restraints (positions distant in the primary sequence but known to be in close proximity within the tertiary structure). Most available contact prediction methods achieve accuracies insufficient for de novo protein folding. Direct Information (DI), which finds the minimal set of correlations that explains all global correlation, is a notable exception. DI has been used to determine the structures of some membrane and soluble proteins with large numbers of homologous sequences compiled into large sequence alignments. However, DI has many limitations.

I have leveraged machine learning methods to predict contacts more accurately by combining DI with sequence information thereby improving protein structure prediction accuracy in the Biochemical Library (BCL). The BCL is a C++ library developed in the Meiler lab. This innovative resource will augment the elucidation of traditionally challenging membrane protein structures – specifically larger proteins, which are computationally difficult to address.

Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  Masters_Thesis_Pedro_Teixeira_v7.0_FINAL_NO_ABSTRACT_v2_FINAL.pdf 3.15 Mb 00:14:34 00:07:29 00:06:33 00:03:16 00:00:16

Browse All Available ETDs by ( Author | Department )

If you have more questions or technical problems, please Contact LITS.