A joint project of the Graduate School, Peabody College, and the Jean & Alexander Heard Library

Title page for ETD etd-03212018-133009


Type of Document Dissertation
Author Li, Bian
Author's Email Address bian.li@vanderbilt.edu
URN etd-03212018-133009
Title Structure prediction and variant interpretation of membrane proteins aided by machine learning algorithms
Degree PhD
Department Chemistry
Advisory Committee
Advisor Name Title
Jens Meiler Committee Chair
Clare McCabe Committee Member
Terry Lybrand Committee Member
Terunaga Nakagawa Committee Member
Keywords
  • machine learning
  • variant interpretation
  • membrane protein docking
  • protein structure prediction
  • protein folding
Date of Defense 2018-01-31
Availability unrestricted
Abstract
Helical membrane proteins (HMPs) play essential roles in various biological processes. Despite their prevalence in the genome, a very small portion (~2%) of structures in the Protein Data Bank are HMPs, partially due to the experimental difficulties in determining structures of HMPs and their complexes. Therefore, accurate computational methods for predicting structure and interpreting variants of HMPs are particularly desirable.

We developed a method, using state-of-the-art machine learning techniques, that accurately predicts residue weighted contact numbers (WCNs) from amino acid sequences. We demonstrated that residues’ WCNs predicted by this method not only are effective restraints for improving the fraction of native contacts in tertiary structure prediction of HMPs, they can also be used to derive a powerful score for selecting native-like docking candidates of HMP complexes. We also developed a machine learning-based protein-specific method capable of accurately predicting functional consequences of variants of the KCNQ1 potassium channel.

The success of our methods suggests that using structural properties predicted by machine-learning algorithms as restraints can be an effective approach to improving sampling and scoring in membrane protein structure prediction. It also suggests a promising pipeline, where a machine learning model is tailored to a specific protein target and trained with a functionally validated data set to calibrate informatics tools.

Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  Li.pdf 7.37 Mb 00:34:08 00:17:33 00:15:21 00:07:40 00:00:39

Browse All Available ETDs by ( Author | Department )

If you have more questions or technical problems, please Contact LITS.