A joint project of the Graduate School, Peabody College, and the Jean & Alexander Heard Library

Title page for ETD etd-08052016-114746

Type of Document Master's Thesis
Author Chambers, Matthew Chase
Author's Email Address matt.chambers42@gmail.com
URN etd-08052016-114746
Title Omicron: a Galaxy for reproducible proteogenomics
Degree Master of Science
Department Biomedical Informatics
Advisory Committee
Advisor Name Title
Bing Zhang Committee Chair
Daniel Liebler Committee Member
David Tabb Committee Member
  • galaxy
  • proteogenomics
  • reproducibility
Date of Defense 2016-07-13
Availability unrestricted
Proteomics allows us to see post-translational modifications and expression patterns that we cannot see with genomics and transcriptomics alone. By itself, proteomics has limited sensitivity to detect genetic variation (e.g. single-nucleotide polymorphisms and insertion/deletion mutations), but we can improve that with access to genomic data: an approach known as proteogenomics. As in many of the -omics fields, reproducibility of proteogenomic results is a problem. Since 2005, the web application “Galaxy” has been available to improve the transparency and reproducibility of -omic analyses. However, a Galaxy server is not easy to set up, and to work around that, investigators have sometimes distributed their customizations as virtual machines (VMs). In recent years, a more efficient approach for software isolation - “containers” - has become popular. A proteogenomics “flavor” of Galaxy – Omicron – was created to simplify reproduction of proteogenomic workflows. An easy way for anyone to launch Omicron on Amazon Web Services, paired with a scalable compute cluster, was also created. Using Omicron, results from a 2014 Nature paper were partially reproduced. Due to changes in online reference data and possibly due to different tool versions, it was not possible to perfectly reproduce the previous results. However, other investigators could easily reproduce the Omicron results without digging through methods and supplemental data. Then they could easily apply the same workflow to their own data.
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  MatthewChambers.pdf 2.84 Mb 00:13:08 00:06:45 00:05:54 00:02:57 00:00:15

Browse All Available ETDs by ( Author | Department )

If you have more questions or technical problems, please Contact LITS.