Quantifying phylogenetic incongruence and identifying contributing factors in a yeast model clade
Salichos, Leonidas
:
2014-08-20
Abstract
The presence of phylogenetic incongruence, the topological conflict between different gene trees, continues to confound phylogeneticists and makes the determination of the major branches of the tree of life challenging. In my thesis, I introduce four novel measures of internode and tree support (Internode Certainty, Internode Certainty All, Tree Certainty and Tree Certainty All) that quantify incongruence by considering the most prevalent conflicting bipartitions for each internode. Using these measures on a dataset of 1,070 high-quality orthogroups from the yeast clade that I assembled, I discovered extreme levels of gene-tree incongruence, I identified several ambiguous internodes previously considered as resolved and I showed that most standard practices aimed at decreasing incongruence, have little, negative or no significant effect, misleading investigators and leading to overconfidence. I obtained similar results on two additional datasets, consisting of vertebrate and metazoan taxa, respectively, whose orthology I defined using clustering-Reciprocal Best Hit, an ortholog-prediction clustering algorithm that I developed. I also found that approximately 67% of the total variance of gene tree incongruence can be attributed to the short length of internodes and their localization near the root of the phylogeny and approximately 17% to several functional gene factors such as GC content, codon bias and the number of variable sites per gene. Finally, I show that selection of highly supported gene trees or bipartitions are capable of significantly reducing gene-tree incongruence.