Structure-aided detection of functional innovation in protein phylogenies
Adams, Jeremy Bruce
MetadataShow full item record
Detection of positive selection in proteins is both a common and powerful approach for investigating the molecular basis of adaptation. In this thesis, I explore the use of protein three- dimensional (3D) structure to assist in prediction of historical adaptations in proteins. Building on a method first introduced by Wagner (Genetics, 2007, 176: 2451–2463), I present a novel framework called Adaptation3D for detecting positive selection by integrating sequence, structural, and phylogenetic information for protein families. Adaptation3D identifies possible instances of positive selection by reconstructing historical substitutions along a phylogenetic tree and detecting branch-specific cases of spatially clustered substitution. The Adaptation3D method was capable of identifying previously characterized cases of positive selection in proteins, as demonstrated through an analysis of the pathogenesis-related protein 5 (PR-5) phylogeny. It was then applied on a phylogenomic scale in an analysis of thousands of vertebrate protein phylogenetic trees from the Selectome database. Adaptation3D’s reconstruction of historical mutations in vertebrate protein families revealed several evolutionary phenomena. First, clustered mutation is widespread and occurs significantly more often than that expected by chance. Second, numerous top-scoring cases of predicted positive selection are consistent with existing literature on vertebrate protein adaptation. Third, in the vertebrate lineage, clustered mutation has occurred disproportionately in proteins from certain families and functional categories such as zinc-finger transcription factors (TFs). Finally, by separating paralogous and orthologous lineages, it was found that TF paralogs display significantly elevated levels of clustered mutation in their DNA-binding sites compared to orthologs, consistent with historical DNA-binding specificity divergence in newly duplicated TFs. Ultimately, Adaptation3D is a powerful framework for reconstructing structural patterns of historical mutation, and provides important insights into the nature of protein adaptation.