A Data Mining Approach for Detecting Evolutionary Divergence in Transcriptomic Data
Loading...
Date
2019-11-19
Authors
Woody, Owen
Advisor
McConkey, Brendan
Journal Title
Journal ISSN
Volume Title
Publisher
University of Waterloo
Abstract
It has become common to produce genome sequences for organisms of scientific or popular interest. Although these genome projects provide insight into the gene and protein complements of a species including their evolutionary relationships, it remains challenging to determine gene regulatory behavior from genome sequence alone. It has also become common to produce “expression atlas” transcriptomic data sets. These atlases employ high-throughput transcript assays to survey an assortment of tissues, developmental states, and responses to stimuli that each may individually elicit or inhibit the transcription of genes.
Although genomic and transcriptomic data sets are both routinely collected, they are seldom analyzed in tandem. Here I present a novel approach to combining these complementary data with a software package called BranchOut. BranchOut uses genomic information to construct gene family phylogenies, and then attempts to map gene expression activity onto this phylogeny to allow estimation of ancestral expression states. This allows the identification of specific innovations due to gene duplications that resulted in fundamental diversification in the roles of otherwise closely related genes.
As a proof of concept, the BranchOut technique is first applied to a tangible small-scale example in Apis mellifera. Subsequently, the power of BranchOut to analyze complete genomes is shown for two mammalian genomes, Sus scrofa and Bos taurus. The transcriptomic data sets for these two mammals employ microarray and RNAseq platforms, respectively, for expression analysis, demonstrating BranchOut’s applicability to both future and historic expression atlases. Potential refinements to the approach are also discussed.
Description
Keywords
evolution, gene expression, bioinformatics, data mining, phylogenetics