A Data Mining Approach for Detecting Evolutionary Divergence in Transcriptomic Data

Woody, Owen

A Data Mining Approach for Detecting Evolutionary Divergence in Transcriptomic Data

Files

Woody_Owen.pdf (2.33 MB)

Date

2019-11-19

Authors

Woody, Owen

Publisher

University of Waterloo

Abstract

It has become common to produce genome sequences for organisms of scientific or popular interest. Although these genome projects provide insight into the gene and protein complements of a species including their evolutionary relationships, it remains challenging to determine gene regulatory behavior from genome sequence alone. It has also become common to produce “expression atlas” transcriptomic data sets. These atlases employ high-throughput transcript assays to survey an assortment of tissues, developmental states, and responses to stimuli that each may individually elicit or inhibit the transcription of genes. Although genomic and transcriptomic data sets are both routinely collected, they are seldom analyzed in tandem. Here I present a novel approach to combining these complementary data with a software package called BranchOut. BranchOut uses genomic information to construct gene family phylogenies, and then attempts to map gene expression activity onto this phylogeny to allow estimation of ancestral expression states. This allows the identification of specific innovations due to gene duplications that resulted in fundamental diversification in the roles of otherwise closely related genes. As a proof of concept, the BranchOut technique is first applied to a tangible small-scale example in Apis mellifera. Subsequently, the power of BranchOut to analyze complete genomes is shown for two mammalian genomes, Sus scrofa and Bos taurus. The transcriptomic data sets for these two mammals employ microarray and RNAseq platforms, respectively, for expression analysis, demonstrating BranchOut’s applicability to both future and historic expression atlases. Potential refinements to the approach are also discussed.

Keywords

evolution, gene expression, bioinformatics, data mining, phylogenetics

URI

http://hdl.handle.net/10012/15257

Collections

Theses
Biology

Full item page

A Data Mining Approach for Detecting Evolutionary Divergence in Transcriptomic Data

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By