RNA Homology Searches Using Pair Seeding
MetadataShow full item record
Due to increasing numbers of non-coding RNA (ncRNA) being discovered recently, there is interest in identifying homologs of a given structured RNA sequence. Exhaustive homology searching for structured RNA molecules using covariance models is infeasible on genome-length sequences. Hence, heuristic methods are employed, but they largely ignore structural information in the query. We present a novel method, which uses secondary structure information, to perform homology searches for a structured RNA molecule. We define the concept of a <em>pair seed</em> and theoretically model alignments of random and related paired regions to compute expected sensitivity and specificity. We show that our method gives theoretical gains in sensitivity and specificity compared to a BLAST-based heuristic approach. We provide experimental verification of this gain. <br /><br /> We also show that pair seeds can be effectively combined with the spaced seeds approach to nucleotide homology search. The hybrid search method has theoretical specificity superior to that of the BLAST seed. We provide experimental evaluation of our hypotheses. Finally, we note that our method is easily modified to process pseudo-knotted regions in the query, something outside the scope of covariance model based methods.