Show simple item record

dc.contributor.authorLiang, Chengzhien 14:21:14 (GMT) 14:21:14 (GMT)
dc.description.abstractConsensus pattern problem (CPP) aims at finding conserved regions, or motifs, in unaligned sequences. This problem is NP-hard under various scoring schemes. To solve this problem for protein sequences more efficiently,a new scoring scheme and a randomized algorithm based on substitution matrix are proposed here. Any practical solutions to a bioinformatics problem must observe twoprinciples: (1) the problem that it solves accurately describes the real problem; in CPP, this requires the scoring scheme be able to distinguisha real motif from background; (2) it provides an efficient algorithmto solve the mathematical problem. A key question in protein motif-finding is how to determine the motif length. One problem in EM algorithms to solve CPP is how to find good startingpoints to reach the global optimum. These two questions were both well addressed under this scoring scheme,which made the randomized algorithm both fast and accurate in practice. A software, COPIA (COnsensus Pattern Identification and Analysis),has been developed implementing this algorithm. Experiments using sequences from the von Willebrand factor (vWF)familyshowed that it worked well on finding multiple motifs and repeats. COPIA's ability to find repeats makes it also useful in illustrating the internal structures of multidomain proteins. Comparative studies using several groups of protein sequences demonstrated that COPIA performed better than the commonly used motif-finding programs.en
dc.format.extent439052 bytes
dc.publisherUniversity of Waterlooen
dc.rightsCopyright: 2001, Liang, Chengzhi. All rights reserved.en
dc.subjectComputer Scienceen
dc.subjectbioinformatics softwareen
dc.subjectmultiple alignmenten
dc.subjectconsensus pattern problemen
dc.titleCOPIA: A New Software for Finding Consensus Patterns in Unaligned Protein Sequencesen
dc.typeMaster Thesisen
dc.pendingfalseen of Computer Scienceen
uws-etd.degreeMaster of Mathematicsen

Files in this item


This item appears in the following Collection(s)

Show simple item record


University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages