Show simple item record

dc.contributor.authorWang, Tiancong
dc.date.accessioned2017-09-21 16:07:15 (GMT)
dc.date.available2017-09-21 16:07:15 (GMT)
dc.date.issued2017-09-21
dc.date.submitted2017-09-19
dc.identifier.urihttp://hdl.handle.net/10012/12421
dc.description.abstractBioinformaticians have been working on peptide sequencing with tandem mass spectrometry (MS/MS) for decades. However, the results are still not perfect. A lot of research have been carried on two peptide sequencing methods, database search and de novo sequencing. However, due to the quality of spectra and the inherent difficulty of this problem itself, both methods are having problem improving their results further better. The publishing of the NIST peptide library in May 2014 brought fresh ideas into this long lasting problem. This peptide library contains a large amount of MS/MS spectra and their corresponding peptide sequences. Taking advantage of this high-quality dataset, more and more researches have started to find internal patterns in MS/MS spectra since then. In this thesis, we are going to look more into this peptide library and use statistical and machine learning ideas to find new features to help improve peptide sequencing results. Two main contributions have been made. First, a general scoring feature is presented that can be incorporated in the scoring functions of other peptide sequencing software. The scoring feature is based on the intensity ratios between two adjacent y-ions in the spectrum. A method is proposed to obtain the probability distributions of such ratios, and to calculate the scoring feature based on the distributions. To demonstrate the performance of the method, this new feature is incorporated with X!Tandem and Novor and significantly improved their performances on testing data, respectively. Second, a machine learning model to predict the appearances of internal fragment ions in MS/MS spectra is presented. Even though this is the first model on this topic to the best of our knowledge, it achieves fairly good results. Several possible applications of this model are also discussed to show that this topic is valuable for peptide sequencing and thus worth further research.en
dc.language.isoenen
dc.publisherUniversity of Waterlooen
dc.titleDiscovery of New Features for Peptide Sequencing with Mass Spectrometryen
dc.typeMaster Thesisen
dc.pendingfalse
uws-etd.degree.departmentDavid R. Cheriton School of Computer Scienceen
uws-etd.degree.disciplineComputer Scienceen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.degreeMaster of Mathematicsen
uws.contributor.advisorMa, Bin
uws.contributor.affiliation1Faculty of Mathematicsen
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.typeOfResourceTexten
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record


UWSpace

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages