Show simple item record

dc.contributor.authorQiao, Rui
dc.date.accessioned2020-09-30 18:10:42 (GMT)
dc.date.available2020-09-30 18:10:42 (GMT)
dc.date.issued2020-09-30
dc.date.submitted2020-09-16
dc.identifier.urihttp://hdl.handle.net/10012/16412
dc.description.abstractIn shotgun proteomics, de novo peptide sequencing from tandem mass spectrometry data is the key technology for finding new peptide or protein sequences. It has successful applications in assembling monoclonal antibody sequences and great potentials for identifying neoantigens for personalized cancer vaccines. In this thesis, I propose a novel deep neural network-based de novo peptide sequencing model: PointNovo. The proposed PointNovo model not only outperforms the previous state-of-the-art model by a significant margin but also solves the long-standing accuracy–speed/memory trade-off problem that exists in previous de novo peptide sequencing tools. Further, our experiment results show that even though PointNovo is not trained to distinguish between true and false peptide spectrum matching, its resulting log probability score can be used as a scoring function to perform database searching. On several different datasets, we show that PointNovo, when used as a database search engine, can achieve an identification rate that is at least comparable to existing popular database search softwares. We also extend and adapt an existing model to process Data Independent Acquisition (DIA) data and propose the first de novo peptide sequencing algorithm for DIA tandem mass spectra. Finally, we develop a workflow that can identify tumor-specific antigens directly and purely from mass spectrometry data of tumor tissues and test it on a published dataset of tumor samples from melanoma patients. Our workflow applies de novo peptide sequencing to detect mutated endogenous peptides, in contrast to the prevalent indirect approach of combining exome sequencing, somatic mutation calling, and epitope prediction in existing methods. More importantly, we develop machine learning models that are tailored to each patient based on their own MS data. Such a personalized approach enables accurate identification of neoantigens for the development of personalized cancer vaccines. We applied the workflow to datasets of five melanoma patients and expanded their immunopeptidomes by 5% to 15%. Subsequently, we discovered 17 neoantigens of both HLA–I and HLA–II, including those with validated T cell responses and those novel neoantigens that had not been reported in previous studies.en
dc.language.isoenen
dc.publisherUniversity of Waterlooen
dc.subjectde novo peptide sequencingen
dc.subjectdeep learningen
dc.subjectmass spectrometryen
dc.subjectneoantigen identificationen
dc.titlePeptide Sequencing with Deep Learningen
dc.typeDoctoral Thesisen
dc.pendingfalse
uws-etd.degree.departmentStatistics and Actuarial Scienceen
uws-etd.degree.disciplineStatisticsen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.degreeDoctor of Philosophyen
uws.contributor.advisorGhodsi, Ali
uws.contributor.affiliation1Faculty of Mathematicsen
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.typeOfResourceTexten
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record


UWSpace

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages