Predicting the Spectrum Quality and Digestive Enzyme for Shotgun Proteomics
Loading...
Date
2022-05-03
Authors
Gholamizoj, Soroosh
Advisor
Ma, Bin
Journal Title
Journal ISSN
Volume Title
Publisher
University of Waterloo
Abstract
In proteomics, database search programs are routinely used for peptide identification from tandem mass spectrometry data. However, many low-quality spectra cannot be interpreted by any programs. Meanwhile, certain high-quality spectra may not be identified due to incompleteness of the database, failure of the software, or sub-optimal search parameters. Thus, spectrum quality assessment tools are helpful programs that can eliminate poor-quality spectra before the database search and highlight the high-quality spectra that are not identified in the initial search. These spectra may be valuable candidates for further analyses.
We propose SPEQ: a spectrum quality assessment tool that uses a deep neural network to classify spectra into high-quality, which are worthy candidates for interpretation, and low-quality, which lack sufficient information for identification. SPEQ was compared with a few other prediction models and demonstrated improved prediction accuracy.
Furthermore, we propose a statistical model to automatically detect the enzyme used for digestion in a proteomics experiment, by analyzing the distribution of amino acids in peptides de novo sequenced with a nonspecific enzyme setting. Results demonstrate that this algorithm can accurately identify correct enzymes.