Please use this identifier to cite or link to this item:

A semi-empirical approach to predict unobserved peptide MS/MS spectra from spectral libraries

Authors Hu, Yingwei
Issue Date 2011
Summary Proteomics, the large-scale, system-wide study of proteins, is a powerful approach for answering diverse and challenging questions in biological and medical research. The technological foundation of proteomics is the high-throughput and accurate method of peptide sequencing using mass spectrometry, allowing one to identify and quantify hundreds of proteins in a complex sample simultaneously. In this method, protein samples are digested into peptides first, then separated by chromatography and mass spectrometry, then fragmented and analyzed using tandem mass spectrometry (MS/MS). The output MS/MS spectra are assigned to their peptides and hence protein identification computationally. The traditional method for peptide assignment, sequence database searching, tends to be slow and error-prone, especially when applied to identify point mutations or multiple post-translational modifications (PTMs). Spectral library searching is a promising alternative to sequence database searching in peptide identification (sequencing) from MS/MS spectra. It utilizes more features in both the experimental query data and in the reference spectra in library to achieve higher sensitivity and specificity. However, the generation of high-quality reliable reference spectra is limited by current experimental or computational methods, restricting the applicability of spectral library searching. We developed an alternative approach to expand the coverage of spectral libraries with semi-empirical spectra predicted from perturbing known spectra of similar sequences. We hypothesized that peptides of similar sequences should fragment in the similar patterns at least in most cases. The results confirmed our hypothesis and shown this semi-empirical approach can transfer most of informative features to the predicted spectra. The extremely high similarity between the real spectra from previous experiments and our predictions suggests that spectral searching is readily extendable to these semi-empirical spectra. The semi-empirical spectral library is especially suitable in single nucleotide polymorphisms (SNPs) and PTMs searching. We searched 3 human datasets against the library of semi-empirical spectra predicted according to the pre-defined SNPs and identified hundreds of related peptide-spectrum matches which were missed in the typical sequence database searching.
Note Thesis (M.Phil.)--Hong Kong University of Science and Technology, 2011
Language English
Format Thesis
Access View full-text via DOI
Files in this item:
File Description Size Format
th_redirect.html 337 B HTML
Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.