Algorithms for accurate and sensitive interpretation of mass spectra against arbitrarily large peptide search spaces
Abstract/Contents
- Abstract
- Tandem mass spectrometry (MS/MS) enables the high-throughput identification and characterization of complex protein mixtures, and depends critically on bioinformatics tools to interpret mass spectra as peptide sequences. There exist two general techniques for the interpretation of mass spectra: de novo sequencing and database search. In de novo sequencing, a mass spectrum is directly interpreted as a protein sequence. In database search, a mass spectrum is identified from its best match in an existing sequence or spectrum database. Though more unbiased and less restrictive than database search algorithms, de novo sequencing algorithms are less popular due to their relatively lower accuracy and lack of automated statistical validation tools. However, database search algorithms suffer greatly in both speed and sensitivity as database search spaces increase through the addition of protein sequences and post-translational modifications. To able to apply MS/MS to more diverse systems, I developed the de novo sequencing algorithm Label Assisted De novo Sequencing (LADS). LADS utilizes chemical strategies to bolster introduce signatures into mass spectra which improve sequencing accuracy, and employs a support vector machine-based model to discriminate true from false identifications. I also developed a method by which to empirically estimate false discovery rates (FDRs) from any de novo sequencing algorithm. In the last stage of my PhD, I developed TagGraph, an unrestricted database search tool able to match peptides to mass spectra from sequence databases without assuming any protease specificity or requiring a user-specified set of modifications. I demonstrate the utility of TagGraph on the recently published human proteome dataset, matching over four million spectra to modified peptides, and identifying new functional roles and disease associations for protein hydroxylation. Both TagGraph and the de novo FDR calibration technology described herein have the potential to greatly extend the scope and depth of tandem MS analyses.
Description
Type of resource | text |
---|---|
Form | electronic; electronic resource; remote |
Extent | 1 online resource. |
Publication date | 2016 |
Issuance | monographic |
Language | English |
Creators/Contributors
Associated with | Devabhaktuni, Arun | |
---|---|---|
Associated with | Stanford University, Department of Chemical and Systems Biology. | |
Primary advisor | Elias, Joshua | |
Thesis advisor | Elias, Joshua | |
Thesis advisor | Dill, David L | |
Thesis advisor | Mallick, Parag, 1976- | |
Thesis advisor | Meyer, Tobias | |
Advisor | Dill, David L | |
Advisor | Mallick, Parag, 1976- | |
Advisor | Meyer, Tobias |
Subjects
Genre | Theses |
---|
Bibliographic information
Statement of responsibility | Arun Devabhaktuni. |
---|---|
Note | Submitted to the Department of Chemical and Systems Biology. |
Thesis | Thesis (Ph.D.)--Stanford University, 2016. |
Location | electronic resource |
Access conditions
- Copyright
- © 2016 by Arun Devabhaktuni
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...