Systematic Evaluation of Protein Sequence Filtering Algorithms for Proteoform Identification Using Top-Down Mass Spectrometry

dc.contributor.authorKou, Qiang
dc.contributor.authorWu, Si
dc.contributor.authorLiu, Xiaowen
dc.contributor.departmentBioHealth Informatics, School of Informatics and Computingen_US
dc.date.accessioned2018-02-08T21:09:49Z
dc.date.available2018-02-08T21:09:49Z
dc.date.issued2018
dc.description.abstractComplex proteoforms contain various primary structural alterations resulting from variations in genes, RNA, and proteins. Top-down mass spectrometry is commonly used for analyzing complex proteoforms because it provides whole sequence information of the proteoforms. Proteoform identification by top-down mass spectral database search is a challenging computational problem because the types and/or locations of some alterations in target proteoforms are in general unknown. Although spectral alignment and mass graph alignment algorithms have been proposed for identifying proteoforms with unknown alterations, they are extremely slow to align millions of spectra against tens of thousands of protein sequences in high throughput proteome level analyses. Many software tools in this area combine efficient protein sequence filtering algorithms and spectral alignment algorithms to speed up database search. As a result, the performance of these tools heavily relies on the sensitivity and efficiency of their filtering algorithms. Here, we propose two efficient approximate spectrum-based filtering algorithms for proteoform identification. We evaluated the performances of the proposed algorithms and four existing ones on simulated and real top-down mass spectrometry data sets. Experiments showed that the proposed algorithms outperformed the existing ones for complex proteoform identification. In addition, combining the proposed filtering algorithms and mass graph alignment algorithms identified many proteoforms missed by ProSightPC in proteome-level proteoform analyses.en_US
dc.eprint.versionAuthor's manuscripten_US
dc.identifier.citationKou, Q., Wu, S. and Liu, X. (2018), Systematic Evaluation of Protein Sequence Filtering Algorithms for Proteoform Identification Using Top-Down Mass Spectrometry. Proteomics, 1700306. Accepted Author Manuscript. http://dx.doi.org/10.1002/pmic.201700306en_US
dc.identifier.urihttps://hdl.handle.net/1805/15162
dc.language.isoenen_US
dc.publisherWileyen_US
dc.relation.isversionof10.1002/pmic.201700306en_US
dc.relation.journalProteomicsen_US
dc.rightsPublisher Policyen_US
dc.sourceAuthoren_US
dc.subjecttop-down mass spectrometryen_US
dc.subjectspectral identificationen_US
dc.subjectfiltering algorithmsen_US
dc.titleSystematic Evaluation of Protein Sequence Filtering Algorithms for Proteoform Identification Using Top-Down Mass Spectrometryen_US
dc.typeArticleen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Kou_2018_systematic.pdf
Size:
1013.7 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.99 KB
Format:
Item-specific license agreed upon to submission
Description: