PEPPI: a peptidomic database of human protein isoforms for proteomics experiments

dc.contributor.authorZhou, Ao
dc.contributor.authorZhang, Fan
dc.contributor.authorChen, Jake Yue
dc.contributor.departmentBioHealth Informatics, School of Informatics and Computingen_US
dc.date.accessioned2020-05-04T16:36:37Z
dc.date.available2020-05-04T16:36:37Z
dc.date.issued2010-10-07
dc.description.abstractBackground Protein isoform generation, which may derive from alternative splicing, genetic polymorphism, and posttranslational modification, is an essential source of achieving molecular diversity by eukaryotic cells. Previous studies have shown that protein isoforms play critical roles in disease diagnosis, risk assessment, sub-typing, prognosis, and treatment outcome predictions. Understanding the types, presence, and abundance of different protein isoforms in different cellular and physiological conditions is a major task in functional proteomics, and may pave ways to molecular biomarker discovery of human diseases. In tandem mass spectrometry (MS/MS) based proteomics analysis, peptide peaks with exact matches to protein sequence records in the proteomics database may be identified with mass spectrometry (MS) search software. However, due to limited annotation and poor coverage of protein isoforms in proteomics databases, high throughput protein isoform identifications, particularly those arising from alternative splicing and genetic polymorphism, have not been possible. Results Therefore, we present the PEPtidomics Protein Isoform Database (PEPPI, http://bio.informatics.iupui.edu/peppi), a comprehensive database of computationally-synthesized human peptides that can identify protein isoforms derived from either alternatively spliced mRNA transcripts or SNP variations. We collected genome, pre-mRNA alternative splicing and SNP information from Ensembl. We synthesized in silico isoform transcripts that cover all exons and theoretically possible junctions of exons and introns, as well as all their variations derived from known SNPs. With three case studies, we further demonstrated that the database can help researchers discover and characterize new protein isoform biomarkers from experimental proteomics data. Conclusions We developed a new tool for the proteomics community to characterize protein isoforms from MS-based proteomics experiments. By cataloguing each peptide configurations in the PEPPI database, users can study genetic variations and alternative splicing events at the proteome level. They can also batch-download peptide sequences in FASTA format to search for MS/MS spectra derived from human samples. The database can help generate novel hypotheses on molecular risk factors and molecular mechanisms of complex diseases, leading to identification of potentially highly specific protein isoform biomarkers.en_US
dc.eprint.versionFinal published versionen_US
dc.identifier.citationZhou, A., Zhang, F. & Chen, J.Y. PEPPI: a peptidomic database of human protein isoforms for proteomics experiments. BMC Bioinformatics 11, S7 (2010). https://doi.org/10.1186/1471-2105-11-S6-S7en_US
dc.identifier.urihttps://hdl.handle.net/1805/22691
dc.language.isoen_USen_US
dc.publisherBMCen_US
dc.relation.isversionof10.1186/1471-2105-11-S6-S7en_US
dc.relation.journalBMC Bioinformaticsen_US
dc.rightsAttribution 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.sourcePublisheren_US
dc.subjectProtein Isoformsen_US
dc.subjectPeptide Regionen_US
dc.subjectAlternative Splice Eventen_US
dc.subjectHuman Fetal Liveren_US
dc.subjectType Peptideen_US
dc.titlePEPPI: a peptidomic database of human protein isoforms for proteomics experimentsen_US
dc.typeArticleen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
1471-2105-11-S6-S7.pdf
Size:
3.17 MB
Format:
Adobe Portable Document Format
Description:
Main article
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.99 KB
Format:
Item-specific license agreed upon to submission
Description: