Yan, JingwenChen, WenrongWang, JuexinWan, JunZang, YongLuo, XiaoLiu, Xiaowen2024-01-102024-01-102023-12https://hdl.handle.net/1805/37921Indiana University-Purdue University Indianapolis (IUPUI)Proteoforms, distinct molecular forms of proteins, arise due to numerous factors such as genetic mutations, differential gene expression, alternative splicing, and a range of biological processes. These proteoforms are often characterized by primary structural variances such as amino acid substitutions, terminal truncations, and post-translational modifications (PTMs). Proteoforms from the same proteins can manifest varied functional behaviors based on the specific alterations. The complexity inherent to proteoforms has elevated the significance of top-down mass spectrometry (MS) due to its proficiency in providing intricate sequence information for these intact proteoforms. During a typical top-down MS experiment, intact proteoforms are separated through platforms like liquid chromatography (LC) or capillary zone electrophoresis (CZE) prior to tandem mass spectrometry (MS/MS) analysis. Despite advancements in instruments and protocols for top-down MS, computational challenges persist, with software tool development still in its early stage. In this dissertation, our research revolves around three primary goals, all aimed at refining proteoform characterization. First, we bridge RNA-Seq with top-down MS for a better proteoform identification. We propose TopPG, an innovative proteogenomic tool which is tailored to generate proteoform sequence databases from genetic and splicing variations explicitly for top-down MS in contrast to traditional approaches. Second, to boost the accuracy of proteoform detection, we utilize machine learning methods to predict proteoform retention and migration times in top-down MS, an area previously overshadowed by bottom-up MS paradigms. critically evaluating models in a realm traditionally dominated by bottom-up MS methodologies. Lastly, recognizing the indispensable role of post-translational modifications (PTMs) on cellular functions, we introduce PTM-TBA. This tool integrates the complementary strengths of both top-down and bottom-up MS, augmented with annotations, building a comprehensive strategy for precise PTM identification and localization.en-USMass spectrometryPost-translational modificationProteoformTop-down proteomicsComputational Methods for Proteoform Identification and Characterization Using Top-Down Mass SpectrometryDissertation