- Browse by Author
Browsing by Author "Wang, Zhe"
Now showing 1 - 10 of 10
Results Per Page
Sort Options
Item Deep Intact Proteoform Characterization in Human Cell Lysate using High-pH and Low-pH Reversed-Phase Liquid Chromatography(American Chemical Society, 2019-12) Yu, Dahang; Wang, Zhe; Sutton, Kellye A.; Liu, Xiaowen; Wu, Si; Computer and Information Science, School of SciencePost-translational modifications (PTMs) play critical roles in biological processes and have significant effects on the structures and dynamics of proteins. Top-down proteomics methods were developed for and applied to the study of intact proteins and their PTMs in human samples. However, the large dynamic range and complexity of human samples makes the study of human proteins challenging. To address these challenges, we developed a 2D pH RP/RPLC-MS/MS technique that fuses high-resolution separation and intact protein characterization to study the human proteins in HeLa cell lysate. Our results provide a deep coverage of soluble proteins in human cancer cells. Compared to 225 proteoforms from 124 proteins identified when 1D separation was used, 2778 proteoforms from 628 proteins were detected and characterized using our 2D separation method. Many proteoforms with critically functional PTMs including phosphorylation were characterized. Additionally, we present the first detection of intact human GcvH proteoforms with rare modifications such as octanoylation and lipoylation. Overall, the increase in the number of proteoforms identified using 2DLC separation is largely due to the reduction in sample complexity through improved separation resolution, which enables the detection of low abundance PTM modified proteoforms. We demonstrate here that 2D pH RP/RPLC is an effective technique to analyze complex protein samples using top-down proteomics.Item Development of an Online 2D Ultrahigh-Pressure Nano-LC System for High-pH and Low-pH Reversed Phase Separation in Top-Down Proteomics(American Chemical Society, 2020-08-28) Wang, Zhe; Yu, Dahang; Cupp-Sutton, Kellye A.; Liu, Xiaowen; Smith, Kenneth; Wu, Si; Computer and Information Science, School of ScienceThe development of novel high-resolution separation techniques is crucial for advancing the complex sample analysis necessary for high-throughput top-down proteomics. Recently, our group developed an offline 2D high-pH RPLC/low-pH RPLC separation method and demonstrated good orthogonality between these two RPLC formats. Specifically, ultrahigh-pressure long capillary column RPLC separation has been applied as the second dimensional low-pH RPLC separation for the improvement of separation resolution. To further improve the throughput and sensitivity of the offline approach, we developed an online 2D ultrahigh-pressure nano-LC system for high-pH and low-pH RPLC separations in top-down proteomics. An online microtrap column with a dilution setup was used to collect eluted proteins from the first dimension high-pH separation and inject the fractions for ultrahigh-pressure long capillary column low-pH RPLC separation in the second dimension. This automatic platform enables the characterization of 1000+ intact proteoforms from 5 μg of intact E. coli cell lysate in 10 online-collected fractions. Here, we have demonstrated that our online 2D pH RP/RPLC system coupled with top-down proteomics holds the potential for deep proteome characterization of mass-limited samples because it allows the identification of hundreds of intact proteoforms from complex biological samples at low microgram sample amounts.Item Epidemiology of Fracture Nonunion in 18 Human Bones(JAMA, 2016-11) Zura, Robert; Xiong, Ze; Einhorn, Thomas; Watson, J. Tracy; Ostrum, Robert F.; Prayson, Michael J.; Della Rocca, Gregory J.; Mehta, Samir; McKinley, Todd; Wang, Zhe; Steen, R. Grant; Department of Orthopaedic Surgery, School of MedicineImportance Failure of bone fracture healing occurs in 5% to 10% of all patients. Nonunion risk is associated with the severity of injury and with the surgical treatment technique, yet progression to nonunion is not fully explained by these risk factors. Objective To test a hypothesis that fracture characteristics and patient-related risk factors assessable by the clinician at patient presentation can indicate the probability of fracture nonunion. Design, Setting, and Participants An inception cohort study in a large payer database of patients with fracture in the United States was conducted using patient-level health claims for medical and drug expenses compiled for approximately 90.1 million patients in calendar year 2011.The final database collated demographic descriptors, treatment procedures as per Current Procedural Terminology codes; comorbidities as per International Classification of Diseases, Ninth Revision codes; and drug prescriptions as per National Drug Code Directory codes. Logistic regression was used to calculate odds ratios (ORs) for variables associated with nonunion. Data analysis was performed from January 1, 2011, to December 31, 2012, Exposures Continuous enrollment in the database was required for 12 months after fracture to allow sufficient time to capture a nonunion diagnosis. Results The final analysis of 309 330 fractures in 18 bones included 178 952 women (57.9%); mean (SD) age was 44.48 (13.68) years. The nonunion rate was 4.9%. Elevated nonunion risk was associated with severe fracture (eg, open fracture, multiple fractures), high body mass index, smoking, and alcoholism. Women experienced more fractures, but men were more prone to nonunion. The nonunion rate also varied with fracture location: scaphoid, tibia plus fibula, and femur were most likely to be nonunion. The ORs for nonunion fractures were significantly increased for risk factors, including number of fractures (OR, 2.65; 95% CI, 2.34-2.99), use of nonsteroidal anti-inflammatory drugs plus opioids (OR, 1.84; 95% CI, 1.73-1.95), operative treatment (OR, 1.78; 95% CI, 1.69-1.86), open fracture (OR, 1.66; 95% CI, 1.55-1.77), anticoagulant use (OR, 1.58; 95% CI, 1.51-1.66), osteoarthritis with rheumatoid arthritis (OR, 1.58; 95% CI, 1.38-1.82), anticonvulsant use with benzodiazepines (OR, 1.49; 95% CI, 1.36-1.62), opioid use (OR, 1.43; 95% CI, 1.34-1.52), diabetes (OR, 1.40; 95% CI, 1.21-1.61), high-energy injury (OR, 1.38; 95% CI, 1.27-1.49), anticonvulsant use (OR, 1.37; 95% CI, 1.31-1.43), osteoporosis (OR, 1.24; 95% CI, 1.14-1.34), male gender (OR, 1.21; 95% CI, 1.16-1.25), insulin use (OR, 1.21; 95% CI, 1.10-1.31), smoking (OR, 1.20; 95% CI, 1.14-1.26), benzodiazepine use (OR, 1.20; 95% CI, 1.10-1.31), obesity (OR, 1.19; 95% CI, 1.12-1.25), antibiotic use (OR, 1.17; 95% CI, 1.13-1.21), osteoporosis medication use (OR, 1.17; 95% CI, 1.08-1.26), vitamin D deficiency (OR, 1.14; 95% CI, 1.05-1.22), diuretic use (OR, 1.13; 95% CI, 1.07-1.18), and renal insufficiency (OR, 1.11; 95% CI, 1.04-1.17) (multivariate P < .001 for all). Conclusions and Relevance The probability of fracture nonunion can be based on patient-specific risk factors at presentation. Risk of nonunion is a function of fracture severity, fracture location, disease comorbidity, and medication use.Item Evaluation of top-down mass spectral identification with homologous protein sequences(Biomed Central, 2018-12-28) Li, Ziwei; He, Bo; Kou, Qiang; Wang, Zhe; Wu, Si; Liu, Yunlong; Feng, Weixing; Liu, Xiaowen; Medical and Molecular Genetics, School of MedicineBACKGROUND: Top-down mass spectrometry has unique advantages in identifying proteoforms with multiple post-translational modifications and/or unknown alterations. Most software tools in this area search top-down mass spectra against a protein sequence database for proteoform identification. When the species studied in a mass spectrometry experiment lacks its proteome sequence database, a homologous protein sequence database can be used for proteoform identification. The accuracy of homologous protein sequences affects the sensitivity of proteoform identification and the accuracy of mass shift localization. RESULTS: We tested TopPIC, a commonly used software tool for top-down mass spectral identification, on a top-down mass spectrometry data set of Escherichia coli K12 MG1655, and evaluated its performance using an Escherichia coli K12 MG1655 proteome database and a homologous protein database. The number of identified spectra with the homologous database was about half of that with the Escherichia coli K12 MG1655 database. We also tested TopPIC on a top-down mass spectrometry data set of human MCF-7 cells and obtained similar results. CONCLUSIONS: Experimental results demonstrated that TopPIC is capable of identifying many proteoform spectrum matches and localizing unknown alterations using homologous protein sequences containing no more than 2 mutations.Item Identification and Quantification of Proteoforms by Mass Spectrometry(Wiley, 2019-05) Schaffer, Leah V.; Millikin, Robert J.; Miller, Rachel M.; Anderson, Lissa C.; Fellers, Ryan T.; Ge, Ying; Kelleher, Neil L.; LeDuc, Richard D.; Liu, Xiaowen; Payne, Samuel H.; Sun, Liangliang; Thomas, Paul M.; Tucholski, Trisha; Wang, Zhe; Wu, Si; Wu, Zhijie; Yu, Dahang; Shortreed, Michael R.; Smith, Lloyd M.; BioHealth Informatics, School of Informatics and ComputingA proteoform is a defined form of a protein derived from a given gene with a specific amino acid sequence and localized post-translational modifications. In top-down proteomic analyses, proteoforms are identified and quantified through mass spectrometric analysis of intact proteins. Recent technological developments have enabled comprehensive proteoform analyses in complex samples, and an increasing number of laboratories are adopting top-down proteomic workflows. In this review, we outline some recent advances and discuss current challenges and future directions for the field.Item A Markov chain Monte Carlo method for estimating the statistical significance of proteoform identifications by top-down mass spectrometry(ACS, 2019-03) Kou, Qiang; Wang, Zhe; Lubeckyj, Rachele A.; Wu, Si; Liu, Xiaowen; BioHealth Informatics, School of Informatics and ComputingTop-down mass spectrometry is capable of identifying whole proteoform sequences with multiple post-translational modifications because it generates tandem mass spectra directly from intact proteoforms. Many software tools, such as ProSightPC, MSPathFinder, and TopMG, have been proposed for identifying proteoforms with modifications. In these tools, various methods are employed to estimate the statistical significance of identifications. However, most existing methods are designed for proteoform identifications without modifications, and the challenge remains for accurately estimating the statistical significance of proteoform identifications with modifications. Here we propose TopMCMC, a method that combines a Markov chain random walk algorithm and a greedy algorithm for assigning statistical significance to matches between spectra and protein sequences with variable modifications. Experimental results showed that TopMCMC achieved high accuracy in estimating E-values and false discovery rates of identifications in top-down mass spectrometry. Coupled with TopMG, TopMCMC identified more spectra than the generating function method from an MCF-7 top-down mass spectrometry data set.Item Quantitative Top-Down Proteomics in Complex Samples Using Protein-Level Tandem Mass Tag Labeling(American Chemical Society, 2021-06-02) Yu, Dahang; Wang, Zhe; Cupp-Sutton, Kellye A.; Guo, Yanting; Kou, Qiang; Smith, Kenneth; Liu, Xiaowen; Wu, Si; BioHealth Informatics, School of Informatics and ComputingLabeling approaches using isobaric chemical tags (e.g., isobaric tagging for relative and absolute quantification, iTRAQ and tandem mass tag, TMT) have been widely applied for the quantification of peptides and proteins in bottom-up MS. However, until recently, successful applications of these approaches to top-down proteomics have been limited because proteins tend to precipitate and “crash” out of solution during TMT labeling of complex samples making the quantification of such samples difficult. In this study, we report a top-down TMT MS platform for confidently identifying and quantifying low molecular weight intact proteoforms in complex biological samples. To reduce the sample complexity and remove large proteins from complex samples, we developed a filter-SEC technique that combines a molecular weight cutoff filtration step with high-performance size exclusion chromatography (SEC) separation. No protein precipitation was observed in filtered samples under the intact protein-level TMT labeling conditions. The proposed top-down TMT MS platform enables high-throughput analysis of intact proteoforms, allowing for the identification and quantification of hundreds of intact proteoforms from Escherichia coli cell lysates. To our knowledge, this represents the first high-throughput TMT labeling-based, quantitative, top-down MS analysis suitable for complex biological samples.Item Rare variants in long non-coding RNAs are associated with blood lipid levels in the TOPMed Whole Genome Sequencing Study(medRxiv, 2023-06-29) Wang, Yuxuan; Selvaraj, Margaret Sunitha; Li, Xihao; Li, Zilin; Holdcraft, Jacob A.; Arnett, Donna K.; Bis, Joshua C.; Blangero, John; Boerwinkle, Eric; Bowden, Donald W.; Cade, Brian E.; Carlson, Jenna C.; Carson, April P.; Chen, Yii-Der Ida; Curran, Joanne E.; de Vries, Paul S.; Dutcher, Susan K.; Ellinor, Patrick T.; Floyd, James S.; Fornage, Myriam; Freedman, Barry I.; Gabriel, Stacey; Germer, Soren; Gibbs, Richard A.; Guo, Xiuqing; He, Jiang; Heard-Costa, Nancy; Hildalgo, Bertha; Hou, Lifang; Irvin, Marguerite R.; Joehanes, Roby; Kaplan, Robert C.; Kardia, Sharon Lr.; Kelly, Tanika N.; Kim, Ryan; Kooperberg, Charles; Kral, Brian G.; Levy, Daniel; Li, Changwei; Liu, Chunyu; Lloyd-Jone, Don; Loos, Ruth Jf.; Mahaney, Michael C.; Martin, Lisa W.; Mathias, Rasika A.; Minster, Ryan L.; Mitchell, Braxton D.; Montasser, May E.; Morrison, Alanna C.; Murabito, Joanne M.; Naseri, Take; O'Connell, Jeffrey R.; Palmer, Nicholette D.; Preuss, Michael H.; Psaty, Bruce M.; Raffield, Laura M.; Rao, Dabeeru C.; Redline, Susan; Reiner, Alexander P.; Rich, Stephen S.; Ruepena, Muagututi'a Sefuiva; Sheu, Wayne H-H; Smith, Jennifer A.; Smith, Albert; Tiwari, Hemant K.; Tsai, Michael Y.; Viaud-Martinez, Karine A.; Wang, Zhe; Yanek, Lisa R.; Zhao, Wei; NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium; Rotter, Jerome I.; Lin, Xihong; Natarajan, Pradeep; Peloso, Gina M.; Biostatistics and Health Data Science, School of MedicineLong non-coding RNAs (lncRNAs) are known to perform important regulatory functions. Large-scale whole genome sequencing (WGS) studies and new statistical methods for variant set tests now provide an opportunity to assess the associations between rare variants in lncRNA genes and complex traits across the genome. In this study, we used high-coverage WGS from 66,329 participants of diverse ancestries with blood lipid levels (LDL-C, HDL-C, TC, and TG) in the National Heart, Lung, and Blood Institute (NHLBI) Trans-Omics for Precision Medicine (TOPMed) program to investigate the role of lncRNAs in lipid variability. We aggregated rare variants for 165,375 lncRNA genes based on their genomic locations and conducted rare variant aggregate association tests using the STAAR (variant-Set Test for Association using Annotation infoRmation) framework. We performed STAAR conditional analysis adjusting for common variants in known lipid GWAS loci and rare coding variants in nearby protein coding genes. Our analyses revealed 83 rare lncRNA variant sets significantly associated with blood lipid levels, all of which were located in known lipid GWAS loci (in a ±500 kb window of a Global Lipids Genetics Consortium index variant). Notably, 61 out of 83 signals (73%) were conditionally independent of common regulatory variations and rare protein coding variations at the same loci. We replicated 34 out of 61 (56%) conditionally independent associations using the independent UK Biobank WGS data. Our results expand the genetic architecture of blood lipids to rare variants in lncRNA, implicating new therapeutic opportunities.Item Top-down Mass Spectrometry Analysis of Human Serum Autoantibody Antigen-Binding Fragments(Springer Nature, 2019-02-20) Wang, Zhe; Liu, Xiaowen; Muther, Jennifer; James, Judith A.; Smith, Kenneth; Wu, Si; BioHealth Informatics, School of Informatics and ComputingDetecting autoimmune diseases at an early stage is crucial for effective treatment and disease management to slow disease progression and prevent irreversible organ damage. In many autoimmune diseases, disease-specific autoantibodies are produced by B cells in response to soluble autoantigens due to defects in B cell tolerance mechanisms. Autoantibodies accrue early in disease development, and several are so disease-specific they serve as classification criteria. In this study, we established a high-throughput, sensitive, intact serum autoantibody analysis platform based on the optimization of a one dimensional ultra-high-pressure liquid chromatography top-down mass spectrometry platform (1D UPLC-TDMS). This approach has been successfully applied to a 12 standard monoclonal antibody antigen-binding fragment (Fab) mixture, demonstrating the feasibility to separate and sequence intact antibodies with high sequence coverage and high sensitivity. We then applied the optimized platform to characterize total serum antibody Fabs in a systemic lupus erythematosus (SLE) patient sample and compared it to healthy control samples. From this analysis, we show that the SLE sample has many dominant antibody Fab-related mass features unlike the healthy controls. To our knowledge, this is the first top-down demonstration of serum autoantibody pool analysis. Our proposed approach holds great promise for discovering novel serum autoantibody biomarkers that are of interest for diagnosis, prognosis, and tolerance induction, as well as improving our understanding of pathogenic autoimmune processes.Item Whole Genome Sequencing Analysis of Body Mass Index Identifies Novel African Ancestry-Specific Risk Allele(medRxiv, 2023-08-22) Zhang, Xinruo; Brody, Jennifer A.; Graff, Mariaelisa; Highland, Heather M.; Chami, Nathalie; Xu, Hanfei; Wang, Zhe; Ferrier, Kendra; Chittoor, Geetha; Josyula, Navya S.; Li, Xihao; Li, Zilin; Allison, Matthew A.; Becker, Diane M.; Bielak, Lawrence F.; Bis, Joshua C.; Boorgula, Meher Preethi; Bowden, Donald W.; Broome, Jai G.; Buth, Erin J.; Carlson, Christopher S.; Chang, Kyong-Mi; Chavan, Sameer; Chiu, Yen-Feng; Chuang, Lee-Ming; Conomos, Matthew P.; DeMeo, Dawn L.; Du, Margaret; Duggirala, Ravindranath; Eng, Celeste; Fohner, Alison E.; Freedman, Barry I.; Garrett, Melanie E.; Guo, Xiuqing; Haiman, Chris; Heavner, Benjamin D.; Hidalgo, Bertha; Hixson, James E.; Ho, Yuk-Lam; Hobbs, Brian D.; Hu, Donglei; Hui, Qin; Hwu, Chii-Min; Jackson, Rebecca D.; Jain, Deepti; Kalyani, Rita R.; Kardia, Sharon L. R.; Kelly, Tanika N.; Lange, Ethan M.; LeNoir, Michael; Li, Changwei; Marchand, Loic Le; McDonald, Merry-Lynn N.; McHugh, Caitlin P.; Morrison, Alanna C.; Naseri, Take; NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium; O'Connell, Jeffrey; O'Donnell, Christopher J.; Palmer, Nicholette D.; Pankow, James S.; Perry, James A.; Peters, Ulrike; Preuss, Michael H.; Rao, D. C.; Regan, Elizabeth A.; Reupena, Sefuiva M.; Roden, Dan M.; Rodriguez-Santana, Jose; Sitlani, Colleen M.; Smith, Jennifer A.; Tiwari, Hemant K.; Vasan, Ramachandran S.; Wang, Zeyuan; Weeks, Daniel E.; Wessel, Jennifer; Wiggins, Kerri L.; Wilkens, Lynne R.; Wilson, Peter W. F.; Yanek, Lisa R.; Yoneda, Zachary T.; Zhao, Wei; Zöllner, Sebastian; Arnett, Donna K.; Ashley-Koch, Allison E.; Barnes, Kathleen C.; Blangero, John; Boerwinkle, Eric; Burchard, Esteban G.; Carson, April P.; Chasman, Daniel I.; Chen, Yii-Der Ida; Curran, Joanne E.; Fornage, Myriam; Gordeuk, Victor R.; He, Jiang; Heckbert, Susan R.; Hou, Lifang; Irvin, Marguerite R.; Kooperberg, Charles; Minster, Ryan L.; Mitchell, Braxton D.; Nouraie, Mehdi; Psaty, Bruce M.; Raffield, Laura M.; Reiner, Alexander P.; Rich, Stephen S.; Rotter, Jerome I.; Shoemaker, M. Benjamin; Smith, Nicholas L.; Taylor, Kent D.; Telen, Marilyn J.; Weiss, Scott T.; Zhang, Yingze; Heard-Costa, Nancy; Sun, Yan V.; Lin, Xihong; Cupples, L. Adrienne; Lange, Leslie A.; Liu, Ching-Ti; Loos, Ruth J. F.; North, Kari E.; Justice, Anne E.; Biostatistics and Health Data Science, School of MedicineObesity is a major public health crisis associated with high mortality rates. Previous genome-wide association studies (GWAS) investigating body mass index (BMI) have largely relied on imputed data from European individuals. This study leveraged whole-genome sequencing (WGS) data from 88,873 participants from the Trans-Omics for Precision Medicine (TOPMed) Program, of which 51% were of non-European population groups. We discovered 18 BMI-associated signals (P < 5 × 10−9). Notably, we identified and replicated a novel low frequency single nucleotide polymorphism (SNP) in MTMR3 that was common in individuals of African descent. Using a diverse study population, we further identified two novel secondary signals in known BMI loci and pinpointed two likely causal variants in the POC5 and DMD loci. Our work demonstrates the benefits of combining WGS and diverse cohorts in expanding current catalog of variants and genes confer risk for obesity, bringing us one step closer to personalized medicine.