- Browse by Author
Browsing by Author "Wu, Si"
Now showing 1 - 10 of 18
Results Per Page
Sort Options
Item 25th Annual Computational Neuroscience Meeting: CNS-2016(BioMed Central, 2016-08-18) Sharpee, Tatyana O.; Destexhe, Alain; Kawato, Mitsuo; Sekulić, Vladisla; Skinner, Frances K.; Wójcik, Daniel K.; Chintaluri, Chaitanya; Cserpán, Dorottya; Somogyvári, Zoltán; Kim, Jae Kyoung; Kilpatrick, Zachary P.; Bennett, Matthew R.; Josić, Kresimir; Elices, Irene; Arroyo, David; Levi, Rafael; Rodriguez, Francisco B.; Varona, Pablo; Hwang, Eunjin; Kim, Bowon; Han, Hio-Been; Kim, Tae; McKenna, James T.; Brown, Ritchie E.; McCarley, Robert W.; Choi, Jee Hyun; Rankin, James; Popp, Pamela Osborn; Rinzel, John; Tabas, Alejandro; Rupp, André; Balaguer‑Ballester, Emili; Maturana, Matias I.; Grayden, David B.; Cloherty, Shaun L.; Kameneva, Tatiana; Ibbotson, Michael R.; Meffin, Hamish; Koren, Veronika; Lochmann, Timm; Dragoi, Valentin; Obermayer, Klaus; Psarrou, Maria; Schilstra, Maria; Davey, Neil; Ju, Huiwen; Hines, Michael L.; Chen, Liang; Kim, Jimin; Leahy, Will; Shlizerman, Eli; Birgolias, Justas; Gerkin, Richard C.; Crook, Sharon M.; Viriyopase, Atthaphon; Memmeshei, Raol-Martin; Dabaghian, Yuri; DeVuti, Justin; Perotti, Luca; Kim, Ammo J.; Fenk, Lisa M.; Lyu, Cheng; Malmon, Gabby; Zhao, Chang; Widmer, Yves; Sprecher, Simon; Halnes, Geir; Tuomo, Maki-Martun; Keller, Daniel; Petterson, Klas H.; Andreassen, Ole A.; Elnevoll, Gaute T.; Yamada, Yasnori; Steyn-Ross, Moira L.; Steyn-Ross, D. Alistair; Meijas, Jorge F.; Murray, John D.; Kennedy, Henry; Kruscha, Alexandra; Grewe, Jan; Lidner, Benjamin; Badel, Laurent; Kasumi, Ohta; Tsuchimoto, Yoshiko; Kazama, Hokto; Kahng, B.; Tam, Nicoladie D.; Pollonini, Luca; Zouridakis, George; Soh, Jaehyun; Kim, DaeEun; Yoo, Minsu; Palmer, S.E.; Culmone, Viviana; Bojak, Ingo; Ferrario, Andrea; Merriosn-Hort, Robert; Borisyuk, Roman; Kim, Chang Sub; Tezuka, Taro; Joo, Pangyu; Young-Ah, Rho; Burton, Shawn D.; Bard, G.; Marsalek, Petr; Kim, Hoon-Hee; Moon, Seok-hun; Lee, Do-won; Molkov, Yaroslav I.; Hamade, Khaldoun; Teka, Wondimu; Barnett, William H.; Kim, Taegyo; Markin, Sergey; Rybak, Ilya A.; Forrow, Csaba; Demutz, Harald; Demkó, László; Vörös, János; Dabaghian, Yuri; Babichev, Andrey; Huang, Haiping; Metzner, Christoph; Schwikard, Achim; Zurowski, Bartosz; Roach, James P.; Sander, Leonard M.; Zochowski, Michal R.; Skilling, Quinton M.; Ognjanovski, Nicolette; Aton, Sara J.; Zochowski, Michal; Wang, Sheng-Ju; Ouyang, Guang; Zhang, Mingsha; Wong, Michael; Zhou, Changsong; Robinson, Peter A.; Sanz-Leon, Paula; Drysdale, Peter M.; Fung, Felix; Abeysuriya, Romesh G.; Rennle, Chris J.; Zhao, Xuelong; Choe, Yoonsuck; Yang, Huei-Fang; Mi, Yuanyuan; Lin, Xiahoan; Wu, Si; Liedtke, Joscha; Schottdorf, Manual; Wolf, Fred; Yamamura, Yorkio; Wickens, Jeffery R.; Rumbell, Timothy; Ramsey, Julia; Reyes, Amy; Draguljić, Daniel; Hof, Patrick R.; Luebke, Jennifer; Weaver, Christina M.; He, Hu; Yang, Xu; Ma, Hailin; Xu, Zhiheng; Wang, Yuzhe; Baek, Kwangyeol; Morris, Laurel S.; Kundu, Prantik; Voon, Valerie; Agnes, Everton J.; Vogels, Tim P.; Giese, Martin; Kuravi, Pradeep; Vogels, Rufin; Seeholzer, Alexander; Podlaski, William; Ranjan, Rajnish; Vogels, Tim; Torres, Joaquin J.; Baroni, Fabiano; Latorre, Roberto; Varona, Pablo; Gips, Bart; Lowet, Eric; Roberts, Mark J.; de Weerd, Peter; Jensen, Ole; van der Eerden, Jan; Goodarzinic, Abdorreza; Niry, Mohammad; Valizadeh, Alireza; Pariz, Aref; Parsi, Shervin S.; Valizadeh, Alireza; Warburton, Julia M.; Marucci, Lucia; Tamagnini, Francesco; Brown, John; Tsaneva‑Atanasova, Krasimira; Kleberg, Florence I.; Triesch, Jochen; Moezzi, Bahar; Iannella, Nicolangelo; Schaworonkow, Natalie; Plogmacher, Lukas; Goldsworthy, Mitchell R.; Hordacre, Brenton; McDonnell, Mark D.; Ridding, Michael C.; Trisch, Jochen; Zaptocky, Martin; Smit, Daniel; Fouquet, Coralie; Trembleau, Alain; Dasgupta, Sakyasingha; Nishikawa, Isao; Aihara, Kazuyuki; Toyoizumi, Taro; Robb, Daniel T.; Mellen, Nick; Toporikova, Natalia; Tang, Rongxiang; Tang, Yi-Yuan; Kiser, Seth A.; Howard Jr., James H.; Tang, Yi-Yuan; Goncharenko, Julia; Davey, Neil; Schilstra, Marla; Steuber, Volker; Voronenko, Sergej O.; Linder, Benjamin; Ahamed, Tosif; Stephens, Greg; Yger, Pierre; Lefebvre, Baptiste; Spampinato, Giulia Lia Beatrice; Esposito, Elric; Stimberg, Marcel; Marre, Olivier; Choi, Hansol; Song, Min-Ho; Chung, SueYeon; Lee, Dan D.; Sompolinsky, Haim; Phillips, Ryan S.; Smith, Jeffrey; Chatzikalymniou, Alexandra Pierri; Ferguson, Katie; Skinner, Frances K.; Gajic, N. Alex Cayco; Clopath, Claudia; Silver, R. Angus; Gleeson, Padraig; Marin, Boris; Sadeh, Sadra; Quintana, Adrian; Cantarelli, Matteo; Dura‑Bernal, Salvador; Lytton, William W.; Davison, Andrew; Silver, Angus; Li, Luozheng; Zhang, Wenhao; Mi, Yuanyuan; Wang, Dahui; Wu, Sl; Song, Youngjo; Park, Sol; Choi, Ilhwan; Jeong, Jaeseung; Shin, Hee‑sup; Choi, Hannah; Pasupathy, Anitha; Shea-Brown, Eric; Huh, Dongsung; Sejnowski, Terrence J.; Vogt, Simon M.; Kumar, Arvind; Schmidt, Robert; Werdt, Stephen Van; Schiff, Steven J.; Veale, Richard; Scheutz, Matthias; Lee, Sang Wan; Gallinaro, Júlia; Rotter, Stefan; Sanz‑Leon, Paula; Robinson, Peter A.; Rubchinsky, Leonid L.; Cheung, Chung Ching; Ratnadurai‑Giridharan, Shivakeshavan; Shomali, Safura Rashid; Ahmadabadi, Majid Nili; Shimazaki, Hideaki; Rasuli, Nader; Zhao, Xiaochen; Rasch, Malte J.; Witting, Jens; Priesemann, Viola; Levina, Anna; Priesemann, Viola; Lizler, Joseph T.; Spinney, Richard E.; Rubinov, Mikail; Wibral, Michael; Bak, Ji Hyun; Pillow, Jonathan; Zaho, Yuan; Park, Memming; Kang, Jiyoung; Park, Hae‑Jeong; Jang, Jaeson; Paik, Se-Bum; Choi, Woochul; Lee, Changju; Jang, Jaeson; Paik, Se‑Bum; Song, Min; Lee, Hyeonsu; Yilmaz, Ergin; Baysal, Velt; Ozer, Mahmut; Koren, Veronika; Obermayer, Klaus; Saska, Daniel; Nowotny, Thomas; Chan, Ho Ka; Diamond, Alan; Hermann, Christoph S.; Murray, Micha M.; Ionta, Silvlo; Hutt, Axel; Lefebvre, Jérémie; Weidel, Philipp; Duarte, Renato; Morrison, Abigail; Iyer, Ramakrishnan; Mihalas, Stefan; Petrovici, Mihai A.; Leng, Luziwei; Breitwieser, Oliver; Stöckel, David; Bytschok, Ilja; Martel, Roman; Bill, Johannes; Schemmel, Johannes; Meier, Karlheinz; Esler, Timothy B.; Burkitt, Anthony N.; Grayden, David B.; Kerr, Robert R.; Tahayori, Bahman; Meffin, Hamish; Moezzi, Bahar; Iannella, Nicolangelo; McDonnell, Mark D.; Nolte, Max; Reimann, Michael W.; Muler, Eilif; Markram, Henry; Parziale, Antonio; Senatore, Rosa; Marcelli, Angelo; Maouene, M.; Skiker, K.; Neymotin, Samuel A.; Dura‑Bernal, Salvador; Seidenstein, Alexandra; Lakatos, Peter; Sanger, Terence D.; Lytton, William W.; Dura‑Bernal, Salvador; Menzies, Rosemary J.; McLauchlan, Campbell; van Albada, Sacha J.; Kedziora, David J.; Neymotin, Samuel; Kerr, Cliff C.; Ryu, Juhyoung; Lee, Sang-Hun; Lee, Joonwon; Lee, Hyang Jung; Lim, Daeseob; Lee, Jung H.; Wang, Jisung; Lee, Heonsoo; Jung, Nam; Quang, Le Anh; Maeng, Seung Eu; Lee, Tae Ho; Lee, Jae Woo; Park, Chang-hyun; Ahn, Sora; Moon, Jangsup; Choi, Yun Seo; Kim, Juhee; Jun, Sang Beom; Lee, Seungjun; Lee, Hyang Woon; Jo, Sumin; Jun, Eunji; Yu, Suin; Goetze, Felix; Lai, Pik‑Yin; Kwag, Jeehyun; Liang, Guangsheng; Jang, Hyun Jae; Filipovi, Marko; Reig, Ramon; Aertsen, Ad; Silberberg, Gilad; Kumar, Arvind; Bachmann, Claudia; Buttler, Simone; Jacobs, Heidi; Dillen, Kim; Fink, Gereon R.; Kukolja, Juraj; Kepple, Daniel; Giaffar, Hamza; Rinberg, Dima; Shea, Steven; Koulakov, Alex; Bahuguna, Jyotika; Tetzlaff, Tom; Kotaleski, Jeanette Hellgren; Kunze, Tim; Peterson, Andre; Knösche, Thomas; Kim, Minjung; Kim, Hojeong; Park, Ji Sung; Yeon, Ji Won; Kim, Sung-Phil; Lee, Chungho; Kim, Sung-Phil; Spiegler, Andreas; Petkoski, Spase; Palva, Matias J.; Jirsa, Viktor K.; Saggio, Maria L.; Siep, Silvan F.; Stacey, William C.; Bernard, Christophe; Choung, Oh‑hyeon; Jeong, Yong; Lee, Yong‑il; Jeong, Jaesung; Kim, Su Hyun; Lee, Jeungmin; Kwon, Jaehyung; Kralik, Jerald D.; Hwang, Dong‑Uk; Park, Sang-Min; Kim, Seongkyun; Kim, Hyoungkyu; Lim, Sewoong; Yoon, Sangsup; Park, Choongseok; Miller, Thomas; Clements, Katie; Hye Jr., Eoon; Issa, Fadi A.; Baek, JeongHun; Oba, Shigeyuki; Yoshimoto, Junichiro; Doya, Kenji; Ishii, Shin; Mosqueiro, Thiago S; Strube‑Bloss, Martin F.; Smith, Brian; Huerta, Ramon; Hadrava, Michal; Hlinka, Jaroslav; Bos, Hannah; Helias, Moritz; Welzig, Charles M.; Harper, Zachary J.; Kim, Won Sup; Shin, In-Seob; Baek, Hyeon-Man; Han, Seung Kee; Richter, René; Vitay, Julien; Beuth, Frederick; Hamker, Fred H.; Kameneva, Tatiana; Graham, Bruce P.; Kale, Penelope J.; Gollo, Leonardo L.; Stern, Merav; Abbott, L.F.; Fedorov, Leonid A.; Giese, Martin A.; Ardestani, Mohammad Hovaidi; Giese, Martin; Chakravarthy, V.Srinivasa; Chhabria, Karishma; Philips, Ryan T.; Ardestani, Mohammad Hovaidi; Faraji, Mohammad Java; Preuschoff, Kerstin; Gerstner, Wulfram; Briaire`, Jeroen J.; Kalkman, Randy K.; Frijns, Johan H. M.; Lee, Won Hee; Frangou, Sophia; Fulcher, Ben D.; Tran, Patricia H. P.; Fornito, Alex; Gliske, Stephen V.; Stacey, William C.; Holman, Katherine A.; Fink, Christian G.; Kim, Jinseop; Mu, Shang; Briggman, Kevin L; Seung, H. Sebastian; Wegener, Detlef; Bohnenkamp, Lisa; Ernst, Udo A.; Mäki‑Marttunen, Tuomo; Halnes, Geir; Devor, Anna; Dale, Anders M.; Andreassen, Ole A.; Einevoll, Gaute T.; Hagen, Espen; Lines, Glenn T.; Edwards, Andy; Tveito, Aslak; Senk, Johanna; van Albada, Sacha J; Diesmann, Markus; Schmidt, Maximilian; Bakker, Rembrandt; Shen, Kelly; Bezgin`, Gleb; Hilgetag`, Claus‑Christian; Sun, Haoqi; Sourina, Olga; Huang, Guang-Bin; Klanner, Felix; Denk, Cornelia; Glomb, Katharina; Ponce‑Alvarez, Adrián; Gilson, Matthieu; Ritter, Petra; Deco, Gustavo; Witek, Maria A. G.; Clarke, Eric F.; Hansen, Mads; Wallentin, Mikkel; Kringelbach, Morten L.; Vuust, Peter; Klingbeil, Guido; Schutter, Erik De; Chen, Weiliang; Hong, Sungho; Takashima, Akira; Zamora, Criseida; Gallimore, Andrew R.; Karoly, Philippa J.; Freestone, Dean R.; Soundry, Daniel; Kuhlmann, Levin; Paninski, Liam; Cook, Mark; Lee, Jaejin; Fishman, Yonatan I.; Cohen, Yale E.; Cocchi, Luca; Sweeney, Yann; Lee, Soohyun; Jung, Woo-Sung; Kim, Bowon; Kim, Youngsoo; Jung, Younginha; Rankin, James; Chavane, Frédéric; Soman, Karthik; Muralidharan, Vignesh; Shivkumar, Sabyasach; Mandall, Alekhya; Priyadharsini, B. Praga; Mehta, Hima; Brinkman, Braden A.; Kekona, Tyler; Rieke, Fred; Shea‑Brown, Eric; Buice, Michael; Pittà, Maurizio De; Berry, Hugues; Brunel, Nicolas; Breakspear, Michael; Marsat, Gary; Drew, Jordan; Chapman, Phillip D.; Daly, Kevin C.; Bradley, Samual P.; Seo, Sat Byul; Su, Jianzhong; Kavalali, Enge T.; Blackwell, Justin; Shiau, LieJune; Buhry, Laure; Basnayake, Kanishka; Lee, Sue-Hyun; Levy, Brandon A.; Baker, Chris I.; Leleu, Timothée; Aihara, Kazuyuki; Department of Mathematical Sciences, School of ScienceItem Characterization of proteoforms with unknown post-translational modi cations using the MIScore(ACS, 2016) Kou, Qiang; Zhu, Binhai; Wu, Si; Ansong, Charles; Tolić, Nikola; Paša-Tolić, Ljiljana; Liu, Xiaowen; Department of Biohealth Informatics, School of Informatics and ComputingVarious proteoforms may be generated from a single gene due to primary structure alterations (PSAs) such as genetic variations, alternative splicing, and post-translational modifications (PTMs). Top-down mass spectrometry is capable of analyzing intact proteins and identifying patterns of multiple PSAs, making it the method of choice for studying complex proteoforms. In top-down proteomics, proteoform identification is often performed by searching tandem mass spectra against a protein sequence database that contains only one reference protein sequence for each gene or transcript variant in a proteome. Because of the incompleteness of the protein database, an identified proteoform may contain unknown PSAs compared with the reference sequence. Proteoform characterization is to identify and localize PSAs in a proteoform. Although many software tools have been proposed for proteoform identification by top-down mass spectrometry, the characterization of proteoforms in identified proteoform–spectrum matches still relies mainly on manual annotation. We propose to use the Modification Identification Score (MIScore), which is based on Bayesian models, to automatically identify and localize PTMs in proteoforms. Experiments showed that the MIScore is accurate in identifying and localizing one or two modifications.Item Deep Intact Proteoform Characterization in Human Cell Lysate using High-pH and Low-pH Reversed-Phase Liquid Chromatography(American Chemical Society, 2019-12) Yu, Dahang; Wang, Zhe; Sutton, Kellye A.; Liu, Xiaowen; Wu, Si; Computer and Information Science, School of SciencePost-translational modifications (PTMs) play critical roles in biological processes and have significant effects on the structures and dynamics of proteins. Top-down proteomics methods were developed for and applied to the study of intact proteins and their PTMs in human samples. However, the large dynamic range and complexity of human samples makes the study of human proteins challenging. To address these challenges, we developed a 2D pH RP/RPLC-MS/MS technique that fuses high-resolution separation and intact protein characterization to study the human proteins in HeLa cell lysate. Our results provide a deep coverage of soluble proteins in human cancer cells. Compared to 225 proteoforms from 124 proteins identified when 1D separation was used, 2778 proteoforms from 628 proteins were detected and characterized using our 2D separation method. Many proteoforms with critically functional PTMs including phosphorylation were characterized. Additionally, we present the first detection of intact human GcvH proteoforms with rare modifications such as octanoylation and lipoylation. Overall, the increase in the number of proteoforms identified using 2DLC separation is largely due to the reduction in sample complexity through improved separation resolution, which enables the detection of low abundance PTM modified proteoforms. We demonstrate here that 2D pH RP/RPLC is an effective technique to analyze complex protein samples using top-down proteomics.Item Development of an Online 2D Ultrahigh-Pressure Nano-LC System for High-pH and Low-pH Reversed Phase Separation in Top-Down Proteomics(American Chemical Society, 2020-08-28) Wang, Zhe; Yu, Dahang; Cupp-Sutton, Kellye A.; Liu, Xiaowen; Smith, Kenneth; Wu, Si; Computer and Information Science, School of ScienceThe development of novel high-resolution separation techniques is crucial for advancing the complex sample analysis necessary for high-throughput top-down proteomics. Recently, our group developed an offline 2D high-pH RPLC/low-pH RPLC separation method and demonstrated good orthogonality between these two RPLC formats. Specifically, ultrahigh-pressure long capillary column RPLC separation has been applied as the second dimensional low-pH RPLC separation for the improvement of separation resolution. To further improve the throughput and sensitivity of the offline approach, we developed an online 2D ultrahigh-pressure nano-LC system for high-pH and low-pH RPLC separations in top-down proteomics. An online microtrap column with a dilution setup was used to collect eluted proteins from the first dimension high-pH separation and inject the fractions for ultrahigh-pressure long capillary column low-pH RPLC separation in the second dimension. This automatic platform enables the characterization of 1000+ intact proteoforms from 5 μg of intact E. coli cell lysate in 10 online-collected fractions. Here, we have demonstrated that our online 2D pH RP/RPLC system coupled with top-down proteomics holds the potential for deep proteome characterization of mass-limited samples because it allows the identification of hundreds of intact proteoforms from complex biological samples at low microgram sample amounts.Item Evaluation of top-down mass spectral identification with homologous protein sequences(Biomed Central, 2018-12-28) Li, Ziwei; He, Bo; Kou, Qiang; Wang, Zhe; Wu, Si; Liu, Yunlong; Feng, Weixing; Liu, Xiaowen; Medical and Molecular Genetics, School of MedicineBACKGROUND: Top-down mass spectrometry has unique advantages in identifying proteoforms with multiple post-translational modifications and/or unknown alterations. Most software tools in this area search top-down mass spectra against a protein sequence database for proteoform identification. When the species studied in a mass spectrometry experiment lacks its proteome sequence database, a homologous protein sequence database can be used for proteoform identification. The accuracy of homologous protein sequences affects the sensitivity of proteoform identification and the accuracy of mass shift localization. RESULTS: We tested TopPIC, a commonly used software tool for top-down mass spectral identification, on a top-down mass spectrometry data set of Escherichia coli K12 MG1655, and evaluated its performance using an Escherichia coli K12 MG1655 proteome database and a homologous protein database. The number of identified spectra with the homologous database was about half of that with the Escherichia coli K12 MG1655 database. We also tested TopPIC on a top-down mass spectrometry data set of human MCF-7 cells and obtained similar results. CONCLUSIONS: Experimental results demonstrated that TopPIC is capable of identifying many proteoform spectrum matches and localizing unknown alterations using homologous protein sequences containing no more than 2 mutations.Item Identification and Quantification of Proteoforms by Mass Spectrometry(Wiley, 2019-05) Schaffer, Leah V.; Millikin, Robert J.; Miller, Rachel M.; Anderson, Lissa C.; Fellers, Ryan T.; Ge, Ying; Kelleher, Neil L.; LeDuc, Richard D.; Liu, Xiaowen; Payne, Samuel H.; Sun, Liangliang; Thomas, Paul M.; Tucholski, Trisha; Wang, Zhe; Wu, Si; Wu, Zhijie; Yu, Dahang; Shortreed, Michael R.; Smith, Lloyd M.; BioHealth Informatics, School of Informatics and ComputingA proteoform is a defined form of a protein derived from a given gene with a specific amino acid sequence and localized post-translational modifications. In top-down proteomic analyses, proteoforms are identified and quantified through mass spectrometric analysis of intact proteins. Recent technological developments have enabled comprehensive proteoform analyses in complex samples, and an increasing number of laboratories are adopting top-down proteomic workflows. In this review, we outline some recent advances and discuss current challenges and future directions for the field.Item Identification of ultramodified proteins using top-down tandem mass spectra(American Chemical Society, 2013-12-06) Liu, Xiaowen; Hengel, Shawna; Wu, Si; Tolić, Nikola; Pasa-Tolić, Ljiljana; Pevzner, Pavel A.; Department of BioHealth Informatics, IU School of Informatics and ComputingPost-translational modifications (PTMs) play an important role in various biological processes through changing protein structure and function. Some ultramodified proteins (like histones) have multiple PTMs forming PTM patterns that define the functionality of a protein. While bottom-up mass spectrometry (MS) has been successful in identifying individual PTMs within short peptides, it is unable to identify PTM patterns spreading along entire proteins in a coordinated fashion. In contrast, top-down MS analyzes intact proteins and reveals PTM patterns along the entire proteins. However, while recent advances in instrumentation have made top-down MS accessible to many laboratories, most computational tools for top-down MS focus on proteins with few PTMs and are unable to identify complex PTM patterns. We propose a new algorithm, MS-Align-E, that identifies both expected and unexpected PTMs in ultramodified proteins. We demonstrate that MS-Align-E identifies many proteoforms of histone H4 and benchmark it against the currently accepted software tools.Item A Markov chain Monte Carlo method for estimating the statistical significance of proteoform identifications by top-down mass spectrometry(ACS, 2019-03) Kou, Qiang; Wang, Zhe; Lubeckyj, Rachele A.; Wu, Si; Liu, Xiaowen; BioHealth Informatics, School of Informatics and ComputingTop-down mass spectrometry is capable of identifying whole proteoform sequences with multiple post-translational modifications because it generates tandem mass spectra directly from intact proteoforms. Many software tools, such as ProSightPC, MSPathFinder, and TopMG, have been proposed for identifying proteoforms with modifications. In these tools, various methods are employed to estimate the statistical significance of identifications. However, most existing methods are designed for proteoform identifications without modifications, and the challenge remains for accurately estimating the statistical significance of proteoform identifications with modifications. Here we propose TopMCMC, a method that combines a Markov chain random walk algorithm and a greedy algorithm for assigning statistical significance to matches between spectra and protein sequences with variable modifications. Experimental results showed that TopMCMC achieved high accuracy in estimating E-values and false discovery rates of identifications in top-down mass spectrometry. Coupled with TopMG, TopMCMC identified more spectra than the generating function method from an MCF-7 top-down mass spectrometry data set.Item A mass graph-based approach for the identification of modified proteoforms using top-down tandem mass spectra(Oxford, 2017-05-01) Kou, Qiang; Wu, Si; Tolić, Nikola; Paša-Tolić, Ljiljana; Liu, Yunlong; Liu, Xiaowen; BioHealth Informatics, School of Informatics and ComputingMotivation: Although proteomics has rapidly developed in the past decade, researchers are still in the early stage of exploring the world of complex proteoforms, which are protein products with various primary structure alterations resulting from gene mutations, alternative splicing, post-translational modifications, and other biological processes. Proteoform identification is essential to mapping proteoforms to their biological functions as well as discovering novel proteoforms and new protein functions. Top-down mass spectrometry is the method of choice for identifying complex proteoforms because it provides a 'bird's eye view' of intact proteoforms. The combinatorial explosion of various alterations on a protein may result in billions of possible proteoforms, making proteoform identification a challenging computational problem. Results: We propose a new data structure, called the mass graph, for efficient representation of proteoforms and design mass graph alignment algorithms. We developed TopMG, a mass graph-based software tool for proteoform identification by top-down mass spectrometry. Experiments on top-down mass spectrometry datasets showed that TopMG outperformed existing methods in identifying complex proteoforms.Item Mass graphs and their applications in top-down proteomics(2015) Kou, Qiang; Wu, Si; Tolić, Nikola; Pasa-Tolić, Ljiljana; Liu, Xiaowen; Department of Biohealth Informatics, School of Informatics and ComputingAlthough proteomics has made rapid progress in the past decade, researchers are still in the early stage of exploring the world of complex proteoforms, which are protein products with various primary structure alterations resulting from gene mutations, alternative splicing, post-translational modifications, and other biological processes. Proteoform identification is essential to mapping proteoforms to their biological functions as well as discovering novel proteoforms and new protein functions. Top-down mass spectrometry is the method of choice for identifying complex proteoforms because it provides a "bird view" of intact proteoforms. The combinatorial explosion of possible proteoforms, which may result in billions of possible proteoforms for one protein, makes proteoform identification a challenging computational problem. Here we propose a new data structure, called the mass graph, for efficiently representing proteoforms. In addition, we design mass graph alignment algorithms for proteoform identification by top-down mass spectrometry. Experiments on a histone H4 mass spectrometry data set showed that the proposed methods outperformed MS-Align-E in identifying complex proteoforms.