Jake Chen

Permanent URI for this collection

https://hdl.handle.net/1805/6688

Browse

Now showing 1 - 10 of 21

Computational Analysis of Drought Stress-Associated miRNAs and miRNA Co-Regulation Network in Physcomitrella patens.
(Elsevier, 2011-04) Wan, Ping; Wu, Jun; Zhou, Yuan; Xiao, Junshu; Feng, Jie; Zhao, Weizhong; Xiang, Shen; Jiang, Guanglong; Chen, Jake Yue; Department of Biohealth Informatics, IU School of Informatics and Computing
miRNAs are non-coding small RNAs that involve diverse biological processes. Until now, little is known about their roles in plant drought resistance. Physcomitrella patens is highly tolerant to drought; however, it is not clear about the basic biology of the traits that contribute P. patens this important character. In this work, we discovered 16 drought stress-associated miRNA (DsAmR) families in P. patens through computational analysis. Due to the possible discrepancy of expression periods and tissue distributions between potential DsAmRs and their targeting genes, and the existence of false positive results in computational identification, the prediction results should be examined with further experimental validation. We also constructed an miRNA co-regulation network, and identified two network hubs, miR902a-5p and miR414, which may play important roles in regulating drought-resistance traits. We distributed our results through an online database named ppt-miRBase, which can be accessed at http://bioinfor.cnu.edu.cn/ppt_miRBase/index.php. Our methods in finding DsAmR and miRNA co-regulation network showed a new direction for identifying miRNA functions.
Computational Biomarker Discovery: From Systems Biology to Predictive and Personalized Medicine Applications
(Office of the Vice Chancellor for Research, 2010-04-09) Chen, Jake Yue; Wu, Xiaogang; Zhang, Fan; Pandey, Ragini; Huang, Hui; Huan, Tianxiao
With the advent of Genome-based Medicine, there is an escalating need for discovering how the modifications of biological molecules, either individually or as an ensemble, can be uniquely associated with human physiological states. This knowledge could lead to breakthroughs in the development of clinical tests known as "biomarker tests" to assess disease risks, early onset, prognosis, and treatment outcome predictions. Therefore, development of molecular biomarkers is a key agenda in the next 5-10 years to take full advantage of the human genome to improve human well-beings. However, the complexity of human biological systems and imperfect instrumentations of high-throughput biological instruments/results have created significant hurdles in biomarker development. Only recently did computational methods become an important player of the research topic, which has seen conventional molecular biomarkers development both extremely long and cost-ineffective. At Indiana Center for Systems Biology and Personalized Medicine, we are developing several computational systems biology strategies to address these challenges. We will show examples of how we approach the problem using a variety of computational techniques, including data mining, algorithm development to take into account of biological contexts, biological knowledge integration, and information visualization. Finally, we outline how research in this direction to derive more robust molecular biomarkers may lead to predictive and personalized medicine. Indiana Center for Systems Biology and Personalized Medicine (CSBPM) was founded in 2007 as an IUPUI signature center by Dr. Jake Chen and his colleagues in the Indiana University School of Informatics, School of Medicine, and School of Science. CSBPM is the only research center in the State of Indiana with the primary goal of pursuing predictive and personalized medicine. CSBPM currently consists of eleven faculty members from the School of Medicine, School of Science, School of Engineering, School of Informatics, and Indiana University Simon Cancer Center. The primary mission of the center is to foster the development and use of systems biology and computational modeling techniques to address challenges in future genome-based medicine. The ultimate goal of the center is to shorten the discovery-to-practice gap between integrative ―Omics‖ biology studies—including genomics, transcriptomics, proteomics, and metabolomics—and predictive and personalized medicine applications.
Discovery of pathway biomarkers from coupled proteomics and systems biology methods
(BMC, 2010-11-02) Zhang, Fan; Chen, Jake Yue; BioHealth Informatics, School of Informatics and Computing
Background: Breast cancer is worldwide the second most common type of cancer after lung cancer. Plasma proteome profiling may have a higher chance to identify protein changes between plasma samples such as normal and breast cancer tissues. Breast cancer cell lines have long been used by researches as model system for identifying protein biomarkers. A comparison of the set of proteins which change in plasma with previously published findings from proteomic analysis of human breast cancer cell lines may identify with a higher confidence a subset of candidate protein biomarker. Results: In this study, we analyzed a liquid chromatography (LC) coupled tandem mass spectrometry (MS/MS) proteomics dataset from plasma samples of 40 healthy women and 40 women diagnosed with breast cancer. Using a two-sample t-statistics and permutation procedure, we identified 254 statistically significant, differentially expressed proteins, among which 208 are over-expressed and 46 are under-expressed in breast cancer plasma. We validated this result against previously published proteomic results of human breast cancer cell lines and signaling pathways to derive 25 candidate protein biomarkers in a panel. Using the pathway analysis, we observed that the 25 “activated” plasma proteins were present in several cancer pathways, including ‘Complement and coagulation cascades’, ‘Regulation of actin cytoskeleton’, and ‘Focal adhesion’, and match well with previously reported studies. Additional gene ontology analysis of the 25 proteins also showed that cellular metabolic process and response to external stimulus (especially proteolysis and acute inflammatory response) were enriched functional annotations of the proteins identified in the breast cancer plasma samples. By cross-validation using two additional proteomics studies, we obtained 86% and 83% similarities in pathway-protein matrix between the first study and the two testing studies, which is much better than the similarity we measured with proteins. Conclusions: We presented a ‘systems biology’ method to identify, characterize, analyze and validate panel biomarkers in breast cancer proteomics data, which includes 1) t statistics and permutation process, 2) network, pathway and function annotation analysis, and 3) cross-validation of multiple studies. Our results showed that the systems biology approach is essential to the understanding molecular mechanisms of panel protein biomarkers.
DMAP: a connectivity map database to enable identification of novel drug repositioning candidates
(BioMed Central, 2015-09-25) Huang, Hui; Nguyen, Thanh; Ibrahim, Sara; Shantharam, Sandeep; Yue, Zongliang; Chen, Jake Yue; Department of Computer & Information Science, School of Science
BACKGROUND: Drug repositioning is a cost-efficient and time-saving process to drug development compared to traditional techniques. A systematic method to drug repositioning is to identify candidate drug's gene expression profiles on target disease models and determine how similar these profiles are to approved drugs. Databases such as the CMAP have been developed recently to help with systematic drug repositioning. METHODS: To overcome the limitation of connectivity maps on data coverage, we constructed a comprehensive in silico drug-protein connectivity map called DMAP, which contains directed drug-to-protein effects and effect scores. The drug-to-protein effect scores are compiled from all database entries between the drug and protein have been previously observed and provide a confidence measure on the quality of such drug-to-protein effects. RESULTS: In DMAP, we have compiled the direct effects between 24,121 PubChem Compound ID (CID), which were mapped from 289,571 chemical entities recognized from public literature, and 5,196 reviewed Uniprot proteins. DMAP compiles a total of 438,004 chemical-to-protein effect relationships. Compared to CMAP, DMAP shows an increase of 221 folds in the number of chemicals and 1.92 fold in the number of ATC codes. Furthermore, by overlapping DMAP chemicals with the approved drugs with known indications from the TTD database and literature, we obtained 982 drugs and 622 diseases; meanwhile, we only obtained 394 drugs with known indication from CMAP. To validate the feasibility of applying new DMAP for systematic drug repositioning, we compared the performance of DMAP and the well-known CMAP database on two popular computational techniques: drug-drug-similarity-based method with leave-one-out validation and Kolmogorov-Smirnov scoring based method. In drug-drug-similarity-based method, the drug repositioning prediction using DMAP achieved an Area-Under-Curve (AUC) score of 0.82, compared with that using CMAP, AUC = 0.64. For Kolmogorov-Smirnov scoring based method, with DMAP, we were able to retrieve several drug indications which could not be retrieved using CMAP. DMAP data can be queried using the existing C2MAP server or downloaded freely at: http://bio.informatics.iupui.edu/cmaps CONCLUSIONS: Reliable measurements of how drug affect disease-related proteins are critical to ongoing drug development in the genome medicine era. We demonstrated that DMAP can help drug development professionals assess drug-to-protein relationship data and improve chances of success for systematic drug repositioning efforts.
Graft-Versus-Host Disease-Free Antitumoral Signature After Allogeneic Donor Lymphocyte Injection Identified by Proteomics and Systems Biology
(American Society of Clinical Oncology, 2019) Liu, Xiaowen; Yue, Zongliang; Cao, Yimou; Taylor, Lauren; Zhang, Qing; Choi, Sung W.; Hanash, Samir; Ito, Sawa; Chen, Jake Yue; Wu, Huanmei; Paczesny, Sophie; Pediatrics, School of Medicine
PURPOSE: As a tumor immunotherapy, allogeneic hematopoietic cell transplantation with subsequent donor lymphocyte injection (DLI) aims to induce the graft-versus-tumor (GVT) effect but often also leads to acute graft-versus-host disease (GVHD). Plasma tests that can predict the likelihood of GVT without GVHD are still needed. PATIENTS AND METHODS: We first used an intact-protein analysis system to profile the plasma proteome post-DLI of patients who experienced GVT and acute GVHD for comparison with the proteome of patients who experienced GVT without GVHD in a training set. Our novel six-step systems biology analysis involved removing common proteins and GVHD-specific proteins, creating a protein-protein interaction network, calculating relevance and penalty scores, and visualizing candidate biomarkers in gene networks. We then performed a second proteomics experiment in a validation set of patients who experienced GVT without acute GVHD after DLI for comparison with the proteome of patients before DLI. We next combined the two experiments to define a biologically relevant signature of GVT without GVHD. An independent experiment with single-cell profiling in tumor antigen-activated T cells from a patient with post-hematopoietic cell transplantation relapse was performed. RESULTS: The approach provided a list of 46 proteins in the training set, and 30 proteins in the validation set were associated with GVT without GVHD. The combination of the two experiments defined a unique 61-protein signature of GVT without GVHD. Finally, the single-cell profiling in activated T cells found 43 of the 61 genes. Novel markers, such as RPL23, ILF2, CD58, and CRTAM, were identified and could be extended to other antitumoral responses. CONCLUSION: Our multiomic analysis provides, to our knowledge, the first human plasma signature for GVT without GVHD. Risk stratification on the basis of this signature would allow for customized treatment plans.
HAPPI-2: a Comprehensive and High-quality Map of Human Annotated and Predicted Protein Interactions
(BioMed Central, 2017-02-17) Chen, Jake Yue; Pandey, Ragini; Nguyen, Thanh M.; Department of Biohealth Informatics, School of Informatics and Computing
BACKGROUND: Human protein-protein interaction (PPI) data is essential to network and systems biology studies. PPI data can help biochemists hypothesize how proteins form complexes by binding to each other, how extracellular signals propagate through post-translational modification of de-activated signaling molecules, and how chemical reactions are coupled by enzymes involved in a complex biological process. Our capability to develop good public database resources for human PPI data has a direct impact on the quality of future research on genome biology and medicine. RESULTS: The database of Human Annotated and Predicted Protein Interactions (HAPPI) version 2.0 is a major update to the original HAPPI 1.0 database. It contains 2,922,202 unique protein-protein interactions (PPI) linked by 23,060 human proteins, making it the most comprehensive database covering human PPI data today. These PPIs contain both physical/direct interactions and high-quality functional/indirect interactions. Compared with the HAPPI 1.0 database release, HAPPI database version 2.0 (HAPPI-2) represents a 485% of human PPI data coverage increase and a 73% protein coverage increase. The revamped HAPPI web portal provides users with a friendly search, curation, and data retrieval interface, allowing them to retrieve human PPIs and available annotation information on the interaction type, interaction quality, interacting partner drug targeting data, and disease information. The updated HAPPI-2 can be freely accessed by Academic users at http://discovery.informatics.uab.edu/HAPPI . CONCLUSIONS: While the underlying data for HAPPI-2 are integrated from a diverse data sources, the new HAPPI-2 release represents a good balance between data coverage and data quality of human PPIs, making it ideally suited for network biology.
An integrated proteomics analysis of bone tissues in response to mechanical stimulation
(2010-07) Li, Jillian; Zhang, Fan; Chen, Jake Yue
Bone cells can sense physical forces and convert mechanical stimulation conditions into biochemical signals that lead to expression of mechanically sensitive genes and proteins. However, it is still poorly understood how genes and proteins in bone cells are orchestrated to respond to mechanical stimulations. In this research, we applied integrated proteomics, statistical, and network biology techniques to study proteome-level changes to bone tissue cells in response to two different conditions, normal loading and fatigue loading. We harvested ulna midshafts and isolated proteins from the control, loaded, and fatigue loaded Rats. Using a label-free liquid chromatography tandem mass spectrometry (LC-MS/MS) experimental proteomics technique, we derived a comprehensive list of 1,058 proteins that are differentially expressed among normal loading, fatigue loading, and controls. By carefully developing protein selection filters and statistical models, we were able to identify 42 proteins representing 21 Rat genes that were significantly associated with bone cells' response to quantitative changes between normal loading and fatigue loading conditions. We further applied network biology techniques by building a fatigue loading activated protein-protein interaction subnetwork involving 9 of the human-homolog counterpart of the 21 rat genes in a large connected network component. Our study shows that the combination of decreased anti-apoptotic factor, Raf1, and increased pro-apoptotic factor, PDCD8, results in significant increase in the number of apoptotic osteocytes following fatigue loading. We believe controlling osteoblast differentiation/proliferation and osteocyte apoptosis could be promising directions for developing future therapeutic solutions for related bone diseases.
A method for identifying discriminative isoform-specific peptides for clinical proteomics application
(BioMed Central, 2016-08-22) Zhang, Fan; Chen, Jake Yue; Department of Biohealth Informatics, IU School of Informatics and Computing
BACKGROUND: Clinical proteomics application aims at solving a specific clinical problem within the context of a clinical study. It has been growing rapidly in the field of biomarker discovery, especially in the area of cancer diagnostics. Until recently, protein isoform has not been viewed as a new class of early diagnostic biomarkers for clinical proteomics. A protein isoform is one of different forms of the same protein. Different forms of a protein may be produced from single-nucleotide polymorphisms (SNPs), alternative splicing, or post-translational modifications (PTMs). Previous studies have shown that protein isoforms play critical roles in tumorigenesis, disease diagnosis, and prognosis. Identifying and characterizing protein isoforms are essential to the study of molecular mechanisms and early detection of complex diseases such as breast cancer. However, there are limitations with traditional methods such as EST sequencing, Microarray profiling (exon array, Exon-exon junction array), mRNA next-generation sequencing used for protein isoform determination: 1) not in the protein level, 2) no connectivity about connection of nonadjacent exons, 3) no SNPs and PTMs, and 4) low reproducibility. Moreover, there exist the computational challenges of clinical proteomics studies: 1) low sensitivity of instruments, 2) high data noise, and 3) high variability and low repeatability, although recent advances in clinical proteomics technology, LC-MS/MS proteomics, have been used to identify candidate molecular biomarkers in diverse range of samples, including cells, tissues, serum/plasma, and other types of body fluids. RESULTS: Therefore, in the paper, we presented a peptidomics method for identifying cancer-related and isoform-specific peptide for clinical proteomics application from LC-MS/MS. First, we built a Peptidomic Database of Human Protein Isoforms, then created a peptidomics approach to perform large-scale screen of breast cancer-associated alternative splicing isoform markers in clinical proteomics, and lastly performed four kinds of validations: biological validation (explainable index), exon array, statistical validation of independent samples, and extensive pathway analysis. CONCLUSIONS: Our results showed that alternative splicing isoform makers can act as independent markers of breast cancer and that the method for identifying cancer-specific protein isoform biomarkers from clinical proteomics application is an effective one for increasing the number of identified alternative splicing isoform markers in clinical proteomics.
MicroRNA Expression Profiling of Human Respiratory Epithelium Affected by Invasive Candida Infection
(Public Library of Science, 2015) Muhammad, Syed Aun; Fatima, Nighat; Syed, Nawazish-I.-Husain; Wu, Xiaogang; Yang, X. Frank; Chen, Jake Yue; IU School of Informatics and Computing
Invasive candidiasis is potentially life-threatening systemic fungal infection caused by Candida albicans (C. albicans). Candida enters the blood stream and disseminate throughout the body and it is often observed in hospitalized patients, immunocompromised individuals or those with chronic diseases. This infection is opportunistic and risk starts with the colonization of C. albicans on mucocutaneous surfaces and respiratory epithelium. MicroRNAs (miRNAs) are small non-coding RNAs which are involved in the regulation of virtually every cellular process. They regulate and control the levels of mRNA stability and post-transcriptional gene expression. Aberrant expression of miRNAs has been associated in many disease states, and miRNA-based therapies are in progress. In this study, we investigated possible variations of miRNA expression profiles of respiratory epithelial cells infected by invasive Candida species. For this purpose, respiratory epithelial tissues of infected individuals from hospital laboratory were accessed before their treatment. Invasive Candida infection was confirmed by isolation of Candia albicans from the blood cultures of the same infected individuals. The purity of epithelial tissues was assessed by flow cytometry (FACSCalibur cytometer; BD Biosciences, Heidelberg, Germany) using statin antibody (S-44). TaqMan quantitative real-time PCR (in a TaqMan Low Density Array format) was used for miRNA expression profiling. MiRNAs investigated, the levels of expression of 55 miRNA were significantly altered in infected tissues. Some miRNAs showed dramatic increase (miR-16-1) or decrease of expression (miR-17-3p) as compared to control. Gene ontology enrichment analysis of these miRNA-targeted genes suggests that Candidal infection affect many important biological pathways. In summary, disturbance in miRNA expression levels indicated the change in cascade of pathological processes and the regulation of respiratory epithelial functions following invasive Candidal infection. These findings contribute to our understanding of host cell response to Candidal systemic infections.
A new approach to construct pathway connected networks and its application in dose responsive gene expression profiles of rat liver regulated by 2,4DNT
(BMC, 2010-12-01) Chowbina, Sudhir; Deng, Youping; Ai, Junmei; Wu, Xiaogang; Guan, Xin; Wilbanks, Mitchell S.; Escalon, Barbara Lynn; Meyer, Sharon A.; Perkins, Edward J.; Chen, Jake Yue; BioHealth Informatics, School of Informatics and Computing
Military and industrial activities have lead to reported release of 2,4-dinitrotoluene (2,4DNT) into soil, groundwater or surface water. It has been reported that 2,4DNT can induce toxic effects on humans and other organisms. However the mechanism of 2,4DNT induced toxicity is still unclear. Although a series of methods for gene network construction have been developed, few instances of applying such technology to generate pathway connected networks have been reported. Results Microarray analyses were conducted using liver tissue of rats collected 24h after exposure to a single oral gavage with one of five concentrations of 2,4DNT. We observed a strong dose response of differentially expressed genes after 2,4DNT treatment. The most affected pathways included: long term depression, breast cancer regulation by stathmin1, WNT Signaling; and PI3K signaling pathways. In addition, we propose a new approach to construct pathway connected networks regulated by 2,4DNT. We also observed clear dose response pathway networks regulated by 2,4DNT. Conclusions We developed a new method for constructing pathway connected networks. This new method was successfully applied to microarray data from liver tissue of 2,4DNT exposed animals and resulted in the identification of unique dose responsive biomarkers in regards to affected pathways.

Browse

Browsing Jake Chen by Author "Chen, Jake Yue"

Results Per Page

Sort Options