- Browse by Subject
Browsing by Subject "Multi-omics"
Now showing 1 - 7 of 7
Results Per Page
Sort Options
Item Artificial intelligence framework identifies candidate targets for drug repurposing in Alzheimer’s disease(BMC, 2022-01-10) Fang, Jiansong; Zhang, Pengyue; Wang, Quan; Chiang, Chien‑Wei; Zhou, Yadi; Hou, Yuan; Xu, Jielin; Chen, Rui; Zhang, Bin; Lewis, Stephen J.; Leverenz, James B.; Pieper, Andrew A.; Li, Bingshan; Li, Lang; Cummings, Jeffrey; Cheng, Feixiong; Biostatistics and Health Data Science, School of MedicineBackground: Genome-wide association studies (GWAS) have identified numerous susceptibility loci for Alzheimer's disease (AD). However, utilizing GWAS and multi-omics data to identify high-confidence AD risk genes (ARGs) and druggable targets that can guide development of new therapeutics for patients suffering from AD has heretofore not been successful. Methods: To address this critical problem in the field, we have developed a network-based artificial intelligence framework that is capable of integrating multi-omics data along with human protein-protein interactome networks to accurately infer accurate drug targets impacted by GWAS-identified variants to identify new therapeutics. When applied to AD, this approach integrates GWAS findings, multi-omics data from brain samples of AD patients and AD transgenic animal models, drug-target networks, and the human protein-protein interactome, along with large-scale patient database validation and in vitro mechanistic observations in human microglia cells. Results: Through this approach, we identified 103 ARGs validated by various levels of pathobiological evidence in AD. Via network-based prediction and population-based validation, we then showed that three drugs (pioglitazone, febuxostat, and atenolol) are significantly associated with decreased risk of AD compared with matched control populations. Pioglitazone usage is significantly associated with decreased risk of AD (hazard ratio (HR) = 0.916, 95% confidence interval [CI] 0.861-0.974, P = 0.005) in a retrospective case-control validation. Pioglitazone is a peroxisome proliferator-activated receptor (PPAR) agonist used to treat type 2 diabetes, and propensity score matching cohort studies confirmed its association with reduced risk of AD in comparison to glipizide (HR = 0.921, 95% CI 0.862-0.984, P = 0.0159), an insulin secretagogue that is also used to treat type 2 diabetes. In vitro experiments showed that pioglitazone downregulated glycogen synthase kinase 3 beta (GSK3β) and cyclin-dependent kinase (CDK5) in human microglia cells, supporting a possible mechanism-of-action for its beneficial effect in AD. Conclusions: In summary, we present an integrated, network-based artificial intelligence methodology to rapidly translate GWAS findings and multi-omics data to genotype-informed therapeutic discovery in AD.Item Cox-sMBPLS: An Algorithm for Disease Survival Prediction and Multi-Omics Module Discovery Incorporating Cis-Regulatory Quantitative Effects(Frontiers Media, 2021-08-02) Vahabi, Nasim; McDonough, Caitrin W.; Desai, Ankit A.; Cavallari, Larisa H.; Duarte, Julio D.; Michailidis, George; Medicine, School of MedicineBackground: The development of high-throughput techniques has enabled profiling a large number of biomolecules across a number of molecular compartments. The challenge then becomes to integrate such multimodal Omics data to gain insights into biological processes and disease onset and progression mechanisms. Further, given the high dimensionality of such data, incorporating prior biological information on interactions between molecular compartments when developing statistical models for data integration is beneficial, especially in settings involving a small number of samples. Results: We develop a supervised model for time to event data (e.g., death, biochemical recurrence) that simultaneously accounts for redundant information within Omics profiles and leverages prior biological associations between them through a multi-block PLS framework. The interactions between data from different molecular compartments (e.g., epigenome, transcriptome, methylome, etc.) were captured by using cis-regulatory quantitative effects in the proposed model. The model, coined Cox-sMBPLS, exhibits superior prediction performance and improved feature selection based on both simulation studies and analysis of data from heart failure patients. Conclusion: The proposed supervised Cox-sMBPLS model can effectively incorporate prior biological information in the survival prediction system, leading to improved prediction performance and feature selection. It also enables the identification of multi-Omics modules of biomolecules that impact the patients' survival probability and also provides insights into potential relevant risk factors that merit further investigation.Item Integrated Correlation Analysis of Proteomics and Transcriptomics Data in Alzheimer's Disease(2020-12) Modekurty, Suneeta; Liu, Xiaowen; Wan, Jun; Zheng, JiapingWe wanted to see if there existed any significant correlations between two -omics layers. So, here, we performed a correlation analysis to study the disease. The pipeline building consisted of first performing the differential expression of two datasets (proteomics and transcriptomics) individually. An in-depth analysis of the proteomics data was performed, followed by differential expression analysis of RNA seq data and then a correlational analysis of the differentially expressed proteins (from proteomics data) and genes (from RNA seq data). From our analysis, we found fascinating information about the correlations between proteins and genes in AD. We performed a correlation analysis of AD (N= 84), Control (N = 31), and PSP (N = 85) samples for proteomics data and got 114 differentially expressed proteins (DEPs = 114). The RNA seq data had AD (N = 82), Control (N = 31) and PSP (N = 84) samples which gave us 61 differentially expressed genes (DEGs = 61). A correlation analysis using Spearman’s correlation coefficient method between proteins involved in AD revealed 192 very significant correlations with p-value <= 0.00000000000005. The mean correlation coefficient was quite high (r = 0.52). A correlation analysis using Spearman’s correlation coefficient method between genes involved in AD revealed 208 very significant correlations with p-value <= 0.00000000000005. The mean correlation coefficient was quite high (r = 0.52). A correlation analysis using Spearman’s correlation coefficient method between proteins and genes involved in AD revealed 395 significant correlations with p-value <= 0.0001. The correlation coefficient (quite high of +0.53), which might help in understanding the molecular pathways behind the disease could uncover new prospects of understanding the disease as well as design treatments. We observed that different genes interact with different proteins (correlation coefficient r >= 0.5, p-value < 0.05). We also observed that a single protein interacts with multiple genes, and a single gene is interestingly associated with multiple proteins. The patterns of correlations are also different in that a protein/gene positively correlates with some proteins/genes and negatively with some other proteins/genes. We hope that this observation is quite useful. However, understanding how it works and how they interact with each other needs further assessment at the molecular level.Item Integrative analysis of multi-omics and imaging data with incorporation of biological information via structural Bayesian factor analysis(Oxford University Press, 2023) Bao, Jingxuan; Chang, Changgee; Zhang, Qiyiwen; Saykin, Andrew J.; Shen, Li; Long, Qi; Alzheimer’s Disease Neuroimaging Initiative; Radiology and Imaging Sciences, School of MedicineMotivation: With the rapid development of modern technologies, massive data are available for the systematic study of Alzheimer's disease (AD). Though many existing AD studies mainly focus on single-modality omics data, multi-omics datasets can provide a more comprehensive understanding of AD. To bridge this gap, we proposed a novel structural Bayesian factor analysis framework (SBFA) to extract the information shared by multi-omics data through the aggregation of genotyping data, gene expression data, neuroimaging phenotypes and prior biological network knowledge. Our approach can extract common information shared by different modalities and encourage biologically related features to be selected, guiding future AD research in a biologically meaningful way. Method: Our SBFA model decomposes the mean parameters of the data into a sparse factor loading matrix and a factor matrix, where the factor matrix represents the common information extracted from multi-omics and imaging data. Our framework is designed to incorporate prior biological network information. Our simulation study demonstrated that our proposed SBFA framework could achieve the best performance compared with the other state-of-the-art factor-analysis-based integrative analysis methods. Results: We apply our proposed SBFA model together with several state-of-the-art factor analysis models to extract the latent common information from genotyping, gene expression and brain imaging data simultaneously from the ADNI biobank database. The latent information is then used to predict the functional activities questionnaire score, an important measurement for diagnosis of AD quantifying subjects' abilities in daily life. Our SBFA model shows the best prediction performance compared with the other factor analysis models. Availability: Code are publicly available at https://github.com/JingxuanBao/SBFA.Item Multi-omics cannot replace sample size in genome-wide association studies(Wiley, 2023) Baranger, David A. A.; Hatoum, Alexander S.; Polimanti, Renato; Gelernter, Joel; Edenberg, Howard J.; Bogdan, Ryan; Agrawal, Arpana; Biochemistry and Molecular Biology, School of MedicineThe integration of multi-omics information (e.g., epigenetics and transcriptomics) can be useful for interpreting findings from genome-wide association studies (GWAS). It has been suggested that multi-omics could circumvent or greatly reduce the need to increase GWAS sample sizes for novel variant discovery. We tested whether incorporating multi-omics information in earlier and smaller-sized GWAS boosts true-positive discovery of genes that were later revealed by larger GWAS of the same/similar traits. We applied 10 different analytic approaches to integrating multi-omics data from 12 sources (e.g., Genotype-Tissue Expression project) to test whether earlier and smaller GWAS of 4 brain-related traits (alcohol use disorder/problematic alcohol use, major depression/depression, schizophrenia, and intracranial volume/brain volume) could detect genes that were revealed by a later and larger GWAS. Multi-omics data did not reliably identify novel genes in earlier less-powered GWAS (PPV <0.2; 80% false-positive associations). Machine learning predictions marginally increased the number of identified novel genes, correctly identifying 1-8 additional genes, but only for well-powered early GWAS of highly heritable traits (i.e., intracranial volume and schizophrenia). Although multi-omics, particularly positional mapping (i.e., fastBAT, MAGMA, and H-MAGMA), can help to prioritize genes within genome-wide significant loci (PPVs = 0.5-1.0) and translate them into information about disease biology, it does not reliably increase novel gene discovery in brain-related GWAS. To increase power for discovery of novel genes and loci, increasing sample size is required.Item SALMON: Survival Analysis Learning With Multi-Omics Neural Networks on Breast Cancer(Frontiers Media, 2019-03-08) Huang, Zhi; Zhan, Xiaohui; Xiang, Shunian; Johnson, Travis S.; Helm, Bryan; Yu, Christina Y.; Zhang, Jie; Salama, Paul; Rizkalla, Maher; Han, Zhi; Huang, Kun; Department of Medicine, Indiana University School of MedicineImproved cancer prognosis is a central goal for precision health medicine. Though many models can predict differential survival from data, there is a strong need for sophisticated algorithms that can aggregate and filter relevant predictors from increasingly complex data inputs. In turn, these models should provide deeper insight into which types of data are most relevant to improve prognosis. Deep Learning-based neural networks offer a potential solution for both problems because they are highly flexible and account for data complexity in a non-linear fashion. In this study, we implement Deep Learning-based networks to determine how gene expression data predicts Cox regression survival in breast cancer. We accomplish this through an algorithm called SALMON (Survival Analysis Learning with Multi-Omics Neural Networks), which aggregates and simplifies gene expression data and cancer biomarkers to enable prognosis prediction. The results revealed improved performance when more omics data were used in model construction. Rather than use raw gene expression values as model inputs, we innovatively use eigengene modules from the result of gene co-expression network analysis. The corresponding high impact co-expression modules and other omics data are identified by feature selection technique, then examined by conducting enrichment analysis and exploiting biological functions, escalated the interpretation of input feature from gene level to co-expression modules level. Our study shows the feasibility of discovering breast cancer related co-expression modules, sketch a blueprint of future endeavors on Deep Learning-based survival analysis. SALMON source code is available at https://github.com/huangzhii/SALMON/.Item Unraveling the Multi-omic Network and Pathway Alterations in Alzheimer's Disease(2024-08) Xie, Linhui; Salama, Paul; Yan, Jingwen; Rizkalla, Maher; Ben Miled, Zina; Saykin, Andrew J.Multi-omic studies ranging from genomics, transcriptomics (e.g., gene expression) to proteomics data exploration have been widely applied to interpret findings from genome wide association studies (GWAS) of Alzheimer's disease (AD). However, previous studies examine each -omics data type individually and the functional interactions between genetic variations, genes and proteins are only used after discovery to interpret the findings, but not beforehand. In this case, multi-omic findings are likely not functionally related and therefore it is challenging for result interpretation. To handle this challenge, we present new modularity constrained least absolute shrinkage and selection operator (M-LASSO), new modularity constrained logistic regression (M-Logistic), new interpretable multi-omic graph fusion neural network model (MoFNet) and new transfer learning framework integrated graph fusion neural network model (TransFuse) to integrate prior biological knowledge to model the functional interactions of multi-omic data. These approaches aim to identify functional connected sub-networks predictive of AD. In this thesis, the intrepretable model MoFNet and TransFuse incorporate prior biological connected multi-omics network, and for the first time model the dynamic information flow from deoxyribonucleic acid (DNA) to ribonucleic acid (RNA) and proteins. While applying the proposed models on multi-omic data from the religious orders study/memory and aging project (ROS/MAP) cohort, MoFNet and TransFuse outperformed all other state-of-art classifiers. Instead of targeting individual markers, the proposed methods identified multi-omic sub-networks associated with AD. MoFNet and TransFuse, produced sub-network and pathway findings that were robustly validated in another independent cohort. These identified gene/protein networks highlight potential pathways involved in AD pathogenesis and could offer systematic overview for understanding the molecular mechanisms of the disease. Investigating these identified pathways in more detail could help uncover the mechanisms causing synaptic dysfunction in AD and guide future research into potential therapeutic targets.