- Browse by Author
Browsing by Author "Yao, Xiaohui"
Now showing 1 - 10 of 37
Results Per Page
Sort Options
Item Associating Multi-modal Brain Imaging Phenotypes and Genetic Risk Factors via A Dirty Multi-task Learning Method(IEEE, 2020) Du, Lei; Liu, Fang; Liu, Kefei; Yao, Xiaohui; Risacher, Shannon L.; Han, Junwei; Saykin, Andrew J.; Shen, Li; Radiology and Imaging Sciences, School of MedicineBrain imaging genetics becomes more and more important in brain science, which integrates genetic variations and brain structures or functions to study the genetic basis of brain disorders. The multi-modal imaging data collected by different technologies, measuring the same brain distinctly, might carry complementary information. Unfortunately, we do not know the extent to which the phenotypic variance is shared among multiple imaging modalities, which further might trace back to the complex genetic mechanism. In this paper, we propose a novel dirty multi-task sparse canonical correlation analysis (SCCA) to study imaging genetic problems with multi-modal brain imaging quantitative traits (QTs) involved. The proposed method takes advantages of the multi-task learning and parameter decomposition. It can not only identify the shared imaging QTs and genetic loci across multiple modalities, but also identify the modality-specific imaging QTs and genetic loci, exhibiting a flexible capability of identifying complex multi-SNP-multi-QT associations. Using the state-of-the-art multi-view SCCA and multi-task SCCA, the proposed method shows better or comparable canonical correlation coefficients and canonical weights on both synthetic and real neuroimaging genetic data. In addition, the identified modality-consistent biomarkers, as well as the modality-specific biomarkers, provide meaningful and interesting information, demonstrating the dirty multi-task SCCA could be a powerful alternative method in multi-modal brain imaging genetics.Item Detecting genetic associations with brain imaging phenotypes in Alzheimer’s disease via a novel structured SCCA approach(Elsevier, 2020-04) Du, Lei; Liu, Kefei; Yao, Xiaohui; Risacher, Shannon L.; Han, Junwei; Saykin, Andrew J.; Guo, Lei; Shen, Li; Radiology and Imaging Sciences, School of MedicineBrain imaging genetics becomes an important research topic since it can reveal complex associations between genetic factors and the structures or functions of the human brain. Sparse canonical correlation analysis (SCCA) is a popular bi-multivariate association identification method. To mine the complex genetic basis of brain imaging phenotypes, there arise many SCCA methods with a variety of norms for incorporating different structures of interest. They often use the group lasso penalty, the fused lasso or the graph/network guided fused lasso ones. However, the group lasso methods have limited capability because of the incomplete or unavailable prior knowledge in real applications. The fused lasso and graph/network guided methods are sensitive to the sign of the sample correlation which may be incorrectly estimated. In this paper, we introduce two new penalties to improve the fused lasso and the graph/network guided lasso penalties in structured sparse learning. We impose both penalties to the SCCA model and propose an optimization algorithm to solve it. The proposed SCCA method has a strong upper bound of grouping effects for both positively and negatively highly correlated variables. We show that, on both synthetic and real neuroimaging genetics data, the proposed SCCA method performs better than or equally to the conventional methods using fused lasso or graph/network guided fused lasso. In particular, the proposed method identifies higher canonical correlation coefficients and captures clearer canonical weight patterns, demonstrating its promising capability in revealing biologically meaningful imaging genetic associations.Item Diagnosis-guided method for identifying multi-modality neuroimaging biomarkers associated with genetic risk factors in Alzheimer’s disease(eProceedings, 2016) Hao, Xiaoke; Yan, Jingwen; Yao, Xiaohui; Risacher, Shannon L.; Saykin, Andrew J.; Zhang, Daoqiang; Shen, Li; Department of Radiology and Imaging Sciences, IU School of MedicineMany recent imaging genetic studies focus on detecting the associations between genetic markers such as single nucleotide polymorphisms (SNPs) and quantitative traits (QTs). Although there exist a large number of generalized multivariate regression analysis methods, few of them have used diagnosis information in subjects to enhance the analysis performance. In addition, few of models have investigated the identification of multi-modality phenotypic patterns associated with interesting genotype groups in traditional methods. To reveal disease-relevant imaging genetic associations, we propose a novel diagnosis-guided multi-modality (DGMM) framework to discover multi-modality imaging QTs that are associated with both Alzheimer's disease (AD) and its top genetic risk factor (i.e., APOE SNP rs429358). The strength of our proposed method is that it explicitly models the priori diagnosis information among subjects in the objective function for selecting the disease-relevant and robust multi-modality QTs associated with the SNP. We evaluate our method on two modalities of imaging phenotypes, i.e., those extracted from structural magnetic resonance imaging (MRI) data and fluorodeoxyglucose positron emission tomography (FDG-PET) data in the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. The experimental results demonstrate that our proposed method not only achieves better performances under the metrics of root mean squared error and correlation coefficient but also can identify common informative regions of interests (ROIs) across multiple modalities to guide the disease-induced biological interpretation, compared with other reference methods.Item Fast Multi-Task SCCA Learning with Feature Selection for Multi-Modal Brain Imaging Genetics(IEEE Xplore, 2019-01-24) Du, Lei; Liu, Kefei; Yao, Xiaohui; Risacher, Shannon L.; Han, Junwei; Guo, Lei; Saykin, Andrew J.; Shen, Li; Radiology and Imaging Sciences, School of MedicineBrain imaging genetics studies the genetic basis of brain structures and functions via integrating both genotypic data such as single nucleotide polymorphism (SNP) and imaging quantitative traits (QTs). In this area, both multi-task learning (MTL) and sparse canonical correlation analysis (SCCA) methods are widely used since they are superior to those independent and pairwise univariate analyses. MTL methods generally incorporate a few of QTs and are not designed for feature selection from a large number of QTs; while existing SCCA methods typically employ only one modality of QTs to study its association with SNPs. Both MTL and SCCA encounter computational challenges as the number of SNPs increases. In this paper, combining the merits of MTL and SCCA, we propose a novel multi-task SCCA (MTSCCA) learning framework to identify bi-multivariate associations between SNPs and multi-modal imaging QTs. MTSCCA could make use of the complementary information carried by different imaging modalities. Using the G2,1-norm regularization, MTSCCA treats all SNPs in the same group together to enforce sparsity at the group level. The l2,1-norm penalty is used to jointly select features across multiple tasks for SNPs, and across multiple modalities for QTs. A fast optimization algorithm is proposed using the grouping information of SNPs. Compared with conventional SCCA methods, MTSCCA obtains improved performance regarding both correlation coefficients and canonical weights patterns. In addition, our method runs very fast and is easy-to-implement, and thus could provide a powerful tool for genome-wide brain-wide imaging genetic studies.Item Genetic Influence Underlying Brain Connectivity Phenotype: A Study on Two Age-Specific Cohorts(Frontiers Media, 2022-02-07) Cong, Shan; Yao, Xiaohui; Xie, Linhui; Yan, Jingwen; Shen, Li; Alzheimer’s Disease Neuroimaging Initiative; Biomedical Engineering and Informatics, Luddy School of Informatics, Computing, and EngineerinBackground: Human brain structural connectivity is an important imaging quantitative trait for brain development and aging. Mapping the network connectivity to the phenotypic variation provides fundamental insights in understanding the relationship between detailed brain topological architecture, function, and dysfunction. However, the underlying neurobiological mechanism from gene to brain connectome, and to phenotypic outcomes, and whether this mechanism changes over time, remain unclear. Methods: This study analyzes diffusion-weighted imaging data from two age-specific neuroimaging cohorts, extracts structural connectome topological network measures, performs genome-wide association studies of the measures, and examines the causality of genetic influences on phenotypic outcomes mediated via connectivity measures. Results: Our empirical study has yielded several significant findings: 1) It identified genetic makeup underlying structural connectivity changes in the human brain connectome for both age groups. Specifically, it revealed a novel association between the minor allele (G) of rs7937515 and the decreased network segregation measures of the left middle temporal gyrus across young and elderly adults, indicating a consistent genetic effect on brain connectivity across the lifespan. 2) It revealed rs7937515 as a genetic marker for body mass index in young adults but not in elderly adults. 3) It discovered brain network segregation alterations as a potential neuroimaging biomarker for obesity. 4) It demonstrated the hemispheric asymmetry of structural network organization in genetic association analyses and outcome-relevant studies. Discussion: These imaging genetic findings underlying brain connectome warrant further investigation for exploring their potential influences on brain-related complex diseases, given the significant involvement of altered connectivity in neurological, psychiatric and physical disorders.Item Genetic studies of quantitative MCI and AD phenotypes in ADNI: Progress, opportunities, and plans(Elsevier, 2015-07) Saykin, Andrew J.; Shen, Li; Yao, Xiaohui; Kim, Sungeun; Nho, Kwangsik; Risacher, Shannon L.; Ramanan, Vijay K.; Foroud, Tatiana M.; Faber, Kelly M.; Sarwar, Nadeem; Munsie, Leanne M.; Hu, Xiaolan; Soares, Holly D.; Potkin, Steven G.; Thompson, Paul M.; Kauwe, John S. K.; Kaddurah-Daouk, Rima; Green, Robert C.; Toga, Arthur W.; Weiner, Michael W.; Alzheimer's Disease Neuroimaging Initiative; Department of Radiology and Imaging Sciences, IU School of MedicineINTRODUCTION: Genetic data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) have been crucial in advancing the understanding of Alzheimer's disease (AD) pathophysiology. Here, we provide an update on sample collection, scientific progress and opportunities, conceptual issues, and future plans. METHODS: Lymphoblastoid cell lines and DNA and RNA samples from blood have been collected and banked, and data and biosamples have been widely disseminated. To date, APOE genotyping, genome-wide association study (GWAS), and whole exome and whole genome sequencing data have been obtained and disseminated. RESULTS: ADNI genetic data have been downloaded thousands of times, and >300 publications have resulted, including reports of large-scale GWAS by consortia to which ADNI contributed. Many of the first applications of quantitative endophenotype association studies used ADNI data, including some of the earliest GWAS and pathway-based studies of biospecimen and imaging biomarkers, as well as memory and other clinical/cognitive variables. Other contributions include some of the first whole exome and whole genome sequencing data sets and reports in healthy controls, mild cognitive impairment, and AD. DISCUSSION: Numerous genetic susceptibility and protective markers for AD and disease biomarkers have been identified and replicated using ADNI data and have heavily implicated immune, mitochondrial, cell cycle/fate, and other biological processes. Early sequencing studies suggest that rare and structural variants are likely to account for significant additional phenotypic variation. Longitudinal analyses of transcriptomic, proteomic, metabolomic, and epigenomic changes will also further elucidate dynamic processes underlying preclinical and prodromal stages of disease. Integration of this unique collection of multiomics data within a systems biology framework will help to separate truly informative markers of early disease mechanisms and potential novel therapeutic targets from the vast background of less relevant biological processes. Fortunately, a broad swath of the scientific community has accepted this grand challenge.Item Genome-wide Network-assisted Association and Enrichment Study of Amyloid Imaging Phenotype in Alzheimer's Disease(Bentham Science, 2019) Li, Jin; Chen, Feng; Zhang, Qiushi; Meng, Xianglian; Yao, Xiaohui; Risacher, Shannon L.; Yan, Jingwen; Saykin, Andrew J.; Liang, Hong; Shen, Li; Radiology and Imaging Sciences, School of MedicineBackground: The etiology of Alzheimer's disease remains poorly understood at the mechanistic level, and genome-wide network-based genetics have the potential to provide new insights into the disease mechanisms. Objective: The study aimed to explore the collective effects of multiple genetic association signals on an AV-45 PET measure, which is a well-known Alzheimer's disease biomarker, by employing a network assisted strategy. Methods: First, we took advantage of a dense module search algorithm to identify modules enriched by genetic association signals in a protein-protein interaction network. Next, we performed statistical evaluation to the modules identified by dense module search, including a normalization process to adjust the topological bias in the network, a replication test to ensure the modules were not found randomly , and a permutation test to evaluate unbiased associations between the modules and amyloid imaging phenotype. Finally, topological analysis, module similarity tests and functional enrichment analysis were performed for the identified modules. Results: We identified 24 consensus modules enriched by robust genetic signals in a genome-wide association analysis. The results not only validated several previously reported AD genes (APOE, APP, TOMM40, DDAH1, PARK2, ATP5C1, PVRL2, ELAVL1, ACTN1 and NRF1), but also nominated a few novel genes (ABL1, ABLIM2) that have not been studied in Alzheimer's disease but have shown associations with other neurodegenerative diseases. Conclusion: The identified genes, consensus modules and enriched pathways may provide important clues to future research on the neurobiology of Alzheimer's disease and suggest potential therapeutic targets.Item Genome-wide network-based pathway analysis of CSF t-tau/Aβ1-42 ratio in the ADNI cohort(Springer Nature, 2017-05-30) Cong, Wang; Meng, Xianglian; Li, Jin; Zhang, Qiushi; Chen, Feng; Liu, Wenjie; Wang, Ying; Cheng, Sipu; Yao, Xiaohui; Yan, Jingwen; Kim, Sungeun; Saykin, Andrew J.; Liang, Hong; Shen, Li; Alzheimer’s Disease Neuroimaging Initiative; Radiology and Imaging Sciences, School of MedicineBACKGROUND: The cerebrospinal fluid (CSF) levels of total tau (t-tau) and Aβ1-42 are potential early diagnostic markers for probable Alzheimer's disease (AD). The influence of genetic variation on these CSF biomarkers has been investigated in candidate or genome-wide association studies (GWAS). However, the investigation of statistically modest associations in GWAS in the context of biological networks is still an under-explored topic in AD studies. The main objective of this study is to gain further biological insights via the integration of statistical gene associations in AD with physical protein interaction networks. RESULTS: The CSF and genotyping data of 843 study subjects (199 CN, 85 SMC, 239 EMCI, 207 LMCI, 113 AD) from the Alzheimer's Disease Neuroimaging Initiative (ADNI) were analyzed. PLINK was used to perform GWAS on the t-tau/Aβ1-42 ratio using quality controlled genotype data, including 563,980 single nucleotide polymorphisms (SNPs), with age, sex and diagnosis as covariates. Gene-level p-values were obtained by VEGAS2. Genes with p-value ≤ 0.05 were mapped on to a protein-protein interaction (PPI) network (9,617 nodes, 39,240 edges, from the HPRD Database). We integrated a consensus model strategy into the iPINBPA network analysis framework, and named it as CM-iPINBPA. Four consensus modules (CMs) were discovered by CM-iPINBPA, and were functionally annotated using the pathway analysis tool Enrichr. The intersection of four CMs forms a common subnetwork of 29 genes, including those related to tau phosphorylation (GSK3B, SUMO1, AKAP5, CALM1 and DLG4), amyloid beta production (CASP8, PIK3R1, PPA1, PARP1, CSNK2A1, NGFR, and RHOA), and AD (BCL3, CFLAR, SMAD1, and HIF1A). CONCLUSIONS: This study coupled a consensus module (CM) strategy with the iPINBPA network analysis framework, and applied it to the GWAS of CSF t-tau/Aβ1-42 ratio in an AD study. The genome-wide network analysis yielded 4 enriched CMs that share not only genes related to tau phosphorylation or amyloid beta production but also multiple genes enriching several KEGG pathways such as Alzheimer's disease, colorectal cancer, gliomas, renal cell carcinoma, Huntington's disease, and others. This study demonstrated that integration of gene-level associations with CMs could yield statistically significant findings to offer valuable biological insights (e.g., functional interaction among the protein products of these genes) and suggest high confidence candidates for subsequent analyses.Item GPU Accelerated Browser for Neuroimaging Genomics(Springer, 2018-10) Zigon, Bob; Li, Huang; Yao, Xiaohui; Fang, Shiaofen; Hasan, Mohammad Al; Yan, Jingwen; Moore, Jason H.; Saykin, Andrew J.; Shen, Li; Alzheimer’s Disease Neuroimaging Initiative; Computer and Information Science, School of ScienceNeuroimaging genomics is an emerging field that provides exciting opportunities to understand the genetic basis of brain structure and function. The unprecedented scale and complexity of the imaging and genomics data, however, have presented critical computational bottlenecks. In this work we present our initial efforts towards building an interactive visual exploratory system for mining big data in neuroimaging genomics. A GPU accelerated browsing tool for neuroimaging genomics is created that implements the ANOVA algorithm for single nucleotide polymorphism (SNP) based analysis and the VEGAS algorithm for gene-based analysis, and executes them at interactive rates. The ANOVA algorithm is 110 times faster than the 4-core OpenMP version, while the VEGAS algorithm is 375 times faster than its 4-core OpenMP counter part. This approach lays a solid foundation for researchers to address the challenges of mining large-scale imaging genomics datasets via interactive visual exploration.Item Identification of associations between genotypes and longitudinal phenotypes via temporally-constrained group sparse canonical correlation analysis(Oxford, 2017-07) Hao, Xiaoke; Li, Chanxiu; Yan, Jingwen; Yao, Xiaohui; Risacher, Shannon L.; Saykin, Andrew J.; Shen, Li; Zhang, Daoqiang; Radiology and Imaging Sciences, School of MedicineMotivation: Neuroimaging genetics identifies the relationships between genetic variants (i.e., the single nucleotide polymorphisms) and brain imaging data to reveal the associations from genotypes to phenotypes. So far, most existing machine-learning approaches are widely used to detect the effective associations between genetic variants and brain imaging data at one time-point. However, those associations are based on static phenotypes and ignore the temporal dynamics of the phenotypical changes. The phenotypes across multiple time-points may exhibit temporal patterns that can be used to facilitate the understanding of the degenerative process. In this article, we propose a novel temporally constrained group sparse canonical correlation analysis (TGSCCA) framework to identify genetic associations with longitudinal phenotypic markers. Results: The proposed TGSCCA method is able to capture the temporal changes in brain from longitudinal phenotypes by incorporating the fused penalty, which requires that the differences between two consecutive canonical weight vectors from adjacent time-points should be small. A new efficient optimization algorithm is designed to solve the objective function. Furthermore, we demonstrate the effectiveness of our algorithm on both synthetic and real data (i.e., the Alzheimer’s Disease Neuroimaging Initiative cohort, including progressive mild cognitive impairment, stable MCI and Normal Control participants). In comparison with conventional SCCA, our proposed method can achieve strong associations and discover phenotypic biomarkers across multiple time-points to guide disease-progressive interpretation.