ScholarWorksIndianapolis
  • Communities & Collections
  • Browse ScholarWorks
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Tiếng Việt
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Yкраї́нська
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Browse by Author

Browsing by Author "Fan, Kun"

Now showing 1 - 3 of 3
Results Per Page
Sort Options
  • Loading...
    Thumbnail Image
    Item
    Identifying Gene–Environment Interactions With Robust Marginal Bayesian Variable Selection
    (Frontiers Media, 2021-12-08) Lu, Xi; Fan, Kun; Ren, Jie; Wu, Cen; Biostatistics & Health Data Science, School of Medicine
    In high-throughput genetics studies, an important aim is to identify gene-environment interactions associated with the clinical outcomes. Recently, multiple marginal penalization methods have been developed and shown to be effective in G×E studies. However, within the Bayesian framework, marginal variable selection has not received much attention. In this study, we propose a novel marginal Bayesian variable selection method for G×E studies. In particular, our marginal Bayesian method is robust to data contamination and outliers in the outcome variables. With the incorporation of spike-and-slab priors, we have implemented the Gibbs sampler based on Markov Chain Monte Carlo (MCMC). The proposed method outperforms a number of alternatives in extensive simulation studies. The utility of the marginal robust Bayesian variable selection method has been further demonstrated in the case studies using data from the Nurse Health Study (NHS). Some of the identified main and interaction effects from the real data analysis have important biological implications.
  • Loading...
    Thumbnail Image
    Item
    Is Seeing Believing? A Practitioner’s Perspective on High-Dimensional Statistical Inference in Cancer Genomics Studies
    (MDPI, 2024-09-16) Fan, Kun; Subedi, Srijana; Yang, Gongshun; Lu, Xi; Ren, Jie; Wu, Cen; Biostatistics and Health Data Science, Richard M. Fairbanks School of Public Health
    Variable selection methods have been extensively developed for and applied to cancer genomics data to identify important omics features associated with complex disease traits, including cancer outcomes. However, the reliability and reproducibility of the findings are in question if valid inferential procedures are not available to quantify the uncertainty of the findings. In this article, we provide a gentle but systematic review of high-dimensional frequentist and Bayesian inferential tools under sparse models which can yield uncertainty quantification measures, including confidence (or Bayesian credible) intervals, p values and false discovery rates (FDR). Connections in high-dimensional inferences between the two realms have been fully exploited under the "unpenalized loss function + penalty term" formulation for regularization methods and the "likelihood function × shrinkage prior" framework for regularized Bayesian analysis. In particular, we advocate for robust Bayesian variable selection in cancer genomics studies due to its ability to accommodate disease heterogeneity in the form of heavy-tailed errors and structured sparsity while providing valid statistical inference. The numerical results show that robust Bayesian analysis incorporating exact sparsity has yielded not only superior estimation and identification results but also valid Bayesian credible intervals under nominal coverage probabilities compared with alternative methods, especially in the presence of heavy-tailed model errors and outliers.
  • Loading...
    Thumbnail Image
    Item
    Sparse group variable selection for gene-environment interactions in the longitudinal study
    (Wiley, 2022) Zhou, Fei; Lu, Xi; Ren, Jie; Fan, Kun; Ma, Shuangge; Wu, Cen; Biostatistics and Health Data Science, School of Medicine
    Penalized variable selection for high dimensional longitudinal data has received much attention as it can account for the correlation among repeated measurements while providing additional and essential information for improved identification and prediction performance. Despite the success, in longitudinal studies, the potential of penalization methods is far from fully understood for accommodating structured sparsity. In this article, we develop a sparse group penalization method to conduct the bi-level gene-environment (G×E) interaction study under the repeatedly measured phenotype. Within the quadratic inference function (QIF) framework, the proposed method can achieve simultaneous identification of main and interaction effects on both the group and individual level. Simulation studies have shown that the proposed method outperforms major competitors. In the case study of asthma data from the Childhood Asthma Management Program (CAMP), we conduct G×E study by using high dimensional SNP data as genetic factors and the longitudinal trait, forced expiratory volume in one second (FEV1), as the phenotype. Our method leads to improved prediction and identification of main and interaction effects with important implications.
About IU Indianapolis ScholarWorks
  • Accessibility
  • Privacy Notice
  • Copyright © 2025 The Trustees of Indiana University