A novel two-sample Mendelian randomization framework integrating common and rare variants: application to assess the effect of HDL-C on preeclampsia risk
Date
Language
Embargo Lift Date
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Abstract
Mendelian randomization (MR) has become an important technique for establishing causal relationships between risk factors and health outcomes. By using genetic variants as instrumental variables, it can mitigate bias due to confounding and reverse causation in observational studies. Current MR analyses have predominantly used common genetic variants as instruments, which represent only part of the genetic architecture of complex traits. Rare variants, which can have larger effect sizes and provide unique biological insights, have been understudied due to statistical and methodological challenges. We introduce MR-common and annotation-informed rare variants (MR-CARV), a novel framework integrating common and rare genetic variants in two-sample MR. This method leverages comprehensive genetic data made available by high-throughput sequencing technologies and large-scale consortia. Rare variants are aggregated into functional categories, such as gene-coding, gene-noncoding, and nongene regions, by leveraging variant annotations and biological impact as weights. The effects of rare variant sets are then estimated with STAARpipeline and combined with the estimated effects of common variants by the existing MR methods. Simulation studies demonstrate that MR-CARV maintains robust type I error and achieves higher statistical power, with up to a 66.3% relative increase compared with existing methods only based on common variants. Consistent with these findings, application to real data on high-density lipoprotein cholesterol (HDL-C) and preeclampsia showed that MR-CARV [inverse variance weighted (IVW)] yielded a more precise and statistically significant effect estimate (-0.020, SE = 0.0102,
