- Browse by Author
Browsing by Author "Li, Zilin"
Now showing 1 - 9 of 9
Results Per Page
Sort Options
Item A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies(Springer Nature, 2022) Li, Zilin; Li, Xihao; Zhou, Hufeng; Gaynor, Sheila M.; Selvaraj, Margaret Sunitha; Arapoglou, Theodore; Quick, Corbin; Liu, Yaowu; Chen, Han; Sun, Ryan; Dey, Rounak; Arnett, Donna K.; Auer, Paul L.; Bielak, Lawrence F.; Bis, Joshua C.; Blackwell, Thomas W.; Blangero, John; Boerwinkle, Eric; Bowden, Donald W.; Brody, Jennifer A.; Cade, Brian E.; Conomos, Matthew P.; Correa, Adolfo; Cupples, L. Adrienne; Curran, Joanne E.; de Vries, Paul S.; Duggirala, Ravindranath; Franceschini, Nora; Freedman, Barry I.; Göring, Harald H. H.; Guo, Xiuqing; Kalyani, Rita R.; Kooperberg, Charles; Kral, Brian G.; Lange, Leslie A.; Lin, Bridget M.; Manichaikul, Ani; Manning, Alisa K.; Martin, Lisa W.; Mathias, Rasika A.; Meigs, James B.; Mitchell, Braxton D.; Montasser, May E.; Morrison, Alanna C.; Naseri, Take; O'Connell, Jeffrey R.; Palmer, Nicholette D.; Peyser, Patricia A.; Psaty, Bruce M.; Raffield, Laura M.; Redline, Susan; Reiner, Alexander P.; Reupena, Muagututi'a Sefuiva; Rice, Kenneth M.; Rich, Stephen S.; Smith, Jennifer A.; Taylor, Kent D.; Taub, Margaret A.; Vasan, Ramachandran S.; Weeks, Daniel E.; Wilson, James G.; Yanek, Lisa R.; Zhao, Wei; NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium; TOPMed Lipids Working Group; Rotter, Jerome I.; Willer, Cristen J.; Natarajan, Pradeep; Peloso, Gina M.; Lin, Xihong; Biostatistics and Health Data Science, School of MedicineLarge-scale whole-genome sequencing studies have enabled analysis of noncoding rare-variant (RV) associations with complex human diseases and traits. Variant-set analysis is a powerful approach to study RV association. However, existing methods have limited ability in analyzing the noncoding genome. We propose a computationally efficient and robust noncoding RV association detection framework, STAARpipeline, to automatically annotate a whole-genome sequencing study and perform flexible noncoding RV association analysis, including gene-centric analysis and fixed window-based and dynamic window-based non-gene-centric analysis by incorporating variant functional annotations. In gene-centric analysis, STAARpipeline uses STAAR to group noncoding variants based on functional categories of genes and incorporate multiple functional annotations. In non-gene-centric analysis, STAARpipeline uses SCANG-STAAR to incorporate dynamic window sizes and multiple functional annotations. We apply STAARpipeline to identify noncoding RV sets associated with four lipid traits in 21,015 discovery samples from the Trans-Omics for Precision Medicine (TOPMed) program and replicate several of them in an additional 9,123 TOPMed samples. We also analyze five non-lipid TOPMed traits.Item Author Correction: Whole-Genome Sequencing Analysis of Human Metabolome in Multi-Ethnic Populations(Springer Nature, 2023-10-19) Feofanova, Elena V.; Brown, Michael R.; Alkis, Taryn; Manuel, Astrid M.; Li, Xihao; Tahir, Usman A.; Li, Zilin; Mendez, Kevin M.; Kelly, Rachel S.; Qi, Qibin; Chen, Han; Larson, Martin G.; Lemaitre, Rozenn N.; Morrison, Alanna C.; Grieser, Charles; Wong, Kari E.; Gerszten, Robert E.; Zhao, Zhongming; Lasky-Su, Jessica; NHLBI Trans-Omics for Precision Medicine (TOPMed); Yu, Bing; Biostatistics and Health Data Science, Richard M. Fairbanks School of Public HealthCorrection to: Nature Communications 10.1038/s41467-023-38800-2, published online 30 May2023 In this article, the author name Robert E. Gerszten was incorrectly written as Robert E. Gersztern. The original article has been corrected.Item FAVOR: functional annotation of variants online resource and annotator for variation across the human genome(Oxford University Press, 2023) Zhou, Hufeng; Arapoglou, Theodore; Li, Xihao; Li, Zilin; Zheng, Xiuwen; Moore, Jill; Asok, Abhijith; Kumar, Sushant; Blue, Elizabeth E.; Buyske, Steven; Cox, Nancy; Felsenfeld, Adam; Gerstein, Mark; Kenny, Eimear; Li, Bingshan; Matise, Tara; Philippakis, Anthony; Rehm, Heidi L.; Sofia, Heidi J.; Snyder, Grace; NHGRI Genome Sequencing Program Variant Functional Annotation Working Group; Weng, Zhiping; Neale, Benjamin; Sunyaev, Shamil R.; Lin, Xihong; Biostatistics, School of Public HealthLarge biobank-scale whole genome sequencing (WGS) studies are rapidly identifying a multitude of coding and non-coding variants. They provide an unprecedented resource for illuminating the genetic basis of human diseases. Variant functional annotations play a critical role in WGS analysis, result interpretation, and prioritization of disease- or trait-associated causal variants. Existing functional annotation databases have limited scope to perform online queries and functionally annotate the genotype data of large biobank-scale WGS studies. We develop the Functional Annotation of Variants Online Resources (FAVOR) to meet these pressing needs. FAVOR provides a comprehensive multi-faceted variant functional annotation online portal that summarizes and visualizes findings of all possible nine billion single nucleotide variants (SNVs) across the genome. It allows for rapid variant-, gene- and region-level queries of variant functional annotations. FAVOR integrates variant functional information from multiple sources to describe the functional characteristics of variants and facilitates prioritizing plausible causal variants influencing human phenotypes. Furthermore, we provide a scalable annotation tool, FAVORannotator, to functionally annotate large-scale WGS studies and efficiently store the genotype and their variant functional annotation data in a single file using the annotated Genomic Data Structure (aGDS) format, making downstream analysis more convenient. FAVOR and FAVORannotator are available at https://favor.genohub.org.Item Genome-wide cross-trait analysis and Mendelian randomization reveal a shared genetic etiology and causality between COVID-19 and venous thromboembolism(Springer Nature, 2023-04-21) Huang, Xin; Yao, Minhao; Tian, Peixin; Wong, Jason Y. Y.; Li, Zilin; Liu, Zhonghua; Zhao, Jie V.; Biostatistics and Health Data Science, School of MedicineVenous thromboembolism occurs in up to one-third of patients with COVID-19. Venous thromboembolism and COVID-19 may share a common genetic architecture, which has not been clarified. To fill this gap, we leverage summary-level genetic data from the latest COVID-19 host genetics consortium and UK Biobank and examine the shared genetic etiology and causal relationship between COVID-19 and venous thromboembolism. The cross-trait and co-localization analyses identify 2, 3, and 4 shared loci between venous thromboembolism and severe COVID-19, COVID-19 hospitalization, SARS-CoV-2 infection respectively, which are mapped to ABO, ADAMTS13, FUT2 genes involved in coagulation functions. Enrichment analysis supports shared biological processes between COVID-19 and venous thromboembolism related to coagulation and immunity. Bi-directional Mendelian randomization suggests that venous thromboembolism was associated with higher risk of three COVID-19 traits, and SARS-CoV-2 infection was associated with a higher risk of venous thromboembolism. Our study provides timely evidence for the genetic etiology between COVID-19 and venous thromboembolism (VTE). Our findings contribute to the understanding of COVID-19 and VTE etiology and provide insights into the prevention and comorbidity management of COVID-19.Item Powerful, scalable and resource-efficient meta-analysis of rare variant associations in large whole genome sequencing studies(Springer Nature, 2023) Li, Xihao; Quick, Corbin; Zhou, Hufeng; Gaynor, Sheila M.; Liu, Yaowu; Chen, Han; Selvaraj, Margaret Sunitha; Sun, Ryan; Dey, Rounak; Arnett, Donna K.; Bielak, Lawrence F.; Bis, Joshua C.; Blangero, John; Boerwinkle, Eric; Bowden, Donald W.; Brody, Jennifer A.; Cade, Brian E.; Correa, Adolfo; Cupples, L. Adrienne; Curran, Joanne E.; de Vries, Paul S.; Duggirala, Ravindranath; Freedman, Barry I.; Göring, Harald H. H.; Guo, Xiuqing; Haessler, Jeffrey; Kalyani, Rita R.; Kooperberg, Charles; Kral, Brian G.; Lange, Leslie A.; Manichaikul, Ani; Martin, Lisa W.; McGarvey, Stephen T.; Mitchell, Braxton D.; Montasser, May E.; Morrison, Alanna C.; Naseri, Take; O'Connell, Jeffrey R.; Palmer, Nicholette D.; Peyser, Patricia A.; Psaty, Bruce M.; Raffield, Laura M.; Redline, Susan; Reiner, Alexander P.; Reupena, Muagututi'a Sefuiva; Rice, Kenneth M.; Rich, Stephen S.; Sitlani, Colleen M.; Smith, Jennifer A.; Taylor, Kent D.; Vasan, Ramachandran S.; Willer, Cristen J.; Wilson, James G.; Yanek, Lisa R.; Zhao, Wei; NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium; TOPMed Lipids Working Group; Rotter, Jerome I.; Natarajan, Pradeep; Peloso, Gina M.; Li, Zilin; Lin, Xihong; Biostatistics and Health Data Science, School of MedicineMeta-analysis of whole genome sequencing/whole exome sequencing (WGS/WES) studies provides an attractive solution to the problem of collecting large sample sizes for discovering rare variants associated with complex phenotypes. Existing rare variant meta-analysis approaches are not scalable to biobank-scale WGS data. Here we present MetaSTAAR, a powerful and resource-efficient rare variant meta-analysis framework for large-scale WGS/WES studies. MetaSTAAR accounts for relatedness and population structure, can analyze both quantitative and dichotomous traits and boosts the power of rare variant tests by incorporating multiple variant functional annotations. Through meta-analysis of four lipid traits in 30,138 ancestrally diverse samples from 14 studies of the Trans Omics for Precision Medicine (TOPMed) Program, we show that MetaSTAAR performs rare variant meta-analysis at scale and produces results comparable to using pooled data. Additionally, we identified several conditionally significant rare variant associations with lipid traits. We further demonstrate that MetaSTAAR is scalable to biobank-scale cohorts through meta-analysis of TOPMed WGS data and UK Biobank WES data of ~200,000 samples.Item Rare variants in long non-coding RNAs are associated with blood lipid levels in the TOPMed Whole Genome Sequencing Study(medRxiv, 2023-06-29) Wang, Yuxuan; Selvaraj, Margaret Sunitha; Li, Xihao; Li, Zilin; Holdcraft, Jacob A.; Arnett, Donna K.; Bis, Joshua C.; Blangero, John; Boerwinkle, Eric; Bowden, Donald W.; Cade, Brian E.; Carlson, Jenna C.; Carson, April P.; Chen, Yii-Der Ida; Curran, Joanne E.; de Vries, Paul S.; Dutcher, Susan K.; Ellinor, Patrick T.; Floyd, James S.; Fornage, Myriam; Freedman, Barry I.; Gabriel, Stacey; Germer, Soren; Gibbs, Richard A.; Guo, Xiuqing; He, Jiang; Heard-Costa, Nancy; Hildalgo, Bertha; Hou, Lifang; Irvin, Marguerite R.; Joehanes, Roby; Kaplan, Robert C.; Kardia, Sharon Lr.; Kelly, Tanika N.; Kim, Ryan; Kooperberg, Charles; Kral, Brian G.; Levy, Daniel; Li, Changwei; Liu, Chunyu; Lloyd-Jone, Don; Loos, Ruth Jf.; Mahaney, Michael C.; Martin, Lisa W.; Mathias, Rasika A.; Minster, Ryan L.; Mitchell, Braxton D.; Montasser, May E.; Morrison, Alanna C.; Murabito, Joanne M.; Naseri, Take; O'Connell, Jeffrey R.; Palmer, Nicholette D.; Preuss, Michael H.; Psaty, Bruce M.; Raffield, Laura M.; Rao, Dabeeru C.; Redline, Susan; Reiner, Alexander P.; Rich, Stephen S.; Ruepena, Muagututi'a Sefuiva; Sheu, Wayne H-H; Smith, Jennifer A.; Smith, Albert; Tiwari, Hemant K.; Tsai, Michael Y.; Viaud-Martinez, Karine A.; Wang, Zhe; Yanek, Lisa R.; Zhao, Wei; NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium; Rotter, Jerome I.; Lin, Xihong; Natarajan, Pradeep; Peloso, Gina M.; Biostatistics and Health Data Science, School of MedicineLong non-coding RNAs (lncRNAs) are known to perform important regulatory functions. Large-scale whole genome sequencing (WGS) studies and new statistical methods for variant set tests now provide an opportunity to assess the associations between rare variants in lncRNA genes and complex traits across the genome. In this study, we used high-coverage WGS from 66,329 participants of diverse ancestries with blood lipid levels (LDL-C, HDL-C, TC, and TG) in the National Heart, Lung, and Blood Institute (NHLBI) Trans-Omics for Precision Medicine (TOPMed) program to investigate the role of lncRNAs in lipid variability. We aggregated rare variants for 165,375 lncRNA genes based on their genomic locations and conducted rare variant aggregate association tests using the STAAR (variant-Set Test for Association using Annotation infoRmation) framework. We performed STAAR conditional analysis adjusting for common variants in known lipid GWAS loci and rare coding variants in nearby protein coding genes. Our analyses revealed 83 rare lncRNA variant sets significantly associated with blood lipid levels, all of which were located in known lipid GWAS loci (in a ±500 kb window of a Global Lipids Genetics Consortium index variant). Notably, 61 out of 83 signals (73%) were conditionally independent of common regulatory variations and rare protein coding variations at the same loci. We replicated 34 out of 61 (56%) conditionally independent associations using the independent UK Biobank WGS data. Our results expand the genetic architecture of blood lipids to rare variants in lncRNA, implicating new therapeutic opportunities.Item STAAR workflow: a cloud-based workflow for scalable and reproducible rare variant analysis(Oxford University Press, 2022) Gaynor, Sheila M.; Westerman, Kenneth E.; Ackovic, Lea L.; Li, Xihao; Li, Zilin; Manning, Alisa K.; Philippakis, Anthony; Lin, Xihong; Biostatistics and Health Data Science, Richard M. Fairbanks School of Public HealthSummary: We developed the variant-Set Test for Association using Annotation infoRmation (STAAR) workflow description language (WDL) workflow to facilitate the analysis of rare variants in whole genome sequencing association studies. The open-access STAAR workflow written in the WDL allows a user to perform rare variant testing for both gene-centric and genetic region approaches, enabling genome-wide, candidate and conditional analyses. It incorporates functional annotations into the workflow as introduced in the STAAR method in order to boost the rare variant analysis power. This tool was specifically developed and optimized to be implemented on cloud-based platforms such as BioData Catalyst Powered by Terra. It provides easy-to-use functionality for rare variant analysis that can be incorporated into an exhaustive whole genome sequencing analysis pipeline. Availability and implementation: The workflow is freely available from https://dockstore.org/workflows/github.com/sheilagaynor/STAAR_workflow.Item Whole Genome Sequencing Analysis of Body Mass Index Identifies Novel African Ancestry-Specific Risk Allele(medRxiv, 2023-08-22) Zhang, Xinruo; Brody, Jennifer A.; Graff, Mariaelisa; Highland, Heather M.; Chami, Nathalie; Xu, Hanfei; Wang, Zhe; Ferrier, Kendra; Chittoor, Geetha; Josyula, Navya S.; Li, Xihao; Li, Zilin; Allison, Matthew A.; Becker, Diane M.; Bielak, Lawrence F.; Bis, Joshua C.; Boorgula, Meher Preethi; Bowden, Donald W.; Broome, Jai G.; Buth, Erin J.; Carlson, Christopher S.; Chang, Kyong-Mi; Chavan, Sameer; Chiu, Yen-Feng; Chuang, Lee-Ming; Conomos, Matthew P.; DeMeo, Dawn L.; Du, Margaret; Duggirala, Ravindranath; Eng, Celeste; Fohner, Alison E.; Freedman, Barry I.; Garrett, Melanie E.; Guo, Xiuqing; Haiman, Chris; Heavner, Benjamin D.; Hidalgo, Bertha; Hixson, James E.; Ho, Yuk-Lam; Hobbs, Brian D.; Hu, Donglei; Hui, Qin; Hwu, Chii-Min; Jackson, Rebecca D.; Jain, Deepti; Kalyani, Rita R.; Kardia, Sharon L. R.; Kelly, Tanika N.; Lange, Ethan M.; LeNoir, Michael; Li, Changwei; Marchand, Loic Le; McDonald, Merry-Lynn N.; McHugh, Caitlin P.; Morrison, Alanna C.; Naseri, Take; NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium; O'Connell, Jeffrey; O'Donnell, Christopher J.; Palmer, Nicholette D.; Pankow, James S.; Perry, James A.; Peters, Ulrike; Preuss, Michael H.; Rao, D. C.; Regan, Elizabeth A.; Reupena, Sefuiva M.; Roden, Dan M.; Rodriguez-Santana, Jose; Sitlani, Colleen M.; Smith, Jennifer A.; Tiwari, Hemant K.; Vasan, Ramachandran S.; Wang, Zeyuan; Weeks, Daniel E.; Wessel, Jennifer; Wiggins, Kerri L.; Wilkens, Lynne R.; Wilson, Peter W. F.; Yanek, Lisa R.; Yoneda, Zachary T.; Zhao, Wei; Zöllner, Sebastian; Arnett, Donna K.; Ashley-Koch, Allison E.; Barnes, Kathleen C.; Blangero, John; Boerwinkle, Eric; Burchard, Esteban G.; Carson, April P.; Chasman, Daniel I.; Chen, Yii-Der Ida; Curran, Joanne E.; Fornage, Myriam; Gordeuk, Victor R.; He, Jiang; Heckbert, Susan R.; Hou, Lifang; Irvin, Marguerite R.; Kooperberg, Charles; Minster, Ryan L.; Mitchell, Braxton D.; Nouraie, Mehdi; Psaty, Bruce M.; Raffield, Laura M.; Reiner, Alexander P.; Rich, Stephen S.; Rotter, Jerome I.; Shoemaker, M. Benjamin; Smith, Nicholas L.; Taylor, Kent D.; Telen, Marilyn J.; Weiss, Scott T.; Zhang, Yingze; Heard-Costa, Nancy; Sun, Yan V.; Lin, Xihong; Cupples, L. Adrienne; Lange, Leslie A.; Liu, Ching-Ti; Loos, Ruth J. F.; North, Kari E.; Justice, Anne E.; Biostatistics and Health Data Science, School of MedicineObesity is a major public health crisis associated with high mortality rates. Previous genome-wide association studies (GWAS) investigating body mass index (BMI) have largely relied on imputed data from European individuals. This study leveraged whole-genome sequencing (WGS) data from 88,873 participants from the Trans-Omics for Precision Medicine (TOPMed) Program, of which 51% were of non-European population groups. We discovered 18 BMI-associated signals (P < 5 × 10−9). Notably, we identified and replicated a novel low frequency single nucleotide polymorphism (SNP) in MTMR3 that was common in individuals of African descent. Using a diverse study population, we further identified two novel secondary signals in known BMI loci and pinpointed two likely causal variants in the POC5 and DMD loci. Our work demonstrates the benefits of combining WGS and diverse cohorts in expanding current catalog of variants and genes confer risk for obesity, bringing us one step closer to personalized medicine.Item Whole-Genome Sequencing Analysis of Human Metabolome in Multi-Ethnic Populations(Springer Nature, 2023-05-30) Feofanova, Elena V.; Brown, Michael R.; Alkis, Taryn; Manuel, Astrid M.; Li, Xihao; Tahir, Usman A.; Li, Zilin; Mendez, Kevin M.; Kelly, Rachel S.; Qi, Qibin; Chen, Han; Larson, Martin G.; Lemaitre, Rozenn N.; Morrison, Alanna C.; Grieser, Charles; Wong, Kari E.; Gerszten, Robert E.; Zhao, Zhongming; Lasky-Su, Jessica; NHLBI Trans-Omics for Precision Medicine (TOPMed); Yu, Bing; Biostatistics and Health Data Science, School of MedicineCirculating metabolite levels may reflect the state of the human organism in health and disease, however, the genetic architecture of metabolites is not fully understood. We have performed a whole-genome sequencing association analysis of both common and rare variants in up to 11,840 multi-ethnic participants from five studies with up to 1666 circulating metabolites. We have discovered 1985 novel variant-metabolite associations, and validated 761 locus-metabolite associations reported previously. Seventy-nine novel variant-metabolite associations have been replicated, including three genetic loci located on the X chromosome that have demonstrated its involvement in metabolic regulation. Gene-based analysis have provided further support for seven metabolite-replicated loci pairs and their biologically plausible genes. Among those novel replicated variant-metabolite pairs, follow-up analyses have revealed that 26 metabolites have colocalized with 21 tissues, seven metabolite-disease outcome associations have been putatively causal, and 7 metabolites might be regulated by plasma protein levels. Our results have depicted the genetic contribution to circulating metabolite levels, providing additional insights into understanding human disease.