Multi-omics cannot replace sample size in genome-wide association studies
Date
Language
Embargo Lift Date
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Abstract
The integration of multi-omics information (e.g., epigenetics and transcriptomics) can be useful for interpreting findings from genome-wide association studies (GWAS). It has been suggested that multi-omics could circumvent or greatly reduce the need to increase GWAS sample sizes for novel variant discovery. We tested whether incorporating multi-omics information in earlier and smaller-sized GWAS boosts true-positive discovery of genes that were later revealed by larger GWAS of the same/similar traits. We applied 10 different analytic approaches to integrating multi-omics data from 12 sources (e.g., Genotype-Tissue Expression project) to test whether earlier and smaller GWAS of 4 brain-related traits (alcohol use disorder/problematic alcohol use, major depression/depression, schizophrenia, and intracranial volume/brain volume) could detect genes that were revealed by a later and larger GWAS. Multi-omics data did not reliably identify novel genes in earlier less-powered GWAS (PPV <0.2; 80% false-positive associations). Machine learning predictions marginally increased the number of identified novel genes, correctly identifying 1-8 additional genes, but only for well-powered early GWAS of highly heritable traits (i.e., intracranial volume and schizophrenia). Although multi-omics, particularly positional mapping (i.e., fastBAT, MAGMA, and H-MAGMA), can help to prioritize genes within genome-wide significant loci (PPVs = 0.5-1.0) and translate them into information about disease biology, it does not reliably increase novel gene discovery in brain-related GWAS. To increase power for discovery of novel genes and loci, increasing sample size is required.