- Browse by Author
Browsing by Author "Jin, Weijia"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item In silico generation and augmentation of regulatory variants from massively parallel reporter assay using conditional variational autoencoder(bioRxiv, 2024-06-29) Jin, Weijia; Xia, Yi; Thela, Sai Ritesh; Liu, Yunlong; Chen, Li; Medical and Molecular Genetics, School of MedicinePredicting the functional consequences of genetic variants in non-coding regions is a challenging problem. Massively parallel reporter assays (MPRAs), which are an in vitro high-throughput method, can simultaneously test thousands of variants by evaluating the existence of allele specific regulatory activity. Nevertheless, the identified labelled variants by MPRAs, which shows differential allelic regulatory effects on the gene expression are usually limited to the scale of hundreds, limiting their potential to be used as the training set for achieving a robust genome-wide prediction. To address the limitation, we propose a deep generative model, MpraVAE, to in silico generate and augment the training sample size of labelled variants. By benchmarking on several MPRA datasets, we demonstrate that MpraVAE significantly improves the prediction performance for MPRA regulatory variants compared to the baseline method, conventional data augmentation approaches as well as existing variant scoring methods. Taking autoimmune diseases as one example, we apply MpraVAE to perform a genome-wide prediction of regulatory variants and find that predicted regulatory variants are more enriched than background variants in enhancers, active histone marks, open chromatin regions in immune-related cell types, and chromatin states associated with promoter, enhancer activity and binding sites of cMyC and Pol II that regulate gene expression. Importantly, predicted regulatory variants are found to link immune-related genes by leveraging chromatin loop and accessible chromatin, demonstrating the importance of MpraVAE in genetic and gene discovery for complex traits.Item MPRAVarDB: an online database and web server for exploring regulatory effects of genetic variants(Oxford University Press, 2024) Jin, Weijia; Xia, Yi; Nizomov, Javlon; Liu, Yunlong; Li, Zhigang; Lu, Qing; Chen, Li; Medical and Molecular Genetics, School of MedicineSummary: Massively parallel reporter assay (MPRA) is an important technology for evaluating the impact of genetic variants on gene regulation. Here, we present MPRAVarDB, an online database and web server for exploring regulatory effects of genetic variants. MPRAVarDB harbors 18 MPRA experiments designed to assess the regulatory effects of genetic variants associated with GWAS loci, eQTLs, and genomic features, totaling 242 818 variants tested more than 30 cell lines and 30 human diseases or traits. MPRAVarDB enables users to query MPRA variants by genomic region, disease and cell line, or any combination of these parameters. Notably, MPRAVarDB offers a suite of pretrained machine-learning models tailored to the specific disease and cell line, facilitating the prediction of regulatory variants. The user-friendly interface allows users to receive query and prediction results with just a few clicks. Availability and implementation: https://mpravardb.rc.ufl.edu.Item MPRAVarDB: an online database and web server for exploring regulatory effects of genetic variants(bioRxiv, 2024-04-03) Nizomov, Javlon; Jin, Weijia; Xia, Yi; Liu, Yunlong; Li, Zhigang; Chen, Li; Medical and Molecular Genetics, School of MedicineMassively parallel reporter assay (MPRA) is an important technology to evaluate the impact of genetic variants on gene regulation. Here, we present MPRAVarDB, an online database and web server, for exploring regulatory effects of genetic variants. MPRAVarDB harbors 18 MPRA experiments designed to assess the regulatory effects of genetic variants associated with GWAS loci, eQTLs and various genomic features, resulting in a total of 242,818 variants tested across more than 30 cell lines and 30 human diseases or traits. MPRAVarDB empowers the query of MPRA variants by genomic region, disease and cell line or by any combination of these query terms. Notably, MPRAVarDB offers a suite of pretrained machine learning models tailored to the specific disease and cell line, facilitating the genome-wide prediction of regulatory variants. MPRAVarDB is friendly to use, and users only need a few clicks to receive query and prediction results.