Exploiting deep transfer learning for the prediction of functional non-coding variants using genomic sequence
dc.contributor.author | Chen, Li | |
dc.contributor.author | Wang, Ye | |
dc.contributor.author | Zhao, Fengdi | |
dc.contributor.department | Biostatistics, School of Public Health | |
dc.date.accessioned | 2023-10-25T14:12:36Z | |
dc.date.available | 2023-10-25T14:12:36Z | |
dc.date.issued | 2022 | |
dc.description.abstract | Motivation: Though genome-wide association studies have identified tens of thousands of variants associated with complex traits and most of them fall within the non-coding regions, they may not be the causal ones. The development of high-throughput functional assays leads to the discovery of experimental validated non-coding functional variants. However, these validated variants are rare due to technical difficulty and financial cost. The small sample size of validated variants makes it less reliable to develop a supervised machine learning model for achieving a whole genome-wide prediction of non-coding causal variants. Results: We will exploit a deep transfer learning model, which is based on convolutional neural network, to improve the prediction for functional non-coding variants (NCVs). To address the challenge of small sample size, the transfer learning model leverages both large-scale generic functional NCVs to improve the learning of low-level features and context-specific functional NCVs to learn high-level features toward the context-specific prediction task. By evaluating the deep transfer learning model on three MPRA datasets and 16 GWAS datasets, we demonstrate that the proposed model outperforms deep learning models without pretraining or retraining. In addition, the deep transfer learning model outperforms 18 existing computational methods in both MPRA and GWAS datasets. Availability and implementation: https://github.com/lichen-lab/TLVar. | |
dc.eprint.version | Final published version | |
dc.identifier.citation | Chen L, Wang Y, Zhao F. Exploiting deep transfer learning for the prediction of functional non-coding variants using genomic sequence. Bioinformatics. 2022;38(12):3164-3172. doi:10.1093/bioinformatics/btac214 | |
dc.identifier.uri | https://hdl.handle.net/1805/36654 | |
dc.language.iso | en_US | |
dc.publisher | Oxford University Press | |
dc.relation.isversionof | 10.1093/bioinformatics/btac214 | |
dc.relation.journal | Bioinformatics | |
dc.rights | Publisher Policy | |
dc.source | PMC | |
dc.subject | Genome-Wide Association Study | |
dc.subject | Genomics | |
dc.subject | Machine Learning | |
dc.subject | Computer Neural Networks | |
dc.title | Exploiting deep transfer learning for the prediction of functional non-coding variants using genomic sequence | |
dc.type | Article | |
ul.alternative.fulltext | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9890318/ |