Explicit Modeling of Ancestry Improves Polygenic Risk Scores and BLUP Prediction

dc.contributor.authorChen, Chia-Yen
dc.contributor.authorHan, Jiali
dc.contributor.authorHunter, David J.
dc.contributor.authorKraft, Peter
dc.contributor.authorPrice, Alkes L.
dc.contributor.departmentDepartment of Epidemiology, Richard M. Fairbanks School of Public Healthen_US
dc.date.accessioned2017-05-11T16:14:27Z
dc.date.available2017-05-11T16:14:27Z
dc.date.issued2015-09
dc.description.abstractPolygenic prediction using genome-wide SNPs can provide high prediction accuracy for complex traits. Here, we investigate the question of how to account for genetic ancestry when conducting polygenic prediction. We show that the accuracy of polygenic prediction in structured populations may be partly due to genetic ancestry. However, we hypothesized that explicitly modeling ancestry could improve polygenic prediction accuracy. We analyzed three GWAS of hair color (HC), tanning ability (TA), and basal cell carcinoma (BCC) in European Americans (sample size from 7,440 to 9,822) and considered two widely used polygenic prediction approaches: polygenic risk scores (PRSs) and best linear unbiased prediction (BLUP). We compared polygenic prediction without correction for ancestry to polygenic prediction with ancestry as a separate component in the model. In 10-fold cross-validation using the PRS approach, the R(2) for HC increased by 66% (0.0456-0.0755; P < 10(-16)), the R(2) for TA increased by 123% (0.0154 to 0.0344; P < 10(-16)), and the liability-scale R(2) for BCC increased by 68% (0.0138-0.0232; P < 10(-16)) when explicitly modeling ancestry, which prevents ancestry effects from entering into each SNP effect and being overweighted. Surprisingly, explicitly modeling ancestry produces a similar improvement when using the BLUP approach, which fits all SNPs simultaneously in a single variance component and causes ancestry to be underweighted. We validate our findings via simulations, which show that the differences in prediction accuracy will increase in magnitude as sample sizes increase. In summary, our results show that explicitly modeling ancestry can be important in both PRS and BLUP prediction.en_US
dc.eprint.versionAuthor's manuscripten_US
dc.identifier.citationChen, C.-Y., Han, J., Hunter, D. J., Kraft, P., & Price, A. L. (2015). Explicit modeling of ancestry improves polygenic risk scores and BLUP prediction. Genetic Epidemiology, 39(6), 427–438. http://doi.org/10.1002/gepi.21906en_US
dc.identifier.urihttps://hdl.handle.net/1805/12500
dc.language.isoen_USen_US
dc.publisherWileyen_US
dc.relation.isversionof10.1002/gepi.21906en_US
dc.relation.journalGenetic Epidemiologyen_US
dc.rightsPublisher Policyen_US
dc.sourcePMCen_US
dc.subjectBasal cell carcinomaen_US
dc.subjectGenome-wide association studyen_US
dc.subjectPigmentationen_US
dc.subjectPolygenic predictionen_US
dc.subjectPrincipal component analysisen_US
dc.titleExplicit Modeling of Ancestry Improves Polygenic Risk Scores and BLUP Predictionen_US
dc.typeArticleen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
nihms754012-2.pdf
Size:
793.92 KB
Format:
Adobe Portable Document Format
Description:
Main Article
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.88 KB
Format:
Item-specific license agreed upon to submission
Description: