Jeong, SeokhoShivakumar, ManuJung, Sang-HyukWon, Hong-HeeNho, KwangsikHuang, HengDavatzikos, ChristosSaykin, Andrew J.Thompson, Paul M.Shen, LiKim, Young JinKim, Bong-JoLee, SeunggeunKim, Dokyoon2025-05-202025-05-202025Jeong S, Shivakumar M, Jung SH, et al. Addressing overfitting bias due to sample overlap in polygenic risk scoring. Alzheimers Dement. 2025;21(4):e70109. doi:10.1002/alz.70109https://hdl.handle.net/1805/48253Introduction: Numerous studies on Alzheimer's disease polygenic risk scores (PRSs) overlook sample overlap between International Genomics of Alzheimer's Project (IGAP) and target datasets like Alzheimer's Disease Neuroimaging Initiative (ADNI). Methods: To address this, we developed overlap-adjusted PRS (OA PRS) and tested it on simulated data to assess biases from different scenarios by varying training, testing, and overlap proportions. OA PRS was used to adjust for sample bias in simulations; then, we applied OA PRS to IGAP and ADNI datasets and validated through visual diagnosis. Results: OA PRS effectively adjusted for sample overlap in all simulation scenarios, as well as for IGAP and ADNI. The original IGAP PRS showed an inflated area under the receiver operating characteristic (AUROC: 0.915) on overlapping samples. OA PRS reduced the AUROC to 0.726, closely aligning with the AUROC of non-overlapping samples (0.712). Further, visual diagnostics confirmed the effectiveness of our adjustments. Discussion: With OA PRS, we were able to adjust the IGAP summary-based PRS for the overlapped ADNI samples, allowing the dataset to be fully used without the risk of overfitting. Highlights: Sample overlap between large Alzheimer's disease (AD) cohorts poses overfitting bias when using AD polygenic risk scores (PRSs). This study highlighted the effectiveness of overlap-adjusted PRS (OA -PRS) in mitigating overfitting and improving the accuracy of PRS estimations. New PRSs based on adjusted effect sizes showed increased power in association with clinical features.en-USAttribution 4.0 InternationalAlzheimer's diseaseGenetic risk factorPolygenic risk scoresPrecision medicineSample overlapAddressing overfitting bias due to sample overlap in polygenic risk scoringArticle