A comprehensive and bias-free machine learning approach for risk prediction of preeclampsia with severe features in a nulliparous study cohort

dc.contributor.authorLin, Yun C.
dc.contributor.authorMallia, Daniel
dc.contributor.authorClark‑Sevilla, Andrea O.
dc.contributor.authorCatto, Adam
dc.contributor.authorLeshchenko, Alisa
dc.contributor.authorYan, Qi
dc.contributor.authorHaas, David M.
dc.contributor.authorWapner, Ronald
dc.contributor.authorPe’er, Itsik
dc.contributor.authorRaja, Anita
dc.contributor.authorSalleb‑Aouissi, Ansaf
dc.contributor.departmentObstetrics and Gynecology, School of Medicine
dc.date.accessioned2025-01-27T15:47:59Z
dc.date.available2025-01-27T15:47:59Z
dc.date.issued2024-12-24
dc.description.abstractPreeclampsia is one of the leading causes of maternal morbidity, with consequences during and after pregnancy. Because of its diverse clinical presentation, preeclampsia is an adverse pregnancy outcome that is uniquely challenging to predict and manage. In this paper, we developed racial bias-free machine learning models that predict the onset of preeclampsia with severe features or eclampsia at discrete time points in a nulliparous pregnant study cohort. To focus on those most at risk, we selected probands with severe PE (sPE). Those with mild preeclampsia, superimposed preeclampsia, and new onset hypertension were excluded.The prospective study cohort to which we applied machine learning is the Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-be (nuMoM2b) study, which contains information from eight clinical sites across the US. Maternal serum samples were collected for 1,857 individuals between the first and second trimesters. These patients with serum samples collected are selected as the final cohort.Our prediction models achieved an AUROC of 0.72 (95% CI, 0.69-0.76), 0.75 (95% CI, 0.71-0.79), and 0.77 (95% CI, 0.74-0.80), respectively, for the three visits. Our initial models were biased toward non-Hispanic black participants with a high predictive equality ratio of 1.31. We corrected this bias and reduced this ratio to 1.14. This lowers the rate of false positives in our predictive model for the non-Hispanic black participants. The exact cause of the bias is still under investigation, but previous studies have recognized PLGF as a potential bias-inducing factor. However, since our model includes various factors that exhibit a positive correlation with PLGF, such as blood pressure measurements and BMI, we have employed an algorithmic approach to disentangle this bias from the model.The top features of our built model stress the importance of using several tests, particularly for biomarkers (BMI and blood pressure measurements) and ultrasound measurements. Placental analytes (PLGF and Endoglin) were strong predictors for screening for the early onset of preeclampsia with severe features in the first two trimesters.
dc.eprint.versionFinal published version
dc.identifier.citationLin YC, Mallia D, Clark-Sevilla AO, et al. A comprehensive and bias-free machine learning approach for risk prediction of preeclampsia with severe features in a nulliparous study cohort. BMC Pregnancy Childbirth. 2024;24(1):853. Published 2024 Dec 24. doi:10.1186/s12884-024-06988-w
dc.identifier.urihttps://hdl.handle.net/1805/45498
dc.language.isoen_US
dc.publisherSpringer Nature
dc.relation.isversionof10.1186/s12884-024-06988-w
dc.relation.journalBMC Pregnancy and Childbirth
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internationalen
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0
dc.sourcePMC
dc.subjectEnsemble model
dc.subjectFairness in machine learning
dc.subjectMachine learning
dc.subjectPreeclampsia
dc.subjectPreeclampsia with severe features
dc.titleA comprehensive and bias-free machine learning approach for risk prediction of preeclampsia with severe features in a nulliparous study cohort
dc.typeArticle
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Lin2024Comprehensive-CCBYNCND.pdf
Size:
1.93 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.04 KB
Format:
Item-specific license agreed upon to submission
Description: