Prediction of hypertension and diabetes in twin pregnancy using machine learning model based on characteristics at first prenatal visit: national registry study
Date
Language
Embargo Lift Date
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Abstract
Objective: To develop a prediction model for hypertensive disorders of pregnancy (HDP) and gestational diabetes mellitus (GDM) in twin pregnancy using characteristics obtained at the first prenatal visit.
Methods: This was a cross-sectional study using national live-birth data in the USA between 2016 and 2021. The association of all prenatal candidate variables with HDP and GDM was tested on univariable and multivariable logistic regression analyses. Prediction models were built with generalized linear models using the logit link function and classification and regression tree (XGboost) machine learning algorithm. Performance was assessed with repeated 2-fold cross-validation and the area under the receiver-operating-characteristics curve (AUC) was calculated. A P value < 0.001 was considered statistically significant.
Results: A total of 707 198 twin pregnancies were included in the HDP analysis and 723 882 twin pregnancies were included in the GDM analysis. The incidence of HDP and GDM increased significantly from 12.6% and 8.1%, respectively, in 2016 to 16.0% and 10.7%, respectively, in 2021. Factors associated with increased odds of HDP in twin pregnancy were maternal age < 20 years or ≥ 35 years, infertility treatment, prepregnancy diabetes mellitus, non-Hispanic Black race, overweight prepregnancy BMI, prepregnancy obesity and Medicaid as the payment source for delivery (P < 0.001 for all). Obesity Class II and III more than doubled the odds of HDP. Factors associated with increased odds of GDM in twin pregnancy were maternal age ≤ 24 years or ≥ 30 years, infertility treatment, prepregnancy hypertension, non-Hispanic Asian race, maternal birthplace outside the USA and prepregnancy obesity (P < 0.001 for all). Maternal age ≥ 30 years, non-Hispanic Asian race and obesity Class I, II and III more than doubled the odds of GDM. For both HDP and GDM, the performances of the machine learning model and logistic regression model were mostly similar, with negligible differences in the performance domains tested. The mean ± SD AUCs of the final machine learning models for HDP and GDM were 0.620 ± 0.001 and 0.671 ± 0.001, respectively.
Conclusions: The incidence of HDP and GDM in twin pregnancies in the USA is increasing. The predictive accuracy of the machine learning models for HDP and GDM in twin pregnancies was similar to that of the logistic regression models. The models for HDP and GDM had modest predictive performance, were well calibrated and did not have poor fit.