TY - JOUR
T1 - Development of prognostic model for preterm birth using machine learning in a population-based cohort of Western Australia births between 1980 and 2015
AU - Wong, Kingsley
AU - Tessema, Gizachew A.
AU - Chai, Kevin
AU - Pereira, Gavin
N1 - Funding Information:
The authors wish to thank the Linkage, Data Outputs and Research Data Services Teams at the Western Australian Data Linkage Branch, the WA Registry of Births, Deaths and Marriages, as well as the data custodians for the Midwives Notification System, the Hospital Morbidity Data Collection, the Cancer Registry, and the WA Register of Developmental Anomalies—Birth Defects. K.W. was supported through an Australian Government Research Training Program scholarship. G.P. was supported with funding from the National Health and Medical Research Council Project and Investigator Grants #1099655 and #1173991, institutional funding for the WA Health and Artificial Intelligence Consortium, and the Research Council of Norway through its Centres of Excellence funding scheme #262700. G.T. was supported with funding from the National Health and Medical Research Council Investigator Grant #1195716.
Funding Information:
The authors wish to thank the Linkage, Data Outputs and Research Data Services Teams at the Western Australian Data Linkage Branch, the WA Registry of Births, Deaths and Marriages, as well as the data custodians for the Midwives Notification System, the Hospital Morbidity Data Collection, the Cancer Registry, and the WA Register of Developmental Anomalies—Birth Defects. K.W. was supported through an Australian Government Research Training Program scholarship. G.P. was supported with funding from the National Health and Medical Research Council Project and Investigator Grants #1099655 and #1173991, institutional funding for the WA Health and Artificial Intelligence Consortium, and the Research Council of Norway through its Centres of Excellence funding scheme #262700. G.T. was supported with funding from the National Health and Medical Research Council Investigator Grant #1195716.
Funding Information:
K.W. was supported through an Australian Government Research Training Program scholarship. G.P. was supported with funding from the National Health and Medical Research Council Project and Investigator Grants #1099655 and #1173991, institutional funding for the WA Health and Artificial Intelligence Consortium, and the Research Council of Norway through its Centres of Excellence funding scheme #262700. G.T. was supported with funding from the National Health and Medical Research Council Investigator Grant #1195716.
Publisher Copyright:
© 2022, The Author(s).
PY - 2022/12
Y1 - 2022/12
N2 - Preterm birth is a global public health problem with a significant burden on the individuals affected. The study aimed to extend current research on preterm birth prognostic model development by developing and internally validating models using machine learning classification algorithms and population-based routinely collected data in Western Australia. The longitudinal retrospective cohort study involved all births in Western Australia between 1980 and 2015, and the analytic sample contains 81,974 (8.6%) preterm births (< 37 weeks of gestation). Prediction models for preterm birth were developed using regularised logistic regression, decision trees, Random Forests, extreme gradient boosting, and multi-layer perceptron (MLP). Predictors included maternal socio-demographics and medical conditions, current and past pregnancy complications, and family history. Class weight was applied to handle imbalanced outcomes and stratified tenfold cross-validation was used to reduce overfitting. Close to half of the preterm births (49.1% at 5% FPR, 95% CI 48.9%,49.5%) were correctly classified by the best performing classifier (MLP) for all women when current pregnancy information was available. The sensitivity was boosted to 52.7% (95% CI 52.1%,53.3%) after including past obstetric history in a sub-population of births from multiparous women. Around half of the preterm birth can be identified antenatally at high specificity using population-based routinely collected maternal and pregnancy data. The performance of the prediction models depends on the available predictor pool that is individual and time specific.
AB - Preterm birth is a global public health problem with a significant burden on the individuals affected. The study aimed to extend current research on preterm birth prognostic model development by developing and internally validating models using machine learning classification algorithms and population-based routinely collected data in Western Australia. The longitudinal retrospective cohort study involved all births in Western Australia between 1980 and 2015, and the analytic sample contains 81,974 (8.6%) preterm births (< 37 weeks of gestation). Prediction models for preterm birth were developed using regularised logistic regression, decision trees, Random Forests, extreme gradient boosting, and multi-layer perceptron (MLP). Predictors included maternal socio-demographics and medical conditions, current and past pregnancy complications, and family history. Class weight was applied to handle imbalanced outcomes and stratified tenfold cross-validation was used to reduce overfitting. Close to half of the preterm births (49.1% at 5% FPR, 95% CI 48.9%,49.5%) were correctly classified by the best performing classifier (MLP) for all women when current pregnancy information was available. The sensitivity was boosted to 52.7% (95% CI 52.1%,53.3%) after including past obstetric history in a sub-population of births from multiparous women. Around half of the preterm birth can be identified antenatally at high specificity using population-based routinely collected maternal and pregnancy data. The performance of the prediction models depends on the available predictor pool that is individual and time specific.
UR - http://www.scopus.com/inward/record.url?scp=85141449081&partnerID=8YFLogxK
U2 - 10.1038/s41598-022-23782-w
DO - 10.1038/s41598-022-23782-w
M3 - Article
C2 - 36352095
AN - SCOPUS:85141449081
VL - 12
JO - Scientific Reports
JF - Scientific Reports
SN - 2045-2322
IS - 1
M1 - 19153
ER -