Machine Learning Techniques for Prediction of Early Childhood Obesity

This paper aims to predict childhood obesity after age two, using only data collected prior to the second birthday by a clinical decision support system called CHICA. Methods

Analyses of six different machine learning methods: RandomTree, RandomForest, J48, ID3, Naïve Bayes, and Bayes trained on CHICA data show that an accurate, sensitive model can be created. Results

Of the methods analyzed, the ID3 model trained on the CHICA dataset proved the best overall performance with accuracy of 85% and sensitivity of 89%. Additionally, the ID3 model had a positive predictive value of 84% and a negative predictive value of 88%. The structure of the tree also gives insight into the strongest predictors of future obesity in children. Many of the strongest predictors seen in the ID3 modeling of the CHICA dataset have been independently validated in the literature as correlated with obesity, thereby supporting the validity of the model. Conclusions

This study demonstrated that data from a production clinical decision support system can be used to build an accurate machine learning model to predict obesity in children after age two.

Keywords

Bayes theorem, Obesity, Artificial intelligence, Decision trees, Predictive analytics

Cite As

Dugan, T. M., Mukhopadhyay, S., Carroll, A., & Downs, S. (2015). Machine Learning Techniques for Prediction of Early Childhood Obesity. Applied Clinical Informatics, 6(3), 506–520. http://doi.org/10.4338/ACI-2015-03-RA-0036

Journal

Applied Clinical Informatics

Rights

Publisher Policy

Source

PMC

Type

Article

Permanent Link

https://hdl.handle.net/1805/12952

DOI

https://doi.org/10.4338/ACI-2015-03-RA-0036

Full Text Available at

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4586339/

Collections

Open Access Policy Articles
Department of Computer and Information Science Works

Full item page