Data-driven clustering identifies features distinguishing multisystem inflammatory syndrome from acute COVID-19 in children and adolescents
Date
Authors
Language
Embargo Lift Date
Department
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Abstract
Background Multisystem inflammatory syndrome in children (MIS-C) consensus criteria were designed for maximal sensitivity and therefore capture patients with acute COVID-19 pneumonia.
Methods We performed unsupervised clustering on data from 1,526 patients (684 labeled MIS-C by clinicians) <21 years old hospitalized with COVID-19-related illness admitted between 15 March 2020 and 31 December 2020. We compared prevalence of assigned MIS-C labels and clinical features among clusters, followed by recursive feature elimination to identify characteristics of potentially misclassified MIS-C-labeled patients.
Findings Of 94 clinical features tested, 46 were retained for clustering. Cluster 1 patients (N = 498; 92% labeled MIS-C) were mostly previously healthy (71%), with mean age 7·2 ± 0·4 years, predominant cardiovascular (77%) and/or mucocutaneous (82%) involvement, high inflammatory biomarkers, and mostly SARS-CoV-2 PCR negative (60%). Cluster 2 patients (N = 445; 27% labeled MIS-C) frequently had pre-existing conditions (79%, with 39% respiratory), were similarly 7·4 ± 2·1 years old, and commonly had chest radiograph infiltrates (79%) and positive PCR testing (90%). Cluster 3 patients (N = 583; 19% labeled MIS-C) were younger (2·8 ± 2·0 y), PCR positive (86%), with less inflammation. Radiographic findings of pulmonary infiltrates and positive SARS-CoV-2 PCR accurately distinguished cluster 2 MIS-C labeled patients from cluster 1 patients.
Interpretation Using a data driven, unsupervised approach, we identified features that cluster patients into a group with high likelihood of having MIS-C. Other features identified a cluster of patients more likely to have acute severe COVID-19 pulmonary disease, and patients in this cluster labeled by clinicians as MIS-C may be misclassified. These data driven phenotypes may help refine the diagnosis of MIS-C.