Application of unsupervised deep learning algorithms for identification of specific clusters of chronic cough patients from EMR data

dc.contributor.authorShao, Wei
dc.contributor.authorLuo, Xiao
dc.contributor.authorZhang, Zuoyi
dc.contributor.authorHan, Zhi
dc.contributor.authorChandrasekaran, Vasu
dc.contributor.authorTurzhitsky, Vladimir
dc.contributor.authorBali, Vishal
dc.contributor.authorRoberts, Anna R.
dc.contributor.authorMetzger, Megan
dc.contributor.authorBaker, Jarod
dc.contributor.authorLa Rosa, Carmen
dc.contributor.authorWeaver, Jessica
dc.contributor.authorDexter, Paul
dc.contributor.authorHuang, Kun
dc.contributor.departmentBiostatistics and Health Data Science, School of Medicineen_US
dc.date.accessioned2023-06-02T11:40:40Z
dc.date.available2023-06-02T11:40:40Z
dc.date.issued2022-04-19
dc.description.abstractBackground: Chronic cough affects approximately 10% of adults. The lack of ICD codes for chronic cough makes it challenging to apply supervised learning methods to predict the characteristics of chronic cough patients, thereby requiring the identification of chronic cough patients by other mechanisms. We developed a deep clustering algorithm with auto-encoder embedding (DCAE) to identify clusters of chronic cough patients based on data from a large cohort of 264,146 patients from the Electronic Medical Records (EMR) system. We constructed features using the diagnosis within the EMR, then built a clustering-oriented loss function directly on embedded features of the deep autoencoder to jointly perform feature refinement and cluster assignment. Lastly, we performed statistical analysis on the identified clusters to characterize the chronic cough patients compared to the non-chronic cough patients. Results: The experimental results show that the DCAE model generated three chronic cough clusters and one non-chronic cough patient cluster. We found various diagnoses, medications, and lab tests highly associated with chronic cough patients by comparing the chronic cough cluster with the non-chronic cough cluster. Comparison of chronic cough clusters demonstrated that certain combinations of medications and diagnoses characterize some chronic cough clusters. Conclusions: To the best of our knowledge, this study is the first to test the potential of unsupervised deep learning methods for chronic cough investigation, which also shows a great advantage over existing algorithms for patient data clustering.en_US
dc.eprint.versionFinal published versionen_US
dc.identifier.citationShao W, Luo X, Zhang Z, et al. Application of unsupervised deep learning algorithms for identification of specific clusters of chronic cough patients from EMR data. BMC Bioinformatics. 2022;23(Suppl 3):140. Published 2022 Apr 19. doi:10.1186/s12859-022-04680-4en_US
dc.identifier.urihttps://hdl.handle.net/1805/33419
dc.language.isoen_USen_US
dc.publisherBMCen_US
dc.relation.isversionof10.1186/s12859-022-04680-4en_US
dc.relation.journalBMC Bioinformaticsen_US
dc.rightsAttribution 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.sourcePMCen_US
dc.subjectChronic coughen_US
dc.subjectDeep clusteringen_US
dc.subjectEMR dataen_US
dc.subjectUnsupervised learningen_US
dc.titleApplication of unsupervised deep learning algorithms for identification of specific clusters of chronic cough patients from EMR dataen_US
dc.typeArticleen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
12859_2022_Article_4680.pdf
Size:
1.26 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.99 KB
Format:
Item-specific license agreed upon to submission
Description: