Improving the Performance of Clinical Prediction Tasks by Using Structured and Unstructured Data Combined with a Patient Network

dc.contributor.advisorLuo, Xiao
dc.contributor.advisorKing, Brian
dc.contributor.authorNouri Golmaei, Sara
dc.contributor.otherZhang, Qingxue
dc.date.accessioned2021-08-09T17:39:29Z
dc.date.available2021-08-09T17:39:29Z
dc.date.issued2021-08
dc.degree.date2021en_US
dc.degree.disciplineElectrical & Computer Engineeringen
dc.degree.grantorPurdue Universityen_US
dc.degree.levelM.S.en_US
dc.descriptionIndiana University-Purdue University Indianapolis (IUPUI)en_US
dc.description.abstractWith the increasing availability of Electronic Health Records (EHRs) and advances in deep learning techniques, developing deep predictive models that use EHR data to solve healthcare problems has gained momentum in recent years. The majority of clinical predictive models benefit from structured data in EHR (e.g., lab measurements and medications). Still, learning clinical outcomes from all possible information sources is one of the main challenges when building predictive models. This work focuses mainly on two sources of information that have been underused by researchers; unstructured data (e.g., clinical notes) and a patient network. We propose a novel hybrid deep learning model, DeepNote-GNN, that integrates clinical notes information and patient network topological structure to improve 30-day hospital readmission prediction. DeepNote-GNN is a robust deep learning framework consisting of two modules: DeepNote and patient network. DeepNote extracts deep representations of clinical notes using a feature aggregation unit on top of a state-of-the-art Natural Language Processing (NLP) technique - BERT. By exploiting these deep representations, a patient network is built, and Graph Neural Network (GNN) is used to train the network for hospital readmission predictions. Performance evaluation on the MIMIC-III dataset demonstrates that DeepNote-GNN achieves superior results compared to the state-of-the-art baselines on the 30-day hospital readmission task. We extensively analyze the DeepNote-GNN model to illustrate the effectiveness and contribution of each component of it. The model analysis shows that patient network has a significant contribution to the overall performance, and DeepNote-GNN is robust and can consistently perform well on the 30-day readmission prediction task. To evaluate the generalization of DeepNote and patient network modules on new prediction tasks, we create a multimodal model and train it on structured and unstructured data of MIMIC-III dataset to predict patient mortality and Length of Stay (LOS). Our proposed multimodal model consists of four components: DeepNote, patient network, DeepTemporal, and score aggregation. While DeepNote keeps its functionality and extracts representations of clinical notes, we build a DeepTemporal module using a fully connected layer stacked on top of a one-layer Gated Recurrent Unit (GRU) to extract the deep representations of temporal signals. Independent to DeepTemporal, we extract feature vectors of temporal signals and use them to build a patient network. Finally, the DeepNote, DeepTemporal, and patient network scores are linearly aggregated to fit the multimodal model on downstream prediction tasks. Our results are very competitive to the baseline model. The multimodal model analysis reveals that unstructured text data better help to estimate predictions than temporal signals. Moreover, there is no limitation in applying a patient network on structured data. In comparison to other modules, the patient network makes a more significant contribution to prediction tasks. We believe that our efforts in this work have opened up a new study area that can be used to enhance the performance of clinical predictive models.en_US
dc.identifier.urihttps://hdl.handle.net/1805/26384
dc.identifier.urihttp://dx.doi.org/10.7912/C2/41
dc.language.isoen_USen_US
dc.subjectElectronic Health Record (EHR)en_US
dc.subjectClinical Noteen_US
dc.subjectClinical Predictive Modelsen_US
dc.subjectMultimodal Modelen_US
dc.subjectNatural Language Processing (NLP)en_US
dc.subjectPatient Networken_US
dc.subjectGraph Neural Network (GNN)en_US
dc.subjectStructured Temporal Dataen_US
dc.subjectUnstructured Dataen_US
dc.subjectData Fusionen_US
dc.subjectDeep Learningen_US
dc.subjectFeature Aggregationen_US
dc.subjectReadmission Predictionen_US
dc.subjectMortality Predictionen_US
dc.subjectLength of Stay (LOS) predictionen_US
dc.titleImproving the Performance of Clinical Prediction Tasks by Using Structured and Unstructured Data Combined with a Patient Networken_US
dc.typeThesisen
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Sara_Nouri_Thesis (updated).pdf
Size:
1.45 MB
Format:
Adobe Portable Document Format
Description:
Thesis
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.99 KB
Format:
Item-specific license agreed upon to submission
Description: