Identifying clinical feature clusters toward predicting stroke in patients with asymptomatic carotid stenosis

Date
2025
Language
American English
Embargo Lift Date
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Springer Nature
Can't use the file because of accessibility barriers? Contact us with the title of the item, permanent link, and specifics of your accommodation need.
Abstract

Despite the widespread application of machine learning models and feature selection methods to identify important clinical features in electronic health records (EHR) for disease prediction, the use of graph neural networks (GNNs) to uncover significant clinical features associated with a disease remains largely unexplored. In this investigation, we developed a computational method utilizing EHR data from Indiana University Medical Hospital to predict stroke in patients with asymptomatic carotid stenosis. We first constructed a patient clinical feature graph for each patient based on the co-occurrence of features (medications, diagnoses, and results of laboratory tests) in the EHR data within a predefined timeframe (e.g., 6 months before the detection of the disease). Then, we applied an unsupervised GNN-based clustering approach and our algorithm to select notable clinical feature clusters crucial for stroke prediction. These clinical features served as the basis for constructing patient representation for prediction. Various supervised learning models were evaluated for their prediction capabilities. Unlike conventional feature selection methods, our GNN-based feature selection approach relies solely on positive cases. We compared our method against baseline models for stroke prediction and achieved robust performance metrics, including an AUC of 0.87 and an F1 score of 0.80, surpassing all baselines. Additionally, we conducted an ablation study on the amount of EHR data, measured in months, to determine the most effective approach for generating patient clinical feature graphs. By capturing inherent relationships between clinical features using the graph model, our approach offers a promising avenue for advancing disease prediction, particularly in scenarios with limited positive cases available. Our code can be found on Github (https://github.com/xudav001/Identifying-Phenotype-Clusters).

Description
item.page.description.tableofcontents
item.page.relation.haspart
Cite As
Xu D, Matinmehr S, Sawchuk A, Luo X. Identifying clinical feature clusters toward predicting stroke in patients with asymptomatic carotid stenosis. Int J Data Sci Anal. 2025;20(3):2511-2524. doi:10.1007/s41060-024-00597-8
ISSN
Publisher
Series/Report
Sponsorship
Major
Extent
Identifier
Relation
Journal
International Journal of Data Science and Analytics
Source
PMC
Alternative Title
Type
Article
Number
Volume
Conference Dates
Conference Host
Conference Location
Conference Name
Conference Panel
Conference Secretariat Location
Version
Author's manuscript
Full Text Available at
This item is under embargo {{howLong}}