Development and validation of computable social phenotypes for health-related social needs

dc.contributor.authorGregory, Megan E.
dc.contributor.authorKasthurirathne, Suranga N.
dc.contributor.authorMagoc, Tanja
dc.contributor.authorMcNamee, Cassidy
dc.contributor.authorHarle, Christopher A.
dc.contributor.authorVest, Joshua R.
dc.contributor.departmentHealth Policy and Management, Richard M. Fairbanks School of Public Health
dc.date.accessioned2025-02-19T15:38:18Z
dc.date.available2025-02-19T15:38:18Z
dc.date.issued2025-01-07
dc.description.abstractObjective: Measurement of health-related social needs (HRSNs) is complex. We sought to develop and validate computable phenotypes (CPs) using structured electronic health record (EHR) data for food insecurity, housing instability, financial insecurity, transportation barriers, and a composite-type measure of these, using human-defined rule-based and machine learning (ML) classifier approaches. Materials and methods: We collected HRSN surveys as the reference standard and obtained EHR data from 1550 patients in 3 health systems from 2 states. We followed a Delphi-like approach to develop the human-defined rule-based CP. For the ML classifier approach, we trained supervised ML (XGBoost) models using 78 features. Using surveys as the reference standard, we calculated sensitivity, specificity, positive predictive values, and area under the curve (AUC). We compared AUCs using the Delong test and other performance measures using McNemar's test, and checked for differential performance. Results: Most patients (63%) reported at least one HRSN on the reference standard survey. Human-defined rule-based CPs exhibited poor performance (AUCs=.52 to .68). ML classifier CPs performed significantly better, but still poor-to-fair (AUCs = .68 to .75). Significant differences for race/ethnicity were found for ML classifier CPs (higher AUCs for White non-Hispanic patients). Important features included number of encounters and Medicaid insurance. Discussion: Using a supervised ML classifier approach, HRSN CPs approached thresholds of fair performance, but exhibited differential performance by race/ethnicity. Conclusion: CPs may help to identify patients who may benefit from additional social needs screening. Future work should explore the use of area-level features via geospatial data and natural language processing to improve model performance.
dc.eprint.versionFinal published version
dc.identifier.citationGregory ME, Kasthurirathne SN, Magoc T, McNamee C, Harle CA, Vest JR. Development and validation of computable social phenotypes for health-related social needs. JAMIA Open. 2025;8(1):ooae150. Published 2025 Jan 7. doi:10.1093/jamiaopen/ooae150
dc.identifier.urihttps://hdl.handle.net/1805/45836
dc.language.isoen_US
dc.publisherOxford University Press
dc.relation.isversionof10.1093/jamiaopen/ooae150
dc.relation.journalJAMIA Open
dc.rightsAttribution-NonCommercial 4.0 Internationalen
dc.rights.urihttps://creativecommons.org/licenses/by-nc/4.0
dc.sourcePMC
dc.subjectSocial determinants of health
dc.subjectElectronic health records
dc.subjectMachine learning
dc.titleDevelopment and validation of computable social phenotypes for health-related social needs
dc.typeArticle
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Gregory2025Development-CCBYNC.pdf
Size:
749.87 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.04 KB
Format:
Item-specific license agreed upon to submission
Description: