- Browse by Author
Browsing by Author "Toh, Sengwee"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item Data-driven automated classification algorithms for acute health conditions: applying PheNorm to COVID-19 disease(American Medical Informatics Association, 2024) Smith, Joshua C.; Williamson, Brian D.; Cronkite, David J.; Park, Daniel; Whitaker, Jill M.; McLemore, Michael F.; Osmanski, Joshua T.; Winter, Robert; Ramaprasan, Arvind; Kelley, Ann; Shea, Mary; Wittayanukorn, Saranrat; Stojanovic, Danijela; Zhao, Yueqin; Toh, Sengwee; Johnson, Kevin B.; Aronoff, David M.; Carrell, David S.; Medicine, School of MedicineObjectives: Automated phenotyping algorithms can reduce development time and operator dependence compared to manually developed algorithms. One such approach, PheNorm, has performed well for identifying chronic health conditions, but its performance for acute conditions is largely unknown. Herein, we implement and evaluate PheNorm applied to symptomatic COVID-19 disease to investigate its potential feasibility for rapid phenotyping of acute health conditions. Materials and methods: PheNorm is a general-purpose automated approach to creating computable phenotype algorithms based on natural language processing, machine learning, and (low cost) silver-standard training labels. We applied PheNorm to cohorts of potential COVID-19 patients from 2 institutions and used gold-standard manual chart review data to investigate the impact on performance of alternative feature engineering options and implementing externally trained models without local retraining. Results: Models at each institution achieved AUC, sensitivity, and positive predictive value of 0.853, 0.879, 0.851 and 0.804, 0.976, and 0.885, respectively, at quantiles of model-predicted risk that maximize F1. We report performance metrics for all combinations of silver labels, feature engineering options, and models trained internally versus externally. Discussion: Phenotyping algorithms developed using PheNorm performed well at both institutions. Performance varied with different silver-standard labels and feature engineering options. Models developed locally at one site also worked well when implemented externally at the other site. Conclusion: PheNorm models successfully identified an acute health condition, symptomatic COVID-19. The simplicity of the PheNorm approach allows it to be applied at multiple study sites with substantially reduced overhead compared to traditional approaches.Item Establishing a framework for privacy-preserving record linkage among electronic health record and administrative claims databases within PCORnet®, the National Patient-Centered Clinical Research Network(BMC, 2022-10-31) Kiernan, Daniel; Carton, Thomas; Toh, Sengwee; Phua, Jasmin; Zirkle, Maryan; Louzao, Darcy; Haynes, Kevin; Weiner, Mark; Angulo, Francisco; Bailey, Charles; Bian, Jiang; Fort, Daniel; Grannis, Shaun; Krishnamurthy, Ashok Kumar; Nair, Vinit; Rivera, Pedro; Silverstein, Jonathan; Marsolo, Keith; Medicine, School of MedicineObjective: The aim of this study was to determine whether a secure, privacy-preserving record linkage (PPRL) methodology can be implemented in a scalable manner for use in a large national clinical research network. Results: We established the governance and technical capacity to support the use of PPRL across the National Patient-Centered Clinical Research Network (PCORnet®). As a pilot, four sites used the Datavant software to transform patient personally identifiable information (PII) into de-identified tokens. We queried the sites for patients with a clinical encounter in 2018 or 2019 and matched their tokens to determine whether overlap existed. We described patient overlap among the sites and generated a "deduplicated" table of patient demographic characteristics. Overlapping patients were found in 3 of the 6 site-pairs. Following deduplication, the total patient count was 3,108,515 (0.11% reduction), with the largest reduction in count for patients with an "Other/Missing" value for Sex; from 198 to 163 (17.6% reduction). The PPRL solution successfully links patients across data sources using distributed queries without directly accessing patient PII. The overlap queries and analysis performed in this pilot is being replicated across the full network to provide additional insight into patient linkages among a distributed research network.