Machine Learning Approaches to Identify Nicknames from A Statewide Health Information Exchange
Date
Embargo Lift Date
Department
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Abstract
Patient matching is essential to minimize fragmentation of patient data. Existing patient matching efforts often do not account for nickname use. We sought to develop decision models that could identify true nicknames using features representing the phonetical and structural similarity of nickname pairs. We identified potential male and female name pairs from the Indiana Network for Patient Care (INPC), and developed a series of features that represented their phonetical and structural similarities. Next, we used the XGBoost classifier and hyperparameter tuning to build decision models to identify nicknames using these feature sets and a manually reviewed gold standard. Decision models reported high Precision/Positive Predictive Value and Accuracy scores for both male and female name pairs despite the low number of true nickname matches in the datasets under study. Ours is one of the first efforts to identify patient nicknames using machine learning approaches.