Weiner, MichaelDexter, Paul R.Heithoff, KimRoberts, Anna R.Liu, ZiyueGriffith, AshleyHui, SiuSchelfhout, JonathanDicpinigaitis, PeterDoshi, IshitaWeaver, Jessica P.2022-03-242022-03-242021-06Weiner, M., Dexter, P. R., Heithoff, K., Roberts, A. R., Liu, Z., Griffith, A., Hui, S., Schelfhout, J., Dicpinigaitis, P., Doshi, I., & Weaver, J. P. (2021). Identifying and Characterizing a Chronic Cough Cohort Through Electronic Health Records. Chest, 159(6), 2346–2355. https://doi.org/10.1016/j.chest.2020.12.011https://hdl.handle.net/1805/28293Background Chronic cough (CC) of 8 weeks or more affects about 10% of adults and may lead to expensive treatments and reduced quality of life. Incomplete diagnostic coding complicates identifying CC in electronic health records (EHRs). Natural language processing (NLP) of EHR text could improve detection. Research Question Can NLP be used to identify cough in EHRs, and to characterize adults and encounters with CC? Study Design and Methods A Midwestern EHR system identified patients aged 18 to 85 years during 2005 to 2015. NLP was used to evaluate text notes, except prescriptions and instructions, for mentions of cough. Two physicians and a biostatistician reviewed 12 sets of 50 encounters each, with iterative refinements, until the positive predictive value for cough encounters exceeded 90%. NLP, International Classification of Diseases, 10th revision, or medication was used to identify cough. Three encounters spanning 56 to 120 days defined CC. Descriptive statistics summarized patients and encounters, including referrals. Results Optimizing NLP required identifying and eliminating cough denials, instructions, and historical references. Of 235,457 cough encounters, 23% had a relevant diagnostic code or medication. Applying chronicity to cough encounters identified 23,371 patients (61% women) with CC. NLP alone identified 74% of these patients; diagnoses or medications alone identified 15%. The positive predictive value of NLP in the reviewed sample was 97%. Referrals for cough occurred for 3.0% of patients; pulmonary medicine was most common initially (64% of referrals). Limitations Some patients with diagnosis codes for cough, encounters at intervals greater than 4 months, or multiple acute cough episodes may have been misclassified. Interpretation NLP successfully identified a large cohort with CC. Most patients were identified through NLP alone, rather than diagnoses or medications. NLP improved detection of patients nearly sevenfold, addressing the gap in ability to identify and characterize CC disease burden. Nearly all cases appeared to be managed in primary care. Identifying these patients is important for characterizing treatment and unmet needs.enPublisher Policychronic coughelectronic health recordsnatural language processingIdentifying and Characterizing a Chronic Cough Cohort Through Electronic Health RecordsArticle