A Deep Language Model for Symptom Extraction From Clinical Text and its Application to Extract COVID-19 Symptoms From Social Media

Luo, Xiao; Gandhi, Priyanka; Storey, Susan; Huang, Kun

A Deep Language Model for Symptom Extraction From Clinical Text and its Application to Extract COVID-19 Symptoms From Social Media

dc.contributor.author	Luo, Xiao
dc.contributor.author	Gandhi, Priyanka
dc.contributor.author	Storey, Susan
dc.contributor.author	Huang, Kun
dc.contributor.department	Biostatistics and Health Data Science, School of Medicine
dc.date.accessioned	2023-11-29T11:26:01Z
dc.date.available	2023-11-29T11:26:01Z
dc.date.issued	2022
dc.description.abstract	Patients experience various symptoms when they have either acute or chronic diseases or undergo some treatments for diseases. Symptoms are often indicators of the severity of the disease and the need for hospitalization. Symptoms are often described in free text written as clinical notes in the Electronic Health Records (EHR) and are not integrated with other clinical factors for disease prediction and healthcare outcome management. In this research, we propose a novel deep language model to extract patient-reported symptoms from clinical text. The deep language model integrates syntactic and semantic analysis for symptom extraction and identifies the actual symptoms reported by patients and conditional or negation symptoms. The deep language model can extract both complex and straightforward symptom expressions. We used a real-world clinical notes dataset to evaluate our model and demonstrated that our model achieves superior performance compared to three other state-of-the-art symptom extraction models. We extensively analyzed our model to illustrate its effectiveness by examining each component’s contribution to the model. Finally, we applied our model on a COVID-19 tweets data set to extract COVID-19 symptoms. The results show that our model can identify all the symptoms suggested by CDC ahead of their timeline and many rare symptoms.
dc.eprint.version	Author's manuscript
dc.identifier.citation	Luo X, Gandhi P, Storey S, Huang K. A Deep Language Model for Symptom Extraction From Clinical Text and its Application to Extract COVID-19 Symptoms From Social Media. IEEE J Biomed Health Inform. 2022;26(4):1737-1748. doi:10.1109/JBHI.2021.3123192
dc.identifier.uri	https://hdl.handle.net/1805/37200
dc.language.iso	en_US
dc.publisher	IEEE
dc.relation.isversionof	10.1109/JBHI.2021.3123192
dc.relation.journal	IEEE Journal of Biomedical and Health Informatics
dc.rights	Publisher Policy
dc.source	PMC
dc.subject	Natural Language Processing
dc.subject	Symptom Extraction
dc.subject	Deep Language Model
dc.subject	COVID-19
dc.subject	Social Media
dc.title	A Deep Language Model for Symptom Extraction From Clinical Text and its Application to Extract COVID-19 Symptoms From Social Media
dc.type	Article

Files

Original bundle

Now showing 1 - 1 of 1

Name:: nihms-1798540.pdf
Size:: 1.66 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.99 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Open Access Policy Articles
Biostatistics and Health Data Science Works