Dementia Risk Prediction Using Decision-Focused Content Selection from Medical Notes

Date
2024
Language
American English
Embargo Lift Date
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Elsevier
Can't use the file because of accessibility barriers? Contact us with the title of the item, permanent link, and specifics of your accommodation need.
Abstract

Several general-purpose language model (LM) architectures have been proposed with demonstrated improvement in text summarization and classification. Adapting these architectures to the medical domain requires additional considerations. For instance, the medical history of the patient is documented in the Electronic Health Record (EHR) which includes many medical notes drafted by healthcare providers. Direct processing of these notes may not be possible because the computational complexity of LMs imposes a limit on the length of input text. Therefore, previous applications resorted to content selection using truncation or summarization of the text. Unfortunately, these text processing techniques may lead to information loss, redundancy or irrelevance. In the present paper, a decision-focused content selection technique is proposed. The objective of this technique is to select a subset of sentences from the medical notes of a patient that are relevant to the target outcome over a predefined observation period. This decision-focused content selection methodology is then used to develop a dementia risk prediction model based on the Longformer LM architecture. The results show that the proposed framework delivers an AUC of 78.43 when the summary is restricted to 1024 tokens, outperforming previously proposed content selection techniques. This performance is notable given that the model estimates dementia risk with a one year prediction horizon, relies on an observation period of only one year and solely uses medical notes without other EHR data modalities. Moreover, the proposed techniques overcome the limitation of machine learning models that use a tabular representation of the text by preserving contextual content, enable feature engineering from raw text and circumvent the computational complexity of language models.

Description
item.page.description.tableofcontents
item.page.relation.haspart
Cite As
Li S, Dexter P, Ben-Miled Z, Boustani M. Dementia risk prediction using decision-focused content selection from medical notes. Comput Biol Med. 2024;182:109144. doi:10.1016/j.compbiomed.2024.109144
ISSN
Publisher
Series/Report
Sponsorship
Major
Extent
Identifier
Relation
Journal
Computers in Biology and Medicine
Source
PMC
Alternative Title
Type
Article
Number
Volume
Conference Dates
Conference Host
Conference Location
Conference Name
Conference Panel
Conference Secretariat Location
Version
Author's manuscript
Full Text Available at
This item is under embargo {{howLong}}