An Assessment of ChatGPT’s Performance as a Patient Counseling Tool: Exploring the Potential Integration of Large Language Model-based ChatBots into Online Patient Portals

Date
2024-04-26
Language
American English
Embargo Lift Date
Department
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Abstract

BACKGROUND: With the advancement of online patient portals, patients now have unprecedented access to their healthcare providers. This has led to increased physician burden associated with electronic inbox overload [1]. Recent developments in artificial intelligence, specifically in Large Language Model-based chatbots (i.e. ChatGPT), may prove to be useful tools in reducing such burden. Can ChatGPT reliably be utilized as a patient counseling tool? ChatGPT can be described as “an advanced language model that uses deep learning techniques to produce human-like responses to natural language inputs” [5]. Despite concerns surrounding this technology (i.e. spreading of misinformation, inconsistent reproducibility, “hallucination” phenomena), several studies have demonstrated ChatGPT’s clinical savviness. One study examined ChatGPT’s ability to answer frequently asked fertility-related questions, finding the model’s responses to be comparable to the CDC’s published answers in respect to length, factual content, and sentiment [6]. Additionally, ChatGPT was found capable of achieving a passing score on the STEP 1 licensing exam, a benchmark set for third year medical students [7]. OBJECTIVE: This study aims to further evaluate the clinical decision making of ChatGPT, specifically the ability for ChatGPT to provide accurate medical counseling in response to frequently asked patient questions within the field of cardiology. METHODS: 35 frequently asked cardiovascular questions (FAQs) published by the OHSU Knight Cardiovascular Institute were processed through ChatGPT 4 (Classic Version) by OpenAI. ChatGPT’s answers and the provided answers by the OHSU Knight Cardiovascular Institute were assessed in respect to length, factual content, sentiment analysis, and the presence of incorrect/false statements. RESULTS: When comparing ChatGPT’s responses to the 35 FAQs against the published responses by OHSU, Chat GPT’s responses were significantly longer in length (295.4 vs 112.5 (words/response)) and included more factual statements per response (7.2 vs 3.5). Chat GPT was able to produce responses of similar sentiment polarity (0.10 vs 0.11 on a scale of -1 (negative) to 1 (positive)) and subjectivity (0.46 vs 0.43 on a scale from 0 (objective) to 1 (subjective)). 0% of ChatGPT’s factual statements were found to be false or harmful. CONCLUSIONS: The results of this study provide valuable insight into the clinical “knowledge” and fluency of ChatGPT, demonstrating its ability to produce accurate and effective responses to frequently asked cardiovascular questions. Larger scale studies with an additional focus on ChatGPT’s reproducibility/consistency may provide important implications for the future of patient education. Implementation of AI-based chatbots into online patient portals may prove to be assistive to physicians, alleviating the growing burden of electronic inbox volume.

Description
item.page.description.tableofcontents
item.page.relation.haspart
Cite As
Price CG, Brougham AJ, Burton KA, Dexter PR. An Assessment of ChatGPT’s Performance as a Patient Counseling Tool: Exploring the Potential Integration of Large Language Model-based ChatBots into Online Patient Portals. Poster presented at: Indiana University School of Medicine Education Day; April 26, 2024: Indianapolis, IN.
ISSN
Publisher
Series/Report
Sponsorship
Major
Extent
Identifier
Relation
Journal
Rights
Source
Alternative Title
Type
Poster
Number
Volume
Conference Dates
Conference Host
Conference Location
Conference Name
Conference Panel
Conference Secretariat Location
Version
Full Text Available at
This item is under embargo {{howLong}}