Is ChatGPT 3.5 smarter than Otolaryngology trainees? A comparison study of board style exam questions

Patel, Jaimin; Robinson, Peyton; Illing, Elisa; Anthony, Benjamin

Is ChatGPT 3.5 smarter than Otolaryngology trainees? A comparison study of board style exam questions

dc.contributor.author	Patel, Jaimin
dc.contributor.author	Robinson, Peyton
dc.contributor.author	Illing, Elisa
dc.contributor.author	Anthony, Benjamin
dc.contributor.department	Otolaryngology -- Head and Neck Surgery, School of Medicine
dc.date.accessioned	2024-10-31T09:52:28Z
dc.date.available	2024-10-31T09:52:28Z
dc.date.issued	2024-09-26
dc.description.abstract	Objectives: This study compares the performance of the artificial intelligence (AI) platform Chat Generative Pre-Trained Transformer (ChatGPT) to Otolaryngology trainees on board-style exam questions. Methods: We administered a set of 30 Otolaryngology board-style questions to medical students (MS) and Otolaryngology residents (OR). 31 MSs and 17 ORs completed the questionnaire. The same test was administered to ChatGPT version 3.5, five times. Comparisons of performance were achieved using a one-way ANOVA with Tukey Post Hoc test, along with a regression analysis to explore the relationship between education level and performance. Results: The average scores increased each year from MS1 to PGY5. A one-way ANOVA revealed that ChatGPT outperformed trainee years MS1, MS2, and MS3 (p = <0.001, 0.003, and 0.019, respectively). PGY4 and PGY5 otolaryngology residents outperformed ChatGPT (p = 0.033 and 0.002, respectively). For years MS4, PGY1, PGY2, and PGY3 there was no statistical difference between trainee scores and ChatGPT (p = .104, .996, and 1.000, respectively). Conclusion: ChatGPT can outperform lower-level medical trainees on Otolaryngology board-style exam but still lacks the ability to outperform higher-level trainees. These questions primarily test rote memorization of medical facts; in contrast, the art of practicing medicine is predicated on the synthesis of complex presentations of disease and multilayered application of knowledge of the healing process. Given that upper-level trainees outperform ChatGPT, it is unlikely that ChatGPT, in its current form will provide significant clinical utility over an Otolaryngologist.
dc.eprint.version	Final published version
dc.identifier.citation	Patel J, Robinson P, Illing E, Anthony B. Is ChatGPT 3.5 smarter than Otolaryngology trainees? A comparison study of board style exam questions. PLoS One. 2024;19(9):e0306233. Published 2024 Sep 26. doi:10.1371/journal.pone.0306233
dc.identifier.uri	https://hdl.handle.net/1805/44380
dc.language.iso	en_US
dc.publisher	Public Library of Science
dc.relation.isversionof	10.1371/journal.pone.0306233
dc.relation.journal	PLoS One
dc.rights	Attribution 4.0 International	en
dc.rights.uri	https://creativecommons.org/licenses/by/4.0
dc.source	PMC
dc.subject	Artificial intelligence
dc.subject	Clinical competence
dc.subject	Educational measurement
dc.subject	Otolaryngology
dc.title	Is ChatGPT 3.5 smarter than Otolaryngology trainees? A comparison study of board style exam questions
dc.type	Article

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Patel2024Chat-CCBY.pdf
Size:: 453.1 KB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.04 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Open Access Policy Articles
Department of Otolaryngology—Head and Neck Surgery Works