Gender Bias in Artificial Intelligence-Written Letters of Reference for Otolaryngology Residency Candidates

Young, Grace; Abouyared, Marianne; Kejner, Alexandra; Patel, Rusha; Edwards, Heather; Yin, Linda; Farlow, Janice

Gender Bias in Artificial Intelligence-Written Letters of Reference for Otolaryngology Residency Candidates

dc.contributor.author	Young, Grace
dc.contributor.author	Abouyared, Marianne
dc.contributor.author	Kejner, Alexandra
dc.contributor.author	Patel, Rusha
dc.contributor.author	Edwards, Heather
dc.contributor.author	Yin, Linda
dc.contributor.author	Farlow, Janice
dc.date.accessioned	2025-04-21T21:20:56Z
dc.date.available	2025-04-21T21:20:56Z
dc.date.issued	2025-04-25
dc.description.abstract	Introduction/Background: Written letters of reference (LORs) are an important component of the residency application process, and human-written LORs have been shown to contain gender-bias. Given that AI tools such as ChatGPT are increasingly utilized to draft LORs, it is important to understand how bias may be perpetuated in these tools. Study objective/Hypothesis: In a previous study, we identified gender bias in AI-written LORs when using prompts with randomly-generated resume variables. We sought to investigate whether this bias persisted using real applicant experiences, and how this compared to the LORs written by otolaryngology faculty. Methods: We obtained 46 LORs for otolaryngology residency applicants written by faculty from 5 different institutions who regularly compose LORs. Prompts describing the candidate’s experiences using the exact phrasing as the letter writers were provided to ChatGPT4.0 in individual sessions. The writer-generated and AI-generated letters were compared using a gender-bias calculator (https://slowe.github.io/genderbias/) which reports the ratio of male-associated ‘ability’ words to female-associated ‘grindstone’ words. Results: Both the writer-generated and AI-generated letters exhibited male bias on average (18.7% and 37.2% respectively). We used a paired t-test to determine that the AI-generated letters exhibited significantly higher male bias (t-statistic: -4.27, p-value: 0.0001). Independent t-tests did not reveal a significant difference for male versus female applicants for either writer-generated (t-statistic: 1.54, p-value 0.131) or AI-generated letters (t-statistic: 0.14, p-value: 0.892). However, Levene’s test comparing variation in scores indicated AI had significantly lower variability than for writers (Levene’s statistic: 11.38, p-value: 0.0011), and notably, every single AI-generated letter was male biased. 54.3% of the LORs were written for male candidates. Conclusions: While the use of AI for letter drafting resulted in overall male-bias, there was not a significant difference between letters using male versus female names, and the results did not vary as much as human-written letters. This suggests that AI-drafts could help reduce gender discrepancies. Further research is necessary to explore the broader implications of AI-assisted letter writing in residency selection, particularly in non-technical contexts.
dc.identifier.citation	Young G, Abouyared M, Kejner A, Patel R, Edwards HA, Yin L, Farlow JL. Gender Bias in Artificial Intelligence-Written Letters of Reference for Otolaryngology Residency Candidates. Poster presented at: Indiana University School of Medicine Education Day; April 25, 2025; Indianapolis, IN.
dc.identifier.uri	https://hdl.handle.net/1805/47252
dc.language.iso	en_US
dc.title	Gender Bias in Artificial Intelligence-Written Letters of Reference for Otolaryngology Residency Candidates
dc.type	Poster

Files

Original bundle

Now showing 1 - 1 of 1

Name:: LOR.Poster.copy.pdf
Size:: 481.57 KB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.04 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

2025 IUSM Education Day