Gender Bias in Artificial Intelligence-Written Letters of Reference for Otolaryngology Residency Candidates

dc.contributor.authorYoung, Grace
dc.contributor.authorAbouyared, Marianne
dc.contributor.authorKejner, Alexandra
dc.contributor.authorPatel, Rusha
dc.contributor.authorEdwards, Heather
dc.contributor.authorYin, Linda
dc.contributor.authorFarlow, Janice
dc.date.accessioned2025-04-21T21:20:56Z
dc.date.available2025-04-21T21:20:56Z
dc.date.issued2025-04-25
dc.description.abstractIntroduction/Background: Written letters of reference (LORs) are an important component of the residency application process, and human-written LORs have been shown to contain gender-bias. Given that AI tools such as ChatGPT are increasingly utilized to draft LORs, it is important to understand how bias may be perpetuated in these tools. Study objective/Hypothesis: In a previous study, we identified gender bias in AI-written LORs when using prompts with randomly-generated resume variables. We sought to investigate whether this bias persisted using real applicant experiences, and how this compared to the LORs written by otolaryngology faculty. Methods: We obtained 46 LORs for otolaryngology residency applicants written by faculty from 5 different institutions who regularly compose LORs. Prompts describing the candidate’s experiences using the exact phrasing as the letter writers were provided to ChatGPT4.0 in individual sessions. The writer-generated and AI-generated letters were compared using a gender-bias calculator (https://slowe.github.io/genderbias/) which reports the ratio of male-associated ‘ability’ words to female-associated ‘grindstone’ words. Results: Both the writer-generated and AI-generated letters exhibited male bias on average (18.7% and 37.2% respectively). We used a paired t-test to determine that the AI-generated letters exhibited significantly higher male bias (t-statistic: -4.27, p-value: 0.0001). Independent t-tests did not reveal a significant difference for male versus female applicants for either writer-generated (t-statistic: 1.54, p-value 0.131) or AI-generated letters (t-statistic: 0.14, p-value: 0.892). However, Levene’s test comparing variation in scores indicated AI had significantly lower variability than for writers (Levene’s statistic: 11.38, p-value: 0.0011), and notably, every single AI-generated letter was male biased. 54.3% of the LORs were written for male candidates. Conclusions: While the use of AI for letter drafting resulted in overall male-bias, there was not a significant difference between letters using male versus female names, and the results did not vary as much as human-written letters. This suggests that AI-drafts could help reduce gender discrepancies. Further research is necessary to explore the broader implications of AI-assisted letter writing in residency selection, particularly in non-technical contexts.
dc.identifier.citationYoung G, Abouyared M, Kejner A, Patel R, Edwards HA, Yin L, Farlow JL. Gender Bias in Artificial Intelligence-Written Letters of Reference for Otolaryngology Residency Candidates. Poster presented at: Indiana University School of Medicine Education Day; April 25, 2025; Indianapolis, IN.
dc.identifier.urihttps://hdl.handle.net/1805/47252
dc.language.isoen_US
dc.titleGender Bias in Artificial Intelligence-Written Letters of Reference for Otolaryngology Residency Candidates
dc.typePoster
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
LOR.Poster.copy.pdf
Size:
481.57 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.04 KB
Format:
Item-specific license agreed upon to submission
Description: