BACKGROUND Letters of recommendation (LoRs) play an important role in resident selection. Author language varies implicitly toward male and female applicants. We examined gender bias in LoRs written for surgical… Click to show full abstract
BACKGROUND Letters of recommendation (LoRs) play an important role in resident selection. Author language varies implicitly toward male and female applicants. We examined gender bias in LoRs written for surgical residency candidates across three decades at one institution. METHODS Retrospective analysis of LoRs written for general surgery residency candidates between 1980 and 2011 using artificial intelligence (AI) to conduct natural language processing (NLP) and sentiment analysis, and computer-based algorithms to detect gender bias. Applicants were grouped by scaled clerkship grades and USMLE scores. Data were analyzed among groups with t-tests, ANOVA, and non-parametric tests, as appropriate. RESULTS A total of 611 LoRs were analyzed for 171 applicants (16.4% female), and 95.3% of letter authors were male. Scaled USMLE scores and clerkship grades (SCG) were similar for both genders (p > 0.05 for both). Average word count for all letters was 290 words and was not significantly different between genders (p = 0.18). LoRs written before 2000 were significantly shorter than those written after, among applicants of both genders (female p = 0.004; male p < 0.001). Gender bias analysis of female LoRs revealed more gendered wording compared to male LoRs (p = 0.04) and was most prominent among females with lower SCG (9.5 vs 5.1, p = 0.01). Sentiment analysis revealed male LoRs with female authors had significantly more positive sentiment compared to female LoRs (p = 0.02), and males with higher SCG had more positive sentiment compared to those with lower SCG (9.4 vs 8.2, p = 0.03). NLP detected more "fear" in male LoRs with lower SCGs (0.11 vs 0.09, p = 0.02). Female LoRs with higher SCGs had more positive sentiment (0.78 vs 0.83, p = 0.03) and "joy" (0.60 vs 0.63, p = 0.02), although those written before 2000 had less "joy" (0.5 vs 0.63, p = 0.006). CONCLUSION AI and computer-based algorithms detected linguistic differences and gender bias in LoRs written for general surgery residency applicants, even following stratification by clerkship grades and when analyzed by decade.
               
Click one of the above tabs to view related content.