Residency Application Selection Committee Discriminatory Ability in Identifying Artificial Intelligence-Generated Personal Statements.

Artificial intelligence Personal statement Residency application

Journal

Journal of surgical education

ISSN: 1878-7452

Titre abrégé: J Surg Educ

Pays: United States

ID NLM: 101303204

Informations de publication

Date de publication:
27 Apr 2024

Historique:

received: 14 10 2023

revised: 20 01 2024

accepted: 19 02 2024

medline: 29 4 2024

pubmed: 29 4 2024

entrez: 28 4 2024

Statut: aheadofprint

Résumé

Advances in artificial intelligence (AI) have given rise to sophisticated algorithms capable of generating human-like text. The goal of this study was to evaluate the ability of human reviewers to reliably differentiate personal statements (PS) written by human authors from those generated by AI software. Four personal statements from the archives of two surgical program directors were de-identified and used as the human samples. Two AI platforms were used to generate nine additional PS. Four surgeons from the residency selection committees of two surgical residency programs of a large multihospital system served as blinded reviewers. AI was also asked to evaluate each PS sample for authorship. Sensitivity, specificity and accuracy of the reviewers in identifying the PS author were calculated. Kappa statistic for correlation between the hypothesized author and the true author were calculated. Inter-rater reliability was calculated using the kappa statistic with Light's modification given more than two reviewers in a fully-crossed design. Logistic regression was performed with to model the impact of perceived creativity, writing quality, and authorship or the likelihood of offering an interview. Human reviewer sensitivity for identifying an AI-generated PS was 0.87 with specificity of 0.37 and overall accuracy of 0.55. The level of agreement by kappa statistic of the reviewer estimate of authorship and the true authorship was 0.19 (slight agreement). The reviewers themselves had an inter-rater reliability of 0.067 (poor), with only complete agreement (four out of four reviewers) on two PS, both authored by humans. The odds ratio of offering an interview (compared to a composite of "backup" status or no interview) to a perceived human author was 7 times that of a perceived AI author (95% confidence interval 1.5276 to 32.0758, p=0.0144). AI hypothesized human authorship for twelve of the PS, with the last one "unsure." The increasing pervasiveness of AI will have far-reaching effects including on the resident application and recruitment process. Identifying AI-generated personal statements is exceedingly difficult. With the decreasing availability of objective data to assess applicants, a review and potential restructuring of the approach to resident recruitment may be warranted.

Identifiants

DOI: 10.1016/j.jsurg.2024.02.009 PMID: 38679494

pubmed: 38679494

pii: S1931-7204(24)00107-7

doi: 10.1016/j.jsurg.2024.02.009

pii:

doi:

Types de publication

Journal Article

Langues

eng

Residency Application Selection Committee Discriminatory Ability in Identifying Artificial Intelligence-Generated Personal Statements.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Informations de copyright

Auteurs

Issam Koleilat (I)

Advaith Bongu (A)

Sumy Chang (S)

Dylan Nieman (D)

Steven Priolo (S)

Nell Maloney Patel (NM)

Classifications MeSH