Development and Validation of a Machine Learning-Based Decision Support Tool for Residency Applicant Screening and Review.

Decision Support Techniques Humans Internship and Residency Machine Learning Personnel Selection / methods School Admission Criteria United States

Journal

Academic medicine : journal of the Association of American Medical Colleges

ISSN: 1938-808X

Titre abrégé: Acad Med

Pays: United States

ID NLM: 8904605

Informations de publication

Date de publication:
01 11 2021

Historique:

pubmed: 5 8 2021

medline: 9 11 2021

entrez: 4 8 2021

Statut: ppublish

Résumé

Residency programs face overwhelming numbers of residency applications, limiting holistic review. Artificial intelligence techniques have been proposed to address this challenge but have not been created. Here, a multidisciplinary team sought to develop and validate a machine learning (ML)-based decision support tool (DST) for residency applicant screening and review. Categorical applicant data from the 2018, 2019, and 2020 residency application cycles (n = 8,243 applicants) at one large internal medicine residency program were downloaded from the Electronic Residency Application Service and linked to the outcome measure: interview invitation by human reviewers (n = 1,235 invites). An ML model using gradient boosting was designed using training data (80% of applicants) with over 60 applicant features (e.g., demographics, experiences, academic metrics). Model performance was validated on held-out data (20% of applicants). Sensitivity analysis was conducted without United States Medical Licensing Examination (USMLE) scores. An interactive DST incorporating the ML model was designed and deployed that provided applicant- and cohort-level visualizations. The ML model areas under the receiver operating characteristic and precision recall curves were 0.95 and 0.76, respectively; these changed to 0.94 and 0.72, respectively, with removal of USMLE scores. Applicants' medical school information was an important driver of predictions-which had face validity based on the local selection process-but numerous predictors contributed. Program directors used the DST in the 2021 application cycle to select 20 applicants for interview that had been initially screened out during human review. The authors developed and validated an ML algorithm for predicting residency interview offers from numerous application elements with high performance-even when USMLE scores were removed. Model deployment in a DST highlighted its potential for screening candidates and helped quantify and mitigate biases existing in the selection process. Further work will incorporate unstructured textual data through natural language processing methods.

Identifiants

DOI: 10.1097/ACM.0000000000004317 PMID: 34348383

pubmed: 34348383

doi: 10.1097/ACM.0000000000004317

pii: 00001888-202111001-00013

doi:

Types de publication

Journal Article Validation Study

Langues

eng

Sous-ensembles de citation

Pagination

S54-S61

Informations de copyright

Références

Association of American Medical Colleges. Holistic Review. https://www.aamc.org/services/member-capacity-building/holistic-review . Accessed July 25, 2021

Aibana O, Swails JL, Flores RJ, Love L. Bridging the gap: Holistic review to increase diversity in graduate medical education. Acad Med. 2019; 94:1137–1141

Barceló NE, Shadravan S, Wells CR, et al. Reimagining merit and representation: Promoting equity and reducing bias in GME through holistic review. Acad Psychiatry. 2021; 45:34–42

Angus SV, Williams CM, Stewart EA, Sweet M, Kisielewski M, Willett LL. Internal medicine residency program directors’ screening practices and perceptions about recruitment challenges. Acad Med. 2020; 95:582–589

National Resident Matching Program. Data Release and Research Committee: Results of the 2018 NRMP Program Director Survey. Washington, DC: National Resident Matching Program, 2018. http://www.nrmp.org/wp-content/uploads/2018/07/NRMP-2018-Program-Director-Survey-for-WWW.pdf . Accessed July 25, 2021

Berger JS, Cioletti A. Viewpoint from 2 graduate medical education deans application overload in the residency match process. J Grad Med Educ. 2016; 8:317–321

McGaghie WC, Cohen ER, Wayne DB. Are United States Medical Licensing Exam Step 1 and 2 scores valid measures for postgraduate medical residency selection decisions? Acad Med. 2011; 86:48–52

Prober CG, Kolars JC, First LR, Melnick DE. A plea to reassess the role of United States Medical Licensing Examination Step 1 Scores in Residency Selection. Acad Med. 2016; 91:12–15

United States Medical Licensing Examination. Invitational Conference on USMLE Scoring (InCUS). https://www.usmle.org/incus . Accessed July 25, 2021

Coalition for Physician Accountability. Reviewing the transition from UME to GME. https://physicianaccountability.org/ume-gme . Accessed July 25, 2021

Beam AL, Kohane IS. Big data and machine learning in health care. JAMA. 2018; 319:1317–1318

Rajkomar A, Dean J, Kohane I. Machine learning in medicine. N Engl J Med. 2019; 380:1347–1358

Kolachalama VB, Garg PS. Machine learning and medical education. NPJ Digit Med. 2018; 1:54

Arora VM. Harnessing the power of big data to improve graduate medical education: Big idea or bust? Acad Med. 2018; 93:833–834

Andris C, Cowen D, Wittenbach J. Support vector machine for spatial variation. Trans GIS. 2013; 17:41–61

Lux T, Pittman R, Shende R, Shende A. Applications of supervised learning techniques on undergraduate admissions data. Proceedings of the ACM International Conference on Computing Frontiers; Como, Italy; May 16-19, 2016; 412–417

Basu K, Basu T, Buckmire R, Lal N. Predictive models of student college commitment decisions using machine learning. Data. 2019; 4:65

Waters A, Miikkulainen R. GRADE: Machine learning support for graduate admissions. AI Magazine. 2014; 35:64–64

Gupta N, Sawhney A, Roth D. Will I get in? Modeling the graduate admission process for American universities. IEEE 16th International Conference on Data Mining Workshops (ICDMW); Barcelona, Spain; December 12-15, 2016; 631–638

Bitar Z, Al-Mousa A. Prediction of graduate admission using multiple supervised machine learning models. 2020 SoutheastCon; Raleigh, NC; March 28-29, 2020; 1–6

Zhao Y, Lackaye B, Dy JG, Brodley CE. A quantitative machine learning approach to master students admission for professional institutions. Presented at: International Conference on Educational Data Mining (EDM); July 13, 2020; virtual

Muratov E, Lewis M, Fourches D, Tropsha A, Cox WC. Computer-assisted decision support for student admissions based on their predicted academic performance. Am J Pharm Educ. 2017; 81:46

Triola MM. Director, Institute for Innovations in Medical Education, NYU Grossman School of Medicine. Personal communication with J. Burk-Rafel, April 26, 2021

Winkel AF, Morgan HK, Burk-Rafel J, et al. A model for exploring compatibility between applicants and residency programs: Right resident, right program. Obstet Gynecol. 2021; 137:164–169

US News and World Report. Best Medical Schools: Research. https://www.usnews.com/best-graduate-schools/top-medical-schools/research-rankings . Accessed July 25, 2021

McKinney W. Data structures for statistical computing in Python. Proceedings of the 9th Python in Science Conference; June 28-July 3, 2010. Austin, Texas. http://conference.scipy.org/proceedings/scipy2010/pdfs/mckinney.pdf . Accessed July 25, 2021

Pedregosa F, Varoguaux G, Gramfort A, et al. Scikit-learn: Machine learning in Python. J Machine Learn Res. 2011; 12:2825–2830

Liaw A, Wiener M. Classification and regression by RandomForest. R News. 2002; 2:18–22

Ke G, Meng Q, Finley T, et al. LightGBM: A highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst. 2017; 30:3146–3154

Chen T, Guestrin C. Xgboost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; August 13, 2016; San Francisco, CA:785–794

Breiman L, Friedman J, Olshen R, Stone C. Classification and Regression Trees. Wadsworth, NY: Chapman and Hall, 1984

Boehmke B, Greenwell B. Hands-On Machine Learning with R. Boca Raton, FL: CRC Press, 2019

Friedman JH. Greedy function approximation: A gradient boosting machine. Ann Stat. 2001; 1189–1232

Geron A. Hands-on Machine Learning With Scikit-Learn and Tensorflow. Sebastopol, CA: O’Reilly Media Inc, 2017

Jeni LA, Cohn JF, De La Torre F. Facing imbalanced data: Recommendations for the use of performance metrics. 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction; September 2-5, 2013. Geneva, Switzerland. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4285355/pdf/nihms-554962.pdf . Accessed July 25, 2021

Davis J, Goadrich M. The Relationship Between Precision-Recall and ROC Curves. Proceedings of the 23rd International Conference on Machine Learning; June 25-29, 2006. Pittsburgh, Pennsylvania. https://dl.acm.org/doi/10.1145/1143844.1143874 . Accessed July 25, 2021

Lundberg S, Lee SI. A unified approach to interpreting model predictions, Proceedings of the 31st International Conference on Neural Information Processing Systems; December 4-9, 2017. Long Beach, California. https://dl.acm.org/doi/10.5555/3295222.3295230 . Accessed July 25, 2021

Molnar C. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. https://christophm.github.io/interpretable-ml-book . Accessed July 25, 2021

Messick S. Validity. In: Linn RL, ed. Educational Measurement. 3rd ed, New York, NY: American Council on Education and Macmillan, 1989; 13–103

Kane MT. Validation. In: Brennan RL, ed. Educational Measurement. 4th ed, Westport, CT: Praeger, 2006; 17–64

Cook DA, Brydges R, Ginsburg S, Hatala R. A contemporary approach to validity arguments: A practical guide to Kane’s framework. Med Educ. 2015; 49:560–575

Hastie T, Tibshirani R, Friedman JH. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York, NY: Springer, 2001

Nielsen D. Tree boosting with XGBoost: Why does XGBoost win “every” machine learning competition? [thesis]. Trondheim, Norway: Norwegian University of Science and Technology, 2016. https://ntnuopen.ntnu.no/ntnu-xmlui/bitstream/handle/11250/2433761/16128_FULLTEXT.pdf . Accessed July 25, 2021

Burk-Rafel J, Pulido RW, Elfanagely Y, Kolars JC. Institutional differences in USMLE Step 1 and 2 CK performance: Cross-sectional study of 89 US allopathic medical schools. PLoS One. 2019; 14:e0224675

Kuchipudi B, Nannapaneni RT, Liao Q. Adversarial machine learning for spam filters. FARES ’20, August 25, 2020, Dublin, Ireland. http://people.cst.cmich.edu/liao1q/papers/amlspam.pdf . Accessed July 25, 2021

Johnson JM, Khoshgoftaar TM. Survey on deep learning with class imbalance. J Big Data. 2019; 6. https://doi.org/10.1186/s40537-019-0192-5

doi: 10.1186/s40537-019-0192-5

Howard A, Borenstein J. The ugly truth about ourselves and our robot creations: The problem of bias and social inequity. Sci Eng Ethics. 2018; 24:1521–1536

Vayena E, Blasimme A, Cohen IG. Machine learning in medicine: Addressing ethical challenges. PLoS Med. 2018; 15:e1002689

Char DS, Shah NH, Magnus D. Implementing machine learning in health care—Addressing ethical challenges. N Engl J Med. 2018; 378:981–983

Capers Q 4th, Clinchot D, McDougle L, Greenwald AG. Implicit racial bias in medical school admissions. Acad Med. 2017; 92:365–369

Maxfield CM, Thorpe MP, Desser TS, et al. Bias in radiology resident selection: Do we discriminate against the obese and unattractive? Acad Med. 2019; 94:1774–1780

Grimm LJ, Redmond RA, Campbell JC, Rosette AS. Gender and racial bias in radiology residency letters of recommendation. J Am Coll Radiol. 2020; 17(1 Pt A):64–71

Filippou P, Mahajan S, Deal A, et al. The presence of gender bias in letters of recommendations written for urology residency applicants. Urology. 2019; 134:56–61

Parikh RB, Teeple S, Navathe AS. Addressing bias in artificial intelligence in health care. JAMA. 2019; 322:2377–2378

DeCamp M, Lindvall C. Latent bias and the implementation of artificial intelligence in medicine. J Am Med Inform Assoc. 2020; 27:2020–2023

Development and Validation of a Machine Learning-Based Decision Support Tool for Residency Applicant Screening and Review.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Références

Auteurs

Jesse Burk-Rafel (J)

Ilan Reinstein (I)

James Feng (J)

Moosun Brad Kim (MB)

Louis H Miller (LH)

Patrick M Cocks (PM)

Marina Marin (M)

Yindalon Aphinyanaphongs (Y)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH