Development and Validation of a Machine Learning-Based Decision Support Tool for Residency Applicant Screening and Review.


Journal

Academic medicine : journal of the Association of American Medical Colleges
ISSN: 1938-808X
Titre abrégé: Acad Med
Pays: United States
ID NLM: 8904605

Informations de publication

Date de publication:
01 11 2021
Historique:
pubmed: 5 8 2021
medline: 9 11 2021
entrez: 4 8 2021
Statut: ppublish

Résumé

Residency programs face overwhelming numbers of residency applications, limiting holistic review. Artificial intelligence techniques have been proposed to address this challenge but have not been created. Here, a multidisciplinary team sought to develop and validate a machine learning (ML)-based decision support tool (DST) for residency applicant screening and review. Categorical applicant data from the 2018, 2019, and 2020 residency application cycles (n = 8,243 applicants) at one large internal medicine residency program were downloaded from the Electronic Residency Application Service and linked to the outcome measure: interview invitation by human reviewers (n = 1,235 invites). An ML model using gradient boosting was designed using training data (80% of applicants) with over 60 applicant features (e.g., demographics, experiences, academic metrics). Model performance was validated on held-out data (20% of applicants). Sensitivity analysis was conducted without United States Medical Licensing Examination (USMLE) scores. An interactive DST incorporating the ML model was designed and deployed that provided applicant- and cohort-level visualizations. The ML model areas under the receiver operating characteristic and precision recall curves were 0.95 and 0.76, respectively; these changed to 0.94 and 0.72, respectively, with removal of USMLE scores. Applicants' medical school information was an important driver of predictions-which had face validity based on the local selection process-but numerous predictors contributed. Program directors used the DST in the 2021 application cycle to select 20 applicants for interview that had been initially screened out during human review. The authors developed and validated an ML algorithm for predicting residency interview offers from numerous application elements with high performance-even when USMLE scores were removed. Model deployment in a DST highlighted its potential for screening candidates and helped quantify and mitigate biases existing in the selection process. Further work will incorporate unstructured textual data through natural language processing methods.

Identifiants

pubmed: 34348383
doi: 10.1097/ACM.0000000000004317
pii: 00001888-202111001-00013
doi:

Types de publication

Journal Article Validation Study

Langues

eng

Sous-ensembles de citation

IM

Pagination

S54-S61

Informations de copyright

Copyright © 2021 by the Association of American Medical Colleges.

Références

Association of American Medical Colleges. Holistic Review. https://www.aamc.org/services/member-capacity-building/holistic-review . Accessed July 25, 2021
Aibana O, Swails JL, Flores RJ, Love L. Bridging the gap: Holistic review to increase diversity in graduate medical education. Acad Med. 2019; 94:1137–1141
Barceló NE, Shadravan S, Wells CR, et al. Reimagining merit and representation: Promoting equity and reducing bias in GME through holistic review. Acad Psychiatry. 2021; 45:34–42
Angus SV, Williams CM, Stewart EA, Sweet M, Kisielewski M, Willett LL. Internal medicine residency program directors’ screening practices and perceptions about recruitment challenges. Acad Med. 2020; 95:582–589
National Resident Matching Program. Data Release and Research Committee: Results of the 2018 NRMP Program Director Survey. Washington, DC: National Resident Matching Program, 2018. http://www.nrmp.org/wp-content/uploads/2018/07/NRMP-2018-Program-Director-Survey-for-WWW.pdf . Accessed July 25, 2021
Berger JS, Cioletti A. Viewpoint from 2 graduate medical education deans application overload in the residency match process. J Grad Med Educ. 2016; 8:317–321
McGaghie WC, Cohen ER, Wayne DB. Are United States Medical Licensing Exam Step 1 and 2 scores valid measures for postgraduate medical residency selection decisions? Acad Med. 2011; 86:48–52
Prober CG, Kolars JC, First LR, Melnick DE. A plea to reassess the role of United States Medical Licensing Examination Step 1 Scores in Residency Selection. Acad Med. 2016; 91:12–15
United States Medical Licensing Examination. Invitational Conference on USMLE Scoring (InCUS). https://www.usmle.org/incus . Accessed July 25, 2021
Coalition for Physician Accountability. Reviewing the transition from UME to GME. https://physicianaccountability.org/ume-gme . Accessed July 25, 2021
Beam AL, Kohane IS. Big data and machine learning in health care. JAMA. 2018; 319:1317–1318
Rajkomar A, Dean J, Kohane I. Machine learning in medicine. N Engl J Med. 2019; 380:1347–1358
Kolachalama VB, Garg PS. Machine learning and medical education. NPJ Digit Med. 2018; 1:54
Arora VM. Harnessing the power of big data to improve graduate medical education: Big idea or bust? Acad Med. 2018; 93:833–834
Andris C, Cowen D, Wittenbach J. Support vector machine for spatial variation. Trans GIS. 2013; 17:41–61
Lux T, Pittman R, Shende R, Shende A. Applications of supervised learning techniques on undergraduate admissions data. Proceedings of the ACM International Conference on Computing Frontiers; Como, Italy; May 16-19, 2016; 412–417
Basu K, Basu T, Buckmire R, Lal N. Predictive models of student college commitment decisions using machine learning. Data. 2019; 4:65
Waters A, Miikkulainen R. GRADE: Machine learning support for graduate admissions. AI Magazine. 2014; 35:64–64
Gupta N, Sawhney A, Roth D. Will I get in? Modeling the graduate admission process for American universities. IEEE 16th International Conference on Data Mining Workshops (ICDMW); Barcelona, Spain; December 12-15, 2016; 631–638
Bitar Z, Al-Mousa A. Prediction of graduate admission using multiple supervised machine learning models. 2020 SoutheastCon; Raleigh, NC; March 28-29, 2020; 1–6
Zhao Y, Lackaye B, Dy JG, Brodley CE. A quantitative machine learning approach to master students admission for professional institutions. Presented at: International Conference on Educational Data Mining (EDM); July 13, 2020; virtual
Muratov E, Lewis M, Fourches D, Tropsha A, Cox WC. Computer-assisted decision support for student admissions based on their predicted academic performance. Am J Pharm Educ. 2017; 81:46
Triola MM. Director, Institute for Innovations in Medical Education, NYU Grossman School of Medicine. Personal communication with J. Burk-Rafel, April 26, 2021
Winkel AF, Morgan HK, Burk-Rafel J, et al. A model for exploring compatibility between applicants and residency programs: Right resident, right program. Obstet Gynecol. 2021; 137:164–169
US News and World Report. Best Medical Schools: Research. https://www.usnews.com/best-graduate-schools/top-medical-schools/research-rankings . Accessed July 25, 2021
McKinney W. Data structures for statistical computing in Python. Proceedings of the 9th Python in Science Conference; June 28-July 3, 2010. Austin, Texas. http://conference.scipy.org/proceedings/scipy2010/pdfs/mckinney.pdf . Accessed July 25, 2021
Pedregosa F, Varoguaux G, Gramfort A, et al. Scikit-learn: Machine learning in Python. J Machine Learn Res. 2011; 12:2825–2830
Liaw A, Wiener M. Classification and regression by RandomForest. R News. 2002; 2:18–22
Ke G, Meng Q, Finley T, et al. LightGBM: A highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst. 2017; 30:3146–3154
Chen T, Guestrin C. Xgboost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; August 13, 2016; San Francisco, CA:785–794
Breiman L, Friedman J, Olshen R, Stone C. Classification and Regression Trees. Wadsworth, NY: Chapman and Hall, 1984
Boehmke B, Greenwell B. Hands-On Machine Learning with R. Boca Raton, FL: CRC Press, 2019
Friedman JH. Greedy function approximation: A gradient boosting machine. Ann Stat. 2001; 1189–1232
Geron A. Hands-on Machine Learning With Scikit-Learn and Tensorflow. Sebastopol, CA: O’Reilly Media Inc, 2017
Jeni LA, Cohn JF, De La Torre F. Facing imbalanced data: Recommendations for the use of performance metrics. 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction; September 2-5, 2013. Geneva, Switzerland. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4285355/pdf/nihms-554962.pdf . Accessed July 25, 2021
Davis J, Goadrich M. The Relationship Between Precision-Recall and ROC Curves. Proceedings of the 23rd International Conference on Machine Learning; June 25-29, 2006. Pittsburgh, Pennsylvania. https://dl.acm.org/doi/10.1145/1143844.1143874 . Accessed July 25, 2021
Lundberg S, Lee SI. A unified approach to interpreting model predictions, Proceedings of the 31st International Conference on Neural Information Processing Systems; December 4-9, 2017. Long Beach, California. https://dl.acm.org/doi/10.5555/3295222.3295230 . Accessed July 25, 2021
Molnar C. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. https://christophm.github.io/interpretable-ml-book . Accessed July 25, 2021
Messick S. Validity. In: Linn RL, ed. Educational Measurement. 3rd ed, New York, NY: American Council on Education and Macmillan, 1989; 13–103
Kane MT. Validation. In: Brennan RL, ed. Educational Measurement. 4th ed, Westport, CT: Praeger, 2006; 17–64
Cook DA, Brydges R, Ginsburg S, Hatala R. A contemporary approach to validity arguments: A practical guide to Kane’s framework. Med Educ. 2015; 49:560–575
Hastie T, Tibshirani R, Friedman JH. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York, NY: Springer, 2001
Nielsen D. Tree boosting with XGBoost: Why does XGBoost win “every” machine learning competition? [thesis]. Trondheim, Norway: Norwegian University of Science and Technology, 2016. https://ntnuopen.ntnu.no/ntnu-xmlui/bitstream/handle/11250/2433761/16128_FULLTEXT.pdf . Accessed July 25, 2021
Burk-Rafel J, Pulido RW, Elfanagely Y, Kolars JC. Institutional differences in USMLE Step 1 and 2 CK performance: Cross-sectional study of 89 US allopathic medical schools. PLoS One. 2019; 14:e0224675
Kuchipudi B, Nannapaneni RT, Liao Q. Adversarial machine learning for spam filters. FARES ’20, August 25, 2020, Dublin, Ireland. http://people.cst.cmich.edu/liao1q/papers/amlspam.pdf . Accessed July 25, 2021
Johnson JM, Khoshgoftaar TM. Survey on deep learning with class imbalance. J Big Data. 2019; 6. https://doi.org/10.1186/s40537-019-0192-5
doi: 10.1186/s40537-019-0192-5
Howard A, Borenstein J. The ugly truth about ourselves and our robot creations: The problem of bias and social inequity. Sci Eng Ethics. 2018; 24:1521–1536
Vayena E, Blasimme A, Cohen IG. Machine learning in medicine: Addressing ethical challenges. PLoS Med. 2018; 15:e1002689
Char DS, Shah NH, Magnus D. Implementing machine learning in health care—Addressing ethical challenges. N Engl J Med. 2018; 378:981–983
Capers Q 4th, Clinchot D, McDougle L, Greenwald AG. Implicit racial bias in medical school admissions. Acad Med. 2017; 92:365–369
Maxfield CM, Thorpe MP, Desser TS, et al. Bias in radiology resident selection: Do we discriminate against the obese and unattractive? Acad Med. 2019; 94:1774–1780
Grimm LJ, Redmond RA, Campbell JC, Rosette AS. Gender and racial bias in radiology residency letters of recommendation. J Am Coll Radiol. 2020; 17(1 Pt A):64–71
Filippou P, Mahajan S, Deal A, et al. The presence of gender bias in letters of recommendations written for urology residency applicants. Urology. 2019; 134:56–61
Parikh RB, Teeple S, Navathe AS. Addressing bias in artificial intelligence in health care. JAMA. 2019; 322:2377–2378
DeCamp M, Lindvall C. Latent bias and the implementation of artificial intelligence in medicine. J Am Med Inform Assoc. 2020; 27:2020–2023

Auteurs

Jesse Burk-Rafel (J)

J. Burk-Rafel is assistant professor of medicine and assistant director of UME-GME innovation, Institute for Innovations in Medical Education, NYU Grossman School of Medicine, New York, New York. At the time this work was completed, he was an internal medicine resident at NYU Langone Health, New York, New York; ORCID: https://orcid.org/0000-0003-3785-2154 .

Ilan Reinstein (I)

I. Reinstein is a research scientist, Institute for Innovations in Medical Education, NYU Grossman School of Medicine, New York, New York.

James Feng (J)

J. Feng is an orthopedic surgery resident, Beaumont Health, Royal Oak, Michigan. At the time this work was completed, he was a master's student in biomedical informatics, NYU Grossman School of Medicine Vilcek Institute of Graduate Biomedical Sciences, New York, New York.

Moosun Brad Kim (MB)

M.B. Kim is a biostatistician at Aprogen, Seongnam, Republic of Korea. At the time this work was completed, he was a master's student in biomedical informatics, NYU Grossman School of Medicine Vilcek Institute of Graduate Biomedical Sciences, New York, New York.

Louis H Miller (LH)

L.H. Miller is assistant professor of cardiology and assistant dean for career advisement, Zucker School of Medicine at Hofstra/Northwell, New York, New York.

Patrick M Cocks (PM)

P.M. Cocks is the Abraham Sunshine Assistant Professor of Medicine, and program director of the internal medicine residency program, NYU Langone Health, New York, New York.

Marina Marin (M)

M. Marin is director of the division of educational analytics, Institute for Innovations in Medical Education, NYU Grossman School of Medicine, New York, New York.

Yindalon Aphinyanaphongs (Y)

Y. Aphinyanaphongs is director of operational data science and machine learning, NYU Langone Health, New York, New York.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH