Considerations in the reliability and fairness audits of predictive models for advance care planning.

advance care planning artificial intelligence audit electronic health record fairness model reporting guideline

Journal

Frontiers in digital health
ISSN: 2673-253X
Titre abrégé: Front Digit Health
Pays: Switzerland
ID NLM: 101771889

Informations de publication

Date de publication:
2022
Historique:
received: 14 05 2022
accepted: 17 08 2022
entrez: 7 11 2022
pubmed: 8 11 2022
medline: 8 11 2022
Statut: epublish

Résumé

Multiple reporting guidelines for artificial intelligence (AI) models in healthcare recommend that models be audited for reliability and fairness. However, there is a gap of operational guidance for performing reliability and fairness audits in practice. Following guideline recommendations, we conducted a reliability audit of two models based on model performance and calibration as well as a fairness audit based on summary statistics, subgroup performance and subgroup calibration. We assessed the Epic End-of-Life (EOL) Index model and an internally developed Stanford Hospital Medicine (HM) Advance Care Planning (ACP) model in 3 practice settings: Primary Care, Inpatient Oncology and Hospital Medicine, using clinicians' answers to the surprise question ("Would you be surprised if [patient X] passed away in [Y years]?") as a surrogate outcome. For performance, the models had positive predictive value (PPV) at or above 0.76 in all settings. In Hospital Medicine and Inpatient Oncology, the Stanford HM ACP model had higher sensitivity (0.69, 0.89 respectively) than the EOL model (0.20, 0.27), and better calibration (O/E 1.5, 1.7) than the EOL model (O/E 2.5, 3.0). The Epic EOL model flagged fewer patients (11%, 21% respectively) than the Stanford HM ACP model (38%, 75%). There were no differences in performance and calibration by sex. Both models had lower sensitivity in Hispanic/Latino male patients with Race listed as "Other." 10 clinicians were surveyed after a presentation summarizing the audit. 10/10 reported that summary statistics, overall performance, and subgroup performance would affect their decision to use the model to guide care; 9/10 said the same for overall and subgroup calibration. The most commonly identified barriers for routinely conducting such reliability and fairness audits were poor demographic data quality and lack of data access. This audit required 115 person-hours across 8-10 months. Our recommendations for performing reliability and fairness audits include verifying data validity, analyzing model performance on intersectional subgroups, and collecting clinician-patient linkages as necessary for label generation by clinicians. Those responsible for AI models should require such audits before model deployment and mediate between model auditors and impacted stakeholders.

Identifiants

pubmed: 36339512
doi: 10.3389/fdgth.2022.943768
pmc: PMC9634737
doi:

Types de publication

Journal Article

Langues

eng

Pagination

943768

Informations de copyright

© 2022 Lu, Sattler, Wang, Khaki, Callahan, Fleming, Fong, Ehlert, Li, Shieh, Ramchandran, Gensheimer, Chobot, Pfohl, Li, Shum, Parikh, Desai, Seevaratnam, Hanson, Smith, Xu, Gokhale, Lin, Pfeffer, Teuteberg and Shah.

Déclaration de conflit d'intérêts

SP is currently employed by Google, with contributions to this work made while at Stanford. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Références

Heart. 2012 May;98(9):683-90
pubmed: 22397945
Br J Surg. 2015 Feb;102(3):148-58
pubmed: 25627261
Nat Med. 2019 Oct;25(10):1467-1468
pubmed: 31551578
Lancet. 2017 Apr 8;389(10077):1453-1463
pubmed: 28402827
BMJ. 2020 Sep 9;370:m3210
pubmed: 32907797
JAMA Intern Med. 2014 Dec;174(12):1994-2003
pubmed: 25330167
Ann Intern Med. 2019 Jan 1;170(1):51-58
pubmed: 30596875
NPJ Digit Med. 2020 Mar 23;3:41
pubmed: 32219182
Stat Med. 2021 Aug 30;40(19):4230-4251
pubmed: 34031906
Ann Fam Med. 2022 Mar-Apr;20(2):157-163
pubmed: 35045967
Eur Heart J. 2014 Aug 1;35(29):1925-31
pubmed: 24898551
J Am Med Inform Assoc. 2017 Nov 01;24(6):1052-1061
pubmed: 28379439
BMC Med. 2017 Aug 2;15(1):139
pubmed: 28764757
J Med Internet Res. 2016 Dec 16;18(12):e323
pubmed: 27986644
CMAJ. 2017 Apr 3;189(13):E484-E493
pubmed: 28385893
PLoS Med. 2014 Oct 14;11(10):e1001744
pubmed: 25314315
JAMA Netw Open. 2021 Apr 1;4(4):e213909
pubmed: 33856478
JAMA Netw Open. 2022 Aug 1;5(8):e2227779
pubmed: 35984654
J Am Med Inform Assoc. 2020 Dec 9;27(12):2011-2015
pubmed: 32594179
JAMA Intern Med. 2021 Aug 1;181(8):1065-1070
pubmed: 34152373
J Am Med Inform Assoc. 2019 Aug 1;26(8-9):730-736
pubmed: 31365089
J Am Med Inform Assoc. 2020 Dec 9;27(12):1878-1884
pubmed: 32935131
J Am Med Inform Assoc. 2021 Oct 12;28(11):2445-2450
pubmed: 34423364
Nat Med. 2020 Sep;26(9):1364-1374
pubmed: 32908283
BMJ Open. 2016 Nov 14;6(11):e012799
pubmed: 28137831
J Am Med Inform Assoc. 2022 Aug 16;29(9):1525-1534
pubmed: 35686364
Nat Med. 2020 Sep;26(9):1320-1324
pubmed: 32908275
Science. 2019 Oct 25;366(6464):447-453
pubmed: 31649194
PLoS Med. 2007 Sep;4(9):e271
pubmed: 17896853

Auteurs

Jonathan Lu (J)

Center for Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.

Amelia Sattler (A)

Stanford Healthcare AI Applied Research Team, Division of Primary Care and Population Health, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.

Samantha Wang (S)

Division of Hospital Medicine, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.

Ali Raza Khaki (AR)

Division of Oncology, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.

Alison Callahan (A)

Center for Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.

Scott Fleming (S)

Center for Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.

Rebecca Fong (R)

Serious Illness Care Program, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.

Benjamin Ehlert (B)

Center for Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.

Ron C Li (RC)

Division of Hospital Medicine, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.

Lisa Shieh (L)

Division of Hospital Medicine, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.

Kavitha Ramchandran (K)

Division of Oncology, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.

Michael F Gensheimer (MF)

Department of Radiation Oncology, Stanford University School of Medicine, Palo Alto, United States.

Sarah Chobot (S)

Inpatient Palliative Care, Stanford Health Care, Palo Alto, United States.

Stephen Pfohl (S)

Center for Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.

Siyun Li (S)

Center for Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.

Kenny Shum (K)

Technology / Digital Solutions, Stanford Health Care and Stanford University School of Medicine, Palo Alto, United States.

Nitin Parikh (N)

Technology / Digital Solutions, Stanford Health Care and Stanford University School of Medicine, Palo Alto, United States.

Priya Desai (P)

Technology / Digital Solutions, Stanford Health Care and Stanford University School of Medicine, Palo Alto, United States.

Briththa Seevaratnam (B)

Serious Illness Care Program, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.

Melanie Hanson (M)

Serious Illness Care Program, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.

Margaret Smith (M)

Stanford Healthcare AI Applied Research Team, Division of Primary Care and Population Health, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.

Yizhe Xu (Y)

Center for Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.

Arjun Gokhale (A)

Center for Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.

Steven Lin (S)

Stanford Healthcare AI Applied Research Team, Division of Primary Care and Population Health, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.

Michael A Pfeffer (MA)

Division of Hospital Medicine, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.
Technology / Digital Solutions, Stanford Health Care and Stanford University School of Medicine, Palo Alto, United States.

Winifred Teuteberg (W)

Serious Illness Care Program, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.

Nigam H Shah (NH)

Center for Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Palo Alto, United States.
Technology / Digital Solutions, Stanford Health Care and Stanford University School of Medicine, Palo Alto, United States.
Clinical Excellence Research Center, Stanford University School of Medicine, Palo Alto, United States.

Classifications MeSH