Considerations in the reliability and fairness audits of predictive models for advance care planning.

advance care planning artificial intelligence audit electronic health record fairness model reporting guideline

Journal

Frontiers in digital health

ISSN: 2673-253X

Titre abrégé: Front Digit Health

Pays: Switzerland

ID NLM: 101771889

Informations de publication

Date de publication:
2022

Historique:

received: 14 05 2022

accepted: 17 08 2022

entrez: 7 11 2022

pubmed: 8 11 2022

medline: 8 11 2022

Statut: epublish

Résumé

Multiple reporting guidelines for artificial intelligence (AI) models in healthcare recommend that models be audited for reliability and fairness. However, there is a gap of operational guidance for performing reliability and fairness audits in practice. Following guideline recommendations, we conducted a reliability audit of two models based on model performance and calibration as well as a fairness audit based on summary statistics, subgroup performance and subgroup calibration. We assessed the Epic End-of-Life (EOL) Index model and an internally developed Stanford Hospital Medicine (HM) Advance Care Planning (ACP) model in 3 practice settings: Primary Care, Inpatient Oncology and Hospital Medicine, using clinicians' answers to the surprise question ("Would you be surprised if [patient X] passed away in [Y years]?") as a surrogate outcome. For performance, the models had positive predictive value (PPV) at or above 0.76 in all settings. In Hospital Medicine and Inpatient Oncology, the Stanford HM ACP model had higher sensitivity (0.69, 0.89 respectively) than the EOL model (0.20, 0.27), and better calibration (O/E 1.5, 1.7) than the EOL model (O/E 2.5, 3.0). The Epic EOL model flagged fewer patients (11%, 21% respectively) than the Stanford HM ACP model (38%, 75%). There were no differences in performance and calibration by sex. Both models had lower sensitivity in Hispanic/Latino male patients with Race listed as "Other." 10 clinicians were surveyed after a presentation summarizing the audit. 10/10 reported that summary statistics, overall performance, and subgroup performance would affect their decision to use the model to guide care; 9/10 said the same for overall and subgroup calibration. The most commonly identified barriers for routinely conducting such reliability and fairness audits were poor demographic data quality and lack of data access. This audit required 115 person-hours across 8-10 months. Our recommendations for performing reliability and fairness audits include verifying data validity, analyzing model performance on intersectional subgroups, and collecting clinician-patient linkages as necessary for label generation by clinicians. Those responsible for AI models should require such audits before model deployment and mediate between model auditors and impacted stakeholders.

Identifiants

DOI: 10.3389/fdgth.2022.943768 PMID: 36339512 PMC: PMC9634737

pubmed: 36339512

doi: 10.3389/fdgth.2022.943768

pmc: PMC9634737

doi:

Types de publication

Journal Article

Langues

eng

Pagination

943768

Informations de copyright

© 2022 Lu, Sattler, Wang, Khaki, Callahan, Fleming, Fong, Ehlert, Li, Shieh, Ramchandran, Gensheimer, Chobot, Pfohl, Li, Shum, Parikh, Desai, Seevaratnam, Hanson, Smith, Xu, Gokhale, Lin, Pfeffer, Teuteberg and Shah.

Déclaration de conflit d'intérêts

SP is currently employed by Google, with contributions to this work made while at Stanford. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Références

Heart. 2012 May;98(9):683-90

pubmed: 22397945

Br J Surg. 2015 Feb;102(3):148-58

pubmed: 25627261

Nat Med. 2019 Oct;25(10):1467-1468

pubmed: 31551578

Lancet. 2017 Apr 8;389(10077):1453-1463

pubmed: 28402827

BMJ. 2020 Sep 9;370:m3210

pubmed: 32907797

JAMA Intern Med. 2014 Dec;174(12):1994-2003

pubmed: 25330167

Ann Intern Med. 2019 Jan 1;170(1):51-58

pubmed: 30596875

NPJ Digit Med. 2020 Mar 23;3:41

pubmed: 32219182

Stat Med. 2021 Aug 30;40(19):4230-4251

pubmed: 34031906

Ann Fam Med. 2022 Mar-Apr;20(2):157-163

pubmed: 35045967

Eur Heart J. 2014 Aug 1;35(29):1925-31

pubmed: 24898551

J Am Med Inform Assoc. 2017 Nov 01;24(6):1052-1061

pubmed: 28379439

BMC Med. 2017 Aug 2;15(1):139

pubmed: 28764757

J Med Internet Res. 2016 Dec 16;18(12):e323

pubmed: 27986644

CMAJ. 2017 Apr 3;189(13):E484-E493

pubmed: 28385893

PLoS Med. 2014 Oct 14;11(10):e1001744

pubmed: 25314315

JAMA Netw Open. 2021 Apr 1;4(4):e213909

pubmed: 33856478

JAMA Netw Open. 2022 Aug 1;5(8):e2227779

pubmed: 35984654

J Am Med Inform Assoc. 2020 Dec 9;27(12):2011-2015

pubmed: 32594179

JAMA Intern Med. 2021 Aug 1;181(8):1065-1070

pubmed: 34152373

J Am Med Inform Assoc. 2019 Aug 1;26(8-9):730-736

pubmed: 31365089

J Am Med Inform Assoc. 2020 Dec 9;27(12):1878-1884

pubmed: 32935131

J Am Med Inform Assoc. 2021 Oct 12;28(11):2445-2450

pubmed: 34423364

Nat Med. 2020 Sep;26(9):1364-1374

pubmed: 32908283

BMJ Open. 2016 Nov 14;6(11):e012799

pubmed: 28137831

J Am Med Inform Assoc. 2022 Aug 16;29(9):1525-1534

pubmed: 35686364

Nat Med. 2020 Sep;26(9):1320-1324

pubmed: 32908275

Science. 2019 Oct 25;366(6464):447-453

pubmed: 31649194

PLoS Med. 2007 Sep;4(9):e271

pubmed: 17896853

Considerations in the reliability and fairness audits of predictive models for advance care planning.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Pagination

Informations de copyright

Déclaration de conflit d'intérêts

Références

Auteurs

Jonathan Lu (J)

Amelia Sattler (A)

Samantha Wang (S)

Ali Raza Khaki (AR)

Alison Callahan (A)

Scott Fleming (S)

Rebecca Fong (R)

Benjamin Ehlert (B)

Ron C Li (RC)

Lisa Shieh (L)

Kavitha Ramchandran (K)

Michael F Gensheimer (MF)

Sarah Chobot (S)

Stephen Pfohl (S)

Siyun Li (S)

Kenny Shum (K)

Nitin Parikh (N)

Priya Desai (P)

Briththa Seevaratnam (B)

Melanie Hanson (M)

Margaret Smith (M)

Yizhe Xu (Y)

Arjun Gokhale (A)

Steven Lin (S)

Michael A Pfeffer (MA)

Winifred Teuteberg (W)

Nigam H Shah (NH)

Classifications MeSH