Do no harm: a roadmap for responsible machine learning for health care.


Journal

Nature medicine
ISSN: 1546-170X
Titre abrégé: Nat Med
Pays: United States
ID NLM: 9502015

Informations de publication

Date de publication:
09 2019
Historique:
received: 10 07 2019
accepted: 17 07 2019
pubmed: 21 8 2019
medline: 13 11 2019
entrez: 21 8 2019
Statut: ppublish

Résumé

Interest in machine-learning applications within medicine has been growing, but few studies have progressed to deployment in patient care. We present a framework, context and ultimately guidelines for accelerating the translation of machine-learning-based interventions in health care. To be successful, translation will require a team of engaged stakeholders and a systematic process from beginning (problem formulation) to end (widespread deployment).

Identifiants

pubmed: 31427808
doi: 10.1038/s41591-019-0548-6
pii: 10.1038/s41591-019-0548-6
doi:

Types de publication

Journal Article Review

Langues

eng

Sous-ensembles de citation

IM

Pagination

1337-1340

Commentaires et corrections

Type : ErratumIn

Références

Lazer, D., Kennedy, R., King, G. & Vespignani, A. Big data. The parable of Google Flu: traps in big data analysis. Science 343, 1203–1205 (2014).
doi: 10.1126/science.1248506
Hutson, M. Even artificial intelligence can acquire biases against race and gender. Science https://doi.org/10.1126/science.aal1053 (2017).
He, J. et al. The practical implementation of artificial intelligence technologies in medicine. Nat. Med. 25, 30–36 (2019).
doi: 10.1038/s41591-018-0307-0
Silva, I., Moody, G., Scott, D. J., Celi, L. A. & Mark, R. G. Predicting in-hospital mortality of ICU patients: the Physionet/Computing in Cardiology Challenge 2012. Comput. Cardiol. 39, 245–248 (2012).
Luo, Y., Cai, X., Zhang, Y. & Xu, J. Multivariate time series imputation with generative adversarial networks. in Advances in Neural Information Processing Systems 1596–1607 (NeurIPS, 2018).
O’Malley, K. J. et al. Measuring diagnoses: ICD code accuracy. Health Serv. Res. 40, 1620–1639 (2005).
doi: 10.1111/j.1475-6773.2005.00444.x
Saria, S. & Subbaswamy, A. Tutorial: safe and reliable machine learning. Preprint at https://arxiv.org/abs/1904.07204 (2019).
Chen, I. Y., Szolovits, P. & Ghassemi, M. Can AI help reduce disparities in general medical and mental health care? AMA J. Ethics 21, E167–E179 (2019).
doi: 10.1001/amajethics.2019.167
Schulam, P. & Saria, S. Reliable decision support using counterfactual models. in Advances in Neural Information Processing Systems 1697–1708 (NeurIPS, 2017).
O’neil, C. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy (Broadway Books, 2016).
Williams, D. R., Mohammed, S. A., Leavell, J. & Collins, C. Race, socioeconomic status, and health: complexities, ongoing challenges, and research opportunities. Ann. NY Acad. Sci. 1186, 69–101 (2010).
doi: 10.1111/j.1749-6632.2009.05339.x
Rajpurkar, P. et al. Chexnet: radiologist-level pneumonia detection on chest X-rays with deep learning. Preprint at https://arxiv.org/abs/1711.05225 (2017).
Liu, V.X., Bates, D.W., Wiens, J. & Shah, N.H. The number needed to benefit: estimating the value of predictive analytics in healthcare. J. Am. Med. Inform. Assoc. https://doi.org/10.1093/jamia/ocz088 (2019).
Oh, J. et al. A generalizable, data-driven approach to predict daily risk of Clostridium difficile infection at two large academic health centers. Infect. Control Hosp. Epidemiol. 39, 425–433 (2018).
doi: 10.1017/ice.2018.16
Schulam, P. & Saria, S. Can you trust this prediction? Auditing pointwise reliability after learning. in The 22nd International Conference on Artificial Intelligence and Statistics 1022–1031 (PMLR, 2019).
Henderson, P. et al. Deep reinforcement learning that matters. in Thirty-second AAAI Conference on Artificial Intelligence (AAAI, 2018).
Nestor, B. et al. Rethinking clinical prediction: why machine learning must consider year of care and feature aggregation. Preprint at https://arxiv.org/abs/1811.12583 (2018).
Henry, K. E., Hager, D. N., Pronovost, P. J. & Saria, S. A targeted real-time early warning score (TREWScore) for septic shock. Sci. Transl. Med. 7, 299ra122 (2015).
doi: 10.1126/scitranslmed.aab3719
Hemming, K., Haines, T. P., Chilton, P. J., Girling, A. J. & Lilford, R. J. The stepped wedge cluster randomised trial: rationale, design, analysis, and reporting. Br. Med. J. 350, h391 (2015).
doi: 10.1136/bmj.h391
Evans, B. & Ossorio, P. The challenge of regulating clinical decision support software after 21
doi: 10.1177/0098858818789418
Okoro, A. O. Preface: The 21
doi: 10.1177/0098858818793388
Proposed Regulatory Framework for Modifications to Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD) (U.S. Food & Drug Administration, 2019); https://www.fda.gov/media/122535/download
Massachusetts Institute of Technology. Self-driving cars, robots: identifying AI ‘blind spots’. ScienceDaily (25 January 2019).
Chien, S. & Wagstaff, K. L. Robotic space exploration agents. Sci. Robot. 2, eaan4831 (2017).
doi: 10.1126/scirobotics.aan4831

Auteurs

Jenna Wiens (J)

Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI, USA. wiensj@umich.edu.

Suchi Saria (S)

Departments of Computer Science and Statistics, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA.
Department of Health Policy and Management, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA.
Bayesian Health, New York, NY, USA.

Mark Sendak (M)

Duke Institute for Health Innovation, Duke University School of Medicine, Durham, NC, USA.

Marzyeh Ghassemi (M)

Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.
Department of Medicine, University of Toronto, Toronto, Ontario, Canada.
Vector Institute, Toronto, Ontario, Canada.

Vincent X Liu (VX)

Kaiser Permanente Division of Research, Oakland, CA, USA.

Finale Doshi-Velez (F)

School of Engineering and Applied Science, Harvard University, Cambridge, MA, USA.

Kenneth Jung (K)

Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA.

Katherine Heller (K)

Google Inc., Mountain View, CA, USA.
Department of Statistical Science, Duke University, Durham, NC, USA.

David Kale (D)

Information Sciences Institute, University of Southern California, Los Angeles, CA, USA.

Mohammed Saeed (M)

Department of Internal Medicine, University of Michigan, Ann Arbor, MI, USA.

Pilar N Ossorio (PN)

Law School, University of Wisconsin-Madison, Madison, WI, USA.

Sonoo Thadaney-Israni (S)

Presence and Program in Bedside Medicine, Stanford University School of Medicine, Stanford, CA, USA.

Anna Goldenberg (A)

Department of Computer Science, University of Toronto, Toronto, Ontario, Canada. anna.goldenberg@utoronto.ca.
Vector Institute, Toronto, Ontario, Canada. anna.goldenberg@utoronto.ca.
SickKids Research Institute, Toronto, Ontario, Canada. anna.goldenberg@utoronto.ca.
Child and Brain Development Program, CIFAR, Toronto, Ontario, Canada. anna.goldenberg@utoronto.ca.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH