The value of standards for health datasets in artificial intelligence-based applications.


Journal

Nature medicine
ISSN: 1546-170X
Titre abrégé: Nat Med
Pays: United States
ID NLM: 9502015

Informations de publication

Date de publication:
Nov 2023
Historique:
received: 14 03 2023
accepted: 22 09 2023
medline: 27 11 2023
pubmed: 27 10 2023
entrez: 26 10 2023
Statut: ppublish

Résumé

Artificial intelligence as a medical device is increasingly being applied to healthcare for diagnosis, risk stratification and resource allocation. However, a growing body of evidence has highlighted the risk of algorithmic bias, which may perpetuate existing health inequity. This problem arises in part because of systemic inequalities in dataset curation, unequal opportunity to participate in research and inequalities of access. This study aims to explore existing standards, frameworks and best practices for ensuring adequate data diversity in health datasets. Exploring the body of existing literature and expert views is an important step towards the development of consensus-based guidelines. The study comprises two parts: a systematic review of existing standards, frameworks and best practices for healthcare datasets; and a survey and thematic analysis of stakeholder views of bias, health equity and best practices for artificial intelligence as a medical device. We found that the need for dataset diversity was well described in literature, and experts generally favored the development of a robust set of guidelines, but there were mixed views about how these could be implemented practically. The outputs of this study will be used to inform the development of standards for transparency of data diversity in health datasets (the STANDING Together initiative).

Identifiants

pubmed: 37884627
doi: 10.1038/s41591-023-02608-w
pii: 10.1038/s41591-023-02608-w
pmc: PMC10667100
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

2929-2938

Subventions

Organisme : Medical Research Council
ID : MC_PC_21015
Pays : United Kingdom
Organisme : Medical Research Council
ID : MC_PC_21055
Pays : United Kingdom

Informations de copyright

© 2023. The Author(s).

Références

Sidey-Gibbons, J. A. M. & Sidey-Gibbons, C. J. Machine learning in medicine: a practical introduction. BMC Med. Res. Methodol. 19, 64 (2019).
doi: 10.1186/s12874-019-0681-4 pubmed: 30890124 pmcid: 6425557
Ibrahim, H., Liu, X., Zariffa, N., Morris, A. D. & Denniston, A. K. Health data poverty: an assailable barrier to equitable digital health care. Lancet Digit. Health 3, e260–e265 (2021).
doi: 10.1016/S2589-7500(20)30317-4 pubmed: 33678589
Kuhlman, C., Jackson, L. & Chunara, R. No computation without representation: avoiding data and algorithm biases through diversity. In Proc. 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD '20) 3593 (ACM, 2020); https://doi.org/10.1145/3394486.3411074
Courbier, S., Dimond, R. & Bros-Facer, V. Share and protect our health data: an evidence based approach to rare disease patients’ perspectives on data sharing and data protection - quantitative survey and recommendations. Orphanet J. Rare Dis. 14, 175 (2019).
doi: 10.1186/s13023-019-1123-4 pubmed: 31300010 pmcid: 6625078
Chen, I. Y. et al. Ethical machine learning in healthcare. Annu Rev. Biomed. Data Sci. 4, 123–44. (2021).
doi: 10.1146/annurev-biodatasci-092820-114757 pubmed: 34396058 pmcid: 8362902
Khan, S. M. et al. A global review of publicly available datasets for ophthalmological imaging: barriers to access, usability, and generalisability. Lancet Digit. Health 3, e51–e66 (2021).
doi: 10.1016/S2589-7500(20)30240-5 pubmed: 33735069
Wen, D. et al. Characteristics of publicly available skin cancer image datasets: a systematic review. Lancet Digit. Health 4, e64–e74 (2022).
doi: 10.1016/S2589-7500(21)00252-1 pubmed: 34772649
Kaushal, A., Altman, R. & Langlotz, C. Geographic distribution of US cohorts used to train deep learning algorithms. JAMA 324, 1212–1213 (2020).
doi: 10.1001/jama.2020.12067 pubmed: 32960230 pmcid: 7509620
Gichoya, J. W. et al. AI recognition of patient race in medical imaging: a modelling study. Lancet Digit. Health 4, e406–e414 (2022).
doi: 10.1016/S2589-7500(22)00063-2 pubmed: 35568690 pmcid: 9650160
Glocker, B., Jones, C., Bernhardt, M. & Winzeck, S. Risk of bias in chest radiography foundation models. Radiol. Artif. Intell. https://doi.org/10.1148/ryai.230060 (2023).
Zou, J. & Schiebinger, L. Ensuring that biomedical AI benefits diverse populations. eBioMedicine https://doi.org/10.1016/j.ebiom.2021.103358 (2021).
Jobin, A., Ienca, M. & Vayena, E. The global landscape of AI ethics guidelines. Nat. Mach. Intell. 1, 389–399. (2019).
doi: 10.1038/s42256-019-0088-2
Ethics and Governance of Artificial Intelligence for Health (WHO 2021); https://www.who.int/publications-detail-redirect/9789240029200
Block, R. G. et al. Recommendations for improving national clinical datasets for health equity research. J. Am. Med. Inform. Assoc. 27, 1802–1807 (2020).
doi: 10.1093/jamia/ocaa144 pubmed: 32885240 pmcid: 7671626
DeVoe, J. E. et al. The ADVANCE network: accelerating data value across a national community health center network. J. Am. Med. Inform. Assoc. 21, 591–595 (2014).
doi: 10.1136/amiajnl-2014-002744 pubmed: 24821740 pmcid: 4078289
Hasnain-Wynia, R. & Baker, D. W. Obtaining data on patient race, ethnicity, and primary language in health care organizations: current challenges and proposed solutions. Health Serv. Res. 411, 1501–1518 (2006).
doi: 10.1111/j.1475-6773.2006.00552.x
Computer-Assisted Detection Devices Applied to Radiology Images and Radiology Device Data - Premarket Notification [510(k)] Submissions. (FDA, 2022); https://www.fda.gov/regulatory-information/search-fda-guidance-documents/computer-assisted-detection-devices-applied-radiology-images-and-radiology-device-data-premarket
Ganapathi, S. et al. Tackling bias in AI health datasets through the STANDING Together initiative. Nat. Med. 28, 2232–2233 (2022).
doi: 10.1038/s41591-022-01987-w pubmed: 36163296
Vollmer, S. et al. Machine learning and artificial intelligence research for patient benefit: 20 critical questions on transparency, replicability, ethics, and effectiveness. Br. Med. J. 368, l6927 (2020).
doi: 10.1136/bmj.l6927
Challen, R. et al. Artificial intelligence, bias and clinical safety. BMJ Qual. Saf. 28, 231–237 (2019).
doi: 10.1136/bmjqs-2018-008370 pubmed: 30636200 pmcid: 6560460
Wiens, J. et al. Do no harm: a roadmap for responsible machine learning for health care. Nat. Med. 25, 1337–40. (2019).
doi: 10.1038/s41591-019-0548-6 pubmed: 31427808
Saleh, S., Boag, W., Erdman, L. & Naumann, T. Clinical collabsheets: 53 questions to guide a clinical collaboration. In Proc. 5th Machine Learning for Healthcare Conference (eds Doshi-Velez, F. et al.) 783–812 (PMLR, 2022); https://proceedings.mlr.press/v126/saleh20a.html
Ferryman, K. Addressing health disparities in the Food and Drug Administration’s artificial intelligence and machine learning regulatory framework. J. Am. Med. Inform. Assoc. 27, 2016–2019 (2020).
doi: 10.1093/jamia/ocaa133 pubmed: 32951036 pmcid: 7727393
Suresh, H. & Guttag, J. A framework for understanding sources of harm throughout the machine learning life cycle. In Proc. Equity and Access in Algorithms, Mechanisms, and Optimization 1–9 (ACM, 2021); https://dl.acm.org/doi/10.1145/3465416.3483305
Lysaght, T., Lim, H. Y., Xafis, V. & Ngiam, K. Y. AI-assisted decision-making in healthcare: the application of an ethics framework for big data in health and research. Asian Bioeth. Rev. 11, 299–314 (2019).
doi: 10.1007/s41649-019-00096-0 pubmed: 33717318 pmcid: 7747260
Flanagin, A., Frey, T., Christiansen, S. L. & Bauchner, H. The reporting of race and ethnicity in medical and science journals: comments invited. JAMA 325, 1049–1052 (2021).
doi: 10.1001/jama.2021.2104 pubmed: 33616604
Cerdeña, J. P., Grubbs, V. & Non, A. L. Racialising genetic risk: assumptions, realities, and recommendations. Lancet 400, 2147–2154. (2022).
doi: 10.1016/S0140-6736(22)02040-2 pubmed: 36502852
Elias, J. Google contractor reportedly tricked homeless people into face scans. CNBC https://www.cnbc.com/2019/10/03/google-contractor-reportedly-tricked-homeless-people-into-face-scans.html (2019).
Equality Act 2010. Statute Law Database (UK Government, 2010); https://www.legislation.gov.uk/ukpga/2010/15/section/4
Declaration of the High-Level Meeting of the General Assembly on the Rule of Law at the National and International Levels (UN General Assembly, 2012); https://digitallibrary.un.org/record/734369
Article 21 - Non-Discrimination (European Union Agency for Fundamental Rights, 2007); https://fra.europa.eu/en/eu-charter/article/21-non-discrimination
Gebru, T. et al. Datasheets for datasets. Preprint at http://arxiv.org/abs/1803.09010 (2021).
Rostamzadeh, N. et al. Healthsheet: development of a transparency artifact for health datasets. In Proc. 2022 ACM Conference on Fairness, Accountability, and Transparency 1943–1961 (ACM, 2022); https://doi.org/10.1145/3531146.3533239
Smeaton, J. & Christie, L. AI and healthcare. UK Parliament POSTnote https://post.parliament.uk/research-briefings/post-pn-0637/ (2021).
Human bias and discrimination in AI systems. ICO https://webarchive.nationalarchives.gov.uk/ukgwa/20211004162239/https://ico.org.uk/about-the-ico/news-and-events/ai-blog-human-bias-and-discrimination-in-ai-systems/ (2019).
Artificial Intelligence and Machine Learning in Software as a Medical Device (FDA, 2021); https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-software-medical-device
A Governance Framework for Algorithmic Accountability and Transparency (European Parliament, Directorate-General for Parliamentary Research Services, 2019); https://data.europa.eu/doi/10.2861/59990
WHO Issues First Global Report on Artificial Intelligence (AI) in Health and Six Guiding Principles for Its Design and Use (WHO, 2021); https://www.who.int/news/item/28-06-2021-who-issues-first-global-report-on-ai-in-health-and-six-guiding-principles-for-its-design-and-use
Regulatory Horizons Council: The Regulation of Artificial Intelligence as a Medical Device. (UK Government, 2022); https://www.gov.uk/government/publications/regulatory-horizons-council-the-regulation-of-artificial-intelligence-as-a-medical-device
Arora, A. & Arora, A. Generative adversarial networks and synthetic patient data: current challenges and future perspectives. Future Healthc. J. 9, 190–193 (2022).
doi: 10.7861/fhj.2022-0013 pubmed: 35928184 pmcid: 9345230
Burlina, P., Joshi, N., Paul, W., Pacheco, K. D. & Bressler, N. M. Addressing artificial intelligence bias in retinal diagnostics. Transl. Vis. Sci. Technol. 10, 13 (2021).
doi: 10.1167/tvst.10.2.13 pubmed: 34003898 pmcid: 7884292
Koivu, A., Sairanen, M., Airola, A. & Pahikkala, T. Synthetic minority oversampling of vital statistics data with generative adversarial networks. J. Am. Med. Inform. Assoc. 27, 1667–74. (2020).
doi: 10.1093/jamia/ocaa127 pubmed: 32885818 pmcid: 7750982
Murphy, K. et al. Artificial intelligence for good health: a scoping review of the ethics literature. BMC Med. Ethics 22, 14 (2021).
doi: 10.1186/s12910-021-00577-8 pubmed: 33588803 pmcid: 7885243
Wilkinson, M. D. et al. The FAIR guiding principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016).
doi: 10.1038/sdata.2016.18 pubmed: 26978244 pmcid: 4792175
Liu, X., Cruz Rivera, S., Moher, D., Calvert, M. J. & Denniston, A. K. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Nat. Med. 26, 1364–74. (2020).
doi: 10.1038/s41591-020-1034-x pubmed: 32908283 pmcid: 7598943
Villarroel, N., Davidson, E., Pereyra-Zamora, P., Krasnik, A. & Bhopal, R. S. Heterogeneity/granularity in ethnicity classifications project: the need for refining assessment of health status. Eur. J. Public Health 29, 260–266 (2019).
doi: 10.1093/eurpub/cky191 pubmed: 30260371
Denton, E. et al. Bringing the people back in: contesting benchmark machine learning datasets. Preprint at http://arxiv.org/abs/2007.07399 (2020).
Holland, S., Hosny, A., Newman, S., Joseph, J. & Chmielinski, K. The dataset nutrition label: a framework to drive higher data quality standards. Preprint at http://arxiv.org/abs/1805.03677 (2018).
Floridi, L., Cowls, J., King, T. C. & Taddeo, M. How to design AI for social good: seven essential factors. Sci. Eng. Ethics 26, 1771–1796. (2020).
doi: 10.1007/s11948-020-00213-5 pubmed: 32246245 pmcid: 7286860
Char, D. S., Abràmoff, M. D. & Feudtner, C. Identifying ethical considerations for machine learning healthcare applications. Am. J. Bioeth. 20, 7–17 (2020).
doi: 10.1080/15265161.2020.1819469 pubmed: 33103967 pmcid: 7737650
Griffiths, K. E., Blain, J., Vajdic, C. M. & Jorm, L. Indigenous and tribal peoples data governance in health research: a systematic review. Int. J. Environ. Res. Public Health 18, 10318 (2021).
doi: 10.3390/ijerph181910318 pubmed: 34639617 pmcid: 8508308
Hernandez-Boussard, T., Bozkurt, S., Ioannidis, J. P. A. & Shah, N. H. MINIMAR (MINimum Information for Medical AI Reporting): developing reporting standards for artificial intelligence in health care. J. Am. Med. Inform. Assoc. 27, 2011–2015 (2020).
doi: 10.1093/jamia/ocaa088 pubmed: 32594179 pmcid: 7727333
Paulus, J. K. & Kent, D. M. Predictably unequal: understanding and addressing concerns that algorithmic clinical prediction may increase health disparities. NPJ Digit. Med. 3, 1–8 (2020).
doi: 10.1038/s41746-020-0304-9
McCradden, M. D., Joshi, S., Mazwi, M. & Anderson, J. A. Ethical limitations of algorithmic fairness solutions in health care machine learning. Lancet Digit. Health 2, e221–e223 (2020).
doi: 10.1016/S2589-7500(20)30065-0 pubmed: 33328054
Douglas, M. D., Dawes, D. E., Holden, K. B. & Mack, D. Missed policy opportunities to advance health equity by recording demographic data in electronic health records. Am. J. Public Health 105, S380–S388 (2015).
doi: 10.2105/AJPH.2014.302384 pubmed: 25905840 pmcid: 4455508
Mitchell, M. et al. Model cards for model reporting. In Proc. Conference on Fairness, Accountability, and Transparency (FAT* '19) 220–229 (ACM, 2019); https://doi.org/10.1145/3287560.3287596
Mörch, C. M., Gupta, A. & Mishara, B. L. Canada protocol: an ethical checklist for the use of artificial Intelligence in suicide prevention and mental health. Artif. Intell. Med. 108, 101934 (2020).
doi: 10.1016/j.artmed.2020.101934 pubmed: 32972663
Saleiro, P. et al. Aequitas: a bias and fairness audit toolkit. Preprint at http://arxiv.org/abs/1811.05577 (2019).
Xafis, V. et al. An ethics framework for big data in health and research. Asian Bioeth. Rev. 11, 227–254. (2019).
doi: 10.1007/s41649-019-00099-x pubmed: 33717314 pmcid: 7747261
Abstracts from the 53rd European Society of Human Genetics (ESHG) conference: e-posters. Eur. J. Hum. Genet. 28, 798–1016 (2020).
Zhang, X. et al. Big data science: opportunities and challenges to address minority health and health disparities in the 21st century. Ethn. Dis. 27, 95–106 (2017).
doi: 10.18865/ed.27.2.95 pubmed: 28439179 pmcid: 5398183
Rajkomar, A., Hardt, M., Howell, M. D., Corrado, G. & Chin, M. H. Ensuring fairness in machine learning to advance health equity. Ann. Intern. Med. 169, 866–872 (2018).
doi: 10.7326/M18-1990 pubmed: 30508424 pmcid: 6594166
Fletcher, R. R., Nakeshimana, A. & Olubeko, O. Addressing fairness, bias, and appropriate use of artificial intelligence and machine learning in global health. Front. Artif. Intell. https://doi.org/10.3389/frai.2020.561802 (2021).
The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ 372, n71 (2021).
Braun, V. & Clarke, V. Using thematic analysis in psychology. Qual. Res. Psychol. 3, 77–101 (2006).
doi: 10.1191/1478088706qp063oa

Auteurs

Anmol Arora (A)

School of Clinical Medicine, University of Cambridge, Cambridge, UK.

Joseph E Alderman (JE)

Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK.
University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK.
National Institute for Health and Care Research Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK.

Joanne Palmer (J)

University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK.
National Institute for Health and Care Research Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK.

Shaswath Ganapathi (S)

Sandwell and West Birmingham Hospitals NHS Trust, Birmingham, UK.

Elinor Laws (E)

Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK.
University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK.
National Institute for Health and Care Research Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK.

Melissa D McCradden (MD)

Department of Bioethics, The Hospital for Sick Children, Toronto, Ontario, Canada.
Genetics and Genome Biology, Peter Gilgan Centre for Research and Learning, Toronto, Ontario, Canada.
Dalla Lana School of Public Health, Toronto, Ontario, Canada.

Lauren Oakden-Rayner (L)

The Australian Institute for Machine Learning, University of Adelaide, Adelaide, South Australia, Australia.

Stephen R Pfohl (SR)

Google Research, Mountain View, CA, USA.

Marzyeh Ghassemi (M)

Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA.
Institute for Medical Engineering & Science, Massachusetts Institute of Technology, Cambridge, MA, USA.
Vector Institute, Toronto, Ontario, Canada.

Francis McKay (F)

The Ethox Centre and the Wellcome Centre for Ethics and Humanities, Nuffield Department of Population Health, University of Oxford, Oxford, UK.

Darren Treanor (D)

Leeds Teaching Hospitals NHS Trust, Leeds, UK.
University of Leeds, Leeds, UK.
Department of Clinical Pathology and Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden.
Center for Medical Image Science and Visualization, Linköping University, Linköping, Sweden.

Negar Rostamzadeh (N)

Google Research, Montreal, Quebec, Canada.

Bilal Mateen (B)

Institute for Health Informatics, University College London, London, UK.
Wellcome Trust, London, UK.

Jacqui Gath (J)

Patient and Public Involvement and Engagement (PPIE) Group, STANDING Together, Birmingham, UK.

Adewole O Adebajo (AO)

Patient and Public Involvement and Engagement (PPIE) Group, STANDING Together, Birmingham, UK.

Stephanie Kuku (S)

Institute of Women's Health, UCL, London, UK.

Rubeta Matin (R)

Oxford University Hospitals NHS Foundation Trust, Oxford, UK.

Katherine Heller (K)

Google Research, Mountain View, CA, USA.

Elizabeth Sapey (E)

Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK.
University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK.
National Institute for Health and Care Research Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK.
PIONEER, HDR UK Hub in Acute Care, Institute of Inflammation and Ageing, University of Birmingham, Birmingham, UK.

Neil J Sebire (NJ)

National Institute for Health and Care Research, Great Ormond Street Hospital Biomedical Research Centre, London, UK.
Great Ormond Street Institute of Child Health, University Hospital London, London, UK.

Heather Cole-Lewis (H)

Google Research, Mountain View, CA, USA.

Melanie Calvert (M)

National Institute for Health and Care Research Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK.
Birmingham Health Partners Centre for Regulatory Science and Innovation, University of Birmingham, Birmingham, UK.
Centre for Patient Reported Outcomes Research, Institute of Applied Health Research, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK.
National Institute for Health and Care Research Applied Research Collaboration West Midlands, University of Birmingham, Birmingham, UK.
National Institute for Health and Care Research Birmingham-Oxford Blood and Transplant Research Unit in Precision Transplant and Cellular Therapeutics, University of Birmingham, Birmingham, UK.
DEMAND Hub, University of Birmingham, Birmingham, UK.
UK SPINE, University of Birmingham, Birmingham, UK.

Alastair Denniston (A)

Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK.
University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK.
National Institute for Health and Care Research Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK.
Birmingham Health Partners Centre for Regulatory Science and Innovation, University of Birmingham, Birmingham, UK.
National Institute for Health and Care Research Biomedical Research Centre, Moorfields Eye Hospital/University College London, London, UK.

Xiaoxuan Liu (X)

Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK. x.liu.8@bham.ac.uk.
University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK. x.liu.8@bham.ac.uk.
National Institute for Health and Care Research Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK. x.liu.8@bham.ac.uk.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH