The value of standards for health datasets in artificial intelligence-based applications.

Humans Artificial Intelligence Consensus Delivery of Health Care Systematic Reviews as Topic

Journal

Nature medicine

ISSN: 1546-170X

Titre abrégé: Nat Med

Pays: United States

ID NLM: 9502015

Informations de publication

Date de publication:
Nov 2023

Historique:

received: 14 03 2023

accepted: 22 09 2023

medline: 27 11 2023

pubmed: 27 10 2023

entrez: 26 10 2023

Statut: ppublish

Résumé

Artificial intelligence as a medical device is increasingly being applied to healthcare for diagnosis, risk stratification and resource allocation. However, a growing body of evidence has highlighted the risk of algorithmic bias, which may perpetuate existing health inequity. This problem arises in part because of systemic inequalities in dataset curation, unequal opportunity to participate in research and inequalities of access. This study aims to explore existing standards, frameworks and best practices for ensuring adequate data diversity in health datasets. Exploring the body of existing literature and expert views is an important step towards the development of consensus-based guidelines. The study comprises two parts: a systematic review of existing standards, frameworks and best practices for healthcare datasets; and a survey and thematic analysis of stakeholder views of bias, health equity and best practices for artificial intelligence as a medical device. We found that the need for dataset diversity was well described in literature, and experts generally favored the development of a robust set of guidelines, but there were mixed views about how these could be implemented practically. The outputs of this study will be used to inform the development of standards for transparency of data diversity in health datasets (the STANDING Together initiative).

Identifiants

DOI: 10.1038/s41591-023-02608-w PMID: 37884627 PMC: PMC10667100

pubmed: 37884627

doi: 10.1038/s41591-023-02608-w

pii: 10.1038/s41591-023-02608-w

pmc: PMC10667100

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

2929-2938

Subventions

Organisme : Medical Research Council

ID : MC_PC_21015

Pays : United Kingdom

Organisme : Medical Research Council

ID : MC_PC_21055

Pays : United Kingdom

Informations de copyright

Références

Sidey-Gibbons, J. A. M. & Sidey-Gibbons, C. J. Machine learning in medicine: a practical introduction. BMC Med. Res. Methodol. 19, 64 (2019).

doi: 10.1186/s12874-019-0681-4 pubmed: 30890124 pmcid: 6425557

Ibrahim, H., Liu, X., Zariffa, N., Morris, A. D. & Denniston, A. K. Health data poverty: an assailable barrier to equitable digital health care. Lancet Digit. Health 3, e260–e265 (2021).

doi: 10.1016/S2589-7500(20)30317-4 pubmed: 33678589

Kuhlman, C., Jackson, L. & Chunara, R. No computation without representation: avoiding data and algorithm biases through diversity. In Proc. 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD '20) 3593 (ACM, 2020); https://doi.org/10.1145/3394486.3411074

Courbier, S., Dimond, R. & Bros-Facer, V. Share and protect our health data: an evidence based approach to rare disease patients’ perspectives on data sharing and data protection - quantitative survey and recommendations. Orphanet J. Rare Dis. 14, 175 (2019).

doi: 10.1186/s13023-019-1123-4 pubmed: 31300010 pmcid: 6625078

Chen, I. Y. et al. Ethical machine learning in healthcare. Annu Rev. Biomed. Data Sci. 4, 123–44. (2021).

doi: 10.1146/annurev-biodatasci-092820-114757 pubmed: 34396058 pmcid: 8362902

Khan, S. M. et al. A global review of publicly available datasets for ophthalmological imaging: barriers to access, usability, and generalisability. Lancet Digit. Health 3, e51–e66 (2021).

doi: 10.1016/S2589-7500(20)30240-5 pubmed: 33735069

Wen, D. et al. Characteristics of publicly available skin cancer image datasets: a systematic review. Lancet Digit. Health 4, e64–e74 (2022).

doi: 10.1016/S2589-7500(21)00252-1 pubmed: 34772649

Kaushal, A., Altman, R. & Langlotz, C. Geographic distribution of US cohorts used to train deep learning algorithms. JAMA 324, 1212–1213 (2020).

doi: 10.1001/jama.2020.12067 pubmed: 32960230 pmcid: 7509620

Gichoya, J. W. et al. AI recognition of patient race in medical imaging: a modelling study. Lancet Digit. Health 4, e406–e414 (2022).

doi: 10.1016/S2589-7500(22)00063-2 pubmed: 35568690 pmcid: 9650160

Glocker, B., Jones, C., Bernhardt, M. & Winzeck, S. Risk of bias in chest radiography foundation models. Radiol. Artif. Intell. https://doi.org/10.1148/ryai.230060 (2023).

Zou, J. & Schiebinger, L. Ensuring that biomedical AI benefits diverse populations. eBioMedicine https://doi.org/10.1016/j.ebiom.2021.103358 (2021).

Jobin, A., Ienca, M. & Vayena, E. The global landscape of AI ethics guidelines. Nat. Mach. Intell. 1, 389–399. (2019).

doi: 10.1038/s42256-019-0088-2

Ethics and Governance of Artificial Intelligence for Health (WHO 2021); https://www.who.int/publications-detail-redirect/9789240029200

Block, R. G. et al. Recommendations for improving national clinical datasets for health equity research. J. Am. Med. Inform. Assoc. 27, 1802–1807 (2020).

doi: 10.1093/jamia/ocaa144 pubmed: 32885240 pmcid: 7671626

DeVoe, J. E. et al. The ADVANCE network: accelerating data value across a national community health center network. J. Am. Med. Inform. Assoc. 21, 591–595 (2014).

doi: 10.1136/amiajnl-2014-002744 pubmed: 24821740 pmcid: 4078289

Hasnain-Wynia, R. & Baker, D. W. Obtaining data on patient race, ethnicity, and primary language in health care organizations: current challenges and proposed solutions. Health Serv. Res. 411, 1501–1518 (2006).

doi: 10.1111/j.1475-6773.2006.00552.x

Computer-Assisted Detection Devices Applied to Radiology Images and Radiology Device Data - Premarket Notification [510(k)] Submissions. (FDA, 2022); https://www.fda.gov/regulatory-information/search-fda-guidance-documents/computer-assisted-detection-devices-applied-radiology-images-and-radiology-device-data-premarket

Ganapathi, S. et al. Tackling bias in AI health datasets through the STANDING Together initiative. Nat. Med. 28, 2232–2233 (2022).

doi: 10.1038/s41591-022-01987-w pubmed: 36163296

Vollmer, S. et al. Machine learning and artificial intelligence research for patient benefit: 20 critical questions on transparency, replicability, ethics, and effectiveness. Br. Med. J. 368, l6927 (2020).

doi: 10.1136/bmj.l6927

Challen, R. et al. Artificial intelligence, bias and clinical safety. BMJ Qual. Saf. 28, 231–237 (2019).

doi: 10.1136/bmjqs-2018-008370 pubmed: 30636200 pmcid: 6560460

Wiens, J. et al. Do no harm: a roadmap for responsible machine learning for health care. Nat. Med. 25, 1337–40. (2019).

doi: 10.1038/s41591-019-0548-6 pubmed: 31427808

Saleh, S., Boag, W., Erdman, L. & Naumann, T. Clinical collabsheets: 53 questions to guide a clinical collaboration. In Proc. 5th Machine Learning for Healthcare Conference (eds Doshi-Velez, F. et al.) 783–812 (PMLR, 2022); https://proceedings.mlr.press/v126/saleh20a.html

Ferryman, K. Addressing health disparities in the Food and Drug Administration’s artificial intelligence and machine learning regulatory framework. J. Am. Med. Inform. Assoc. 27, 2016–2019 (2020).

doi: 10.1093/jamia/ocaa133 pubmed: 32951036 pmcid: 7727393

Suresh, H. & Guttag, J. A framework for understanding sources of harm throughout the machine learning life cycle. In Proc. Equity and Access in Algorithms, Mechanisms, and Optimization 1–9 (ACM, 2021); https://dl.acm.org/doi/10.1145/3465416.3483305

Lysaght, T., Lim, H. Y., Xafis, V. & Ngiam, K. Y. AI-assisted decision-making in healthcare: the application of an ethics framework for big data in health and research. Asian Bioeth. Rev. 11, 299–314 (2019).

doi: 10.1007/s41649-019-00096-0 pubmed: 33717318 pmcid: 7747260

Flanagin, A., Frey, T., Christiansen, S. L. & Bauchner, H. The reporting of race and ethnicity in medical and science journals: comments invited. JAMA 325, 1049–1052 (2021).

doi: 10.1001/jama.2021.2104 pubmed: 33616604

Cerdeña, J. P., Grubbs, V. & Non, A. L. Racialising genetic risk: assumptions, realities, and recommendations. Lancet 400, 2147–2154. (2022).

doi: 10.1016/S0140-6736(22)02040-2 pubmed: 36502852

Elias, J. Google contractor reportedly tricked homeless people into face scans. CNBC https://www.cnbc.com/2019/10/03/google-contractor-reportedly-tricked-homeless-people-into-face-scans.html (2019).

Equality Act 2010. Statute Law Database (UK Government, 2010); https://www.legislation.gov.uk/ukpga/2010/15/section/4

Declaration of the High-Level Meeting of the General Assembly on the Rule of Law at the National and International Levels (UN General Assembly, 2012); https://digitallibrary.un.org/record/734369

Article 21 - Non-Discrimination (European Union Agency for Fundamental Rights, 2007); https://fra.europa.eu/en/eu-charter/article/21-non-discrimination

Gebru, T. et al. Datasheets for datasets. Preprint at http://arxiv.org/abs/1803.09010 (2021).

Rostamzadeh, N. et al. Healthsheet: development of a transparency artifact for health datasets. In Proc. 2022 ACM Conference on Fairness, Accountability, and Transparency 1943–1961 (ACM, 2022); https://doi.org/10.1145/3531146.3533239

Smeaton, J. & Christie, L. AI and healthcare. UK Parliament POSTnote https://post.parliament.uk/research-briefings/post-pn-0637/ (2021).

Human bias and discrimination in AI systems. ICO https://webarchive.nationalarchives.gov.uk/ukgwa/20211004162239/https://ico.org.uk/about-the-ico/news-and-events/ai-blog-human-bias-and-discrimination-in-ai-systems/ (2019).

Artificial Intelligence and Machine Learning in Software as a Medical Device (FDA, 2021); https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-software-medical-device

A Governance Framework for Algorithmic Accountability and Transparency (European Parliament, Directorate-General for Parliamentary Research Services, 2019); https://data.europa.eu/doi/10.2861/59990

WHO Issues First Global Report on Artificial Intelligence (AI) in Health and Six Guiding Principles for Its Design and Use (WHO, 2021); https://www.who.int/news/item/28-06-2021-who-issues-first-global-report-on-ai-in-health-and-six-guiding-principles-for-its-design-and-use

Regulatory Horizons Council: The Regulation of Artificial Intelligence as a Medical Device. (UK Government, 2022); https://www.gov.uk/government/publications/regulatory-horizons-council-the-regulation-of-artificial-intelligence-as-a-medical-device

Arora, A. & Arora, A. Generative adversarial networks and synthetic patient data: current challenges and future perspectives. Future Healthc. J. 9, 190–193 (2022).

doi: 10.7861/fhj.2022-0013 pubmed: 35928184 pmcid: 9345230

Burlina, P., Joshi, N., Paul, W., Pacheco, K. D. & Bressler, N. M. Addressing artificial intelligence bias in retinal diagnostics. Transl. Vis. Sci. Technol. 10, 13 (2021).

doi: 10.1167/tvst.10.2.13 pubmed: 34003898 pmcid: 7884292

Koivu, A., Sairanen, M., Airola, A. & Pahikkala, T. Synthetic minority oversampling of vital statistics data with generative adversarial networks. J. Am. Med. Inform. Assoc. 27, 1667–74. (2020).

doi: 10.1093/jamia/ocaa127 pubmed: 32885818 pmcid: 7750982

Murphy, K. et al. Artificial intelligence for good health: a scoping review of the ethics literature. BMC Med. Ethics 22, 14 (2021).

doi: 10.1186/s12910-021-00577-8 pubmed: 33588803 pmcid: 7885243

Wilkinson, M. D. et al. The FAIR guiding principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016).

doi: 10.1038/sdata.2016.18 pubmed: 26978244 pmcid: 4792175

Liu, X., Cruz Rivera, S., Moher, D., Calvert, M. J. & Denniston, A. K. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Nat. Med. 26, 1364–74. (2020).

doi: 10.1038/s41591-020-1034-x pubmed: 32908283 pmcid: 7598943

Villarroel, N., Davidson, E., Pereyra-Zamora, P., Krasnik, A. & Bhopal, R. S. Heterogeneity/granularity in ethnicity classifications project: the need for refining assessment of health status. Eur. J. Public Health 29, 260–266 (2019).

doi: 10.1093/eurpub/cky191 pubmed: 30260371

Denton, E. et al. Bringing the people back in: contesting benchmark machine learning datasets. Preprint at http://arxiv.org/abs/2007.07399 (2020).

Holland, S., Hosny, A., Newman, S., Joseph, J. & Chmielinski, K. The dataset nutrition label: a framework to drive higher data quality standards. Preprint at http://arxiv.org/abs/1805.03677 (2018).

Floridi, L., Cowls, J., King, T. C. & Taddeo, M. How to design AI for social good: seven essential factors. Sci. Eng. Ethics 26, 1771–1796. (2020).

doi: 10.1007/s11948-020-00213-5 pubmed: 32246245 pmcid: 7286860

Char, D. S., Abràmoff, M. D. & Feudtner, C. Identifying ethical considerations for machine learning healthcare applications. Am. J. Bioeth. 20, 7–17 (2020).

doi: 10.1080/15265161.2020.1819469 pubmed: 33103967 pmcid: 7737650

Griffiths, K. E., Blain, J., Vajdic, C. M. & Jorm, L. Indigenous and tribal peoples data governance in health research: a systematic review. Int. J. Environ. Res. Public Health 18, 10318 (2021).

doi: 10.3390/ijerph181910318 pubmed: 34639617 pmcid: 8508308

Hernandez-Boussard, T., Bozkurt, S., Ioannidis, J. P. A. & Shah, N. H. MINIMAR (MINimum Information for Medical AI Reporting): developing reporting standards for artificial intelligence in health care. J. Am. Med. Inform. Assoc. 27, 2011–2015 (2020).

doi: 10.1093/jamia/ocaa088 pubmed: 32594179 pmcid: 7727333

Paulus, J. K. & Kent, D. M. Predictably unequal: understanding and addressing concerns that algorithmic clinical prediction may increase health disparities. NPJ Digit. Med. 3, 1–8 (2020).

doi: 10.1038/s41746-020-0304-9

McCradden, M. D., Joshi, S., Mazwi, M. & Anderson, J. A. Ethical limitations of algorithmic fairness solutions in health care machine learning. Lancet Digit. Health 2, e221–e223 (2020).

doi: 10.1016/S2589-7500(20)30065-0 pubmed: 33328054

Douglas, M. D., Dawes, D. E., Holden, K. B. & Mack, D. Missed policy opportunities to advance health equity by recording demographic data in electronic health records. Am. J. Public Health 105, S380–S388 (2015).

doi: 10.2105/AJPH.2014.302384 pubmed: 25905840 pmcid: 4455508

Mitchell, M. et al. Model cards for model reporting. In Proc. Conference on Fairness, Accountability, and Transparency (FAT* '19) 220–229 (ACM, 2019); https://doi.org/10.1145/3287560.3287596

Mörch, C. M., Gupta, A. & Mishara, B. L. Canada protocol: an ethical checklist for the use of artificial Intelligence in suicide prevention and mental health. Artif. Intell. Med. 108, 101934 (2020).

doi: 10.1016/j.artmed.2020.101934 pubmed: 32972663

Saleiro, P. et al. Aequitas: a bias and fairness audit toolkit. Preprint at http://arxiv.org/abs/1811.05577 (2019).

Xafis, V. et al. An ethics framework for big data in health and research. Asian Bioeth. Rev. 11, 227–254. (2019).

doi: 10.1007/s41649-019-00099-x pubmed: 33717314 pmcid: 7747261

Abstracts from the 53rd European Society of Human Genetics (ESHG) conference: e-posters. Eur. J. Hum. Genet. 28, 798–1016 (2020).

Zhang, X. et al. Big data science: opportunities and challenges to address minority health and health disparities in the 21st century. Ethn. Dis. 27, 95–106 (2017).

doi: 10.18865/ed.27.2.95 pubmed: 28439179 pmcid: 5398183

Rajkomar, A., Hardt, M., Howell, M. D., Corrado, G. & Chin, M. H. Ensuring fairness in machine learning to advance health equity. Ann. Intern. Med. 169, 866–872 (2018).

doi: 10.7326/M18-1990 pubmed: 30508424 pmcid: 6594166

Fletcher, R. R., Nakeshimana, A. & Olubeko, O. Addressing fairness, bias, and appropriate use of artificial intelligence and machine learning in global health. Front. Artif. Intell. https://doi.org/10.3389/frai.2020.561802 (2021).

The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ 372, n71 (2021).

Braun, V. & Clarke, V. Using thematic analysis in psychology. Qual. Res. Psychol. 3, 77–101 (2006).

doi: 10.1191/1478088706qp063oa