Revealing transparency gaps in publicly available COVID-19 datasets used for medical artificial intelligence development-a systematic review.
Journal
The Lancet. Digital health
ISSN: 2589-7500
Titre abrégé: Lancet Digit Health
Pays: England
ID NLM: 101751302
Informations de publication
Date de publication:
Nov 2024
Nov 2024
Historique:
received:
08
02
2024
revised:
21
06
2024
accepted:
27
06
2024
medline:
26
10
2024
pubmed:
26
10
2024
entrez:
25
10
2024
Statut:
ppublish
Résumé
During the COVID-19 pandemic, artificial intelligence (AI) models were created to address health-care resource constraints. Previous research shows that health-care datasets often have limitations, leading to biased AI technologies. This systematic review assessed datasets used for AI development during the pandemic, identifying several deficiencies. Datasets were identified by screening articles from MEDLINE and using Google Dataset Search. 192 datasets were analysed for metadata completeness, composition, data accessibility, and ethical considerations. Findings revealed substantial gaps: only 48% of datasets documented individuals' country of origin, 43% reported age, and under 25% included sex, gender, race, or ethnicity. Information on data labelling, ethical review, or consent was frequently missing. Many datasets reused data with inadequate traceability. Notably, historical paediatric chest x-rays appeared in some datasets without acknowledgment. These deficiencies highlight the need for better data quality and transparent documentation to lessen the risk that biased AI models are developed in future health emergencies.
Identifiants
pubmed: 39455195
pii: S2589-7500(24)00146-8
doi: 10.1016/S2589-7500(24)00146-8
pii:
doi:
Types de publication
Systematic Review
Journal Article
Review
Langues
eng
Sous-ensembles de citation
IM
Pagination
e827-e847Informations de copyright
Copyright © 2024 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license. Published by Elsevier Ltd.. All rights reserved.
Déclaration de conflit d'intérêts
Declaration of interests JEA is a named researcher on grants from the Medical Research Council (MRC) and the Engineering & Physical Sciences Research Council with payments made to his institution for the delivery of other research projects; and is the co-organiser of Alan Turing Institute Clinical AI interest group (unpaid). ELe has received grants from the NHS AI Lab. MCa has received grants from the National Institute for Health and Care Research (NIHR), Health Data Research UK (HDR-UK), Innovate UK, Macmillan Cancer Support, GlaxoSmithKline, UCB Pharma, Research England as part of United Kingdom Research and Innovation (UKRI), European Commission, European Federation of Pharmaceutical Industries and Associations, Brain Tumour Charity, Gilead, Janssen, UKRI, and Merck for the delivery of other research projects; has received payment for delivering lectures from the University of Maastricht; has received a speaker fee from Cochrane Portugal; payments for reviewing from the South-Eastern Norway Regional Health Authority and Singapore National Medical Research Council; consulting fees from Aparito, CIS Oncology, Takeda, Merck, Daiichi Sankyo, Glaukos, GlaxoSmithKline, Patient-Centered Outcomes Research Institute, Genentech, Vertex, ICON, Halfloop, and Pfizer; and is associated with Proteus Consortium. MG has received grants from the Canadian Institute for Advanced Research, Helmsley Trust, Wellcome Trust, Moore Foundation, Volkswagen Foundation, I-Clinic, IBM-AI, Janssen research and development, Takeda, Quanta Computing, and Microsoft Research; and has acted as an advisor for the Symposium on Artificial Intelligence for Learning Health Systems and Conference on Health, Inference, and Learning. MDM has received grants from the Canadian Institutes of Health Research, AMS Healthcare, and the SickKids Foundation. JO works or has previously worked with AdvaMed AI Framework Group member (unpaid), AdvaMed Software Working Group member (unpaid), MedTech Europe AI Working Group member (unpaid), and MedTech Europe Digital Health Committee member (unpaid). CS has received grants from NIHR, UKRI, MRC, and NIHR Cambridge Biomedical Research Centre. RNM has received grants from MRC, British Heart Foundation, United States Agency for International Development, and HDR-UK. AKD has received grants from MRC and NIHR. XL has received grants from NIHR, Wellcome Trust, Research England, Moorfields Eye Hospital Charity; has received consulting fees from Hardian Health; has received payment or honoraria from the University of Turku; and is employed by Apple as a health scientist. All other authors declare no competing interests.