Improved clinical data imputation via classical and quantum determinantal point processes.

clinical computational biology critical care unit human survival systems biology

Journal

eLife
ISSN: 2050-084X
Titre abrégé: Elife
Pays: England
ID NLM: 101579614

Informations de publication

Date de publication:
09 May 2024
Historique:
medline: 9 5 2024
pubmed: 9 5 2024
entrez: 9 5 2024
Statut: epublish

Résumé

Imputing data is a critical issue for machine learning practitioners, including in the life sciences domain, where missing clinical data is a typical situation and the reliability of the imputation is of great importance. Currently, there is no canonical approach for imputation of clinical data and widely used algorithms introduce variance in the downstream classification. Here we propose novel imputation methods based on determinantal point processes (DPP) that enhance popular techniques such as the multivariate imputation by chained equations and MissForest. Their advantages are twofold: improving the quality of the imputed data demonstrated by increased accuracy of the downstream classification and providing deterministic and reliable imputations that remove the variance from the classification results. We experimentally demonstrate the advantages of our methods by performing extensive imputations on synthetic and real clinical data. We also perform quantum hardware experiments by applying the quantum circuits for DPP sampling since such quantum algorithms provide a computational advantage with respect to classical ones. We demonstrate competitive results with up to 10 qubits for small-scale imputation tasks on a state-of-the-art IBM quantum processor. Our classical and quantum methods improve the effectiveness and robustness of clinical data prediction modeling by providing better and more reliable data imputations. These improvements can add significant value in settings demanding high precision, such as in pharmaceutical drug trials where our approach can provide higher confidence in the predictions made.

Identifiants

pubmed: 38722146
doi: 10.7554/eLife.89947
pii: 89947
doi:
pii:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Informations de copyright

© 2023, Kazdaghli et al.

Déclaration de conflit d'intérêts

SK, IK No competing interests declared, JK, PT are employees of AstraZeneca. The authors declares that no other competing interests exist

Références

Nat Comput Sci. 2022 Sep;2(9):567-576
pubmed: 38177473
Bioinformatics. 2012 Jan 1;28(1):112-8
pubmed: 22039212
BMC Bioinformatics. 2019 Jun 17;20(1):339
pubmed: 31208324
J Biomed Inform. 2017 Dec;76:59-68
pubmed: 29113935
Clin Epidemiol. 2017 Mar 15;9:157-166
pubmed: 28352203
J Big Data. 2021;8(1):140
pubmed: 34722113
Brief Bioinform. 2022 Jan 17;23(1):
pubmed: 34882223
Mod Pathol. 2021 Mar;34(3):522-531
pubmed: 33067522
PLoS One. 2018 Aug 6;13(8):e0201904
pubmed: 30080866
Sci Data. 2016 May 24;3:160035
pubmed: 27219127
Nature. 2017 Sep 13;549(7671):195-202
pubmed: 28905917

Auteurs

Iordanis Kerenidis (I)

QC Ware, Paris, France.
Universite de Paris, CNRS, IRIF, Paris, France.

Jens Kieckbusch (J)

Emerging Innovations Unit, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, United Kingdom.

Philip Teare (P)

Centre for AI, Data Science & AI, BioPharmaceuticals R&D, AstraZeneca, Cambridge, United Kingdom.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH