Development, evaluation, and validation of machine learning models for COVID-19 detection based on routine blood tests.


Journal

Clinical chemistry and laboratory medicine
ISSN: 1437-4331
Titre abrégé: Clin Chem Lab Med
Pays: Germany
ID NLM: 9806306

Informations de publication

Date de publication:
21 10 2020
Historique:
received: 25 08 2020
accepted: 07 10 2020
pubmed: 21 10 2020
medline: 16 2 2021
entrez: 20 10 2020
Statut: epublish

Résumé

The rRT-PCR test, the current gold standard for the detection of coronavirus disease (COVID-19), presents with known shortcomings, such as long turnaround time, potential shortage of reagents, false-negative rates around 15-20%, and expensive equipment. The hematochemical values of routine blood exams could represent a faster and less expensive alternative. Three different training data set of hematochemical values from 1,624 patients (52% COVID-19 positive), admitted at San Raphael Hospital (OSR) from February to May 2020, were used for developing machine learning (ML) models: the complete OSR dataset (72 features: complete blood count (CBC), biochemical, coagulation, hemogasanalysis and CO-Oxymetry values, age, sex and specific symptoms at triage) and two sub-datasets (COVID-specific and CBC dataset, 32 and 21 features respectively). 58 cases (50% COVID-19 positive) from another hospital, and 54 negative patients collected in 2018 at OSR, were used for internal-external and external validation. We developed five ML models: for the complete OSR dataset, the area under the receiver operating characteristic curve (AUC) for the algorithms ranged from 0.83 to 0.90; for the COVID-specific dataset from 0.83 to 0.87; and for the CBC dataset from 0.74 to 0.86. The validations also achieved good results: respectively, AUC from 0.75 to 0.78; and specificity from 0.92 to 0.96. ML can be applied to blood tests as both an adjunct and alternative method to rRT-PCR for the fast and cost-effective identification of COVID-19-positive patients. This is especially useful in developing countries, or in countries facing an increase in contagions.

Identifiants

pubmed: 33079698
doi: 10.1515/cclm-2020-1294
pii: cclm-2020-1294
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

421-431

Références

Oran, DP, Topol, EJ. Prevalence of asymptomatic SARS-CoV-2 infection: a narrative review. Ann Intern Med. https://doi.org/10.7326/M20-3012. [Published online June 3, 2020].
Vogels, CBF, Brito, AF, Wyllie, AL, Fauver, JR, Ott, IM, Kalinich, CC, et al. Analytical sensitivity and efficiency comparisons of SARS-CoV-2 RT–qPCR primer–probe sets. Nat Microbiol. https://doi.org/10.1038/s41564-020-0761-6. [Published online July 10, 2020].
Lippi, G, Simundic, A-M, Plebani, M. Potential preanalytical and analytical vulnerabilities in the laboratory diagnosis of coronavirus disease 2019 (COVID-19). Clin Chem Lab Med 2020;58:1070–6. https://doi.org/10.1515/cclm-2020-0285.
Woloshin, S, Patel, N, Kesselheim, AS. False negative tests for SARS-CoV-2 infection — challenges and implications. N Engl J Med 2020;383:e38. https://doi.org/10.1056/NEJMp2015897.
Wynants, L, Van Calster, B, Collins, GS, Riley, RD, Heinze, G, Schuit, E, et al. Prediction models for diagnosis and prognosis of COVID-19: systematic review and critical appraisal. BMJ 2020;369:m1328. https://doi.org/10.1136/bmj.m1328.
Li, L, Qin, L, Xu, Z, Yin, Y, Wang, X, Kong, B, et al. Artificial intelligence distinguishes COVID-19 from community acquired pneumonia on chest CT. Radiology. https://doi.org/10.1148/radiol.2020200905. [Published online April 3, 2020].
Gozes, O, Frid-Adar, M, Greenspan, H, Browning, PD, Zhang, H, Ji, W, et al. Rapid AI development cycle for the coronavirus (COVID-19) pandemic: initial results for automated detection & patient monitoring using deep learning CT image analysis. [Published online March 24, 2020]. arXiv Prepr arXiv http://arxiv.org/abs/2003.05037.
Ozturk, T, Talo, M, Yildirim, EA, Baloglu, UB, Yildirim, O, Rajendra Acharya, U. Automated detection of COVID-19 cases using deep neural networks with X-ray images. Comput Biol Med 2020;121:103792. https://doi.org/10.1016/j.compbiomed.2020.103792.
Mei, X, Lee, HC, Diao, K, Huang, M, Lin, B, Liu, C, et al. Artificial intelligence–enabled rapid diagnosis of patients with COVID-19. Nat Med 2020;26:1224–8. https://doi.org/10.1038/s41591-020-0931-3.
Weinstock, MB, Echenique, A, Russell, JW, Leib, A, Miller, J, Cohen, DJ, et al. Chest X-ray findings in 636 ambulatory patients with COVID-19 presenting to an urgent care center: a normal chest X-ray is no guarantee. JUCM 2020;10:13–8. [Published online May, 2020]. Available from: https://www.jucm.com/documents/jucm-covid-19-studyepub-april-2020.pdf/ [Accessed 17 August 2020].
Fan, BE, Chong, VCL, Chan, SSW, Lim, GH, Tan, GB, Mucheli, SS, et al. Hematologic parameters in patients with COVID-19 infection. Am J Hematol 2020;95:E131–4. https://doi.org/10.1002/ajh.25774.
Ferrari, D, Motta, A, Strollo, M, Banfi, G, Locatelli, M. Routine blood tests as a potential diagnostic tool for COVID-19. Clin Chem Lab Med 2020;58:1095–9. https://doi.org/10.1515/cclm-2020-0398.
Formica, V, Minieri, M, Bernardini, S, Ciotti, M, D’Agostini, C, Roselli, M, et al. Complete blood count might help to identify subjects with high probability of testing positive to SARS-CoV-2. Clin Med 2020;20:e114-19. https://doi.org/10.7861/clinmed.2020-0373.
Wu, J, Zhang, P, Zhang, L, Meng, W, Li, J, Tong, C, et al. Rapid and accurate identification of COVID-19 infection through machine learning based on clinical available blood test results. medRxiv. https://doi.org/10.1101/2020.04.02.20051136. [Published online 2020].
Soares, F. A novel specific artificial intelligence-based method to identify {COVID}-19 cases using simple blood exams. medRxiv. [Published online 2020] https://www.medrxiv.org/content/10.1101/2020.04.10.20061036v2.
Soltan, AAS, Kouchaki, S, Zhu, T, Kiyasseh, D, Taylor, T, Hussain, ZB, et al. Artificial intelligence driven assessment of routinely collected healthcare data is an effective screening test for COVID-19 in patients presenting to hospital. medRxiv. https://doi.org/10.1101/2020.07.07.20148361. [Published online 2020].
Kukar, M, Gunčar, G, Vovko, T, Podnar, S, Černelč, P, Brvar, M, et al. COVID-19 diagnosis by routine blood tests using machine learning. [Published online June 2020]. arXiv Prepr arXiv Available from: http://arxiv.org/abs/2006.03476 [Accessed 17 August 2020].
Collins, GS, Moons, KGM. Reporting of artificial intelligence prediction models. Lancet 2019;393:1577–9. https://doi.org/10.1016/S0140-6736(19)30037-6.
Brinati, D, Campagner, A, Ferrari, D, Locatelli, M, Banfi, G, Cabitza, F. Detection of COVID-19 infection from routine blood exams with machine learning: a feasibility study. J Med Syst 2020;44:135. https://doi.org/10.1007/s10916-020-01597-4.
Collins, GS, Reitsma, JB, Altman, DG, Moons, KGM. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMC Med 2015;13:211–9. https://doi.org/10.1186/s12916-014-0241-z.
Watson, J, Whiting, PF, Brush, JE. Interpreting a COVID-19 test result. BMJ 2020;369:m1808 https://doi.org/10.1136/bmj.m1808. [Published online May 12, 2020].
Zitek, T. The appropriate use of testing for COVID-19. West J Emerg Med 2020;21:470–2. https://doi.org/10.5811/westjem.2020.4.47370.
Fang, Y, Zhang, H, Xie, J, Lin, M, Ying, L, Pang, P, et al. Sensitivity of chest CT for COVID-19: comparison to RT-PCR. Radiology 2020;296:E115–17. https://doi.org/10.1148/radiol.2020200432.
Liu, J, Yu, H, Zhang, S. The indispensable role of chest CT in the detection of coronavirus disease 2019 (COVID-19). Eur J Nucl Med Mol Imag 2020;47:1638–9. https://doi.org/10.1007/s00259-020-04795-x.
Bohn, MK, Lippi, G, Horvath, A, Sethi, S, Koch, D, Ferrari, M, et al. Molecular, serological, and biochemical diagnosis and monitoring of COVID-19: IFCC taskforce evaluation of the latest evidence. Clin Chem Lab Med 2020;25:1037–52. https://doi.org/10.1515/cclm-2020-0722.
Jadhav, A, Pramod, D, Ramanathan, K. Comparison of performance of data imputation methods for numeric dataset. Appl Artif Intell 2019;10:913–33. https://doi.org/10.1080/08839514.2019.1637138.
Guyon, I, Weston, J, Barnhill, S, Vapnik, V. Gene selection for cancer classification using support vector machines. Mach Learn 2002;46:389–422. https://doi.org/10.1023/A:1012487302797.
Caruana, R, Karampatziakis, N, Yessenalina, A. An empirical evaluation of supervised learning in high dimensions. Proceedings of the 25th ICML 2008;ICML'08:96–103. https://doi.org/10.1145/1390156.1390169.
Du, M, Liu, N, Hu, X. Techniques for interpretable machine learning. Commun ACM 2019;63:68–77. https://doi.org/10.1145/3359786.
Brier, GW. Verification of forecasts expressed in terms of probability. Mon Weather Rev 1950;78:1–3. https://doi.org/10.1175/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2.
Campagner, A, Cabitza, F, Ciucci, D. The three-way-in and three-way-out framework to treat and exploit ambiguity in data. Int J Approx Reason 2020;119:292–312.
Banerjee, A, Ray, S, Vorselaars, B, Kitson, J, Mamalakis, M, Weeks, S, et al. Use of machine learning and artificial intelligence to predict SARS-CoV-2 infection from full blood counts in a population. Int Immunopharm 2020;86:106705 https://doi.org/10.1016/j.intimp.2020.106705. [Published online June 16, 2020].
Avila, E, Kahmann, A, Alho, C, Dorn, M. Hemogram data as a tool for decision-making in COVID-19 management: applications to resource scarcity scenarios. PeerJ. https://doi.org/10.7717/peerj.9482. [Published online June 29, 2020].
Joshi, RP, Pejaver, V, Hammarlund, NE, Sung, H, Lee, SK, Furmanchuk, A, et al. A predictive tool for identification of SARS-CoV-2 PCR-negative emergency department patients using routine test results. J Clin Virol 2020;129:104502. https://doi.org/10.1016/j.jcv.2020.104502.
Yang, HS, Vasovic, L V, Steel, P, Chadburn, A, Hou, Y, Racine-Brzostek, SE, et al. Routine laboratory blood tests predict SARS-CoV-2 infection using machine learning. Clin Chem 2020. https://doi.org/10.1093/clinchem/hvaa200. [Published online August 21, 2020].
Cabitza, F, Campagner, A, Ciucci, D, Seveso, A. Programmed inefficiencies in DSS-supported human decision making. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics); 2019.
Rodriguez-Morales, AJ, Cardona-Ospina, JA, Gutiérrez-Ocampo, E, Villamizar-Peña, R, Holguin-Rivera, Y, Escalera-Antezana, JP, et al. Clinical, laboratory and imaging features of COVID-19: a systematic review and meta-analysis. Trav Med Infect Dis 2020;34:101623. https://doi.org/10.1016/j.tmaid.2020.101623.
Zhang, ZL, Hou, YL, Li, DT, Li, FZ. Laboratory findings of COVID-19: a systematic review and meta-analysis. Scand J Clin Lab Invest 2020;80:1–7. https://doi.org/10.1080/00365513.2020.1768587. [Published online May 23, 2020].
Connors, JM, Levy, JH. COVID-19 and its implications for thrombosis and anticoagulation. Blood 2020;135:2033–40. https://doi.org/10.1182/blood.2020006000.
Rabanser, S, Günnemann, S, Lipton, ZC. Failing loudly: an empirical study of methods for detecting dataset shift; 2018. (NeurIPS) http://arxiv.org/abs/1810.11953.
Augenblick, N, Kolstad, JT, Obermeyer, Z, Wang, A. Group testing in a pandemic: the role of frequent testing, correlated risk, and machine learning. Natl Bur Econ Res 2020. http://www.nber.org/papers/w27457.pdf.
Larremore, DB, Wilder, B, Lester, E, Shehata, S, Burke, JM, Hay, JA, et al. Test sensitivity is secondary to frequency and turnaround time for COVID-19 surveillance. medRxiv. https://doi.org/10.1101/2020.06.22.20136309. [Published online 2020].
Song, JY, Yun, JG, Noh, JY, Cheong, HJ, Kim, WJ. Covid-19 in South Korea – challenges of subclinical manifestations. N Engl J Med 2020;382:1858–9. https://doi.org/10.1056/NEJMc2001801.
Service, R. Fast, cheap tests could enable safer reopening. Science 2020;369:608–9. https://doi.org/10.1126/science.369.6504.608.

Auteurs

Federico Cabitza (F)

DISCo, Università degli Studi di Milano-Bicocca, Milan, Italy.

Andrea Campagner (A)

IRCCS Istituto Ortopedico Galeazzi, Laboratory of Clinical Chemistry and Microbiology, Milan, Italy.

Davide Ferrari (D)

SCVSA Department, University of Parma, Parma, Italy.

Chiara Di Resta (C)

Vita-Salute San Raffaele University; Unit of Genomics for Human Disease Diagnosis, Division of Genetics and Cell Biology, Milan, Italy.

Daniele Ceriotti (D)

Laboratory Medicine, IRCCS San Raffaele Scientific Institute, Milan, Italy.

Eleonora Sabetta (E)

Laboratory Medicine, IRCCS San Raffaele Scientific Institute, Milan, Italy.

Alessandra Colombini (A)

IRCCS Istituto Ortopedico Galeazzi, Laboratory of Clinical Chemistry and Microbiology, Milan, Italy.

Elena De Vecchi (E)

IRCCS Istituto Ortopedico Galeazzi, Laboratory of Clinical Chemistry and Microbiology, Milan, Italy.

Giuseppe Banfi (G)

IRCCS Istituto Ortopedico Galeazzi, Laboratory of Clinical Chemistry and Microbiology, Milan, Italy.

Massimo Locatelli (M)

Laboratory Medicine, IRCCS San Raffaele Scientific Institute, Milan, Italy.

Anna Carobene (A)

Laboratory Medicine, IRCCS San Raffaele Scientific Institute, Milan, Italy.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH