Development, evaluation, and validation of machine learning models for COVID-19 detection based on routine blood tests.
COVID-19
SARS-CoV-2
blood laboratory tests
complete blood count
gradient boosted decision tree
machine learning
Journal
Clinical chemistry and laboratory medicine
ISSN: 1437-4331
Titre abrégé: Clin Chem Lab Med
Pays: Germany
ID NLM: 9806306
Informations de publication
Date de publication:
21 10 2020
21 10 2020
Historique:
received:
25
08
2020
accepted:
07
10
2020
pubmed:
21
10
2020
medline:
16
2
2021
entrez:
20
10
2020
Statut:
epublish
Résumé
The rRT-PCR test, the current gold standard for the detection of coronavirus disease (COVID-19), presents with known shortcomings, such as long turnaround time, potential shortage of reagents, false-negative rates around 15-20%, and expensive equipment. The hematochemical values of routine blood exams could represent a faster and less expensive alternative. Three different training data set of hematochemical values from 1,624 patients (52% COVID-19 positive), admitted at San Raphael Hospital (OSR) from February to May 2020, were used for developing machine learning (ML) models: the complete OSR dataset (72 features: complete blood count (CBC), biochemical, coagulation, hemogasanalysis and CO-Oxymetry values, age, sex and specific symptoms at triage) and two sub-datasets (COVID-specific and CBC dataset, 32 and 21 features respectively). 58 cases (50% COVID-19 positive) from another hospital, and 54 negative patients collected in 2018 at OSR, were used for internal-external and external validation. We developed five ML models: for the complete OSR dataset, the area under the receiver operating characteristic curve (AUC) for the algorithms ranged from 0.83 to 0.90; for the COVID-specific dataset from 0.83 to 0.87; and for the CBC dataset from 0.74 to 0.86. The validations also achieved good results: respectively, AUC from 0.75 to 0.78; and specificity from 0.92 to 0.96. ML can be applied to blood tests as both an adjunct and alternative method to rRT-PCR for the fast and cost-effective identification of COVID-19-positive patients. This is especially useful in developing countries, or in countries facing an increase in contagions.
Identifiants
pubmed: 33079698
doi: 10.1515/cclm-2020-1294
pii: cclm-2020-1294
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
421-431Références
Oran, DP, Topol, EJ. Prevalence of asymptomatic SARS-CoV-2 infection: a narrative review. Ann Intern Med. https://doi.org/10.7326/M20-3012. [Published online June 3, 2020].
Vogels, CBF, Brito, AF, Wyllie, AL, Fauver, JR, Ott, IM, Kalinich, CC, et al. Analytical sensitivity and efficiency comparisons of SARS-CoV-2 RT–qPCR primer–probe sets. Nat Microbiol. https://doi.org/10.1038/s41564-020-0761-6. [Published online July 10, 2020].
Lippi, G, Simundic, A-M, Plebani, M. Potential preanalytical and analytical vulnerabilities in the laboratory diagnosis of coronavirus disease 2019 (COVID-19). Clin Chem Lab Med 2020;58:1070–6. https://doi.org/10.1515/cclm-2020-0285.
Woloshin, S, Patel, N, Kesselheim, AS. False negative tests for SARS-CoV-2 infection — challenges and implications. N Engl J Med 2020;383:e38. https://doi.org/10.1056/NEJMp2015897.
Wynants, L, Van Calster, B, Collins, GS, Riley, RD, Heinze, G, Schuit, E, et al. Prediction models for diagnosis and prognosis of COVID-19: systematic review and critical appraisal. BMJ 2020;369:m1328. https://doi.org/10.1136/bmj.m1328.
Li, L, Qin, L, Xu, Z, Yin, Y, Wang, X, Kong, B, et al. Artificial intelligence distinguishes COVID-19 from community acquired pneumonia on chest CT. Radiology. https://doi.org/10.1148/radiol.2020200905. [Published online April 3, 2020].
Gozes, O, Frid-Adar, M, Greenspan, H, Browning, PD, Zhang, H, Ji, W, et al. Rapid AI development cycle for the coronavirus (COVID-19) pandemic: initial results for automated detection & patient monitoring using deep learning CT image analysis. [Published online March 24, 2020]. arXiv Prepr arXiv http://arxiv.org/abs/2003.05037.
Ozturk, T, Talo, M, Yildirim, EA, Baloglu, UB, Yildirim, O, Rajendra Acharya, U. Automated detection of COVID-19 cases using deep neural networks with X-ray images. Comput Biol Med 2020;121:103792. https://doi.org/10.1016/j.compbiomed.2020.103792.
Mei, X, Lee, HC, Diao, K, Huang, M, Lin, B, Liu, C, et al. Artificial intelligence–enabled rapid diagnosis of patients with COVID-19. Nat Med 2020;26:1224–8. https://doi.org/10.1038/s41591-020-0931-3.
Weinstock, MB, Echenique, A, Russell, JW, Leib, A, Miller, J, Cohen, DJ, et al. Chest X-ray findings in 636 ambulatory patients with COVID-19 presenting to an urgent care center: a normal chest X-ray is no guarantee. JUCM 2020;10:13–8. [Published online May, 2020]. Available from: https://www.jucm.com/documents/jucm-covid-19-studyepub-april-2020.pdf/ [Accessed 17 August 2020].
Fan, BE, Chong, VCL, Chan, SSW, Lim, GH, Tan, GB, Mucheli, SS, et al. Hematologic parameters in patients with COVID-19 infection. Am J Hematol 2020;95:E131–4. https://doi.org/10.1002/ajh.25774.
Ferrari, D, Motta, A, Strollo, M, Banfi, G, Locatelli, M. Routine blood tests as a potential diagnostic tool for COVID-19. Clin Chem Lab Med 2020;58:1095–9. https://doi.org/10.1515/cclm-2020-0398.
Formica, V, Minieri, M, Bernardini, S, Ciotti, M, D’Agostini, C, Roselli, M, et al. Complete blood count might help to identify subjects with high probability of testing positive to SARS-CoV-2. Clin Med 2020;20:e114-19. https://doi.org/10.7861/clinmed.2020-0373.
Wu, J, Zhang, P, Zhang, L, Meng, W, Li, J, Tong, C, et al. Rapid and accurate identification of COVID-19 infection through machine learning based on clinical available blood test results. medRxiv. https://doi.org/10.1101/2020.04.02.20051136. [Published online 2020].
Soares, F. A novel specific artificial intelligence-based method to identify {COVID}-19 cases using simple blood exams. medRxiv. [Published online 2020] https://www.medrxiv.org/content/10.1101/2020.04.10.20061036v2.
Soltan, AAS, Kouchaki, S, Zhu, T, Kiyasseh, D, Taylor, T, Hussain, ZB, et al. Artificial intelligence driven assessment of routinely collected healthcare data is an effective screening test for COVID-19 in patients presenting to hospital. medRxiv. https://doi.org/10.1101/2020.07.07.20148361. [Published online 2020].
Kukar, M, Gunčar, G, Vovko, T, Podnar, S, Černelč, P, Brvar, M, et al. COVID-19 diagnosis by routine blood tests using machine learning. [Published online June 2020]. arXiv Prepr arXiv Available from: http://arxiv.org/abs/2006.03476 [Accessed 17 August 2020].
Collins, GS, Moons, KGM. Reporting of artificial intelligence prediction models. Lancet 2019;393:1577–9. https://doi.org/10.1016/S0140-6736(19)30037-6.
Brinati, D, Campagner, A, Ferrari, D, Locatelli, M, Banfi, G, Cabitza, F. Detection of COVID-19 infection from routine blood exams with machine learning: a feasibility study. J Med Syst 2020;44:135. https://doi.org/10.1007/s10916-020-01597-4.
Collins, GS, Reitsma, JB, Altman, DG, Moons, KGM. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMC Med 2015;13:211–9. https://doi.org/10.1186/s12916-014-0241-z.
Watson, J, Whiting, PF, Brush, JE. Interpreting a COVID-19 test result. BMJ 2020;369:m1808 https://doi.org/10.1136/bmj.m1808. [Published online May 12, 2020].
Zitek, T. The appropriate use of testing for COVID-19. West J Emerg Med 2020;21:470–2. https://doi.org/10.5811/westjem.2020.4.47370.
Fang, Y, Zhang, H, Xie, J, Lin, M, Ying, L, Pang, P, et al. Sensitivity of chest CT for COVID-19: comparison to RT-PCR. Radiology 2020;296:E115–17. https://doi.org/10.1148/radiol.2020200432.
Liu, J, Yu, H, Zhang, S. The indispensable role of chest CT in the detection of coronavirus disease 2019 (COVID-19). Eur J Nucl Med Mol Imag 2020;47:1638–9. https://doi.org/10.1007/s00259-020-04795-x.
Bohn, MK, Lippi, G, Horvath, A, Sethi, S, Koch, D, Ferrari, M, et al. Molecular, serological, and biochemical diagnosis and monitoring of COVID-19: IFCC taskforce evaluation of the latest evidence. Clin Chem Lab Med 2020;25:1037–52. https://doi.org/10.1515/cclm-2020-0722.
Jadhav, A, Pramod, D, Ramanathan, K. Comparison of performance of data imputation methods for numeric dataset. Appl Artif Intell 2019;10:913–33. https://doi.org/10.1080/08839514.2019.1637138.
Guyon, I, Weston, J, Barnhill, S, Vapnik, V. Gene selection for cancer classification using support vector machines. Mach Learn 2002;46:389–422. https://doi.org/10.1023/A:1012487302797.
Caruana, R, Karampatziakis, N, Yessenalina, A. An empirical evaluation of supervised learning in high dimensions. Proceedings of the 25th ICML 2008;ICML'08:96–103. https://doi.org/10.1145/1390156.1390169.
Du, M, Liu, N, Hu, X. Techniques for interpretable machine learning. Commun ACM 2019;63:68–77. https://doi.org/10.1145/3359786.
Brier, GW. Verification of forecasts expressed in terms of probability. Mon Weather Rev 1950;78:1–3. https://doi.org/10.1175/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2.
Campagner, A, Cabitza, F, Ciucci, D. The three-way-in and three-way-out framework to treat and exploit ambiguity in data. Int J Approx Reason 2020;119:292–312.
Banerjee, A, Ray, S, Vorselaars, B, Kitson, J, Mamalakis, M, Weeks, S, et al. Use of machine learning and artificial intelligence to predict SARS-CoV-2 infection from full blood counts in a population. Int Immunopharm 2020;86:106705 https://doi.org/10.1016/j.intimp.2020.106705. [Published online June 16, 2020].
Avila, E, Kahmann, A, Alho, C, Dorn, M. Hemogram data as a tool for decision-making in COVID-19 management: applications to resource scarcity scenarios. PeerJ. https://doi.org/10.7717/peerj.9482. [Published online June 29, 2020].
Joshi, RP, Pejaver, V, Hammarlund, NE, Sung, H, Lee, SK, Furmanchuk, A, et al. A predictive tool for identification of SARS-CoV-2 PCR-negative emergency department patients using routine test results. J Clin Virol 2020;129:104502. https://doi.org/10.1016/j.jcv.2020.104502.
Yang, HS, Vasovic, L V, Steel, P, Chadburn, A, Hou, Y, Racine-Brzostek, SE, et al. Routine laboratory blood tests predict SARS-CoV-2 infection using machine learning. Clin Chem 2020. https://doi.org/10.1093/clinchem/hvaa200. [Published online August 21, 2020].
Cabitza, F, Campagner, A, Ciucci, D, Seveso, A. Programmed inefficiencies in DSS-supported human decision making. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics); 2019.
Rodriguez-Morales, AJ, Cardona-Ospina, JA, Gutiérrez-Ocampo, E, Villamizar-Peña, R, Holguin-Rivera, Y, Escalera-Antezana, JP, et al. Clinical, laboratory and imaging features of COVID-19: a systematic review and meta-analysis. Trav Med Infect Dis 2020;34:101623. https://doi.org/10.1016/j.tmaid.2020.101623.
Zhang, ZL, Hou, YL, Li, DT, Li, FZ. Laboratory findings of COVID-19: a systematic review and meta-analysis. Scand J Clin Lab Invest 2020;80:1–7. https://doi.org/10.1080/00365513.2020.1768587. [Published online May 23, 2020].
Connors, JM, Levy, JH. COVID-19 and its implications for thrombosis and anticoagulation. Blood 2020;135:2033–40. https://doi.org/10.1182/blood.2020006000.
Rabanser, S, Günnemann, S, Lipton, ZC. Failing loudly: an empirical study of methods for detecting dataset shift; 2018. (NeurIPS) http://arxiv.org/abs/1810.11953.
Augenblick, N, Kolstad, JT, Obermeyer, Z, Wang, A. Group testing in a pandemic: the role of frequent testing, correlated risk, and machine learning. Natl Bur Econ Res 2020. http://www.nber.org/papers/w27457.pdf.
Larremore, DB, Wilder, B, Lester, E, Shehata, S, Burke, JM, Hay, JA, et al. Test sensitivity is secondary to frequency and turnaround time for COVID-19 surveillance. medRxiv. https://doi.org/10.1101/2020.06.22.20136309. [Published online 2020].
Song, JY, Yun, JG, Noh, JY, Cheong, HJ, Kim, WJ. Covid-19 in South Korea – challenges of subclinical manifestations. N Engl J Med 2020;382:1858–9. https://doi.org/10.1056/NEJMc2001801.
Service, R. Fast, cheap tests could enable safer reopening. Science 2020;369:608–9. https://doi.org/10.1126/science.369.6504.608.