Creating symptom-based criteria for diagnostic testing: a case study based on a multivariate analysis of data collected during the first wave of the COVID-19 pandemic in New Zealand.
COVID-19
Epidemiology
Machine learning
Symptoms
Triaging
Journal
BMC infectious diseases
ISSN: 1471-2334
Titre abrégé: BMC Infect Dis
Pays: England
ID NLM: 100968551
Informations de publication
Date de publication:
30 Oct 2021
30 Oct 2021
Historique:
received:
02
06
2021
accepted:
20
10
2021
entrez:
30
10
2021
pubmed:
31
10
2021
medline:
3
11
2021
Statut:
epublish
Résumé
Diagnostic testing using PCR is a fundamental component of COVID-19 pandemic control. Criteria for determining who should be tested by PCR vary between countries, and ultimately depend on resource constraints and public health objectives. Decisions are often based on sets of symptoms in individuals presenting to health services, as well as demographic variables, such as age, and travel history. The objective of this study was to determine the sensitivity and specificity of sets of symptoms used for triaging individuals for confirmatory testing, with the aim of optimising public health decision making under different scenarios. Data from the first wave of COVID-19 in New Zealand were analysed; comprising 1153 PCR-confirmed and 4750 symptomatic PCR negative individuals. Data were analysed using Multiple Correspondence Analysis (MCA), automated search algorithms, Bayesian Latent Class Analysis, Decision Tree Analysis and Random Forest (RF) machine learning. Clinical criteria used to guide who should be tested by PCR were based on a set of mostly respiratory symptoms: a new or worsening cough, sore throat, shortness of breath, coryza, anosmia, with or without fever. This set has relatively high sensitivity (> 90%) but low specificity (< 10%), using PCR as a quasi-gold standard. In contrast, a group of mostly non-respiratory symptoms, including weakness, muscle pain, joint pain, headache, anosmia and ageusia, explained more variance in the MCA and were associated with higher specificity, at the cost of reduced sensitivity. Using RF models, the incorporation of 15 common symptoms, age, sex and prioritised ethnicity provided algorithms that were both sensitive and specific (> 85% for both) for predicting PCR outcomes. If predominantly respiratory symptoms are used for test-triaging, a large proportion of the individuals being tested may not have COVID-19. This could overwhelm testing capacity and hinder attempts to trace and eliminate infection. Specificity can be increased using alternative rules based on sets of symptoms informed by multivariate analysis and automated search algorithms, albeit at the cost of sensitivity. Both sensitivity and specificity can be improved through machine learning algorithms, incorporating symptom and demographic data, and hence may provide an alternative approach to test-triaging that can be optimised according to prevailing conditions.
Sections du résumé
BACKGROUND
BACKGROUND
Diagnostic testing using PCR is a fundamental component of COVID-19 pandemic control. Criteria for determining who should be tested by PCR vary between countries, and ultimately depend on resource constraints and public health objectives. Decisions are often based on sets of symptoms in individuals presenting to health services, as well as demographic variables, such as age, and travel history. The objective of this study was to determine the sensitivity and specificity of sets of symptoms used for triaging individuals for confirmatory testing, with the aim of optimising public health decision making under different scenarios.
METHODS
METHODS
Data from the first wave of COVID-19 in New Zealand were analysed; comprising 1153 PCR-confirmed and 4750 symptomatic PCR negative individuals. Data were analysed using Multiple Correspondence Analysis (MCA), automated search algorithms, Bayesian Latent Class Analysis, Decision Tree Analysis and Random Forest (RF) machine learning.
RESULTS
RESULTS
Clinical criteria used to guide who should be tested by PCR were based on a set of mostly respiratory symptoms: a new or worsening cough, sore throat, shortness of breath, coryza, anosmia, with or without fever. This set has relatively high sensitivity (> 90%) but low specificity (< 10%), using PCR as a quasi-gold standard. In contrast, a group of mostly non-respiratory symptoms, including weakness, muscle pain, joint pain, headache, anosmia and ageusia, explained more variance in the MCA and were associated with higher specificity, at the cost of reduced sensitivity. Using RF models, the incorporation of 15 common symptoms, age, sex and prioritised ethnicity provided algorithms that were both sensitive and specific (> 85% for both) for predicting PCR outcomes.
CONCLUSIONS
CONCLUSIONS
If predominantly respiratory symptoms are used for test-triaging, a large proportion of the individuals being tested may not have COVID-19. This could overwhelm testing capacity and hinder attempts to trace and eliminate infection. Specificity can be increased using alternative rules based on sets of symptoms informed by multivariate analysis and automated search algorithms, albeit at the cost of sensitivity. Both sensitivity and specificity can be improved through machine learning algorithms, incorporating symptom and demographic data, and hence may provide an alternative approach to test-triaging that can be optimised according to prevailing conditions.
Identifiants
pubmed: 34715802
doi: 10.1186/s12879-021-06810-4
pii: 10.1186/s12879-021-06810-4
pmc: PMC8556148
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
1119Informations de copyright
© 2021. The Author(s).
Références
Symptoms of coronavirus (COVID-19) https://www.nhs.uk/conditions/coronavirus-covid-19/symptoms/ Accessed 26 May 2021.
Australian Government Department of Health. What you need to know about coronavirus (COVID-19) https://www.health.gov.au/news/health-alerts/novel-coronavirus-2019-ncov-health-alert/what-you-need-to-know-about-coronavirus-covid-19#:~:text=If%20you%20have%20cold%20or,19%20as%20soon%20as%20possible . Accessed 26 May 2021.
COVID-19: Elimination strategy for Aotearoa New Zealand https://www.health.govt.nz/our-work/diseases-and-conditions/covid-19-novel-coronavirus/covid-19-response-planning/covid-19-elimination-strategy-aotearoa-new-zealand . Accessed 26 May 2021.
Elliott J, Whitaker M, Bodinier B, Eales O, Riley S, Ward H, Cooke G, Darzi A, Chadeau-Hyam M, Elliott P. Predictive symptoms for COVID-19 in the community: REACT-1 study of over 1 million people. PLoS Med. 2021;18(9): e1003777. https://doi.org/10.1371/journal.pmed.1003777 .
doi: 10.1371/journal.pmed.1003777
pubmed: 34582457
pmcid: 8478234
Riley S, Atchison C, Ashby D, Donnelly CA, Barclay W, Cooke GS, Ward H, Darzi A, Elliott P. REal-time Assessment of Community Transmission (REACT) of SARS-CoV-2 virus: study protocol. Wellcome Open Res. 2020;5:200. https://doi.org/10.12688/wellcomeopenres.16228.2 .
doi: 10.12688/wellcomeopenres.16228.2
pubmed: 33997297
Canas LS, Sudre CH, Capdevila Pujol J, Polidori L, Murray B, Molteni E, Graham MS, Klaser K, Antonelli M, Berry S, et al. Early detection of COVID-19 in the UK using self-reported symptoms: a large-scale, prospective, epidemiological surveillance study. Lancet Digit Health. 2021;3(9):e587–98. https://doi.org/10.1016/s2589-7500(21)00131-x .
doi: 10.1016/s2589-7500(21)00131-x
pubmed: 34334333
pmcid: 8321433
Menni C, Valdes AM, Freidin MB, Sudre CH, Nguyen LH, Drew DA, Ganesh S, Varsavsky T, Cardoso MJ, El-Sayed Moustafa JS, et al. Real-time tracking of self-reported symptoms to predict potential COVID-19. Nat Med. 2020;26(7):1037–40. https://doi.org/10.1038/s41591-020-0916-2 .
doi: 10.1038/s41591-020-0916-2
pubmed: 32393804
pmcid: 7751267
Sudre CH, Lee KA, Lochlainn MN, Varsavsky T, Murray B, Graham MS, Menni C, Modat M, Bowyer RCE, Nguyen LH, et al. Symptom clusters in COVID-19: a potential clinical prediction tool from the COVID Symptom Study app. Sci Adv. 2021. https://doi.org/10.1126/sciadv.abd4177 .
doi: 10.1126/sciadv.abd4177
pubmed: 33741586
pmcid: 7978420
Antonelli M, Penfold RS, Merino J, Sudre CH, Molteni E, Berry S, Canas LS, Graham MS, Klaser K, Modat M, et al. Risk factors and disease profile of post-vaccination SARS-CoV-2 infection in UK users of the COVID Symptom Study app: a prospective, community-based, nested, case-control study. Lancet Infect Dis. 2021. https://doi.org/10.1016/s1473-3099(21)00460-6 .
doi: 10.1016/s1473-3099(21)00460-6
pubmed: 34480857
pmcid: 8409907
Graham MS, Sudre CH, May A, Antonelli M, Murray B, Varsavsky T, Kläser K, Canas LS, Molteni E, Modat M, et al. Changes in symptomatology, reinfection, and transmissibility associated with the SARS-CoV-2 variant B.1.1.7: an ecological study. Lancet Public Health. 2021;6(5):e335–45. https://doi.org/10.1016/s2468-2667(21)00055-4 .
doi: 10.1016/s2468-2667(21)00055-4
pubmed: 33857453
pmcid: 8041365
Alwan NA, Johnson L. Defining long COVID: going back to the start. Med (N Y). 2021;2(5):501–4. https://doi.org/10.1016/j.medj.2021.03.003 .
doi: 10.1016/j.medj.2021.03.003
Jefferies S, French N, Gilkison C, Graham G, Hope V, Marshall J, McElnay C, McNeill A, Muellner P, Paine S, et al. COVID-19 in New Zealand and the impact of the national response: a descriptive epidemiological study. Lancet Public Health. 2020;5(11):e612–23. https://doi.org/10.1016/s2468-2667(20)30225-5 .
doi: 10.1016/s2468-2667(20)30225-5
pubmed: 33065023
pmcid: 7553903
Ministry of Health. Case definition and clinical testing guidelines for COVID-19 https://www.health.govt.nz/our-work/diseases-and-conditions/covid-19-novel-coronavirus/covid-19-resources-health-professionals/case-definition-and-testing-guidance-covid-19 . Accessed 26 May 2021.
Ministry of Health HISO 10001:2017 Ethnicity Data Protocols https://www.health.govt.nz/publication/hiso-100012017-ethnicity-data-protocols . Accessed 26 May 2021.
Husson F, Lê S, Pagès J. Exploratory multivariate analysis by example using R. Boca Raton: CRC Press; 2017.
doi: 10.1201/b21874
Breiman L, Friedman J, Olshen R, Stone C. Classification and regression trees (The Wadsworth Statistics/Probability Series). New York: Chapman and Hall; 1984. p. 1–358.
Liaw A, Wiener M. Classification and regression by randomForest. R J. 2002;2(3):18–22.
Kuhn M. Building predictive models in R using the caret package. J Stat Softw. 2008;28(5):1–26.
doi: 10.18637/jss.v028.i05
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res. 2002;16:321–57.
doi: 10.1613/jair.953
Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez J-C, Müller M. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics. 2011;12(1):1–8.
doi: 10.1186/1471-2105-12-77
Joseph L, Gyorkos TW, Coupal L. Bayesian estimation of disease prevalence and the parameters of diagnostic tests in the absence of a gold standard. Am J Epidemiol. 1995;141(3):263–72.
doi: 10.1093/oxfordjournals.aje.a117428
dos Santos Santana ÍV, da Silveira AC, Sobrinho Á, et al. : Classification models for COVID-19 test prioritization in Brazil: machine learning approach. J Med Internet Res. 2021;23(4):e27293. https://doi.org/10.2196/27293 .
doi: 10.2196/27293
Zoabi Y, Deri-Rozov S, Shomron N. Machine learning-based prediction of COVID-19 diagnosis based on symptoms. NPJ Digit Med. 2021;4(1):3. https://doi.org/10.1038/s41746-020-00372-6 .
doi: 10.1038/s41746-020-00372-6
pubmed: 33398013
pmcid: 7782717
Shoer S, Karady T, Keshet A, Shilo S, Rossman H, Gavrieli A, Meir T, Lavon A, Kolobkov D, Kalka I, et al. A prediction model to prioritize individuals for a SARS-CoV-2 test built from National Symptom Surveys. Med (N Y). 2021;2(2):196-208.e194. https://doi.org/10.1016/j.medj.2020.10.002 .
doi: 10.1016/j.medj.2020.10.002
Nishiura H, Kobayashi T, Miyama T, Suzuki A, Jung SM, Hayashi K, Kinoshita R, Yang Y, Yuan B, Akhmetzhanov AR, et al. Estimation of the asymptomatic ratio of novel coronavirus infections (COVID-19). Int J Infect Dis. 2020;94:154–5. https://doi.org/10.1016/j.ijid.2020.03.020 .
doi: 10.1016/j.ijid.2020.03.020
pubmed: 32179137
pmcid: 7270890
Zhao S, Musa SS, Lin Q, Ran J, Yang G, Wang W, Lou Y, Yang L, Gao D, He D, et al. Estimating the unreported number of novel coronavirus (2019-nCoV) cases in China in the first half of January 2020: A data-driven modelling analysis of the early outbreak. J Clin Med. 2020. https://doi.org/10.3390/jcm9020388 .
doi: 10.3390/jcm9020388
pubmed: 33334043
pmcid: 7765470
Li R, Pei S, Chen B, Song Y, Zhang T, Yang W, Shaman J. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2). Science. 2020;368(6490):489–93. https://doi.org/10.1126/science.abb3221 .
doi: 10.1126/science.abb3221
pubmed: 32179701
pmcid: 32179701
Budd J, Miller BS, Manning EM, Lampos V, Zhuang M, Edelstein M, Rees G, Emery VC, Stevens MM, Keegan N, et al. Digital technologies in the public-health response to COVID-19. Nat Med. 2020;26(8):1183–92. https://doi.org/10.1038/s41591-020-1011-4 .
doi: 10.1038/s41591-020-1011-4
pubmed: 32770165
Molteni E, Sudre CH, Canas LS, Bhopal SS, Hughes RC, Antonelli M, Murray B, Kläser K, Kerfoot E, Chen L, et al. Illness duration and symptom profile in symptomatic UK school-aged children tested for SARS-CoV-2. Lancet Child Adolesc Health. 2021;5(10):708–18. https://doi.org/10.1016/s2352-4642(21)00198-x .
doi: 10.1016/s2352-4642(21)00198-x
pubmed: 34358472
pmcid: 8443448
Sudre CH, Keshet A, Graham MS, Joshi AD, Shilo S, Rossman H, Murray B, Molteni E, Klaser K, Canas LD, et al. Anosmia, ageusia, and other COVID-19-like symptoms in association with a positive SARS-CoV-2 test, across six national digital surveillance platforms: an observational study. Lancet Digit Health. 2021;3(9):e577–86. https://doi.org/10.1016/s2589-7500(21)00115-1 .
doi: 10.1016/s2589-7500(21)00115-1
pubmed: 34305035
pmcid: 8297994
Al-Ani RM, Acharya D. Prevalence of anosmia and ageusia in patients with COVID-19 at a primary health center, Doha, Qatar. Indian J Otolaryngol Head Neck Surg. 2020. https://doi.org/10.1007/s12070-020-02064-9 .
doi: 10.1007/s12070-020-02064-9
pubmed: 32837952
Hopkins C, Smith B. Widespread smell testing for COVID-19 has limited application. Lancet. 2020;396(10263):1630. https://doi.org/10.1016/s0140-6736(20)32317-5 .
doi: 10.1016/s0140-6736(20)32317-5
pubmed: 33157001
pmcid: 7834025
Wilson N, Barnard LT, Summers JA, Shanks GD, Baker MG. Differential mortality rates by ethnicity in 3 influenza pandemics over a century. New Zealand Emerg Infect Dis. 2012;18(1):71–7. https://doi.org/10.3201/eid1801.110035 .
doi: 10.3201/eid1801.110035
pubmed: 22257434
Verrall A, Norton K, Rooker S, Dee S, Olsen L, Tan CE, Paull S, Allen R, Blackmore TK. Hospitalizations for pandemic (H1N1) 2009 among Maori and Pacific Islanders. New Zealand Emerg Infect Dis. 2010;16(1):100–2. https://doi.org/10.3201/eid1601.090994 .
doi: 10.3201/eid1601.090994
pubmed: 20031050
Steyn N, Binny RN, Hannah K, Hendy SC, James A, Kukutai T, Lustig A, McLeod M, Plank MJ, Ridings K, et al. Estimated inequities in COVID-19 infection fatality rates by ethnicity for Aotearoa New Zealand. N Z Med J. 2020;133(1521):28–39.
pubmed: 32994635
Reid G, Bycroft C, Gleisner F: Comparison of ethnicity information in administrative data and the census. 2016. Retrieved from www.stats.govt.nz. https://www.stats.govt.nz/assets/Research/Comparison-of-ethnicity-information-in-administrative-data-and-the-census/comparison-of-ethnicity-information-in-administrative-data-and-the-census.pdf . Accessed 26 May 2021.
Douglas J, Geoghegan JL, Hadfield J, Bouckaert R, Storey M, Ren X, de Ligt J, French N, Welch D. Real-time genomics for tracking Severe Acute Respiratory Syndrome Coronavirus 2 border incursions after virus elimination. New Zealand Emerg Infect Dis. 2021;27(9):2361–8. https://doi.org/10.3201/eid2709.211097 .
doi: 10.3201/eid2709.211097
pubmed: 34424164