Seek COVER: using a disease proxy to rapidly develop and validate a personalized risk calculator for COVID-19 outcomes in an international network.


Journal

BMC medical research methodology
ISSN: 1471-2288
Titre abrégé: BMC Med Res Methodol
Pays: England
ID NLM: 100968545

Informations de publication

Date de publication:
30 01 2022
Historique:
received: 06 01 2021
accepted: 03 01 2022
entrez: 31 1 2022
pubmed: 1 2 2022
medline: 3 2 2022
Statut: epublish

Résumé

We investigated whether we could use influenza data to develop prediction models for COVID-19 to increase the speed at which prediction models can reliably be developed and validated early in a pandemic. We developed COVID-19 Estimated Risk (COVER) scores that quantify a patient's risk of hospital admission with pneumonia (COVER-H), hospitalization with pneumonia requiring intensive services or death (COVER-I), or fatality (COVER-F) in the 30-days following COVID-19 diagnosis using historical data from patients with influenza or flu-like symptoms and tested this in COVID-19 patients. We analyzed a federated network of electronic medical records and administrative claims data from 14 data sources and 6 countries containing data collected on or before 4/27/2020. We used a 2-step process to develop 3 scores using historical data from patients with influenza or flu-like symptoms any time prior to 2020. The first step was to create a data-driven model using LASSO regularized logistic regression, the covariates of which were used to develop aggregate covariates for the second step where the COVER scores were developed using a smaller set of features. These 3 COVER scores were then externally validated on patients with 1) influenza or flu-like symptoms and 2) confirmed or suspected COVID-19 diagnosis across 5 databases from South Korea, Spain, and the United States. Outcomes included i) hospitalization with pneumonia, ii) hospitalization with pneumonia requiring intensive services or death, and iii) death in the 30 days after index date. Overall, 44,507 COVID-19 patients were included for model validation. We identified 7 predictors (history of cancer, chronic obstructive pulmonary disease, diabetes, heart disease, hypertension, hyperlipidemia, kidney disease) which combined with age and sex discriminated which patients would experience any of our three outcomes. The models achieved good performance in influenza and COVID-19 cohorts. For COVID-19 the AUC ranges were, COVER-H: 0.69-0.81, COVER-I: 0.73-0.91, and COVER-F: 0.72-0.90. Calibration varied across the validations with some of the COVID-19 validations being less well calibrated than the influenza validations. This research demonstrated the utility of using a proxy disease to develop a prediction model. The 3 COVER models with 9-predictors that were developed using influenza data perform well for COVID-19 patients for predicting hospitalization, intensive services, and fatality. The scores showed good discriminatory performance which transferred well to the COVID-19 population. There was some miscalibration in the COVID-19 validations, which is potentially due to the difference in symptom severity between the two diseases. A possible solution for this is to recalibrate the models in each location before use.

Sections du résumé

BACKGROUND
We investigated whether we could use influenza data to develop prediction models for COVID-19 to increase the speed at which prediction models can reliably be developed and validated early in a pandemic. We developed COVID-19 Estimated Risk (COVER) scores that quantify a patient's risk of hospital admission with pneumonia (COVER-H), hospitalization with pneumonia requiring intensive services or death (COVER-I), or fatality (COVER-F) in the 30-days following COVID-19 diagnosis using historical data from patients with influenza or flu-like symptoms and tested this in COVID-19 patients.
METHODS
We analyzed a federated network of electronic medical records and administrative claims data from 14 data sources and 6 countries containing data collected on or before 4/27/2020. We used a 2-step process to develop 3 scores using historical data from patients with influenza or flu-like symptoms any time prior to 2020. The first step was to create a data-driven model using LASSO regularized logistic regression, the covariates of which were used to develop aggregate covariates for the second step where the COVER scores were developed using a smaller set of features. These 3 COVER scores were then externally validated on patients with 1) influenza or flu-like symptoms and 2) confirmed or suspected COVID-19 diagnosis across 5 databases from South Korea, Spain, and the United States. Outcomes included i) hospitalization with pneumonia, ii) hospitalization with pneumonia requiring intensive services or death, and iii) death in the 30 days after index date.
RESULTS
Overall, 44,507 COVID-19 patients were included for model validation. We identified 7 predictors (history of cancer, chronic obstructive pulmonary disease, diabetes, heart disease, hypertension, hyperlipidemia, kidney disease) which combined with age and sex discriminated which patients would experience any of our three outcomes. The models achieved good performance in influenza and COVID-19 cohorts. For COVID-19 the AUC ranges were, COVER-H: 0.69-0.81, COVER-I: 0.73-0.91, and COVER-F: 0.72-0.90. Calibration varied across the validations with some of the COVID-19 validations being less well calibrated than the influenza validations.
CONCLUSIONS
This research demonstrated the utility of using a proxy disease to develop a prediction model. The 3 COVER models with 9-predictors that were developed using influenza data perform well for COVID-19 patients for predicting hospitalization, intensive services, and fatality. The scores showed good discriminatory performance which transferred well to the COVID-19 population. There was some miscalibration in the COVID-19 validations, which is potentially due to the difference in symptom severity between the two diseases. A possible solution for this is to recalibrate the models in each location before use.

Identifiants

pubmed: 35094685
doi: 10.1186/s12874-022-01505-z
pii: 10.1186/s12874-022-01505-z
pmc: PMC8801189
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

35

Subventions

Organisme : NLM NIH HHS
ID : T15 LM007079
Pays : United States

Informations de copyright

© 2022. The Author(s).

Références

Lancet Respir Med. 2021 Mar;9(3):251-259
pubmed: 33341155
Eur Heart J. 2014 Aug 1;35(29):1925-31
pubmed: 24898551
J Am Med Inform Assoc. 2012 Jan-Feb;19(1):54-60
pubmed: 22037893
BMJ. 2020 Apr 7;369:m1328
pubmed: 32265220
BMJ. 2020 Mar 18;368:m441
pubmed: 32188600
J Clin Epidemiol. 2008 Jan;61(1):76-86
pubmed: 18083464
ACM Trans Model Comput Simul. 2013 Jan;23(1):
pubmed: 25328363
Stat Med. 2004 Aug 30;23(16):2567-86
pubmed: 15287085
J Clin Epidemiol. 2005 May;58(5):475-83
pubmed: 15845334
Ann Intern Med. 2015 Jan 6;162(1):W1-73
pubmed: 25560730
J Am Med Inform Assoc. 2018 Aug 1;25(8):969-975
pubmed: 29718407
Stat Med. 2016 Jan 30;35(2):214-26
pubmed: 26553135
BMC Med Res Methodol. 2020 May 6;20(1):102
pubmed: 32375693
Lancet Respir Med. 2021 Mar;9(3):219-220
pubmed: 33341154

Auteurs

Ross D Williams (RD)

Department of Medical Informatics, Erasmus University Medical Center, Doctor Molewaterplein, 403015, GD, Rotterdam, The Netherlands.

Aniek F Markus (AF)

Department of Medical Informatics, Erasmus University Medical Center, Doctor Molewaterplein, 403015, GD, Rotterdam, The Netherlands.

Cynthia Yang (C)

Department of Medical Informatics, Erasmus University Medical Center, Doctor Molewaterplein, 403015, GD, Rotterdam, The Netherlands.

Talita Duarte-Salles (T)

Fundacio Institut Universitari per a la recerca a l'Atencio Primaria de Salut Jordi Gol i Gurina (IDIAPJGol), Barcelona, Spain.

Scott L DuVall (SL)

Department of Veterans Affairs, University of Utah, Salt Lake City, UT, USA.

Thomas Falconer (T)

Department of Biomedical Informatics, Columbia University, New York, NY, USA.

Jitendra Jonnagaddala (J)

School of Public Health and Community Medicine, UNSW, Sydney, Australia.

Chungsoo Kim (C)

Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, Republic of Korea.

Yeunsook Rho (Y)

Department of Big Data Strategy, National Health Insurance Service, Wonju, Republic of Korea.

Andrew E Williams (AE)

Tufts University School of Medicine, Institute for Clinical Research and Health Policy Studies, Boston, MA, 02111, USA.

Amanda Alberga Machado (AA)

Independent Epidemiologist, OHDSI, Rotterdam, The Netherlands.

Min Ho An (MH)

So Ahn Public Health Center, Wando County Health Center and Hospital, Wando, Republic of Korea.

María Aragón (M)

Fundacio Institut Universitari per a la recerca a l'Atencio Primaria de Salut Jordi Gol i Gurina (IDIAPJGol), Barcelona, Spain.

Carlos Areia (C)

Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK.

Edward Burn (E)

Fundacio Institut Universitari per a la recerca a l'Atencio Primaria de Salut Jordi Gol i Gurina (IDIAPJGol), Barcelona, Spain.
Centre for Statistics in Medicine, NDORMS, University of Oxford, Oxford, UK.

Young Hwa Choi (YH)

Department of Infectious Diseases, Ajou University School of Medicine, Suwon, Republic of Korea.

Iannis Drakos (I)

Center for Surgical Science, Koege, Denmark.

Maria Tereza Fernandes Abrahão (MTF)

Faculty of Medicine, University of Sao Paulo, Sao Paulo, Brazil.

Sergio Fernández-Bertolín (S)

Fundacio Institut Universitari per a la recerca a l'Atencio Primaria de Salut Jordi Gol i Gurina (IDIAPJGol), Barcelona, Spain.

George Hripcsak (G)

Department of Biomedical Informatics, Columbia University, New York, NY, USA.

Benjamin Skov Kaas-Hansen (BS)

Clinical Pharmacology Unit, Zealand University Hospital, Roskilde, Denmark.
NNF Centre for Protein Research, University of Copenhagen, Copenhagen, Denmark.

Prasanna L Kandukuri (PL)

Abbvie, Chicago, USA.

Jan A Kors (JA)

Department of Medical Informatics, Erasmus University Medical Center, Doctor Molewaterplein, 403015, GD, Rotterdam, The Netherlands.

Kristin Kostka (K)

Real World Solutions, IQVIA, Cambridge, MA, USA.

Siaw-Teng Liaw (ST)

School of Public Health and Community Medicine, UNSW, Sydney, Australia.

Kristine E Lynch (KE)

Department of Veterans Affairs, University of Utah, Salt Lake City, UT, USA.

Gerardo Machnicki (G)

Janssen Latin America, Buenos Aires, Argentina.

Michael E Matheny (ME)

Department of Veterans Affairs, Washington D. C, USA.
Vanderbilt University, Nashville, USA.

Daniel Morales (D)

Division of Population Health and Genomics, University of Dundee, Dundee, UK.

Fredrik Nyberg (F)

School of Public Health and Community Medicine, Institute of Medicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden.

Rae Woong Park (RW)

Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Republic of Korea.

Albert Prats-Uribe (A)

Centre for Statistics in Medicine, NDORMS, University of Oxford, Oxford, UK.

Nicole Pratt (N)

Quality Use of Medicines and Pharmacy Research Centre, University of South Australia, Adelaide, Australia.

Gowtham Rao (G)

Janssen Research & Development, Titusville, NJ, USA.

Christian G Reich (CG)

Real World Solutions, IQVIA, Cambridge, MA, USA.

Marcela Rivera (M)

Bayer Pharmaceuticals, Bayer Hispania, S.L., Barcelona, Spain.

Tom Seinen (T)

Department of Medical Informatics, Erasmus University Medical Center, Doctor Molewaterplein, 403015, GD, Rotterdam, The Netherlands.

Azza Shoaibi (A)

Janssen Research & Development, Titusville, NJ, USA.

Matthew E Spotnitz (ME)

Department of Biomedical Informatics, Columbia University, New York, NY, USA.

Ewout W Steyerberg (EW)

Department of Public Health, Erasmus University Medical Center, Rotterdam, The Netherlands.
Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands.

Marc A Suchard (MA)

Department of Biostatistics, UCLA Fielding School of Public Health, University of California, Los Angeles, CA, USA.

Seng Chan You (SC)

Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Republic of Korea.

Lin Zhang (L)

School of Public Health, Peking Union Medical College, Beijing, China.
Melbourne School of Public Health, The University of Melbourne, Melbourne, Victoria, Australia.

Lili Zhou (L)

Abbvie, Chicago, USA.

Patrick B Ryan (PB)

Janssen Research & Development, Titusville, NJ, USA.

Daniel Prieto-Alhambra (D)

Centre for Statistics in Medicine, NDORMS, University of Oxford, Oxford, UK.

Jenna M Reps (JM)

Janssen Research & Development, Titusville, NJ, USA.

Peter R Rijnbeek (PR)

Department of Medical Informatics, Erasmus University Medical Center, Doctor Molewaterplein, 403015, GD, Rotterdam, The Netherlands. p.rijnbeek@erasmusmc.nl.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH