Seek COVER: using a disease proxy to rapidly develop and validate a personalized risk calculator for COVID-19 outcomes in an international network.

COVID-19 COVID-19 Testing Humans Influenza, Human / epidemiology Pneumonia SARS-CoV-2 United States

COVID-19 Patient-level prediction modelling Risk score

Journal

BMC medical research methodology

ISSN: 1471-2288

Titre abrégé: BMC Med Res Methodol

Pays: England

ID NLM: 100968545

Informations de publication

Date de publication:
30 01 2022

Historique:

received: 06 01 2021

accepted: 03 01 2022

entrez: 31 1 2022

pubmed: 1 2 2022

medline: 3 2 2022

Statut: epublish

Résumé

We investigated whether we could use influenza data to develop prediction models for COVID-19 to increase the speed at which prediction models can reliably be developed and validated early in a pandemic. We developed COVID-19 Estimated Risk (COVER) scores that quantify a patient's risk of hospital admission with pneumonia (COVER-H), hospitalization with pneumonia requiring intensive services or death (COVER-I), or fatality (COVER-F) in the 30-days following COVID-19 diagnosis using historical data from patients with influenza or flu-like symptoms and tested this in COVID-19 patients. We analyzed a federated network of electronic medical records and administrative claims data from 14 data sources and 6 countries containing data collected on or before 4/27/2020. We used a 2-step process to develop 3 scores using historical data from patients with influenza or flu-like symptoms any time prior to 2020. The first step was to create a data-driven model using LASSO regularized logistic regression, the covariates of which were used to develop aggregate covariates for the second step where the COVER scores were developed using a smaller set of features. These 3 COVER scores were then externally validated on patients with 1) influenza or flu-like symptoms and 2) confirmed or suspected COVID-19 diagnosis across 5 databases from South Korea, Spain, and the United States. Outcomes included i) hospitalization with pneumonia, ii) hospitalization with pneumonia requiring intensive services or death, and iii) death in the 30 days after index date. Overall, 44,507 COVID-19 patients were included for model validation. We identified 7 predictors (history of cancer, chronic obstructive pulmonary disease, diabetes, heart disease, hypertension, hyperlipidemia, kidney disease) which combined with age and sex discriminated which patients would experience any of our three outcomes. The models achieved good performance in influenza and COVID-19 cohorts. For COVID-19 the AUC ranges were, COVER-H: 0.69-0.81, COVER-I: 0.73-0.91, and COVER-F: 0.72-0.90. Calibration varied across the validations with some of the COVID-19 validations being less well calibrated than the influenza validations. This research demonstrated the utility of using a proxy disease to develop a prediction model. The 3 COVER models with 9-predictors that were developed using influenza data perform well for COVID-19 patients for predicting hospitalization, intensive services, and fatality. The scores showed good discriminatory performance which transferred well to the COVID-19 population. There was some miscalibration in the COVID-19 validations, which is potentially due to the difference in symptom severity between the two diseases. A possible solution for this is to recalibrate the models in each location before use.

Sections du résumé

BACKGROUND

METHODS

We analyzed a federated network of electronic medical records and administrative claims data from 14 data sources and 6 countries containing data collected on or before 4/27/2020. We used a 2-step process to develop 3 scores using historical data from patients with influenza or flu-like symptoms any time prior to 2020. The first step was to create a data-driven model using LASSO regularized logistic regression, the covariates of which were used to develop aggregate covariates for the second step where the COVER scores were developed using a smaller set of features. These 3 COVER scores were then externally validated on patients with 1) influenza or flu-like symptoms and 2) confirmed or suspected COVID-19 diagnosis across 5 databases from South Korea, Spain, and the United States. Outcomes included i) hospitalization with pneumonia, ii) hospitalization with pneumonia requiring intensive services or death, and iii) death in the 30 days after index date.

RESULTS

Overall, 44,507 COVID-19 patients were included for model validation. We identified 7 predictors (history of cancer, chronic obstructive pulmonary disease, diabetes, heart disease, hypertension, hyperlipidemia, kidney disease) which combined with age and sex discriminated which patients would experience any of our three outcomes. The models achieved good performance in influenza and COVID-19 cohorts. For COVID-19 the AUC ranges were, COVER-H: 0.69-0.81, COVER-I: 0.73-0.91, and COVER-F: 0.72-0.90. Calibration varied across the validations with some of the COVID-19 validations being less well calibrated than the influenza validations.

CONCLUSIONS

This research demonstrated the utility of using a proxy disease to develop a prediction model. The 3 COVER models with 9-predictors that were developed using influenza data perform well for COVID-19 patients for predicting hospitalization, intensive services, and fatality. The scores showed good discriminatory performance which transferred well to the COVID-19 population. There was some miscalibration in the COVID-19 validations, which is potentially due to the difference in symptom severity between the two diseases. A possible solution for this is to recalibrate the models in each location before use.

Identifiants

DOI: 10.1186/s12874-022-01505-z PMID: 35094685 PMC: PMC8801189

pubmed: 35094685

doi: 10.1186/s12874-022-01505-z

pii: 10.1186/s12874-022-01505-z

pmc: PMC8801189

doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

Pagination

Subventions

Organisme : NLM NIH HHS

ID : T15 LM007079

Pays : United States

Informations de copyright

Références

Lancet Respir Med. 2021 Mar;9(3):251-259

pubmed: 33341155

Eur Heart J. 2014 Aug 1;35(29):1925-31

pubmed: 24898551

J Am Med Inform Assoc. 2012 Jan-Feb;19(1):54-60

pubmed: 22037893

BMJ. 2020 Apr 7;369:m1328

pubmed: 32265220

BMJ. 2020 Mar 18;368:m441

pubmed: 32188600

J Clin Epidemiol. 2008 Jan;61(1):76-86

pubmed: 18083464

ACM Trans Model Comput Simul. 2013 Jan;23(1):

pubmed: 25328363

Stat Med. 2004 Aug 30;23(16):2567-86

pubmed: 15287085

J Clin Epidemiol. 2005 May;58(5):475-83

pubmed: 15845334

Ann Intern Med. 2015 Jan 6;162(1):W1-73

pubmed: 25560730

J Am Med Inform Assoc. 2018 Aug 1;25(8):969-975

pubmed: 29718407

Stat Med. 2016 Jan 30;35(2):214-26

pubmed: 26553135

BMC Med Res Methodol. 2020 May 6;20(1):102

pubmed: 32375693

Lancet Respir Med. 2021 Mar;9(3):219-220

pubmed: 33341154