Covid-19 risk factors: statistical learning from German healthcare claims data.
SARS-CoV-2
machine learning
prediction
prioritization
routine data
Journal
Infectious diseases (London, England)
ISSN: 2374-4243
Titre abrégé: Infect Dis (Lond)
Pays: England
ID NLM: 101650235
Informations de publication
Date de publication:
Feb 2022
Feb 2022
Historique:
pubmed:
28
9
2021
medline:
8
1
2022
entrez:
27
9
2021
Statut:
ppublish
Résumé
Precise individual risk quantification of severe courses of Covid-19 is needed to prioritize protective measures and to assess population risks in a phase of increased immunization. So far, results for the German population are lacking. Furthermore, existing studies pre-specify comorbidity risks by broad categories rather than deriving them from the data using statistical learning techniques. Risk factors for severe, critical and lethal courses of Covid-19 are identified from a large German claims dataset covering more than 4 million individuals. To avoid prior grouping and pre-selection of risk factors, fine-grained hierarchical information from medical classification systems for diagnoses, pharmaceuticals and procedures are used, resulting in more than 33,000 covariates. These are processed using a LASSO approach. We identify relevant risk factors, among which hypertensive diseases, heart disease and the corresponding medications are most relevant at population level. Prior use of diuretics is the strongest single medical predictor for severe course (e.g. Torasemide, odds ratio (OR) 1.801), but also for a critical course (OR 2.304) and death (OR 2.523). To assess risk profiles at the individual level, our approach sums up many such factors and has better predictive ability than using pre-specified morbidity groups (AUC for predicting critical course 0.875 versus AUC ≤ 0.865). The proposed method can help to identify risk factors and assess risk at the individual level for other infectious diseases. The results can be used by administrative data holders to guide protective policies, while a risk index can be applied in clinical studies with a narrower focus.
Sections du résumé
BACKGROUND
BACKGROUND
Precise individual risk quantification of severe courses of Covid-19 is needed to prioritize protective measures and to assess population risks in a phase of increased immunization. So far, results for the German population are lacking. Furthermore, existing studies pre-specify comorbidity risks by broad categories rather than deriving them from the data using statistical learning techniques.
METHODS
METHODS
Risk factors for severe, critical and lethal courses of Covid-19 are identified from a large German claims dataset covering more than 4 million individuals. To avoid prior grouping and pre-selection of risk factors, fine-grained hierarchical information from medical classification systems for diagnoses, pharmaceuticals and procedures are used, resulting in more than 33,000 covariates. These are processed using a LASSO approach.
RESULTS
RESULTS
We identify relevant risk factors, among which hypertensive diseases, heart disease and the corresponding medications are most relevant at population level. Prior use of diuretics is the strongest single medical predictor for severe course (e.g. Torasemide, odds ratio (OR) 1.801), but also for a critical course (OR 2.304) and death (OR 2.523). To assess risk profiles at the individual level, our approach sums up many such factors and has better predictive ability than using pre-specified morbidity groups (AUC for predicting critical course 0.875 versus AUC ≤ 0.865).
CONCLUSIONS
CONCLUSIONS
The proposed method can help to identify risk factors and assess risk at the individual level for other infectious diseases. The results can be used by administrative data holders to guide protective policies, while a risk index can be applied in clinical studies with a narrower focus.
Identifiants
pubmed: 34569423
doi: 10.1080/23744235.2021.1982141
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM