Predictive modeling of nontuberculous mycobacterial pulmonary disease epidemiology using German health claims data.
Adolescent
Adult
Aged
Aged, 80 and over
Case-Control Studies
Comorbidity
Female
Germany
/ epidemiology
Humans
Incidence
Insurance Claim Review
Lung Diseases
/ epidemiology
Machine Learning
Male
Middle Aged
Models, Statistical
Mycobacterium Infections, Nontuberculous
/ epidemiology
Nontuberculous Mycobacteria
Prevalence
Retrospective Studies
Risk Factors
Young Adult
Epidemiology
Insurance claims analysis
Machine learning
Nontuberculous mycobacteria
Nontuberculous mycobacterium infections
Probability learning
Journal
International journal of infectious diseases : IJID : official publication of the International Society for Infectious Diseases
ISSN: 1878-3511
Titre abrégé: Int J Infect Dis
Pays: Canada
ID NLM: 9610933
Informations de publication
Date de publication:
Mar 2021
Mar 2021
Historique:
received:
01
11
2020
revised:
04
01
2021
accepted:
04
01
2021
pubmed:
15
1
2021
medline:
23
4
2021
entrez:
14
1
2021
Statut:
ppublish
Résumé
Administrative claims data are prone to underestimate the burden of non-tuberculous mycobacterial pulmonary disease (NTM-PD). We developed machine learning-based algorithms using historical claims data from cases with NTM-PD to predict patients with a high probability of having previously undiagnosed NTM-PD and to assess actual prevalence and incidence. Adults with incident NTM-PD were classified from a representative 5% sample of the German population covered by statutory health insurance during 2011-2016 by the International Classification of Diseases, 10th revision code A31.0. Pre-diagnosis characteristics (patient demographics, comorbidities, diagnostic and therapeutic procedures, and medications) were extracted and compared to that of a control group without NTM-PD to identify risk factors. Applying a random forest model (area under the curve 0.847; total error 19.4%) and a risk threshold of >99%, prevalence and incidence rates in 2016 increased 5-fold and 9-fold to 19 and 15 cases/100,000 population, respectively, for both coded and non-coded vs. coded cases alone. The use of a machine learning-based algorithm applied to German statutory health insurance claims data predicted a considerable number of previously unreported NTM-PD cases with high probabilty.
Identifiants
pubmed: 33444748
pii: S1201-9712(21)00006-0
doi: 10.1016/j.ijid.2021.01.003
pii:
doi:
Types de publication
Journal Article
Observational Study
Langues
eng
Sous-ensembles de citation
IM
Pagination
398-406Informations de copyright
Copyright © 2021 The Author(s). Published by Elsevier Ltd.. All rights reserved.