Linking Provider Specialty and Outpatient Diagnoses in Medicare Claims Data: Data Quality Implications.


Journal

Applied clinical informatics
ISSN: 1869-0327
Titre abrégé: Appl Clin Inform
Pays: Germany
ID NLM: 101537732

Informations de publication

Date de publication:
08 2021
Historique:
entrez: 4 8 2021
pubmed: 5 8 2021
medline: 26 11 2021
Statut: ppublish

Résumé

With increasing use of real world data in observational health care research, data quality assessment of these data is equally gaining in importance. Electronic health record (EHR) or claims datasets can differ significantly in the spectrum of care covered by the data. In our study, we link provider specialty with diagnoses (encoded in International Classification of Diseases) with a motivation to characterize data completeness. We develop a set of measures that determine diagnostic span of a specialty (how many distinct diagnosis codes are generated by a specialty) and specialty span of a diagnosis (how many specialties diagnose a given condition). We also analyze ranked lists for both measures. As use case, we apply these measures to outpatient Medicare claims data from 2016 (3.5 billion diagnosis-specialty pairs). We analyze 82 distinct specialties present in Medicare claims (using Medicare list of specialties derived from level III Healthcare Provider Taxonomy Codes). A typical specialty diagnoses on average 4,046 distinct diagnosis codes. It can range from 33 codes for medical toxicology to 25,475 codes for internal medicine. Specialties with large visit volume tend to have large diagnostic span. Median specialty span of a diagnosis code is 8 specialties with a range from 1 to 82 specialties. In total, 13.5% of all observed diagnoses are generated exclusively by a single specialty. Quantitative cumulative rankings reveal that some diagnosis codes can be dominated by few specialties. Using such diagnoses in cohort or outcome definitions may thus be vulnerable to incomplete specialty coverage of a given dataset. We propose specialty fingerprinting as a method to assess data completeness component of data quality. Datasets covering a full spectrum of care can be used to generate reference benchmark data that can quantify relative importance of a specialty in constructing diagnostic history elements of computable phenotype definitions.

Sections du résumé

BACKGROUND
With increasing use of real world data in observational health care research, data quality assessment of these data is equally gaining in importance. Electronic health record (EHR) or claims datasets can differ significantly in the spectrum of care covered by the data.
OBJECTIVE
In our study, we link provider specialty with diagnoses (encoded in International Classification of Diseases) with a motivation to characterize data completeness.
METHODS
We develop a set of measures that determine diagnostic span of a specialty (how many distinct diagnosis codes are generated by a specialty) and specialty span of a diagnosis (how many specialties diagnose a given condition). We also analyze ranked lists for both measures. As use case, we apply these measures to outpatient Medicare claims data from 2016 (3.5 billion diagnosis-specialty pairs). We analyze 82 distinct specialties present in Medicare claims (using Medicare list of specialties derived from level III Healthcare Provider Taxonomy Codes).
RESULTS
A typical specialty diagnoses on average 4,046 distinct diagnosis codes. It can range from 33 codes for medical toxicology to 25,475 codes for internal medicine. Specialties with large visit volume tend to have large diagnostic span. Median specialty span of a diagnosis code is 8 specialties with a range from 1 to 82 specialties. In total, 13.5% of all observed diagnoses are generated exclusively by a single specialty. Quantitative cumulative rankings reveal that some diagnosis codes can be dominated by few specialties. Using such diagnoses in cohort or outcome definitions may thus be vulnerable to incomplete specialty coverage of a given dataset.
CONCLUSION
We propose specialty fingerprinting as a method to assess data completeness component of data quality. Datasets covering a full spectrum of care can be used to generate reference benchmark data that can quantify relative importance of a specialty in constructing diagnostic history elements of computable phenotype definitions.

Identifiants

pubmed: 34348410
doi: 10.1055/s-0041-1732404
pmc: PMC8354353
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

729-736

Informations de copyright

Thieme. All rights reserved.

Déclaration de conflit d'intérêts

None declared.

Références

J Am Med Inform Assoc. 2011 Nov-Dec;18(6):859-67
pubmed: 21613643
Appl Clin Inform. 2020 Aug;11(4):622-634
pubmed: 32968999
Clin Pharmacol Ther. 2019 Jul;106(1):10-18
pubmed: 31273768
J Med Econ. 2019 Jun;22(6):545-553
pubmed: 30816067
Appl Clin Inform. 2020 Oct;11(5):785-791
pubmed: 33241548
Appl Clin Inform. 2014 Jul 09;5(3):621-9
pubmed: 25298803
AMIA Annu Symp Proc. 2003;:489-93
pubmed: 14728221
Appl Clin Inform. 2019 Mar;10(2):199-209
pubmed: 30895574
J Gen Intern Med. 2012 Aug;27(8):968-73
pubmed: 22426706
EGEMS (Wash DC). 2016 Sep 11;4(1):1244
pubmed: 27713905
Appl Clin Inform. 2018 Jul;9(3):528-540
pubmed: 30040112
Pac Symp Biocomput. 2018;23:628-633
pubmed: 29218922
Curr Med Res Opin. 2018 Dec;34(12):2125-2130
pubmed: 30217138

Auteurs

Vojtech Huser (V)

Lister Hill National Center for Biomedical Communications, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States.

Nick D Williams (ND)

Lister Hill National Center for Biomedical Communications, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States.

Craig S Mayer (CS)

Lister Hill National Center for Biomedical Communications, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH