Linking Provider Specialty and Outpatient Diagnoses in Medicare Claims Data: Data Quality Implications.

Aged Data Accuracy Humans International Classification of Diseases Medicare Medicine Outpatients United States

Journal

Applied clinical informatics

ISSN: 1869-0327

Titre abrégé: Appl Clin Inform

Pays: Germany

ID NLM: 101537732

Informations de publication

Date de publication:
08 2021

Historique:

entrez: 4 8 2021

pubmed: 5 8 2021

medline: 26 11 2021

Statut: ppublish

Résumé

With increasing use of real world data in observational health care research, data quality assessment of these data is equally gaining in importance. Electronic health record (EHR) or claims datasets can differ significantly in the spectrum of care covered by the data. In our study, we link provider specialty with diagnoses (encoded in International Classification of Diseases) with a motivation to characterize data completeness. We develop a set of measures that determine diagnostic span of a specialty (how many distinct diagnosis codes are generated by a specialty) and specialty span of a diagnosis (how many specialties diagnose a given condition). We also analyze ranked lists for both measures. As use case, we apply these measures to outpatient Medicare claims data from 2016 (3.5 billion diagnosis-specialty pairs). We analyze 82 distinct specialties present in Medicare claims (using Medicare list of specialties derived from level III Healthcare Provider Taxonomy Codes). A typical specialty diagnoses on average 4,046 distinct diagnosis codes. It can range from 33 codes for medical toxicology to 25,475 codes for internal medicine. Specialties with large visit volume tend to have large diagnostic span. Median specialty span of a diagnosis code is 8 specialties with a range from 1 to 82 specialties. In total, 13.5% of all observed diagnoses are generated exclusively by a single specialty. Quantitative cumulative rankings reveal that some diagnosis codes can be dominated by few specialties. Using such diagnoses in cohort or outcome definitions may thus be vulnerable to incomplete specialty coverage of a given dataset. We propose specialty fingerprinting as a method to assess data completeness component of data quality. Datasets covering a full spectrum of care can be used to generate reference benchmark data that can quantify relative importance of a specialty in constructing diagnostic history elements of computable phenotype definitions.

Sections du résumé

BACKGROUND

OBJECTIVE

In our study, we link provider specialty with diagnoses (encoded in International Classification of Diseases) with a motivation to characterize data completeness.

METHODS

We develop a set of measures that determine diagnostic span of a specialty (how many distinct diagnosis codes are generated by a specialty) and specialty span of a diagnosis (how many specialties diagnose a given condition). We also analyze ranked lists for both measures. As use case, we apply these measures to outpatient Medicare claims data from 2016 (3.5 billion diagnosis-specialty pairs). We analyze 82 distinct specialties present in Medicare claims (using Medicare list of specialties derived from level III Healthcare Provider Taxonomy Codes).

RESULTS

A typical specialty diagnoses on average 4,046 distinct diagnosis codes. It can range from 33 codes for medical toxicology to 25,475 codes for internal medicine. Specialties with large visit volume tend to have large diagnostic span. Median specialty span of a diagnosis code is 8 specialties with a range from 1 to 82 specialties. In total, 13.5% of all observed diagnoses are generated exclusively by a single specialty. Quantitative cumulative rankings reveal that some diagnosis codes can be dominated by few specialties. Using such diagnoses in cohort or outcome definitions may thus be vulnerable to incomplete specialty coverage of a given dataset.

CONCLUSION

We propose specialty fingerprinting as a method to assess data completeness component of data quality. Datasets covering a full spectrum of care can be used to generate reference benchmark data that can quantify relative importance of a specialty in constructing diagnostic history elements of computable phenotype definitions.

Identifiants

DOI: 10.1055/s-0041-1732404 PMID: 34348410 PMC: PMC8354353

pubmed: 34348410

doi: 10.1055/s-0041-1732404

pmc: PMC8354353

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

729-736

Informations de copyright

Déclaration de conflit d'intérêts

None declared.

Références

J Am Med Inform Assoc. 2011 Nov-Dec;18(6):859-67

pubmed: 21613643

Appl Clin Inform. 2020 Aug;11(4):622-634

pubmed: 32968999

Clin Pharmacol Ther. 2019 Jul;106(1):10-18

pubmed: 31273768

J Med Econ. 2019 Jun;22(6):545-553

pubmed: 30816067

Appl Clin Inform. 2020 Oct;11(5):785-791

pubmed: 33241548

Appl Clin Inform. 2014 Jul 09;5(3):621-9

pubmed: 25298803

AMIA Annu Symp Proc. 2003;:489-93

pubmed: 14728221

Appl Clin Inform. 2019 Mar;10(2):199-209

pubmed: 30895574

J Gen Intern Med. 2012 Aug;27(8):968-73

pubmed: 22426706

EGEMS (Wash DC). 2016 Sep 11;4(1):1244

pubmed: 27713905

Appl Clin Inform. 2018 Jul;9(3):528-540

pubmed: 30040112

Pac Symp Biocomput. 2018;23:628-633

pubmed: 29218922

Curr Med Res Opin. 2018 Dec;34(12):2125-2130

pubmed: 30217138

Linking Provider Specialty and Outpatient Diagnoses in Medicare Claims Data: Data Quality Implications.

Journal

Informations de publication

Résumé

Sections du résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Déclaration de conflit d'intérêts

Références

Auteurs

Vojtech Huser (V)

Nick D Williams (ND)

Craig S Mayer (CS)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH