Procedure code overutilization detection from healthcare claims using unsupervised deep learning methods.

Deep autoencoder Feature-weighted loss function Fraud, waste, and abuse Procedure code overutilization Unsupervised learning

Journal

BMC medical informatics and decision making
ISSN: 1472-6947
Titre abrégé: BMC Med Inform Decis Mak
Pays: England
ID NLM: 101088682

Informations de publication

Date de publication:
28 09 2023
Historique:
received: 17 10 2022
accepted: 17 08 2023
medline: 2 10 2023
pubmed: 29 9 2023
entrez: 28 9 2023
Statut: epublish

Résumé

Fraud, Waste, and Abuse (FWA) in medical claims have a negative impact on the quality and cost of healthcare. A major component of FWA in claims is procedure code overutilization, where one or more prescribed procedures may not be relevant to a given diagnosis and patient profile, resulting in unnecessary and unwarranted treatments and medical payments. This study aims to identify such unwarranted procedures from millions of healthcare claims. In the absence of labeled examples of unwarranted procedures, the study focused on the application of unsupervised machine learning techniques. Experiments were conducted with deep autoencoders to find claims containing anomalous procedure codes indicative of FWA, and were compared against a baseline density-based clustering model. Diagnoses, procedures, and demographic data associated with healthcare claims were used as features for the models. A dataset of one hundred thousand claims sampled from a larger claims database is used to initially train and tune the models, followed by experimentations on a dataset with thirty-three million claims. Experimental results show that the autoencoder model, when trained with a novel feature-weighted loss function, outperforms the density-based clustering approach in finding potential outlier procedure codes. Given the unsupervised nature of our experiments, model performance was evaluated using a synthetic outlier test dataset, and a manually annotated outlier test dataset. Precision, recall and F1-scores on the synthetic outlier test dataset for the autoencoder model trained on one hundred thousand claims were 0.87, 1.0 and 0.93, respectively, while the results for these metrics on the manually annotated outlier test dataset were 0.36, 0.86 and 0.51, respectively. The model performance on the manually annotated outlier test dataset improved further when trained on the larger thirty-three million claims dataset with precision, recall and F1-scores of 0.48, 0.90 and 0.63, respectively. This study demonstrates the feasibility of leveraging unsupervised, deep-learning methods to identify potential procedure overutilization from healthcare claims.

Sections du résumé

BACKGROUND
Fraud, Waste, and Abuse (FWA) in medical claims have a negative impact on the quality and cost of healthcare. A major component of FWA in claims is procedure code overutilization, where one or more prescribed procedures may not be relevant to a given diagnosis and patient profile, resulting in unnecessary and unwarranted treatments and medical payments. This study aims to identify such unwarranted procedures from millions of healthcare claims. In the absence of labeled examples of unwarranted procedures, the study focused on the application of unsupervised machine learning techniques.
METHODS
Experiments were conducted with deep autoencoders to find claims containing anomalous procedure codes indicative of FWA, and were compared against a baseline density-based clustering model. Diagnoses, procedures, and demographic data associated with healthcare claims were used as features for the models. A dataset of one hundred thousand claims sampled from a larger claims database is used to initially train and tune the models, followed by experimentations on a dataset with thirty-three million claims. Experimental results show that the autoencoder model, when trained with a novel feature-weighted loss function, outperforms the density-based clustering approach in finding potential outlier procedure codes.
RESULTS
Given the unsupervised nature of our experiments, model performance was evaluated using a synthetic outlier test dataset, and a manually annotated outlier test dataset. Precision, recall and F1-scores on the synthetic outlier test dataset for the autoencoder model trained on one hundred thousand claims were 0.87, 1.0 and 0.93, respectively, while the results for these metrics on the manually annotated outlier test dataset were 0.36, 0.86 and 0.51, respectively. The model performance on the manually annotated outlier test dataset improved further when trained on the larger thirty-three million claims dataset with precision, recall and F1-scores of 0.48, 0.90 and 0.63, respectively.
CONCLUSIONS
This study demonstrates the feasibility of leveraging unsupervised, deep-learning methods to identify potential procedure overutilization from healthcare claims.

Identifiants

pubmed: 37770866
doi: 10.1186/s12911-023-02268-3
pii: 10.1186/s12911-023-02268-3
pmc: PMC10536726
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

196

Informations de copyright

© 2023. BioMed Central Ltd., part of Springer Nature.

Références

National Health Expenditure Accounts (NHEA) Historical Data. https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/NationalHealthExpendData/NationalHealthAccountsHistorical .
National Health Expenditure Accounts (NHEA) Projections. https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/NationalHealthExpendData/NationalHealthAccountsProjected .
National Health Care Anti-Fraud Association (NHCAA). The Challenge of Health Care Fraud. https://www.nhcaa.org/tools-insights/about-health-care-fraud/the-challenge-of-health-care-fraud/ .
Rosenbaum S, Lopez N, Stifler S. Health insurance fraud: an overview. Washington: Department of Health Policy, School of Public Health and Health Services, The George Washington University; 2009.
Kalb PE. Health care fraud and abuse. JAMA. 1999;282:1163.
doi: 10.1001/jama.282.12.1163 pubmed: 10501120
Bauder R, Khoshgoftaar TM, Seliya N. A survey on the state of healthcare upcoding fraud analysis and detection. Health Serv Outcomes Res Method. 2017;17:31–55.
doi: 10.1007/s10742-016-0154-8
Joudaki H, Rashidian A, Minaei-Bidgoli B, Mahmoodi M, Geraili B, Nasiri M, et al. Using data mining to detect health care fraud and abuse: a review of literature. GJHS. 2014;7:194.
doi: 10.5539/gjhs.v7n1p194 pubmed: 25560347 pmcid: 4796421
Johnson JM, Khoshgoftaar TM. Medicare fraud detection using neural networks. J Big Data. 2019;6:63.
doi: 10.1186/s40537-019-0225-0
Bauder R, da Rosa R, Khoshgoftaar T. Identifying medicare provider fraud with unsupervised machine learning. In: 2018 IEEE International Conference on Information Reuse and Integration (IRI). Salt Lake City, UT: IEEE; 2018. p. 285–92.
doi: 10.1109/IRI.2018.00051
Kanksha, Bhaskar A, Pande S, Malik R, Khamparia A. An intelligent unsupervised technique for fraud detection in health care systems. IDT. 2021;15:127–39.
Nassery N, Segal JB, Chang E, Bridges JFP. Systematic overuse of healthcare services: a conceptual model. Appl Health Econ Health Policy. 2015;13:1–6.
doi: 10.1007/s40258-014-0126-5 pubmed: 25193241 pmcid: 5511697
Centers for Medicare and Medicaid Services (CMS). List of CPT/HCPCS Codes. https://www.cms.gov/Medicare/Fraud-and-Abuse/PhysicianSelfReferral .
Best Care at Lower Cost. The path to continuously learning health care in America. Washington, D.C.: National Academies Press; 2013.
Elshaug A. Combating overuse and underuse in health care. 2017. https://www.commonwealthfund.org/publications/journal-article/2017/feb/combating-overuse-and-underuse-health-care .
Lyu H, Xu T, Brotman D, Mayer-Blackwell B, Cooper M, Daniel M, et al. Overtreatment in the United States. PLoS One. 2017;12:e0181970.
doi: 10.1371/journal.pone.0181970 pubmed: 28877170 pmcid: 5587107
Brownlee S, Chalkidou K, Doust J, Elshaug AG, Glasziou P, Heath I, et al. Evidence for overuse of medical services around the world. Lancet. 2017;390:156–68.
doi: 10.1016/S0140-6736(16)32585-5 pubmed: 28077234 pmcid: 5708862
Surveillance and utilization review subsystem snapshot. https://www.cms.gov/Medicare-Medicaid-Coordination/Fraud-Prevention/Medicaid-Integrity-Education/Downloads/ebulletins-surs.pdf .
Lasaga D, Santhana P. Deep learning to detect medical treatment fraud. In: KDD 2017 Workshop on Anomaly Detection in Finance. Halifax: PMLR; 2018. p. 114–20.
Centers for Disease Control and Prevention (CDC). International Classification of Diseases. https://www.cdc.gov/nchs/icd/icd10cm_pcs_background.htm .
American Medical Association (AMA). Current Procedural Terminology. https://www.ama-assn.org/amaone/cpt-current-procedural-terminology .
American Medical Association (AMA). Healthcare Common Procedure Coding System. https://www.ama-assn.org/practice-management/cpt/healthcare-common-procedure-coding-system-hcpcs .
Centers for Medicare and Medicaid Services (CMS). National Provider Identifier Standard. https://www.cms.gov/Regulations-and-Guidance/Administrative-Simplification/NationalProvIdentStand .
Zhou C, Paffenroth RC. Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. 2017. p. 665–74.
doi: 10.1145/3097983.3098052
Ester M, Kriegel HP, Sander J, Xu X. A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. Portland: AAAI Press; 1996. p. 226–31.
Zhang W, He X. An Anomaly Detection Method for Medicare Fraud Detection. In: 2017 IEEE International Conference on Big Knowledge (ICBK). Hefei, China: IEEE; 2017. p. 309–14.
Zhang C, Xiao X, Wu C. Medical Fraud and Abuse Detection System Based on Machine Learning. IJERPH. 2020;17:7265.
doi: 10.3390/ijerph17197265 pubmed: 33027884 pmcid: 7579458
Rakshit P, Zaballa O, Pérez A, Gómez-Inhiesto E, Acaiturri-Ayesta MT, Lozano JA. A machine learning approach to predict healthcare cost of breast cancer patients. Sci Rep. 2021;11:12441.
doi: 10.1038/s41598-021-91580-x pubmed: 34127694 pmcid: 8203698
Goodfellow I, Bengio Y, Courville A. Deep learning. Cambridge: MIT press; 2016.
Schmidhuber J. Deep Learning in Neural Networks: An Overview. Neural Netw. 2015;61:85–117.
doi: 10.1016/j.neunet.2014.09.003 pubmed: 25462637
Baldi P. Autoencoders, unsupervised learning, and deep architectures. In: Proceedings of ICML workshop on unsupervised and transfer learning. Washington: JMLR Workshop and Conference Proceedings; 2012. p. 37–49.
Lyudchik O. Outlier detection using autoencoders. 2016.
Kramer MA. Nonlinear principal component analysis using autoassociative neural networks. AIChE J. 1991;37:233–43.
doi: 10.1002/aic.690370209
Chen J, Sathe S, Aggarwal C, Turaga D. Outlier detection with autoencoder ensembles. In: Proceedings of the 2017 SIAM international conference on data mining. Houston: SIAM; 2017. p. 90–8.
Xu W, Jang-Jaccard J, Singh A, Wei Y, Sabrina F. Improving Performance of Autoencoder-Based Network Anomaly Detection on NSL-KDD Dataset. IEEE Access. 2021;9:140136–46.
doi: 10.1109/ACCESS.2021.3116612
Javaid A, Niyaz Q, Sun W, Alam M. A deep learning approach for network intrusion detection system. In: Proceedings of the 9th EAI International Conference on Bio-inspired Information and Communications Technologies (formerly BIONETICS). New York City: ACM; 2016.
Shvetsova N, Bakker B, Fedulova I, Schulz H, Dylov DV. Anomaly detection in medical imaging with deep perceptual autoencoders. IEEE Access. 2021;9:118571–83.
doi: 10.1109/ACCESS.2021.3107163
Borghesi A, Bartolini A, Lombardi M, Milano M, Benini L. Anomaly detection using autoencoders in high performance computing systems. AAAI. 2019;33:9428–33.
doi: 10.1609/aaai.v33i01.33019428
da Rosa RC. An evaluation of unsupervised machine learning algorithms for detecting fraud and abuse in the US Medicare Insurance Program. PhD Thesis. Boca Raton: Florida Atlantic University; 2018.
Ho Y, Wookey S. The real-world-weight cross-entropy loss function: modeling the costs of mislabeling. IEEE Access. 2020;8:4806–13.
doi: 10.1109/ACCESS.2019.2962617
McNemar Q. Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika. 1947;12:153–7.
doi: 10.1007/BF02295996 pubmed: 20254758
Dietterich TG. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput. 1998;10:1895–923.
doi: 10.1162/089976698300017197 pubmed: 9744903
Steinbuss G, Böhm K. Benchmarking unsupervised outlier detection with realistic synthetic data. ACM Trans Knowl Discov Data (TKDD). 2021;15(4):1–20.
doi: 10.1145/3441453

Auteurs

Michael Suesserman (M)

AI Center of Excellence, Deloitte & Touche LLP, New York, NY, USA.

Samantha Gorny (S)

Program Integrity, Deloitte & Touche LLP, New York, NY, USA.

Daniel Lasaga (D)

Program Integrity, Deloitte & Touche LLP, New York, NY, USA.

John Helms (J)

AI Center of Excellence, Deloitte & Touche LLP, New York, NY, USA.

Dan Olson (D)

Program Integrity, Deloitte & Touche LLP, New York, NY, USA.

Edward Bowen (E)

AI Center of Excellence, Deloitte & Touche LLP, New York, NY, USA.

Sanmitra Bhattacharya (S)

AI Center of Excellence, Deloitte & Touche LLP, New York, NY, USA. sanmbhattacharya@deloitte.com.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH