Identify the most appropriate imputation method for handling missing values in clinical structured datasets: a systematic review.

Clinical dataset Imputation methods Mechanism of missingness Missing ratio Missing values Pattern of missingness Simulation study

Journal

BMC medical research methodology
ISSN: 1471-2288
Titre abrégé: BMC Med Res Methodol
Pays: England
ID NLM: 100968545

Informations de publication

Date de publication:
28 Aug 2024
Historique:
received: 06 04 2024
accepted: 19 08 2024
medline: 31 8 2024
pubmed: 31 8 2024
entrez: 28 8 2024
Statut: epublish

Résumé

Comprehending the research dataset is crucial for obtaining reliable and valid outcomes. Health analysts must have a deep comprehension of the data being analyzed. This comprehension allows them to suggest practical solutions for handling missing data, in a clinical data source. Accurate handling of missing values is critical for producing precise estimates and making informed decisions, especially in crucial areas like clinical research. With data's increasing diversity and complexity, numerous scholars have developed a range of imputation techniques. To address this, we conducted a systematic review to introduce various imputation techniques based on tabular dataset characteristics, including the mechanism, pattern, and ratio of missingness, to identify the most appropriate imputation methods in the healthcare field. We searched four information databases namely PubMed, Web of Science, Scopus, and IEEE Xplore, for articles published up to September 20, 2023, that discussed imputation methods for addressing missing values in a clinically structured dataset. Our investigation of selected articles focused on four key aspects: the mechanism, pattern, ratio of missingness, and various imputation strategies. By synthesizing insights from these perspectives, we constructed an evidence map to recommend suitable imputation methods for handling missing values in a tabular dataset. Out of 2955 articles, 58 were included in the analysis. The findings from the development of the evidence map, based on the structure of the missing values and the types of imputation methods used in the extracted items from these studies, revealed that 45% of the studies employed conventional statistical methods, 31% utilized machine learning and deep learning methods, and 24% applied hybrid imputation techniques for handling missing values. Considering the structure and characteristics of missing values in a clinical dataset is essential for choosing the most appropriate data imputation technique, especially within conventional statistical methods. Accurately estimating missing values to reflect reality enhances the likelihood of obtaining high-quality and reusable data, contributing significantly to precise medical decision-making processes. Performing this review study creates a guideline for choosing the most appropriate imputation methods in data preprocessing stages to perform analytical processes on structured clinical datasets.

Sections du résumé

BACKGROUND AND OBJECTIVES OBJECTIVE
Comprehending the research dataset is crucial for obtaining reliable and valid outcomes. Health analysts must have a deep comprehension of the data being analyzed. This comprehension allows them to suggest practical solutions for handling missing data, in a clinical data source. Accurate handling of missing values is critical for producing precise estimates and making informed decisions, especially in crucial areas like clinical research. With data's increasing diversity and complexity, numerous scholars have developed a range of imputation techniques. To address this, we conducted a systematic review to introduce various imputation techniques based on tabular dataset characteristics, including the mechanism, pattern, and ratio of missingness, to identify the most appropriate imputation methods in the healthcare field.
MATERIALS AND METHODS METHODS
We searched four information databases namely PubMed, Web of Science, Scopus, and IEEE Xplore, for articles published up to September 20, 2023, that discussed imputation methods for addressing missing values in a clinically structured dataset. Our investigation of selected articles focused on four key aspects: the mechanism, pattern, ratio of missingness, and various imputation strategies. By synthesizing insights from these perspectives, we constructed an evidence map to recommend suitable imputation methods for handling missing values in a tabular dataset.
RESULTS RESULTS
Out of 2955 articles, 58 were included in the analysis. The findings from the development of the evidence map, based on the structure of the missing values and the types of imputation methods used in the extracted items from these studies, revealed that 45% of the studies employed conventional statistical methods, 31% utilized machine learning and deep learning methods, and 24% applied hybrid imputation techniques for handling missing values.
CONCLUSION CONCLUSIONS
Considering the structure and characteristics of missing values in a clinical dataset is essential for choosing the most appropriate data imputation technique, especially within conventional statistical methods. Accurately estimating missing values to reflect reality enhances the likelihood of obtaining high-quality and reusable data, contributing significantly to precise medical decision-making processes. Performing this review study creates a guideline for choosing the most appropriate imputation methods in data preprocessing stages to perform analytical processes on structured clinical datasets.

Identifiants

pubmed: 39198744
doi: 10.1186/s12874-024-02310-6
pii: 10.1186/s12874-024-02310-6
doi:

Types de publication

Journal Article Systematic Review

Langues

eng

Sous-ensembles de citation

IM

Pagination

188

Informations de copyright

© 2024. The Author(s).

Références

Little RJ, Rubin DB. Statistical Analysis with Missing Data, vol. 793. Hoboken, NJ, USA: Wiley; 2019.
Rubin DB. Inference and missing data. Biometrika. 1976;63(3):581–92.
doi: 10.1093/biomet/63.3.581
Galimard JE, Chevret S, Protopopescu C, Resche-Rigon M. A multiple imputation approach for MNAR mechanisms compatible with Heckman’s model. Stat Med. 2016;35(17):2907–20.
pubmed: 26893215 doi: 10.1002/sim.6902
Miettinen OS. Theoretical epidemiology: principles of occurrence research in medicine. In Theoretical epidemiology: principles of occurrence research in medicine 1985 (pp. xxii-359).
Humphries M. Missing Data & How to Deal: an overview of missing data. Popul Res Cent. 2013; 45.
Li T, Hutfless S, Scharfstein DO, Daniels MJ, Hogan JW, Little RJA, et al. Standards should be applied in the prevention and handling of missing data for patient-centered outcomes research: a systematic review and expert consensus. J Clin Epidemiol. 2014;67:15–32. https://doi.org/10.1016/j.jclinepi.2013.08.013 .
doi: 10.1016/j.jclinepi.2013.08.013 pubmed: 24262770 pmcid: 4631258
Suthar B, Patel H, Goswami A. A survey: classification of imputation methods in data mining. Int J Emerg Technol Adv Eng. 2012;2(1):309–12.
Graham JW, Cumsille PE, Elek‐Fisk E. Methods for handling missing data. Handbook of psychology. 2003:87–114.
Buuren SV. Flexible Imputation of Missing Data. Chapman & Hall CRC. 2018. https://doi.org/10.1201/9780429492259 .
doi: 10.1201/9780429492259
Fan J, Han F, Liu H. Challenges of big data analysis. Natl Sci Rev. 2014;1(2):293–314.
pubmed: 25419469 doi: 10.1093/nsr/nwt032
Sterne JA, White IR, Carlin JB, Spratt M, Royston P, Kenward MG, Wood AM, Carpenter JR. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. Bmj. 2009;338.
Shah AD, Bartlett JW, Carpenter J, Nicholas O, Hemingway H. Comparison of random forest and parametric imputation models for imputing missing data using mice: a caliber study. Am J Epidemiol 2014; 179:764–74? https://doi.org/10.1093/aje/kwt312 .
Palanivinayagam A, Damaševičius R. Effective Handling of Missing Values in Datasets for Classification Using Machine Learning Methods. Information. 2023;14(2):92.
doi: 10.3390/info14020092
Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T, Tibshirani R, et al. Missing value estimation methods for DNA microarrays. Bioinformatics. 2001;17:520–5.
pubmed: 11395428 doi: 10.1093/bioinformatics/17.6.520
Luis J, Gomez S, Vidal ARF, Verleysen M. K nearest neighbors with mutual information for simultaneous classification and missing data imputation. Neurocomputing. 2009;72(7–9):1483–93.
Khan SI, Hoque AS. SICE: an improved missing data imputation technique. Journal of Big Data. 2020;7(1):1–21.
doi: 10.1186/s40537-020-00313-w
Jain R, Xu W. Dynamic model updating (DMU) approach for statistical learning model building with missing data. BMC Bioinformatics. 2021;22(1):1–5.
doi: 10.1186/s12859-021-04138-z
Sun Y, Li J, Xu Y, Zhang T, Wang X. Deep learning versus conventional methods for missing data imputation: A review and comparative study. Expert Systems with Applications. 2023:120201
Sherwood B, Wang L, Zhou XH. Weighted quantile regression for analyzing health care cost data with missing covariates. Stat Med. 2013;32(28):4967–79.
pubmed: 23836597 doi: 10.1002/sim.5883
Crambes C, Henchiri Y. Regression imputation in the functional linear model with missing values in the response. Journal of Statistical Planning and Inference. 2019;201:103–19.
doi: 10.1016/j.jspi.2018.12.004
Andridge RR, Little RJ. A review of hot deck imputation for survey non-response. Int Stat Rev. 2010;78(1):40–64.
pubmed: 21743766 pmcid: 3130338 doi: 10.1111/j.1751-5823.2010.00103.x
Sullivan D, Andridge R. A hot deck imputation procedure for multiply imputing nonignorable missing data: The proxy pattern-mixture hot deck. Comput Stat Data Anal. 2015;82:173–85.
doi: 10.1016/j.csda.2014.09.008
Delalleau O, Courville A, Bengio Y. Efficient EM training of Gaussian mixtures with missing data. arXiv preprint arXiv:1209.0521 . 2012 Sep 4.
Pelckmans K, De Brabanter J, Suykens JA, De Moor B. Handling missing values in support vector machine classifiers. Neural Netw. 2005;18(5–6):684–92.
pubmed: 16111866 doi: 10.1016/j.neunet.2005.06.025
Twala B. An empirical comparison of techniques for handling incomplete data using decision trees. Appl Artif Intell. 2009;23(5):373–405.
doi: 10.1080/08839510902872223
Bauer E, Kohavi R. An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Mach Learn. 1999;36:105–39.
doi: 10.1023/A:1007515423169
Whitehead M, Yaeger L. Sentiment mining using ensemble classification models. InInnovations and advances in computer sciences and engineering 2010 (pp. 509–514). Springer Netherlands.
Gupta A, Lam MS. Estimating missing values using neural networks. Journal of the Operational Research Society. 1996;47:229–38.
doi: 10.1057/jors.1996.21
Sharpe PK, Solly RJ. Dealing with missing values in neural network-based diagnostic systems. Neural Comput Appl. 1995;3:73–7.
doi: 10.1007/BF01421959
Moher D, Liberati A, Tetzlaff J, Altman DG, PRISMA Group* T. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Annals of internal medicine. 2009; 151(4):264–9.
Liu N, Chee ML, Niu C, Pek PP, Siddiqui FJ, Ansah JP, Matchar DB, Lam SS, Abdullah HR, Chan A, Malhotra R. Coronavirus disease 2019 (COVID-19): an evidence map of medical literature. BMC Med Res Methodol. 2020;20:1–1.
doi: 10.1186/s12874-020-01059-y
Abassi RA, Msengwa AS. Classification of breast cancer recurrence based on imputed data: a simulation study. BioData Mining. 2022;15(1):30.
pubmed: 36476234 pmcid: 9727846 doi: 10.1186/s13040-022-00316-8
Ahmad A, Mohamed HH. The enhancement of linear regression algorithm in handling missing data for medical data set.
Setiawan NA, Venkatachalam PA, Ahmad Fadzil MH. A knowledge discovery from incomplete coronary artery disease datasets using a rough set. International Journal of Medical Engineering and Informatics. 2011;3(1):60–77.
doi: 10.1504/IJMEI.2011.039077
Alabadla M, Sidi F, Ishak I, H, Affendey L, Hamdan H. A. ExtraImpute: A Novel Machine Learning Method for Missing Data Imputation. Journal of Advances in Information Technology. 2022; 13(5): 470–476. https://doi.org/10.12720/jait.13.5.470-476
Alade OA, Selamat A, Sallehuddin R. The Effects of Missing Data Characteristics on the Choice of Imputation Techniques. Vietnam Journal of Computer Science. 2020;7(02):161–77.
doi: 10.1142/S2196888820500098
Algarni A, Ragab M, Alamri W, Mostafa SM. Towards Improving Predictive Statistical Learning Model Accuracy by Enhancing Learning Technique. Comput Syst Sci Eng. 2022;42(1):303–18.
doi: 10.32604/csse.2022.022152
Almasinejad P, Golabpour A, Mollakhalili Meybodi MR, Mirzaie K, Khosravi A. A dynamic model for imputing missing medical data: a multiobjective particle swarm optimization algorithm. J Healthcare Eng. 2021; 2021.
Alsaber A, Al-Herz A, Pan J, AL‐Sultan AT, Mishra D, KRRD Group. Handling missing data in a rheumatoid arthritis registry using a random forest approach. Int J Rheumatic Dis. 2021;24(10):1282–93.
doi: 10.1111/1756-185X.14203
Batra S, Khurana R, Khan MZ, Boulila W, Koubaa A, Srivastava P. A Pragmatic Ensemble Strategy for Missing Values Imputation in Health Records. Entropy. 2022;24(4):533.
pubmed: 35455196 pmcid: 9030272 doi: 10.3390/e24040533
Beaulieu-Jones BK, Lavage DR, Snyder JW, Moore JH, Pendergrass SA, Bauer CR. Characterizing and managing missing structured data in electronic health records: data analysis. JMIR Med Inform. 2018;6(1): e8960.
doi: 10.2196/medinform.8960
Beesley LJ, Taylor JM. Accounting for not-at-random missingness through imputation stacking. Stat Med. 2021;40(27):6118–32.
pubmed: 34459011 pmcid: 8595557 doi: 10.1002/sim.9174
Bernardini M, Doinychko A, Romeo L, Frontoni E, Amini MR. a novel missing data imputation approach based on clinical conditional Generative Adversarial Networks applied to EHR datasets. Comput Biol Med. 2023;163: 107188.
pubmed: 37393785 doi: 10.1016/j.compbiomed.2023.107188
Burgette LF, Reiter JP. Multiple imputation for missing data via sequential regression trees. Am J Epidemiol. 2010;172(9):1070–6.
pubmed: 20841346 doi: 10.1093/aje/kwq260
Carreras G, Miccinesi G, Wilcock A, Preston N, Nieboer D, Deliens L, Groenvold M, Lunder U, van der Heide A, Baccini M. Missing not at random in end-of-life care studies: multiple imputation and sensitivity analysis on data from the ACTION study. BMC Med Res Methodol. 2021;21:1–2.
doi: 10.1186/s12874-020-01180-y
Casiraghi E, Wong R, Hall M, Coleman B, Notaro M, Evans MD, Tronieri JS, Blau H, Laraway B, Callahan TJ, Chan LE. A method for comparing multiple imputation techniques: A case study on the US national COVID cohort collaborative. J Biomed Inform. 2023;139: 104295.
pubmed: 36716983 pmcid: 10683778 doi: 10.1016/j.jbi.2023.104295
Chen J, Hunter S, Kisfalvi K, Lirio RA. A hybrid approach of handling missing data under different missing data mechanisms: VISIBLE 1 and VARSITY trials for ulcerative colitis. Contemp Clin Trials. 2021;100: 106226.
pubmed: 33238200 doi: 10.1016/j.cct.2020.106226
Cheng CH, Chang JR, Huang HH. A novel weighted distance threshold method for handling medical missing values. Comput Biol Med. 2020;122: 103824.
pubmed: 32658729 doi: 10.1016/j.compbiomed.2020.103824
Cheng CH, Huang SF. A novel clustering-based purity and distance imputation for handling medical data with missing values. Soft Comput. 2021;25(17):11781–801.
doi: 10.1007/s00500-021-05947-3
Choi YJ, Nam CM, Kwak MJ. Multiple imputation techniques applied to appropriateness ratings in cataract surgery. Yonsei Med J. 2004;45(5):829–37.
pubmed: 15515193 doi: 10.3349/ymj.2004.45.5.829
Clark TG, Altman DG. Developing a prognostic model in the presence of missing data: an ovarian cancer case study. J Clin Epidemiol. 2003;56(1):28–37.
pubmed: 12589867 doi: 10.1016/S0895-4356(02)00539-5
Cleophas EP, Cleophas TJ. Clinical research: A novel approach to regression substitution for handling missing data. Am J Ther. 2013;20(5):514–9.
pubmed: 21866042 doi: 10.1097/MJT.0b013e3181ff7a7b
Curioso I, Santos R, Ribeiro B, Carreiro A, Coelho P, Fragata J, Gamboa H. Addressing the curse of missing data in clinical contexts: A novel approach to correlation-based imputation. Journal of King Saud University-Computer and Information Sciences. 2023;35(6): 101562.
doi: 10.1016/j.jksuci.2023.101562
Dekermanjian JP, Shaddox E, Nandy D, Ghosh D, Kechris K. Mechanism-aware imputation: a two-step approach in handling missing values in metabolomics. BMC Bioinformatics. 2022;23(1):179.
pubmed: 35578165 pmcid: 9109373 doi: 10.1186/s12859-022-04659-1
DiazOrdaz K, Kenward MG, Gomes M, Grieve R. Multiple imputation methods for bivariate outcomes in cluster randomized trials. Stat Med. 2016;35(20):3482–96.
pubmed: 26990655 pmcid: 4981911 doi: 10.1002/sim.6935
Dong W, Fong DY, Yoon JS, Wan EY, Bedford LE, Tang EH, Lam CL. Generative adversarial networks for imputing missing data for big data clinical research. BMC Med Res Methodol. 2021;21:1.
doi: 10.1186/s12874-021-01272-3
Dzulkalnine MF, Sallehuddin R. Missing data imputation with fuzzy feature selection for diabetes dataset. SN Applied Sciences. 2019;1(4):362.
doi: 10.1007/s42452-019-0383-x
Ferri P, Romero-Garcia N, Badenes R, Lora-Pablos D, Morales TG, de la Cámara AG, García-Gómez JM, Sáez C. Extremely missing numerical data in Electronic Health Records for machine learning can be managed through simple imputation methods considering informative missingness: A comparative of solutions in a COVID-19 mortality case study. Comput Methods Programs Biomed. 2023;242: 107803.
pubmed: 37703700 doi: 10.1016/j.cmpb.2023.107803
Haliduola HN, Bretz F, Mansmann U. Missing data imputation using utility-based regression and sampling approaches. Comput Methods Programs Biomed. 2022;226: 107172.
pubmed: 36260971 doi: 10.1016/j.cmpb.2022.107172
Hassan GS, Ali NJ, Abdulsahib AK, Mohammed FJ, Gheni HM. A missing data imputation method based on the Salp swarm algorithm for diabetes disease. Bulletin of Electrical Engineering and Informatics. 2023;12(3):1700–10.
doi: 10.11591/eei.v12i3.4528
Hegde H, Shimpi N, Panny A, Glurich I, Christie P, Acharya A. MICE vs PPCA: Missing data imputation in healthcare. Inform Med Unlocked. 2019;17: 100275.
doi: 10.1016/j.imu.2019.100275
Husson F, Josse J, Narasimhan B, Robin G. Imputation of mixed data with multilevel singular value decomposition. J Comput Graph Stat. 2019;28(3):552–66.
doi: 10.1080/10618600.2019.1585261
Ilango P, Vijayakumar K, Rajasekhara BM. Instance-driven clustering for the imputation of missing data in KDD. International Journal of Communication Networks and Distributed Systems. 2014;12(1):69–81.
doi: 10.1504/IJCNDS.2014.057988
Jafrasteh B, Hernández-Lobato D, Lubián-López SP, Benavente-Fernández I. Gaussian processes for missing value imputation. Knowl-Based Syst. 2023;273: 110603.
doi: 10.1016/j.knosys.2023.110603
Jain R, Xu W. Dynamic model updating (DMU) approach for statistical learning model building with missing data. BMC Bioinformatics. 2021;22(1):221.
pubmed: 33926384 pmcid: 8086098 doi: 10.1186/s12859-021-04138-z
Jolani S. Hierarchical imputation of systematically and sporadically missing data: an approximate Bayesian approach using chained equations. Biom J. 2018;60(2):333–51.
pubmed: 28990686 doi: 10.1002/bimj.201600220
Kabir S, Farrokhvar L. Non-linear missing data imputation for healthcare data via index-aware autoencoders. Health Care Manag Sci. 2022;25(3):484–97.
pubmed: 35737282 doi: 10.1007/s10729-022-09597-1
Kim KH, Kim KJ. Missing-data handling methods for lifelong-based wellness index estimation: Comparative analysis with panel data. JMIR Med Inform. 2020;8(12): e20597.
pubmed: 33331831 pmcid: 7775200 doi: 10.2196/20597
Kuppusamy V, Paramasivam I. Integrating WLI fuzzy clustering with grey neural network for missing data imputation. International Journal of Intelligent Enterprise. 2017;4(1–2):103–27.
doi: 10.1504/IJIE.2017.087011
Kuppusamy V, Paramasivam I. Grey Fuzzy Neural Network-Based Hybrid Model for Missing Data Imputation in Mixed Database. International Journal of Intelligent Engineering & Systems. 2017; 10(2).
Lee JH, Huber JC Jr. Evaluation of multiple imputations with large proportions of missing data: how much is too much? Iran J Public Health. 2021;50(7):1372.
pubmed: 34568175 pmcid: 8426774
Ma Y, Zhang W, Lyman S, Huang Y. The HCUP SID imputation project: improving statistical inferences for health disparities research by imputing missing race data. Health Serv Res. 2018;53(3):1870–89.
pubmed: 28474359 doi: 10.1111/1475-6773.12704
Miao SD, Li SQ, Zheng XY, Wang RT, Li J, Ding SS, Ma JF. Missing data interpolation of Alzheimer’s disease based on column-by-column mixed mode. Complexity. 2021;2021:1–6.
doi: 10.1155/2021/3541516
Nadimi-Shahraki MH, Mohammadi S, Zamani H, Gandomi M, Gandomi AH. A hybrid imputation method for multi-pattern missing data: A case study on type II diabetes diagnosis. Electronics. 2021;10(24):3167.
doi: 10.3390/electronics10243167
Nijman SW, Groenhof TK, Hoogland J, Bots ML, Brandjes M, Jacobs JJ, Asselbergs FW, Moons KG, Debray TP. Real-time imputation of missing predictor values improved the application of prediction models in daily practice. J Clin Epidemiol. 2021;134:22–34.
pubmed: 33482294 doi: 10.1016/j.jclinepi.2021.01.003
Pereira RC, Abreu PH, Rodrigues PP. Partial multiple imputations with variational autoencoders: tackling not at randomness in healthcare data. IEEE J Biomed Health Inform. 2022;26(8):4218–27.
pubmed: 35511840 doi: 10.1109/JBHI.2022.3172656
Pezoulas VC, Tachos NS, Olivotto I, Barlocco F, Fotiadis DI. A “smart” Imputation Approach for Effective Quality Control across Complex Clinical Data Structures. In2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) 2022. (pp. 1049–1052). IEEE.
Phung S, Kumar A, Kim J. A deep learning technique for imputing missing healthcare data. In2019 41st annual international conference of the IEEE Engineering in Medicine and Biology Society (EMBC) 2019. (pp. 6513–6516). IEEE.
Quartagno M, Carpenter JR. Multiple imputation for discrete data: Evaluation of the joint latent normal model. Biom J. 2019;61(4):1003–19.
pubmed: 30868652 pmcid: 6618333 doi: 10.1002/bimj.201800222
Rani P, Kumar R, Jain A. HIOC: a hybrid imputation method to predict missing values in medical datasets. International Journal of Intelligent Computing and Cybernetics. 2021;14(4):598–616.
doi: 10.1108/IJICC-03-2021-0042
Shobha K, Savarimuthu N. Clustering-based imputation algorithm using unsupervised neural network for enhancing the quality of healthcare data. J Ambient Intell Humaniz Comput. 2021;12(2):1771–81.
doi: 10.1007/s12652-020-02250-1
Sportisse A, Boyer C, Josse J. Imputation and low-rank estimation with missing not at random data. Stat Comput. 2020;30(6):1629–43.
doi: 10.1007/s11222-020-09963-5
Tomita H, Fujisawa H, Henmi M. A bias-corrected estimator in multiple imputation for missing data. Stat Med. 2018;37(23):3373–86.
pubmed: 29845646 doi: 10.1002/sim.7833
Wang G, Lu J, Choi KS, Zhang G. A transfer-based additive LS-SVM classifier for handling missing data. IEEE transactions on cybernetics. 2018;50(2):739–52.
pubmed: 30334775 doi: 10.1109/TCYB.2018.2872800
Xu D, Hu PJ, Huang TS, Fang X, Hsu CC. A deep learning–based, unsupervised method to impute missing values in electronic health records for improved patient management. J Biomed Inform. 2020;111: 103576.
pubmed: 33010424 doi: 10.1016/j.jbi.2020.103576
Xu D, Daniels MJ, Winterstein AG. Sequential BART for imputation of missing covariates. Biostatistics. 2016;17(3):589–602.
pubmed: 26980459 pmcid: 4915613 doi: 10.1093/biostatistics/kxw009
Zang H, Kim HJ, Huang B, Szczesniak R. Bayesian causal inference for observational studies with missingness in covariates and outcomes. Biometrics. 2023;79(4):3624–36.
pubmed: 37553770 doi: 10.1111/biom.13918
Yang L, Zhang H, Shen H, Huang X, Zhou X, Rong G, Shao D. Quality assessment in systematic literature reviews: A software engineering perspective. Inf Softw Technol. 2021;130: 106397.
doi: 10.1016/j.infsof.2020.106397
Alabadla M, Sidi F, Ishak I, Ibrahim H, Affendey LS, Ani ZC, Jabar MA, Bukar UA, Devaraj NK, Muda AS, Tharek A. Systematic review of using machine learning in imputing missing values. IEEE Access. 2022;10:44483–502.
doi: 10.1109/ACCESS.2022.3160841
Emmanuel T, Maupong T, Mpoeleng D, Semong T, Mphago B, Tabona O. A survey on missing data in machine learning. Journal of Big Data. 2021;8:1–37.
doi: 10.1186/s40537-021-00516-9
Thomas T, Rajabi E. A systematic review of machine learning-based missing value imputation techniques. Data Technologies and Applications. 2021;55(4):558–85.
doi: 10.1108/DTA-12-2020-0298
Liu M, Li S, Yuan H, Ong ME, Ning Y, Xie F, Saffari SE, Shang Y, Volovici V, Chakraborty B, Liu N. Handling missing values in healthcare data: A systematic review of deep learning-based imputation techniques. Art Intel Med. 2023:102587.
Setiawan I, Gernowo R, Warsito B. A Systematic Literature Review on Missing Values: Research Trends, Datasets, Methods, and Frameworks. In E3S Web of Conferences 2023. (Vol. 448, p. 02020). EDP Sciences.

Auteurs

Marziyeh Afkanpour (M)

Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.

Elham Hosseinzadeh (E)

Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.

Hamed Tabesh (H)

Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran. tabesh79@gmail.com.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH