Machine learning for identifying relevant publications in updates of systematic reviews of diagnostic test studies.


Journal

Research synthesis methods
ISSN: 1759-2887
Titre abrégé: Res Synth Methods
Pays: England
ID NLM: 101543738

Informations de publication

Date de publication:
Jul 2021
Historique:
revised: 11 01 2021
received: 17 06 2020
accepted: 13 02 2021
pubmed: 16 3 2021
medline: 29 10 2021
entrez: 15 3 2021
Statut: ppublish

Résumé

Updating systematic reviews is often a time-consuming process that involves a lot of human effort and is therefore not conducted as often as it should be. The aim of our research project was to explore the potential of machine learning methods to reduce human workload. Furthermore, we evaluated the performance of deep learning methods in comparison to more established machine learning methods. We used three available reviews of diagnostic test studies as the data set. In order to identify relevant publications, we used typical text pre-processing methods. The reference standard for the evaluation was the human-consensus based on binary classification (inclusion, exclusion). For the evaluation of the models, various scenarios were generated using a grid of combinations of data preprocessing steps. Moreover, we evaluated each machine learning approach with an approach-specific predefined grid of tuning parameters using the Brier score metric. The best performance was obtained with an ensemble method for two of the reviews, and by a deep learning approach for the other review. Yet, the final performance of approaches strongly depends on data preparation. Overall, machine learning methods provided reasonable classification. It seems possible to reduce human workload in updating systematic reviews by using machine learning methods. Yet, as the influence of data preprocessing on the final performance seems to be at least as important as choosing the specific machine learning approach, users should not blindly expect a good performance by solely using approaches from a popular class, such as deep learning.

Identifiants

pubmed: 33720520
doi: 10.1002/jrsm.1486
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

506-515

Informations de copyright

© 2021 The Authors. Research Synthesis Methods published by John Wiley & Sons Ltd.

Références

Liberati A, Altman DG, Tetzlaff J, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. PLoS Med. 2009;6(7):e1000100. https://doi.org/10.1371/journal.pmed.1000100.
Fontelo P, Liu F. A review of recent publication trends from top publishing countries. Syst Rev. 2018;7(1):147. https://doi.org/10.1186/s13643-018-0819-1.
O'Mara-Eves A, Thomas J, McNaught J, Miwa M, Ananiadou S. Using text mining for study identification in systematic reviews: a systematic review of current approaches. Syst Rev. 2015;4(1):5. https://doi.org/10.1186/2046-4053-4-5.
Thomas J, Noel-Storr A, Marshall I, et al. Living systematic review N. Living systematic reviews: 2. Combining human and machine effort. J Clin Epidemiol. 2017;91:31-37. https://doi.org/10.1016/j.jclinepi.2017.08.011.
Elliott JH, Synnot A, Turner T, et al. Living systematic review: 1. Introduction-the why, what, when, and how. J Clin Epidemiol. 2017;11:1878-5921. https://doi.org/10.1371/journal.pmed.1001603.
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. https://doi.org/10.1038/nature14539.
Marshall IJ, Noel-Storr A, Kuiper J, Thomas J, Wallace BC. Machine learning for identifying Randomized Controlled Trials: an evaluation and practitioner's guide. Res Synth Meth. 2018;9(4):602-614. https://doi.org/10.1002/jrsm.1287.
Korevaar DA, van Enst WA, Spijker R, Bossuyt PMM, Hooft L. Reporting quality of diagnostic accuracy studies: a systematic review and meta-analysis of investigations on adherence to STARD. Evid Based Med. 2014;19(2):47. https://doi.org/10.1136/eb-2013-101637.
Lange T, Matthijs O, Jain NB, Schmitt J, Lutzner J, Kopkow C. Reliability of specific physical examination tests for the diagnosis of shoulder pathologies: a systematic review and meta-analysis. Br J Sports Med. 2017;51(6):511-518. https://doi.org/10.1136/bjsports-2016-096558.
Lange T, Struyf F, Schmitt J, Lutzner J, Kopkow C. The reliability of physical examination tests for the clinical assessment of scapular dyskinesis in subjects with shoulder complaints: A systematic review. Phys Therapy Sport. 2017;26:64-89. https://doi.org/10.1016/j.ptsp.2016.10.006.
Kopkow CF, Kirschner A, Seidler S, Schmitt A. Physical examination tests for the diagnosis of posterior cruciate ligament rupture: a systematic review. J Orthop Sports Phys Ther. 2013;43(11):804-813. https://doi.org/10.2519/jospt.2013.4906.
Kopkow C, Lange T, Hoyer A, Lützner J, Schmitt J. Physical tests for diagnosing anterior cruciate ligament rupture. Cochrane Database Syst Rev. 2015. https://www.cochrane.org/CD011925/MUSKINJ_physical-tests-diagnosing-anterior-cruciate-ligament-rupture.
Goldberg YA. Primer on neural network models for natural language processing. J Artif Intell Res. 2016;57(1):345-420. https://doi.org/10.1613/jair.4992.
Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. ICLR Workshop. 2013.
Binder H, Graf E. Brier score. Vol 1. London, England: SAGE Publications; 2009.
R Core Team. R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing; 2018.
Kuhn M. Caret: Classification and Regression Training. Vol 6. R Package Version; 2018:1-80.
Assel M, Sjoberg DD, Vickers AJ. The Brier score does not evaluate the clinical utility of diagnostic tests or prediction models. Diagn Progn Res. 2017;1(1):19. https://doi.org/10.1186/s41512-017-0020-3.

Auteurs

Toni Lange (T)

Center for Evidence-based Healthcare, University Hospital Carl Gustav Carus and Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Germany.

Guido Schwarzer (G)

Faculty of Medicine and Medical Center, Institute of Medical Biometry and Statistics, University of Freiburg, Freiburg, Germany.

Thomas Datzmann (T)

Center for Evidence-based Healthcare, University Hospital Carl Gustav Carus and Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Germany.

Harald Binder (H)

Faculty of Medicine and Medical Center, Institute of Medical Biometry and Statistics, University of Freiburg, Freiburg, Germany.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH