A question of trust: can we build an evidence base to gain trust in systematic review automation technologies?

Humans Evidence-Based Practice Machine Learning Reproducibility of Results Systematic Reviews as Topic Trust

Artificial intelligence Automation Data extraction Machine learning Screening

Journal

Systematic reviews

ISSN: 2046-4053

Titre abrégé: Syst Rev

Pays: England

ID NLM: 101580575

Informations de publication

Date de publication:
18 06 2019

Historique:

received: 23 05 2018

accepted: 05 06 2019

entrez: 20 6 2019

pubmed: 20 6 2019

medline: 28 7 2020

Statut: epublish

Résumé

Although many aspects of systematic reviews use computational tools, systematic reviewers have been reluctant to adopt machine learning tools. We discuss that the potential reason for the slow adoption of machine learning tools into systematic reviews is multifactorial. We focus on the current absence of trust in automation and set-up challenges as major barriers to adoption. It is important that reviews produced using automation tools are considered non-inferior or superior to current practice. However, this standard will likely not be sufficient to lead to widespread adoption. As with many technologies, it is important that reviewers see "others" in the review community using automation tools. Adoption will also be slow if the automation tools are not compatible with workflows and tasks currently used to produce reviews. Many automation tools being developed for systematic reviews mimic classification problems. Therefore, the evidence that these automation tools are non-inferior or superior can be presented using methods similar to diagnostic test evaluations, i.e., precision and recall compared to a human reviewer. However, the assessment of automation tools does present unique challenges for investigators and systematic reviewers, including the need to clarify which metrics are of interest to the systematic review community and the unique documentation challenges for reproducible software experiments. We discuss adoption barriers with the goal of providing tool developers with guidance as to how to design and report such evaluations and for end users to assess their validity. Further, we discuss approaches to formatting and announcing publicly available datasets suitable for assessment of automation technologies and tools. Making these resources available will increase trust that tools are non-inferior or superior to current practice. Finally, we identify that, even with evidence that automation tools are non-inferior or superior to current practice, substantial set-up challenges remain for main stream integration of automation into the systematic review process.

Sections du résumé

BACKGROUND

Although many aspects of systematic reviews use computational tools, systematic reviewers have been reluctant to adopt machine learning tools.

DISCUSSION

We discuss that the potential reason for the slow adoption of machine learning tools into systematic reviews is multifactorial. We focus on the current absence of trust in automation and set-up challenges as major barriers to adoption. It is important that reviews produced using automation tools are considered non-inferior or superior to current practice. However, this standard will likely not be sufficient to lead to widespread adoption. As with many technologies, it is important that reviewers see "others" in the review community using automation tools. Adoption will also be slow if the automation tools are not compatible with workflows and tasks currently used to produce reviews. Many automation tools being developed for systematic reviews mimic classification problems. Therefore, the evidence that these automation tools are non-inferior or superior can be presented using methods similar to diagnostic test evaluations, i.e., precision and recall compared to a human reviewer. However, the assessment of automation tools does present unique challenges for investigators and systematic reviewers, including the need to clarify which metrics are of interest to the systematic review community and the unique documentation challenges for reproducible software experiments.

CONCLUSION

We discuss adoption barriers with the goal of providing tool developers with guidance as to how to design and report such evaluations and for end users to assess their validity. Further, we discuss approaches to formatting and announcing publicly available datasets suitable for assessment of automation technologies and tools. Making these resources available will increase trust that tools are non-inferior or superior to current practice. Finally, we identify that, even with evidence that automation tools are non-inferior or superior to current practice, substantial set-up challenges remain for main stream integration of automation into the systematic review process.

Identifiants

DOI: 10.1186/s13643-019-1062-0 PMID: 31215463 PMC: PMC6582554

pubmed: 31215463

doi: 10.1186/s13643-019-1062-0

pii: 10.1186/s13643-019-1062-0

pmc: PMC6582554

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

143

Subventions

Organisme : Medical Research Council

ID : MR/J005037/1

Pays : United Kingdom

Références

BMC Med Inform Decis Mak. 2010 Sep 28;10:56

pubmed: 20920176

Syst Rev. 2015 Jan 14;4:5

pubmed: 25588314

BMC Bioinformatics. 2010 Jan 26;11:55

pubmed: 20102628

Stud Health Technol Inform. 2010;160(Pt 1):146-50

pubmed: 20841667

Syst Rev. 2014 Jul 09;3:74

pubmed: 25005128

Arch Toxicol. 2017 Jul;91(7):2551-2575

pubmed: 28501917

Biostatistics. 2007 Apr;8(2):474-84

pubmed: 17085745

PLoS Comput Biol. 2013 Oct;9(10):e1003285

pubmed: 24204232

BMC Med. 2016 Mar 29;14:59

pubmed: 27025849

BMJ. 2013 Jan 10;346:f139

pubmed: 23305843

Ann Intern Med. 2017 Aug 1;167(3):210-211

pubmed: 28605802

PLoS One. 2011;6(7):e21704

pubmed: 21818262

J Am Med Inform Assoc. 2005 Mar-Apr;12(2):207-16

pubmed: 15561789

PLoS Med. 2014 Feb 18;11(2):e1001603

pubmed: 24558353

BMJ Open. 2017 Feb 27;7(2):e012545

pubmed: 28242767

J Biomed Inform. 2017 Sep;73:1-13

pubmed: 28711679

Stat Med. 2014 Oct 30;33(24):4141-69

pubmed: 24910172

Appl Ergon. 2016 Mar;53 Pt A:190-202

pubmed: 26467193

Crit Rev Food Sci Nutr. 2015;55(7):1026-34

pubmed: 25191830

Am J Public Health. 2017 Jan;107(1):88-92

pubmed: 27854522

J Mach Learn Res. 2016;17:

pubmed: 27746703

Milbank Q. 2011 Sep;89(3):425-49

pubmed: 21933275

BMJ. 2015 Oct 28;351:h5527

pubmed: 26511519

A question of trust: can we build an evidence base to gain trust in systematic review automation technologies?

Journal

Informations de publication

Résumé

Sections du résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Subventions

Références

Auteurs

Annette M O'Connor (AM)

Guy Tsafnat (G)

James Thomas (J)

Paul Glasziou (P)

Stephen B Gilbert (SB)

Brian Hutton (B)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH