A question of trust: can we build an evidence base to gain trust in systematic review automation technologies?
Artificial intelligence
Automation
Data extraction
Machine learning
Screening
Journal
Systematic reviews
ISSN: 2046-4053
Titre abrégé: Syst Rev
Pays: England
ID NLM: 101580575
Informations de publication
Date de publication:
18 06 2019
18 06 2019
Historique:
received:
23
05
2018
accepted:
05
06
2019
entrez:
20
6
2019
pubmed:
20
6
2019
medline:
28
7
2020
Statut:
epublish
Résumé
Although many aspects of systematic reviews use computational tools, systematic reviewers have been reluctant to adopt machine learning tools. We discuss that the potential reason for the slow adoption of machine learning tools into systematic reviews is multifactorial. We focus on the current absence of trust in automation and set-up challenges as major barriers to adoption. It is important that reviews produced using automation tools are considered non-inferior or superior to current practice. However, this standard will likely not be sufficient to lead to widespread adoption. As with many technologies, it is important that reviewers see "others" in the review community using automation tools. Adoption will also be slow if the automation tools are not compatible with workflows and tasks currently used to produce reviews. Many automation tools being developed for systematic reviews mimic classification problems. Therefore, the evidence that these automation tools are non-inferior or superior can be presented using methods similar to diagnostic test evaluations, i.e., precision and recall compared to a human reviewer. However, the assessment of automation tools does present unique challenges for investigators and systematic reviewers, including the need to clarify which metrics are of interest to the systematic review community and the unique documentation challenges for reproducible software experiments. We discuss adoption barriers with the goal of providing tool developers with guidance as to how to design and report such evaluations and for end users to assess their validity. Further, we discuss approaches to formatting and announcing publicly available datasets suitable for assessment of automation technologies and tools. Making these resources available will increase trust that tools are non-inferior or superior to current practice. Finally, we identify that, even with evidence that automation tools are non-inferior or superior to current practice, substantial set-up challenges remain for main stream integration of automation into the systematic review process.
Sections du résumé
BACKGROUND
Although many aspects of systematic reviews use computational tools, systematic reviewers have been reluctant to adopt machine learning tools.
DISCUSSION
We discuss that the potential reason for the slow adoption of machine learning tools into systematic reviews is multifactorial. We focus on the current absence of trust in automation and set-up challenges as major barriers to adoption. It is important that reviews produced using automation tools are considered non-inferior or superior to current practice. However, this standard will likely not be sufficient to lead to widespread adoption. As with many technologies, it is important that reviewers see "others" in the review community using automation tools. Adoption will also be slow if the automation tools are not compatible with workflows and tasks currently used to produce reviews. Many automation tools being developed for systematic reviews mimic classification problems. Therefore, the evidence that these automation tools are non-inferior or superior can be presented using methods similar to diagnostic test evaluations, i.e., precision and recall compared to a human reviewer. However, the assessment of automation tools does present unique challenges for investigators and systematic reviewers, including the need to clarify which metrics are of interest to the systematic review community and the unique documentation challenges for reproducible software experiments.
CONCLUSION
We discuss adoption barriers with the goal of providing tool developers with guidance as to how to design and report such evaluations and for end users to assess their validity. Further, we discuss approaches to formatting and announcing publicly available datasets suitable for assessment of automation technologies and tools. Making these resources available will increase trust that tools are non-inferior or superior to current practice. Finally, we identify that, even with evidence that automation tools are non-inferior or superior to current practice, substantial set-up challenges remain for main stream integration of automation into the systematic review process.
Identifiants
pubmed: 31215463
doi: 10.1186/s13643-019-1062-0
pii: 10.1186/s13643-019-1062-0
pmc: PMC6582554
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
143Subventions
Organisme : Medical Research Council
ID : MR/J005037/1
Pays : United Kingdom
Références
BMC Med Inform Decis Mak. 2010 Sep 28;10:56
pubmed: 20920176
Syst Rev. 2015 Jan 14;4:5
pubmed: 25588314
BMC Bioinformatics. 2010 Jan 26;11:55
pubmed: 20102628
Stud Health Technol Inform. 2010;160(Pt 1):146-50
pubmed: 20841667
Syst Rev. 2014 Jul 09;3:74
pubmed: 25005128
Arch Toxicol. 2017 Jul;91(7):2551-2575
pubmed: 28501917
Biostatistics. 2007 Apr;8(2):474-84
pubmed: 17085745
PLoS Comput Biol. 2013 Oct;9(10):e1003285
pubmed: 24204232
BMC Med. 2016 Mar 29;14:59
pubmed: 27025849
BMJ. 2013 Jan 10;346:f139
pubmed: 23305843
Ann Intern Med. 2017 Aug 1;167(3):210-211
pubmed: 28605802
PLoS One. 2011;6(7):e21704
pubmed: 21818262
J Am Med Inform Assoc. 2005 Mar-Apr;12(2):207-16
pubmed: 15561789
PLoS Med. 2014 Feb 18;11(2):e1001603
pubmed: 24558353
BMJ Open. 2017 Feb 27;7(2):e012545
pubmed: 28242767
J Biomed Inform. 2017 Sep;73:1-13
pubmed: 28711679
Stat Med. 2014 Oct 30;33(24):4141-69
pubmed: 24910172
Appl Ergon. 2016 Mar;53 Pt A:190-202
pubmed: 26467193
Crit Rev Food Sci Nutr. 2015;55(7):1026-34
pubmed: 25191830
Am J Public Health. 2017 Jan;107(1):88-92
pubmed: 27854522
J Mach Learn Res. 2016;17:
pubmed: 27746703
Milbank Q. 2011 Sep;89(3):425-49
pubmed: 21933275
BMJ. 2015 Oct 28;351:h5527
pubmed: 26511519