Argument mining as rapid screening tool of COVID-19 literature quality: Preliminary evidence.

Artificial Intelligence COVID-19 / diagnosis Humans Pandemics Reproducibility of Results Research

COVID-19 argument mining artificial intelligence inter-rater agreement scientific literature quality assessment

Journal

Frontiers in public health

ISSN: 2296-2565

Titre abrégé: Front Public Health

Pays: Switzerland

ID NLM: 101616579

Informations de publication

Date de publication:
2022

Historique:

received: 16 05 2022

accepted: 27 06 2022

entrez: 4 8 2022

pubmed: 5 8 2022

medline: 6 8 2022

Statut: epublish

Résumé

The COVID-19 pandemic prompted the scientific community to share timely evidence, also in the form of pre-printed papers, not peer reviewed yet. To develop an artificial intelligence system for the analysis of the scientific literature by leveraging on recent developments in the field of Argument Mining. Scientific quality criteria were borrowed from two selected Cochrane systematic reviews. Four independent reviewers gave a blind evaluation on a 1-5 scale to 40 papers for each review. These scores were matched with the automatic analysis performed by an AM system named MARGOT, which detected claims and supporting evidence for the cited papers. Outcomes were evaluated with inter-rater indices (Cohen's Kappa, Krippendorff's Alpha, s MARGOT performs differently on the two selected Cochrane reviews: the inter-rater indices show a fair-to-moderate agreement of the most relevant MARGOT metrics both with Cochrane and the skilled interval scores, with larger values for one of the two reviews. The noted discrepancy could rely on a limitation of the MARGOT system that can be improved; yet, the level of agreement between human reviewers also suggests a different complexity between the two reviews in debating controversial arguments. These preliminary results encourage to expand and deepen the investigation to other topics and a larger number of highly specialized reviewers, to reduce uncertainty in the evaluation process, thus supporting the retraining of AM systems.

Sections du résumé

Background

The COVID-19 pandemic prompted the scientific community to share timely evidence, also in the form of pre-printed papers, not peer reviewed yet.

Purpose

To develop an artificial intelligence system for the analysis of the scientific literature by leveraging on recent developments in the field of Argument Mining.

Methodology

Scientific quality criteria were borrowed from two selected Cochrane systematic reviews. Four independent reviewers gave a blind evaluation on a 1-5 scale to 40 papers for each review. These scores were matched with the automatic analysis performed by an AM system named MARGOT, which detected claims and supporting evidence for the cited papers. Outcomes were evaluated with inter-rater indices (Cohen's Kappa, Krippendorff's Alpha, s

Results

MARGOT performs differently on the two selected Cochrane reviews: the inter-rater indices show a fair-to-moderate agreement of the most relevant MARGOT metrics both with Cochrane and the skilled interval scores, with larger values for one of the two reviews.

Discussion and conclusions

The noted discrepancy could rely on a limitation of the MARGOT system that can be improved; yet, the level of agreement between human reviewers also suggests a different complexity between the two reviews in debating controversial arguments. These preliminary results encourage to expand and deepen the investigation to other topics and a larger number of highly specialized reviewers, to reduce uncertainty in the evaluation process, thus supporting the retraining of AM systems.

Identifiants

DOI: 10.3389/fpubh.2022.945181 PMID: 35923956 PMC: PMC9339778

pubmed: 35923956

doi: 10.3389/fpubh.2022.945181

pmc: PMC9339778

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

945181

Informations de copyright

Références

Nature. 2018 Jul;559(7715):445

pubmed: 30042547

Ann Ist Super Sanita. 2021 Apr-Jun;57(2):121-127

pubmed: 34132208

Stat Methods Med Res. 2016 Dec;25(6):2611-2633

pubmed: 24740999

J Chiropr Med. 2016 Jun;15(2):155-63

pubmed: 27330520

Lancet. 2020 Mar 28;395(10229):1015-1018

pubmed: 32197103

Cochrane Database Syst Rev. 2021 Mar 16;3:CD013639

pubmed: 33724443

Tutor Quant Methods Psychol. 2012;8(1):23-34

pubmed: 22833776

J Clin Epidemiol. 2021 Feb;130:13-22

pubmed: 33068715

Clin Cancer Res. 2012 Jul 15;18(14):3731-6

pubmed: 22675175

PLoS One. 2020 Nov 18;15(11):e0242520

pubmed: 33206715

Biometrics. 1977 Mar;33(1):159-74

pubmed: 843571

Front Public Health. 2022 May 23;10:898254

pubmed: 35677770

Cochrane Database Syst Rev. 2020 Aug 26;8:CD013705

pubmed: 32845525

Proc Natl Acad Sci U S A. 2018 Mar 20;115(12):2952-2957

pubmed: 29507248

J Nurs Scholarsh. 2021 Mar;53(2):246-254

pubmed: 33555110

Argument mining as rapid screening tool of COVID-19 literature quality: Preliminary evidence.

Journal

Informations de publication

Résumé

Sections du résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Références

Auteurs

Gianfranco Brambilla (G)

Antonella Rosi (A)

Francesco Antici (F)

Andrea Galassi (A)

Daniele Giansanti (D)

Fabio Magurano (F)

Federico Ruggeri (F)

Paolo Torroni (P)

Evaristo Cisbani (E)

Marco Lippi (M)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH