Top 10 Reviewer Critiques of Radiology Artificial Intelligence (AI) Articles: Qualitative Thematic Analysis of Reviewer Critiques of Machine Learning/Deep Learning Manuscripts Submitted to JMRI.
artificial intelligence
machine learning
thematic analysis
Journal
Journal of magnetic resonance imaging : JMRI
ISSN: 1522-2586
Titre abrégé: J Magn Reson Imaging
Pays: United States
ID NLM: 9105850
Informations de publication
Date de publication:
07 2020
07 2020
Historique:
received:
21
09
2019
revised:
09
12
2019
accepted:
11
12
2019
pubmed:
17
1
2020
medline:
11
5
2021
entrez:
17
1
2020
Statut:
ppublish
Résumé
Classical machine learning (ML) and deep learning (DL) articles have rapidly captured the attention of the radiology research community and comprise an increasing proportion of articles submitted to JMRI, of variable reporting and methodological quality. To identify the most frequent reviewer critiques of classical ML and DL articles submitted to JMRI. Qualitative thematic analysis. In all, 1396 manuscript journal articles submitted to JMRI for consideration in 2018, with thematic analysis performed of reviewer critiques of 38 artificial intelligence (AI) articles, comprised of 24 ML and 14 DL articles, from January 9, 2018 to June 2, 2018. N/A. After identifying and sampling ML and DL articles, and collecting all reviews, qualitative thematic analysis was performed to identify major and minor themes of reviewer critiques. Descriptive statistics provided of article characteristics, and thematic review of major and minor themes. Thirty-eight articles were sampled for thematic review: 24 (63.2%) focused on classical ML and 14 (36.8%) on DL. The overall acceptance rate of classical ML/DL articles was 28.9%, similar to the overall 2017-2019 acceptance rate of 23.1-28.1%. These articles resulted in 72 reviews analyzed, yielding a total 713 critiques that underwent formal thematic analysis consensus encoding. Ten major themes of critiques were identified, with 1-Lack of Information as the most frequent, comprising 268 (37.6%) of all critiques. Frequent minor themes of critiques concerning ML/DL-specific recommendations included performing basic clinical statistics such as to ensure similarity of training and test groups (N = 26), emphasizing strong clinical Gold Standards for the basis of training labels (N = 19), and ensuring strong radiological relevance of the topic and task performed (N = 16). Standardized reporting of ML and DL methods could help address nearly one-third of all reviewer critiques made. 4 Technical Efficacy Stage: 1 J. Magn. Reson. Imaging 2020;52:248-254.
Sections du résumé
BACKGROUND
Classical machine learning (ML) and deep learning (DL) articles have rapidly captured the attention of the radiology research community and comprise an increasing proportion of articles submitted to JMRI, of variable reporting and methodological quality.
PURPOSE
To identify the most frequent reviewer critiques of classical ML and DL articles submitted to JMRI.
STUDY TYPE
Qualitative thematic analysis.
POPULATION
In all, 1396 manuscript journal articles submitted to JMRI for consideration in 2018, with thematic analysis performed of reviewer critiques of 38 artificial intelligence (AI) articles, comprised of 24 ML and 14 DL articles, from January 9, 2018 to June 2, 2018.
FIELD STRENGTH/SEQUENCE
N/A.
ASSESSMENT
After identifying and sampling ML and DL articles, and collecting all reviews, qualitative thematic analysis was performed to identify major and minor themes of reviewer critiques.
STATISTICAL TESTS
Descriptive statistics provided of article characteristics, and thematic review of major and minor themes.
RESULTS
Thirty-eight articles were sampled for thematic review: 24 (63.2%) focused on classical ML and 14 (36.8%) on DL. The overall acceptance rate of classical ML/DL articles was 28.9%, similar to the overall 2017-2019 acceptance rate of 23.1-28.1%. These articles resulted in 72 reviews analyzed, yielding a total 713 critiques that underwent formal thematic analysis consensus encoding. Ten major themes of critiques were identified, with 1-Lack of Information as the most frequent, comprising 268 (37.6%) of all critiques. Frequent minor themes of critiques concerning ML/DL-specific recommendations included performing basic clinical statistics such as to ensure similarity of training and test groups (N = 26), emphasizing strong clinical Gold Standards for the basis of training labels (N = 19), and ensuring strong radiological relevance of the topic and task performed (N = 16).
DATA CONCLUSION
Standardized reporting of ML and DL methods could help address nearly one-third of all reviewer critiques made.
LEVEL OF EVIDENCE
4 Technical Efficacy Stage: 1 J. Magn. Reson. Imaging 2020;52:248-254.
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
248-254Commentaires et corrections
Type : CommentIn
Informations de copyright
© 2020 International Society for Magnetic Resonance in Medicine.
Références
Tang A, Tam R, Cadrin-Chênevert A, et al. Canadian Association of Radiologists white paper on artificial intelligence in radiology. Can Assoc Radiol J 2018;69:120-135.
Mazurowski MA, Buda M, Saha A, Bashir MR. Deep learning in radiology: An overview of the concepts and a survey of the state of the art with focus on MRI. J Magn Reson Imaging 2019;49:939-954. Review.
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521:436-444. Review.
McGinty GB, Allen B Jr. The ACR Data Science Institute and AI Advisory Group: Harnessing the power of artificial intelligence to improve patient care. J Am Coll Radiol 2018;15(3 Pt B):577-579.
Guermazi A, Kressel HY. Getting published in radiology: A Deputy Editor's perspective. Jpn J Radiol 2015;33:678-685.
Ehara S, Takahashi K. Reasons for rejection of manuscripts submitted to AJR by international authors. AJR Am J Roentgenol 2007;188:W113-W116.
Kim DW, Jang HY, Kim KW, Shin Y, Park SH. Design characteristics of studies reporting the performance of artificial intelligence algorithms for diagnostic analysis of medical images: Results from recently published papers. Korean J Radiol 2019;20:405-410.
Schwier M, van Griethuysen J, Vangel MG, et al. Repeatability of multiparametric prostate MRI radiomics features. Sci Rep 2019;9:9441.
Traverso A, Wee L, Dekker A, Gillies R. Repeatability and reproducibility of radiomic features: A systematic review. Int J Radiat Oncol Biol Phys 2018;102:1143-1158.
Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): The TRIPOD statement. Ann Intern Med 2015;162:55-63.
Collins GS, Moons KGM. Reporting of artificial intelligence prediction models. Lancet 2019;393:1577-1579.
Roberts K, Dowell A, Nie JB. Attempting rigour and replicability in thematic analysis of qualitative research data; a case study of codebook development. BMC Med Res Methodol 2019;19:66.
Malterud K. Qualitative research: Standards, challenges, and guidelines. Lancet 2001;358(panel 2):483-488.
Tran VT, Riveros C, Péan C, Czarnobroda A, Ravaud P. Patients' perspective on how to improve the care of people with chronic conditions in France: a citizen science study within the ComPaRe e-cohort. BMJ Qual Saf. 2019;28(11):875-886. https://doi.org/10.1136/bmjqs-2018-008593. Epub 2019 Apr 23. PubMed.
Moser A, Korstjens I. Series: Practical guidance to qualitative research. Part 3: Sampling, data collection and analysis. Eur J Gen Pract 2018;24:9-18.
Martin EG, Begany GM. Opening government health data to the public: Benefits, challenges, and lessons learned from early innovators. J Am Med Inform Assoc 2017;24:345-351.