Fully Automatic Deep Learning in Bi-institutional Prostate Magnetic Resonance Imaging: Effects of Cohort Size and Heterogeneity.


Journal

Investigative radiology
ISSN: 1536-0210
Titre abrégé: Invest Radiol
Pays: United States
ID NLM: 0045377

Informations de publication

Date de publication:
01 12 2021
Historique:
pubmed: 29 5 2021
medline: 15 4 2022
entrez: 28 5 2021
Statut: ppublish

Résumé

The potential of deep learning to support radiologist prostate magnetic resonance imaging (MRI) interpretation has been demonstrated. The aim of this study was to evaluate the effects of increased and diversified training data (TD) on deep learning performance for detection and segmentation of clinically significant prostate cancer-suspicious lesions. In this retrospective study, biparametric (T2-weighted and diffusion-weighted) prostate MRI acquired with multiple 1.5-T and 3.0-T MRI scanners in consecutive men was used for training and testing of prostate segmentation and lesion detection networks. Ground truth was the combination of targeted and extended systematic MRI-transrectal ultrasound fusion biopsies, with significant prostate cancer defined as International Society of Urological Pathology grade group greater than or equal to 2. U-Nets were internally validated on full, reduced, and PROSTATEx-enhanced training sets and subsequently externally validated on the institutional test set and the PROSTATEx test set. U-Net segmentation was calibrated to clinically desired levels in cross-validation, and test performance was subsequently compared using sensitivities, specificities, predictive values, and Dice coefficient. One thousand four hundred eighty-eight institutional examinations (median age, 64 years; interquartile range, 58-70 years) were temporally split into training (2014-2017, 806 examinations, supplemented by 204 PROSTATEx examinations) and test (2018-2020, 682 examinations) sets. In the test set, Prostate Imaging-Reporting and Data System (PI-RADS) cutoffs greater than or equal to 3 and greater than or equal to 4 on a per-patient basis had sensitivity of 97% (241/249) and 90% (223/249) at specificity of 19% (82/433) and 56% (242/433), respectively. The full U-Net had corresponding sensitivity of 97% (241/249) and 88% (219/249) with specificity of 20% (86/433) and 59% (254/433), not statistically different from PI-RADS (P > 0.3 for all comparisons). U-Net trained using a reduced set of 171 consecutive examinations achieved inferior performance (P < 0.001). PROSTATEx training enhancement did not improve performance. Dice coefficients were 0.90 for prostate and 0.42/0.53 for MRI lesion segmentation at PI-RADS category 3/4 equivalents. In a large institutional test set, U-Net confirms similar performance to clinical PI-RADS assessment and benefits from more TD, with neither institutional nor PROSTATEx performance improved by adding multiscanner or bi-institutional TD.

Sections du résumé

BACKGROUND
The potential of deep learning to support radiologist prostate magnetic resonance imaging (MRI) interpretation has been demonstrated.
PURPOSE
The aim of this study was to evaluate the effects of increased and diversified training data (TD) on deep learning performance for detection and segmentation of clinically significant prostate cancer-suspicious lesions.
MATERIALS AND METHODS
In this retrospective study, biparametric (T2-weighted and diffusion-weighted) prostate MRI acquired with multiple 1.5-T and 3.0-T MRI scanners in consecutive men was used for training and testing of prostate segmentation and lesion detection networks. Ground truth was the combination of targeted and extended systematic MRI-transrectal ultrasound fusion biopsies, with significant prostate cancer defined as International Society of Urological Pathology grade group greater than or equal to 2. U-Nets were internally validated on full, reduced, and PROSTATEx-enhanced training sets and subsequently externally validated on the institutional test set and the PROSTATEx test set. U-Net segmentation was calibrated to clinically desired levels in cross-validation, and test performance was subsequently compared using sensitivities, specificities, predictive values, and Dice coefficient.
RESULTS
One thousand four hundred eighty-eight institutional examinations (median age, 64 years; interquartile range, 58-70 years) were temporally split into training (2014-2017, 806 examinations, supplemented by 204 PROSTATEx examinations) and test (2018-2020, 682 examinations) sets. In the test set, Prostate Imaging-Reporting and Data System (PI-RADS) cutoffs greater than or equal to 3 and greater than or equal to 4 on a per-patient basis had sensitivity of 97% (241/249) and 90% (223/249) at specificity of 19% (82/433) and 56% (242/433), respectively. The full U-Net had corresponding sensitivity of 97% (241/249) and 88% (219/249) with specificity of 20% (86/433) and 59% (254/433), not statistically different from PI-RADS (P > 0.3 for all comparisons). U-Net trained using a reduced set of 171 consecutive examinations achieved inferior performance (P < 0.001). PROSTATEx training enhancement did not improve performance. Dice coefficients were 0.90 for prostate and 0.42/0.53 for MRI lesion segmentation at PI-RADS category 3/4 equivalents.
CONCLUSIONS
In a large institutional test set, U-Net confirms similar performance to clinical PI-RADS assessment and benefits from more TD, with neither institutional nor PROSTATEx performance improved by adding multiscanner or bi-institutional TD.

Identifiants

pubmed: 34049336
doi: 10.1097/RLI.0000000000000791
pii: 00004424-202112000-00003
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

799-808

Informations de copyright

Copyright © 2021 Wolters Kluwer Health, Inc. All rights reserved.

Déclaration de conflit d'intérêts

Conflicts of interest and sources of funding: N.N., C.W., P.S., X.W., X.Q., M.G., V.S., T.H., C.S., T.A.K., R.G., M.H., and K.H.M.-H. have nothing to declare. J.P.R. declares payment for consultant work from Saegeling Medizintechnik, Siemens Healthineers, and for development of educational presentations from Saegeling Medizintechnik. A.S. declares being part of advisory board/speaker's bureau of AstraZeneca, Bayer, Bristol Myers Squibb, Eli Lilly, Illumina, Janssen, MSD, Novartis, Pfizer, Roche, Seattle Genetics, Thermo Fisher Scientific; and declares receiving grants from Bayer, Bristol Myers Squibb, and Chugai. H.-P.S. declares receiving consulting fee or honorarium from Siemens, Curagita, Profound, and Bayer; declares receiving travel support from Siemens, Curagita, Profound, and Bayer; is a board member of Curagita; provides consultancy for Curagita and Bayer; declares receiving grants/grants pending from BMBF, Deutsche Krebshilfe, Dietmar-Hopp-Stiftung, and Roland-Ernst-Stiftung; and declares receiving payment for lectures from Siemens, Curagita, Profound, and Bayer. D.B. declares receiving payment for lectures from Bayer Vital.

Références

Ahdoot M, Wilbur AR, Reese SE, et al. MRI-targeted, systematic, and combined biopsy for prostate cancer diagnosis. N Engl J Med . 2020;382:917–928.
Ahmed HU, El-Shater Bosaily A, Brown LC, et al. Diagnostic accuracy of multi-parametric MRI and TRUS biopsy in prostate cancer (PROMIS): a paired validating confirmatory study. Lancet . 2017;389:815–822.
Kasivisvanathan V, Rannikko AS, Borghi M, et al. MRI-Targeted or standard biopsy for prostate-cancer diagnosis. N Engl J Med . 2018;378:1767–1777.
Rouviere O, Puech P, Renard-Penna R, et al. Use of prostate systematic and targeted biopsy on the basis of multiparametric MRI in biopsy-naive patients (MRI-FIRST): a prospective, multicentre, paired diagnostic study. Lancet Oncol . 2019;20:100–109.
van der Leest M, Cornel E, Israel B, et al. Head-to-head comparison of transrectal ultrasound-guided prostate biopsy versus multiparametric prostate resonance imaging with subsequent magnetic resonance-guided biopsy in biopsy-naive men with elevated prostate-specific antigen: a large prospective multicenter clinical study. Eur Urol . 2019;75:570–578.
Turkbey B, Rosenkrantz AB, Haider MA, et al. Prostate Imaging Reporting and Data System Version 2.1: 2019 Update of Prostate Imaging Reporting and Data System Version 2. Eur Urol . 2019;76:340–351.
Westphalen AC, McCulloch CE, Anaokar JM, et al. Variability of the positive predictive value of PI-RADS for prostate MRI across 26 centers: experience of the Society of Abdominal Radiology Prostate Cancer Disease–Focused Panel. Radiology . 2020;296:76–84.
Mazzone E, Stabile A, Pellegrino F, et al. Positive predictive value of Prostate Imaging Reporting and Data System Version 2 for the detection of clinically significant prostate cancer: a systematic review and meta-analysis. Eur Urol Oncol . 2020;S2588-9311(20)30212-1.
Sathianathen NJ, Omer A, Harriss E, et al. Negative predictive value of multiparametric magnetic resonance imaging in the detection of clinically significant prostate cancer in the Prostate Imaging Reporting and Data System era: a systematic review and meta-analysis. Eur Urol . 2020;78:402–414.
Esser M, Zinsser D, Kundel M, et al. Performance of an automated workflow for magnetic resonance imaging of the prostate: comparison with a manual workflow. Invest Radiol . 2020;55:277–284.
Panda A, O'Connor G, Lo WC, et al. Targeted biopsy validation of peripheral zone prostate cancer characterization with magnetic resonance fingerprinting and diffusion mapping. Invest Radiol . 2019;54:485–493.
Mai J, Abubrig M, Lehmann T, et al. T2 mapping in prostate cancer. Invest Radiol . 2019;54:146–152.
Asbach P, Ro SR, Aldoj N, et al. In vivo quantification of water diffusion, stiffness, and tissue fluidity in benign prostatic hyperplasia and prostate cancer. Invest Radiol . 2020;55:524–530.
Schelb P, Kohl S, Radtke JP, et al. Classification of cancer at prostate MRI: deep learning versus clinical PI-RADS assessment. Radiology . 2019;293:607–617.
Schelb P, Wang X, Radtke JP, et al. Simulated clinical deployment of fully automatic deep learning for clinical prostate MRI assessment. Eur Radiol . 2021;31:302–313.
Arif M, Schoots IG, Castillo Tovar J, et al. Clinically significant prostate cancer detection and segmentation in low-risk patients using a convolutional neural network on multi-parametric MRI. Eur Radiol . 2020;30:6582–6592.
Deniffel D, Abraham N, Namdar K, et al. Using decision curve analysis to benchmark performance of a magnetic resonance imaging-based deep learning model for prostate cancer risk assessment. Eur Radiol . 2020;30:6867–6876.
Yoo S, Gujrathi I, Haider MA, et al. Prostate cancer detection using deep convolutional neural networks. Sci Rep . 2019;9:19518.
Winkel DJ, Wetterauer C, Matthias MO, et al. Autonomous detection and classification of PI-RADS lesions in an MRI screening population incorporating multicenter-labeled deep learning and biparametric imaging: proof of concept. Diagnostics (Basel) . 2020;10:951.
Bluemke DA, Moy L, Bredella MA, et al. Assessing radiology research on artificial intelligence: a brief guide for authors, reviewers, and readers-from the radiology editorial board. Radiology . 2020;294:487–489.
Leitlinienprogramm Onkologie Deutsche Krebsgesellschaft DK, AWMF. 2019. Available at: https://www.leitlinienprogramm-onkologie.de/leitlinien/prostatakarzinom/ . Accessed October 26, 2020.
Armato SG 3rd, Huisman H, Drukker K, et al. PROSTATEx Challenges for computerized classification of prostate lesions from multiparametric magnetic resonance images. J Med Imaging (Bellingham) . 2018;5:044501.
Vargas HA, Hotker AM, Goldman DA, et al. Updated Prostate Imaging Reporting and Data System (PIRADS v2) recommendations for the detection of clinically significant prostate cancer using multiparametric MRI: critical evaluation using whole-mount pathology as standard of reference. Eur Radiol . 2016;26:1606–1612.
Weinreb JC, Barentsz JO, Choyke PL, et al. PI-RADS Prostate Imaging–Reporting and Data System: 2015, Version 2. Eur Urol . 2016;69:16–40.
Kuru TH, Wadhwa K, Chang RTM, et al. Definitions of terms, processes and a minimum dataset for transperineal prostate biopsies: a standardization approach of the Ginsburg Study Group for Enhanced Prostate Diagnostics. BJU Int . 2013;112:568–577.
Egevad L, Delahunt B, Srigley JR, et al. International Society of Urological Pathology (ISUP) grading of prostate cancer—an ISUP consensus on contemporary grading. APMIS . 2016;124:433–435.
Fritzsche KH, Neher PF, Reicht I, et al. MITK diffusion imaging. Methods Inf Med . 2012;51:441–448.
Nolden M, Zelzer S, Seitel A, et al. The Medical Imaging Interaction Toolkit: challenges and advances: 10 years of open-source development. Int J Comput Assist Radiol Surg . 2013;8:607–620.
Ronneberger O, Fischer P, Brox T. U-Net: convolutional networks for biomedical image segmentation. 2015. Available at: https://arxiv.org/abs/1505.04597 . Accessed October 20, 2020.
Isensee F, Jäger PF, Kohl SAA, et al. Automated design of deep learning methods for biomedical image segmentation. Available at: https://arxiv.org/abs/1904.08128 . Accessed May 1, 2019.
Dice LR. Measures of the amount of ecologic association between species. Ecology . 1945;26:297–302.
Delong ER, Delong DM, Clarkepearson DI. Comparing the areas under 2 or more correlated receiver operating characteristic curves—a nonparametric approach. Biometrics . 1988;44:837–845.
McNemar Q. Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika . 1947;12:153–157.
Holm S. A simple sequentially rejective multiple test procedure. Scand J Stat . 1979;65–70.
Vickers AJ, Cronin AM, Elkin EB, et al. Extensions to decision curve analysis, a novel method for evaluating diagnostic tests, prediction models and molecular markers. BMC Med Inform Decis Mak . 2008;8:53.
Radtke JP, Wiesenfarth M, Kesch C, et al. Combined clinical parameters and multiparametric magnetic resonance imaging for advanced risk modeling of prostate cancer-patient-tailored risk stratification can reduce unnecessary biopsies. Eur Urol . 2017;72:888–896.
Zhang Z, Rousson V, Lee WC, et al. Decision curve analysis: a technical note. Ann Transl Med . 2018;6:308.
R Core Team. R: A language and environment for statistical computing . Vienna, Austria: R Foundation for Statistical Computing; 2020.
Bossuyt PM, Reitsma JB, Bruns DE, et al. Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Radiology . 2003;226:24–28.
Zhong X, Cao R, Shakeri S, et al. Deep transfer learning-based prostate cancer classification using 3 Tesla multi-parametric MRI. Abdom Radiol (NY) . 2019;44:2030–2039.
Schelb P, Tavakoli AA, Tubtawee T, et al. Comparison of prostate MRI lesion segmentation agreement between multiple radiologists and a fully automatic deep learning system. Rofo . 2020.
Radtke JP, Schwab C, Wolf MB, et al. Multiparametric magnetic resonance imaging (MRI) and MRI-transrectal ultrasound fusion biopsy for index tumor detection: correlation with radical prostatectomy specimen. Eur Urol . 2016;70:846–853.
Zavala-Romero O, Breto AL, Xu IR, et al. Segmentation of prostate and prostate zones using deep learning: a multi-MRI vendor analysis. Strahlenther Onkol . 2020;196:932–942.
Maas MC, Litjens GJS, Wright AJ, et al. A single-arm, multicenter validation study of prostate cancer localization and aggressiveness with a quantitative multiparametric magnetic resonance imaging approach. Invest Radiol . 2019;54:437–447.

Auteurs

Magdalena Görtz (M)

Department of Urology, University of Heidelberg Medical Center.

Viktoria Schütz (V)

Department of Urology, University of Heidelberg Medical Center.

Thomas Hielscher (T)

Division of Biostatistics, German Cancer Research Center.

Constantin Schwab (C)

Institute of Pathology, University of Heidelberg Medical Center.

Albrecht Stenzinger (A)

Institute of Pathology, University of Heidelberg Medical Center.

Tristan Anselm Kuder (TA)

Division of Medical Physics in Radiology.

Regula Gnirs (R)

From the Division of Radiology, German Cancer Research Center.

Markus Hohenfellner (M)

Department of Urology, University of Heidelberg Medical Center.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH