Application of a validated prostate MRI deep learning system to independent same-vendor multi-institutional data: demonstration of transferability.
Deep learning
Magnetic resonance imaging
Prostatic neoplasms
Validation study
Journal
European radiology
ISSN: 1432-1084
Titre abrégé: Eur Radiol
Pays: Germany
ID NLM: 9114774
Informations de publication
Date de publication:
Nov 2023
Nov 2023
Historique:
received:
24
03
2023
accepted:
27
04
2023
revised:
24
04
2023
medline:
27
10
2023
pubmed:
29
7
2023
entrez:
28
7
2023
Statut:
ppublish
Résumé
To evaluate a fully automatic deep learning system to detect and segment clinically significant prostate cancer (csPCa) on same-vendor prostate MRI from two different institutions not contributing to training of the system. In this retrospective study, a previously bi-institutionally validated deep learning system (UNETM) was applied to bi-parametric prostate MRI data from one external institution (A), a PI-RADS distribution-matched internal cohort (B), and a csPCa stratified subset of single-institution external public challenge data (C). csPCa was defined as ISUP Grade Group ≥ 2 determined from combined targeted and extended systematic MRI/transrectal US-fusion biopsy. Performance of UNETM was evaluated by comparing ROC AUC and specificity at typical PI-RADS sensitivity levels. Lesion-level analysis between UNETM segmentations and radiologist-delineated segmentations was performed using Dice coefficient, free-response operating characteristic (FROC), and weighted alternative (waFROC). The influence of using different diffusion sequences was analyzed in cohort A. In 250/250/140 exams in cohorts A/B/C, differences in ROC AUC were insignificant with 0.80 (95% CI: 0.74-0.85)/0.87 (95% CI: 0.83-0.92)/0.82 (95% CI: 0.75-0.89). At sensitivities of 95% and 90%, UNETM achieved specificity of 30%/50% in A, 44%/71% in B, and 43%/49% in C, respectively. Dice coefficient of UNETM and radiologist-delineated lesions was 0.36 in A and 0.49 in B. The waFROC AUC was 0.67 (95% CI: 0.60-0.83) in A and 0.7 (95% CI: 0.64-0.78) in B. UNETM performed marginally better on readout-segmented than on single-shot echo-planar-imaging. For same-vendor examinations, deep learning provided comparable discrimination of csPCa and non-csPCa lesions and examinations between local and two independent external data sets, demonstrating the applicability of the system to institutions not participating in model training. A previously bi-institutionally validated fully automatic deep learning system maintained acceptable exam-level diagnostic performance in two independent external data sets, indicating the potential of deploying AI models without retraining or fine-tuning, and corroborating evidence that AI models extract a substantial amount of transferable domain knowledge about MRI-based prostate cancer assessment. • A previously bi-institutionally validated fully automatic deep learning system maintained acceptable exam-level diagnostic performance in two independent external data sets. • Lesion detection performance and segmentation congruence was similar on the institutional and an external data set, as measured by the weighted alternative FROC AUC and Dice coefficient. • Although the system generalized to two external institutions without re-training, achieving expected sensitivity and specificity levels using the deep learning system requires probability thresholds to be adjusted, underlining the importance of institution-specific calibration and quality control.
Identifiants
pubmed: 37507610
doi: 10.1007/s00330-023-09882-9
pii: 10.1007/s00330-023-09882-9
pmc: PMC10598076
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
7463-7476Commentaires et corrections
Type : CommentIn
Informations de copyright
© 2023. German Cancer Research Center.
Références
Turkbey B, Rosenkrantz AB, Haider MA et al (2019) Prostate Imaging Reporting and Data System Version 2.1: 2019 Update of Prostate Imaging Reporting and Data System Version 2. Eur Urol 76:340–351
doi: 10.1016/j.eururo.2019.02.033
pubmed: 30898406
Schelb P, Kohl S, Radtke JP et al (2019) Classification of cancer at prostate MRI: deep learning versus clinical PI-RADS assessment. Radiology 293:607–617
doi: 10.1148/radiol.2019190938
pubmed: 31592731
Schelb P, Wang X, Radtke JP et al (2020) Simulated clinical deployment of fully automatic deep learning for clinical prostate MRI assessment. Eur Radiol. https://doi.org/10.1007/s00330-020-07086-z
doi: 10.1007/s00330-020-07086-z
pubmed: 32767102
pmcid: 7755653
Zhong X, Cao R, Shakeri S et al (2019) Deep transfer learning-based prostate cancer classification using 3 Tesla multi-parametric MRI. Abdom Radiol (NY) 44:2030–2039
doi: 10.1007/s00261-018-1824-5
pubmed: 30460529
Ronneberger O, Fischer P, Brox T (2015) U-Net: Convolutional Networks for Biomedical Image Segmentation
Netzer N, Weisser C, Schelb P et al (2021) Fully automatic deep learning in bi-institutional prostate Magnetic Resonance Imaging: Effects of Cohort Size and Heterogeneity. Invest Radiol 56(12):799–808. https://doi.org/10.1097/RLI.0000000000000791
Armato SG 3rd, Huisman H, Drukker K et al (2018) PROSTATEx challenges for computerized classification of prostate lesions from multiparametric magnetic resonance images. J Med Imaging (Bellingham) 5:044501
pubmed: 30840739
Clark K, Vendt B, Smith K et al (2013) The Cancer Imaging Archive (TCIA): maintaining and operating a public information repository. J Digit Imaging 26:1045–1057
doi: 10.1007/s10278-013-9622-7
pubmed: 23884657
pmcid: 3824915
Kann BH, Hicks DF, Payabvash S et al (2020) Multi-institutional validation of deep learning for pretreatment identification of extranodal extension in head and neck squamous cell carcinoma. J Clin Oncol 38:1304–1311
doi: 10.1200/JCO.19.02031
pubmed: 31815574
Chen KT, Schurer M, Ouyang J et al (2020) Generalization of deep learning models for ultra-low-count amyloid PET/MRI using transfer learning. Eur J Nucl Med Mol Imaging 47:2998–3007
doi: 10.1007/s00259-020-04897-6
pubmed: 32535655
pmcid: 7680289
AlBadawy EA, Saha A, Mazurowski MA (2018) Deep learning for segmentation of brain tumors: impact of cross-institutional training and testing. Med Phys 45:1150–1158
doi: 10.1002/mp.12752
pubmed: 29356028
Egevad L, Delahunt B, Srigley JR, Samaratunga H (2016) International Society of Urological Pathology (ISUP) grading of prostate cancer - an ISUP consensus on contemporary grading. APMIS 124:433–435
doi: 10.1111/apm.12533
pubmed: 27150257
Nolden M, Zelzer S, Seitel A et al (2013) The Medical Imaging Interaction Toolkit: challenges and advances : 10 years of open-source development. Int J Comput Assist Radiol Surg 8:607–620
doi: 10.1007/s11548-013-0840-8
pubmed: 23588509
Fritzsche KH, Neher PF, Reicht I et al (2012) MITK diffusion imaging. Methods Inf Med 51:441–448
doi: 10.3414/ME11-02-0031
pubmed: 23038239
Kuru TH, Wadhwa K, Chang RT et al (2013) Definitions of terms, processes and a minimum dataset for transperineal prostate biopsies: a standardization approach of the Ginsburg Study Group for Enhanced Prostate Diagnostics. BJU Int 112:568–577
doi: 10.1111/bju.12132
pubmed: 23773772
Isensee F, Jaeger PF, Kohl SAA, Petersen J, Maier-Hein KH (2021) nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat Methods 18:203–211
doi: 10.1038/s41592-020-01008-z
pubmed: 33288961
Delong ER, Delong DM, Clarkepearson DI (1988) Comparing the areas under 2 or more correlated receiver operating characteristic curves - a nonparametric approach. Biometrics 44:837–845
doi: 10.2307/2531595
pubmed: 3203132
Fisher R (1934) Statistical methods for research workers. 5th edn Oliver & Boyd
McNemar Q (1947) Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12:153–157
doi: 10.1007/BF02295996
pubmed: 20254758
Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6:65–70. https://www.jstor.org/stable/4615733
D. C, Phillips P, Z. X (2020) RJafroc: artificial intelligence systems and observer performance, 2.0.1
Chakraborty D, Zhai X (2023) RJafroc: artificial intelligence systems and observer performance. R package version 2.1.2.9000. https://dpc10ster.github.io/RJafroc/
Ahmed HU, El-Shater Bosaily A, Brown LC et al (2017) Diagnostic accuracy of multi-parametric MRI and TRUS biopsy in prostate cancer (PROMIS): a paired validating confirmatory study. Lancet 389:815–822
doi: 10.1016/S0140-6736(16)32401-1
pubmed: 28110982
R Core Team (2020) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
Dice LR (1945) Measures of the amount of ecologic association between species. Ecology 26:297–302
doi: 10.2307/1932409
Wei JW, Suriawinata AA, Vaickus LJ et al (2020) Evaluation of a deep neural network for automated classification of colorectal polyps on histopathologic slides. JAMA Netw Open 3:e203398
doi: 10.1001/jamanetworkopen.2020.3398
pubmed: 32324237
pmcid: 7180424
Park KJ, Choi SH, Kim MH, Kim JK, Jeong IG (2021) Performance of prostate imaging reporting and data system Version 2.1 for diagnosis of prostate cancer: a systematic review and meta-analysis. J Magn Reson Imaging 54:103–112
doi: 10.1002/jmri.27546
pubmed: 33576169
Penzkofer T, Padhani AR, Turkbey B, Ahmed HU (2022) Assessing the clinical performance of artificial intelligence software for prostate cancer detection on MRI. Eur Radiol 32:2221–2223
doi: 10.1007/s00330-022-08609-6
pubmed: 35195746
Schelb P, Tavakoli AA, Tubtawee T et al (2020) Comparison of prostate MRI lesion segmentation agreement between multiple radiologists and a fully automatic deep learning system. Rofo. https://doi.org/10.1055/a-1290-8070
doi: 10.1055/a-1290-8070
pubmed: 33212541
Duran A, Dussert G, Rouviere O, Jaouen T, Jodoin PM, Lartizien C (2022) ProstAttention-Net: a deep attention model for prostate cancer segmentation by aggressiveness in MRI scans. Med Image Anal 77:102347
doi: 10.1016/j.media.2021.102347
pubmed: 35085952
Seetharaman A, Bhattacharya I, Chen LC et al (2021) Automated detection of aggressive and indolent prostate cancer on magnetic resonance imaging. Med Phys 48:2960–2972
doi: 10.1002/mp.14855
pubmed: 33760269
Hosseinzadeh M, Saha A, Brand P, Slootweg I, de Rooij M, Huisman H (2022) Deep learning-assisted prostate cancer detection on bi-parametric MRI: minimum training data size requirements and effect of prior knowledge. Eur Radiol 32:2224–2234
doi: 10.1007/s00330-021-08320-y
pubmed: 34786615
Saha A, Hosseinzadeh M, Huisman H (2021) End-to-end prostate cancer detection in bpMRI via 3D CNNs: effects of attention mechanisms, clinical priori and decoupled false positive reduction. Med Image Anal 73:102155
doi: 10.1016/j.media.2021.102155
pubmed: 34245943
Klingebiel M, Ullrich T, Quentin M et al (2020) Advanced diffusion weighted imaging of the prostate: comparison of readout-segmented multi-shot, parallel-transmit and single-shot echo-planar imaging. Eur J Radiol 130:109161
doi: 10.1016/j.ejrad.2020.109161
pubmed: 32650128
Plodeck V, Radosa CG, Hubner HM et al (2020) Rectal gas-induced susceptibility artefacts on prostate diffusion-weighted MRI with epi read-out at 3.0 T: does a preparatory micro-enema improve image quality? Abdom Radiol (NY) 45:4244–4251
doi: 10.1007/s00261-020-02600-9
pubmed: 32500236
Cuocolo R, Stanzione A, Ponsiglione A et al (2019) Prostate MRI technical parameters standardization: a systematic review on adherence to PI-RADSv2 acquisition protocol. Eur J Radiol 120:108662
doi: 10.1016/j.ejrad.2019.108662
pubmed: 31539790
Coşkun M, Sarp AF, Karasu Ş, Gelal MF, Türkbey B (2019) Assessment of the compliance with minimum acceptable technical parameters proposed by PI-RADS v2 guidelines in multiparametric prostate MRI acquisition in tertiary referral hospitals in the Republic of Turkey. Diagn Interv Radiol 25:421
doi: 10.5152/dir.2019.18537
pubmed: 31650967
pmcid: 6837294
Giganti F, Kirkham A, Kasivisvanathan V et al (2021) Understanding PI-QUAL for prostate MRI quality: a practical primer for radiologists. Insights Into Imaging 12:1–19
doi: 10.1186/s13244-021-00996-6
Venderink W, van Luijtelaar A, van der Leest M et al (2019) Multiparametric magnetic resonance imaging and follow-up to avoid prostate biopsy in 4259 men. BJU Int 124:775–784
doi: 10.1111/bju.14853
pubmed: 31237388
Cuocolo R, Stanzione A, Castaldo A, De Lucia DR, Imbriaco M (2021) Quality control and whole-gland, zonal and lesion annotations for the PROSTATEx challenge public dataset. Eur J Radiol 138:109647
doi: 10.1016/j.ejrad.2021.109647
pubmed: 33721767
GigantiAllenEmbertonMooreKasivisvanathanGroup FCMCMVPS (2020) Prostate imaging quality (PI-QUAL): a new quality control scoring system for multiparametric magnetic resonance imaging of the prostate from the PRECISION trial. Eur Urol Oncol 3:615–619
doi: 10.1016/j.euo.2020.06.007