AI-Generated Annotations Dataset for Diverse Cancer Radiology Collections in NCI Image Data Commons.


Journal

Scientific data
ISSN: 2052-4463
Titre abrégé: Sci Data
Pays: England
ID NLM: 101640192

Informations de publication

Date de publication:
23 Oct 2024
Historique:
received: 29 12 2023
accepted: 07 10 2024
medline: 24 10 2024
pubmed: 24 10 2024
entrez: 23 10 2024
Statut: epublish

Résumé

The National Cancer Institute (NCI) Image Data Commons (IDC) offers publicly available cancer radiology collections for cloud computing, crucial for developing advanced imaging tools and algorithms. Despite their potential, these collections are minimally annotated; only 4% of DICOM studies in collections considered in the project had existing segmentation annotations. This project increases the quantity of segmentations in various IDC collections. We produced high-quality, AI-generated imaging annotations dataset of tissues, organs, and/or cancers for 11 distinct IDC image collections. These collections contain images from a variety of modalities, including computed tomography (CT), magnetic resonance imaging (MRI), and positron emission tomography (PET). The collections cover various body parts, such as the chest, breast, kidneys, prostate, and liver. A portion of the AI annotations were reviewed and corrected by a radiologist to assess the performance of the AI models. Both the AI's and the radiologist's annotations were encoded in conformance to the Digital Imaging and Communications in Medicine (DICOM) standard, allowing for seamless integration into the IDC collections as third-party analysis collections. All the models, images and annotations are publicly accessible.

Identifiants

pubmed: 39443503
doi: 10.1038/s41597-024-03977-8
pii: 10.1038/s41597-024-03977-8
doi:

Types de publication

Dataset Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

1165

Informations de copyright

© 2024. The Author(s).

Références

Fedorov, A. et al. National Cancer Institute Imaging Data Commons: Toward Transparency, Reproducibility, and Scalability in Imaging Artificial Intelligence. RadioGraphics 43 (2023).
Clark, K. et al. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository. https://doi.org/10.1007/s10278-013-9622-7 .
Albertina, B. et al. The Cancer Genome Atlas Lung Adenocarcinoma Collection (TCGA-LUAD) (Version 4) [Data set]. The Cancer Imaging Archive https://doi.org/10.7937/K9/TCIA.2016.JGNIHEP5 (2016).
Kirk, S. et al. The Cancer Genome Atlas Lung Squamous Cell Carcinoma Collection (TCGA-LUSC) (Version 4) [Data set]. The Cancer Imaging Archive https://doi.org/10.7937/K9/TCIA.2016.TYGKKFMQ (2016).
Li, P. et al. A Large-Scale CT and PET/CT Dataset for Lung Cancer Diagnosis (Lung-PET-CT-Dx) [Data set]. The Cancer Imaging Archive https://doi.org/10.7937/TCIA.2020.NNC2-0461 (2020).
Madhavi, P., Patel, S. & Tsao, A. S. Data from Anti-PD-1 Immunotherapy Lung [Data set]. The Cancer Imaging Archive https://doi.org/10.7937/tcia.2019.zjjwb9ip (2019).
Muzi, P., Wanner, M. & Kinahan, P. Data From RIDER Lung PET-CT. The Cancer Imaging Archive https://doi.org/10.7937/k9/tcia.2015.ofip7tvm (2015).
Gevaert, O. et al. Non–Small Cell Lung Cancer: Identifying Prognostic Imaging Biomarkers by Leveraging Public Gene Expression Microarray Data—Methods and Preliminary Results. Radiology 264, 387–396 (2012).
doi: 10.1148/radiol.12111607 pubmed: 22723499 pmcid: 3401348
Bakr, S. et al. Data for NSCLC Radiogenomics (Version 4) [Data set]. The Cancer Imaging Archive (2017).
Bakr, S. et al. A radiogenomic dataset of non-small cell lung cancer. Sci Data 5, 180202 (2018).
doi: 10.1038/sdata.2018.202 pubmed: 30325352 pmcid: 6190740
Kinahan, P., Muzi, M., Bialecki, B., Herman, B. & Coombs, L. Data from the ACRIN 6668 Trial NSCLC-FDG-PET (Version 2) [Data set]. The Cancer Imaging Archive https://doi.org/10.7937/tcia.2019.30ilqfcl (2019).
Machtay, M. et al. Prediction of Survival by [18F]Fluorodeoxyglucose Positron Emission Tomography in Patients With Locally Advanced Non–Small-Cell Lung Cancer Undergoing Definitive Chemoradiation Therapy: Results of the ACRIN 6668/RTOG 0235 Trial. Journal of Clinical Oncology 31, 3823–3830 (2013).
doi: 10.1200/JCO.2012.47.5947 pubmed: 24043740 pmcid: 3795891
Li, X. et al. Data From QIN-Breast (Version 2) [Data set]. The Cancer Imaging Archive https://doi.org/10.7937/K9/TCIA.2016.21JUEBH0 (2016).
Li, X. et al. Multiparametric Magnetic Resonance Imaging for Predicting Pathological Response After the First Cycle of Neoadjuvant Chemotherapy in Breast Cancer. Invest Radiol 50, 195–204 (2015).
doi: 10.1097/RLI.0000000000000100 pubmed: 25360603 pmcid: 4471951
Akin, O. et al. The Cancer Genome Atlas Kidney Renal Clear Cell Carcinoma Collection (TCGA-KIRC) (Version 3) [Data set]. The Cancer Imaging Archive https://doi.org/10.7937/K9/TCIA.2016.V6PBVTDR (2016).
Litjens, J. B., Debats, O., Barentsz, J., Karssemeijer, N. & Huisman, H. SPIE-AAPM-NCI PROSTATEx Challenges. The Cancer Imaging Archive https://doi.org/10.7937/K9TCIA.2017.MURS5CL (2017).
Litjens, G., Debats, O., Barentsz, J., Karssemeijer, N. & Huisman, H. Computer-Aided Detection of Prostate Cancer in MRI. IEEE Trans Med Imaging 33, 1083–1092 (2014).
doi: 10.1109/TMI.2014.2303821 pubmed: 24770913
Erickson, B. J. et al. The Cancer Genome Atlas Liver Hepatocellular Carcinoma Collection (TCGA-LIHC) (Version 5) [Data set]. The Cancer Imaging Archive https://doi.org/10.7937/K9/TCIA.2016.IMMQW8UQ (2016).
Digital Imaging and Communications in Medicine (DICOM). in NEMA Publications PS 3.1-PS 3.12. (The National Electrical Manufacturers Association, Rosslyn, VA, 1992).
Isensee, F., Jaeger, P. F., Kohl, S. A. A., Petersen, J. & Maier-Hein, K. H. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nature Methods 2020 18:2 18, 203–211 (2020).
Murugesan, G. K. et al. Evaluating the Effect of Multilabel and Single Label Models on Prostate Cancer Lesion Segmentation in Ga-68 PSMA-11 PET/CT. (2023).
Wasserthal, J. et al. TotalSegmentator: Robust Segmentation of 104 Anatomic Structures in CT Images. 5, https://doi.org/10.1148/ryai.230024 (2023).
Gatidis, S. & Kuestner, T. A whole-body FDG-PET/CT dataset with manually annotated tumor lesions (FDG-PET-CT-Lesions) [Dataset]. The Cancer Imaging Archive https://doi.org/10.7937/gkr0-xv29 (2022).
Gatidis, S. et al. A whole-body FDG-PET/CT Dataset with manually annotated Tumor Lesions. Sci Data 9, 601 (2022).
doi: 10.1038/s41597-022-01718-3 pubmed: 36195599 pmcid: 9532417
Gatidis, S., Kustner, T., Ingrisch, M., Cyran, C. & Kleesiek, J. Automated Lesion Segmentation in Whole-Body FDG- PET/CT - Domain Generalization. Preprint at https://doi.org/10.5281/zenodo.7845727 (2023).
Murugesan, G. K. et al. Automatic Whole Body FDG PET/CT Lesion Segmentation using Residual UNet and Adaptive Ensemble. bioRxiv 2023.02.06.525233 https://doi.org/10.1101/2023.02.06.525233 (2023).
Wasserthal, J. et al. TotalSegmentator: Robust Segmentation of 104 Anatomic Structures in CT Images. Radiol Artif Intell 5 (2023).
Pretrained model for 3D semantic image segmentation of the FDG-avid lesions from PT/CT scans. https://doi.org/10.5281/ZENODO.8290055 .
Fedorov, A. et al. Standardized representation of the TCIA LIDC-IDRI annotations using DICOM. The Cancer Imaging Archive https://doi.org/10.7937/TCIA.2018.h7umfurq (2018).
Armato, S. G. et al. The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A completed reference database of lung nodules on CT scans. Med Phys 38, 915–931 (2011).
doi: 10.1118/1.3528204 pubmed: 21452728 pmcid: 3041807
Fedorov, A. et al. DICOM re‐encoding of volumetrically annotated Lung Imaging Database Consortium (LIDC) nodules. Med Phys 47, 5953–5965 (2020).
doi: 10.1002/mp.14445 pubmed: 32772385
Pretrained model for 3D semantic image segmentation of the lung from ct scan. https://doi.org/10.5281/ZENODO.8290168 .
Pretrained model for 3D semantic image segmentation of the lung nodules from CT scans. https://doi.org/10.5281/ZENODO.8290146 .
Aerts, H. J. W. L. et al. Data From NSCLC-Radiomics (version 4) [Data set]. The Cancer Imaging Archive https://doi.org/10.7937/K9/TCIA.2015.PF0M9REI (2014).
Aerts, H. J. W. L. et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat Commun 5, 4006 (2014).
doi: 10.1038/ncomms5006 pubmed: 24892406
Bakr, S. et al. Data descriptor: A radiogenomic dataset of non-small cell lung cancer. Sci Data 5, (2018).
Heller, N. et al. The KiTS21 Challenge: Automatic segmentation of kidneys, renal tumors, and renal cysts in corticomedullary-phase CT. Preprint at (2023).
Heller, N. et al. The state of the art in kidney and kidney tumor segmentation in contrast-enhanced CT imaging: Results of the KiTS19 challenge. Med Image Anal 67 (2021).
Heller, N. et al. The KiTS19 Challenge Data: 300 Kidney Tumor Cases with Clinical Context, CT Semantic Segmentations, and Surgical Outcomes. (2019).
Pretrained model for 3D semantic image segmentation of the kidney from CT scans. https://doi.org/10.5281/ZENODO.8277846 .
Schindele, D. et al. High Resolution Prostate Segmentations for the ProstateX-Challenge [Data set]. The Cancer Imaging Archive https://doi.org/10.7937/TCIA.2019.DEG7ZG1U (2020).
Meyer, A. et al. Anisotropic 3D Multi-Stream CNN for Accurate Prostate Segmentation from Multi-Planar MRI. Comput Methods Programs Biomed 200, 105821 (2021).
doi: 10.1016/j.cmpb.2020.105821 pubmed: 33218704
Meyer, A. et al. PROSTATEx Zone Segmentations [Data set]. The Cancer Imaging Archive https://doi.org/10.7937/TCIA.NBB4-4655 (2020).
Meyer, A. et al. Towards Patient-Individual PI-Rads v2 Sector Map: Cnn for Automatic Segmentation of Prostatic Zones From T2-Weighted MRI. in 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019) 696–700, https://doi.org/10.1109/ISBI.2019.8759572 (IEEE, 2019).
Saha, A. et al. The PI-CAI Challenge: Public Training and Development Dataset. https://doi.org/10.5281/ZENODO.6624726 (2022).
Cuocolo, R., Stanzione, A., Castaldo, A., De Lucia, D. R. & Imbriaco, M. Quality control and whole-gland, zonal and lesion annotations for the PROSTATEx challenge public dataset. Eur J Radiol 138, 109647 (2021).
doi: 10.1016/j.ejrad.2021.109647 pubmed: 33721767
Cuocolo, R. et al. Deep Learning Whole-Gland and Zonal Prostate Segmentation on a Public MRI Dataset. Journal of Magnetic Resonance Imaging 54, 452–459 (2021).
doi: 10.1002/jmri.27585 pubmed: 33634932
Bressem, K., Adams, L. & Engel, G. Prostate158 - Training data (version 1) [Data set]. In Computers in Biology and Medicine 148, 105817 (2022).
Bloch, N. et al. NCI-ISBI 2013 Challenge: Automated Segmentation of Prostate Structures. The Cancer Imaging Archive https://doi.org/10.7937/K9/TCIA.2015.zF0vlOPv (2015).
Pretrained model for 3D semantic image segmentation of the prostate from T2 MRI scans. https://doi.org/10.5281/ZENODO.8290093 .
Fedorov, A. et al. Data From QIN-PROSTATE-Repeatability. The Cancer Imaging Archive https://doi.org/10.7937/K9/TCIA.2018.MR1CKGND (2018).
Fedorov, A., Vangel, M. G., Tempany, C. M. & Fennessy, F. M. Multiparametric Magnetic Resonance Imaging of the Prostate. Invest Radiol 52, 538–546 (2017).
doi: 10.1097/RLI.0000000000000382 pubmed: 28463931 pmcid: 5544576
Fedorov, A. et al. An annotated test-retest collection of prostate multiparametric MRI. Sci Data 5, 180281 (2018).
doi: 10.1038/sdata.2018.281 pubmed: 30512014 pmcid: 6278692
Peled, S. et al. Selection of Fitting Model and Arterial Input Function for Repeatability in Dynamic Contrast-Enhanced Prostate MRI. Acad Radiol 26, e241–e251 (2019).
doi: 10.1016/j.acra.2018.10.018 pubmed: 30467073
Schwier, M. et al. Repeatability of Multiparametric Prostate MRI Radiomics Features. Sci Rep 9, 9441 (2019).
doi: 10.1038/s41598-019-45766-z pubmed: 31263116 pmcid: 6602944
Litjens, G. et al. PROMISE12: Data from the MICCAI Grand Challenge: Prostate MR Image Segmentation 2012. https://doi.org/10.5281/ZENODO.8026660 (2023).
Antonelli, M. et al. The Medical Segmentation Decathlon. Nat Commun 13, 4128 (2022).
doi: 10.1038/s41467-022-30695-9 pubmed: 35840566 pmcid: 9287542
Ji, Y. et al. AMOS: A Large-Scale Abdominal Multi-Organ Benchmark for Versatile Medical Image Segmentation. (2022).
Macdonald, J. A. et al. Duke Liver Dataset: A Publicly Available Liver MRI Dataset with Liver Segmentation Masks and Series Labels. 10.1148/ryai.220275 5, (2023).
Pretrained model for 3D semantic image segmentation of the liver from T1 MRI scans. https://doi.org/10.5281/ZENODO.8290124 .
Ma, J. et al. Fast and Low-GPU-memory abdomen CT organ segmentation: The FLARE challenge. Med Image Anal 82, 102616 (2022).
doi: 10.1016/j.media.2022.102616 pubmed: 36179380
Ma, J. et al. AbdomenCT-1K: Is Abdominal Organ Segmentation A Solved Problem? IEEE Transactions on Pattern Analysis and Machine Intelligence https://doi.org/10.1109/TPAMI.2021.3100536 (2021).
Pretrained model for 3D semantic image segmentation of the liver from CT scans. https://doi.org/10.5281/ZENODO.8274976 .
VanOss, J., Murugesan, G. K., McCrumb, D. & Soni, R. Image segmentations produced by BAMF under the AIMI Annotations initiative. Zenodo https://doi.org/10.5281/zenodo.13244892 (2024).
Dice, L. R. Measures of the Amount of Ecologic Association Between Species. Ecology 26, 297–302 (1945).
doi: 10.2307/1932409
Nikolov, S. et al. Clinically Applicable Segmentation of Head and Neck Anatomy for Radiotherapy: Deep Learning Algorithm Development and Validation Study. J Med Internet Res 23, e26151 (2021).
doi: 10.2196/26151 pubmed: 34255661 pmcid: 8314151

Auteurs

Gowtham Krishnan Murugesan (GK)

BAMF Health, Grand Rapids, MI, USA. gowtham.murugesan@bamfhealth.com.

Diana McCrumb (D)

BAMF Health, Grand Rapids, MI, USA.

Mariam Aboian (M)

Yale School of Medicine, New Haven, CT, USA.

Tej Verma (T)

Yale School of Medicine, New Haven, CT, USA.

Rahul Soni (R)

BAMF Health, Grand Rapids, MI, USA.

Fatima Memon (F)

Yale School of Medicine, New Haven, CT, USA.

Keyvan Farahani (K)

National Institute of Health, Bethesda, MD, USA.

Linmin Pei (L)

Frederick National Laboratory for Cancer Research, Frederick, MD, USA.

Ulrike Wagner (U)

Frederick National Laboratory for Cancer Research, Frederick, MD, USA.

Andrey Y Fedorov (AY)

Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.

David Clunie (D)

PixelMed Publishing, Bangor, PA, USA.

Stephen Moore (S)

BAMF Health, Grand Rapids, MI, USA.

Jeff Van Oss (J)

BAMF Health, Grand Rapids, MI, USA.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH