Dimension reduction and outlier detection of 3-D shapes derived from multi-organ CT images.
Bagplots
CT scans
Dimension reduction
Multiple co-inertia analysis
Outlier detection
Journal
BMC medical informatics and decision making
ISSN: 1472-6947
Titre abrégé: BMC Med Inform Decis Mak
Pays: England
ID NLM: 101088682
Informations de publication
Date de publication:
14 Feb 2024
14 Feb 2024
Historique:
received:
24
03
2023
accepted:
08
02
2024
medline:
15
2
2024
pubmed:
15
2
2024
entrez:
14
2
2024
Statut:
epublish
Résumé
Unsupervised clustering and outlier detection are important in medical research to understand the distributional composition of a collective of patients. A number of clustering methods exist, also for high-dimensional data after dimension reduction. Clustering and outlier detection may, however, become less robust or contradictory if multiple high-dimensional data sets per patient exist. Such a scenario is given when the focus is on 3-D data of multiple organs per patient, and a high-dimensional feature matrix per organ is extracted. We use principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE) and multiple co-inertia analysis (MCIA) combined with bagplots to study the distribution of multi-organ 3-D data taken by computed tomography scans. After point-set registration of multiple organs from two public data sets, multiple hundred shape features are extracted per organ. While PCA and t-SNE can only be applied to each organ individually, MCIA can project the data of all organs into the same low-dimensional space. MCIA is the only approach, here, with which data of all organs can be projected into the same low-dimensional space. We studied how frequently (i.e., by how many organs) a patient was classified to belong to the inner or outer 50% of the population, or as an outlier. Outliers could only be detected with MCIA and PCA. MCIA and t-SNE were more robust in judging the distributional location of a patient in contrast to PCA. MCIA is more appropriate and robust in judging the distributional location of a patient in the case of multiple high-dimensional data sets per patient. It is still recommendable to apply PCA or t-SNE in parallel to MCIA to study the location of individual organs.
Sections du résumé
BACKGROUND
BACKGROUND
Unsupervised clustering and outlier detection are important in medical research to understand the distributional composition of a collective of patients. A number of clustering methods exist, also for high-dimensional data after dimension reduction. Clustering and outlier detection may, however, become less robust or contradictory if multiple high-dimensional data sets per patient exist. Such a scenario is given when the focus is on 3-D data of multiple organs per patient, and a high-dimensional feature matrix per organ is extracted.
METHODS
METHODS
We use principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE) and multiple co-inertia analysis (MCIA) combined with bagplots to study the distribution of multi-organ 3-D data taken by computed tomography scans. After point-set registration of multiple organs from two public data sets, multiple hundred shape features are extracted per organ. While PCA and t-SNE can only be applied to each organ individually, MCIA can project the data of all organs into the same low-dimensional space.
RESULTS
RESULTS
MCIA is the only approach, here, with which data of all organs can be projected into the same low-dimensional space. We studied how frequently (i.e., by how many organs) a patient was classified to belong to the inner or outer 50% of the population, or as an outlier. Outliers could only be detected with MCIA and PCA. MCIA and t-SNE were more robust in judging the distributional location of a patient in contrast to PCA.
CONCLUSIONS
CONCLUSIONS
MCIA is more appropriate and robust in judging the distributional location of a patient in the case of multiple high-dimensional data sets per patient. It is still recommendable to apply PCA or t-SNE in parallel to MCIA to study the location of individual organs.
Identifiants
pubmed: 38355504
doi: 10.1186/s12911-024-02457-8
pii: 10.1186/s12911-024-02457-8
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
49Subventions
Organisme : Bundesministerium für Ernährung und Landwirtschaft
ID : 28DK104B20
Organisme : Bundesministerium für Ernährung und Landwirtschaft
ID : 28DK104B20
Organisme : Bundesministerium für Ernährung und Landwirtschaft
ID : 28DK104B20
Organisme : Bundesministerium für Ernährung und Landwirtschaft
ID : 28DK104B20
Organisme : Bundesministerium für Ernährung und Landwirtschaft
ID : 28DK104B20
Informations de copyright
© 2024. The Author(s).
Références
Cerrolaza JJ, Picazo ML, Humbert L, Sato Y, Rueckert D, Ballester MÁG, et al. Computational anatomy for multi-organ analysis in medical imaging: A review. Med Image Anal. 2019;56:44–67.
pubmed: 31181343
doi: 10.1016/j.media.2019.04.002
Lidke DS, Lidke KA. Advances in high-resolution imaging-techniques for three-dimensional imaging of cellular structures. J Cell Sci. 2012;125(11):2571–80.
pubmed: 22685332
pmcid: 3706075
Vázquez-Arellano M, Griepentrog HW, Reiser D, Paraforos DS. 3-D imaging systems for agricultural applications-a review. Sensors. 2016;16(5):618.
pubmed: 27136560
pmcid: 4883309
doi: 10.3390/s16050618
Soufi M, Otake Y, Hori M, Moriguchi K, Imai Y, Sawai Y, et al. Liver shape analysis using partial least squares regression-based statistical shape model: application for understanding and staging of liver fibrosis. Int J CARS. 2019;14:2083–93.
doi: 10.1007/s11548-019-02084-z
Audenaert EA, Pattyn C, Steenackers G, De Roeck J, Vandermeulen D, Claes P. Statistical shape modeling of skeletal anatomy for sex discrimination: their training size, sexual dimorphism, and asymmetry. Front Bioeng Biotechnol. 2019;7:302.
pubmed: 31737620
pmcid: 6837998
doi: 10.3389/fbioe.2019.00302
Spoliansky R, Edan Y, Parmet Y, Halachmi I. Development of automatic body condition scoring using a low-cost 3-dimensional Kinect camera. J Dairy Sci. 2016;99(9):7714–25.
pubmed: 27320661
doi: 10.3168/jds.2015-10607
Condotta IC, Brown-Brandl TM, Stinn JP, Rohrer GA, Davis JD, Silva-Miranda KO. Dimensions of the modern pig. Trans ASABE. 2018;61(5):1729–39.
doi: 10.13031/trans.12826
Meckbach C, Tiesmeyer V, Traulsen I. A promising approach towards precise animal weight monitoring using convolutional neural networks. Comput Electron Agric. 2021;183:106056.
doi: 10.1016/j.compag.2021.106056
Tang S, Godil A. An evaluation of local shape descriptors for 3D shape retrieval. In: Three-Dimensional Image Processing (3DIP) and Applications II. vol. 8290. Bellingham, Washington: SPIE; 2012. p. 217–31.
Geffre A, Friedrichs K, Harr K, Concordet D, Trumel C, Braun JP. Reference values: a review. Vet Clin Pathol. 2009;38(3):288–98.
pubmed: 19737162
doi: 10.1111/j.1939-165X.2009.00179.x
Tschuchnig ME, Gadermayr M. Anomaly detection in medical imaging-a mini review. In: Data Science–Analytics and Applications: Proceedings of the 4th International Data Science Conference–iDSC2021. Wiesbaden: Springer Fachmedien Wiesbaden; 2022. p. 33–8.
Chaudhuri P. On a geometric notion of quantiles for multivariate data. J Am Stat Assoc. 1996;91(434):862–72.
doi: 10.1080/01621459.1996.10476954
Li C, Wang F, Li R, Ishfaq M, Chen H, Liu F, et al. Hematologic and biochemical reference intervals for 1-month-old specific-pathogen-free Landrace pigs. Vet Clin Pathol. 2021;50(1):76–80.
pubmed: 33550680
doi: 10.1111/vcp.12972
Abbam G, Tandoh S, Tetteh M, Afrifah DA, Annani-Akollor ME, Owiredu EW, et al. Reference intervals for selected haematological and biochemical parameters among apparently healthy adults in different eco-geographical zones in Ghana. PLoS ONE. 2021;16(1):e0245585.
pubmed: 33471853
pmcid: 7817015
doi: 10.1371/journal.pone.0245585
Dolédec S, Chessel D. Co-inertia analysis: an alternative method for studying species-environment relationships. Freshw Biol. 1994;31(3):277–94.
doi: 10.1111/j.1365-2427.1994.tb01741.x
Meng C, Kuster B, Culhane AC, Gholami AM. A multivariate approach to the integration of multi-omics datasets. BMC Bioinformatics. 2014;15:1–13.
doi: 10.1186/1471-2105-15-162
Rousseeuw PJ, Ruts I, Tukey JW. The bagplot: a bivariate boxplot. Am Stat. 1999;53(4):382–7.
Kruppa J, Jung K. Automated multigroup outlier identification in molecular high-throughput data using bagplots and gemplots. BMC Bioinformatics. 2017;18(1):1–10.
doi: 10.1186/s12859-017-1645-5
R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria. 2022. Available from: https://www.R-project.org/ .
Van Rossum G, Drake FL. Python 3 Reference Manual. Scotts Valley: CreateSpace; 2009.
Rister B, Yi D, Shivakumar K, Nobashi T, Rubin DL. CT-ORG, a new dataset for multiple organ segmentation in computed tomography. Sci Data. 2020;7(1):381.
pubmed: 33177518
pmcid: 7658204
doi: 10.1038/s41597-020-00715-8
Ma J, Zhang Y, Gu S, Zhu C, Ge C, Zhang Y, et al. Abdomenct-1k: Is abdominal organ segmentation a solved problem? IEEE Trans Pattern Anal Mach Intell. 2021;44(10):6695–714.
doi: 10.1109/TPAMI.2021.3100536
Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, et al. The Cancer Imaging Archive (TCIA): maintaining and operating a public information repository. J Digit Imaging. 2013;26:1045–57.
pubmed: 23884657
pmcid: 3824915
doi: 10.1007/s10278-013-9622-7
Rister B, Shivakumar K, Nobashi T, Rubin DL. Ct-org: Ct volumes with multiple organ segmentations [dataset]. The Cancer Imaging Archive. 2019. Available from: https://doi.org/10.7937/tcia.2019.tt7f4v7o .
Brooks RA. A quantitative theory of the Hounsfield unit and its application to dual energy scanning. J Comput Assist Tomogr. 1977;1(4):487–93.
pubmed: 615229
doi: 10.1097/00004728-197710000-00016
Pau G, Fuchs F, Sklyar O, Boutros M, Huber W. EBImage-an R package for image processing with applications to cellular phenotypes. Bioinformatics. 2010;26(7):979–81.
pubmed: 20338898
pmcid: 2844988
doi: 10.1093/bioinformatics/btq046
Lewiner T, Lopes H, Vieira AW, Tavares G. Efficient implementation of marching cubes’ cases with topological guarantees. J Graph Tools. 2003;8(2):1–15.
doi: 10.1080/10867651.2003.10487582
Schlager S. Morpho and Rvcg–shape analysis in R: R-packages for geometric morphometrics, shape analysis and surface manipulations. In: Statistical shape and deformation analysis. Amsterdam: Elsevier; 2017. p. 217–56.
Sullivan C, Kaszynski A. PyVista: 3D plotting and mesh analysis through a streamlined interface for the Visualization Toolkit (VTK). J Open Source Softw. 2019;4(37):1450.
doi: 10.21105/joss.01450
Myronenko A, Song X. Point set registration: Coherent point drift. IEEE Trans Pattern Anal Mach Intell. 2010;32(12):2262–75.
pubmed: 20975122
doi: 10.1109/TPAMI.2010.46
Tanaka K, Schmitz P, Ciganovic M, Kumar P. Probreg: Probablistic Point Cloud Registration Library. 2020. Available from: https://probreg.readthedocs.io/en/latest/
Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods. 2020;17(3):261–72.
pubmed: 32015543
pmcid: 7056644
doi: 10.1038/s41592-019-0686-2
Kuhn HW. The Hungarian method for the assignment problem. Nav Res Logist Q. 1955;2(1–2):83–97.
doi: 10.1002/nav.3800020109
Jolliffe, I. Principal Component Analysis. In Encyclopedia of Statistics in Behavioral Science. In: Everitt BS, Howell DC, editors. 2005. Available from: https://doi.org/10.1002/0470013192.bsa501 .
Van der Maaten L, Hinton G. Visualizing data using t-SNE. J Mach Learn Res. 2008;9(11):2579-605.
Ringnér M. What is principal component analysis? Nat Biotechnol. 2008;26(3):303–4.
pubmed: 18327243
doi: 10.1038/nbt0308-303
Meng C, Zeleznik OA, Thallinger GG, Kuster B, Gholami AM, Culhane AC. Dimension reduction techniques for the integrative analysis of multi-omics data. Brief Bioinforma. 2016;17(4):628–41.
doi: 10.1093/bib/bbv108
Dray S, Chessel D, Thioulouse J. Co-inertia analysis and the linking of ecological data tables. Ecology. 2003;84(11):3078–89.
doi: 10.1890/03-0178
Luo X, Liao W, Xiao J, Chen J, Song T, Zhang X, et al. WORD: A large scale dataset, benchmark and clinical applicable study for abdominal organ segmentation from CT image. Med Image Anal. 2022;82:102642.
pubmed: 36223682
doi: 10.1016/j.media.2022.102642
Van Ginneken B, Schaefer-Prokop CM, Prokop M. Computer-aided diagnosis: how to move from the laboratory to the clinic. Radiology. 2011;261(3):719–32.
pubmed: 22095995
doi: 10.1148/radiol.11091710
Sharma N, Aggarwal LM, et al. Automated medical image segmentation techniques. J Med Phys. 2010;35(1):3.
pubmed: 20177565
pmcid: 2825001
doi: 10.4103/0971-6203.58777
Dakua SP, Abi-Nahed J. Patient oriented graph-based image segmentation. Biomed Signal Process Control. 2013;8(3):325–32.
doi: 10.1016/j.bspc.2012.11.009
Fernando T, Gammulle H, Denman S, Sridharan S, Fookes C. Deep learning for medical anomaly detection-a survey. ACM Comput Surv (CSUR). 2021;54(7):1–37.
doi: 10.1145/3464423
Okada T, Linguraru MG, Hori M, Summers RM, Tomiyama N, Sato Y. Abdominal multi-organ segmentation from CT images using conditional shape-location and unsupervised intensity priors. Med Image Anal. 2015;26(1):1–18.
pubmed: 26277022
pmcid: 4679509
doi: 10.1016/j.media.2015.06.009
Krasoń A, Woloshuk A, Spinczyk D. Segmentation of abdominal organs in computed tomography using a generalized statistical shape model. Comput Med Imaging Graph. 2019;78:101672.
pubmed: 31715378
doi: 10.1016/j.compmedimag.2019.101672
Xu Y, Tang O, Tang Y, Lee HH, Chen Y, Gao D, et al. Outlier guided optimization of abdominal segmentation. In: Medical Imaging 2020: Image Processing. vol. 11313. Bellingham, Washington: SPIE; 2020. p. 799–805.
Zhu H, Guo B, Zou K, Li Y, Yuen KV, Mihaylova L, et al. A review of point set registration: From pairwise registration to groupwise registration. Sensors. 2019;19(5):1191.
pubmed: 30857205
pmcid: 6427196
doi: 10.3390/s19051191
Lüthi M, Forster A, Gerig T, Vetter T. Shape modeling using gaussian process morphable models. In: Statistical shape and deformation analysis. Amsterdam: Elsevier; 2017. p. 165–91.
Ambellan F, Lamecker H, von Tycowicz C, Zachow S. Statistical shape models: understanding and mastering variation in anatomy. Springer International Publishing; 2019.
Heimann T, Meinzer HP. Statistical shape models for 3D medical image segmentation: a review. Med Image Anal. 2009;13(4):543–63.
pubmed: 19525140
doi: 10.1016/j.media.2009.05.004
Rahbani D, Morel-Forster A, Madsen D, Lüthi M, Vetter T. Robust registration of statistical shape models for unsupervised pathology annotation. In: Large-Scale Annotation of Biomedical Data and Expert Label Synthesis and Hardware Aware Learning for Medical Imaging and Computer Assisted Intervention: International Workshops, LABELS 2019, HAL-MICCAI 2019, and CuRIOUS 2019, Held in Conjunction with MICCAI 2019, Shenzhen, China, October 13 and 17, 2019, Proceedings 4. Springer International Publishing; 2019. p. 13–21.
Mohanty S, Dakua SP. Toward computing cross-modality symmetric non-rigid medical image registration. IEEE Access. 2022;10:24528–39.
doi: 10.1109/ACCESS.2022.3154771
Han X, Yu Z, Zhuo Y, Zhao B, Ren Y, Lamm L, et al. The value of longitudinal clinical data and paired CT scans in predicting the deterioration of COVID-19 revealed by an artificial intelligence system. Iscience. 2022;25(5):104227.
Nakao M, Nakamura M, Mizowaki T, Matsuda T. Statistical deformation reconstruction using multi-organ shape features for pancreatic cancer localization. Med Image Anal. 2021;67:101829.
pubmed: 33129146
doi: 10.1016/j.media.2020.101829
Pellicer-Valero OJ, Rupérez MJ, Martínez-Sanchis S, Martín-Guerrero JD. Real-time biomechanical modeling of the liver using machine learning models trained on finite element method simulations. Expert Syst Appl. 2020;143:113083.
doi: 10.1016/j.eswa.2019.113083
Sinha A, Reiter A, Leonard S, Ishii M, Hager GD, Taylor RH. Simultaneous segmentation and correspondence improvement using statistical modes. In: Medical Imaging 2017: Image Processing. vol. 10133. Bellingham, Washington: SPIE; 2017. p. 377–84.
Zadorozhny K, Thoral P, Elbers P, Cinà G. Out-of-distribution detection for medical applications: Guidelines for practical evaluation. In: Multimodal AI in healthcare: A paradigm shift in health intelligence. Springer International Publishing; 2022. p. 137–53.