Dimension reduction and outlier detection of 3-D shapes derived from multi-organ CT images.

Bagplots CT scans Dimension reduction Multiple co-inertia analysis Outlier detection

Journal

BMC medical informatics and decision making
ISSN: 1472-6947
Titre abrégé: BMC Med Inform Decis Mak
Pays: England
ID NLM: 101088682

Informations de publication

Date de publication:
14 Feb 2024
Historique:
received: 24 03 2023
accepted: 08 02 2024
medline: 15 2 2024
pubmed: 15 2 2024
entrez: 14 2 2024
Statut: epublish

Résumé

Unsupervised clustering and outlier detection are important in medical research to understand the distributional composition of a collective of patients. A number of clustering methods exist, also for high-dimensional data after dimension reduction. Clustering and outlier detection may, however, become less robust or contradictory if multiple high-dimensional data sets per patient exist. Such a scenario is given when the focus is on 3-D data of multiple organs per patient, and a high-dimensional feature matrix per organ is extracted. We use principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE) and multiple co-inertia analysis (MCIA) combined with bagplots to study the distribution of multi-organ 3-D data taken by computed tomography scans. After point-set registration of multiple organs from two public data sets, multiple hundred shape features are extracted per organ. While PCA and t-SNE can only be applied to each organ individually, MCIA can project the data of all organs into the same low-dimensional space. MCIA is the only approach, here, with which data of all organs can be projected into the same low-dimensional space. We studied how frequently (i.e., by how many organs) a patient was classified to belong to the inner or outer 50% of the population, or as an outlier. Outliers could only be detected with MCIA and PCA. MCIA and t-SNE were more robust in judging the distributional location of a patient in contrast to PCA. MCIA is more appropriate and robust in judging the distributional location of a patient in the case of multiple high-dimensional data sets per patient. It is still recommendable to apply PCA or t-SNE in parallel to MCIA to study the location of individual organs.

Sections du résumé

BACKGROUND BACKGROUND
Unsupervised clustering and outlier detection are important in medical research to understand the distributional composition of a collective of patients. A number of clustering methods exist, also for high-dimensional data after dimension reduction. Clustering and outlier detection may, however, become less robust or contradictory if multiple high-dimensional data sets per patient exist. Such a scenario is given when the focus is on 3-D data of multiple organs per patient, and a high-dimensional feature matrix per organ is extracted.
METHODS METHODS
We use principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE) and multiple co-inertia analysis (MCIA) combined with bagplots to study the distribution of multi-organ 3-D data taken by computed tomography scans. After point-set registration of multiple organs from two public data sets, multiple hundred shape features are extracted per organ. While PCA and t-SNE can only be applied to each organ individually, MCIA can project the data of all organs into the same low-dimensional space.
RESULTS RESULTS
MCIA is the only approach, here, with which data of all organs can be projected into the same low-dimensional space. We studied how frequently (i.e., by how many organs) a patient was classified to belong to the inner or outer 50% of the population, or as an outlier. Outliers could only be detected with MCIA and PCA. MCIA and t-SNE were more robust in judging the distributional location of a patient in contrast to PCA.
CONCLUSIONS CONCLUSIONS
MCIA is more appropriate and robust in judging the distributional location of a patient in the case of multiple high-dimensional data sets per patient. It is still recommendable to apply PCA or t-SNE in parallel to MCIA to study the location of individual organs.

Identifiants

pubmed: 38355504
doi: 10.1186/s12911-024-02457-8
pii: 10.1186/s12911-024-02457-8
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

49

Subventions

Organisme : Bundesministerium für Ernährung und Landwirtschaft
ID : 28DK104B20
Organisme : Bundesministerium für Ernährung und Landwirtschaft
ID : 28DK104B20
Organisme : Bundesministerium für Ernährung und Landwirtschaft
ID : 28DK104B20
Organisme : Bundesministerium für Ernährung und Landwirtschaft
ID : 28DK104B20
Organisme : Bundesministerium für Ernährung und Landwirtschaft
ID : 28DK104B20

Informations de copyright

© 2024. The Author(s).

Références

Cerrolaza JJ, Picazo ML, Humbert L, Sato Y, Rueckert D, Ballester MÁG, et al. Computational anatomy for multi-organ analysis in medical imaging: A review. Med Image Anal. 2019;56:44–67.
pubmed: 31181343 doi: 10.1016/j.media.2019.04.002
Lidke DS, Lidke KA. Advances in high-resolution imaging-techniques for three-dimensional imaging of cellular structures. J Cell Sci. 2012;125(11):2571–80.
pubmed: 22685332 pmcid: 3706075
Vázquez-Arellano M, Griepentrog HW, Reiser D, Paraforos DS. 3-D imaging systems for agricultural applications-a review. Sensors. 2016;16(5):618.
pubmed: 27136560 pmcid: 4883309 doi: 10.3390/s16050618
Soufi M, Otake Y, Hori M, Moriguchi K, Imai Y, Sawai Y, et al. Liver shape analysis using partial least squares regression-based statistical shape model: application for understanding and staging of liver fibrosis. Int J CARS. 2019;14:2083–93.
doi: 10.1007/s11548-019-02084-z
Audenaert EA, Pattyn C, Steenackers G, De Roeck J, Vandermeulen D, Claes P. Statistical shape modeling of skeletal anatomy for sex discrimination: their training size, sexual dimorphism, and asymmetry. Front Bioeng Biotechnol. 2019;7:302.
pubmed: 31737620 pmcid: 6837998 doi: 10.3389/fbioe.2019.00302
Spoliansky R, Edan Y, Parmet Y, Halachmi I. Development of automatic body condition scoring using a low-cost 3-dimensional Kinect camera. J Dairy Sci. 2016;99(9):7714–25.
pubmed: 27320661 doi: 10.3168/jds.2015-10607
Condotta IC, Brown-Brandl TM, Stinn JP, Rohrer GA, Davis JD, Silva-Miranda KO. Dimensions of the modern pig. Trans ASABE. 2018;61(5):1729–39.
doi: 10.13031/trans.12826
Meckbach C, Tiesmeyer V, Traulsen I. A promising approach towards precise animal weight monitoring using convolutional neural networks. Comput Electron Agric. 2021;183:106056.
doi: 10.1016/j.compag.2021.106056
Tang S, Godil A. An evaluation of local shape descriptors for 3D shape retrieval. In: Three-Dimensional Image Processing (3DIP) and Applications II. vol. 8290. Bellingham, Washington: SPIE; 2012. p. 217–31.
Geffre A, Friedrichs K, Harr K, Concordet D, Trumel C, Braun JP. Reference values: a review. Vet Clin Pathol. 2009;38(3):288–98.
pubmed: 19737162 doi: 10.1111/j.1939-165X.2009.00179.x
Tschuchnig ME, Gadermayr M. Anomaly detection in medical imaging-a mini review. In: Data Science–Analytics and Applications: Proceedings of the 4th International Data Science Conference–iDSC2021. Wiesbaden: Springer Fachmedien Wiesbaden; 2022. p. 33–8.
Chaudhuri P. On a geometric notion of quantiles for multivariate data. J Am Stat Assoc. 1996;91(434):862–72.
doi: 10.1080/01621459.1996.10476954
Li C, Wang F, Li R, Ishfaq M, Chen H, Liu F, et al. Hematologic and biochemical reference intervals for 1-month-old specific-pathogen-free Landrace pigs. Vet Clin Pathol. 2021;50(1):76–80.
pubmed: 33550680 doi: 10.1111/vcp.12972
Abbam G, Tandoh S, Tetteh M, Afrifah DA, Annani-Akollor ME, Owiredu EW, et al. Reference intervals for selected haematological and biochemical parameters among apparently healthy adults in different eco-geographical zones in Ghana. PLoS ONE. 2021;16(1):e0245585.
pubmed: 33471853 pmcid: 7817015 doi: 10.1371/journal.pone.0245585
Dolédec S, Chessel D. Co-inertia analysis: an alternative method for studying species-environment relationships. Freshw Biol. 1994;31(3):277–94.
doi: 10.1111/j.1365-2427.1994.tb01741.x
Meng C, Kuster B, Culhane AC, Gholami AM. A multivariate approach to the integration of multi-omics datasets. BMC Bioinformatics. 2014;15:1–13.
doi: 10.1186/1471-2105-15-162
Rousseeuw PJ, Ruts I, Tukey JW. The bagplot: a bivariate boxplot. Am Stat. 1999;53(4):382–7.
Kruppa J, Jung K. Automated multigroup outlier identification in molecular high-throughput data using bagplots and gemplots. BMC Bioinformatics. 2017;18(1):1–10.
doi: 10.1186/s12859-017-1645-5
R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria. 2022. Available from: https://www.R-project.org/ .
Van Rossum G, Drake FL. Python 3 Reference Manual. Scotts Valley: CreateSpace; 2009.
Rister B, Yi D, Shivakumar K, Nobashi T, Rubin DL. CT-ORG, a new dataset for multiple organ segmentation in computed tomography. Sci Data. 2020;7(1):381.
pubmed: 33177518 pmcid: 7658204 doi: 10.1038/s41597-020-00715-8
Ma J, Zhang Y, Gu S, Zhu C, Ge C, Zhang Y, et al. Abdomenct-1k: Is abdominal organ segmentation a solved problem? IEEE Trans Pattern Anal Mach Intell. 2021;44(10):6695–714.
doi: 10.1109/TPAMI.2021.3100536
Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, et al. The Cancer Imaging Archive (TCIA): maintaining and operating a public information repository. J Digit Imaging. 2013;26:1045–57.
pubmed: 23884657 pmcid: 3824915 doi: 10.1007/s10278-013-9622-7
Rister B, Shivakumar K, Nobashi T, Rubin DL. Ct-org: Ct volumes with multiple organ segmentations [dataset]. The Cancer Imaging Archive. 2019. Available from: https://doi.org/10.7937/tcia.2019.tt7f4v7o .
Brooks RA. A quantitative theory of the Hounsfield unit and its application to dual energy scanning. J Comput Assist Tomogr. 1977;1(4):487–93.
pubmed: 615229 doi: 10.1097/00004728-197710000-00016
Pau G, Fuchs F, Sklyar O, Boutros M, Huber W. EBImage-an R package for image processing with applications to cellular phenotypes. Bioinformatics. 2010;26(7):979–81.
pubmed: 20338898 pmcid: 2844988 doi: 10.1093/bioinformatics/btq046
Lewiner T, Lopes H, Vieira AW, Tavares G. Efficient implementation of marching cubes’ cases with topological guarantees. J Graph Tools. 2003;8(2):1–15.
doi: 10.1080/10867651.2003.10487582
Schlager S. Morpho and Rvcg–shape analysis in R: R-packages for geometric morphometrics, shape analysis and surface manipulations. In: Statistical shape and deformation analysis. Amsterdam: Elsevier; 2017. p. 217–56.
Sullivan C, Kaszynski A. PyVista: 3D plotting and mesh analysis through a streamlined interface for the Visualization Toolkit (VTK). J Open Source Softw. 2019;4(37):1450.
doi: 10.21105/joss.01450
Myronenko A, Song X. Point set registration: Coherent point drift. IEEE Trans Pattern Anal Mach Intell. 2010;32(12):2262–75.
pubmed: 20975122 doi: 10.1109/TPAMI.2010.46
Tanaka K, Schmitz P, Ciganovic M, Kumar P. Probreg: Probablistic Point Cloud Registration Library. 2020. Available from: https://probreg.readthedocs.io/en/latest/
Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods. 2020;17(3):261–72.
pubmed: 32015543 pmcid: 7056644 doi: 10.1038/s41592-019-0686-2
Kuhn HW. The Hungarian method for the assignment problem. Nav Res Logist Q. 1955;2(1–2):83–97.
doi: 10.1002/nav.3800020109
Jolliffe, I. Principal Component Analysis. In Encyclopedia of Statistics in Behavioral Science. In: Everitt BS, Howell DC, editors. 2005. Available from: https://doi.org/10.1002/0470013192.bsa501 .
Van der Maaten L, Hinton G. Visualizing data using t-SNE. J Mach Learn Res. 2008;9(11):2579-605.
Ringnér M. What is principal component analysis? Nat Biotechnol. 2008;26(3):303–4.
pubmed: 18327243 doi: 10.1038/nbt0308-303
Meng C, Zeleznik OA, Thallinger GG, Kuster B, Gholami AM, Culhane AC. Dimension reduction techniques for the integrative analysis of multi-omics data. Brief Bioinforma. 2016;17(4):628–41.
doi: 10.1093/bib/bbv108
Dray S, Chessel D, Thioulouse J. Co-inertia analysis and the linking of ecological data tables. Ecology. 2003;84(11):3078–89.
doi: 10.1890/03-0178
Luo X, Liao W, Xiao J, Chen J, Song T, Zhang X, et al. WORD: A large scale dataset, benchmark and clinical applicable study for abdominal organ segmentation from CT image. Med Image Anal. 2022;82:102642.
pubmed: 36223682 doi: 10.1016/j.media.2022.102642
Van Ginneken B, Schaefer-Prokop CM, Prokop M. Computer-aided diagnosis: how to move from the laboratory to the clinic. Radiology. 2011;261(3):719–32.
pubmed: 22095995 doi: 10.1148/radiol.11091710
Sharma N, Aggarwal LM, et al. Automated medical image segmentation techniques. J Med Phys. 2010;35(1):3.
pubmed: 20177565 pmcid: 2825001 doi: 10.4103/0971-6203.58777
Dakua SP, Abi-Nahed J. Patient oriented graph-based image segmentation. Biomed Signal Process Control. 2013;8(3):325–32.
doi: 10.1016/j.bspc.2012.11.009
Fernando T, Gammulle H, Denman S, Sridharan S, Fookes C. Deep learning for medical anomaly detection-a survey. ACM Comput Surv (CSUR). 2021;54(7):1–37.
doi: 10.1145/3464423
Okada T, Linguraru MG, Hori M, Summers RM, Tomiyama N, Sato Y. Abdominal multi-organ segmentation from CT images using conditional shape-location and unsupervised intensity priors. Med Image Anal. 2015;26(1):1–18.
pubmed: 26277022 pmcid: 4679509 doi: 10.1016/j.media.2015.06.009
Krasoń A, Woloshuk A, Spinczyk D. Segmentation of abdominal organs in computed tomography using a generalized statistical shape model. Comput Med Imaging Graph. 2019;78:101672.
pubmed: 31715378 doi: 10.1016/j.compmedimag.2019.101672
Xu Y, Tang O, Tang Y, Lee HH, Chen Y, Gao D, et al. Outlier guided optimization of abdominal segmentation. In: Medical Imaging 2020: Image Processing. vol. 11313. Bellingham, Washington: SPIE; 2020. p. 799–805.
Zhu H, Guo B, Zou K, Li Y, Yuen KV, Mihaylova L, et al. A review of point set registration: From pairwise registration to groupwise registration. Sensors. 2019;19(5):1191.
pubmed: 30857205 pmcid: 6427196 doi: 10.3390/s19051191
Lüthi M, Forster A, Gerig T, Vetter T. Shape modeling using gaussian process morphable models. In: Statistical shape and deformation analysis. Amsterdam: Elsevier; 2017. p. 165–91.
Ambellan F, Lamecker H, von Tycowicz C, Zachow S. Statistical shape models: understanding and mastering variation in anatomy. Springer International Publishing; 2019.
Heimann T, Meinzer HP. Statistical shape models for 3D medical image segmentation: a review. Med Image Anal. 2009;13(4):543–63.
pubmed: 19525140 doi: 10.1016/j.media.2009.05.004
Rahbani D, Morel-Forster A, Madsen D, Lüthi M, Vetter T. Robust registration of statistical shape models for unsupervised pathology annotation. In: Large-Scale Annotation of Biomedical Data and Expert Label Synthesis and Hardware Aware Learning for Medical Imaging and Computer Assisted Intervention: International Workshops, LABELS 2019, HAL-MICCAI 2019, and CuRIOUS 2019, Held in Conjunction with MICCAI 2019, Shenzhen, China, October 13 and 17, 2019, Proceedings 4. Springer International Publishing; 2019. p. 13–21.
Mohanty S, Dakua SP. Toward computing cross-modality symmetric non-rigid medical image registration. IEEE Access. 2022;10:24528–39.
doi: 10.1109/ACCESS.2022.3154771
Han X, Yu Z, Zhuo Y, Zhao B, Ren Y, Lamm L, et al. The value of longitudinal clinical data and paired CT scans in predicting the deterioration of COVID-19 revealed by an artificial intelligence system. Iscience. 2022;25(5):104227.
Nakao M, Nakamura M, Mizowaki T, Matsuda T. Statistical deformation reconstruction using multi-organ shape features for pancreatic cancer localization. Med Image Anal. 2021;67:101829.
pubmed: 33129146 doi: 10.1016/j.media.2020.101829
Pellicer-Valero OJ, Rupérez MJ, Martínez-Sanchis S, Martín-Guerrero JD. Real-time biomechanical modeling of the liver using machine learning models trained on finite element method simulations. Expert Syst Appl. 2020;143:113083.
doi: 10.1016/j.eswa.2019.113083
Sinha A, Reiter A, Leonard S, Ishii M, Hager GD, Taylor RH. Simultaneous segmentation and correspondence improvement using statistical modes. In: Medical Imaging 2017: Image Processing. vol. 10133. Bellingham, Washington: SPIE; 2017. p. 377–84.
Zadorozhny K, Thoral P, Elbers P, Cinà G. Out-of-distribution detection for medical applications: Guidelines for practical evaluation. In: Multimodal AI in healthcare: A paradigm shift in health intelligence. Springer International Publishing; 2022. p. 137–53.

Auteurs

Michael Selle (M)

Institute of Animal Genomics, University of Veterinary Medicine Hannover, Hannover, Germany. michael.selle@tiho-hannover.de.

Magdalena Kircher (M)

Institute of Animal Genomics, University of Veterinary Medicine Hannover, Hannover, Germany.

Cornelia Schwennen (C)

Institute for Animal Nutrition, University of Veterinary Medicine Hannover, Hannover, Germany.

Christian Visscher (C)

Institute for Animal Nutrition, University of Veterinary Medicine Hannover, Hannover, Germany.

Klaus Jung (K)

Institute of Animal Genomics, University of Veterinary Medicine Hannover, Hannover, Germany. klaus.jung@tiho-hannover.de.

Classifications MeSH