A robust clustering strategy for stratification unveils unique patient subgroups in acutely decompensated cirrhosis.
ACLF
Cirrhosis
Clustering
Complex diseases
Patient heterogeneity
Stratification
Unsupervised learning
Journal
Journal of translational medicine
ISSN: 1479-5876
Titre abrégé: J Transl Med
Pays: England
ID NLM: 101190741
Informations de publication
Date de publication:
27 Jun 2024
27 Jun 2024
Historique:
received:
01
02
2024
accepted:
10
06
2024
medline:
28
6
2024
pubmed:
28
6
2024
entrez:
27
6
2024
Statut:
epublish
Résumé
Patient heterogeneity poses significant challenges for managing individuals and designing clinical trials, especially in complex diseases. Existing classifications rely on outcome-predicting scores, potentially overlooking crucial elements contributing to heterogeneity without necessarily impacting prognosis. To address patient heterogeneity, we developed ClustALL, a computational pipeline that simultaneously faces diverse clinical data challenges like mixed types, missing values, and collinearity. ClustALL enables the unsupervised identification of patient stratifications while filtering for stratifications that are robust against minor variations in the population (population-based) and against limited adjustments in the algorithm's parameters (parameter-based). Applied to a European cohort of patients with acutely decompensated cirrhosis (n = 766), ClustALL identified five robust stratifications, using only data at hospital admission. All stratifications included markers of impaired liver function and number of organ dysfunction or failure, and most included precipitating events. When focusing on one of these stratifications, patients were categorized into three clusters characterized by typical clinical features; notably, the 3-cluster stratification showed a prognostic value. Re-assessment of patient stratification during follow-up delineated patients' outcomes, with further improvement of the prognostic value of the stratification. We validated these findings in an independent prospective multicentre cohort of patients from Latin America (n = 580). By applying ClustALL to patients with acutely decompensated cirrhosis, we identified three patient clusters. Following these clusters over time offers insights that could guide future clinical trial design. ClustALL is a novel and robust stratification method capable of addressing the multiple challenges of patient stratification in most complex diseases.
Sections du résumé
BACKGROUND
BACKGROUND
Patient heterogeneity poses significant challenges for managing individuals and designing clinical trials, especially in complex diseases. Existing classifications rely on outcome-predicting scores, potentially overlooking crucial elements contributing to heterogeneity without necessarily impacting prognosis.
METHODS
METHODS
To address patient heterogeneity, we developed ClustALL, a computational pipeline that simultaneously faces diverse clinical data challenges like mixed types, missing values, and collinearity. ClustALL enables the unsupervised identification of patient stratifications while filtering for stratifications that are robust against minor variations in the population (population-based) and against limited adjustments in the algorithm's parameters (parameter-based).
RESULTS
RESULTS
Applied to a European cohort of patients with acutely decompensated cirrhosis (n = 766), ClustALL identified five robust stratifications, using only data at hospital admission. All stratifications included markers of impaired liver function and number of organ dysfunction or failure, and most included precipitating events. When focusing on one of these stratifications, patients were categorized into three clusters characterized by typical clinical features; notably, the 3-cluster stratification showed a prognostic value. Re-assessment of patient stratification during follow-up delineated patients' outcomes, with further improvement of the prognostic value of the stratification. We validated these findings in an independent prospective multicentre cohort of patients from Latin America (n = 580).
CONCLUSIONS
CONCLUSIONS
By applying ClustALL to patients with acutely decompensated cirrhosis, we identified three patient clusters. Following these clusters over time offers insights that could guide future clinical trial design. ClustALL is a novel and robust stratification method capable of addressing the multiple challenges of patient stratification in most complex diseases.
Identifiants
pubmed: 38937846
doi: 10.1186/s12967-024-05386-2
pii: 10.1186/s12967-024-05386-2
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
599Subventions
Organisme : Ministerio de Ciencia e Innovación
ID : RYC2021-032197-I
Organisme : Horizon 2020 Framework Programme
ID : 847949
Organisme : German Research Foundation
ID : 403224013 - SFB 1382 (A09)
Organisme : Foundation pour la Recherche Médicale
ID : EQU202303016287
Organisme : Agence Nationale pour la Recherche
ID : ANR-18-CE14-0006-01, RHU QUID-NASH, ANR-18-IDEX-0001, ANR-22-CE14-0002
Informations de copyright
© 2024. The Author(s).
Références
Almendro V, Kim HJ, Cheng YK, Gonen M, Itzkovitz S, Argani P, et al. Genetic and phenotypic diversity in breast tumor metastases. Cancer Res. 2014;74(5):1338–48.
doi: 10.1158/0008-5472.CAN-13-2357-T
pubmed: 24448237
pmcid: 3963810
Kotelnikova E, Kiani NA, Abad E, Martinez-Lapiscina EH, Andorra M, Zubizarreta I et al. Dynamics and heterogeneity of brain damage in multiple sclerosis. PLoS Comput Biol. 2017;13(10).
Dennis JM, Shields BM, Henley WE, Jones AG, Hattersley AT. Disease progression and treatment response in data-driven subgroups of type 2 diabetes compared with models based on simple clinical features: an analysis using clinical trial data. Lancet Diabetes Endocrinol. 2019;7(6):442–51.
doi: 10.1016/S2213-8587(19)30087-7
pubmed: 31047901
pmcid: 6520497
Schuppan D, Afdhal NH. Liver cirrhosis. Lancet. 2008;371(9615):838–51.
doi: 10.1016/S0140-6736(08)60383-9
pubmed: 18328931
pmcid: 2271178
Mansour D, McPherson S. Management of decompensated cirrhosis. Clin Med (Lond). 2018;18(Suppl 2):s60–5.
doi: 10.7861/clinmedicine.18-2-s60
pubmed: 29700095
D’Amico G, Morabito A, D’Amico M, Pasta L, Malizia G, Rebora P, et al. Clinical states of cirrhosis and competing risks. Journal of Hepatology. Volume 68. Elsevier B.V.; 2018. pp. 563–76.
Spach D. Evaluation and Prognosis of Patients with Cirrhosis - Core Concepts [Internet]. https://www.hepatitisC.uw.edu/go/evaluation-staging-monitoring/evaluation-prognosis-cirrhosis/core .
Cerezo Cerezo J, ALC. Population stratification: a fundamental instrument used for population health management in Spain: good practice brief. World Health Organization Regional Office for Europe; 2018. https://apps.who.int/iris/handle/10665/345586 .
Moral TT, Sanchez-Niubo A, Monistrol-Mula A, Gerardi C, Banzi R, Garcia P, et al. Methods for stratification and validation cohorts: a scoping review. Volume 12. Journal of Personalized Medicine. MDPI; 2022.
Horne E, Tibble H, Sheikh A, Tsanas A. Challenges of clustering multimodal clinical data: review of applications in asthma subtyping. JMIR Medical Informatics. Volume 8. JMIR Publications Inc.; 2020.
Wang H, Donoho D, Kuppler C, Loftus TJ Jr, Copyright UG. frai, Phenotype clustering in health care: A narrative review for clinicians.
Saxena A, Prasad M, Gupta A, Bharill N, Patel OP, Tiwari A, et al. A review of clustering techniques and developments. Neurocomputing. 2017;267:664–81.
doi: 10.1016/j.neucom.2017.06.053
Cismondi F, Fialho AS, Vieira SM, Reti SR, Sousa JMC, Finkelstein SN. Missing data in medical databases: Impute, delete or classify? Artif Intell Med. 2013;58(1):63–72.
doi: 10.1016/j.artmed.2013.01.003
pubmed: 23428358
Rodríguez AH, Ruiz-Botella M, Martín-Loeches I, Jimenez Herrera M, Solé-Violan J, Gómez J et al. Deploying unsupervised clustering analysis to derive clinical phenotypes and risk factors associated with mortality risk in 2022 critically ill patients with COVID-19 in Spain. Crit Care. 2021;25(1).
Curtis JR, Weinblatt M, Saag K, Bykerk VP, Furst DE, Fiore S, et al. Data-Driven patient clustering and Differential Clinical outcomes in the Brigham and women’s Rheumatoid Arthritis Sequential Study Registry. Arthritis Care Res (Hoboken). 2021;73(4):471–80.
doi: 10.1002/acr.24471
pubmed: 33002337
Pudjihartono N, Fadason T, Kempa-Liehr AW, O’Sullivan JM. A review of feature selection methods for machine learning-based Disease Risk Prediction. Front Bioinf. 2022;2.
Hennig C. What are the true clusters? Pattern Recognit Lett. 2015;64:53–62.
doi: 10.1016/j.patrec.2015.04.009
Lopez-Martinez-Carrasco A, Juarez JM, Campos M, Canovas-Segura B. A methodology based on Trace-based clustering for patient phenotyping. Knowl Based Syst. 2021;232.
Chalancon G, Kruse K, Babu MM. Clustering coefficient. Encyclopedia of systems Biology. New York, NY: Springer New York; 2013. pp. 422–4.
doi: 10.1007/978-1-4419-9863-7_1239
Rousseeuw PJ. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math. 1987;20:53–65.
doi: 10.1016/0377-0427(87)90125-7
Adam SP, Alexandropoulos SAN, Pardalos PM, Vrahatis MN. In. No free lunch theorem: a review. 2019. p. 57–82.
Milligan GW, Cooper MC. An examination of procedures for determining the number of clusters in a data set. Psychometrika. 1985;50(2):159–79.
doi: 10.1007/BF02294245
Steinley D, Brusco MJ. Choosing the number of clusters in Κ-means clustering. Psychol Methods. 2011;16(3):285–97.
doi: 10.1037/a0023346
pubmed: 21728423
Altman N, Krzywinski M, Clustering. Nat Methods. 2017;14(6):545–6.
doi: 10.1038/nmeth.4299
Kitano H. Towards a theory of biological robustness. Mol Syst Biol. 2007;3(1).
Yu H, Chapman B, Di Florio A, Eischen E, Gotz D, Jacob M, et al. Bootstrapping estimates of stability for clusters, observations and model selection. Comput Stat. 2019;34(1):349–72.
doi: 10.1007/s00180-018-0830-y
Lu Y, Phillips CA, Langston MA. A robustness metric for biological data clustering algorithms. BMC Bioinformatics. 2019;20(S15):503.
doi: 10.1186/s12859-019-3089-6
pubmed: 31874625
pmcid: 6929270
Trebicka J, Fernandez J, Papp M, Caraceni P, Laleman W, Gambino C, et al. The PREDICT study uncovers three clinical courses of acutely decompensated cirrhosis that have distinct pathophysiology. J Hepatol. 2020;73(4):842–54.
doi: 10.1016/j.jhep.2020.06.013
pubmed: 32673741
Schonlau M. Visualizing non-hierarchical and hierarchical cluster analyses with clustergrams. Vol. 19, Computational Statistics. 2004.
Ringnér M. What is principal component analysis? [Internet]. Vol. 26, NATURE BIOTECHNOLOGY. 2008. http://www.nature.com/naturebiotechnology .
Martínez-Gómez E, Richards MT, Richards DSP. DISTANCE CORRELATION METHODS FOR DISCOVERING ASSOCIATIONS IN LARGE ASTROPHYSICAL DATABASES. Astrophys J. 2014;781(1):39.
doi: 10.1088/0004-637X/781/1/39
Gower JC. A General Coefficient of Similarity and Some of Its Properties. Vol. 27, Biometrics. 1971.
Hummel M, Edelmann D, Kopp-Schneider A. Clustering of samples and variables with mixed-type data. PLoS ONE. 2017;12(11).
Zhang Z, Murtagh F, Poucke S, Van, Lin S, Lan P. Hierarchical cluster analysis in clinical research with heterogeneous study population: highlighting its visualization with R. Ann Transl Med. 2017;5(4).
Arora P, Deepali, Varshney S. Analysis of K-Means and K-Medoids Algorithm for Big Data. Physics Procedia. Elsevier B.V.; 2016. pp. 507–12.
Liu Y, Li Z, Xiong H, Gao X, Wu J. Understanding of internal clustering validation measures. In: Proceedings - IEEE International Conference on Data Mining, ICDM. 2010. pp. 911–6.
Zhao Q, Fränti P. WB-index: a sum-of-squares based index for cluster validity. Data Knowl Eng. 2014;92:77–89.
doi: 10.1016/j.datak.2014.07.008
Fletcher S, Islam Z. Comparing sets of patterns with the Jaccard index. Volume 22. Australasian Journal of Information Systems Fletcher & Islam; 2018.
Tang M, Kaymaz Y, Logeman BL, Eichhorn S, Liang ZS, Dulac C, et al. Evaluating single-cell cluster stability using the Jaccard similarity index. Bioinformatics. 2021;37(15):2212–4.
doi: 10.1093/bioinformatics/btaa956
pubmed: 33165513
Van Buuren S, Groothuis-Oudshoorn K. Journal of Statistical Software mice: Multivariate Imputation by Chained Equations in R [Internet]. Vol. 45. 2011. http://www.jstatsoft.org/ .
Farias AQ, Curto Vilalta A, Momoyo Zitelli P, Pereira G, Goncalves LL, Torre A, et al. Genetic ancestry, race, and severity of acutely decompensated cirrhosis in Latin America. Gastroenterology. 2023;165(3):696–716.
doi: 10.1053/j.gastro.2023.05.033
pubmed: 37263305
Hennig C. Cluster-wise assessment of cluster stability. Comput Stat Data Anal. 2007;52(1):258–71.
doi: 10.1016/j.csda.2006.11.025
R Core Team. R: a language and environment for statistical. Vienna, Austria: R Foundation for Statistical Computing; 2021.
Lagani V, Athineou G, Farcomeni A, Tsagris M, Tsamardinos I. Feature selection with the R Package MXM: discovering statistically equivalent feature subsets. J Stat Softw. 2017;80(7).
Tsagris M, Tsamardinos I. Feature selection with the R package MXM. F1000Res. 2018;7:1505.
doi: 10.12688/f1000research.16216.1
pubmed: 31656581
Deng Z, Zhu X, Cheng D, Zong M, Zhang S. Efficient kNN classification algorithm for big data. Neurocomputing. 2016;195:143–8.
doi: 10.1016/j.neucom.2015.08.112
Ali N, Neagu D, Trundle P. Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets. SN Appl Sci. 2019;1(12):1559.
doi: 10.1007/s42452-019-1356-9
Rossi R, Murari A, Gaudio P, Gelfusa M. Upgrading model selection criteria with goodness of fit tests for practical applications. Entropy. 2020;22(4):447.
doi: 10.3390/e22040447
pubmed: 33286221
pmcid: 7516921
Cook NR. Quantifying the added value of new biomarkers: how and how not. Diagn Progn Res. 2018;2(1):14.
doi: 10.1186/s41512-018-0037-2
pubmed: 31093563
pmcid: 6460632
Arroyo V, Moreau R, Jalan R. Acute-on-chronic liver failure. N Engl J Med. 2020;382(22):2137–45.
doi: 10.1056/NEJMra1914900
pubmed: 32459924
Stewart CA, Malinchoc M, Kim WR, Kamath PS. Hepatic encephalopathy as a predictor of survival in patients with end-stage liver disease. Liver Transpl. 2007;13(10):1366–71.
doi: 10.1002/lt.21129
pubmed: 17520742
Jepsen P, Vilstrup H, Andersen PK. The clinical course of cirrhosis: the importance of multistate models and competing risks analysis. Hepatology. 2015;62(1):292–302.
doi: 10.1002/hep.27598
pubmed: 25376655
D’Amico G, Morabito A, D’Amico M, Pasta L, Malizia G, Rebora P, et al. Clinical states of cirrhosis and competing risks. J Hepatol. 2018;68(3):563–76.
doi: 10.1016/j.jhep.2017.10.020
pubmed: 29111320
Castela Forte J, van der Yeshmagambetova G, Hiemstra B, Kaufmann T, Eck RJ et al. Identifying and characterizing high-risk clusters in a heterogeneous ICU population with deep embedded clustering. Sci Rep. 2021;11(1).
Li X, Wang C, Liu L, Xia X. A Method for Heterogeneity Analysis of Complex Diseases Based on Clustering Algorithm. In: Proceedings – 13th International Conference on Computational Intelligence and Security, CIS 2017. Institute of Electrical and Electronics Engineers Inc.; 2018. pp. 155–8.
Choobdar S, Ahsen ME, Crawford J, Tomasoni M, Fang T, Lamparter D, et al. Assessment of network module identification across complex diseases. Nat Methods. 2019;16(9):843–52.
doi: 10.1038/s41592-019-0509-5
pubmed: 31471613
pmcid: 6719725
Naithani N, Sinha S, Misra P, Vasudevan B, Sahu R. Precision medicine: Concept and tools. Med J Armed Forces India. 2021;77(3):249–57.
doi: 10.1016/j.mjafi.2021.06.021
pubmed: 34305276
pmcid: 8282508
Kiselev VY, Andrews TS, Hemberg M. Challenges in unsupervised clustering of single-cell RNA-seq data. Nature Reviews Genetics. Volume 20. Nature Publishing Group; 2019. pp. 273–82.
Qi R, Ma A, Ma Q, Zou Q. Clustering and classification methods for single-cell RNA-sequencing data. Briefings in Bioinformatics. Volume 21. Oxford University Press; 2019. pp. 1196–208.
Coombes CE, Liu X, Abrams ZB, Coombes KR, Brock G. Simulation-derived best practices for clustering clinical data. J Biomed Inf. 2021;118.
Lu Y, Phillips CA, Langston MA. A robustness metric for biological data clustering algorithms. BMC Bioinformatics. 2019;20.
Müller E, Günnemann S, Färber I, Seidl T. Discovering multiple clustering solutions: grouping objects in different views of the data. In: Proceedings - International Conference on Data Engineering. 2012. pp. 1207–10.
Hu J, Pei J. Subspace multi-clustering: a review. Knowledge and Information Systems. Volume 56. Springer London; 2018. pp. 257–84.
Elkrief L, Rautou PE, Sarin S, Valla D, Paradis V, Moreau R. Diabetes mellitus in patients with cirrhosis: clinical implications and management. Liver Int. 2016;36(7):936–48.
doi: 10.1111/liv.13115
pubmed: 26972930
Paternostro R, Jachs M, Hartl L, Simbrunner B, Scheiner B, Bauer D et al. Diabetes impairs the haemodynamic response to non-selective betablockers in compensated cirrhosis and predisposes to hepatic decompensation. Aliment Pharmacol Ther. 2023.
Romero-Gómez M, Montagnese S, Jalan R. Hepatic encephalopathy in patients with acute decompensation of cirrhosis and acute-on-chronic liver failure. J Hepatol. 2015;62(2):437–47.
doi: 10.1016/j.jhep.2014.09.005
pubmed: 25218789
Ferenci P. Hepatic encephalopathy. Gastroenterol Rep (Oxf). 2017;5(2):138–47.
doi: 10.1093/gastro/gox013
pubmed: 28533911
Higuera-de-la-Tijera F, Velarde-Ruiz Velasco JA, Raña-Garibay RH, Castro-Narro GE, Abdo-Francis JM, Moreno-Alcántar R, et al. Current vision on diagnosis and comprehensive care in hepatic encephalopathy. Revista De Gastroenterología De México. (English Edition). 2023;88(2):155–74.
Khalilov RK. Future prospects of biomaterials in nanomedicine. Adv Biology Earth Sci. 2024;9(Special Issue):5–10.
doi: 10.62476/abes.9s5
Huseynov E. Novel nanomaterials for hepatobiliary diseases treatment and future perspectives. Adv Biology Earth Sci. 2024;9(Special Issue):81–91.
doi: 10.62476/abes9s81
Ahmed F, Samantasinghar A, Soomro AM, Kim S, Choi KH. A systematic review of computational approaches to understand cancer biology for informed drug repurposing. J Biomed Inf. 2023;142:104373.
doi: 10.1016/j.jbi.2023.104373