Developing an ensemble machine learning study: Insights from a multi-center proof-of-concept study.


Journal

PloS one
ISSN: 1932-6203
Titre abrégé: PLoS One
Pays: United States
ID NLM: 101285081

Informations de publication

Date de publication:
2024
Historique:
received: 15 11 2023
accepted: 21 04 2024
medline: 10 9 2024
pubmed: 10 9 2024
entrez: 10 9 2024
Statut: epublish

Résumé

To address the numerous unmeet clinical needs, in recent years several Machine Learning models applied to medical images and clinical data have been introduced and developed. Even when they achieve encouraging results, they lack evolutionary progression, thus perpetuating their status as autonomous entities. We postulated that different algorithms which have been proposed in the literature to address the same diagnostic task, can be aggregated to enhance classification performance. We suggested a proof of concept to define an ensemble approach useful for integrating different algorithms proposed to solve the same clinical task. The proposed approach was developed starting from a public database consisting of radiomic features extracted from CT images relating to 535 patients suffering from lung cancer. Seven algorithms were trained independently by participants in the AI4MP working group on Artificial Intelligence of the Italian Association of Physics in Medicine to discriminate metastatic from non-metastatic patients. The classification scores generated by these algorithms are used to train SVM classifier. The Explainable Artificial Intelligence approach is applied to the final model. The ensemble model was validated following an 80-20 hold-out and leave-one-out scheme on the training set. Compared to individual algorithms, a more accurate result was achieved. On the independent test the ensemble model achieved an accuracy of 0.78, a F1-score of 0.57 and a log-loss of 0.49. Shapley values representing the contribution of each algorithm to the final classification result of the ensemble model were calculated. This information represents an added value for the end user useful for evaluating the appropriateness of the classification result on a particular case. It also allows us to evaluate on a global level which methodological approaches of the individual algorithms are likely to have the most impact. Our proposal represents an innovative approach useful for integrating different algorithms that populate the literature and which lays the foundations for future evaluations in broader application scenarios.

Sections du résumé

BACKGROUND BACKGROUND
To address the numerous unmeet clinical needs, in recent years several Machine Learning models applied to medical images and clinical data have been introduced and developed. Even when they achieve encouraging results, they lack evolutionary progression, thus perpetuating their status as autonomous entities. We postulated that different algorithms which have been proposed in the literature to address the same diagnostic task, can be aggregated to enhance classification performance. We suggested a proof of concept to define an ensemble approach useful for integrating different algorithms proposed to solve the same clinical task.
METHODS METHODS
The proposed approach was developed starting from a public database consisting of radiomic features extracted from CT images relating to 535 patients suffering from lung cancer. Seven algorithms were trained independently by participants in the AI4MP working group on Artificial Intelligence of the Italian Association of Physics in Medicine to discriminate metastatic from non-metastatic patients. The classification scores generated by these algorithms are used to train SVM classifier. The Explainable Artificial Intelligence approach is applied to the final model. The ensemble model was validated following an 80-20 hold-out and leave-one-out scheme on the training set.
RESULTS RESULTS
Compared to individual algorithms, a more accurate result was achieved. On the independent test the ensemble model achieved an accuracy of 0.78, a F1-score of 0.57 and a log-loss of 0.49. Shapley values representing the contribution of each algorithm to the final classification result of the ensemble model were calculated. This information represents an added value for the end user useful for evaluating the appropriateness of the classification result on a particular case. It also allows us to evaluate on a global level which methodological approaches of the individual algorithms are likely to have the most impact.
CONCLUSION CONCLUSIONS
Our proposal represents an innovative approach useful for integrating different algorithms that populate the literature and which lays the foundations for future evaluations in broader application scenarios.

Identifiants

pubmed: 39255296
doi: 10.1371/journal.pone.0303217
pii: PONE-D-23-37990
doi:

Types de publication

Journal Article Multicenter Study

Langues

eng

Sous-ensembles de citation

IM

Pagination

e0303217

Informations de copyright

Copyright: © 2024 Fanizzi et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Déclaration de conflit d'intérêts

The authors have declared that no competing interests exist.

Auteurs

Annarita Fanizzi (A)

Laboratorio Biostatistica e Bioinformatica, I.R.C.C.S. Istituto Tumori 'Giovanni Paolo II', Bari, Italy.

Federico Fadda (F)

Laboratorio Biostatistica e Bioinformatica, I.R.C.C.S. Istituto Tumori 'Giovanni Paolo II', Bari, Italy.

Michele Maddalo (M)

Servizio di Fisica Sanitaria, Azienda Ospedaliero-Universitaria di Parma, Parma, Italy.

Sara Saponaro (S)

Fisica Sanitaria, Azienda Usl Toscana Nord Ovest, Lucca, Italy.

Leda Lorenzon (L)

Fisica Sanitaria, Azienda Sanitaria dell'Alto Adige, Bolzano, Italy.

Leonardo Ubaldi (L)

Dip. Scienze Biomediche Sperimentali e Cliniche "Mario Serio", Università degli Studi di Firenze,Viale Morgagni, Firenze.
Istituto Nazionale di Fisica Nucleare, Sez. Firenze, Via Sansone 1, Sesto Fiorentino, Firenze.

Nicola Lambri (N)

IRCCS Humanitas Research Hospital, Medical Physics Unit of Radiotherapy and Radiosurgery Department, via Manzoni, Rozzano, Milan, Italy.
Department of Biomedical Sciences, Humanitas University, via Rita Levi Montalcini, Pieve Emanuele, Milan, Italy.

Alessia Giuliano (A)

U.O.C. Fisica Sanitaria, Azienda Ospedaliero-Universitaria Pisana, Pisa, Italy.

Emiliano Loi (E)

SC Fisica Sanitaria, IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) "Dino Amadori", Meldola, Italy.

Michele Signoriello (M)

Fisica Sanitaria, Azienda sanitaria universitaria Giuliano Isontina, Trieste, Italy.

Marco Branchini (M)

Fisica Sanitaria, Azienda Socio Sanitaria Territoriale della Valtellina e dell'Alto Lario, Sondrio, Italy.

Gina Belmonte (G)

Fisica Sanitaria, Azienda Usl Toscana Nord Ovest, Lucca, Italy.

Marco Giannelli (M)

U.O.C. Fisica Sanitaria, Azienda Ospedaliero-Universitaria Pisana, Pisa, Italy.

Pietro Mancosu (P)

IRCCS Humanitas Research Hospital, Medical Physics Unit of Radiotherapy and Radiosurgery Department, via Manzoni, Rozzano, Milan, Italy.

Cinzia Talamonti (C)

Dip. Scienze Biomediche Sperimentali e Cliniche "Mario Serio", Università degli Studi di Firenze,Viale Morgagni, Firenze.
Istituto Nazionale di Fisica Nucleare, Sez. Firenze, Via Sansone 1, Sesto Fiorentino, Firenze.

Mauro Iori (M)

Medical Physics Unit, Azienda USL-IRCCS di Reggio Emilia, Reggio Emilia, Italy.

Sabina Tangaro (S)

Dipartimento di Fisica Applicata, Università degli Studi di Bari Aldo Moro, Bari, Italy.

Michele Avanzo (M)

Centro di Riferimento Oncologico di Aviano (CRO) IRCCS, Via F. Gallini, Aviano, Italy.

Raffaella Massafra (R)

Laboratorio Biostatistica e Bioinformatica, I.R.C.C.S. Istituto Tumori 'Giovanni Paolo II', Bari, Italy.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH