Explainable artificial intelligence for microbiome data analysis in colorectal cancer biomarker identification.

biomarker identification colorectal cancer explainable artificial intelligence machine learning microbiome microbiota precision medicine

Journal

Frontiers in microbiology
ISSN: 1664-302X
Titre abrégé: Front Microbiol
Pays: Switzerland
ID NLM: 101548977

Informations de publication

Date de publication:
2024
Historique:
received: 03 12 2023
accepted: 24 01 2024
medline: 1 3 2024
pubmed: 1 3 2024
entrez: 1 3 2024
Statut: epublish

Résumé

Colorectal cancer (CRC) is a type of tumor caused by the uncontrolled growth of cells in the mucosa lining the last part of the intestine. Emerging evidence underscores an association between CRC and gut microbiome dysbiosis. The high mortality rate of this cancer has made it necessary to develop new early diagnostic methods. Machine learning (ML) techniques can represent a solution to evaluate the interaction between intestinal microbiota and host physiology. Through explained artificial intelligence (XAI) it is possible to evaluate the individual contributions of microbial taxonomic markers for each subject. Our work also implements the Shapley Method Additive Explanations (SHAP) algorithm to identify for each subject which parameters are important in the context of CRC. The proposed study aimed to implement an explainable artificial intelligence framework using both gut microbiota data and demographic information from subjects to classify a cohort of control subjects from those with CRC. Our analysis revealed an association between gut microbiota and this disease. We compared three machine learning algorithms, and the Random Forest (RF) algorithm emerged as the best classifier, with a precision of 0.729 ± 0.038 and an area under the Precision-Recall curve of 0.668 ± 0.016. Additionally, SHAP analysis highlighted the most crucial variables in the model's decision-making, facilitating the identification of specific bacteria linked to CRC. Our results confirmed the role of certain bacteria, such as These findings emphasizes the potential of leveraging gut microbiota data within an explainable AI framework for CRC classification. The significant association observed aligns with existing knowledge. The precision exhibited by the RF algorithm reinforces its suitability for such classification tasks. The SHAP analysis not only enhanced interpretability but identified specific bacteria crucial in CRC determination. This approach opens avenues for targeted interventions based on microbial signatures. Further exploration is warranted to deepen our understanding of the intricate interplay between microbiota and health, providing insights for refined diagnostic and therapeutic strategies.

Sections du résumé

Background UNASSIGNED
Colorectal cancer (CRC) is a type of tumor caused by the uncontrolled growth of cells in the mucosa lining the last part of the intestine. Emerging evidence underscores an association between CRC and gut microbiome dysbiosis. The high mortality rate of this cancer has made it necessary to develop new early diagnostic methods. Machine learning (ML) techniques can represent a solution to evaluate the interaction between intestinal microbiota and host physiology. Through explained artificial intelligence (XAI) it is possible to evaluate the individual contributions of microbial taxonomic markers for each subject. Our work also implements the Shapley Method Additive Explanations (SHAP) algorithm to identify for each subject which parameters are important in the context of CRC.
Results UNASSIGNED
The proposed study aimed to implement an explainable artificial intelligence framework using both gut microbiota data and demographic information from subjects to classify a cohort of control subjects from those with CRC. Our analysis revealed an association between gut microbiota and this disease. We compared three machine learning algorithms, and the Random Forest (RF) algorithm emerged as the best classifier, with a precision of 0.729 ± 0.038 and an area under the Precision-Recall curve of 0.668 ± 0.016. Additionally, SHAP analysis highlighted the most crucial variables in the model's decision-making, facilitating the identification of specific bacteria linked to CRC. Our results confirmed the role of certain bacteria, such as
Discussion UNASSIGNED
These findings emphasizes the potential of leveraging gut microbiota data within an explainable AI framework for CRC classification. The significant association observed aligns with existing knowledge. The precision exhibited by the RF algorithm reinforces its suitability for such classification tasks. The SHAP analysis not only enhanced interpretability but identified specific bacteria crucial in CRC determination. This approach opens avenues for targeted interventions based on microbial signatures. Further exploration is warranted to deepen our understanding of the intricate interplay between microbiota and health, providing insights for refined diagnostic and therapeutic strategies.

Identifiants

pubmed: 38426064
doi: 10.3389/fmicb.2024.1348974
pmc: PMC10901987
doi:

Types de publication

Journal Article

Langues

eng

Pagination

1348974

Informations de copyright

Copyright © 2024 Novielli, Romano, Magarelli, Bitonto, Diacono, Chiatante, Lopalco, Sabella, Venerito, Filannino, Bellotti, De Angelis, Iannone and Tangaro.

Déclaration de conflit d'intérêts

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The author(s) declared that they were an editorial board member of Frontiers, at the time of submission. This had no impact on the peer review process and the final decision.

Auteurs

Pierfrancesco Novielli (P)

Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy.
Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy.

Donato Romano (D)

Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy.
Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy.

Michele Magarelli (M)

Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy.

Pierpaolo Di Bitonto (PD)

Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy.

Domenico Diacono (D)

Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy.

Annalisa Chiatante (A)

Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy.

Giuseppe Lopalco (G)

Dipartimento di Medicina di Precisione e Rigenerativa e Area Jonica, Università degli Studi di Bari Aldo Moro, Bari, Italy.

Daniele Sabella (D)

Dipartimento di Medicina di Precisione e Rigenerativa e Area Jonica, Università degli Studi di Bari Aldo Moro, Bari, Italy.

Vincenzo Venerito (V)

Dipartimento di Medicina di Precisione e Rigenerativa e Area Jonica, Università degli Studi di Bari Aldo Moro, Bari, Italy.

Pasquale Filannino (P)

Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy.

Roberto Bellotti (R)

Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy.
Dipartimento Interateneo di Fisica M. Merlin, Università degli Studi di Bari Aldo Moro, Bari, Italy.

Maria De Angelis (M)

Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy.

Florenzo Iannone (F)

Dipartimento di Medicina di Precisione e Rigenerativa e Area Jonica, Università degli Studi di Bari Aldo Moro, Bari, Italy.

Sabina Tangaro (S)

Dipartimento di Scienze del Suolo, della Pianta e degli Alimenti, Università degli Studi di Bari Aldo Moro, Bari, Italy.
Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Bari, Italy.

Classifications MeSH