Host phenotype classification from human microbiome data is mainly driven by the presence of microbial taxa.
Journal
PLoS computational biology
ISSN: 1553-7358
Titre abrégé: PLoS Comput Biol
Pays: United States
ID NLM: 101238922
Informations de publication
Date de publication:
04 2022
04 2022
Historique:
received:
13
10
2021
accepted:
29
03
2022
revised:
03
05
2022
pubmed:
22
4
2022
medline:
6
5
2022
entrez:
21
4
2022
Statut:
epublish
Résumé
Machine learning-based classification approaches are widely used to predict host phenotypes from microbiome data. Classifiers are typically employed by considering operational taxonomic units or relative abundance profiles as input features. Such types of data are intrinsically sparse, which opens the opportunity to make predictions from the presence/absence rather than the relative abundance of microbial taxa. This also poses the question whether it is the presence rather than the abundance of particular taxa to be relevant for discrimination purposes, an aspect that has been so far overlooked in the literature. In this paper, we aim at filling this gap by performing a meta-analysis on 4,128 publicly available metagenomes associated with multiple case-control studies. At species-level taxonomic resolution, we show that it is the presence rather than the relative abundance of specific microbial taxa to be important when building classification models. Such findings are robust to the choice of the classifier and confirmed by statistical tests applied to identifying differentially abundant/present taxa. Results are further confirmed at coarser taxonomic resolutions and validated on 4,026 additional 16S rRNA samples coming from 30 public case-control studies.
Identifiants
pubmed: 35446845
doi: 10.1371/journal.pcbi.1010066
pii: PCOMPBIOL-D-21-01860
pmc: PMC9064115
doi:
Substances chimiques
RNA, Ribosomal, 16S
0
Types de publication
Journal Article
Meta-Analysis
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
e1010066Déclaration de conflit d'intérêts
The authors have declared that no competing interests exist.
Références
Nat Commun. 2015 Mar 11;6:6528
pubmed: 25758642
Nat Commun. 2019 Jun 20;10(1):2719
pubmed: 31222023
PLoS One. 2017 Feb 21;12(2):e0172605
pubmed: 28222161
PLoS Comput Biol. 2016 Jul 11;12(7):e1004977
pubmed: 27400279
Gut. 2017 Jan;66(1):70-78
pubmed: 26408641
Nature. 2018 Oct;562(7728):589-594
pubmed: 30356183
Nature. 2012 Oct 4;490(7418):55-60
pubmed: 23023125
Gigascience. 2019 May 1;8(5):
pubmed: 31042284
mSystems. 2019 Nov 12;4(6):
pubmed: 31719139
Nat Microbiol. 2016 Oct 10;2:16180
pubmed: 27723761
Nat Med. 2019 Jun;25(6):968-976
pubmed: 31171880
Front Microbiol. 2021 Feb 19;12:634511
pubmed: 33737920
Nat Methods. 2017 Oct 31;14(11):1023-1024
pubmed: 29088129
Nature. 2012 Jun 13;486(7402):207-14
pubmed: 22699609
PeerJ. 2015 Aug 25;3:e1140
pubmed: 26336637
Nat Med. 2019 Apr;25(4):679-689
pubmed: 30936547
NPJ Biofilms Microbiomes. 2020 Oct 30;6(1):47
pubmed: 33127901
Elife. 2021 May 04;10:
pubmed: 33944776
Nat Med. 2019 Apr;25(4):667-678
pubmed: 30936548
mBio. 2018 Nov 20;9(6):
pubmed: 30459201
Nature. 2014 Sep 4;513(7516):59-64
pubmed: 25079328
BMC Genomics. 2013 Sep 22;14:641
pubmed: 24053649
J R Soc Interface. 2018 Apr;15(141):
pubmed: 29618526
PLoS One. 2016 May 12;11(5):e0155362
pubmed: 27171425
mBio. 2016 Aug 23;7(4):
pubmed: 27555308
Gut Microbes. 2021 Jan-Dec;13(1):1-20
pubmed: 33522391
Microbiome. 2017 Feb 1;5(1):14
pubmed: 28143587
Microbiome. 2021 Sep 2;9(1):181
pubmed: 34474689
Gut. 2020 Jul;69(7):1258-1268
pubmed: 32075887
Nature. 2010 Mar 4;464(7285):59-65
pubmed: 20203603
Neurol Neuroimmunol Neuroinflamm. 2020 Nov 4;8(1):
pubmed: 33148687
Nat Microbiol. 2016 Jul 11;1(9):16106
pubmed: 27562258
Genome Biol. 2021 Mar 30;22(1):93
pubmed: 33785070
Genome Res. 2013 Oct;23(10):1704-14
pubmed: 23861384
Sci Rep. 2019 Jul 15;9(1):10189
pubmed: 31308384
BMC Bioinformatics. 2018 Jun 15;19(1):227
pubmed: 29907097
Future Microbiol. 2017 Feb;12:157-170
pubmed: 28139139
Microbiome. 2018 Aug 4;6(1):135
pubmed: 30077182
IEEE J Biomed Health Inform. 2020 Oct;24(10):2993-3001
pubmed: 32396115
Forensic Sci Int Genet. 2019 Jul;41:72-82
pubmed: 31003081
Med Microecol. 2020 Jun;4:
pubmed: 34368751
Methods. 2019 Aug 15;166:74-82
pubmed: 30885720
Microbiome. 2013 Apr 05;1(1):11
pubmed: 24456583
Bioinformatics. 2020 Jul 1;36(Suppl_1):i39-i47
pubmed: 32657370
Front Genet. 2019 Jun 25;10:579
pubmed: 31293616
Sci Rep. 2020 Apr 7;10(1):6026
pubmed: 32265477
Nat Biotechnol. 2014 Aug;32(8):822-8
pubmed: 24997787
Cell Host Microbe. 2015 Feb 11;17(2):260-73
pubmed: 25662751
ISME J. 2016 Mar;10(3):707-20
pubmed: 26359913
Biomed Res Int. 2018 Jan 11;2018:2936257
pubmed: 29568746
Nat Commun. 2017 Dec 5;8(1):1784
pubmed: 29209090
Nat Commun. 2017 Oct 10;8(1):845
pubmed: 29018189
Mol Syst Biol. 2014 Nov 28;10:766
pubmed: 25432777
mSystems. 2019 May 14;4(4):
pubmed: 31098399
Nature. 2013 Jun 6;498(7452):99-103
pubmed: 23719380
Annu Rev Med. 2013;64:145-63
pubmed: 23327521
BMC Bioinformatics. 2015 Nov 04;16:358
pubmed: 26538306
Front Microbiol. 2021 Feb 22;12:635781
pubmed: 33692771
N Engl J Med. 2016 Dec 15;375(24):2369-2379
pubmed: 27974040