MSPminer: abundance-based reconstitution of microbial pan-genomes from shotgun metagenomic data.
Journal
Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944
Informations de publication
Date de publication:
01 05 2019
01 05 2019
Historique:
received:
27
04
2018
revised:
29
08
2018
accepted:
24
09
2018
pubmed:
27
9
2018
medline:
19
5
2020
entrez:
26
9
2018
Statut:
ppublish
Résumé
Analysis toolkits for shotgun metagenomic data achieve strain-level characterization of complex microbial communities by capturing intra-species gene content variation. Yet, these tools are hampered by the extent of reference genomes that are far from covering all microbial variability, as many species are still not sequenced or have only few strains available. Binning co-abundant genes obtained from de novo assembly is a powerful reference-free technique to discover and reconstitute gene repertoire of microbial species. While current methods accurately identify species core parts, they miss many accessory genes or split them into small gene groups that remain unassociated to core clusters. We introduce MSPminer, a computationally efficient software tool that reconstitutes Metagenomic Species Pan-genomes (MSPs) by binning co-abundant genes across metagenomic samples. MSPminer relies on a new robust measure of proportionality coupled with an empirical classifier to group and distinguish not only species core genes but accessory genes also. Applied to a large scale metagenomic dataset, MSPminer successfully delineates in a few hours the gene repertoires of 1661 microbial species with similar specificity and higher sensitivity than existing tools. The taxonomic annotation of MSPs reveals microorganisms hitherto unknown and brings coherence in the nomenclature of the species of the human gut microbiota. The provided MSPs can be readily used for taxonomic profiling and biomarkers discovery in human gut metagenomic samples. In addition, MSPminer can be applied on gene count tables from other ecosystems to perform similar analyses. The binary is freely available for non-commercial users at www.enterome.com/downloads. Supplementary data are available at Bioinformatics online.
Identifiants
pubmed: 30252023
pii: 5106712
doi: 10.1093/bioinformatics/bty830
pmc: PMC6499236
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
1544-1552Informations de copyright
© The Author(s) 2018. Published by Oxford University Press.
Références
Curr Opin Genet Dev. 2005 Dec;15(6):589-94
pubmed: 16185861
Nucleic Acids Res. 2008 Dec;36(21):6688-719
pubmed: 18948295
PLoS Genet. 2009 Jan;5(1):e1000344
pubmed: 19165319
PLoS One. 2010 Dec 08;5(12):e15147
pubmed: 21170335
PLoS One. 2012;7(7):e41294
pubmed: 22848458
Nature. 2012 Oct 4;490(7418):55-60
pubmed: 23023125
Biometrika. 2011 Mar;98(1):199-214
pubmed: 23049127
PLoS One. 2012;7(10):e47656
pubmed: 23082188
Nature. 2013 Jan 3;493(7430):45-50
pubmed: 23222524
PLoS One. 2013;8(2):e57923
pubmed: 23460914
JAMA. 2013 Apr 10;309(14):1502-10
pubmed: 23571589
Nature. 2013 Jun 6;498(7452):99-103
pubmed: 23719380
Nature. 2013 Aug 29;500(7464):541-6
pubmed: 23985870
Nat Methods. 2013 Dec;10(12):1196-9
pubmed: 24141494
Nature. 2014 Feb 27;506(7489):498-502
pubmed: 24463512
Nat Biotechnol. 2014 Aug;32(8):834-41
pubmed: 24997786
Nat Biotechnol. 2014 Aug;32(8):822-8
pubmed: 24997787
World J Gastroenterol. 2015 Jan 21;21(3):803-14
pubmed: 25624713
Cell. 2015 Feb 12;160(4):583-594
pubmed: 25640238
BMC Microbiol. 2015 Mar 21;15:66
pubmed: 25880246
Genome Biol. 2015 Apr 21;16:82
pubmed: 25896518
Proc Natl Acad Sci U S A. 2015 Jul 7;112(27):E3574-81
pubmed: 26100894
Nat Biotechnol. 2015 Oct;33(10):1045-52
pubmed: 26344404
Nat Methods. 2015 Oct;12(10):902-3
pubmed: 26418763
Biochem Biophys Res Commun. 2016 Jan 22;469(4):967-77
pubmed: 26718401
ISME J. 2016 Oct;10(10):2459-67
pubmed: 26943627
Nat Methods. 2016 May;13(5):435-8
pubmed: 26999001
Nature. 2016 May 04;533(7604):543-546
pubmed: 27144353
Front Microbiol. 2016 Apr 20;7:459
pubmed: 27148170
Biometrics. 1989 Mar;45(1):255-68
pubmed: 2720055
Nat Rev Microbiol. 2016 Aug;14(8):508-22
pubmed: 27396567
BMC Genomics. 2016 Oct 21;17(1):819
pubmed: 27769168
Genome Res. 2016 Nov;26(11):1612-1625
pubmed: 27803195
Nat Microbiol. 2016 Nov 07;1:16203
pubmed: 27819657
Genome Res. 2017 Apr;27(4):626-638
pubmed: 28167665
Nature. 2017 Apr 6;544(7648):124
pubmed: 28329759
Nat Methods. 2017 Nov;14(11):1063-1071
pubmed: 28967888
Nat Commun. 2017 Oct 10;8(1):845
pubmed: 29018189
BMJ. 1996 Mar 23;312(7033):770
pubmed: 8605469