Multicenter evaluation of gut microbiome profiling by next-generation sequencing reveals major biases in partial-length metabarcoding approach.


Journal

Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288

Informations de publication

Date de publication:
18 Dec 2023
Historique:
received: 27 02 2023
accepted: 27 10 2023
medline: 20 12 2023
pubmed: 20 12 2023
entrez: 19 12 2023
Statut: epublish

Résumé

Next-generation sequencing workflows, using either metabarcoding or metagenomic approaches, have massively contributed to expanding knowledge of the human gut microbiota, but methodological bias compromises reproducibility across studies. Where these biases have been quantified within several comparative analyses on their own, none have measured inter-laboratory reproducibility using similar DNA material. Here, we designed a multicenter study involving seven participating laboratories dedicated to partial- (P1 to P5), full-length (P6) metabarcoding, or metagenomic profiling (MGP) using DNA from a mock microbial community or extracted from 10 fecal samples collected at two time points from five donors. Fecal material was collected, and the DNA was extracted according to the IHMS protocols. The mock and isolated DNA were then provided to the participating laboratories for sequencing. Following sequencing analysis according to the laboratories' routine pipelines, relative taxonomic-count tables defined at the genus level were provided and analyzed. Large variations in alpha-diversity between laboratories, uncorrelated with sequencing depth, were detected among the profiles. Half of the genera identified by P1 were unique to this partner and two-thirds of the genera identified by MGP were not detected by P3. Analysis of beta-diversity revealed lower inter-individual variance than inter-laboratory variances. The taxonomic profiles of P5 and P6 were more similar to those of MGP than those obtained by P1, P2, P3, and P4. Reanalysis of the raw sequences obtained by partial-length metabarcoding profiling, using a single bioinformatic pipeline, harmonized the description of the bacterial profiles, which were more similar to each other, except for P3, and closer to the profiles obtained by MGP. This study highlights the major impact of the bioinformatics pipeline, and primarily the database used for taxonomic annotation. Laboratories need to benchmark and optimize their bioinformatic pipelines using standards to monitor their effectiveness in accurately detecting taxa present in gut microbiota.

Identifiants

pubmed: 38114587
doi: 10.1038/s41598-023-46062-7
pii: 10.1038/s41598-023-46062-7
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

22593

Subventions

Organisme : European Research Council
ID : 2017-AdG No. 788191
Pays : International

Informations de copyright

© 2023. The Author(s).

Références

Vincent, A. T., Derome, N., Boyle, B., Culley, A. I. & Charette, S. J. Next-generation sequencing (NGS) in the microbiological world: How to make the most of your money. J. Microbiol. Methods 138, 60–71 (2017).
pubmed: 26995332 doi: 10.1016/j.mimet.2016.02.016
Nearing, J. T., Comeau, A. M. & Langille, M. G. I. Identifying biases and their potential solutions in human microbiome studies. Microbiome https://doi.org/10.1186/s40168-021-01059-0 (2021).
doi: 10.1186/s40168-021-01059-0 pubmed: 34006335 pmcid: 8132403
Penington, J. S. et al. Influence of fecal collection conditions and 16S rRNA gene sequencing at two centers on human gut microbiota analysis. Sci. Rep. 8, 4386 (2018).
pubmed: 29531234 pmcid: 5847573 doi: 10.1038/s41598-018-22491-7
Ilett, E. E. et al. Gut microbiome comparability of fresh-frozen versus stabilized-frozen samples from hospitalized patients using 16S rRNA gene and shotgun metagenomic sequencing. Sci. Rep. 9, 13351 (2019).
pubmed: 31527823 pmcid: 6746779 doi: 10.1038/s41598-019-49956-7
Salter, S. J. et al. Reagent and laboratory contamination can critically impact sequence-based microbiome analyses. BMC Biol. 12, 87 (2014).
pubmed: 25387460 pmcid: 4228153 doi: 10.1186/s12915-014-0087-z
Costea, P. I. et al. Towards standards for human fecal sample processing in metagenomic studies. Nat. Biotechnol. 35, 1069–1076 (2017).
pubmed: 28967887 doi: 10.1038/nbt.3960
Lim, M. Y., Song, E.-J., Kim, S. H., Lee, J. & Nam, Y.-D. Comparison of DNA extraction methods for human gut microbial community profiling. Syst. Appl. Microbiol. 41, 151–157 (2018).
pubmed: 29305057 doi: 10.1016/j.syapm.2017.11.008
Sze, M. A. & Schloss, P. D. The impact of DNA polymerase and number of rounds of amplification in PCR on 16S rRNA gene sequence data. mSphere https://doi.org/10.1128/mSphere.00163-19 (2019).
doi: 10.1128/mSphere.00163-19 pubmed: 31118299 pmcid: 6531881
Jones, M. B. et al. Library preparation methodology can influence genomic and functional predictions in human microbiome research. Proc. Natl. Acad. Sci. U.S.A. 112, 14024–14029 (2015).
pubmed: 26512100 pmcid: 4653211 doi: 10.1073/pnas.1519288112
Schirmer, M. et al. Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform. Nucl. Acids Res. 43, e37 (2015).
pubmed: 25586220 pmcid: 4381044 doi: 10.1093/nar/gku1341
Thorsen, J. et al. Large-scale benchmarking reveals false discoveries and count transformation sensitivity in 16S rRNA gene amplicon data analysis methods used in microbiome studies. Microbiome. 4, 62 (2016).
pubmed: 27884206 pmcid: 5123278 doi: 10.1186/s40168-016-0208-8
Hillmann, B. et al. Evaluating the information content of shallow shotgun metagenomics. mSystems https://doi.org/10.1128/mSystems.00069-18 (2018).
doi: 10.1128/mSystems.00069-18 pubmed: 30443602 pmcid: 6234283
Whon, T. W. et al. The effects of sequencing platforms on phylogenetic resolution in 16 S rRNA gene profiling of human feces. Sci. Data. 5, 180068 (2018).
pubmed: 29688220 pmcid: 5914283 doi: 10.1038/sdata.2018.68
Marizzoni, M. et al. Comparison of bioinformatics pipelines and operating systems for the analyses of 16S rRNA gene amplicon sequences in human fecal samples. Front. Microbiol. 11, 1262 (2020).
pubmed: 32636817 pmcid: 7318847 doi: 10.3389/fmicb.2020.01262
Weiss, S. et al. Normalization and microbial differential abundance strategies depend upon data characteristics. Microbiome. 5, 27 (2017).
pubmed: 28253908 pmcid: 5335496 doi: 10.1186/s40168-017-0237-y
Lynch, M. D. J. & Neufeld, J. D. Ecology and exploration of the rare biosphere. Nat. Rev. Microbiol. 13, 217–229 (2015).
pubmed: 25730701 doi: 10.1038/nrmicro3400
Abellan-Schneyder, I. et al. Primer, pipelines, parameters: Issues in 16S rRNA gene sequencing. mSphere https://doi.org/10.1128/mSphere.01202-20 (2021).
doi: 10.1128/mSphere.01202-20 pubmed: 33627512 pmcid: 8544895
Wei, Z.-G. et al. Comparison of methods for picking the operational taxonomic units from amplicon sequences. Front. Microbiol. 12, 644012 (2021).
pubmed: 33841367 pmcid: 8024490 doi: 10.3389/fmicb.2021.644012
Nearing, J. T. et al. Microbiome differential abundance methods produce different results across 38 datasets. Nat. Commun. https://doi.org/10.1038/s41467-022-28034-z (2022).
doi: 10.1038/s41467-022-28034-z pubmed: 35115546 pmcid: 8813933
Caruso, V., Song, X., Asquith, M. & Karstens, L. Performance of microbiome sequence inference methods in environments with varying biomass. mSystems https://doi.org/10.1128/mSystems.00163-18 (2019).
doi: 10.1128/mSystems.00163-18 pubmed: 30801029 pmcid: 6381225
Acinas, S. G. et al. Fine-scale phylogenetic architecture of a complex bacterial community. Nature. 430, 551–554 (2004).
pubmed: 15282603 doi: 10.1038/nature02649
Větrovský, T. & Baldrian, P. The variability of the 16S rRNA gene in bacterial genomes and its consequences for bacterial community analyses. PLoS ONE 8, e57923 (2013).
pubmed: 23460914 pmcid: 3583900 doi: 10.1371/journal.pone.0057923
Jeong, J. et al. The effect of taxonomic classification by full-length 16S rRNA sequencing with a synthetic long-read technology. Sci. Rep. 11, 1727 (2021).
pubmed: 33462291 pmcid: 7814050 doi: 10.1038/s41598-020-80826-9
Hassler, H. B. et al. Phylogenies of the 16S rRNA gene and its hypervariable regions lack concordance with core genome phylogenies. Microbiome https://doi.org/10.1186/s40168-022-01295-y (2022).
doi: 10.1186/s40168-022-01295-y pubmed: 35799218 pmcid: 9264627
Pereira-Marques, J. et al. Impact of host DNA and sequencing depth on the taxonomic resolution of whole metagenome sequencing for microbiome analysis. Front. Microbiol. 10, 1277 (2019).
pubmed: 31244801 pmcid: 6581681 doi: 10.3389/fmicb.2019.01277
Gweon, H. S. et al. The impact of sequencing depth on the inferred taxonomic composition and AMR gene content of metagenomic samples. Environ. Microbiome https://doi.org/10.1186/s40793-019-0347-1 (2019).
doi: 10.1186/s40793-019-0347-1 pubmed: 33902704 pmcid: 8204541
Laudadio, I. et al. Quantitative assessment of shotgun metagenomics and 16S rDNA amplicon sequencing in the study of human gut microbiome. OMICS 22, 248–254 (2018).
pubmed: 29652573 doi: 10.1089/omi.2018.0013
Park, S.-Y., Ufondu, A., Lee, K. & Jayaraman, A. Emerging computational tools and models for studying gut microbiota composition and function. Curr. Opin. Biotechnol. 66, 301–311 (2020).
pubmed: 33248408 pmcid: 7744364 doi: 10.1016/j.copbio.2020.10.005
Jovel, J. et al. Characterization of the gut microbiome using 16S or shotgun metagenomics. Front. Microbiol. 7, 459 (2016).
pubmed: 27148170 pmcid: 4837688 doi: 10.3389/fmicb.2016.00459
Mitra, S. et al. Analysis of the intestinal microbiota using SOLiD 16S rRNA gene sequencing and SOLiD shotgun sequencing. BMC Genomics. 14(Suppl 5), S16 (2013).
pubmed: 24564472 pmcid: 3852202 doi: 10.1186/1471-2164-14-S5-S16
Rausch, P. et al. Comparative analysis of amplicon and metagenomic sequencing methods reveals key features in the evolution of animal metaorganisms. Microbiome 7, 133 (2019).
pubmed: 31521200 pmcid: 6744666 doi: 10.1186/s40168-019-0743-1
Biegert, G., Karpinets, T., Wu, X., Alam, M.B.E., Sims, T.T., Yoshida-Court, K., et al. Diversity and composition of gut microbiome of cervical cancer patients by 16S rRNA and whole-metagenome sequencing (2020).
Vogtmann, E. et al. Colorectal cancer and the human gut microbiome: Reproducibility with whole-genome shotgun sequencing. PLoS ONE. 11, e0155362 (2016).
pubmed: 27171425 pmcid: 4865240 doi: 10.1371/journal.pone.0155362
Ranjan, R., Rani, A., Metwally, A., McGee, H. S. & Perkins, D. L. Analysis of the microbiome: Advantages of whole genome shotgun versus 16S amplicon sequencing. Biochem. Biophys. Res. Commun. 469, 967–977 (2016).
pubmed: 26718401 doi: 10.1016/j.bbrc.2015.12.083
Clooney, A. G. et al. Comparing apples and oranges? Next generation sequencing and its impact on microbiome analysis. PLoS ONE 11, e0148028 (2016).
pubmed: 26849217 pmcid: 4746063 doi: 10.1371/journal.pone.0148028
Han, D. et al. Multicenter assessment of microbial community profiling using 16S rRNA gene sequencing and shotgun metagenomic sequencing. J Adv Res. 26, 111–121 (2020).
pubmed: 33133687 pmcid: 7584675 doi: 10.1016/j.jare.2020.07.010
Criscuolo, A. & Brisse, S. AlienTrimmer: A tool to quickly and accurately trim off multiple short contaminant sequences from high-throughput sequencing reads. Genomics 102, 500–506 (2013).
pubmed: 23912058 doi: 10.1016/j.ygeno.2013.07.011
Wen, C. et al. Quantitative metagenomics reveals unique gut microbiome biomarkers in ankylosing spondylitis. Genome Biol. 18, 142 (2017).
pubmed: 28750650 pmcid: 5530561 doi: 10.1186/s13059-017-1271-6
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
pubmed: 22388286 pmcid: 3322381 doi: 10.1038/nmeth.1923
Cotillard, A. et al. Dietary intervention impact on gut microbial gene richness. Nature 500, 585–588 (2013).
pubmed: 23985875 doi: 10.1038/nature12480
Le Chatelier, E. et al. Richness of human gut microbiome correlates with metabolic markers. Nature 500, 541–546 (2013).
pubmed: 23985870 doi: 10.1038/nature12506
Plaza Oñate, F. et al. MSPminer: Abundance-based reconstitution of microbial pan-genomes from shotgun metagenomic data. Bioinformatics 35, 1544–1552 (2019).
pubmed: 30252023 doi: 10.1093/bioinformatics/bty830
Parks, D. H. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat. Biotechnol. 36, 996–1004 (2018).
pubmed: 30148503 doi: 10.1038/nbt.4229
Wood, D. E., Lu, J. & Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol. 20, 257 (2019).
pubmed: 31779668 pmcid: 6883579 doi: 10.1186/s13059-019-1891-0
Schloss, P. D. et al. Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl. Environ. Microbiol. 75, 7537–7541 (2009).
pubmed: 19801464 pmcid: 2786419 doi: 10.1128/AEM.01541-09
Caporaso, J. G. et al. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7, 335–336 (2010).
pubmed: 20383131 pmcid: 3156573 doi: 10.1038/nmeth.f.303
Escudié, F. et al. FROGS: Find, rapidly, OTUs with galaxy solution. Bioinformatics 34, 1287–1294 (2018).
pubmed: 29228191 doi: 10.1093/bioinformatics/btx791
Callahan, B. J. et al. DADA2: High-resolution sample inference from Illumina amplicon data. Nat. Methods 13, 581–583 (2016).
pubmed: 27214047 pmcid: 4927377 doi: 10.1038/nmeth.3869
Westcott, S. L. & Schloss, P. D. OptiClust, an improved method for assigning amplicon-based sequence data to operational taxonomic units. mSphere https://doi.org/10.1128/mSphereDirect.00073-17 (2017).
doi: 10.1128/mSphereDirect.00073-17 pubmed: 28289728 pmcid: 5343174
Mahé, F., Rognes, T., Quince, C., de Vargas, C. & Dunthorn, M. Swarm: Robust and fast clustering method for amplicon-based studies. PeerJ. 2, e593 (2014).
pubmed: 25276506 pmcid: 4178461 doi: 10.7717/peerj.593
Edgar, R. C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010).
pubmed: 20709691 doi: 10.1093/bioinformatics/btq461
Magoč, T. & Salzberg, S. L. FLASH: Fast length adjustment of short reads to improve genome assemblies. Bioinformatics 27, 2957–2963 (2011).
pubmed: 21903629 pmcid: 3198573 doi: 10.1093/bioinformatics/btr507
Maidak, B. L. et al. The RDP (Ribosomal Database Project) continues. Nucl. Acids Res. 28, 173–174 (2000).
pubmed: 10592216 pmcid: 102428 doi: 10.1093/nar/28.1.173
DeSantis, T. Z. et al. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol. 72, 5069–5072 (2006).
pubmed: 16820507 pmcid: 1489311 doi: 10.1128/AEM.03006-05
Camacho, C. et al. BLAST+: Architecture and applications. BMC Bioinform. 10, 421 (2009).
doi: 10.1186/1471-2105-10-421
Wang, Q., Garrity, G. M., Tiedje, J. M. & Cole, J. R. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl. Environ. Microbiol. 73, 5261–5267 (2007).
pubmed: 17586664 pmcid: 1950982 doi: 10.1128/AEM.00062-07
Quast, C. et al. The SILVA ribosomal RNA gene database project: Improved data processing and web-based tools. Nucl. Acids Res. 41, D590–D596 (2013).
pubmed: 23193283 doi: 10.1093/nar/gks1219
O’Leary, N. A. et al. Reference sequence (RefSeq) database at NCBI: Current status, taxonomic expansion, and functional annotation. Nucl. Acids Res. 44, D733–D745 (2016).
pubmed: 26553804 doi: 10.1093/nar/gkv1189
Blin, K. ncbi-genome-download: Zenodo (2023).
Schoch, C. L. et al. NCBI Taxonomy: A comprehensive update on curation, resources and tools. Database (Oxford) https://doi.org/10.1093/database/baaa062 (2020).
Seemann, T. barrnap 0.9: Rapid ribosomal RNA prediction (2013). https://github.com/tseemann/barrnap .
Li, W. & Godzik, A. Cd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659 (2006).
pubmed: 16731699 doi: 10.1093/bioinformatics/btl158
Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: Accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012).
pubmed: 23060610 pmcid: 3516142 doi: 10.1093/bioinformatics/bts565
Sayers, E. W. et al. Database resources of the National Center for Biotechnology Information. Nucl. Acids Res. 47, D23–D28 (2019).
pubmed: 30395293 doi: 10.1093/nar/gky1069
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 17, 10 (2011).
doi: 10.14806/ej.17.1.200
Bankevich, A. et al. SPAdes: A new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19, 455–477 (2012).
pubmed: 22506599 pmcid: 3342519 doi: 10.1089/cmb.2012.0021
Zhang, J., Kobert, K., Flouri, T. & Stamatakis, A. PEAR: A fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics 30, 614–620 (2014).
pubmed: 24142950 doi: 10.1093/bioinformatics/btt593
Edgar, R. C., Haas, B. J., Clemente, J. C., Quince, C. & Knight, R. UCHIME improves sensitivity and speed of chimera detection. Bioinformatics 27, 2194–2200 (2011).
pubmed: 21700674 pmcid: 3150044 doi: 10.1093/bioinformatics/btr381
Cole, J. R. et al. Ribosomal Database Project: Data and tools for high throughput rRNA analysis. Nucl. Acids Res. 42, D633–D642 (2014).
pubmed: 24288368 doi: 10.1093/nar/gkt1244
Dereeper, A. et al. Phylogeny.fr: Robust phylogenetic analysis for the non-specialist. Nucl. Acids Res. 36, W465–W469 (2008).
pubmed: 18424797 pmcid: 2447785 doi: 10.1093/nar/gkn180
Castresana, J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol. Biol. Evol. 17, 540–552 (2000).
pubmed: 10742046 doi: 10.1093/oxfordjournals.molbev.a026334
Guindon, S. et al. New algorithms and methods to estimate maximum-likelihood phylogenies: Assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010).
pubmed: 20525638 doi: 10.1093/sysbio/syq010
Chevenet, F., Brun, C., Bañuls, A.-L., Jacq, B. & Christen, R. TreeDyn: Towards dynamic graphics and annotations for analyses of trees. BMC Bioinform. 7, 439 (2006).
doi: 10.1186/1471-2105-7-439
Balvočiūtė, M. & Huson, D. H. SILVA, RDP, Greengenes, NCBI and OTT—How do these taxonomies compare?. BMC Genomics https://doi.org/10.1186/s12864-017-3501-4 (2017).
doi: 10.1186/s12864-017-3501-4 pubmed: 28361695 pmcid: 5374703
McDonald, D. et al. Greengenes2 unifies microbial data in a single reference tree. Nat. Biotechnol. https://doi.org/10.1038/s41587-023-01845-1 (2023).
doi: 10.1038/s41587-023-01845-1 pubmed: 37853258 pmcid: 10344774
Park, S.-C. & Won, S. Evaluation of 16S rRNA databases for taxonomic assignments using a mock community. Genomics Inform. 16, e24 (2018).
pubmed: 30602085 pmcid: 6440677 doi: 10.5808/GI.2018.16.4.e24
Sinha, R. et al. Assessment of variation in microbial community amplicon sequencing by the Microbiome Quality Control (MBQC) project consortium. Nat. Biotechnol. 35, 1077–1086 (2017).
pubmed: 28967885 pmcid: 5839636 doi: 10.1038/nbt.3981
O’Sullivan, D. M. et al. An inter-laboratory study to investigate the impact of the bioinformatics component on microbiome analysis using mock communities. Sci. Rep. 11, 10590 (2021).
pubmed: 34012005 pmcid: 8134577 doi: 10.1038/s41598-021-89881-2
Straub, D. et al. Interpretations of environmental microbial community studies are biased by the selected 16S rRNA (Gene) amplicon sequencing pipeline. Front. Microbiol. 11, 550420 (2020).
pubmed: 33193131 pmcid: 7645116 doi: 10.3389/fmicb.2020.550420
Amos, G. C. A. et al. Developing standards for the microbiome field. Microbiome. 8, 98 (2020).
pubmed: 32591016 pmcid: 7320585 doi: 10.1186/s40168-020-00856-3
Scherz, V., Greub, G. & Bertelli, C. Building up a clinical microbiota profiling: A quality framework proposal. Crit. Rev. Microbiol. 48(3), 356–375 (2021).
pubmed: 34752719 doi: 10.1080/1040841X.2021.1975642
Mirzayi, C. et al. Reporting guidelines for human microbiome research: The STORMS checklist. Nat. Med. 27, 1885–1892 (2021).
pubmed: 34789871 pmcid: 9105086 doi: 10.1038/s41591-021-01552-x

Auteurs

Hugo Roume (H)

Université Paris-Saclay, INRAE, MetaGenoPolis, 78350, Jouy-en-Josas, France.
Discovery & Front End Innovation, Lesaffre Institute of Science & Technology, Lesaffre International, 101 rue de Menin, 59700, Marcq-en-Barœul, France.

Stanislas Mondot (S)

Université Paris-Saclay, INRAE, AgroParisTech, Micalis Institute, 78350, Jouy-en-Josas, France.

Adrien Saliou (A)

BIOASTER, Microbiology Technology Institute, 40 Avenue Tony Garnier, 69007, Lyon, France.

Sophie Le Fresne-Languille (S)

Biofortis SAS, 3 Route de la Chatterie, Saint-Herblain, 44800, Nantes, France.

Joël Doré (J)

Université Paris-Saclay, INRAE, MetaGenoPolis, 78350, Jouy-en-Josas, France. joel.dore@inrae.fr.
Université Paris-Saclay, INRAE, AgroParisTech, Micalis Institute, 78350, Jouy-en-Josas, France. joel.dore@inrae.fr.

Classifications MeSH