Metagenomic compendium of 189,680 DNA viruses from the human gut microbiome.
Journal
Nature microbiology
ISSN: 2058-5276
Titre abrégé: Nat Microbiol
Pays: England
ID NLM: 101674869
Informations de publication
Date de publication:
07 2021
07 2021
Historique:
received:
09
03
2021
accepted:
25
05
2021
pubmed:
26
6
2021
medline:
21
9
2021
entrez:
25
6
2021
Statut:
ppublish
Résumé
Bacteriophages have important roles in the ecology of the human gut microbiome but are under-represented in reference databases. To address this problem, we assembled the Metagenomic Gut Virus catalogue that comprises 189,680 viral genomes from 11,810 publicly available human stool metagenomes. Over 75% of genomes represent double-stranded DNA phages that infect members of the Bacteroidia and Clostridia classes. Based on sequence clustering we identified 54,118 candidate viral species, 92% of which were not found in existing databases. The Metagenomic Gut Virus catalogue improves detection of viruses in stool metagenomes and accounts for nearly 40% of CRISPR spacers found in human gut Bacteria and Archaea. We also produced a catalogue of 459,375 viral protein clusters to explore the functional potential of the gut virome. This revealed tens of thousands of diversity-generating retroelements, which use error-prone reverse transcription to mutate target genes and may be involved in the molecular arms race between phages and their bacterial hosts.
Identifiants
pubmed: 34168315
doi: 10.1038/s41564-021-00928-6
pii: 10.1038/s41564-021-00928-6
pmc: PMC8241571
doi:
Substances chimiques
DNA, Viral
0
Viral Proteins
0
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
960-970Subventions
Organisme : NCI NIH HHS
ID : P30 CA124435
Pays : United States
Organisme : NIAID NIH HHS
ID : R01 AI148623
Pays : United States
Références
Lynch, S. V. & Pedersen, O. The human intestinal microbiome in health and disease. N. Engl. J. Med. 375, 2369–2379 (2016).
pubmed: 27974040
doi: 10.1056/NEJMra1600266
Ogilvie, L. A. et al. Genome signature-based dissection of human gut metagenomes to extract subliminal viral sequences. Nat. Commun. 4, 2420 (2013).
pubmed: 24036533
doi: 10.1038/ncomms3420
Reyes, A. et al. Viruses in the faecal microbiota of monozygotic twins and their mothers. Nature 466, 334–338 (2010).
pubmed: 20631792
pmcid: 2919852
doi: 10.1038/nature09199
Gogokhia, L. et al. Expansion of bacteriophages is linked to aggravated intestinal inflammation and colitis. Cell Host Microbe 25, 285–299 (2019).
pubmed: 30763538
pmcid: 6885004
doi: 10.1016/j.chom.2019.01.008
Clooney, A. G. et al. Whole-virome analysis sheds light on viral dark matter in inflammatory bowel disease. Cell Host Microbe 26, 764–778 (2019).
pubmed: 31757768
doi: 10.1016/j.chom.2019.10.009
Ma, Y. et al. A human gut phage catalog correlates the gut phageome with type 2 diabetes. Microbiome 6, 24 (2018).
pubmed: 29391057
pmcid: 5796561
doi: 10.1186/s40168-018-0410-y
Minot, S. et al. The human gut virome: inter-individual variation and dynamic response to diet. Genome Res. 21, 1616–1625 (2011).
pubmed: 21880779
pmcid: 3202279
doi: 10.1101/gr.122705.111
Breitbart, M. et al. Metagenomic analyses of an uncultured viral community from human feces. J. Bacteriol. 185, 6220–6223 (2003).
pubmed: 14526037
pmcid: 225035
doi: 10.1128/JB.185.20.6220-6223.2003
Rodriguez-Valera, F. et al. Explaining microbial population genomics through phage predation. Nat. Rev. Microbiol. 7, 828–836 (2009).
pubmed: 19834481
doi: 10.1038/nrmicro2235
Canchaya, C. et al. Prophage genomics. Microbiol. Mol. Biol. Rev. 67, 238–276 (2003).
pubmed: 12794192
pmcid: 156470
doi: 10.1128/MMBR.67.2.238-276.2003
Touchon, M., Moura de Souza, J. A. & Rocha, E. P. C. Embracing the enemy: the diversification of microbial gene repertoires by phage-mediated horizontal gene transfer. Curr. Opin. Microbiol. 38, 66–73 (2017).
pubmed: 28527384
doi: 10.1016/j.mib.2017.04.010
Edwards, R. A. et al. Computational approaches to predict bacteriophage–host relationships. FEMS Microbiol. Rev. 40, 258–272 (2016).
pubmed: 26657537
doi: 10.1093/femsre/fuv048
Yi, H. et al. AcrFinder: genome mining anti-CRISPR operons in prokaryotes and their viruses. Nucleic Acids Res. 48, W358–W365 (2020).
pubmed: 32402073
pmcid: 7319584
doi: 10.1093/nar/gkaa351
Shkoporov, A. N. et al. Reproducible protocols for metagenomic analysis of human faecal phageomes. Microbiome 6, 68 (2018).
pubmed: 29631623
pmcid: 5892011
doi: 10.1186/s40168-018-0446-z
Conceicao-Neto, N. et al. Modular approach to customise sample preparation procedures for viral metagenomics: a reproducible protocol for virome analysis. Sci. Rep. 5, 16532 (2015).
pubmed: 26559140
pmcid: 4642273
doi: 10.1038/srep16532
Milani, C. et al. Tracing mother–infant transmission of bacteriophages by means of a novel analytical tool for shotgun metagenomic datasets: METAnnotatorX. Microbiome 6, 145 (2018).
pubmed: 30126456
pmcid: 6102903
doi: 10.1186/s40168-018-0527-z
Trubl, G. et al. Towards optimized viral metagenomes for double-stranded and single-stranded DNA viruses from challenging soils. PeerJ 7, e7265 (2019).
pubmed: 31309007
pmcid: 6612421
doi: 10.7717/peerj.7265
Roux, S. et al. Assessment of viral community functional potential from viral metagenomes may be hampered by contamination with cellular sequences. Open Biol. 3, 130160 (2013).
pubmed: 24335607
pmcid: 3877843
doi: 10.1098/rsob.130160
Parras-Molto, M. et al. Evaluation of bias induced by viral enrichment and random amplification protocols in metagenomic surveys of saliva DNA viruses. Microbiome 6, 119 (2018).
pubmed: 29954453
pmcid: 6022446
doi: 10.1186/s40168-018-0507-3
Kim, K. H. & Bae, J. W. Amplification methods bias metagenomic libraries of uncultured single-stranded and double-stranded DNA viruses. Appl. Environ. Microbiol. 77, 7663–7668 (2011).
pubmed: 21926223
pmcid: 3209148
doi: 10.1128/AEM.00289-11
Szekely, A. J. & Breitbart, M. Single-stranded DNA phages: from early molecular biology tools to recent revolutions in environmental microbiology. FEMS Microbiol. Lett. 363, fnw027 (2016).
pubmed: 26850442
doi: 10.1093/femsle/fnw027
Paez-Espino, D. et al. IMG/VR v.2.0: an integrated data management and analysis system for cultivated and environmental viral genomes. Nucleic Acids Res. 47, D678–D686 (2019).
pubmed: 30407573
doi: 10.1093/nar/gky1127
Paez-Espino, D. et al. Uncovering Earth’s virome. Nature 536, 425–430 (2016).
pubmed: 27533034
doi: 10.1038/nature19094
Arumugam, M. et al. Enterotypes of the human gut microbiome. Nature 473, 174–180 (2011).
pubmed: 21508958
pmcid: 3728647
doi: 10.1038/nature09944
Bobay, L. M., Touchon, M. & Rocha, E. P. Pervasive domestication of defective prophages by bacteria. Proc. Natl Acad. Sci. USA 111, 12127–12132 (2014).
pubmed: 25092302
pmcid: 4143005
doi: 10.1073/pnas.1405336111
Soto-Perez, P. et al. CRISPR–Cas system of a prevalent human gut bacterium reveals hyper-targeting against phages in a human virome catalog. Cell Host Microbe 26, 325–335 (2019).
pubmed: 31492655
pmcid: 6936622
doi: 10.1016/j.chom.2019.08.008
Gregory, A. C. et al. The gut virome database reveals age-dependent patterns of virome diversity in the human gut. Cell Host Microbe 28, 724–740 (2020).
pubmed: 32841606
pmcid: 7443397
doi: 10.1016/j.chom.2020.08.003
The Human Microbiome Project Consortium. A framework for human microbiome research. Nature 486, 215–221 (2012)..
Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649–662 (2019).
pubmed: 30661755
pmcid: 6349461
doi: 10.1016/j.cell.2019.01.001
Almeida, A. et al. A new genomic blueprint of the human gut microbiota. Nature 568, 499–504 (2019).
pubmed: 30745586
pmcid: 6784870
doi: 10.1038/s41586-019-0965-1
Nayfach, S. et al. New insights from uncultivated genomes of the global human gut microbiome. Nature 568, 505–510 (2019).
pubmed: 30867587
pmcid: 6784871
doi: 10.1038/s41586-019-1058-x
Ren, J. et al. VirFinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data. Microbiome 5, 69 (2017).
pubmed: 28683828
pmcid: 5501583
doi: 10.1186/s40168-017-0283-5
Roux, S. et al. VirSorter: mining viral signal from microbial genomic data. PeerJ 3, e985 (2015).
pubmed: 26038737
pmcid: 4451026
doi: 10.7717/peerj.985
Dutilh, B. E. et al. A highly abundant bacteriophage discovered in the unknown sequences of human faecal metagenomes. Nat. Commun. 5, 4498 (2014).
pubmed: 25058116
doi: 10.1038/ncomms5498
Devoto, A. E. et al. Megaphages infect Prevotella and variants are widespread in gut microbiomes. Nat. Microbiol 4, 693–700 (2019).
pubmed: 30692672
pmcid: 6784885
doi: 10.1038/s41564-018-0338-9
Mitchell, A. L. et al. MGnify: the microbiome analysis resource in 2020. Nucleic Acids Res. 48, D570–D578 (2020).
pubmed: 31696235
Nayfach, S. et al. CheckV assesses the quality and completeness of metagenome-assembled viral genomes. Nat. Biotechnol. 39, 578–585 (2021).
pubmed: 33349699
doi: 10.1038/s41587-020-00774-7
Roux, S. et al. Minimum information about an uncultivated virus genome (MIUViG). Nat. Biotechnol. 37, 29–37 (2019).
pubmed: 30556814
doi: 10.1038/nbt.4306
Bowers, R. M. et al. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea. Nat. Biotechnol. 35, 725–731 (2017).
pubmed: 28787424
pmcid: 6436528
doi: 10.1038/nbt.3893
Hockenberry, A. J. & Wilke, C. O. BACPHLIP: predicting bacteriophage lifestyle from conserved protein domains. PeerJ 9, e11396 (2021).
pubmed: 33996289
pmcid: 8106911
doi: 10.7717/peerj.11396
Kang, H. S. et al. Prophage genomics reveals patterns in phage genome organization and replication. Preprint at bioRxiv https://doi.org/10.1101/114819 (2017).
Lin, D. M., Koskella, B. & Lin, H. C. Phage therapy: an alternative to antibiotics in the age of multi-drug resistance. World J. Gastrointest. Pharm. Ther. 8, 162–173 (2017).
doi: 10.4292/wjgpt.v8.i3.162
Almeida, A. et al. A unified catalog of 204,938 reference genomes from the human gut microbiome. Nat. Biotechnol. 39, 105–114 (2020).
pubmed: 32690973
pmcid: 7801254
doi: 10.1038/s41587-020-0603-3
Burstein, D. et al. Major bacterial lineages are essentially devoid of CRISPR–Cas viral defence systems. Nat. Commun. 7, 10613 (2016).
pubmed: 26837824
pmcid: 4742961
doi: 10.1038/ncomms10613
Guerin, E. et al. Biology and taxonomy of crAss-like bacteriophages, the most abundant virus in the human gut. Cell Host Microbe 24, 653–664 (2018).
pubmed: 30449316
doi: 10.1016/j.chom.2018.10.002
Shkoporov, A. N. et al. PhiCrAss001 represents the most abundant bacteriophage family in the human gut and infects Bacteroides intestinalis. Nat. Commun. 9, 4781 (2018).
pubmed: 30429469
pmcid: 6235969
doi: 10.1038/s41467-018-07225-7
Yutin, N. et al. Analysis of metagenome-assembled viral genomes from the human gut reveals diverse putative CrAss-like phages with unique genomic features. Nat. Commun. 12, 1044 (2021).
pubmed: 33594055
pmcid: 7886860
doi: 10.1038/s41467-021-21350-w
Ackermann, H. W. Tailed bacteriophages: the order Caudovirales. Adv. Virus Res. 51, 135–201 (1998).
pubmed: 9891587
pmcid: 7173057
doi: 10.1016/S0065-3527(08)60785-X
Low, S. J. et al. Evaluation of a concatenated protein phylogeny for classification of tailed double-stranded DNA viruses belonging to the order Caudovirales. Nat. Microbiol. 4, 1306–1315 (2019).
pubmed: 31110365
doi: 10.1038/s41564-019-0448-z
Parks, D. H. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat. Biotechnol. 36, 996–1004 (2018).
pubmed: 30148503
doi: 10.1038/nbt.4229
Al-Shayeb, B. et al. Clades of huge phages from across Earth’s ecosystems. Nature 578, 425–431 (2020).
pubmed: 32051592
pmcid: 7162821
doi: 10.1038/s41586-020-2007-4
Karcher, N. et al. Analysis of 1321 Eubacterium rectale genomes from metagenomes uncovers complex phylogeographic population structure and subspecies functional adaptations. Genome Biol. 21, 138 (2020).
pubmed: 32513234
pmcid: 7278147
doi: 10.1186/s13059-020-02042-y
Qin, J. et al. A human gut microbial gene catalogue established by metagenomic sequencing. Nature 464, 59–65 (2010).
pubmed: 20203603
pmcid: 3779803
doi: 10.1038/nature08821
Li, J. et al. An integrated catalog of reference genes in the human gut microbiome. Nat. Biotechnol. 32, 834–841 (2014).
pubmed: 24997786
doi: 10.1038/nbt.2942
Kanehisa, M. & Goto, S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 28, 27–30 (2000).
pubmed: 10592173
pmcid: 102409
doi: 10.1093/nar/28.1.27
Haft, D. H. The TIGRFAMs database of protein families. Nucleic Acids Res. 31, 371–373 (2003).
pubmed: 12520025
pmcid: 165575
doi: 10.1093/nar/gkg128
El-Gebali, S. et al. The Pfam protein families database in 2019. Nucleic Acids Res. 47, D427–D432 (2019).
pubmed: 30357350
doi: 10.1093/nar/gky995
Hauser, M., Steinegger, M. & Soding, J. MMseqs software suite for fast and deep clustering and searching of large protein sequence sets. Bioinformatics 32, 1323–1330 (2016).
pubmed: 26743509
doi: 10.1093/bioinformatics/btw006
Shaikh, S. et al. Antibiotic resistance and extended spectrum beta-lactamases: types, epidemiology and treatment. Saudi J. Biol. Sci. 22, 90–101 (2015).
pubmed: 25561890
doi: 10.1016/j.sjbs.2014.08.002
Gibson, M. K., Forsberg, K. J. & Dantas, G. Improved annotation of antibiotic resistance determinants reveals microbial resistomes cluster by ecology. ISME J. 9, 207–216 (2015).
pubmed: 25003965
doi: 10.1038/ismej.2014.106
Feldgarden, M. et al. Validating the AMRFinder Tool and Resistance Gene Database by Using Antimicrobial Resistance Genotype–Phenotype Correlations in a Collection of Isolates. Antimicrob. Agents Chemother. 63, e00483-19 (2019).
pubmed: 31427293
pmcid: 6811410
doi: 10.1128/AAC.00483-19
Alcock, B. P. et al. CARD 2020: antibiotic resistome surveillance with the comprehensive antibiotic resistance database. Nucleic Acids Res. 48, D517–D525 (2019).
pmcid: 7145624
Enault, F. et al. Phages rarely encode antibiotic resistance genes: a cautionary tale for virome analyses. ISME J. 11, 237–247 (2017).
pubmed: 27326545
doi: 10.1038/ismej.2016.90
Telesnitsky, A. & Goff, G. P. in Retroviruses (eds Coffin, J. M. et al.) 121–160 (Cold Spring Harbor Laboratory Press, 1997).
Silas, S. et al. Direct CRISPR spacer acquisition from RNA by a natural reverse transcriptase–Cas1 fusion protein. Science 351, aad4234 (2016).
pubmed: 26917774
pmcid: 4898656
doi: 10.1126/science.aad4234
Liu, M. et al. Reverse transcriptase-mediated tropism switching in Bordetella bacteriophage. Science 295, 2091–2094 (2002).
pubmed: 11896279
doi: 10.1126/science.1067467
Ye, Y. Identification of diversity-generating retroelements in human microbiomes. Int. J. Mol. Sci. 15, 14234–14246 (2014).
pubmed: 25196521
pmcid: 4159848
doi: 10.3390/ijms150814234
Benler, S. et al. A diversity-generating retroelement encoded by a globally ubiquitous Bacteroides phage. Microbiome 6, 191 (2018).
pubmed: 30352623
pmcid: 6199706
doi: 10.1186/s40168-018-0573-6
Cornuault, J. K. et al. Phages infecting Faecalibacterium prausnitzii belong to novel viral genera that help to decipher intestinal viromes. Microbiome 6, 65 (2018).
pubmed: 29615108
pmcid: 5883640
doi: 10.1186/s40168-018-0452-1
Fraser, J. S. et al. Ig-like domains on bacteriophages: a tale of promiscuity and deceit. J. Mol. Biol. 359, 496–507 (2006).
pubmed: 16631788
doi: 10.1016/j.jmb.2006.03.043
Kulmanov, M. & Hoehndorf, R. DeepGOPlus: improved protein function prediction from sequence. Bioinformatics 36, 422–429 (2020).
pubmed: 31350877
doi: 10.1093/bioinformatics/btz595
Schmitz, J. E., Schuch, R. & Fischetti, V. A. Identifying active phage lysins through functional viral metagenomics. Appl. Environ. Microbiol. 76, 7181–7187 (2010).
pubmed: 20851985
pmcid: 2976241
doi: 10.1128/AEM.00732-10
Camarillo-Guerrero, L. F. et al. Massive expansion of human gut bacteriophage diversity. Cell 184, 1098–1109 (2021).
pubmed: 33606979
pmcid: 7895897
doi: 10.1016/j.cell.2021.01.029
Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v4: recent updates and new developments. Nucleic Acids Res. 47, W256–W259 (2019).
pubmed: 30931475
pmcid: 6602468
doi: 10.1093/nar/gkz239
Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).
pubmed: 22039361
pmcid: 3197634
doi: 10.1371/journal.pcbi.1002195
Minot, S. et al. Rapid evolution of the human gut virome. Proc. Natl Acad. Sci. USA 110, 12450–12455 (2013).
pubmed: 23836644
pmcid: 3725073
doi: 10.1073/pnas.1300833110
Barrett, T. et al. BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata. Nucleic Acids Res. 40, D57–D63 (2012).
pubmed: 22139929
doi: 10.1093/nar/gkr1163
Hyatt, D. et al. Gene and translation initiation site prediction in metagenomic sequences. Bioinformatics 28, 2223–2230 (2012).
pubmed: 22796954
doi: 10.1093/bioinformatics/bts429
Ivanova, N. N. et al. Stop codon reassignments in the wild. Science 344, 909–913 (2014).
pubmed: 24855270
doi: 10.1126/science.1250691
Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).
pubmed: 25402007
doi: 10.1038/nmeth.3176
Chaumeil, P. A. et al. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics 36, 1925–1927 (2019).
pmcid: 7703759
Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinform. 10, 421 (2009).
doi: 10.1186/1471-2105-10-421
Bland, C. et al. CRISPR recognition tool (CRT): a tool for automatic detection of clustered regularly interspaced palindromic repeats. BMC Bioinform. 8, 209 (2007).
doi: 10.1186/1471-2105-8-209
Edgar, R. C. PILER-CR: fast and accurate identification of CRISPR repeats. BMC Bioinform. 8, 18 (2007).
doi: 10.1186/1471-2105-8-18
Marcais, G. et al. MUMmer4: a fast and versatile genome alignment system. PLoS Comput. Biol. 14, e1005944 (2018).
pubmed: 29373581
pmcid: 5802927
doi: 10.1371/journal.pcbi.1005944
Fernandes, M. A. et al. Enteric virome and bacterial microbiota in children with ulcerative colitis and Crohn disease. J. Pediatr. Gastroenterol. Nutr. 68, 30–36 (2019).
pubmed: 30169455
pmcid: 6310095
doi: 10.1097/MPG.0000000000002140
Shkoporov, A. N. et al. The human gut virome is highly diverse, stable, and individual specific. Cell Host Microbe 26, 527–541 (2019).
pubmed: 31600503
doi: 10.1016/j.chom.2019.09.009
Zolfo, M. et al. Detecting contamination in viromes using ViromeQC. Nat. Biotechnol. 37, 1408–1412 (2019).
pubmed: 31748692
doi: 10.1038/s41587-019-0334-5
Pongor, L. S., Vera, R. & Ligeti, B. Fast and sensitive alignment of microbial whole genome sequencing reads to large sequence datasets on a desktop PC: application to metagenomic datasets and pathogen identification. PLoS ONE 9, e103441 (2014).
pubmed: 25077800
pmcid: 4117525
doi: 10.1371/journal.pone.0103441
Capella-Gutierrez, S., Silla-Martinez, J. M. & Gabaldon, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).
pubmed: 19505945
pmcid: 2712344
doi: 10.1093/bioinformatics/btp348
Price, M. N., Dehal, P. S. & Arkin, A. P. A. FastTree 2 – approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).
pubmed: 20224823
pmcid: 2835736
doi: 10.1371/journal.pone.0009490