Meta-SourceTracker: application of Bayesian source tracking to shotgun metagenomics.

Bioinformatics Environmental microbiology Software

Journal

PeerJ
ISSN: 2167-8359
Titre abrégé: PeerJ
Pays: United States
ID NLM: 101603425

Informations de publication

Date de publication:
2020
Historique:
received: 23 07 2019
accepted: 21 02 2020
entrez: 2 4 2020
pubmed: 2 4 2020
medline: 2 4 2020
Statut: epublish

Résumé

Microbial source tracking methods are used to determine the origin of contaminating bacteria and other microorganisms, particularly in contaminated water systems. The Bayesian SourceTracker approach uses deep-sequencing marker gene libraries (16S ribosomal RNA) to determine the proportional contributions of bacteria from many potential source environments to a given sink environment simultaneously. Since its development, SourceTracker has been applied to an extensive diversity of studies, from beach contamination to human behavior. Here, we demonstrate a novel application of SourceTracker to work with metagenomic datasets and tested this approach using sink samples from a study of coastal marine environments. Source environment metagenomes were obtained from metagenomics studies of gut, freshwater, marine, sand and soil environments. As part of this effort, we implemented features for determining the stability of source proportion estimates, including precision visualizations for performance optimization, and performed domain-specific source-tracking analyses (i.e., Bacteria, Archaea, Eukaryota and viruses). We also applied SourceTracker to metagenomic libraries generated from samples collected from the International Space Station (ISS). SourceTracker proved highly effective at predicting the composition of known sources using shotgun metagenomic libraries. In addition, we showed that different taxonomic domains sometimes presented highly divergent pictures of environmental source origins for both the coastal marine and ISS samples. These findings indicated that applying SourceTracker to separate domains may provide a deeper understanding of the microbial origins of complex, mixed-source environments, and further suggested that certain domains may be preferable for tracking specific sources of contamination.

Sections du résumé

BACKGROUND BACKGROUND
Microbial source tracking methods are used to determine the origin of contaminating bacteria and other microorganisms, particularly in contaminated water systems. The Bayesian SourceTracker approach uses deep-sequencing marker gene libraries (16S ribosomal RNA) to determine the proportional contributions of bacteria from many potential source environments to a given sink environment simultaneously. Since its development, SourceTracker has been applied to an extensive diversity of studies, from beach contamination to human behavior.
METHODS METHODS
Here, we demonstrate a novel application of SourceTracker to work with metagenomic datasets and tested this approach using sink samples from a study of coastal marine environments. Source environment metagenomes were obtained from metagenomics studies of gut, freshwater, marine, sand and soil environments. As part of this effort, we implemented features for determining the stability of source proportion estimates, including precision visualizations for performance optimization, and performed domain-specific source-tracking analyses (i.e., Bacteria, Archaea, Eukaryota and viruses). We also applied SourceTracker to metagenomic libraries generated from samples collected from the International Space Station (ISS).
RESULTS RESULTS
SourceTracker proved highly effective at predicting the composition of known sources using shotgun metagenomic libraries. In addition, we showed that different taxonomic domains sometimes presented highly divergent pictures of environmental source origins for both the coastal marine and ISS samples. These findings indicated that applying SourceTracker to separate domains may provide a deeper understanding of the microbial origins of complex, mixed-source environments, and further suggested that certain domains may be preferable for tracking specific sources of contamination.

Identifiants

pubmed: 32231882
doi: 10.7717/peerj.8783
pii: 8783
pmc: PMC7100590
doi:

Types de publication

Journal Article

Langues

eng

Pagination

e8783

Informations de copyright

©2020 McGhee et al.

Déclaration de conflit d'intérêts

The authors declare there are no competing interests.

Références

Nat Commun. 2016 Apr 13;7:11257
pubmed: 27071849
Sci Total Environ. 2018 Jan 15;612:1300-1310
pubmed: 28898936
Indoor Air. 2016 Dec;26(6):869-879
pubmed: 26717555
Environ Sci Technol. 2018 Apr 3;52(7):4207-4217
pubmed: 29505249
Appl Microbiol Biotechnol. 2018 Oct;102(20):8629-8646
pubmed: 30078138
Nat Methods. 2011 Jul 17;8(9):761-3
pubmed: 21765408
Nat Biotechnol. 2017 Sep 12;35(9):833-844
pubmed: 28898207
Microbiome. 2015 May 12;3:21
pubmed: 25969737
Trends Genet. 2014 Sep;30(9):418-26
pubmed: 25108476
Appl Environ Microbiol. 2002 Dec;68(12):5796-803
pubmed: 12450798
Sci Rep. 2018 Feb 27;8(1):3669
pubmed: 29487294
Anaerobe. 2018 Feb;49:30-40
pubmed: 29223548
Microb Ecol. 2013 May;65(4):1011-23
pubmed: 23475306
Appl Environ Microbiol. 2014 Jan;80(2):612-7
pubmed: 24212583
Water Res. 2018 Jul 1;138:86-96
pubmed: 29573632
Electrophoresis. 2018 Jul;39(13):1692-1701
pubmed: 29427518
Comput Struct Biotechnol J. 2015 Jun 09;13:390-401
pubmed: 26137199
Bioinformatics. 2018 Sep 1;34(17):i884-i890
pubmed: 30423086
mSystems. 2016 Aug 2;1(4):
pubmed: 27822543
Environ Sci Technol. 2018 Aug 21;52(16):9033-9044
pubmed: 30020774
PLoS One. 2011;6(11):e28132
pubmed: 22132229
PLoS One. 2013;8(1):e54703
pubmed: 23372757
mSphere. 2016 Nov 16;1(6):
pubmed: 27904880
Microbiome. 2018 Nov 13;6(1):204
pubmed: 30424821
Environ Sci Technol. 2002 Dec 15;36(24):5279-88
pubmed: 12521151
Int J Food Microbiol. 2018 Dec 20;287:10-17
pubmed: 29157743
Water Res. 2016 Apr 15;93:242-253
pubmed: 26921850
Appl Environ Microbiol. 2015 Jan;81(2):765-73
pubmed: 25398865
Microbiome. 2018 May 24;6(1):93
pubmed: 29793542

Auteurs

Jordan J McGhee (JJ)

Bioinformatics and Medical Informatics Program, San Diego State University, San Diego, CA, United States of America.

Nick Rawson (N)

Department of Mathematics and Statistics, San Diego State University, San Diego, CA, United States of America.

Barbara A Bailey (BA)

Department of Mathematics and Statistics, San Diego State University, San Diego, CA, United States of America.

Antonio Fernandez-Guerra (A)

Microbial Genomics and Bioinformatics Research Group, Max Planck Institute for Marine Microbiology, Bremen, Germany.
Current affiliation: Lundbeck Foundation GeoGenetics Centre, GLOBE Institute, University of Copenhagen, Copenhagen, Denmark.

Laura Sisk-Hackworth (L)

Department of Biology, San Diego State University, San Diego, CA, United States of America.

Scott T Kelley (ST)

Department of Biology, San Diego State University, San Diego, CA, United States of America.

Classifications MeSH