SCRAPP: A tool to assess the diversity of microbial samples from phylogenetic placements.

diversity microbiome phylogenetic placement species delimitation

Journal

Molecular ecology resources
ISSN: 1755-0998
Titre abrégé: Mol Ecol Resour
Pays: England
ID NLM: 101465604

Informations de publication

Date de publication:
Jan 2021
Historique:
received: 05 03 2020
revised: 24 07 2020
accepted: 25 08 2020
pubmed: 1 10 2020
medline: 18 8 2021
entrez: 30 9 2020
Statut: ppublish

Résumé

Microbial ecology research is currently driven by the continuously decreasing cost of DNA sequencing and the improving accuracy of data analysis methods. One such analysis method is phylogenetic placement, which establishes the phylogenetic identity of the anonymous environmental sequences in a sample by means of a given phylogenetic reference tree. However, assessing the diversity of a sample remains challenging, as traditional methods do not scale well with the increasing data volumes and/or do not leverage the phylogenetic placement information. Here, we present scrapp, a highly parallel and scalable tool that uses a molecular species delimitation algorithm to quantify the diversity distribution over the reference phylogeny for a given phylogenetic placement of the sample. scrapp employs a novel approach to cluster phylogenetic placements, called placement space clustering, to efficiently perform dimensionality reduction, so as to scale on large data volumes. Furthermore, it uses the phylogeny-aware molecular species delimitation method mPTP to quantify diversity. We evaluated scrapp using both, simulated and empirical data sets. We use simulated data to verify our approach. Tests on an empirical data set show that scrapp-derived metrics can classify samples by their diversity-correlated features equally well or better than existing, commonly used approaches. scrapp is available at https://github.com/pbdas/scrapp.

Identifiants

pubmed: 32996237
doi: 10.1111/1755-0998.13255
pmc: PMC7756409
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

340-349

Subventions

Organisme : Klaus Tschira Stiftung

Informations de copyright

© 2020 The Authors. Molecular Ecology Resources published by John Wiley & Sons Ltd.

Références

Mol Ecol Resour. 2021 Jan;21(1):340-349
pubmed: 32996237
Bioinformatics. 2017 Jun 1;33(11):1630-1638
pubmed: 28108445
Philos Trans R Soc Lond B Biol Sci. 1994 Jul 29;345(1311):101-18
pubmed: 7972351
J Eukaryot Microbiol. 2017 May;64(3):407-411
pubmed: 28337822
J Clin Microbiol. 1991 Feb;29(2):297-301
pubmed: 1706728
Nat Ecol Evol. 2017 Mar 20;1(4):91
pubmed: 28812652
Appl Environ Microbiol. 2005 Dec;71(12):8228-35
pubmed: 16332807
Comput Appl Biosci. 1997 Jun;13(3):235-8
pubmed: 9183526
Am J Med. 1983 Jan;74(1):14-22
pubmed: 6600371
Syst Biol. 2011 May;60(3):291-302
pubmed: 21436105
PeerJ. 2015 Dec 10;3:e1420
pubmed: 26713226
PLoS One. 2012;7(6):e37818
pubmed: 22719852
Bioinformatics. 2020 May 1;36(10):3263-3265
pubmed: 32016344
PLoS One. 2013 Apr 22;8(4):e61217
pubmed: 23630581
PeerJ. 2014 Sep 25;2:e593
pubmed: 25276506
Biol Rev Camb Philos Soc. 2017 May;92(2):698-715
pubmed: 26785932
PLoS One. 2012;7(2):e31009
pubmed: 22383988
BMC Bioinformatics. 2007 Nov 22;8:460
pubmed: 18034891
PLoS Comput Biol. 2016 May 04;12(5):e1004842
pubmed: 27145223
Bioinformatics. 2019 Nov 1;35(21):4453-4455
pubmed: 31070718
Pac Symp Biocomput. 2001;:547-58
pubmed: 11262972
Science. 2007 Feb 23;315(5815):1126-30
pubmed: 17272687
PLoS One. 2019 May 28;14(5):e0217050
pubmed: 31136592
Bioinformatics. 2007 Jan 1;23(1):127-8
pubmed: 17050570
Bioinformatics. 2019 May 15;35(10):1771-1773
pubmed: 30321303
PeerJ. 2016 Oct 18;4:e2584
pubmed: 27781170
Am Nat. 2007 Mar;169(3):E68-83
pubmed: 17230400
Bioinformatics. 2013 Nov 15;29(22):2869-76
pubmed: 23990417
Syst Biol. 2013 Sep;62(5):707-24
pubmed: 23681854
PeerJ. 2013 Sep 12;1:e157
pubmed: 24058885
Syst Biol. 2019 Mar 1;68(2):365-369
pubmed: 30165689
Mol Ecol Resour. 2020 Mar;20(2):429-443
pubmed: 31705734
BMC Bioinformatics. 2010 Oct 30;11:538
pubmed: 21034504

Auteurs

Pierre Barbera (P)

Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany.

Lucas Czech (L)

Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany.

Sarah Lutteropp (S)

Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany.

Alexandros Stamatakis (A)

Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany.
Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Animals Hemiptera Insect Proteins Phylogeny Insecticides
Populus Soil Microbiology Soil Microbiota Fungi

Classifications MeSH