Accurate estimation of microbial sequence diversity with Distanced.


Journal

Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944

Informations de publication

Date de publication:
01 02 2020
Historique:
received: 02 05 2019
revised: 16 07 2019
accepted: 21 08 2019
pubmed: 11 9 2019
medline: 18 9 2020
entrez: 11 9 2019
Statut: ppublish

Résumé

Microbes are the most diverse organisms on the planet. Deep sequencing of ribosomal DNA (rDNA) suggests thousands of different microbes may be present in a single sample. However, errors in sequencing have made any estimate of within-sample (alpha) diversity uncertain. We developed a tool to estimate alpha diversity of rDNA sequences from microbes (and other sequences). Our tool, Distanced, calculates how different (distant) sequences would be without sequencing errors. It does this using a Bayesian approach. Using this approach, Distanced accurately estimated alpha diversity of rDNA sequences from bacteria and fungi. It had lower root mean square prediction error (RMSPE) than when using no tool (leaving sequencing errors uncorrected). It was also accurate with non-microbial sequences (antibody mRNA). State-of-the-art tools (DADA2 and Deblur) were far less accurate. They often had higher RMSPE than when using no tool. Distanced thus represents an improvement over existing tools. Distanced will be useful to several disciplines, given microbial diversity affects everything from human health to ecosystem function. Distanced is freely available at https://github.com/thackmann/Distanced. Supplementary data are available at Bioinformatics online.

Identifiants

pubmed: 31504180
pii: 5556106
doi: 10.1093/bioinformatics/btz668
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

728-734

Informations de copyright

© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Auteurs

Timothy J Hackmann (TJ)

Department of Animal Science, University of California, Davis, CA 95616, USA.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Populus Soil Microbiology Soil Microbiota Fungi

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software

Classifications MeSH