Database size positively correlates with the loss of species-level taxonomic resolution for the 16S rRNA and other prokaryotic marker genes.


Journal

PLoS computational biology
ISSN: 1553-7358
Titre abrégé: PLoS Comput Biol
Pays: United States
ID NLM: 101238922

Informations de publication

Date de publication:
05 Aug 2024
Historique:
received: 04 01 2024
accepted: 22 07 2024
medline: 5 8 2024
pubmed: 5 8 2024
entrez: 5 8 2024
Statut: aheadofprint

Résumé

For decades, the 16S rRNA gene has been used to taxonomically classify prokaryotic species and to taxonomically profile microbial communities. However, the 16S rRNA gene has been criticized for being too conserved to differentiate between distinct species. We argue that the inability to differentiate between species is not a unique feature of the 16S rRNA gene. Rather, we observe the gradual loss of species-level resolution for other nearly-universal prokaryotic marker genes as the number of gene sequences increases in reference databases. This trend was strongly correlated with how represented a taxonomic group was in the database and indicates that, at the gene-level, the boundaries between many species might be fuzzy. Through our study, we argue that any approach that relies on a single marker to distinguish bacterial taxa is fraught even if some markers appear to be discriminative in current databases.

Identifiants

pubmed: 39102435
doi: 10.1371/journal.pcbi.1012343
pii: PCOMPBIOL-D-24-00016
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

e1012343

Informations de copyright

Copyright: This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

Déclaration de conflit d'intérêts

The authors have declared that no competing interests exist.

Auteurs

Seth Commichaux (S)

Center for Food Safety and Nutrition, Food and Drug Administration, Laurel, Maryland, United States of America.

Tu Luan (T)

Department of Computer Science, University of Maryland, College Park, Maryland, United States of America.
Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, United States of America.

Harihara Subrahmaniam Muralidharan (HS)

Department of Computer Science, University of Maryland, College Park, Maryland, United States of America.
Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, United States of America.

Mihai Pop (M)

Department of Computer Science, University of Maryland, College Park, Maryland, United States of America.
Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, United States of America.

Classifications MeSH