DAIRYdb: a manually curated reference database for improved taxonomy annotation of 16S rRNA gene sequences from dairy products.
16S
Accuracy
Cheese
Dairy
Database
Microbiome
Milk
OTU classification
Starter
Taxonomy annotation
Teat
Whey
Journal
BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258
Informations de publication
Date de publication:
08 Jul 2019
08 Jul 2019
Historique:
received:
07
08
2018
accepted:
18
06
2019
entrez:
10
7
2019
pubmed:
10
7
2019
medline:
18
12
2019
Statut:
epublish
Résumé
Reads assignment to taxonomic units is a key step in microbiome analysis pipelines. To date, accurate taxonomy annotation of 16S reads, particularly at species rank, is still challenging due to the short size of read sequences and differently curated classification databases. The close phylogenetic relationship between species encountered in dairy products, however, makes it crucial to annotate species accurately to achieve sufficient phylogenetic resolution for further downstream ecological studies or for food diagnostics. Curated databases dedicated to the environment of interest are expected to improve the accuracy and resolution of taxonomy annotation. We provide a manually curated database composed of 10'290 full-length 16S rRNA gene sequences from prokaryotes tailored for dairy products analysis ( https://github.com/marcomeola/DAIRYdb ). The performance of the DAIRYdb was compared with the universal databases Silva, LTP, RDP and Greengenes. The DAIRYdb significantly outperformed all other databases independently of the classification algorithm by enabling higher accurate taxonomy annotation down to the species rank. The DAIRYdb accurately annotates over 90% of the sequences of either single or paired hypervariable regions automatically. The manually curated DAIRYdb strongly improves taxonomic annotation accuracy for microbiome studies in dairy environments. The DAIRYdb is a practical solution that enables automatization of this key step, thus facilitating the routine application of NGS microbiome analyses for microbial ecology studies and diagnostics in dairy products.
Sections du résumé
BACKGROUND
BACKGROUND
Reads assignment to taxonomic units is a key step in microbiome analysis pipelines. To date, accurate taxonomy annotation of 16S reads, particularly at species rank, is still challenging due to the short size of read sequences and differently curated classification databases. The close phylogenetic relationship between species encountered in dairy products, however, makes it crucial to annotate species accurately to achieve sufficient phylogenetic resolution for further downstream ecological studies or for food diagnostics. Curated databases dedicated to the environment of interest are expected to improve the accuracy and resolution of taxonomy annotation.
RESULTS
RESULTS
We provide a manually curated database composed of 10'290 full-length 16S rRNA gene sequences from prokaryotes tailored for dairy products analysis ( https://github.com/marcomeola/DAIRYdb ). The performance of the DAIRYdb was compared with the universal databases Silva, LTP, RDP and Greengenes. The DAIRYdb significantly outperformed all other databases independently of the classification algorithm by enabling higher accurate taxonomy annotation down to the species rank. The DAIRYdb accurately annotates over 90% of the sequences of either single or paired hypervariable regions automatically. The manually curated DAIRYdb strongly improves taxonomic annotation accuracy for microbiome studies in dairy environments. The DAIRYdb is a practical solution that enables automatization of this key step, thus facilitating the routine application of NGS microbiome analyses for microbial ecology studies and diagnostics in dairy products.
Identifiants
pubmed: 31286860
doi: 10.1186/s12864-019-5914-8
pii: 10.1186/s12864-019-5914-8
pmc: PMC6615214
doi:
Substances chimiques
RNA, Ribosomal, 16S
0
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
560Références
Appl Environ Microbiol. 1999 Mar;65(3):982-8
pubmed: 10049851
FEMS Microbiol Lett. 2003 Apr 25;221(2):299-304
pubmed: 12725942
J Microbiol Methods. 2003 Dec;55(3):541-55
pubmed: 14607398
Appl Environ Microbiol. 2006 Jul;72(7):5069-72
pubmed: 16820507
Appl Environ Microbiol. 2007 Jun;73(11):3497-504
pubmed: 17416689
Appl Environ Microbiol. 2007 Aug;73(16):5261-7
pubmed: 17586664
Nucleic Acids Res. 2007;35(21):7188-96
pubmed: 17947321
BMC Microbiol. 2007 Nov 30;7:108
pubmed: 18047683
Proc Natl Acad Sci U S A. 2008 Feb 19;105(7):2504-9
pubmed: 18272490
J Dent Res. 2008 Nov;87(11):1016-20
pubmed: 18946007
Nucleic Acids Res. 2009 Jan;37(Database issue):D141-5
pubmed: 19004872
PLoS Genet. 2008 Nov;4(11):e1000255
pubmed: 19023400
Appl Environ Microbiol. 2009 Aug;75(16):5227-36
pubmed: 19561178
Appl Environ Microbiol. 2009 Dec;75(23):7537-41
pubmed: 19801464
BMC Bioinformatics. 2009 Dec 15;10:421
pubmed: 20003500
Extremophiles. 2010 Mar;14(2):145-59
pubmed: 20091326
Nat Methods. 2010 May;7(5):335-6
pubmed: 20383131
Res Microbiol. 2010 Oct;161(8):635-42
pubmed: 20599610
J Microbiol Methods. 2010 Nov;83(2):250-3
pubmed: 20804791
World J Gastroenterol. 2010 Sep 7;16(33):4135-44
pubmed: 20806429
Nucleic Acids Res. 2011 Jan;39(Database issue):D38-51
pubmed: 21097890
Bioinformatics. 2011 Apr 15;27(8):1159-61
pubmed: 21349862
Appl Environ Microbiol. 2011 May;77(10):3219-26
pubmed: 21421784
Microbes Environ. 2008;23(4):253-68
pubmed: 21558717
ISME J. 2012 Jan;6(1):94-103
pubmed: 21716311
Environ Microbiol. 2012 Feb;14(2):318-34
pubmed: 21958017
BMC Bioinformatics. 2011 Sep 30;12:385
pubmed: 21961884
Appl Environ Microbiol. 1990 Jun;56(6):1919-25
pubmed: 2200342
J Mol Biol. 1990 Oct 5;215(3):403-10
pubmed: 2231712
BMC Genomics. 2011 Nov 30;12 Suppl 3:S17
pubmed: 22369513
Nat Rev Genet. 2012 Mar 13;13(4):260-70
pubmed: 22411464
Bioinformatics. 2012 Jul 15;28(14):1823-9
pubmed: 22556368
Nature. 2012 Jun 13;486(7402):215-21
pubmed: 22699610
BMC Microbiol. 2012 Sep 26;12:221
pubmed: 23013113
Nucleic Acids Res. 2013 Jan;41(Database issue):D590-6
pubmed: 23193283
ISME J. 2013 Aug;7(8):1493-506
pubmed: 23575374
Nucleic Acids Res. 2014 Jan;42(Database issue):D613-6
pubmed: 24243842
Nucleic Acids Res. 2014 Jan;42(Database issue):D643-8
pubmed: 24293649
Methods Ecol Evol. 2013 Dec 1;4(12):null
pubmed: 24358444
Bioinformatics. 2014 May 1;30(9):1312-3
pubmed: 24451623
Environ Microbiol. 2014 Jun;16(6):1918-34
pubmed: 24571618
Int J Food Microbiol. 2014 May 2;177:136-54
pubmed: 24642348
Nat Rev Microbiol. 2014 Sep;12(9):635-45
pubmed: 25118885
BMC Bioinformatics. 2014 Aug 29;15:293
pubmed: 25176396
ISME J. 2015 Mar 17;9(4):968-79
pubmed: 25325381
FEMS Microbiol Lett. 2015 Jan;362(2):1-11
pubmed: 25670699
Mol Ecol Resour. 2015 Nov;15(6):1403-14
pubmed: 25732605
Extremophiles. 2015 May;19(3):631-42
pubmed: 25783662
Science. 2015 May 22;348(6237):1261359
pubmed: 25999513
BMC Bioinformatics. 2015 Jul 01;16:205
pubmed: 26130333
Nucleic Acids Res. 2016 Jan 4;44(D1):D581-5
pubmed: 26424852
BMC Bioinformatics. 2015 Oct 08;16:324
pubmed: 26450747
Front Genet. 2015 Nov 17;6:329
pubmed: 26635865
BMC Genomics. 2015 Dec 12;16:1056
pubmed: 26651617
Nucleic Acids Res. 2016 Jan 4;44(D1):D48-50
pubmed: 26657633
Science. 2015 Dec 11;350(6266):aac8455
pubmed: 26659059
Front Microbiol. 2015 Dec 22;6:1454
pubmed: 26733988
Front Genet. 2015 Dec 17;6:348
pubmed: 26734060
Probiotics Antimicrob Proteins. 2012 Dec;4(4):217-26
pubmed: 26782181
Nucleic Acids Res. 2016 Jun 20;44(11):5022-33
pubmed: 27166378
Nat Methods. 2016 Jul;13(7):581-3
pubmed: 27214047
Bioinformatics. 2016 Oct 1;32(19):2920-7
pubmed: 27296980
BMC Microbiol. 2016 Jun 24;16(1):123
pubmed: 27342980
BMC Bioinformatics. 2017 Mar 16;18(1):172
pubmed: 28302051
FEMS Microbiol Ecol. 2017 Apr 1;93(4):
pubmed: 28334218
Front Microbiol. 2017 Mar 08;8:365
pubmed: 28337183
BMC Genomics. 2017 Mar 14;18(Suppl 2):114
pubmed: 28361695
Database (Oxford). 2017 Jan 1;2017(1):
pubmed: 28365734
J Biotechnol. 2017 Nov 10;261:169-176
pubmed: 28648396
ISME J. 2017 Nov;11(11):2399-2406
pubmed: 28731467
ISME J. 2017 Dec;11(12):2639-2643
pubmed: 28731476
mSystems. 2017 Aug 22;2(4):null
pubmed: 28845461
Bioinformatics. 2017 Dec 1;33(23):3808-3810
pubmed: 28961926
Nature. 2017 Nov 23;551(7681):457-463
pubmed: 29088705
Nat Microbiol. 2017 Dec;2(12):1573
pubmed: 29176696
Bioinformatics. 2018 Apr 15;34(8):1287-1294
pubmed: 29228191
Mol Ecol. 2018 Jan;27(2):313-338
pubmed: 29292539
Sci Rep. 2018 Jan 9;8(1):200
pubmed: 29317671
Nat Rev Microbiol. 2018 Mar;16(3):143-155
pubmed: 29332945
Appl Environ Microbiol. 2018 Mar 19;84(7):
pubmed: 29427429
PeerJ. 2018 Apr 18;6:e4652
pubmed: 29682424
Microbiome. 2018 May 17;6(1):90
pubmed: 29773078
Front Microbiol. 2018 May 23;9:1020
pubmed: 29875744
PeerJ. 2018 Jun 12;6:e5030
pubmed: 29910992
Bioinformatics. 2018 Dec 1;34(23):4027-4033
pubmed: 29912385
Appl Environ Microbiol. 2018 Aug 17;84(17):
pubmed: 29915113
Microbiome. 2018 Aug 9;6(1):140
pubmed: 30092815
BMC Genomics. 2018 Aug 17;19(1):620
pubmed: 30119641
Bioinformatics. 2018 Nov 30;:null
pubmed: 30500871
Appl Environ Microbiol. 1993 Mar;59(3):695-700
pubmed: 7683183
Appl Environ Microbiol. 1997 Nov;63(11):4516-22
pubmed: 9361437