Exploring microbial functional biodiversity at the protein family level-From metagenomic sequence reads to annotated protein clusters.

biodiversity cluster annotation metagenomes metatranscriptomes microbial dark matter protein clustering protein families

Journal

Frontiers in bioinformatics
ISSN: 2673-7647
Titre abrégé: Front Bioinform
Pays: Switzerland
ID NLM: 9918227263306676

Informations de publication

Date de publication:
2023
Historique:
received: 07 02 2023
accepted: 21 02 2023
entrez: 24 3 2023
pubmed: 25 3 2023
medline: 25 3 2023
Statut: epublish

Résumé

Metagenomics has enabled accessing the genetic repertoire of natural microbial communities. Metagenome shotgun sequencing has become the method of choice for studying and classifying microorganisms from various environments. To this end, several methods have been developed to process and analyze the sequence data from raw reads to end-products such as predicted protein sequences or families. In this article, we provide a thorough review to simplify such processes and discuss the alternative methodologies that can be followed in order to explore biodiversity at the protein family level. We provide details for analysis tools and we comment on their scalability as well as their advantages and disadvantages. Finally, we report the available data repositories and recommend various approaches for protein family annotation related to phylogenetic distribution, structure prediction and metadata enrichment.

Identifiants

pubmed: 36959975
doi: 10.3389/fbinf.2023.1157956
pii: 1157956
pmc: PMC10029925
doi:

Types de publication

Journal Article Review

Langues

eng

Pagination

1157956

Informations de copyright

Copyright © 2023 Baltoumas, Karatzas, Paez-Espino, Venetsianou, Aplakidou, Oulas, Finn, Ovchinnikov, Pafilis, Kyrpides and Pavlopoulos.

Déclaration de conflit d'intérêts

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Références

Algorithms Mol Biol. 2008 Jun 24;3:7
pubmed: 18577231
Bioinformatics. 2006 Jul 1;22(13):1658-9
pubmed: 16731699
Nucleic Acids Res. 2017 Jan 4;45(D1):D626-D634
pubmed: 27899642
Sci Rep. 2016 Apr 12;6:24175
pubmed: 27067514
Syst Biol. 2007 Aug;56(4):564-77
pubmed: 17654362
PLoS Comput Biol. 2012;8(6):e1002541
pubmed: 22685393
Methods Mol Biol. 2021;2199:239-255
pubmed: 33125654
Nucleic Acids Res. 2018 Jan 4;46(D1):D726-D735
pubmed: 29069476
Nucleic Acids Res. 2021 Jan 8;49(D1):D1515-D1522
pubmed: 33080015
Nat Methods. 2012 Jun 10;9(8):811-4
pubmed: 22688413
Bioinformatics. 2022 Sep 16;38(Suppl_2):ii56-ii61
pubmed: 36124804
Science. 2023 Mar 17;379(6637):1123-1130
pubmed: 36927031
PeerJ. 2019 Jul 26;7:e7359
pubmed: 31388474
Nat Commun. 2016 Apr 13;7:11257
pubmed: 27071849
Bioinformatics. 2009 Jul 15;25(14):1754-60
pubmed: 19451168
Microorganisms. 2022 Jan 26;10(2):
pubmed: 35208748
Nucleic Acids Res. 2021 Jan 8;49(D1):D274-D281
pubmed: 33167031
Nat Med. 2021 Nov;27(11):1885-1892
pubmed: 34789871
Nucleic Acids Res. 2012 Jan;40(Database issue):D54-6
pubmed: 22009675
PLoS One. 2017 Oct 26;12(10):e0185056
pubmed: 29073143
Science. 2007 Feb 16;315(5814):972-6
pubmed: 17218491
Nucleic Acids Res. 2021 Jan 8;49(D1):D764-D775
pubmed: 33137183
Nucleic Acids Res. 2009 Jun;37(11):e83
pubmed: 19443443
Nat Methods. 2010 Mar;7(3 Suppl):S2-4
pubmed: 20195254
Nucleic Acids Res. 2021 Jan 8;49(D1):D1020-D1028
pubmed: 33270901
Bioinformatics. 2009 Jun 1;25(11):1422-3
pubmed: 19304878
BMC Bioinformatics. 2009 Oct 02;10:316
pubmed: 19799776
Nature. 2017 Nov 23;551(7681):457-463
pubmed: 29088705
Nature. 2021 Aug;596(7873):590-596
pubmed: 34293799
Nat Rev Nephrol. 2023 Jan;19(1):9-22
pubmed: 36280707
Nat Protoc. 2015 Jun;10(6):845-58
pubmed: 25950237
Bioinformatics. 2016 May 15;32(10):1571-3
pubmed: 26794316
Nucleic Acids Res. 2016 Aug 19;44(14):6614-24
pubmed: 27342282
Genome Biol. 2019 Oct 22;20(1):217
pubmed: 31640809
Microbiome. 2020 Jun 10;8(1):90
pubmed: 32522236
Mol Syst Biol. 2011 Oct 11;7:539
pubmed: 21988835
Database (Oxford). 2020 Jan 1;2020:
pubmed: 32761142
Nat Biotechnol. 2016 Jan;34(1):64-9
pubmed: 26655498
Bioinformatics. 2019 Jul 15;35(14):i61-i70
pubmed: 31510642
BMC Bioinformatics. 2019 Sep 14;20(1):473
pubmed: 31521110
Annu Rev Biophys Biomol Struct. 2000;29:291-325
pubmed: 10940251
Bioinformatics. 2022 Jan 3;38(2):344-350
pubmed: 34570171
Methods Mol Biol. 2014;1079:155-70
pubmed: 24170401
Proc Natl Acad Sci U S A. 2011 Dec 6;108(49):E1293-301
pubmed: 22106262
Nat Biotechnol. 2017 Nov;35(11):1026-1028
pubmed: 29035372
Nucleic Acids Res. 2012 Jan;40(Database issue):D940-6
pubmed: 22080554
Front Microbiol. 2020 Jan 31;11:37
pubmed: 32082281
Database (Oxford). 2016 Feb 20;2016:
pubmed: 26896844
Mol Biol Evol. 2013 Apr;30(4):772-80
pubmed: 23329690
IEEE/ACM Trans Comput Biol Bioinform. 2022 Mar 22;PP:
pubmed: 35316191
Mol Biol Evol. 2021 Dec 9;38(12):5825-5829
pubmed: 34597405
Bioinformatics. 2015 May 15;31(10):1674-6
pubmed: 25609793
J Chem Theory Comput. 2020 Jan 14;16(1):528-552
pubmed: 31714766
BMC Bioinformatics. 2006 Nov 06;7:488
pubmed: 17087821
Brief Bioinform. 2019 Jul 19;20(4):1151-1159
pubmed: 29028869
Nucleic Acids Res. 2000 Jan 1;28(1):257-9
pubmed: 10592240
Bioinformatics. 2020 Aug 15;36(14):4126-4129
pubmed: 32413137
Nat Biotechnol. 2000 Mar;18(3):283-7
pubmed: 10700142
J Biol Chem. 2020 Jan 3;295(1):15-33
pubmed: 31712314
Nucleic Acids Res. 2020 Jan 8;48(D1):D570-D578
pubmed: 31696235
Bioinformatics. 2018 Nov 1;34(21):3753-3754
pubmed: 29878054
BMC Bioinformatics. 2016 Aug 31;17 Suppl 8:276
pubmed: 27586436
Science. 2021 Aug 20;373(6557):871-876
pubmed: 34282049
Nucleic Acids Res. 2019 Jul 2;47(W1):W199-W205
pubmed: 31114916
BMC Bioinformatics. 2012 Jun 21;13:141
pubmed: 22720753
Nucleic Acids Res. 2018 Mar 16;46(5):2699
pubmed: 29425356
Nucleic Acids Res. 2001 Jun 15;29(12):2607-18
pubmed: 11410670
J Comput Biol. 2006 Jun;13(5):1028-40
pubmed: 16796549
Nucleic Acids Res. 2020 Jan 8;48(D1):D626-D632
pubmed: 31728526
Comput Biol Med. 2021 Aug;135:104557
pubmed: 34139436
PLoS Biol. 2018 Sep 17;16(9):e2005849
pubmed: 30222734
BMC Bioinformatics. 2005 Dec 12;6:298
pubmed: 16343337
BMC Bioinformatics. 2007 Jun 18;8:209
pubmed: 17577412
J Biomed Semantics. 2016 Sep 23;7(1):57
pubmed: 27664130
Nat Methods. 2022 Jun;19(6):679-682
pubmed: 35637307
ISME J. 2011 Oct;5(10):1565-7
pubmed: 21472015
Bioinformatics. 2016 Feb 15;32(4):605-7
pubmed: 26515820
Bioinformatics. 2014 May 1;30(9):1236-40
pubmed: 24451626
Nucleic Acids Res. 2021 Jan 8;49(D1):D192-D200
pubmed: 33211869
Microbiome. 2020 Apr 3;8(1):48
pubmed: 32245390
BMC Evol Biol. 2010 Jul 13;10:210
pubmed: 20626897
Genome Biol. 2007;8(11):R233
pubmed: 17983469
Bioinformatics. 2014 Aug 1;30(15):2114-20
pubmed: 24695404
Nucleic Acids Res. 2021 Jan 8;49(D1):D344-D354
pubmed: 33156333
BMC Bioinformatics. 2008 Sep 19;9:386
pubmed: 18803844
Biology (Basel). 2021 Jul 14;10(7):
pubmed: 34356520
Nucleic Acids Res. 2022 May 26;:
pubmed: 35639733
Bioinformatics. 2019 May 1;35(9):1582-1584
pubmed: 30304492
BMC Bioinformatics. 2014 Jun 12;15:182
pubmed: 24925680
Bioinformatics. 2017 Mar 15;33(6):791-798
pubmed: 27256312
Nature. 2002 Jun 20;417(6891):851-4
pubmed: 12075352
Nat Biotechnol. 2021 Apr;39(4):499-509
pubmed: 33169036
Nature. 2022 Nov;611(7935):211-212
pubmed: 36319775
PLoS One. 2017 May 3;12(5):e0176469
pubmed: 28467460
Cell Rep. 2020 Mar 3;30(9):2909-2922.e6
pubmed: 32130896
Nat Methods. 2014 Nov;11(11):1144-6
pubmed: 25218180
Front Microbiol. 2021 Mar 23;12:613791
pubmed: 33833738
Bioinformatics. 2010 Apr 15;26(8):1105-11
pubmed: 20185405
Nat Biotechnol. 2018 Apr;36(4):359-367
pubmed: 29553575
Science. 2017 Jan 20;355(6322):294-298
pubmed: 28104891
Nucleic Acids Res. 2012 Sep;40(16):e126
pubmed: 22584627
Proteins. 2004 Dec 1;57(4):702-10
pubmed: 15476259
Nucleic Acids Res. 2011 Jul;39(Web Server issue):W29-37
pubmed: 21593126
Bioinformatics. 2021 Sep 29;37(18):3029-3031
pubmed: 33734313
Nucleic Acids Res. 2019 Jan 8;47(D1):D637-D648
pubmed: 30365027
Nucleic Acids Res. 2000 Jan 1;28(1):235-42
pubmed: 10592235
Proteins. 2021 Dec;89(12):1607-1617
pubmed: 34533838
PLoS Comput Biol. 2022 Sep 16;18(9):e1010539
pubmed: 36112717
Nat Biotechnol. 2022 Jul;40(7):1023-1025
pubmed: 34980915
Proc Natl Acad Sci U S A. 2021 Dec 7;118(49):
pubmed: 34873061
Nat Methods. 2015 Jan;12(1):7-8
pubmed: 25549265
Nucleic Acids Res. 2019 Jan 8;47(D1):D259-D264
pubmed: 30371820
BMC Bioinformatics. 2012 Sep 28;13:253
pubmed: 23020263
Sci Data. 2016 Mar 15;3:160018
pubmed: 26978244
Nucleic Acids Res. 2011 Mar;39(4):e23
pubmed: 21109538
Gigascience. 2018 Apr 1;7(4):1-31
pubmed: 29648623
BMC Res Notes. 2011 Dec 20;4:549
pubmed: 22185599
mSystems. 2021 May 18;6(3):
pubmed: 34006627
Genome Res. 2000 Aug;10(8):1204-10
pubmed: 10958638
Bioinformatics. 2012 Sep 15;28(18):i356-i362
pubmed: 22962452
Genome Res. 2016 Dec;26(12):1721-1729
pubmed: 27852649
Nat Biotechnol. 2020 Jul;38(7):824-844
pubmed: 32572269
Protein Eng. 1999 Feb;12(2):85-94
pubmed: 10195279
Biomolecules. 2022 Mar 30;12(4):
pubmed: 35454109
Mol Biol Evol. 1987 Jul;4(4):406-25
pubmed: 3447015
Bioinform Biol Insights. 2015 May 05;9:75-88
pubmed: 25983555
J Chem Phys. 2020 Jul 28;153(4):044130
pubmed: 32752662
Nucleic Acids Res. 2013 Jan;41(Database issue):D344-7
pubmed: 23161676
PLoS Comput Biol. 2014 Nov 20;10(11):e1003918
pubmed: 25412107
Comput Biol Chem. 2018 Aug;75:54-64
pubmed: 29747076
Bioinformatics. 2011 Jan 1;27(1):127-9
pubmed: 21062764
Science. 2005 Sep 2;309(5740):1559-63
pubmed: 16141072
Bioinformatics. 2010 Oct 1;26(19):2460-1
pubmed: 20709691
Bioinformatics. 2010 Jan 1;26(1):123-4
pubmed: 19880369
Brief Bioinform. 2022 Jul 18;23(4):
pubmed: 35769000
Nucleic Acids Res. 2016 Jan 4;44(D1):D51-7
pubmed: 26578571
Nat Protoc. 2017 Aug;12(8):1673-1682
pubmed: 28749930
J Chem Theory Comput. 2015 Jul 14;11(7):3499-509
pubmed: 26190950
Nucleic Acids Res. 2007;35(21):7188-96
pubmed: 17947321
Nucleic Acids Res. 2023 Jan 6;51(D1):D723-D732
pubmed: 36382399
BioData Min. 2008 Nov 28;1:12
pubmed: 19040716
Nat Rev Cancer. 2022 May;22(5):259-279
pubmed: 35194172
Nature. 2021 Aug;596(7873):583-589
pubmed: 34265844
Nucleic Acids Res. 2011 Jan;39(Database issue):D507-13
pubmed: 21030441
Nucleic Acids Res. 2018 Apr 6;46(6):e33
pubmed: 29315405
Bioinformatics. 2018 May 15;34(10):1719-1725
pubmed: 29281009
Bioinformatics. 2012 Mar 15;28(6):878-9
pubmed: 22285832
Nucleic Acids Res. 2022 Jan 7;50(D1):D439-D444
pubmed: 34791371
J Mol Biol. 1970 Mar;48(3):443-53
pubmed: 5420325
Proc Natl Acad Sci U S A. 2016 May 24;113(21):5970-5
pubmed: 27140646
BMC Bioinformatics. 2018 Aug 30;19(1):309
pubmed: 30165813
Nat Biotechnol. 2018 Oct 15;:
pubmed: 30320765
Int J Mol Sci. 2021 Mar 24;22(7):
pubmed: 33805113
Nucleic Acids Res. 2022 May 24;:
pubmed: 35610055
Nucleic Acids Res. 1999 Oct 1;27(19):3911-20
pubmed: 10481031
Nat Commun. 2022 Feb 18;13(1):965
pubmed: 35181661
BioData Min. 2010 Feb 22;3(1):1
pubmed: 20175922
Genome Res. 2017 May;27(5):722-736
pubmed: 28298431
Nat Commun. 2021 Aug 18;12(1):5011
pubmed: 34408149
Nucleic Acids Res. 2019 Jul 2;47(W1):W402-W407
pubmed: 31251384
Bioinformatics. 2011 Feb 15;27(4):580-1
pubmed: 21216778
Nature. 2017 Oct 5;550(7674):61-66
pubmed: 28953883
Nat Methods. 2009 Sep;6(9):673-6
pubmed: 19648916
Nucleic Acids Res. 2020 Dec 2;48(21):e121
pubmed: 33045744
Nucleic Acids Res. 2021 Jan 8;49(D1):D325-D334
pubmed: 33290552
Nucleic Acids Res. 2022 Jan 7;50(D1):D102-D105
pubmed: 34751405
Environ Microbiome. 2022 Nov 18;17(1):57
pubmed: 36401317
Nat Biotechnol. 2018 Feb;36(2):190-195
pubmed: 29291348
Comput Struct Biotechnol J. 2021 Nov 23;19:6301-6314
pubmed: 34900140
Genome Res. 2017 May;27(5):824-834
pubmed: 28298430
IEEE Trans Neural Netw. 2005 May;16(3):645-78
pubmed: 15940994
Nucleic Acids Res. 2021 Jan 8;49(D1):D743-D750
pubmed: 33221926
Nucleic Acids Res. 2022 Jan 7;50(D1):D553-D559
pubmed: 34850923
Biomolecules. 2021 Aug 20;11(8):
pubmed: 34439912
Genome Biol. 2004;5(10):R80
pubmed: 15461798
Microbiome. 2021 Mar 3;9(1):58
pubmed: 33658077
Protein Cell. 2021 May;12(5):315-330
pubmed: 32394199
Genome Biol. 2012 Jan 31;13(1):R5
pubmed: 22293552
Nature. 2016 Aug 25;536(7617):425-30
pubmed: 27533034
Nucleic Acids Res. 2019 Jul 2;47(W1):W74-W80
pubmed: 31114893
Proteins. 2002 Aug 1;48(2):227-41
pubmed: 12112692
Cell. 2012 Jun 22;149(7):1607-21
pubmed: 22579045
Nucleic Acids Res. 2023 Jan 6;51(D1):D957-D963
pubmed: 36318257
PLoS One. 2016 Sep 29;11(9):e0163111
pubmed: 27684958
J Comput Biol. 2011 Mar;18(3):523-34
pubmed: 21385052
Nucleic Acids Res. 2016 Jul 8;44(W1):W16-21
pubmed: 27141966
Syst Biol. 2012 May;61(3):539-42
pubmed: 22357727
Microbiome. 2017 Jul 6;5(1):69
pubmed: 28683828
Proc Natl Acad Sci U S A. 2005 Oct 25;102(43):15545-50
pubmed: 16199517
PLoS Comput Biol. 2017 Jul 26;13(7):e1005659
pubmed: 28746339
PLoS Biol. 2020 Dec 2;18(12):e3001007
pubmed: 33264284
Front Genet. 2015 Dec 17;6:348
pubmed: 26734060
Nucleic Acids Res. 2010 Jul;38(12):e132
pubmed: 20403810
PLoS Comput Biol. 2011 Oct;7(10):e1002195
pubmed: 22039361
Nat Rev Microbiol. 2020 Feb;18(2):67-83
pubmed: 31857715
Microbiome. 2019 Sep 14;7(1):133
pubmed: 31521200
Genome Res. 2004 Jun;14(6):1188-90
pubmed: 15173120
Bioinformatics. 2012 Jun 1;28(11):1420-8
pubmed: 22495754
Nat Methods. 2012 Mar 04;9(4):357-9
pubmed: 22388286
Genome Res. 2018 Apr;28(4):569-580
pubmed: 29496730
Curr Protoc Bioinformatics. 2019 Mar;65(1):e57
pubmed: 30466165
BMC Bioinformatics. 2011 Sep 30;12:385
pubmed: 21961884
BMC Genomics. 2018 Jun 25;19(1):493
pubmed: 29940835
Quant Biol. 2020 Mar;8(1):64-77
pubmed: 34084563
Nat Methods. 2015 Oct;12(10):902-3
pubmed: 26418763
Front Genet. 2019 Oct 15;10:999
pubmed: 31681429
Genome Res. 2018 Jul;28(7):1079-1089
pubmed: 29773659
Genome Biol. 2012 Dec 22;13(12):R122
pubmed: 23259615
BMC Bioinformatics. 2017 Sep 20;18(1):425
pubmed: 28931373
Proc Natl Acad Sci U S A. 2015 Apr 28;112(17):5413-8
pubmed: 25858953
J Mol Biol. 1987 Feb 20;193(4):693-707
pubmed: 3612789
Nucleic Acids Res. 2017 Jan 4;45(D1):D457-D465
pubmed: 27799466
Nat Biotechnol. 2023 Feb 23;:
pubmed: 36823356
mSystems. 2021 Feb 23;6(1):
pubmed: 33622857
Nucleic Acids Res. 2012 Nov 1;40(20):e155
pubmed: 22821567
Nucleic Acids Res. 2022 Jan 7;50(D1):D161-D164
pubmed: 34850943
Nucleic Acids Res. 2006;34(20):5839-51
pubmed: 17062630
Proteins. 2004 Feb 15;54(3):491-9
pubmed: 14747997
Microbiome. 2021 Feb 1;9(1):37
pubmed: 33522966
PeerJ. 2016 Oct 18;4:e2584
pubmed: 27781170
PeerJ. 2017 Mar 8;5:e3035
pubmed: 28289564
Science. 2001 Oct 5;294(5540):93-6
pubmed: 11588250
Bioinformatics. 2018 Dec 15;34(24):4172-4179
pubmed: 29947757
Nucleic Acids Res. 2018 Jan 4;46(D1):D754-D761
pubmed: 29155950
Bioinformatics. 2018 Mar 15;34(6):1037-1039
pubmed: 29106469
Nat Biotechnol. 2017 Sep 12;35(9):833-844
pubmed: 28898207
Nucleic Acids Res. 2014 Jul;42(Web Server issue):W252-8
pubmed: 24782522
Genome Res. 2008 Nov;18(11):1851-8
pubmed: 18714091
Bioinformatics. 2008 Mar 1;24(5):719-20
pubmed: 18024473
Genome Res. 2002 Apr;12(4):656-64
pubmed: 11932250
Bioinformatics. 2009 Apr 1;25(7):969-70
pubmed: 19228804
Nucleic Acids Res. 2010 Nov;38(20):e191
pubmed: 20805240
Nucleic Acids Res. 2021 Jan 8;49(D1):D412-D419
pubmed: 33125078
Structure. 2009 Feb 13;17(2):151-9
pubmed: 19217386
Adv Bioinformatics. 2017;2017:1278932
pubmed: 28804499
Elife. 2014 May 01;3:e02030
pubmed: 24842992
Nat Methods. 2020 Feb;17(2):155-158
pubmed: 31819265
Bioinformatics. 2022 Feb 7;38(5):1198-1207
pubmed: 34875010
Front Microbiol. 2012 Dec 05;3:410
pubmed: 23227024
Nucleic Acids Res. 2019 Jan 8;47(D1):D666-D677
pubmed: 30289528
Genome Res. 2003 May;13(5):875-82
pubmed: 12695325
J Chem Phys. 2020 Oct 7;153(13):134110
pubmed: 33032406
Nucleic Acids Res. 2018 Jan 4;46(D1):D692-D699
pubmed: 29106641
Nat Methods. 2015 Jan;12(1):59-60
pubmed: 25402007
Nucleic Acids Res. 2014 Sep;42(15):e119
pubmed: 24990371
Bioinformatics. 2012 Sep 1;28(17):2223-30
pubmed: 22796954
Curr Protoc Bioinformatics. 2009 Sep;Chapter 3:Unit 3.1 3.1.1-7
pubmed: 19728288
J Mol Biol. 2001 Jan 19;305(3):567-80
pubmed: 11152613
Bioinformatics. 2019 Nov 1;35(21):4229-4238
pubmed: 30977806
Nat Microbiol. 2019 Jan;4(1):112-123
pubmed: 30478291
Nucleic Acids Res. 2007 Jul;35(Web Server issue):W429-32
pubmed: 17483518
Bioinformatics. 2014 Mar 15;30(6):815-22
pubmed: 24215022
Bioinformatics. 2009 Aug 1;25(15):1972-3
pubmed: 19505945
Nat Rev Microbiol. 2004 Feb;2(2):141-50
pubmed: 15040261
Front Bioeng Biotechnol. 2020 Jan 31;8:34
pubmed: 32083072
BMC Bioinformatics. 2020 Jul 28;21(1):334
pubmed: 32723290
BMC Genomics. 2009 Nov 12;10:520
pubmed: 19909532
Nat Rev Genet. 2018 Jun;19(6):329-346
pubmed: 29599501
PLoS One. 2011;6(12):e28766
pubmed: 22163331
Nat Biotechnol. 2017 Jul;35(7):676-683
pubmed: 28604660
Nucleic Acids Res. 2017 Jan 4;45(D1):D517-D528
pubmed: 27899624
Nucleic Acids Res. 2022 Jan 7;50(D1):D106-D110
pubmed: 34850158
Curr Issues Mol Biol. 2017;24:37-58
pubmed: 28686567
Sci Rep. 2021 Feb 4;11(1):3030
pubmed: 33542369
Microb Genom. 2022 May;8(5):
pubmed: 35503723
Nucleic Acids Res. 2020 Jan 8;48(D1):D376-D382
pubmed: 31724711
Front Genet. 2018 Aug 07;9:304
pubmed: 30131825
Bioinformatics. 2017 Dec 01;33(23):3808-3810
pubmed: 28961926
Proteins. 2004 Jan 1;54(1):49-57
pubmed: 14705023
Mamm Genome. 2019 Dec;30(11-12):353-361
pubmed: 31776723
J Bacteriol. 2008 Mar;190(6):2244-8
pubmed: 18192385
PLoS Comput Biol. 2005 Jul;1(2):106-12
pubmed: 16110337
Nucleic Acids Res. 2020 Jan 8;48(D1):D941-D947
pubmed: 31584097
Nucleic Acids Res. 2021 Jan 8;49(D1):D480-D489
pubmed: 33237286
BMC Bioinformatics. 2014 Jan 13;15:7
pubmed: 24410852
Nucleic Acids Res. 2006;34(19):5623-30
pubmed: 17028096
Nat Biotechnol. 2013 Sep;31(9):814-21
pubmed: 23975157
Gigascience. 2022 Aug 11;11:
pubmed: 35950838
Nat Methods. 2012 Nov;9(11):1069-76
pubmed: 23132118
Bioinformatics. 2022 Sep 15;38(18):4264-4270
pubmed: 35920769
Science. 2011 May 13;332(6031):767
pubmed: 21566161
Bioinform Adv. 2022 May 13;2(1):vbac036
pubmed: 36699373
Int J Syst Evol Microbiol. 2020 Nov;70(11):5607-5612
pubmed: 32701423
Genome Res. 2001 Mar;11(3):356-72
pubmed: 11230160
NAR Genom Bioinform. 2021 Oct 06;3(4):lqab090
pubmed: 34632381
NAR Genom Bioinform. 2021 Mar 01;3(1):lqab009
pubmed: 33709074
BMC Bioinformatics. 2010 Mar 08;11:119
pubmed: 20211023
Nucleic Acids Res. 2021 Jan 8;49(D1):D266-D273
pubmed: 33237325
DNA Res. 2008 Dec;15(6):387-96
pubmed: 18940874
BMC Bioinformatics. 2003 Jan 13;4:2
pubmed: 12525261
BMC Microbiol. 2021 Sep 30;21(1):263
pubmed: 34592929
Nat Methods. 2019 Jul;16(7):603-606
pubmed: 31235882
Genome Res. 2011 Jun;21(6):936-9
pubmed: 20980556
BMC Bioinformatics. 2010 Nov 02;11:544
pubmed: 21044341
Nucleic Acids Res. 2005 Apr 22;33(7):2302-9
pubmed: 15849316
Protein Sci. 2020 Jan;29(1):28-35
pubmed: 31423653
Microb Genom. 2020 Aug;6(8):
pubmed: 32706331
Science. 2015 May 22;348(6237):1261359
pubmed: 25999513
Nucleic Acids Res. 2021 Sep 20;49(16):9077-9096
pubmed: 34417604
Biochemistry. 2019 Oct 15;58(41):4169-4182
pubmed: 31553576
Nature. 2002 Dec 12;420(6916):666-9
pubmed: 12478293
Nat Methods. 2015 Nov;12(11):1003-4
pubmed: 26513550
Genome Biol. 2020 Jul 6;21(1):164
pubmed: 32631445
Nat Methods. 2020 Jul;17(7):665-680
pubmed: 32483333
BioData Min. 2011 Apr 28;4:10
pubmed: 21527005
Brief Bioinform. 2012 Nov;13(6):669-81
pubmed: 22962338
Nat Protoc. 2012 Jul 19;7(8):1511-22
pubmed: 22814390
Int J Mol Sci. 2014 Jul 14;15(7):12364-78
pubmed: 25026170
Nucleic Acids Res. 2011 Jul;39(Web Server issue):W13-7
pubmed: 21558174
PeerJ. 2014 Sep 30;2:e603
pubmed: 25289188
OMICS. 2008 Jun;12(2):115-21
pubmed: 18479204
BMC Genomics. 2004 Jan 13;5(1):4
pubmed: 14718066
Nucleic Acids Res. 2020 Jul 2;48(W1):W60-W64
pubmed: 32469061
Nucleic Acids Res. 2023 Jan 6;51(D1):D733-D743
pubmed: 36399502
Genome Biol. 2019 Nov 28;20(1):257
pubmed: 31779668
Nat Microbiol. 2022 Dec;7(12):2128-2150
pubmed: 36443458
Nucleic Acids Res. 2021 Jul 2;49(W1):W36-W45
pubmed: 33885790
Nat Methods. 2017 Jan;14(1):71-73
pubmed: 27819658
Structure. 2013 Oct 8;21(10):1735-42
pubmed: 24035711
Nat Methods. 2020 Nov;17(11):1103-1110
pubmed: 33020656
Nat Commun. 2018 Jun 29;9(1):2542
pubmed: 29959318
Front Microbiol. 2015 Dec 18;6:1451
pubmed: 26732662
Genome Biol. 2019 Nov 1;20(1):229
pubmed: 31676016
Bioinformatics. 2014 Oct;30(19):2717-22
pubmed: 24947750
Bioinformatics. 2014 Jul 15;30(14):2068-9
pubmed: 24642063
J Mol Biol. 1990 Oct 5;215(3):403-10
pubmed: 2231712
Nat Rev Microbiol. 2022 Nov;20(11):641-656
pubmed: 35562427
PLoS One. 2012;7(11):e48998
pubmed: 23145044
Microbiome. 2019 Dec 10;7(1):157
pubmed: 31823797
Nucleic Acids Res. 2003 Jan 1;31(1):371-3
pubmed: 12520025
Genome Biol. 2003;4(2):P1
pubmed: 12620117
Bioinformatics. 2013 Nov 15;29(22):2933-5
pubmed: 24008419
Nucleic Acids Res. 2022 Jan 7;50(D1):D1500-D1507
pubmed: 34747489
Nucleic Acids Res. 2004 Mar 19;32(5):1792-7
pubmed: 15034147
Nat Biotechnol. 2019 Aug;37(8):852-857
pubmed: 31341288
Nucleic Acids Res. 2021 Jul 2;49(W1):W29-W35
pubmed: 33978755
Proc Natl Acad Sci U S A. 1998 Jun 9;95(12):6578-83
pubmed: 9618454
Bioinformatics. 2008 Mar 15;24(6):863-5
pubmed: 18238785
Environ Microbiol. 2010 Jul;12(7):1803-5
pubmed: 20653767
Nucleic Acids Res. 2019 Jun 4;47(10):e57
pubmed: 30838416
Science. 2008 Apr 25;320(5875):486-8
pubmed: 18436778
Proteins. 2012 Jul;80(7):1715-35
pubmed: 22411565
Nucleic Acids Res. 2018 Jan 4;46(D1):D41-D47
pubmed: 29140468
Nat Commun. 2021 Jan 4;12(1):60
pubmed: 33397900
Nat Rev Genet. 2020 Jul;21(7):428-444
pubmed: 32424311

Auteurs

Fotis A Baltoumas (FA)

Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece.

Evangelos Karatzas (E)

Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece.

David Paez-Espino (D)

Lawrence Berkeley National Laboratory, DOE Joint Genome Institute, Berkeley, CA, United States.

Nefeli K Venetsianou (NK)

Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece.

Eleni Aplakidou (E)

Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece.

Anastasis Oulas (A)

The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus.

Robert D Finn (RD)

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, United Kingdom.

Sergey Ovchinnikov (S)

John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA, United States.

Evangelos Pafilis (E)

Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Heraklion, Greece.

Nikos C Kyrpides (NC)

Lawrence Berkeley National Laboratory, DOE Joint Genome Institute, Berkeley, CA, United States.

Georgios A Pavlopoulos (GA)

Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece.
Center of New Biotechnologies and Precision Medicine, Department of Medicine, School of Health Sciences, National and Kapodistrian University of Athens, Athens, Greece.
Hellenic Army Academy, Vari, Greece.

Classifications MeSH