A catalog of small proteins from the global microbiome.
Journal
Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555
Informations de publication
Date de publication:
31 Aug 2024
31 Aug 2024
Historique:
received:
15
01
2024
accepted:
19
08
2024
medline:
31
8
2024
pubmed:
31
8
2024
entrez:
30
8
2024
Statut:
epublish
Résumé
Small open reading frames (smORFs) shorter than 100 codons are widespread and perform essential roles in microorganisms, where they encode proteins active in several cell functions, including signal pathways, stress response, and antibacterial activities. However, the ecology, distribution and role of small proteins in the global microbiome remain unknown. Here, we construct a global microbial smORFs catalog (GMSC) derived from 63,410 publicly available metagenomes across 75 distinct habitats and 87,920 high-quality isolate genomes. GMSC contains 965 million non-redundant smORFs with comprehensive annotations. We find that archaea harbor more smORFs proportionally than bacteria. We moreover provide a tool called GMSC-mapper to identify and annotate small proteins from microbial (meta)genomes. Overall, this publicly-available resource demonstrates the immense and underexplored diversity of small proteins.
Identifiants
pubmed: 39214983
doi: 10.1038/s41467-024-51894-6
pii: 10.1038/s41467-024-51894-6
pmc: PMC11364881
doi:
Substances chimiques
Bacterial Proteins
0
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
7563Subventions
Organisme : National Natural Science Foundation of China (National Science Foundation of China)
ID : 61932008
Informations de copyright
© 2024. The Author(s).
Références
Mol Microbiol. 2016 Nov;102(3):430-445
pubmed: 27447896
Nat Methods. 2020 Mar;17(3):261-272
pubmed: 32015543
BMC Bioinformatics. 2011 Jun 01;12:221
pubmed: 21631914
Nucleic Acids Res. 2016 Jan 4;44(D1):D733-45
pubmed: 26553804
Mol Microbiol. 1997 Aug;25(4):619-37
pubmed: 9379893
Microbiome. 2019 Jun 3;7(1):84
pubmed: 31159881
J Bacteriol. 2010 Oct;192(20):5402-12
pubmed: 20709900
Database (Oxford). 2012 Mar 20;2012:bas003
pubmed: 22434837
Protein Sci. 2001 Oct;10(10):1970-9
pubmed: 11567088
Mol Microbiol. 2009 Apr;72(1):5-11
pubmed: 19210615
Microbiome. 2021 Feb 23;9(1):55
pubmed: 33622394
Bioinformatics. 2009 Jul 15;25(14):1754-60
pubmed: 19451168
Nucleic Acids Res. 2021 Jan 8;49(D1):D274-D281
pubmed: 33167031
J Biol Chem. 1999 Dec 31;274(53):37901-7
pubmed: 10608856
Cell. 2024 Jul 11;187(14):3761-3778.e16
pubmed: 38843834
Curr Opin Microbiol. 2017 Oct;39:81-88
pubmed: 29111488
J Bacteriol. 2017 May 9;199(11):
pubmed: 28289085
Nat Biotechnol. 2017 Nov;35(11):1026-1028
pubmed: 29035372
Nucleic Acids Res. 2020 Jan 8;48(D1):D621-D625
pubmed: 31647096
Bioinformatics. 2015 May 15;31(10):1674-6
pubmed: 25609793
Nucleic Acids Res. 2024 Jan 5;52(D1):D502-D512
pubmed: 37811892
Annu Rev Biochem. 2014;83:753-77
pubmed: 24606146
J Bacteriol. 2022 Jan 18;204(1):e0034421
pubmed: 34516282
Trends Biochem Sci. 2002 Apr;27(4):170-1
pubmed: 11943537
Mol Microbiol. 2008 Oct;70(1):258-70
pubmed: 18761622
J Bacteriol. 2022 Jan 18;204(1):e0034121
pubmed: 34309401
Mol Microbiol. 2008 Dec;70(6):1487-501
pubmed: 19121005
Nature. 2022 Jan;601(7892):252-256
pubmed: 34912116
Bioinformatics. 2021 Sep 29;37(18):3029-3031
pubmed: 33734313
Nature. 2020 Sep;585(7825):357-362
pubmed: 32939066
Curr Opin Microbiol. 2011 Apr;14(2):167-73
pubmed: 21342783
Elife. 2014 Aug 21;3:e03528
pubmed: 25144939
Nucleic Acids Res. 2024 Jan 5;52(D1):D777-D783
pubmed: 37897342
Nucleic Acids Res. 2018 Jan 4;46(D1):D493-D496
pubmed: 29040681
Trends Biochem Sci. 2016 Aug;41(8):665-678
pubmed: 27261332
PLoS One. 2008;3(12):e4027
pubmed: 19107199
J Bacteriol. 1997 Sep;179(17):5534-42
pubmed: 9287010
Genome Biol. 2011 Nov 25;12(11):R118
pubmed: 22118156
Nat Methods. 2021 Apr;18(4):366-368
pubmed: 33828273
Nucleic Acids Res. 2009 Jan;37(Database issue):D216-23
pubmed: 18940865
Genome Biol. 2015 Sep 14;16:179
pubmed: 26364619
Nucleic Acids Res. 2013 Jan;41(Database issue):D387-95
pubmed: 23197656
Nat Biotechnol. 2022 Jun;40(6):921-931
pubmed: 35241840
Genomics Proteomics Bioinformatics. 2021 Aug;19(4):602-610
pubmed: 34536568
Nucleic Acids Res. 2018 Jul 2;46(W1):W200-W204
pubmed: 29905871
Nucleic Acids Res. 2002 Jan 1;30(1):281-3
pubmed: 11752315
mBio. 2018 Aug 14;9(4):
pubmed: 30108166
Genome Res. 2006 Mar;16(3):365-73
pubmed: 16510898
Nat Biotechnol. 2019 Apr;37(4):420-423
pubmed: 30778233
J Biomed Sci. 2022 Mar 17;29(1):19
pubmed: 35300685
Nucleic Acids Res. 2006 Jan 1;34(Database issue):D247-51
pubmed: 16381856
Cell. 2019 Aug 22;178(5):1245-1259.e14
pubmed: 31402174
PeerJ. 2020 Dec 18;8:e10555
pubmed: 33384902
RNA. 2011 Apr;17(4):578-94
pubmed: 21357752
Nucleic Acids Res. 2022 Jan 7;50(D1):D785-D794
pubmed: 34520557
J Proteomics. 2020 Feb 20;213:103604
pubmed: 31841667
Nature. 2024 Feb;626(7998):377-384
pubmed: 38109938
Nucleic Acids Res. 2021 Jan 8;49(D1):D412-D419
pubmed: 33125078
J Mol Biol. 2001 Jan 19;305(3):567-80
pubmed: 11152613
Nature. 2016 Dec 8;540(7632):280-283
pubmed: 27798599
Nucleic Acids Res. 2016 Jul 27;44(13):6232-41
pubmed: 27141962
Nucleic Acids Res. 1997 Sep 1;25(17):3389-402
pubmed: 9254694
Nucleic Acids Res. 2024 Jan 5;52(D1):D522-D528
pubmed: 37956315
BMC Genomics. 2020 Oct 24;21(1):741
pubmed: 33099302
Nat Rev Mol Cell Biol. 2017 Sep;18(9):575-589
pubmed: 28698598
J Bacteriol. 2022 Jan 18;204(1):e0031321
pubmed: 34543104
BMC Bioinformatics. 2010 Mar 08;11:119
pubmed: 20211023
Nucleic Acids Res. 2022 Jan 7;50(D1):D543-D552
pubmed: 34723319
Nucleic Acids Res. 2020 Feb 20;48(3):1029-1042
pubmed: 31504789
Proc Natl Acad Sci U S A. 1990 Mar;87(6):2264-8
pubmed: 2315319
Nucleic Acids Res. 2023 Jan 6;51(D1):D384-D388
pubmed: 36477806
J Bacteriol. 2013 Aug;195(16):3640-50
pubmed: 23749980
J Bacteriol. 2022 Jan 18;204(1):e0029421
pubmed: 34339296
Cell Rep. 2022 Jun 21;39(12):110984
pubmed: 35732113
Nat Commun. 2018 Jun 29;9(1):2542
pubmed: 29959318
BMC Bioinformatics. 2009 Dec 15;10:421
pubmed: 20003500
Nucleic Acids Res. 2018 Jan 4;46(D1):D497-D502
pubmed: 29140531
Front Genet. 2013 Dec 16;4:286
pubmed: 24379829