A genome-phenome association study in native microbiomes identifies a mechanism for cytosine modification in DNA and RNA.
5-hydroxymethylcytosine carbamoyltransferase
DNA modification
GWAS
MetaGPA
RNA modification
functional metagenomics
genetics
genomics
infectious disease
microbiology
microbiome
Journal
eLife
ISSN: 2050-084X
Titre abrégé: Elife
Pays: England
ID NLM: 101579614
Informations de publication
Date de publication:
08 11 2021
08 11 2021
Historique:
received:
04
05
2021
accepted:
05
11
2021
pubmed:
9
11
2021
medline:
27
1
2022
entrez:
8
11
2021
Statut:
epublish
Résumé
Shotgun metagenomic sequencing is a powerful approach to study microbiomes in an unbiased manner and of increasing relevance for identifying novel enzymatic functions. However, the potential of metagenomics to relate from microbiome composition to function has thus far been underutilized. Here, we introduce the Metagenomics Genome-Phenome Association (MetaGPA) study framework, which allows linking genetic information in metagenomes with a dedicated functional phenotype. We applied MetaGPA to identify enzymes associated with cytosine modifications in environmental samples. From the 2365 genes that met our significance criteria, we confirm known pathways for cytosine modifications and proposed novel cytosine-modifying mechanisms. Specifically, we characterized and identified a novel nucleic acid-modifying enzyme, 5-hydroxymethylcytosine carbamoyltransferase, that catalyzes the formation of a previously unknown cytosine modification, 5-carbamoyloxymethylcytosine, in DNA and RNA. Our work introduces MetaGPA as a novel and versatile tool for advancing functional metagenomics. Many industrial processes, such as starch processing and oil refinement, use chemicals that cause harm to the environment. These can often be switched to more sustainable biological processes that are powered by proteins called enzymes. Enzymes are micro-factories that speed up biochemical reactions in most living things. Communities of microorganisms (also known as microbiomes) are an amazing but often untapped resource for discovering enzymes that can be harnessed for industrial purposes. To gain a better picture of the microbes present within a population, researchers often extract and sequence the genetic material of all microorganisms in an environmental sample, also known as the metagenome. While current methods for analyzing the metagenome are good at identifying new species, they often provide limited information about the microorganism’s functional role within the community. This makes it difficult to find new enzymes that may be useful for industry. Here, Yang, Lin et al. have developed a new technique called Metagenomics Genome-Phenome Association, or MetaGPA for short. The method works in a similar way to genome-wide association studies (GWAS) which are used to identify genes involved in human disease. However, instead of disease associated genes in humans, MetaGPA finds microbial genes that are associated with a biological process useful for biotechnology. Like GWAS, the new approach created by Yang, Lin et al. compares two groups: the first contains microorganisms that carry out a specific process, and the second contains all organisms in the microbiome. The metagenome of each group is extracted and a computational pipeline is then applied to identify genes, including those coding for enzymes, that are found more often in the group performing the desired task. To test the technique, Yang, Lin et al. used MetGPA to find new enzymes involved in DNA modification. Microbiome samples were collected from coastal water and sewage, and the computational pipeline was applied to discover genes that are associated with this process. Further analysis revealed that one of the identified genes codes for an enzyme that introduces a previously unknown change to DNA. MetaGPA could be applied to other processes and microbiomes, and, if successful, may help researchers to identify more diverse enzymes than is currently available. This could scale up the discovery of new enzymes that can be used to power industrial reactions.
Autres résumés
Type: plain-language-summary
(eng)
Many industrial processes, such as starch processing and oil refinement, use chemicals that cause harm to the environment. These can often be switched to more sustainable biological processes that are powered by proteins called enzymes. Enzymes are micro-factories that speed up biochemical reactions in most living things. Communities of microorganisms (also known as microbiomes) are an amazing but often untapped resource for discovering enzymes that can be harnessed for industrial purposes. To gain a better picture of the microbes present within a population, researchers often extract and sequence the genetic material of all microorganisms in an environmental sample, also known as the metagenome. While current methods for analyzing the metagenome are good at identifying new species, they often provide limited information about the microorganism’s functional role within the community. This makes it difficult to find new enzymes that may be useful for industry. Here, Yang, Lin et al. have developed a new technique called Metagenomics Genome-Phenome Association, or MetaGPA for short. The method works in a similar way to genome-wide association studies (GWAS) which are used to identify genes involved in human disease. However, instead of disease associated genes in humans, MetaGPA finds microbial genes that are associated with a biological process useful for biotechnology. Like GWAS, the new approach created by Yang, Lin et al. compares two groups: the first contains microorganisms that carry out a specific process, and the second contains all organisms in the microbiome. The metagenome of each group is extracted and a computational pipeline is then applied to identify genes, including those coding for enzymes, that are found more often in the group performing the desired task. To test the technique, Yang, Lin et al. used MetGPA to find new enzymes involved in DNA modification. Microbiome samples were collected from coastal water and sewage, and the computational pipeline was applied to discover genes that are associated with this process. Further analysis revealed that one of the identified genes codes for an enzyme that introduces a previously unknown change to DNA. MetaGPA could be applied to other processes and microbiomes, and, if successful, may help researchers to identify more diverse enzymes than is currently available. This could scale up the discovery of new enzymes that can be used to power industrial reactions.
Identifiants
pubmed: 34747693
doi: 10.7554/eLife.70021
pii: 70021
pmc: PMC8670742
doi:
pii:
Substances chimiques
DNA, Bacterial
0
RNA, Bacterial
0
Cytosine
8J337D1HZY
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Informations de copyright
© 2021, Yang et al.
Déclaration de conflit d'intérêts
WY, ND, RV, PW, YL, IC, IS, LE is an employee of New England Biolabs Inc, a manufacturer of restriction enzymes and molecular reagents, YL, WJ was an employee of New England Biolabs Inc, a manufacturer of restriction enzymes and molecular reagents
Références
J Biochem. 1985 Jan;97(1):361-4
pubmed: 3888974
Cell. 2016 Aug 25;166(5):1103-1116
pubmed: 27565341
Genome Res. 2014 May;24(5):839-49
pubmed: 24717264
J Mol Biol. 1990 Aug 20;214(4):923-36
pubmed: 2201778
Genome Res. 2017 May;27(5):824-834
pubmed: 28298430
Nat Methods. 2012 Mar 04;9(4):357-9
pubmed: 22388286
Nat Rev Genet. 2005 Feb;6(2):95-108
pubmed: 15716906
Front Microbiol. 2017 Sep 21;8:1829
pubmed: 29033905
Bioinformatics. 2011 Mar 15;27(6):764-70
pubmed: 21217122
Bioinformatics. 2012 Dec 1;28(23):3150-2
pubmed: 23060610
Nucleic Acids Res. 2019 Jan 8;47(D1):D427-D432
pubmed: 30357350
Bioinformatics. 2007 Mar 15;23(6):673-9
pubmed: 17237039
Trends Microbiol. 2006 Aug;14(8):353-5
pubmed: 16782339
PLoS One. 2009;4(1):e4277
pubmed: 19169351
Nat Commun. 2013;4:2151
pubmed: 23877117
Nucleic Acids Res. 2013 Sep;41(16):7635-55
pubmed: 23814188
Genome Res. 2021 Jan 19;:
pubmed: 33468551
Science. 2011 Jan 28;331(6016):463-7
pubmed: 21273488
Nat Commun. 2019 Jan 11;10(1):159
pubmed: 30635580
Front Plant Sci. 2014 Jun 16;5:209
pubmed: 24982662
Bioinformatics. 2014 May 1;30(9):1312-3
pubmed: 24451623
Nat Commun. 2019 Nov 29;10(1):5442
pubmed: 31784519
Appl Environ Microbiol. 2004 Apr;70(4):2452-63
pubmed: 15066844
Trends Biochem Sci. 1994 Mar;19(3):105-6
pubmed: 8203015
J Virol. 1980 May;34(2):347-53
pubmed: 7373713
Curr Biol. 2014 May 19;24(10):1096-100
pubmed: 24814145
Genome Res. 2003 Nov;13(11):2498-504
pubmed: 14597658
Proc Natl Acad Sci U S A. 1992 Oct 15;89(20):9725-9
pubmed: 1409689
Cell. 1996 May 17;85(4):607-15
pubmed: 8653795
Microbiol Mol Biol Rev. 2003 Mar;67(1):86-156, table of contents
pubmed: 12626685
Proteins. 2002 Aug 1;48(2):227-41
pubmed: 12112692
Nat Rev Microbiol. 2019 Sep;17(9):569-586
pubmed: 31213707
Nat Biotechnol. 2017 Sep 12;35(9):833-844
pubmed: 28898207
J Bacteriol. 1996 Apr;178(7):1881-94
pubmed: 8606161
Biochemistry. 1992 Jun 9;31(22):5100-4
pubmed: 1606134
Nat Rev Genet. 2019 Jun;20(6):341-355
pubmed: 30918369
Curr Opin Microbiol. 2014 Jun;19:70-75
pubmed: 25000402
Cell Metab. 2014 Nov 4;20(5):719-730
pubmed: 25440054
Biochemistry. 2007 Dec 11;46(49):14188-97
pubmed: 17999469
J Biol Chem. 2012 Oct 5;287(41):34801-8
pubmed: 22896697
J Mol Biol. 1968 Jul 14;34(2):373-5
pubmed: 5760463
PLoS Genet. 2014 Aug 07;10(8):e1004547
pubmed: 25101644
Bioinformatics. 2009 Aug 15;25(16):2078-9
pubmed: 19505943
Science. 2015 May 22;348(6237):1261359
pubmed: 25999513
Proc Natl Acad Sci U S A. 2018 Apr 03;115(14):E3116-E3125
pubmed: 29555775
Chem Rev. 2016 Oct 26;116(20):12655-12687
pubmed: 27319741
Biochemistry. 1992 Oct 27;31(42):10315-21
pubmed: 1420151
Virology. 1969 Sep;39(1):1-17
pubmed: 4897044
Nucleic Acids Res. 2012 Oct;40(18):9206-17
pubmed: 22798497
Angew Chem Int Ed Engl. 2012 Apr 23;51(17):4046-52
pubmed: 22383337
Bioinformatics. 2010 Mar 15;26(6):841-2
pubmed: 20110278
PLoS Comput Biol. 2018 Feb 5;14(2):e1005958
pubmed: 29401456
BMC Bioinformatics. 2009 Jan 30;10 Suppl 1:S12
pubmed: 19208111
Nucleic Acids Res. 2004 Mar 19;32(5):1792-7
pubmed: 15034147
Proc Natl Acad Sci U S A. 2013 Jul 16;110(29):11923-7
pubmed: 23818615
Nucleic Acids Res. 2007;35(6):1992-2002
pubmed: 17341463