Expression regulation of genes is linked to their CpG density distributions around transcription start sites.
Journal
Life science alliance
ISSN: 2575-1077
Titre abrégé: Life Sci Alliance
Pays: United States
ID NLM: 101728869
Informations de publication
Date de publication:
09 2022
09 2022
Historique:
received:
17
11
2021
revised:
07
05
2022
accepted:
09
05
2022
entrez:
17
5
2022
pubmed:
18
5
2022
medline:
21
5
2022
Statut:
epublish
Résumé
The CpG dinucleotide and its methylation behaviors play vital roles in gene regulation. Previous studies have divided genes into several categories based on the CpG intensity around transcription starting sites and found that housekeeping genes tend to possess high CpG density, whereas tissue-specific genes are generally characterized by low CpG density. In this study, we investigated how the CpG density distribution of a gene affects its transcription and regulation pattern. Based on the CpG density distribution around transcription starting site, by means of a semi-supervised neural network we designed, which took data augmentation into account, we divided the human genes into three categories, and genes within each cluster shared similar CpG density distribution. Not only sequence properties, these different clusters exhibited distinctly different structural features, regulatory mechanisms, correlation patterns between the expression level and CpG/TpG density, and expression and epigenetic mark variations during tumorigenesis. For instance, the activation of cluster 3 genes relies more on 3D genome reorganization, compared with cluster 1 and 2 genes, whereas cluster 2 genes showed the strongest correlation between gene expression and H3K27me3. Genes exhibiting uncoupled correlation between gene regulation and histone modifications are mainly in cluster 3. These results emphasized that the usage of epigenetic marks in gene regulation is partially rooted in the sequence property of genes such as their CpG density distribution and explained to some extent why the relation between epigenetic marks and gene expression is controversial.
Identifiants
pubmed: 35580989
pii: 5/9/e202101302
doi: 10.26508/lsa.202101302
pmc: PMC9113945
pii:
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Informations de copyright
© 2022 Tian et al.
Références
Cell. 2011 Mar 4;144(5):646-74
pubmed: 21376230
Mol Cell. 2020 May 7;78(3):506-521.e6
pubmed: 32386543
Proc Natl Acad Sci U S A. 2006 Jan 31;103(5):1412-7
pubmed: 16432200
Elife. 2013 Feb 26;2:e00348
pubmed: 23467541
J Cell Mol Med. 2010 Dec;14(12):2697-701
pubmed: 21029369
Biochem J. 1999 Oct 15;343 Pt 2:281-99
pubmed: 10510291
Genes Dev. 2014 Dec 15;28(24):2679-92
pubmed: 25512556
Nature. 2002 Sep 26;419(6905):407-11
pubmed: 12353038
Cell. 2017 Mar 23;169(1):13-23
pubmed: 28340338
Genome Biol. 2017 Jul 21;18(1):137
pubmed: 28732548
Science. 2009 Oct 9;326(5950):289-93
pubmed: 19815776
Genome Res. 2012 Sep;22(9):1658-67
pubmed: 22955978
Mol Cell. 2019 Oct 17;76(2):306-319
pubmed: 31521504
Genome Res. 2011 Aug;21(8):1313-27
pubmed: 21636662
PLoS Genet. 2010 Dec 09;6(12):e1001244
pubmed: 21170310
OMICS. 2012 May;16(5):284-7
pubmed: 22455463
Genome Biol. 2014;15(12):550
pubmed: 25516281
Development. 2019 Sep 20;146(19):
pubmed: 31540910
Genes Dev. 2011 May 15;25(10):1010-22
pubmed: 21576262
Nature. 2020 Jul;583(7818):720-728
pubmed: 32728244
Genome Biol. 2021 Apr 15;22(1):108
pubmed: 33858480
Cell. 2018 Dec 13;175(7):1842-1855.e16
pubmed: 30449618
Curr Opin Genet Dev. 2012 Apr;22(2):79-85
pubmed: 22169023
Genome Biol. 2012 Nov 27;13(11):R110
pubmed: 23186133
Epigenetics. 2012 May;7(5):421-8
pubmed: 22415013
PLoS One. 2014 Jul 10;9(7):e101853
pubmed: 25010796
Cell. 2014 Jul 31;158(3):673-88
pubmed: 25083876
Mamm Genome. 2020 Aug;31(7-8):240-251
pubmed: 32647942
Nucleic Acids Res. 2009 Oct;37(19):6305-15
pubmed: 19736212
Cell Mol Life Sci. 2003 Aug;60(8):1647-58
pubmed: 14504655
Int J Biochem Cell Biol. 2015 Oct;67:177-87
pubmed: 25982201
Pharmacol Rev. 2020 Apr;72(2):466-485
pubmed: 32144120
Nat Genet. 2007 Feb;39(2):232-6
pubmed: 17200670
Nat Genet. 2020 Apr;52(4):388-400
pubmed: 32203470
Epigenetics. 2019 Dec;14(12):1141-1163
pubmed: 31284823
Mol Cell. 2017 Sep 7;67(5):730-731
pubmed: 28886334
Nat Genet. 2021 Apr;53(4):487-499
pubmed: 33795866
Cell Rep. 2020 Aug 18;32(7):108048
pubmed: 32814051
Nucleic Acids Res. 2021 Oct 11;49(18):10347-10368
pubmed: 34570239
Nat Commun. 2021 Jul 16;12(1):4344
pubmed: 34272393
Science. 2018 Oct 26;362(6413):
pubmed: 30361340
Cell. 2017 Sep 21;171(1):34-57
pubmed: 28938122
BMC Genomics. 2014 Aug 20;15:693
pubmed: 25142157
Nature. 2019 May;569(7756):345-354
pubmed: 31092938
Nat Genet. 2007 Mar;39(3):311-8
pubmed: 17277777
Science. 2018 Sep 28;361(6409):1336-1340
pubmed: 30262495
Nature. 2017 Sep 14;549(7671):287-291
pubmed: 28869966
EMBO J. 2012 Jan 18;31(2):317-29
pubmed: 22056776
J Physiol. 2021 Mar;599(6):1745-1757
pubmed: 33347611
Nat Genet. 2007 Apr;39(4):457-66
pubmed: 17334365
Genes (Basel). 2017 May 23;8(6):
pubmed: 28545252
Nature. 2012 Apr 11;485(7398):376-80
pubmed: 22495300
BMC Genomics. 2015 Oct 05;16:743
pubmed: 26438392
Mol Oncol. 2022 Feb;16(3):699-716
pubmed: 34708506