Microsatellite density landscapes illustrate short tandem repeats aggregation in the complete reference human genome.


Journal

BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258

Informations de publication

Date de publication:
14 Oct 2024
Historique:
received: 18 05 2024
accepted: 26 09 2024
medline: 15 10 2024
pubmed: 15 10 2024
entrez: 14 10 2024
Statut: epublish

Résumé

Microsatellites are increasingly realized to have biological significance in human genome and health in past decades, the assembled complete reference sequence of human genome T2T-CHM13 brought great help for a comprehensive study of short tandem repeats in the human genome. Microsatellites density landscapes of all 24 chromosomes were built here for the first complete reference sequence of human genome T2T-CHM13. These landscapes showed that short tandem repeats (STRs) are prone to aggregate characteristically to form a large number of STRs density peaks. We classified 8,823 High Microsatellites Density Peaks (HMDPs), 35,257 Middle Microsatellites Density Peaks (MMDPs) and 199, 649 Low Microsatellites Density Peaks (LMDPs) on the 24 chromosomes; and also classified the motif types of every microsatellites density peak. These STRs density aggregation peaks are mainly composing of a single motif, and AT is the most dominant motif, followed by AATGG and CCATT motifs. And 514 genomic regions were characterized by microsatellite density feature in the full T2T-CHM13 genome. These landscape maps exhibited that microsatellites aggregate in many genomic positions to form a large number of microsatellite density peaks with composing of mainly single motif type in the complete reference genome, indicating that the local microsatellites density varies enormously along the every chromosome of T2T-CHM13.

Sections du résumé

BACKGROUND BACKGROUND
Microsatellites are increasingly realized to have biological significance in human genome and health in past decades, the assembled complete reference sequence of human genome T2T-CHM13 brought great help for a comprehensive study of short tandem repeats in the human genome.
RESULTS RESULTS
Microsatellites density landscapes of all 24 chromosomes were built here for the first complete reference sequence of human genome T2T-CHM13. These landscapes showed that short tandem repeats (STRs) are prone to aggregate characteristically to form a large number of STRs density peaks. We classified 8,823 High Microsatellites Density Peaks (HMDPs), 35,257 Middle Microsatellites Density Peaks (MMDPs) and 199, 649 Low Microsatellites Density Peaks (LMDPs) on the 24 chromosomes; and also classified the motif types of every microsatellites density peak. These STRs density aggregation peaks are mainly composing of a single motif, and AT is the most dominant motif, followed by AATGG and CCATT motifs. And 514 genomic regions were characterized by microsatellite density feature in the full T2T-CHM13 genome.
CONCLUSIONS CONCLUSIONS
These landscape maps exhibited that microsatellites aggregate in many genomic positions to form a large number of microsatellite density peaks with composing of mainly single motif type in the complete reference genome, indicating that the local microsatellites density varies enormously along the every chromosome of T2T-CHM13.

Identifiants

pubmed: 39402450
doi: 10.1186/s12864-024-10843-9
pii: 10.1186/s12864-024-10843-9
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

960

Informations de copyright

© 2024. The Author(s).

Références

Ellegren H. Microsatellites: simple sequences with complex evolution. Nat Rev Genet. 2004;5(6):435–45.
pubmed: 15153996 doi: 10.1038/nrg1348
Zhao X, Tian Y, Yang R, Feng H, Ouyang Q, Tian Y, et al. Coevolution between simple sequence repeats (SSRs) and virus genome size. BMC Genomics. 2012;13: 435.
pubmed: 22931422 pmcid: 3585866 doi: 10.1186/1471-2164-13-435
Hannan AJ. Tandem repeats mediating genetic plasticity in health and disease. Nat Rev Genet. 2018;19(5):286–98.
pubmed: 29398703 doi: 10.1038/nrg.2017.115
Hannan AJ. Repeat DNA expands our understanding of autism spectrum disorder. Nature. 2021;589(7841):200–2.
pubmed: 33442037 doi: 10.1038/d41586-020-03658-7
Hoyt SJ, Storer JM. From telomere to telomere: the transcriptional and epigenetic state of human repeat elements. Science. 2022;376(6588):eabk3112.
pubmed: 35357925 pmcid: 9301658 doi: 10.1126/science.abk3112
Hartl DL. Molecular melodies in high and low C. Nat Rev Genet. 2000;1(2):145–9.
pubmed: 11253654 doi: 10.1038/35038580
Kim TM, Laird PW, Park PJ. The landscape of microsatellite instability in colorectal and endometrial cancer genomes. Cell. 2013;155(4):858–68.
pubmed: 24209623 doi: 10.1016/j.cell.2013.10.015
Hause RJ, Pritchard CC, Shendure J, Salipante SJ. Classification and characterization of microsatellite instability across 18 cancer types. Nat Med. 2016;22(11):1342–50.
pubmed: 27694933 doi: 10.1038/nm.4191
Priestley P, Baber J, Lolkema MP, Steeghs N, de Bruijn E, Shale C, et al. Pan-cancer whole-genome analyses of metastatic solid tumours. Nature. 2019;575(7781):210–6.
pubmed: 31645765 pmcid: 6872491 doi: 10.1038/s41586-019-1689-y
van Wietmarschen N, Sridharan S, Nathan WJ, Tubbs A, Chan EM, Callen E, et al. Repeat expansions confer WRN dependence in microsatellite-unstable cancers. Nature. 2020;586(7828):292–8.
pubmed: 32999459 pmcid: 8916167 doi: 10.1038/s41586-020-2769-8
Gymrek M, Willems T, Guilmatre A, Zeng H, Markus B, Georgiev S, et al. Abundant contribution of short tandem repeats to gene expression variation in humans. Nat Genet. 2016;48(1):22–9.
pubmed: 26642241 doi: 10.1038/ng.3461
Quilez J, Guilmatre A, Garg P, Highnam G, Gymrek M, Erlich Y, et al. Polymorphic tandem repeats within gene promoters act as modifiers of gene expression and DNA methylation in humans. Nucleic Acids Res. 2016;44(8):3750–62.
pubmed: 27060133 pmcid: 4857002 doi: 10.1093/nar/gkw219
Verstrepen KJ, Jansen A, Lewitter F, Fink GR. Intragenic tandem repeats generate functional variability. Nat Genet. 2005;37(9):986–90.
pubmed: 16086015 pmcid: 1462868 doi: 10.1038/ng1618
Fondon JW 3rd, Hammock EA, Hannan AJ, King DG. Simple sequence repeats: genetic modulators of brain function and behavior. Trends Neurosci. 2008;31(7):328–34.
pubmed: 18550185 doi: 10.1016/j.tins.2008.03.006
Hannan AJ. Tandem repeat polymorphisms: modulators of disease susceptibility and candidates for “missing heritability.” Trends Genet. 2010;26(2):59–65.
pubmed: 20036436 doi: 10.1016/j.tig.2009.11.008
Nasrallah MP, Cho G, Simonet JC, Putt ME, Kitamura K, Golden JA. Differential effects of a polyalanine tract expansion in Arx on neural development and gene expression. Hum Mol Genet. 2012;21(5):1090–8.
pubmed: 22108177 doi: 10.1093/hmg/ddr538
Willems T, Gymrek M, Highnam G, Genomes Project C, Mittelman D, Erlich Y. The landscape of human STR variation. Genome Res. 2014;24(11):1894–904.
doi: 10.1101/gr.177774.114
Willems T, Zielinski D, Yuan J, Gordon A, Gymrek M, Erlich Y. Genome-wide profiling of heritable and de novo STR variations. Nat Methods. 2017;14(6):590–2.
pubmed: 28436466 pmcid: 5482724 doi: 10.1038/nmeth.4267
Mallick S, Li H, Lipson M, Mathieson I, Gymrek M, Racimo F, et al. The simons genome diversity project: 300 genomes from 142 diverse populations. Nature. 2016;538(7624):201–6.
pubmed: 27654912 pmcid: 5161557 doi: 10.1038/nature18964
Gymrek M, Willems T, Reich D, Erlich Y. Interpreting short tandem repeat variations in humans using mutational constraint. Nat Genet. 2017;49(10):1495–501.
pubmed: 28892063 pmcid: 5679271 doi: 10.1038/ng.3952
Levinson G, Gutman GA. High frequencies of short frameshifts in poly-CA/TG tandem repeats borne by bacteriophage M13 in Escherichia coli K-12. Nucleic Acids Res. 1987;15(13):5323–38.
pubmed: 3299269 pmcid: 305964 doi: 10.1093/nar/15.13.5323
Schlötterer C, Tautz D. Slippage synthesis of simple sequence DNA. Nucleic Acids Res. 1992;20(2):211–5.
pubmed: 1741246 pmcid: 310356 doi: 10.1093/nar/20.2.211
Zhang H, Li D, Zhao X, Pan S, Wu X, Peng S, et al. Relatively semi-conservative replication and a folded slippage model for short tandem repeats. BMC Genomics. 2020;21(1):563.
pubmed: 32807079 pmcid: 7430839 doi: 10.1186/s12864-020-06949-5
Nurk S, Koren S, Rhie A, Rautiainen M, Bzikadze AV, Mikheenko A, et al. The complete sequence of a human genome. Science. 2022;376(6588):44–53.
pubmed: 35357919 pmcid: 9186530 doi: 10.1126/science.abj6987
Gershman A, Sauria MEG, Guitart X, Vollger MR, Hook PW, Hoyt SJ, et al. Epigenetic patterns in a complete human genome. Science. 2022;376(6588):eabj5089.
pubmed: 35357915 pmcid: 9170183 doi: 10.1126/science.abj5089
Vollger MR, Guitart X, Dishuck PC, Mercuri L, Harvey WT, Gershman A, et al. Segmental duplications and their variation in a complete human genome. Science. 2022;376(6588):eabj6965.
pubmed: 35357917 pmcid: 8979283 doi: 10.1126/science.abj6965
Altemose N, Logsdon GA, Bzikadze AV, Sidhwani P, Langley SA, Caldas GV, et al. Complete genomic and epigenetic maps of human centromeres. Science. 2022;376(6588):eabl4178.
pubmed: 35357911 pmcid: 9233505 doi: 10.1126/science.abl4178
Aganezov S, Yan SM, Soto DC, Kirsche M, Zarate S, Avdeyev P, et al. A complete reference genome improves analysis of human genetic variation. Science. 2022;376(6588):eabl3533.
pubmed: 35357935 pmcid: 9336181 doi: 10.1126/science.abl3533
Rhie A, Nurk S, Cechova M, Hoyt SJ, Taylor DJ, Altemose N, et al. The complete sequence of a human Y chromosome. Nature. 2023;621:344–54.
pubmed: 37612512 pmcid: 10752217 doi: 10.1038/s41586-023-06457-y
Lei Y, Zhou Y, Price M, Song Z. Genome-wide characterization of microsatellite DNA in fishes: survey and analysis of their abundance and frequency in genome-specific regions. BMC Genomics. 2021;22(1):421.
pubmed: 34098869 pmcid: 8186053 doi: 10.1186/s12864-021-07752-6
Qi WH, Yan CC, Li WJ, Jiang XM, Li GZ, Zhang XY, et al. Distinct patterns of simple sequence repeats and GC distribution in intragenic and intergenic regions of primate genomes. Aging. 2016;8(11):2635–54.
pubmed: 27644032 pmcid: 5191860 doi: 10.18632/aging.101025
Subramanian S, Mishra RK, Singh L. Genome-wide analysis of microsatellite repeats in humans: their abundance and density in specific genomic regions. Genome Biol. 2003;4(2): R13.
pubmed: 12620123 pmcid: 151303 doi: 10.1186/gb-2003-4-2-r13
de Freitas KEJ, Busanello C, Viana VE, Pegoraro C, de Carvalho VF, da Maia LC, et al. An empirical analysis of mtSSRs: could microsatellite distribution patterns explain the evolution of mitogenomes in plants? Funct Integr Genomics. 2022;22(1):35–53.
pubmed: 34751851 doi: 10.1007/s10142-021-00815-7
Chen M, Tan Z, Zeng G, Peng J. Comprehensive analysis of simple sequence repeats in pre-miRNAs. Mol Biol Evol. 2010;27(10):2227–32.
pubmed: 20395311 doi: 10.1093/molbev/msq100
Sahu BP, Majee P, Singh RR, Sahoo N, Nayak D. Genome-wide identification and characterization of microsatellite markers within the Avipoxviruses. 3 Biotech. 2022;12(5):113.
pubmed: 35497507 pmcid: 9008116 doi: 10.1007/s13205-022-03169-4
Li D, Pan S, Zhang H, Fu Y, Peng Z, Zhang L, et al. A comprehensive microsatellite landscape of human Y-DNA at kilobase resolution. BMC Genomics. 2021;22(1):76.
pubmed: 33482734 pmcid: 7821415 doi: 10.1186/s12864-021-07389-5
Mudunuri SB, Nagarajaram HA. IMEx: imperfect microsatellite extractor. Bioinformatics. 2007;23(10):1181–7.
pubmed: 17379689 doi: 10.1093/bioinformatics/btm097
Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, Antonescu C, et al. Versatile and open software for comparing large genomes. Genome Biol. 2004;5(2): R12.
pubmed: 14759262 pmcid: 395750 doi: 10.1186/gb-2004-5-2-r12
Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, et al. Clustal W and Clustal X version 2.0. Bioinformatics. 2007;23(21):2947–8.
pubmed: 17846036 doi: 10.1093/bioinformatics/btm404
Helmrich A, Stout-Weider K, Hermann K, Schrock E, Heiden T. Common fragile sites are conserved features of human and mouse chromosomes and relate to large active genes. Genome Res. 2006;16(10):1222–30.
pubmed: 16954539 pmcid: 1581431 doi: 10.1101/gr.5335506
Irony-Tur Sinai M, Salamon A, Stanleigh N, Goldberg T, Weiss A, Wang YH, et al. AT-dinucleotide rich sequences drive fragile site formation. Nucleic Acids Res. 2019;47(18):9685–95.
pubmed: 31410468 pmcid: 6765107 doi: 10.1093/nar/gkz689
Inagaki H, Ohye T, Kogo H, Yamada K, Kowa H, Shaikh TH, et al. Palindromic AT-rich repeat in the NF1 gene is hypervariable in humans and evolutionarily conserved in primates. Hum Mutat. 2005;26(4):332–42.
pubmed: 16116616 pmcid: 2818517 doi: 10.1002/humu.20228
Ramesh KH, Verma RS. Breakpoints in alpha, beta, and satellite III DNA sequences of chromosome 9 result in a variety of pericentric inversions. J Med Genet. 1996;33(5):395–8.
pubmed: 8733050 pmcid: 1050609 doi: 10.1136/jmg.33.5.395
Starke H, Seidel J, Henn W, Reichardt S, Volleth M, Stumm M, et al. Homologous sequences at human chromosome 9 bands p12 and q13–21.1 are involved in different patterns of pericentric rearrangements. Eur J Hum Genet. 2002;10(12):790–800.
pubmed: 12461685 doi: 10.1038/sj.ejhg.5200889

Auteurs

Yun Xia (Y)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Douyue Li (D)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Tingyi Chen (T)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Saichao Pan (S)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Hanrou Huang (H)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Wenxiang Zhang (W)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Yulin Liang (Y)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Yongzhuo Fu (Y)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Zhuli Peng (Z)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Hongxi Zhang (H)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Liang Zhang (L)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Shan Peng (S)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Ruixue Shi (R)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Xingxin He (X)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Siqian Zhou (S)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Weili Jiao (W)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Xiangyan Zhao (X)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Xiaolong Wu (X)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Lan Zhou (L)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Jingyu Zhou (J)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Qingjian Ouyang (Q)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

You Tian (Y)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Xiaoping Jiang (X)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Yi Zhou (Y)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Shiying Tang (S)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Junxiong Shen (J)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China.

Kazusato Ohshima (K)

Faculty of Agriculture, Saga University, Saga, 840-8502, Japan.

Zhongyang Tan (Z)

Bioinformatic Center, College of Biology, Hunan University, Lushan Road (S), Yuelu District, Changsha, 410082, China. zhongyangtan@yeah.net.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH