Microsatellite density landscapes illustrate short tandem repeats aggregation in the complete reference human genome.
Human genome
Landscape
Microsatellite density
STRs aggregation
Journal
BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258
Informations de publication
Date de publication:
14 Oct 2024
14 Oct 2024
Historique:
received:
18
05
2024
accepted:
26
09
2024
medline:
15
10
2024
pubmed:
15
10
2024
entrez:
14
10
2024
Statut:
epublish
Résumé
Microsatellites are increasingly realized to have biological significance in human genome and health in past decades, the assembled complete reference sequence of human genome T2T-CHM13 brought great help for a comprehensive study of short tandem repeats in the human genome. Microsatellites density landscapes of all 24 chromosomes were built here for the first complete reference sequence of human genome T2T-CHM13. These landscapes showed that short tandem repeats (STRs) are prone to aggregate characteristically to form a large number of STRs density peaks. We classified 8,823 High Microsatellites Density Peaks (HMDPs), 35,257 Middle Microsatellites Density Peaks (MMDPs) and 199, 649 Low Microsatellites Density Peaks (LMDPs) on the 24 chromosomes; and also classified the motif types of every microsatellites density peak. These STRs density aggregation peaks are mainly composing of a single motif, and AT is the most dominant motif, followed by AATGG and CCATT motifs. And 514 genomic regions were characterized by microsatellite density feature in the full T2T-CHM13 genome. These landscape maps exhibited that microsatellites aggregate in many genomic positions to form a large number of microsatellite density peaks with composing of mainly single motif type in the complete reference genome, indicating that the local microsatellites density varies enormously along the every chromosome of T2T-CHM13.
Sections du résumé
BACKGROUND
BACKGROUND
Microsatellites are increasingly realized to have biological significance in human genome and health in past decades, the assembled complete reference sequence of human genome T2T-CHM13 brought great help for a comprehensive study of short tandem repeats in the human genome.
RESULTS
RESULTS
Microsatellites density landscapes of all 24 chromosomes were built here for the first complete reference sequence of human genome T2T-CHM13. These landscapes showed that short tandem repeats (STRs) are prone to aggregate characteristically to form a large number of STRs density peaks. We classified 8,823 High Microsatellites Density Peaks (HMDPs), 35,257 Middle Microsatellites Density Peaks (MMDPs) and 199, 649 Low Microsatellites Density Peaks (LMDPs) on the 24 chromosomes; and also classified the motif types of every microsatellites density peak. These STRs density aggregation peaks are mainly composing of a single motif, and AT is the most dominant motif, followed by AATGG and CCATT motifs. And 514 genomic regions were characterized by microsatellite density feature in the full T2T-CHM13 genome.
CONCLUSIONS
CONCLUSIONS
These landscape maps exhibited that microsatellites aggregate in many genomic positions to form a large number of microsatellite density peaks with composing of mainly single motif type in the complete reference genome, indicating that the local microsatellites density varies enormously along the every chromosome of T2T-CHM13.
Identifiants
pubmed: 39402450
doi: 10.1186/s12864-024-10843-9
pii: 10.1186/s12864-024-10843-9
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
960Informations de copyright
© 2024. The Author(s).
Références
Ellegren H. Microsatellites: simple sequences with complex evolution. Nat Rev Genet. 2004;5(6):435–45.
pubmed: 15153996
doi: 10.1038/nrg1348
Zhao X, Tian Y, Yang R, Feng H, Ouyang Q, Tian Y, et al. Coevolution between simple sequence repeats (SSRs) and virus genome size. BMC Genomics. 2012;13: 435.
pubmed: 22931422
pmcid: 3585866
doi: 10.1186/1471-2164-13-435
Hannan AJ. Tandem repeats mediating genetic plasticity in health and disease. Nat Rev Genet. 2018;19(5):286–98.
pubmed: 29398703
doi: 10.1038/nrg.2017.115
Hannan AJ. Repeat DNA expands our understanding of autism spectrum disorder. Nature. 2021;589(7841):200–2.
pubmed: 33442037
doi: 10.1038/d41586-020-03658-7
Hoyt SJ, Storer JM. From telomere to telomere: the transcriptional and epigenetic state of human repeat elements. Science. 2022;376(6588):eabk3112.
pubmed: 35357925
pmcid: 9301658
doi: 10.1126/science.abk3112
Hartl DL. Molecular melodies in high and low C. Nat Rev Genet. 2000;1(2):145–9.
pubmed: 11253654
doi: 10.1038/35038580
Kim TM, Laird PW, Park PJ. The landscape of microsatellite instability in colorectal and endometrial cancer genomes. Cell. 2013;155(4):858–68.
pubmed: 24209623
doi: 10.1016/j.cell.2013.10.015
Hause RJ, Pritchard CC, Shendure J, Salipante SJ. Classification and characterization of microsatellite instability across 18 cancer types. Nat Med. 2016;22(11):1342–50.
pubmed: 27694933
doi: 10.1038/nm.4191
Priestley P, Baber J, Lolkema MP, Steeghs N, de Bruijn E, Shale C, et al. Pan-cancer whole-genome analyses of metastatic solid tumours. Nature. 2019;575(7781):210–6.
pubmed: 31645765
pmcid: 6872491
doi: 10.1038/s41586-019-1689-y
van Wietmarschen N, Sridharan S, Nathan WJ, Tubbs A, Chan EM, Callen E, et al. Repeat expansions confer WRN dependence in microsatellite-unstable cancers. Nature. 2020;586(7828):292–8.
pubmed: 32999459
pmcid: 8916167
doi: 10.1038/s41586-020-2769-8
Gymrek M, Willems T, Guilmatre A, Zeng H, Markus B, Georgiev S, et al. Abundant contribution of short tandem repeats to gene expression variation in humans. Nat Genet. 2016;48(1):22–9.
pubmed: 26642241
doi: 10.1038/ng.3461
Quilez J, Guilmatre A, Garg P, Highnam G, Gymrek M, Erlich Y, et al. Polymorphic tandem repeats within gene promoters act as modifiers of gene expression and DNA methylation in humans. Nucleic Acids Res. 2016;44(8):3750–62.
pubmed: 27060133
pmcid: 4857002
doi: 10.1093/nar/gkw219
Verstrepen KJ, Jansen A, Lewitter F, Fink GR. Intragenic tandem repeats generate functional variability. Nat Genet. 2005;37(9):986–90.
pubmed: 16086015
pmcid: 1462868
doi: 10.1038/ng1618
Fondon JW 3rd, Hammock EA, Hannan AJ, King DG. Simple sequence repeats: genetic modulators of brain function and behavior. Trends Neurosci. 2008;31(7):328–34.
pubmed: 18550185
doi: 10.1016/j.tins.2008.03.006
Hannan AJ. Tandem repeat polymorphisms: modulators of disease susceptibility and candidates for “missing heritability.” Trends Genet. 2010;26(2):59–65.
pubmed: 20036436
doi: 10.1016/j.tig.2009.11.008
Nasrallah MP, Cho G, Simonet JC, Putt ME, Kitamura K, Golden JA. Differential effects of a polyalanine tract expansion in Arx on neural development and gene expression. Hum Mol Genet. 2012;21(5):1090–8.
pubmed: 22108177
doi: 10.1093/hmg/ddr538
Willems T, Gymrek M, Highnam G, Genomes Project C, Mittelman D, Erlich Y. The landscape of human STR variation. Genome Res. 2014;24(11):1894–904.
doi: 10.1101/gr.177774.114
Willems T, Zielinski D, Yuan J, Gordon A, Gymrek M, Erlich Y. Genome-wide profiling of heritable and de novo STR variations. Nat Methods. 2017;14(6):590–2.
pubmed: 28436466
pmcid: 5482724
doi: 10.1038/nmeth.4267
Mallick S, Li H, Lipson M, Mathieson I, Gymrek M, Racimo F, et al. The simons genome diversity project: 300 genomes from 142 diverse populations. Nature. 2016;538(7624):201–6.
pubmed: 27654912
pmcid: 5161557
doi: 10.1038/nature18964
Gymrek M, Willems T, Reich D, Erlich Y. Interpreting short tandem repeat variations in humans using mutational constraint. Nat Genet. 2017;49(10):1495–501.
pubmed: 28892063
pmcid: 5679271
doi: 10.1038/ng.3952
Levinson G, Gutman GA. High frequencies of short frameshifts in poly-CA/TG tandem repeats borne by bacteriophage M13 in Escherichia coli K-12. Nucleic Acids Res. 1987;15(13):5323–38.
pubmed: 3299269
pmcid: 305964
doi: 10.1093/nar/15.13.5323
Schlötterer C, Tautz D. Slippage synthesis of simple sequence DNA. Nucleic Acids Res. 1992;20(2):211–5.
pubmed: 1741246
pmcid: 310356
doi: 10.1093/nar/20.2.211
Zhang H, Li D, Zhao X, Pan S, Wu X, Peng S, et al. Relatively semi-conservative replication and a folded slippage model for short tandem repeats. BMC Genomics. 2020;21(1):563.
pubmed: 32807079
pmcid: 7430839
doi: 10.1186/s12864-020-06949-5
Nurk S, Koren S, Rhie A, Rautiainen M, Bzikadze AV, Mikheenko A, et al. The complete sequence of a human genome. Science. 2022;376(6588):44–53.
pubmed: 35357919
pmcid: 9186530
doi: 10.1126/science.abj6987
Gershman A, Sauria MEG, Guitart X, Vollger MR, Hook PW, Hoyt SJ, et al. Epigenetic patterns in a complete human genome. Science. 2022;376(6588):eabj5089.
pubmed: 35357915
pmcid: 9170183
doi: 10.1126/science.abj5089
Vollger MR, Guitart X, Dishuck PC, Mercuri L, Harvey WT, Gershman A, et al. Segmental duplications and their variation in a complete human genome. Science. 2022;376(6588):eabj6965.
pubmed: 35357917
pmcid: 8979283
doi: 10.1126/science.abj6965
Altemose N, Logsdon GA, Bzikadze AV, Sidhwani P, Langley SA, Caldas GV, et al. Complete genomic and epigenetic maps of human centromeres. Science. 2022;376(6588):eabl4178.
pubmed: 35357911
pmcid: 9233505
doi: 10.1126/science.abl4178
Aganezov S, Yan SM, Soto DC, Kirsche M, Zarate S, Avdeyev P, et al. A complete reference genome improves analysis of human genetic variation. Science. 2022;376(6588):eabl3533.
pubmed: 35357935
pmcid: 9336181
doi: 10.1126/science.abl3533
Rhie A, Nurk S, Cechova M, Hoyt SJ, Taylor DJ, Altemose N, et al. The complete sequence of a human Y chromosome. Nature. 2023;621:344–54.
pubmed: 37612512
pmcid: 10752217
doi: 10.1038/s41586-023-06457-y
Lei Y, Zhou Y, Price M, Song Z. Genome-wide characterization of microsatellite DNA in fishes: survey and analysis of their abundance and frequency in genome-specific regions. BMC Genomics. 2021;22(1):421.
pubmed: 34098869
pmcid: 8186053
doi: 10.1186/s12864-021-07752-6
Qi WH, Yan CC, Li WJ, Jiang XM, Li GZ, Zhang XY, et al. Distinct patterns of simple sequence repeats and GC distribution in intragenic and intergenic regions of primate genomes. Aging. 2016;8(11):2635–54.
pubmed: 27644032
pmcid: 5191860
doi: 10.18632/aging.101025
Subramanian S, Mishra RK, Singh L. Genome-wide analysis of microsatellite repeats in humans: their abundance and density in specific genomic regions. Genome Biol. 2003;4(2): R13.
pubmed: 12620123
pmcid: 151303
doi: 10.1186/gb-2003-4-2-r13
de Freitas KEJ, Busanello C, Viana VE, Pegoraro C, de Carvalho VF, da Maia LC, et al. An empirical analysis of mtSSRs: could microsatellite distribution patterns explain the evolution of mitogenomes in plants? Funct Integr Genomics. 2022;22(1):35–53.
pubmed: 34751851
doi: 10.1007/s10142-021-00815-7
Chen M, Tan Z, Zeng G, Peng J. Comprehensive analysis of simple sequence repeats in pre-miRNAs. Mol Biol Evol. 2010;27(10):2227–32.
pubmed: 20395311
doi: 10.1093/molbev/msq100
Sahu BP, Majee P, Singh RR, Sahoo N, Nayak D. Genome-wide identification and characterization of microsatellite markers within the Avipoxviruses. 3 Biotech. 2022;12(5):113.
pubmed: 35497507
pmcid: 9008116
doi: 10.1007/s13205-022-03169-4
Li D, Pan S, Zhang H, Fu Y, Peng Z, Zhang L, et al. A comprehensive microsatellite landscape of human Y-DNA at kilobase resolution. BMC Genomics. 2021;22(1):76.
pubmed: 33482734
pmcid: 7821415
doi: 10.1186/s12864-021-07389-5
Mudunuri SB, Nagarajaram HA. IMEx: imperfect microsatellite extractor. Bioinformatics. 2007;23(10):1181–7.
pubmed: 17379689
doi: 10.1093/bioinformatics/btm097
Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, Antonescu C, et al. Versatile and open software for comparing large genomes. Genome Biol. 2004;5(2): R12.
pubmed: 14759262
pmcid: 395750
doi: 10.1186/gb-2004-5-2-r12
Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, et al. Clustal W and Clustal X version 2.0. Bioinformatics. 2007;23(21):2947–8.
pubmed: 17846036
doi: 10.1093/bioinformatics/btm404
Helmrich A, Stout-Weider K, Hermann K, Schrock E, Heiden T. Common fragile sites are conserved features of human and mouse chromosomes and relate to large active genes. Genome Res. 2006;16(10):1222–30.
pubmed: 16954539
pmcid: 1581431
doi: 10.1101/gr.5335506
Irony-Tur Sinai M, Salamon A, Stanleigh N, Goldberg T, Weiss A, Wang YH, et al. AT-dinucleotide rich sequences drive fragile site formation. Nucleic Acids Res. 2019;47(18):9685–95.
pubmed: 31410468
pmcid: 6765107
doi: 10.1093/nar/gkz689
Inagaki H, Ohye T, Kogo H, Yamada K, Kowa H, Shaikh TH, et al. Palindromic AT-rich repeat in the NF1 gene is hypervariable in humans and evolutionarily conserved in primates. Hum Mutat. 2005;26(4):332–42.
pubmed: 16116616
pmcid: 2818517
doi: 10.1002/humu.20228
Ramesh KH, Verma RS. Breakpoints in alpha, beta, and satellite III DNA sequences of chromosome 9 result in a variety of pericentric inversions. J Med Genet. 1996;33(5):395–8.
pubmed: 8733050
pmcid: 1050609
doi: 10.1136/jmg.33.5.395
Starke H, Seidel J, Henn W, Reichardt S, Volleth M, Stumm M, et al. Homologous sequences at human chromosome 9 bands p12 and q13–21.1 are involved in different patterns of pericentric rearrangements. Eur J Hum Genet. 2002;10(12):790–800.
pubmed: 12461685
doi: 10.1038/sj.ejhg.5200889