Short tandem repeat mutations regulate gene expression in colorectal cancer.
Journal
Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288
Informations de publication
Date de publication:
09 Feb 2024
09 Feb 2024
Historique:
received:
15
12
2023
accepted:
04
02
2024
medline:
10
2
2024
pubmed:
10
2
2024
entrez:
9
2
2024
Statut:
epublish
Résumé
Short tandem repeat (STR) mutations are prevalent in colorectal cancer (CRC), especially in tumours with the microsatellite instability (MSI) phenotype. While STR length variations are known to regulate gene expression under physiological conditions, the functional impact of STR mutations in CRC remains unclear. Here, we integrate STR mutation data with clinical information and gene expression data to study the gene regulatory effects of STR mutations in CRC. We confirm that STR mutability in CRC highly depends on the MSI status, repeat unit size, and repeat length. Furthermore, we present a set of 1244 putative expression STRs (eSTRs) for which the STR length is associated with gene expression levels in CRC tumours. The length of 73 eSTRs is associated with expression levels of cancer-related genes, nine of which are CRC-specific genes. We show that linear models describing eSTR-gene expression relationships allow for predictions of gene expression changes in response to eSTR mutations. Moreover, we found an increased mutability of eSTRs in MSI tumours. Our evidence of gene regulatory roles for eSTRs in CRC highlights a mostly overlooked way through which tumours may modulate their phenotypes. Future extensions of these findings could uncover new STR-based targets in the treatment of cancer.
Identifiants
pubmed: 38336885
doi: 10.1038/s41598-024-53739-0
pii: 10.1038/s41598-024-53739-0
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
3331Subventions
Organisme : Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung
ID : CRSII5_193832
Organisme : Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung
ID : IZSEZ0_203264
Organisme : Horizon 2020
ID : 823886
Informations de copyright
© 2024. The Author(s).
Références
Ellegren, H. Microsatellites: Simple sequences with complex evolution. Nat. Rev. Genet. 5, 435–445. https://doi.org/10.1038/nrg1348 (2004).
doi: 10.1038/nrg1348
pubmed: 15153996
Sun, J. X. et al. A direct characterization of human mutation based on microsatellites. Nat. Genet. 44, 1161–1165. https://doi.org/10.1038/ng.2398 (2012).
doi: 10.1038/ng.2398
pubmed: 22922873
pmcid: 3459271
Mitra, I. et al. Patterns of de novo tandem repeat mutations and their role in autism. Nature 589, 246–250. https://doi.org/10.1038/s41586-020-03078-7 (2021).
doi: 10.1038/s41586-020-03078-7
pubmed: 33442040
pmcid: 7810352
Verbiest, M. et al. Mutation and selection processes regulating short tandem repeats give rise to genetic and phenotypic diversity across species. J. Evol. Biol. 36, 321–336. https://doi.org/10.1111/jeb.14106 (2023).
doi: 10.1111/jeb.14106
pubmed: 36289560
Martin-Trujillo, A., Garg, P., Patel, N., Jadhav, B. & Sharp, A. J. Genome-wide evaluation of the effect of short tandem repeat variation on local DNA methylation. Genome Res. 33, 184–196. https://doi.org/10.1101/gr.277057.122 (2023).
doi: 10.1101/gr.277057.122
pubmed: 36577521
pmcid: 10069470
Gymrek, M. et al. Abundant contribution of short tandem repeats to gene expression variation in humans. Nat. Genet. 48, 22–29. https://doi.org/10.1038/ng.3461 (2016).
doi: 10.1038/ng.3461
pubmed: 26642241
Fotsing, S. F. et al. The impact of short tandem repeat variation on gene expression. Nat. Genet. 51, 1652–1659. https://doi.org/10.1038/s41588-019-0521-9 (2019).
doi: 10.1038/s41588-019-0521-9
pubmed: 31676866
pmcid: 6917484
Shi, Y. et al. Characterization of genome-wide STR variation in 6487 human genomes. Nat. Commun. 14, 2092. https://doi.org/10.1038/s41467-023-37690-8 (2023).
doi: 10.1038/s41467-023-37690-8
pubmed: 37045857
pmcid: 10097659
Horton, C. A. et al. Short tandem repeats bind transcription factors to tune eukaryotic gene expression. Science 381, 1250. https://doi.org/10.1126/science.add1250 (2023).
doi: 10.1126/science.add1250
Boland, C. R. & Goel, A. Microsatellite instability in colorectal cancer. Gastroenterology 138, 2073-2087.e3. https://doi.org/10.1053/j.gastro.2009.12.064 (2010).
doi: 10.1053/j.gastro.2009.12.064
pubmed: 20420947
Bonneville, R. et al. Landscape of microsatellite instability across 39 cancer types. JCO Precis. Oncol. https://doi.org/10.1200/po.17.00073 (2017).
doi: 10.1200/po.17.00073
pubmed: 29850653
pmcid: 5972025
Hause, R. J., Pritchard, C. C., Shendure, J. & Salipante, S. J. Classification and characterization of microsatellite instability across 18 cancer types. Nat. Med. 22, 1342–1350. https://doi.org/10.1038/nm.4191 (2016).
doi: 10.1038/nm.4191
pubmed: 27694933
Fujimoto, A. et al. Comprehensive analysis of indels in whole-genome microsatellite regions and microsatellite instability across 21 cancer types. Genome Res. 30, 334–346. https://doi.org/10.1101/gr.255026.119 (2020).
doi: 10.1101/gr.255026.119
pubmed: 32209592
pmcid: 7111525
Bilgin Sonay, T., Koletou, M. & Wagner, A. A survey of tandem repeat instabilities and associated gene expression changes in 35 colorectal cancers. BMC Genom. 16, 702. https://doi.org/10.1186/s12864-015-1902-9 (2015).
doi: 10.1186/s12864-015-1902-9
Kim, T.-M., Laird, P. W. & Park, P. J. The landscape of microsatellite instability in colorectal and endometrial cancer genomes. Cell 155, 858–868. https://doi.org/10.1016/j.cell.2013.10.015 (2013).
doi: 10.1016/j.cell.2013.10.015
pubmed: 24209623
Maruvka, Y. E. et al. Analysis of somatic microsatellite indels identifies driver events in human tumors. Nat. Biotechnol. 35, 951–959. https://doi.org/10.1038/nbt.3966 (2017).
doi: 10.1038/nbt.3966
pubmed: 28892075
pmcid: 9123850
The Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337. https://doi.org/10.1038/nature11252 (2012).
doi: 10.1038/nature11252
Lundström, O. S. et al. WebSTR: A population-wide database of short tandem repeat variation in humans. J. Mol. Biol. 1, 168260. https://doi.org/10.1016/j.jmb.2023.168260 (2023).
doi: 10.1016/j.jmb.2023.168260
Tate, J. G. et al. COSMIC: The catalogue of somatic mutations in cancer. Nucleic Acids Res. 47, D941–D947. https://doi.org/10.1093/nar/gky1015 (2019).
doi: 10.1093/nar/gky1015
pubmed: 30371878
Hinrichs, A. S. et al. The UCSC genome browser database: Update 2006. Nucleic Acids Res. 34, D590–D598. https://doi.org/10.1093/nar/gkj144 (2006).
doi: 10.1093/nar/gkj144
pubmed: 16381938
Quinlan, A. R. & Hall, I. M. BEDTools: A flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842. https://doi.org/10.1093/bioinformatics/btq033 (2010).
doi: 10.1093/bioinformatics/btq033
pubmed: 20110278
pmcid: 2832824
Ionov, Y., Peinado, M. A., Malkhosyan, S., Shibata, D. & Perucho, M. Ubiquitous somatic mutations in simple repeated sequences reveal a new mechanism for colonic carcinogenesis. Nature 363, 558–561. https://doi.org/10.1038/363558a0 (1993).
doi: 10.1038/363558a0
pubmed: 8505985
Xu, X., Peng, M., Fang, Z. & Xu, X. The direction of microsatellite mutations is dependent upon allele length. Nat. Genet. 24, 396–399. https://doi.org/10.1038/74238 (2000).
doi: 10.1038/74238
pubmed: 10742105
Willems, T., Gymrek, M., Highnam, G., Mittelman, D. & Erlich, Y. The landscape of human STR variation. Genome Res. 24, 1894–1904. https://doi.org/10.1101/gr.177774.114 (2014).
doi: 10.1101/gr.177774.114
pubmed: 25135957
pmcid: 4216929
Guinney, J. et al. The consensus molecular subtypes of colorectal cancer. Nat. Med. 21, 1350–1356. https://doi.org/10.1038/nm.3967 (2015).
doi: 10.1038/nm.3967
pubmed: 26457759
pmcid: 4636487
Lai, Y. & Sun, F. The relationship between microsatellite slippage mutation rate and the number of repeat units. Mol. Biol. Evol. 20, 2123–2131. https://doi.org/10.1093/molbev/msg228 (2003).
doi: 10.1093/molbev/msg228
pubmed: 12949124
Mousavi, N., Shleizer-Burko, S., Yanicky, R. & Gymrek, M. Profiling the genome-wide landscape of tandem repeat expansions. Nucleic Acids Res. 47, e90–e90. https://doi.org/10.1093/NAR/GKZ501 (2019).
doi: 10.1093/NAR/GKZ501
pubmed: 31194863
pmcid: 6735967
Mayer, C., Leese, F. & Tollrian, R. Genome-wide analysis of tandem repeats in Daphnia pulex: A comparative approach. BMC Genom. 11, 277. https://doi.org/10.1186/1471-2164-11-277 (2010).
doi: 10.1186/1471-2164-11-277
Newman, A. M. & Cooper, J. B. XSTREAM: A practical algorithm for identification and architecture modeling of tandem repeats in protein sequences. BMC Bioinform. 8, 382. https://doi.org/10.1186/1471-2105-8-382 (2007).
doi: 10.1186/1471-2105-8-382
Benson, G. Tandem repeats finder: A program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580. https://doi.org/10.1093/nar/27.2.573 (1999).
doi: 10.1093/nar/27.2.573
pubmed: 9862982
pmcid: 148217
Schaper, E. et al. TRAL: Tandem repeat annotation library. Bioinformatics 31, 3051–3053. https://doi.org/10.1093/bioinformatics/btv306 (2015).
doi: 10.1093/bioinformatics/btv306
pubmed: 25987568
Delucchi, M., Näf, P., Bliven, S. & Anisimova, M. TRAL 2.0: Tandem repeat detection with circular profile hidden Markov models and evolutionary aligner. Front. Bioinform. 1, 1–10 (2021).
doi: 10.3389/fbinf.2021.691865
Eddy, S. R. Accelerated profile HMM searches. PLOS Comput. Biol. 7, e1002195. https://doi.org/10.1371/journal.pcbi.1002195 (2011).
doi: 10.1371/journal.pcbi.1002195
pubmed: 22039361
pmcid: 3197634
Avvaru, A. K., Sowpati, D. T. & Mishra, R. K. PERF: An exhaustive algorithm for ultra-fast and efficient identification of microsatellites from large DNA sequences. Bioinformatics 34, 943–948. https://doi.org/10.1093/bioinformatics/btx721 (2018).
doi: 10.1093/bioinformatics/btx721
pubmed: 29121165
Mousavi, N. et al. TRTools: A toolkit for genome-wide analysis of tandem repeats. Bioinformatics 37, 731–733. https://doi.org/10.1093/bioinformatics/btaa736 (2021).
doi: 10.1093/bioinformatics/btaa736
pubmed: 32805020
Huang, Q., Carrio-Cordo, P., Gao, B., Paloots, R. & Baudis, M. The Progenetix oncogenomic resource in 2021. Database 2021, 043. https://doi.org/10.1093/database/baab043 (2021).
doi: 10.1093/database/baab043
Zhao, H. & Baudis, M. labelSeg: Segment annotation for tumor copy number alteration profiles. BioRxiv https://doi.org/10.1101/2023.05.17.541097 (2023).
doi: 10.1101/2023.05.17.541097
pubmed: 38187782
pmcid: 10769445
Seabold, S. & Perktold, J. Statsmodels: Econometric and statistical modeling with Python. in Proceedings of the 9th Python in Science Conference (eds) Walt, S. V. D. & Millman, J., 92–96. https://doi.org/10.25080/Majora-92bf1922-011 (2010).