The pan-tandem repeat map highlights multiallelic variants underlying gene expression and agronomic traits in rice.
Journal
Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555
Informations de publication
Date de publication:
24 Aug 2024
24 Aug 2024
Historique:
received:
23
06
2023
accepted:
20
08
2024
medline:
26
8
2024
pubmed:
26
8
2024
entrez:
24
8
2024
Statut:
epublish
Résumé
Tandem repeats (TRs) are genomic regions that tandemly change in repeat number, which are often multiallelic. Their characteristics and contributions to gene expression and quantitative traits in rice are largely unknown. Here, we survey rice TR variations based on 231 genome assemblies and the rice pan-genome graph. We identify 227,391 multiallelic TR loci, including 54,416 TR variations that are absent from the Nipponbare reference genome. Only 1/3 TR variations show strong linkage with nearby bi-allelic variants (SNPs, Indels and PAVs). Using 193 panicle and 202 leaf transcriptomic data, we reveal 485 and 511 TRs act as QTLs independently of other bi-allelic variations to nearby gene expression, respectively. Using plant height and grain width as examples, we identify and validate TRs contributions to rice agronomic trait variations. These findings would enhance our understanding of the functions of multiallelic variants and facilitate rice molecular breeding.
Identifiants
pubmed: 39181885
doi: 10.1038/s41467-024-51854-0
pii: 10.1038/s41467-024-51854-0
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
7291Subventions
Organisme : Youth Innovation Promotion Association of the Chinese Academy of Sciences (Youth Innovation Promotion Association CAS)
ID : Y2023QC36
Informations de copyright
© 2024. The Author(s).
Références
Albert, F. W. & Leonid, K. The role of regulatory variation in complex traits and disease. Nat. Rev. Genet. 16, 197–212 (2015).
pubmed: 25707927
doi: 10.1038/nrg3891
Song, X. G. et al. IPA1 functions as a downstream transcription factor repressed by D53 in strigolactone signaling in rice. Cell Res. 27, 1128–1141 (2017).
pubmed: 28809396
pmcid: 5587847
doi: 10.1038/cr.2017.102
Zhang, L. et al. A natural tandem array alleviates epigenetic repression of IPA1 and leads to superior yielding rice. Nat. Commun. 8, 14789 (2017).
pubmed: 28317902
pmcid: 5364388
doi: 10.1038/ncomms14789
Zhou, J. P. et al. An efficient CRISPR-Cas12a promoter editing system for crop improvement. Nat. Plants 9, 588–604 (2023).
pubmed: 37024659
doi: 10.1038/s41477-023-01384-2
Xue, C. X. et al. Tuning plant phenotypes by precise, graded downregulation of gene expression. Nat. Biotechnol. 41, 1758–1764 (2023).
pubmed: 36894598
doi: 10.1038/s41587-023-01707-w
Wainschtein, P. et al. Assessing the contribution of rare variants to complex trait heritability from whole-genome sequence data. Nat. Genet. 54, 263–273 (2022).
pubmed: 35256806
pmcid: 9119698
doi: 10.1038/s41588-021-00997-7
Tautz, D. Notes on the definition and nomenclature of tandemly repetitive DNA sequences. Exs 67, 21–28 (1993).
pubmed: 8400689
Depienne, C. & Mandel, J. L. 30 years of repeat expansion disorders: What have we learned and what are the remaining challenges? Am. J. Hum. Genet. 108, 764–785 (2021).
pubmed: 33811808
pmcid: 8205997
doi: 10.1016/j.ajhg.2021.03.011
Fotsing, S. F. et al. The impact of short tandem repeat variation on gene expression. Nat. Genet. 51, 1652–1659 (2019).
pubmed: 31676866
pmcid: 6917484
doi: 10.1038/s41588-019-0521-9
Bakhtiari, M. et al. Variable number tandem repeats mediate the expression of proximal genes. Nat. Commun. 12, 2075 (2021).
pubmed: 33824302
pmcid: 8024321
doi: 10.1038/s41467-021-22206-z
Wu, Z. Z. et al. Mapping short tandem repeats for liver gene expression traits helps prioritize potential causal variants for complex traits in pigs. J. Anim. Sci. Biotechnol. 13, 8 (2022).
pubmed: 35034641
pmcid: 8762894
doi: 10.1186/s40104-021-00658-z
Zhang, G. T. & Andersen, E. C. Interplay Between Polymorphic Short Tandem Repeats and Gene Expression Variation in Caenorhabditis elegans. Mol. Biol. Evol. 40, msad067 (2023).
pubmed: 36999565
pmcid: 10075192
doi: 10.1093/molbev/msad067
Ranathunge, C. et al. Transcribed microsatellite allele lengths are often correlated with gene expression in natural sunflower populations. Mol. Ecol. 29, 1704–1716 (2020).
pubmed: 32285554
doi: 10.1111/mec.15440
Reinar, W. B., Lalun VO, Reitan, T., Jakobsen, K. S. & Butenko, M. A. Length variation in short tandem repeats affects gene expression in natural populations of Arabidopsis thaliana. Plant Cell 33, 2221–2234 (2021).
pubmed: 33848350
pmcid: 8364236
doi: 10.1093/plcell/koab107
Shi, Y. et al. Characterization of genome-wide STR variation in 6487 human genomes. Nat. Commun. 14, 2092 (2023).
pubmed: 37045857
pmcid: 10097659
doi: 10.1038/s41467-023-37690-8
Mukamel, R. E. et al. Protein-coding repeat polymorphisms strongly shape diverse human phenotypes. Science 373, 1499–1505 (2021).
pubmed: 34554798
pmcid: 8549062
doi: 10.1126/science.abg8289
Kaur, S., Panesar, P. S., Bera, M. B. & Kaur, V. Simple Sequence Repeat Markers in Genetic Divergence and Marker-Assisted Selection of Rice Cultivars: A Review. Crit. Rev. Food Sci. Nutr. 55, 41–49 (2015).
pubmed: 24915404
doi: 10.1080/10408398.2011.646363
Si, L. Z. et al. OsSPL13 controls grain size in cultivated rice. Nat. Genet. 48, 447–456 (2016).
pubmed: 26950093
doi: 10.1038/ng.3518
Bai, X. et al. Duplication of an upstream silencer of FZP increases grain yield in rice. Nat. Plants 3, 885–893 (2017).
pubmed: 29085070
doi: 10.1038/s41477-017-0042-4
Huang, Y. Y. et al. Variation in the regulatory region of FZP causes increases in secondary inflorescence branching and grain yield in rice domestication. Plant J. 96, 716–733 (2018).
pubmed: 30101570
doi: 10.1111/tpj.14062
Li, Z. et al. Natural variation of codon repeats in COLD11 endows rice with chilling resilience. Sci. Adv. 9, eabq5506 (2023).
pubmed: 36608134
pmcid: 9821855
doi: 10.1126/sciadv.abq5506
Zhao, F. et al. A genome-wide survey of copy number variations reveals an asymmetric evolution of duplicated genes in rice. BMC Biol. 18, 73 (2020).
pubmed: 32591023
pmcid: 7318451
doi: 10.1186/s12915-020-00798-0
Qin, P. et al. Pan-genome analysis of 33 genetically diverse rice accessions reveals hidden genomic variations. Cell 184, 3542–3558 e3516 (2021).
pubmed: 34051138
doi: 10.1016/j.cell.2021.04.046
Rajan-Babu IS, Dolzhenko, E., Eberle, M. A. & Friedman, J. M. Sequence composition changes in short tandem repeats: heterogeneity, detection, mechanisms and clinical implications. Nat. Rev. Genet. 25, 476–499 (2024).
pubmed: 38467784
doi: 10.1038/s41576-024-00696-z
Willems, T. et al. Genome-wide profiling of heritable and de novo STR variations. Nat. Methods 14, 590–592 (2017).
pubmed: 28436466
pmcid: 5482724
doi: 10.1038/nmeth.4267
Gymrek, M., Golan, D., Rosset, S. & Erlich, Y. lobSTR: A short tandem repeat profiler for personal genomes. Genome Res. 22, 1154–1162 (2012).
pubmed: 22522390
pmcid: 3371701
doi: 10.1101/gr.135780.111
Kristmundsdóttir, S., Sigurpálsdóttir, B. D., Kehr, B. & Halldórsson, B. V. popSTR: population-scale detection of STR variants. Bioinformatics 33, 4041–4048 (2017).
pubmed: 27591079
doi: 10.1093/bioinformatics/btw568
Mousavi, N., Shleizer-Burko, S., Yanicky, R. & Gymrek, M. Profiling the genome-wide landscape of tandem repeat expansions. Nucleic Acids Res. 47, e90 (2019).
pubmed: 31194863
pmcid: 6735967
doi: 10.1093/nar/gkz501
Dolzhenko, E. et al. ExpansionHunter: a sequence-graph-based tool to analyze variation in short tandem repeat regions. Bioinformatics 35, 4754–4756 (2019).
pubmed: 31134279
pmcid: 6853681
doi: 10.1093/bioinformatics/btz431
Bakhtiari, M., Shleizer-Burko, S., Gymrek, M., Bansal, V. & Bafna, V. Targeted genotyping of variable number tandem repeats with adVNTR. Genome Res. 28, 1709–1719 (2018).
pubmed: 30352806
pmcid: 6211647
doi: 10.1101/gr.235119.118
Lu, T. Y., Chaisson, M. J. P. & Human Genome Struct Variation C. Profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs. Nat. Commun. 12, 4250 (2021).
pubmed: 34253730
pmcid: 8275641
doi: 10.1038/s41467-021-24378-0
Chintalaphani, S. R., Pineda, S. S., Deveson, I. W. & Kumar, K. R. An update on the neurological short tandem repeat expansion disorders and the emergence of long-read sequencing diagnostics. Acta Neuropathol. Commun. 9, 2–20 (2021).
doi: 10.1186/s40478-021-01201-x
De Coster, W., Weissensteiner, M. H. & Sedlazeck, F. J. Towards population-scale long-read sequencing. Nat. Rev. Genet. 22, 572–587 (2021).
pubmed: 34050336
pmcid: 8161719
doi: 10.1038/s41576-021-00367-3
Shang, L. et al. A super pan-genomic landscape of rice. Cell Res. 32, 878–896 (2022).
pubmed: 35821092
pmcid: 9525306
doi: 10.1038/s41422-022-00685-z
Kawahara, Y. et al. Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data. Rice 6, 2–10 (2013).
doi: 10.1186/1939-8433-6-4
Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999).
pubmed: 9862982
pmcid: 148217
doi: 10.1093/nar/27.2.573
Olson, D. & Wheeler, T. ULTRA: A model based tool to detect tandem repeats. ACM BCB 2018, 37–46 (2018).
pubmed: 31080962
pmcid: 6508075
Yu, J. et al. The genomes of Oryza sativa: A history of duplications. Plos Biol. 3, 266–281 (2005).
doi: 10.1371/journal.pbio.0030038
Mei, H. et al. Population-scale polymorphic short tandem repeat provides an alternative strategy for allele mining in cotton. Front. Plant Sci. 13, 916830 (2022).
pubmed: 35599867
pmcid: 9120961
doi: 10.3389/fpls.2022.916830
Zhang, G. T., Wang, Y. & Andersen, E. C. Natural variation in C. elegans short tandem repeats. Genome Res. 32, 1852–1861 (2022).
pubmed: 36195344
pmcid: 9712632
Zhang, H. et al. Population-level exploration of alternative splicing and its unique role in controlling agronomic traits of rice. Plant Cell, koae181, https://doi.org/10.1093/plcell/koae181 (2024).
Liu, T. Z. et al. Dwarf and High Tillering1 represses rice tillering through mediating the splicing of D14 pre-mRNA. Plant Cell 34, 3301–3318 (2022).
pubmed: 35670739
pmcid: 9421477
doi: 10.1093/plcell/koac169
Song, X. J. et al. Rare allele of a previously unidentified histone H4 acetyltransferase enhances grain weight, yield, and plant biomass in rice. Proc. Natl Acad. Sci. USA 112, 76–81 (2015).
pubmed: 25535376
doi: 10.1073/pnas.1421127112
Dong, N. Q. et al. UDP-glucosyltransferase regulates grain size and abiotic stress tolerance associated with metabolic flux redirection in rice. Nat. Commun. 11, 2629 (2020).
pubmed: 32457405
pmcid: 7250897
doi: 10.1038/s41467-020-16403-5
Rabello, A. R. et al. Identification of drought-responsive genes in roots of upland rice (Oryza sativa L). BMC Genomics 9, 485 (2008).
pubmed: 18922162
pmcid: 2605477
doi: 10.1186/1471-2164-9-485
Wang, G., Sarkar, A., Carbonetto, P. & Stephens, M. A simple new approach to variable selection in regression, with application to genetic fine mapping. J. R. Stat. Soc. Ser. B Stat. Methodol. 82, 1273–1300 (2020).
doi: 10.1111/rssb.12388
Wang, Y. P. et al. Clock component OsPRR59 delays heading date by repressing transcription of Ehd3 in rice. Crop J. 10, 1570–1579 (2022).
doi: 10.1016/j.cj.2022.04.007
Yan, W. H. et al. Natural variation in Ghd7.1 plays an important role in grain yield and adaptation in rice. Cell Res. 23, 969–971 (2013).
pubmed: 23507971
pmcid: 3698629
doi: 10.1038/cr.2013.43
Duan, P. G. et al. Natural Variation in the Promoter of GSE5 Contributes to Grain Size Diversity in Rice. Mol. Plant 10, 685–694 (2017).
pubmed: 28366824
doi: 10.1016/j.molp.2017.03.009
Rook, F. et al. Impaired sucrose induction1 encodes a conserved plant-specific protein that couples carbohydrate availability to gene expression and plant growth. Plant J. 46, 1045–1058 (2006).
pubmed: 16805736
doi: 10.1111/j.1365-313X.2006.02765.x
Malik, I., Kelley, C. P., Wang, E. T. & Todd, P. K. Molecular mechanisms underlying nucleotide repeat expansion disorders. Nat. Rev. Mol. Cell Biol. 22, 589–607 (2021).
pubmed: 34140671
pmcid: 9612635
doi: 10.1038/s41580-021-00382-6
Ming, L. et al. Transcriptome-wide association analyses reveal the impact of regulatory variants on rice panicle architecture and causal gene regulatory networks. Nat. Commun. 14, 7501 (2023).
pubmed: 37980346
pmcid: 10657423
doi: 10.1038/s41467-023-43077-6
Liu, C. et al. eQTLs play critical roles in regulating gene expression and identifying key regulators in rice. Plant Biotechnol. J. 19, 2357 (2022).
doi: 10.1111/pbi.13912
Elden, A. C. et al. Ataxin-2 intermediate-length polyglutamine expansions are associated with increased risk for ALS. Nature 466, 1069–U1077 (2010).
pubmed: 20740007
pmcid: 2965417
doi: 10.1038/nature09320
Lee, T. et al. Ataxin-2 intermediate-length polyglutamine expansions in European ALS patients. Hum. Mol. Genet. 20, 1697–1700 (2011).
pubmed: 21292779
pmcid: 3071667
doi: 10.1093/hmg/ddr045
Quinlan, A. R. BEDTools: The Swiss-Army Tool for Genome Feature Analysis. Curr. Protoc. Bioinforma. 47, 11.12.11–34 (2014).
doi: 10.1002/0471250953.bi1112s47
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
pubmed: 29750242
pmcid: 6137996
doi: 10.1093/bioinformatics/bty191
Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10, giab008 (2021).
pubmed: 33590861
pmcid: 7931819
doi: 10.1093/gigascience/giab008
Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).
pubmed: 2231712
doi: 10.1016/S0022-2836(05)80360-2
Sievers, F. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 7, 539 (2011).
pubmed: 21988835
pmcid: 3261699
doi: 10.1038/msb.2011.75
Kendig, K. I. et al. Sentieon DNASeq Variant Calling Workflow Demonstrates Strong Computational Performance and Accuracy. Front. Genet. 10, 736 (2019).
pubmed: 31481971
pmcid: 6710408
doi: 10.3389/fgene.2019.00736
Sedlazeck, F. J. et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat. Methods 15, 461 (2018).
pubmed: 29713083
pmcid: 5990442
doi: 10.1038/s41592-018-0001-7
Jeffares, D. C. et al. Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat. Commun. 8, 14061 (2017).
pubmed: 28117401
pmcid: 5286201
doi: 10.1038/ncomms14061
Kim, D. et al. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 14, R36 (2013).
pubmed: 23618408
pmcid: 4053844
doi: 10.1186/gb-2013-14-4-r36
Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012).
pubmed: 22383036
pmcid: 3334321
doi: 10.1038/nprot.2012.016
Gymrek, M. et al. Abundant contribution of short tandem repeats to gene expression variation in humans. Nat. Genet. 48, 22–29 (2016).
pubmed: 26642241
doi: 10.1038/ng.3461
Stegle, O., Parts, L., Piipari, M., Winn, J. & Durbin, R. Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses. Nat. Protoc. 7, 500–507 (2012).
pubmed: 22343431
pmcid: 3398141
doi: 10.1038/nprot.2011.457
Ma, X. L. et al. A Robust CRISPR/Cas9 System for Convenient, High-Efficiency Multiplex Genome Editing in Monocot and Dicot Plants. Mol. Plant 8, 1274–1284 (2015).
pubmed: 25917172
doi: 10.1016/j.molp.2015.04.007