Tandem repeats ubiquitously flank and contribute to translation initiation sites.

Genome-scale Homology TIS selection Tandem repeat Translation initiation site

Journal

BMC genomic data
ISSN: 2730-6844
Titre abrégé: BMC Genom Data
Pays: England
ID NLM: 101775394

Informations de publication

Date de publication:
27 07 2022
Historique:
received: 04 04 2022
accepted: 18 07 2022
entrez: 27 7 2022
pubmed: 28 7 2022
medline: 30 7 2022
Statut: epublish

Résumé

While the evolutionary divergence of cis-regulatory sequences impacts translation initiation sites (TISs), the implication of tandem repeats (TRs) in TIS selection remains largely elusive. Here, we employed the TIS homology concept to study a possible link between TRs of all core lengths and repeats with TISs. Human, as reference sequence, and 83 other species were selected, and data was extracted on the entire protein-coding genes (n = 1,611,368) and transcripts (n = 2,730,515) annotated for those species from Ensembl 102. Following TIS identification, two different weighing vectors were employed to assign TIS homology, and the co-occurrence pattern of TISs with the upstream flanking TRs was studied in the selected species. The results were assessed in 10-fold cross-validation. On average, every TIS was flanked by 1.19 TRs of various categories within its 120 bp upstream sequence, per species. We detected statistically significant enrichment of non-homologous human TISs co-occurring with human-specific TRs. On the contrary, homologous human TISs co-occurred significantly with non-human-specific TRs. 2991 human genes had at least one transcript, TIS of which was flanked by a human-specific TR. Text mining of a number of the identified genes, such as CACNA1A, EIF5AL1, FOXK1, GABRB2, MYH2, SLC6A8, and TTN, yielded predominant expression and functions in the human brain and/or skeletal muscle. We conclude that TRs ubiquitously flank and contribute to TIS selection at the trans-species level. Future functional analyses, such as a combination of genome editing strategies and in vitro protein synthesis may be employed to further investigate the impact of TRs on TIS selection.

Sections du résumé

BACKGROUND
While the evolutionary divergence of cis-regulatory sequences impacts translation initiation sites (TISs), the implication of tandem repeats (TRs) in TIS selection remains largely elusive. Here, we employed the TIS homology concept to study a possible link between TRs of all core lengths and repeats with TISs.
METHODS
Human, as reference sequence, and 83 other species were selected, and data was extracted on the entire protein-coding genes (n = 1,611,368) and transcripts (n = 2,730,515) annotated for those species from Ensembl 102. Following TIS identification, two different weighing vectors were employed to assign TIS homology, and the co-occurrence pattern of TISs with the upstream flanking TRs was studied in the selected species. The results were assessed in 10-fold cross-validation.
RESULTS
On average, every TIS was flanked by 1.19 TRs of various categories within its 120 bp upstream sequence, per species. We detected statistically significant enrichment of non-homologous human TISs co-occurring with human-specific TRs. On the contrary, homologous human TISs co-occurred significantly with non-human-specific TRs. 2991 human genes had at least one transcript, TIS of which was flanked by a human-specific TR. Text mining of a number of the identified genes, such as CACNA1A, EIF5AL1, FOXK1, GABRB2, MYH2, SLC6A8, and TTN, yielded predominant expression and functions in the human brain and/or skeletal muscle.
CONCLUSION
We conclude that TRs ubiquitously flank and contribute to TIS selection at the trans-species level. Future functional analyses, such as a combination of genome editing strategies and in vitro protein synthesis may be employed to further investigate the impact of TRs on TIS selection.

Identifiants

pubmed: 35896982
doi: 10.1186/s12863-022-01075-5
pii: 10.1186/s12863-022-01075-5
pmc: PMC9331589
doi:

Banques de données

figshare
['10.6084/m9.figshare.15405267']

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

59

Informations de copyright

© 2022. The Author(s).

Références

Cell. 2019 Jul 11;178(2):458-472.e19
pubmed: 31178119
Genome Res. 2018 Aug;28(8):1169-1178
pubmed: 29970452
Brain Res. 2018 Aug 15;1693(Pt A):43-54
pubmed: 29453961
Nat Chem Biol. 2007 Apr;3(4):218-21
pubmed: 17322877
Sci Rep. 2016 Jan 14;6:19421
pubmed: 26766026
Am J Primatol. 2015 Jan;77(1):34-43
pubmed: 25099915
Genome Res. 2015 Nov;25(11):1610-21
pubmed: 26297486
Nucleic Acids Res. 2019 Jul 2;47(W1):W636-W641
pubmed: 30976793
Comput Struct Biotechnol J. 2022 May 18;20:2521-2538
pubmed: 35685358
Nucleic Acids Res. 2021 Jan 11;49(1):e4
pubmed: 33211865
Genome Biol. 2006;7 Suppl 1:S12.1-14
pubmed: 16925834
Bioinformatics. 2005 Aug 15;21(16):3439-40
pubmed: 16082012
Mol Cell. 2017 Oct 5;68(1):158-170.e3
pubmed: 28918899
PLoS One. 2014 Feb 12;9(2):e88518
pubmed: 24533096
Nat Protoc. 2009;4(8):1184-91
pubmed: 19617889
Nat Rev Mol Cell Biol. 2004 Oct;5(10):827-35
pubmed: 15459663
Am J Primatol. 2014 Aug;76(8):747-56
pubmed: 24573656
Nat Rev Genet. 2018 May;19(5):286-298
pubmed: 29398703
Int J Mol Sci. 2013 Nov 01;14(11):21705-26
pubmed: 24189219
Nucleic Acids Res. 2011 Jan;39(2):567-77
pubmed: 20864444
Curr Protoc Bioinformatics. 2013 Jun;Chapter 3:Unit3.1
pubmed: 23749753
J Cell Sci. 2012 Sep 15;125(Pt 18):4354-61
pubmed: 22641696
Nucleic Acids Res. 2017 Jan 25;45(2):513-526
pubmed: 27923997
PLoS One. 2016 May 12;11(5):e0155359
pubmed: 27171412
Methods Protoc. 2019 Mar 12;2(1):
pubmed: 31164605
Proc Natl Acad Sci U S A. 2008 Aug 5;105(31):10738-43
pubmed: 18658239
Cell. 2009 Feb 20;136(4):731-45
pubmed: 19239892
J Mol Biol. 1970 Mar;48(3):443-53
pubmed: 5420325
Nat Commun. 2013;4:1511
pubmed: 23443539
PLoS One. 2016 Dec 22;11(12):e0168204
pubmed: 28005950
Nat Rev Mol Cell Biol. 2018 Mar;19(3):158-174
pubmed: 29165424
Genome Biol. 2019 Aug 9;20(1):162
pubmed: 31399036
Nucleic Acids Res. 2020 Feb 20;48(3):1068-1083
pubmed: 31777928
Mol Biol Evol. 2020 Jul 1;37(7):2015-2028
pubmed: 32145028
Gerontology. 2020;66(5):514-522
pubmed: 32877896
Proc Natl Acad Sci U S A. 2012 Sep 11;109(37):E2424-32
pubmed: 22927429
Genome Res. 2018 Jan;28(1):25-36
pubmed: 29162641
Clin Transl Sci. 2012 Oct;5(5):408-11
pubmed: 23067353
RNA. 2006 May;12(5):851-61
pubmed: 16540693
Nat Protoc. 2013 Nov;8(11):2281-2308
pubmed: 24157548
Genome Res. 2008 Jul;18(7):1011-9
pubmed: 18593815
Bioinformatics. 2017 Mar 15;33(6):923-925
pubmed: 28039164
Hum Genomics. 2018 Oct 29;12(1):47
pubmed: 30373661
Genome Biol Evol. 2017 Sep 1;9(9):2428-2443
pubmed: 28957459
Database (Oxford). 2011 Jul 23;2011:bar030
pubmed: 21785142

Auteurs

Ali M A Maddi (AMA)

Laboratory of Complex Biological systems and Bioinformatics (CBB), Department of Bioinformatics, Institute of Biochemistry and Biophysics (IBB), University of Tehran, Tehran, Tehran, 1417614411, Iran.

Kaveh Kavousi (K)

Laboratory of Complex Biological systems and Bioinformatics (CBB), Department of Bioinformatics, Institute of Biochemistry and Biophysics (IBB), University of Tehran, Tehran, Tehran, 1417614411, Iran. kkavousi@ut.ac.ir.

Masoud Arabfard (M)

Chemical Injuries Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Sciences, Tehran, Tehran, 1435916471, Iran.

Hamid Ohadi (H)

School of Physics and Astronomy, University of St. Andrews, St. Andrews, KY16 9SS, UK.

Mina Ohadi (M)

Iranian Research Center on Aging, University of Social Welfare and Rehabilitation Sciences, Tehran, Tehran, 1985713871, Iran. mi.ohadi@uswr.ac.ir.

Articles similaires

Oryza Quantitative Trait Loci Alleles Gene Expression Regulation, Plant Genome, Plant
Insulin DNA Humans G-Quadruplexes Promoter Regions, Genetic
High-Throughput Nucleotide Sequencing Humans Algorithms Genome, Human Tandem Repeat Sequences

LongTR: genome-wide profiling of genetic variation at tandem repeats from long reads.

Helyaneh Ziaei Jam, Justin M Zook, Sara Javadzadeh et al.
1.00
Humans Tandem Repeat Sequences Genome, Human Genetic Variation Software

Classifications MeSH