Comparative analysis of morabine grasshopper genomes reveals highly abundant transposable elements and rapidly proliferating satellite DNA repeats.
Journal
BMC biology
ISSN: 1741-7007
Titre abrégé: BMC Biol
Pays: England
ID NLM: 101190720
Informations de publication
Date de publication:
21 12 2020
21 12 2020
Historique:
received:
22
08
2020
accepted:
10
11
2020
entrez:
22
12
2020
pubmed:
23
12
2020
medline:
3
7
2021
Statut:
epublish
Résumé
Repetitive DNA sequences, including transposable elements (TEs) and tandemly repeated satellite DNA (satDNAs), collectively called the "repeatome", are found in high proportion in organisms across the Tree of Life. Grasshoppers have large genomes, averaging 9 Gb, that contain a high proportion of repetitive DNA, which has hampered progress in assembling reference genomes. Here we combined linked-read genomics with transcriptomics to assemble, characterize, and compare the structure of repetitive DNA sequences in four chromosomal races of the morabine grasshopper Vandiemenella viatica species complex and determine their contribution to genome evolution. We obtained linked-read genome assemblies of 2.73-3.27 Gb from estimated genome sizes of 4.26-5.07 Gb DNA per haploid genome of the four chromosomal races of V. viatica. These constitute the third largest insect genomes assembled so far. Combining complementary annotation tools and manual curation, we found a large diversity of TEs and satDNAs, constituting 66 to 75% per genome assembly. A comparison of sequence divergence within the TE classes revealed massive accumulation of recent TEs in all four races (314-463 Mb per assembly), indicating that their large genome sizes are likely due to similar rates of TE accumulation. Transcriptome sequencing showed more biased TE expression in reproductive tissues than somatic tissues, implying permissive transcription in gametogenesis. Out of 129 satDNA families, 102 satDNA families were shared among the four chromosomal races, which likely represent a diversity of satDNA families in the ancestor of the V. viatica chromosomal races. Notably, 50 of these shared satDNA families underwent differential proliferation since the recent diversification of the V. viatica species complex. This in-depth annotation of the repeatome in morabine grasshoppers provided new insights into the genome evolution of Orthoptera. Our TEs analysis revealed a massive recent accumulation of TEs equivalent to the size of entire Drosophila genomes, which likely explains the large genome sizes in grasshoppers. Despite an overall high similarity of the TE and satDNA diversity between races, the patterns of TE expression and satDNA proliferation suggest rapid evolution of grasshopper genomes on recent timescales.
Sections du résumé
BACKGROUND
Repetitive DNA sequences, including transposable elements (TEs) and tandemly repeated satellite DNA (satDNAs), collectively called the "repeatome", are found in high proportion in organisms across the Tree of Life. Grasshoppers have large genomes, averaging 9 Gb, that contain a high proportion of repetitive DNA, which has hampered progress in assembling reference genomes. Here we combined linked-read genomics with transcriptomics to assemble, characterize, and compare the structure of repetitive DNA sequences in four chromosomal races of the morabine grasshopper Vandiemenella viatica species complex and determine their contribution to genome evolution.
RESULTS
We obtained linked-read genome assemblies of 2.73-3.27 Gb from estimated genome sizes of 4.26-5.07 Gb DNA per haploid genome of the four chromosomal races of V. viatica. These constitute the third largest insect genomes assembled so far. Combining complementary annotation tools and manual curation, we found a large diversity of TEs and satDNAs, constituting 66 to 75% per genome assembly. A comparison of sequence divergence within the TE classes revealed massive accumulation of recent TEs in all four races (314-463 Mb per assembly), indicating that their large genome sizes are likely due to similar rates of TE accumulation. Transcriptome sequencing showed more biased TE expression in reproductive tissues than somatic tissues, implying permissive transcription in gametogenesis. Out of 129 satDNA families, 102 satDNA families were shared among the four chromosomal races, which likely represent a diversity of satDNA families in the ancestor of the V. viatica chromosomal races. Notably, 50 of these shared satDNA families underwent differential proliferation since the recent diversification of the V. viatica species complex.
CONCLUSION
This in-depth annotation of the repeatome in morabine grasshoppers provided new insights into the genome evolution of Orthoptera. Our TEs analysis revealed a massive recent accumulation of TEs equivalent to the size of entire Drosophila genomes, which likely explains the large genome sizes in grasshoppers. Despite an overall high similarity of the TE and satDNA diversity between races, the patterns of TE expression and satDNA proliferation suggest rapid evolution of grasshopper genomes on recent timescales.
Identifiants
pubmed: 33349252
doi: 10.1186/s12915-020-00925-x
pii: 10.1186/s12915-020-00925-x
pmc: PMC7754599
doi:
Substances chimiques
DNA Transposable Elements
0
DNA, Satellite
0
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
199Subventions
Organisme : Swedish Research Council Vetenskapsrådet
ID : 2014-6325
Pays : International
Organisme : Marie Sklodowska Curie Actions, Co-fund Project INCA
ID : 600398
Pays : International
Organisme : Swedish Research Council Formas
ID : 2017-01597
Pays : International
Organisme : Sven och Lilly Lawskis fund
ID : N2018-0045
Pays : International
Références
Nat Ecol Evol. 2019 Nov;3(11):1587-1597
pubmed: 31666742
Genome Biol Evol. 2016 May 09;8(5):1327-37
pubmed: 27060702
Genes (Basel). 2018 Oct 26;9(11):
pubmed: 30373193
Proc Natl Acad Sci U S A. 2014 Jul 22;111(29):10630-5
pubmed: 25006263
PLoS Comput Biol. 2019 Aug 19;15(8):e1007293
pubmed: 31425522
Bioinformatics. 2012 Dec 1;28(23):3150-2
pubmed: 23060610
Plant Mol Biol. 2000 Jan;42(1):251-69
pubmed: 10688140
Mol Ecol. 2018 Jan;27(1):99-111
pubmed: 29171119
Annu Rev Genomics Hum Genet. 2010;11:291-316
pubmed: 20438362
Gene. 2008 Feb 15;409(1-2):72-82
pubmed: 18182173
J Mol Evol. 1980 Dec;16(2):111-20
pubmed: 7463489
BMC Evol Biol. 2018 Jan 8;18(1):2
pubmed: 29329524
Mob DNA. 2015 Jun 02;6:11
pubmed: 26045719
Genome Biol Evol. 2019 Apr 1;11(4):1152-1165
pubmed: 30888421
Nat Rev Genet. 2009 Mar;10(3):195-205
pubmed: 19204717
Sci Rep. 2016 Jul 07;6:28333
pubmed: 27385065
Science. 1976 Feb 13;191(4227):528-35
pubmed: 1251186
Genome Res. 2019 Apr;29(4):635-645
pubmed: 30894395
Elife. 2017 Apr 06;6:
pubmed: 28384097
Nature. 1994 Sep 15;371(6494):215-20
pubmed: 8078581
Heredity (Edinb). 2009 Jun;102(6):525-6
pubmed: 19337304
Plant Cell. 2014 Apr 11;26(4):1436-1447
pubmed: 24728646
Bioinformatics. 2012 Dec 15;28(24):3211-7
pubmed: 23071270
Fed Proc. 1976 Jan;35(1):23-35
pubmed: 1107072
Chromosoma. 2018 Sep;127(3):323-340
pubmed: 29549528
Nucleic Acids Res. 2017 Jul 7;45(12):e111
pubmed: 28402514
F1000Res. 2020 Jul 27;9:775
pubmed: 33163158
Mol Biol Evol. 2011 May;28(5):1633-44
pubmed: 21172826
Science. 2002 Sep 13;297(5588):1833-7
pubmed: 12193640
Genome Biol. 2014;15(12):550
pubmed: 25516281
Bioinformatics. 2018 Oct 15;34(20):3575-3577
pubmed: 29762645
Nature. 2003 Jun 19;423(6942):825-37
pubmed: 12815422
Sci Rep. 2017 Jul 25;7(1):6422
pubmed: 28743997
Genome Biol Evol. 2017 Nov 1;9(11):3073-3087
pubmed: 29608678
Bioinformatics. 2013 Mar 15;29(6):792-3
pubmed: 23376349
Nature. 1980 Apr 17;284(5757):604-7
pubmed: 7366731
Heredity (Edinb). 2015 Jul;115(1):1-2
pubmed: 25806543
PLoS Biol. 2009 Oct;7(10):e1000234
pubmed: 19859525
Genome Biol. 2020 Jun 2;21(1):129
pubmed: 32487205
Nature. 2004 Dec 9;432(7018):695-716
pubmed: 15592404
Curr Biol. 2017 May 22;27(10):1393-1402.e2
pubmed: 28457869
Mol Genet Genomics. 2015 Oct;290(5):1717-25
pubmed: 25832354
Trends Genet. 2002 Nov;18(11):587-9
pubmed: 12414190
Cell Rep. 2013 Jun 27;3(6):2179-90
pubmed: 23791531
Trends Genet. 2017 Apr;33(4):266-282
pubmed: 28236503
Bioinformatics. 2000 Feb;16(2):178-9
pubmed: 10842741
Bioinformatics. 2015 Oct 1;31(19):3210-2
pubmed: 26059717
Proc Natl Acad Sci U S A. 2014 Jul 15;111(28):10263-8
pubmed: 24982153
Mol Ecol Resour. 2018 Nov;18(6):1188-1195
pubmed: 30035372
Nucleic Acids Res. 2017 Feb 28;45(4):e18
pubmed: 28204566
Semin Cell Dev Biol. 2015 Dec;47-48:17-31
pubmed: 26582251
Mol Biol Evol. 2011 Oct;28(10):2731-9
pubmed: 21546353
Nat Rev Genet. 2007 Apr;8(4):272-85
pubmed: 17363976
Curr Opin Genet Dev. 1992 Dec;2(6):861-7
pubmed: 1335807
Biol Direct. 2011 Mar 17;6:19
pubmed: 21414203
Chromosoma. 2017 Aug;126(4):487-500
pubmed: 27522227
Mol Ecol Resour. 2021 Jan;21(1):263-286
pubmed: 32937018
Curr Opin Genet Dev. 2018 Apr;49:70-78
pubmed: 29579574
Genome. 2004 Feb;47(1):163-78
pubmed: 15060613
Genome Biol. 2004;5(10):R80
pubmed: 15461798
Nat Commun. 2014;5:2957
pubmed: 24423660
BMC Bioinformatics. 2010 Jul 15;11:378
pubmed: 20633259
Science. 2004 Mar 12;303(5664):1626-32
pubmed: 15016989
Genetics. 2010 Jan;184(1):313-6
pubmed: 19841095
Nat Methods. 2012 Mar 04;9(4):357-9
pubmed: 22388286
Bioinformatics. 2018 Feb 15;34(4):550-557
pubmed: 29444236
Genetics. 1987 Mar;115(3):553-67
pubmed: 3569882
Heredity (Edinb). 2010 Jun;104(6):543-51
pubmed: 19844270
Mol Ecol. 2015 Sep;24(17):4340-7
pubmed: 26224418
Science. 2001 Aug 10;293(5532):1098-102
pubmed: 11498581
Genome Res. 2017 May;27(5):709-721
pubmed: 28373483
Proc Natl Acad Sci U S A. 2006 Nov 28;103(48):18190-5
pubmed: 17110446
Cytogenet Genome Res. 2005;110(1-4):134-43
pubmed: 16093665
Cell. 1977 Dec;12(4):1069-84
pubmed: 597857
Mol Ecol. 2009 Aug;18(16):3429-42
pubmed: 19627493
Bioinformatics. 2011 Mar 15;27(6):764-70
pubmed: 21217122
Nucleic Acids Res. 2017 Jan 4;45(D1):D744-D749
pubmed: 27899580
Genome Dyn. 2012;7:1-28
pubmed: 22759811
DNA Res. 2018 Apr 1;25(2):137-147
pubmed: 29096008
Genome Biol Evol. 2014 May 19;6(6):1279-86
pubmed: 24846631
Genome Biol Evol. 2020 May 1;12(5):506-521
pubmed: 32271917
Sci Rep. 2017 Jul 26;7(1):6606
pubmed: 28747803
EMBO J. 2002 Nov 15;21(22):5955-9
pubmed: 12426367
Brief Bioinform. 2008 Jul;9(4):286-98
pubmed: 18372315
Nat Rev Genet. 2007 Dec;8(12):973-82
pubmed: 17984973
Heredity (Edinb). 2020 Sep;125(3):124-137
pubmed: 32499661
Genome Biol Evol. 2015 Jan 09;7(2):567-80
pubmed: 25577199
Nature. 1982 Sep 9;299(5879):111-7
pubmed: 7110332
Genetics. 1991 Dec;129(4):1085-98
pubmed: 1783293
Nucleic Acids Res. 2004 Mar 19;32(5):1792-7
pubmed: 15034147
Genome Biol Evol. 2020 Mar 1;12(3):88-102
pubmed: 32211863
Gigascience. 2015 Oct 19;4:48
pubmed: 26500767
Genet Res Int. 2012;2012:430136
pubmed: 22567387
Curr Biol. 2019 Apr 1;29(7):R241-R243
pubmed: 30939304