ChimeraTE: a pipeline to detect chimeric transcripts derived from genes and transposable elements.
Journal
Nucleic acids research
ISSN: 1362-4962
Titre abrégé: Nucleic Acids Res
Pays: England
ID NLM: 0411011
Informations de publication
Date de publication:
13 Oct 2023
13 Oct 2023
Historique:
accepted:
09
08
2023
revised:
25
07
2023
received:
03
11
2022
pubmed:
24
8
2023
medline:
24
8
2023
entrez:
24
8
2023
Statut:
ppublish
Résumé
Transposable elements (TEs) produce structural variants and are considered an important source of genetic diversity. Notably, TE-gene fusion transcripts, i.e. chimeric transcripts, have been associated with adaptation in several species. However, the identification of these chimeras remains hindered due to the lack of detection tools at a transcriptome-wide scale, and to the reliance on a reference genome, even though different individuals/cells/strains have different TE insertions. Therefore, we developed ChimeraTE, a pipeline that uses paired-end RNA-seq reads to identify chimeric transcripts through two different modes. Mode 1 is the reference-guided approach that employs canonical genome alignment, and Mode 2 identifies chimeras derived from fixed or insertionally polymorphic TEs without any reference genome. We have validated both modes using RNA-seq data from four Drosophila melanogaster wild-type strains. We found ∼1.12% of all genes generating chimeric transcripts, most of them from TE-exonized sequences. Approximately ∼23% of all detected chimeras were absent from the reference genome, indicating that TEs belonging to chimeric transcripts may be recent, polymorphic insertions. ChimeraTE is the first pipeline able to automatically uncover chimeric transcripts without a reference genome, consisting of two running Modes that can be used as a tool to investigate the contribution of TEs to transcriptome plasticity.
Identifiants
pubmed: 37615575
pii: 7249921
doi: 10.1093/nar/gkad671
pmc: PMC10570057
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
9764-9784Subventions
Organisme : Agence Nationale de la Recherche
ID : ANR-14-CE19-0016-01
Organisme : Fondation pour la Recherche Médicale
ID : DEP20131128536
Organisme : Idex Lyon
Organisme : Campus France Eiffel
ID : P769649C
Organisme : TIGER
ID : H2020-MSCA-IF-2014-658726
Organisme : National Council for Scientific and Technological Development
ID : 308020/2021-9
Organisme : São Paulo Research Foundation
ID : 2020/06238-2
Informations de copyright
© The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research.
Références
PLoS Biol. 2005 Jun;3(6):e181
pubmed: 15898832
Nat Med. 2010 May;16(5):571-9, 1p following 579
pubmed: 20436485
Oncogene. 2016 May 12;35(19):2542-6
pubmed: 26279299
Mol Biol Evol. 1999 Sep;16(9):1251-5
pubmed: 10486980
Nucleic Acids Res. 2020 Jan 8;48(D1):D756-D761
pubmed: 31691824
PLoS One. 2011 Jan 31;6(1):e16526
pubmed: 21304975
Mol Biol Evol. 2020 Sep 1;37(9):2661-2678
pubmed: 32413142
Bioessays. 2002 Sep;24(9):785-8
pubmed: 12210514
Annu Rev Genet. 2014;48:561-81
pubmed: 25292358
PLoS Genet. 2016 Aug 12;12(8):e1006249
pubmed: 27517860
Wiley Interdiscip Rev RNA. 2010 Jul-Aug;1(1):132-41
pubmed: 21956910
PLoS Genet. 2011 Oct;7(10):e1002337
pubmed: 22028673
Genome Res. 2023 Sep;33(9):1541-1553
pubmed: 37793782
Insect Biochem Mol Biol. 1996 Jul;26(7):697-703
pubmed: 8995791
Cell. 1983 Aug;34(1):75-84
pubmed: 6309414
Biol Direct. 2015 Apr 28;10:20
pubmed: 25928409
Mol Biol Evol. 2013 Oct;30(10):2311-27
pubmed: 23883524
Genome Biol. 2014 Jun 30;15(6):R86
pubmed: 24981968
Science. 2009 Nov 20;326(5956):1112-5
pubmed: 19965430
Nat Commun. 2022 Apr 12;13(1):1948
pubmed: 35413957
Algorithms Mol Biol. 2017 Feb 22;12:2
pubmed: 28250805
Nat Biotechnol. 2011 May 15;29(7):644-52
pubmed: 21572440
Mob DNA. 2021 Jan 12;12(1):2
pubmed: 33436076
Nat Methods. 2008 Dec;5(12):1005-10
pubmed: 19034268
Science. 2002 Sep 27;297(5590):2253-6
pubmed: 12351787
Cancer Lett. 2013 Nov 1;340(2):192-200
pubmed: 23376639
Annu Rev Genet. 1992;26:239-75
pubmed: 1482113
Nat Rev Mol Cell Biol. 2022 Jul;23(7):481-497
pubmed: 35228718
Proc Natl Acad Sci U S A. 1992 Jun 1;89(11):4855-9
pubmed: 1317576
Bioinformatics. 2014 Aug 1;30(15):2114-20
pubmed: 24695404
Genome Biol. 2014;15(12):550
pubmed: 25516281
Cell. 1997 Mar 7;88(5):647-55
pubmed: 9054504
Genome Res. 2020 Nov;30(11):1559-1569
pubmed: 32973040
Nat Protoc. 2012 Mar 01;7(3):562-78
pubmed: 22383036
Trends Genet. 2000 Jun;16(6):276-7
pubmed: 10827456
Trends Genet. 2003 Feb;19(2):68-72
pubmed: 12547512
Cells. 2021 Dec 20;10(12):
pubmed: 34944100
RNA. 2007 Oct;13(10):1603-8
pubmed: 17709368
Proc Natl Acad Sci U S A. 2014 Aug 26;111(34):E3534-43
pubmed: 25114248
Proc Natl Acad Sci U S A. 2003 May 27;100(11):6569-74
pubmed: 12743378
J Biosci Bioeng. 2003;96(4):317-23
pubmed: 16233530
Bioessays. 2006 Sep;28(9):913-22
pubmed: 16937363
Bioinformatics. 2011 Dec 15;27(24):3423-4
pubmed: 21949271
PLoS Genet. 2014 Aug 14;10(8):e1004560
pubmed: 25122208
Genetica. 2004 Mar;120(1-3):115-23
pubmed: 15088652
Proc Natl Acad Sci U S A. 2005 Sep 6;102(36):12807-12
pubmed: 16120680
Nucleic Acids Res. 2016 Feb 29;44(4):1483-95
pubmed: 26773057
Virology. 2014 Feb;450-451:196-204
pubmed: 24503082
Cells. 2021 Oct 29;10(11):
pubmed: 34831175
Nucleic Acids Res. 1999 Jan 15;27(2):573-80
pubmed: 9862982
Nucleic Acids Res. 2022 Feb 28;50(4):2111-2127
pubmed: 35166831
Cell Mol Life Sci. 2010 Feb;67(4):569-79
pubmed: 19859660
Nat Genet. 2023 Apr;55(4):631-639
pubmed: 36973455
Nucleic Acids Res. 2012 Jan;40(1):e3
pubmed: 22021376
Bioinformatics. 2018 Feb 15;34(4):688-690
pubmed: 29069308
Cells. 2020 Jul 25;9(8):
pubmed: 32722451
BMC Genomics. 2022 Apr 13;23(1):303
pubmed: 35418012
PLoS Comput Biol. 2005 Jul;1(2):166-75
pubmed: 16110336
Cell. 2021 Oct 28;184(22):5541-5558.e22
pubmed: 34644528
Curr Protoc Mol Biol. 2013 Nov 11;104:Unit 25B.11
pubmed: 24510412
Chromosoma. 1995 Jul;103(10):676-84
pubmed: 7664614
Nat Plants. 2019 Dec;5(12):1250-1259
pubmed: 31740772
Epigenomics. 2009 Dec;1(2):239-59
pubmed: 20495664
Gene. 2020 May 30;741:144546
pubmed: 32165306
Genet Res (Camb). 2011 Jun;93(3):181-7
pubmed: 21554776
Gene. 2014 Mar 1;537(1):93-9
pubmed: 24361809
BMC Biol. 2005 Nov 12;3:24
pubmed: 16283942
Mol Biol Evol. 2005 Mar;22(3):776-83
pubmed: 15574805
Genet Mol Res. 2008 Feb 01;7(1):107-16
pubmed: 18273826
Genome Biol Evol. 2017 Jan 1;9(1):161-177
pubmed: 28158585
Genome Biol. 2002;3(12):RESEARCH0084
pubmed: 12537573
PLoS Genet. 2019 Feb 12;15(2):e1007900
pubmed: 30753202
PLoS One. 2010 Aug 18;5(8):e12271
pubmed: 20805885
Bioinformatics. 2017 Feb 1;33(3):327-333
pubmed: 28172640
Mob DNA. 2016 Dec 1;7:24
pubmed: 27980689
Nat Methods. 2012 Mar 04;9(4):357-9
pubmed: 22388286
Nucleic Acids Res. 2015 Jan;43(Database issue):D690-7
pubmed: 25398896
Bioinformatics. 2019 Oct 1;35(19):3839-3841
pubmed: 30793157
Proc Natl Acad Sci U S A. 2006 May 23;103(21):8101-6
pubmed: 16672366
Genome Res. 2013 Jan;23(1):169-80
pubmed: 22936248
Nat Genet. 2009 May;41(5):563-71
pubmed: 19377475
Mol Ecol Resour. 2022 Nov;22(8):2860-2870
pubmed: 35668693
Nucleic Acids Res. 2021 Sep 20;49(16):9132-9153
pubmed: 34390351
Nucleic Acids Res. 2021 Jan 8;49(D1):D412-D419
pubmed: 33125078
Pest Manag Sci. 2018 Nov;74(11):2530-2543
pubmed: 29656515
Genome Biol Evol. 2023 Sep 4;15(9):
pubmed: 37652057
PLoS Genet. 2008 Jan;4(1):e16
pubmed: 18208336
PLoS One. 2010 Jun 01;5(6):e10907
pubmed: 20532223
Proc Natl Acad Sci U S A. 1993 Jun 15;90(12):5643-7
pubmed: 8390673
PLoS One. 2012;7(9):e44253
pubmed: 22962605
Bioinformatics. 2021 Jul 19;37(12):1639-1643
pubmed: 33320174
Nucleic Acids Res. 1997 Sep 1;25(17):3389-402
pubmed: 9254694
PLoS One. 2016 Dec 29;11(12):e0169196
pubmed: 28033411
PLoS Genet. 2012 Jan;8(1):e1002487
pubmed: 22291611
Genome Biol. 2022 Jun 15;23(1):130
pubmed: 35706016
Nucleic Acids Res. 2023 Jan 6;51(D1):D418-D427
pubmed: 36350672
Nat Rev Genet. 2021 Apr;22(4):203-215
pubmed: 33268840
Bioinformatics. 2017 Feb 15;33(4):564-565
pubmed: 27797756
Trends Genet. 2017 Nov;33(11):832-841
pubmed: 28947157
Nucleic Acids Res. 2015 Dec 15;43(22):10655-72
pubmed: 26578579
Elife. 2016 Feb 22;5:
pubmed: 26901440
Bioinformatics. 2009 Aug 15;25(16):2078-9
pubmed: 19505943
Syst Biol. 2006 Feb;55(1):21-30
pubmed: 16507521
J Evol Biol. 2021 Apr;34(4):628-638
pubmed: 33484011
Nat Methods. 2013 Jan;10(1):71-3
pubmed: 23160280
Mol Ecol Resour. 2020 Sep;20(5):1171-1181
pubmed: 30848092
Elife. 2017 Jul 25;6:
pubmed: 28742021
Nat Rev Cancer. 2007 Apr;7(4):233-45
pubmed: 17361217
Bioinformatics. 2010 Mar 15;26(6):841-2
pubmed: 20110278
Genetica. 2010 Jun;138(6):579-86
pubmed: 20012466
PLoS Genet. 2005 Oct;1(4):e44
pubmed: 16244705
Nat Rev Genet. 2007 Dec;8(12):973-82
pubmed: 17984973
Brief Bioinform. 2013 Mar;14(2):178-92
pubmed: 22517427
Nucleic Acids Res. 2004 Mar 19;32(5):1792-7
pubmed: 15034147
Genome Res. 2003 Aug;13(8):1889-96
pubmed: 12869581