TE-greedy-nester: structure-based detection of LTR retrotransposons and their nesting.


Journal

Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944

Informations de publication

Date de publication:
22 12 2020
Historique:
received: 11 09 2019
revised: 08 06 2020
accepted: 07 07 2020
pubmed: 15 7 2020
medline: 9 3 2021
entrez: 15 7 2020
Statut: ppublish

Résumé

Transposable elements (TEs) in eukaryotes often get inserted into one another, forming sequences that become a complex mixture of full-length elements and their fragments. The reconstruction of full-length elements and the order in which they have been inserted is important for genome and transposon evolution studies. However, the accumulation of mutations and genome rearrangements over evolutionary time makes this process error-prone and decreases the efficiency of software aiming to recover all nested full-length TEs. We created software that uses a greedy recursive algorithm to mine increasingly fragmented copies of full-length LTR retrotransposons in assembled genomes and other sequence data. The software called TE-greedy-nester considers not only sequence similarity but also the structure of elements. This new tool was tested on a set of natural and synthetic sequences and its accuracy was compared to similar software. We found TE-greedy-nester to be superior in a number of parameters, namely computation time and full-length TE recovery in highly nested regions. http://gitlab.fi.muni.cz/lexa/nested. Supplementary data are available at Bioinformatics online.

Identifiants

pubmed: 32663247
pii: 5871348
doi: 10.1093/bioinformatics/btaa632
pmc: PMC7755421
doi:

Substances chimiques

DNA Transposable Elements 0
Retroelements 0

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

4991-4999

Informations de copyright

© The Author(s) 2020. Published by Oxford University Press.

Références

Plant Physiol. 2018 Feb;176(2):1410-1422
pubmed: 29233850
BMC Plant Biol. 2010 Sep 16;10:204
pubmed: 20846365
Plant J. 2018 Feb;93(3):515-533
pubmed: 29237241
Plant J. 2014 Nov;80(4):582-91
pubmed: 25182777
Nucleic Acids Res. 2013 Jan;41(Database issue):D1144-51
pubmed: 23203886
PLoS Genet. 2021 Oct 14;17(10):e1009768
pubmed: 34648488
Nucleic Acids Res. 2012 Jan;40(Database issue):D1178-86
pubmed: 22110026
Nucleic Acids Res. 2007 Jul;35(Web Server issue):W265-8
pubmed: 17485477
Bioinformatics. 2008 Feb 15;24(4):468-76
pubmed: 18089620
Curr Opin Genet Dev. 1999 Dec;9(6):657-63
pubmed: 10607616
Database (Oxford). 2017 Jan 1;2017(1):
pubmed: 28365739
Bioinformatics. 2003 Feb 12;19(3):362-7
pubmed: 12584121
Genome Biol. 2004;5(10):R78
pubmed: 15461796
Nat Rev Genet. 2018 Nov;19(11):688-704
pubmed: 30232369
Brief Bioinform. 2007 Nov;8(6):382-92
pubmed: 17932080
Front Plant Sci. 2017 Apr 04;8:402
pubmed: 28421083
New Phytol. 2020 Sep;227(6):1736-1748
pubmed: 31677277
Genetica. 1999;107(1-3):27-37
pubmed: 10952195
Nat Plants. 2018 Jul;4(7):460-472
pubmed: 29967517
BMC Genomics. 2019 Jun 3;20(1):450
pubmed: 31159720
BMC Genomics. 2010 Feb 17;11:113
pubmed: 20163715
BMC Genomics. 2008 Aug 10;9:382
pubmed: 18691433
BMC Genomics. 2008 Dec 18;9:614
pubmed: 19094224
Nat Genet. 1998 Sep;20(1):43-5
pubmed: 9731528
Genomics. 2012 Oct;100(4):222-30
pubmed: 22800764
Trends Genet. 2000 Jun;16(6):276-7
pubmed: 10827456
Methods Mol Biol. 2013;1057:305-19
pubmed: 23918438
PLoS One. 2013 Jul 29;8(7):e71118
pubmed: 23923055
Genetics. 2006 Dec;174(4):2215-28
pubmed: 17028332
BMC Bioinformatics. 2008 Jan 14;9:18
pubmed: 18194517
Plant Mol Biol. 1997 Sep;35(1-2):231-40
pubmed: 9291976
Comp Funct Genomics. 2012;2012:947089
pubmed: 22792041
Nat Commun. 2018 Jan 2;9(1):13
pubmed: 29296019
Science. 2012 Nov 9;338(6108):758-67
pubmed: 23145453
Nat Biotechnol. 2011 Jan;29(1):24-6
pubmed: 21221095
Ann Bot. 2017 Aug 1;120(2):195-207
pubmed: 28854566
IEEE/ACM Trans Comput Biol Bioinform. 2013 May-Jun;10(3):645-56
pubmed: 24091398
Plant Physiol. 2008 Jan;146(1):45-59
pubmed: 18032588
Bioinformatics. 2010 Mar 15;26(6):841-2
pubmed: 20110278
Brief Bioinform. 2013 Mar;14(2):178-92
pubmed: 22517427
J Mol Biol. 1990 Oct 5;215(3):403-10
pubmed: 2231712

Auteurs

Matej Lexa (M)

Department of Plant Developmental Genetics, Institute of Biophysics of the Czech Academy of Sciences, 61200 Brno, Czech Republic.
Department of Machine Learning and Data Processing, Faculty of Informatics, Masaryk University, 60200 Brno, Czech Republic.

Pavel Jedlicka (P)

Department of Plant Developmental Genetics, Institute of Biophysics of the Czech Academy of Sciences, 61200 Brno, Czech Republic.

Ivan Vanat (I)

Department of Machine Learning and Data Processing, Faculty of Informatics, Masaryk University, 60200 Brno, Czech Republic.

Michal Cervenansky (M)

Department of Plant Developmental Genetics, Institute of Biophysics of the Czech Academy of Sciences, 61200 Brno, Czech Republic.
Department of Machine Learning and Data Processing, Faculty of Informatics, Masaryk University, 60200 Brno, Czech Republic.

Eduard Kejnovsky (E)

Department of Plant Developmental Genetics, Institute of Biophysics of the Czech Academy of Sciences, 61200 Brno, Czech Republic.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software
1.00
Humans Magnetic Resonance Imaging Brain Infant, Newborn Infant, Premature
Cephalometry Humans Anatomic Landmarks Software Internet

Classifications MeSH