TrEMOLO: accurate transposable element allele frequency estimation using long-read sequencing data combining assembly and mapping-based approaches.

Genome Haplotypes Long-read DNA sequencing Nanopore sequencing Software Structural variation Transposable element allelic frequency Transposable element calling

Journal

Genome biology
ISSN: 1474-760X
Titre abrégé: Genome Biol
Pays: England
ID NLM: 100960660

Informations de publication

Date de publication:
03 04 2023
Historique:
received: 21 07 2022
accepted: 23 03 2023
medline: 5 4 2023
entrez: 4 4 2023
pubmed: 5 4 2023
Statut: epublish

Résumé

Transposable Element MOnitoring with LOng-reads (TrEMOLO) is a new software that combines assembly- and mapping-based approaches to robustly detect genetic elements called transposable elements (TEs). Using high- or low-quality genome assemblies, TrEMOLO can detect most TE insertions and deletions and estimate their allele frequency in populations. Benchmarking with simulated data revealed that TrEMOLO outperforms other state-of-the-art computational tools. TE detection and frequency estimation by TrEMOLO were validated using simulated and experimental datasets. Therefore, TrEMOLO is a comprehensive and suitable tool to accurately study TE dynamics. TrEMOLO is available under GNU GPL3.0 at https://github.com/DrosophilaGenomeEvolution/TrEMOLO .

Identifiants

pubmed: 37013657
doi: 10.1186/s13059-023-02911-2
pii: 10.1186/s13059-023-02911-2
pmc: PMC10069131
doi:

Substances chimiques

DNA Transposable Elements 0

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

63

Informations de copyright

© 2023. The Author(s).

Références

PLoS One. 2011 Jan 31;6(1):e16526
pubmed: 21304975
Nat Rev Genet. 2018 Nov;19(11):688-704
pubmed: 30232369
Nucleic Acids Res. 2011 Mar;39(6):e36
pubmed: 21177644
Mutat Res. 2007 Mar 1;616(1-2):46-59
pubmed: 17157332
Cells. 2020 Jul 25;9(8):
pubmed: 32722451
Nat Biotechnol. 2020 Sep;38(9):1044-1053
pubmed: 32686750
Nucleic Acids Res. 2018 Oct 12;46(18):9524-9536
pubmed: 30312469
G3 (Bethesda). 2017 Aug 7;7(8):2763-2778
pubmed: 28637810
Nat Rev Genet. 2007 Dec;8(12):973-82
pubmed: 17984973
G3 (Bethesda). 2018 Oct 3;8(10):3143-3154
pubmed: 30018084
Science. 2022 Apr;376(6588):44-53
pubmed: 35357919
Nat Methods. 2018 Jun;15(6):461-468
pubmed: 29713083
Mob DNA. 2019 Dec 30;10:53
pubmed: 31892957
Nat Commun. 2022 Apr 12;13(1):1948
pubmed: 35413957
Nature. 2020 Dec;588(7837):277-283
pubmed: 33239791
Genome Biol. 2004;5(2):R12
pubmed: 14759262
Mol Cell. 2020 Dec 3;80(5):915-928.e5
pubmed: 33186547
Nucleic Acids Res. 2003 Dec 1;31(23):6935-41
pubmed: 14627826
Genome Res. 2017 May;27(5):737-746
pubmed: 28100585
Nat Biotechnol. 2019 May;37(5):540-546
pubmed: 30936562
Proc Natl Acad Sci U S A. 2020 Apr 28;117(17):9451-9457
pubmed: 32300014
Gene. 2007 Apr 1;390(1-2):3-17
pubmed: 17034960
Nucleic Acids Res. 2015 Feb 27;43(4):e22
pubmed: 25510498
Mol Gen Genet. 1997 Feb 27;253(6):687-94
pubmed: 9079879
Curr Opin Plant Biol. 2004 Apr;7(2):115-9
pubmed: 15003209
Mol Biol Evol. 2016 Oct;33(10):2759-64
pubmed: 27486221
Gene. 2007 Apr 1;390(1-2):108-16
pubmed: 17052864
Nat Methods. 2020 Feb;17(2):155-158
pubmed: 31819265
Plant J. 1998 Dec;16(5):643-50
pubmed: 10036780
Mob DNA. 2020 Jul 3;11:23
pubmed: 32636946
Genome Biol. 2019 Dec 16;20(1):275
pubmed: 31843001
F1000Res. 2021 Jan 18;10:33
pubmed: 34035898
Genome Biol Evol. 2015 Mar 11;7(4):1192-205
pubmed: 25767248
Bioinformatics. 2020 Feb 15;36(4):1191-1197
pubmed: 31580402
PLoS Comput Biol. 2021 Jan 27;17(1):e1008678
pubmed: 33503026
Nucleic Acids Res. 2022 Nov 28;50(21):e124
pubmed: 36156149
Bioinformatics. 2009 Aug 15;25(16):2078-9
pubmed: 19505943
Genome Biol. 2019 Oct 28;20(1):224
pubmed: 31661016
Bioinformatics. 2016 Oct 1;32(19):3021-3
pubmed: 27318204
Bioinformatics. 2020 Apr 15;36(8):2578-2580
pubmed: 31913436
Bioinformatics. 2021 Oct 08;:
pubmed: 34623391
BMC Bioinformatics. 2009 Dec 15;10:421
pubmed: 20003500
Genome Biol Evol. 2021 Feb 3;13(2):
pubmed: 33367721
Mob DNA. 2017 Apr 8;8:5
pubmed: 28405230

Auteurs

Mourdas Mohamed (M)

Institute of Human Genetics, UMR9002, CNRS and Université de Montpellier, Montpellier, France.

François Sabot (F)

DIADE, University of Montpellier, CIRAD, IRD, Montpellier, France.
IFB - Southgreen Bioversity, CIRAD, INRAE, IRD, Montpellier, France.

Marion Varoqui (M)

Institute of Human Genetics, UMR9002, CNRS and Université de Montpellier, Montpellier, France.

Bruno Mugat (B)

Institute of Human Genetics, UMR9002, CNRS and Université de Montpellier, Montpellier, France.

Katell Audouin (K)

TAGC, UMR 1090 INSERM, Marseille, France.

Alain Pélisson (A)

Institute of Human Genetics, UMR9002, CNRS and Université de Montpellier, Montpellier, France.

Anna-Sophie Fiston-Lavier (AS)

ISEM, Université Montpellier, CNRS, IRD, CIRAD, EPHE, Montpellier, France. anna-sophie.fiston-lavier@umontpellier.fr.
Institut Universitaire de France (IUF), Paris, France. anna-sophie.fiston-lavier@umontpellier.fr.

Séverine Chambeyron (S)

Institute of Human Genetics, UMR9002, CNRS and Université de Montpellier, Montpellier, France. severine.chambeyron@igh.cnrs.fr.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software

Classifications MeSH