Fake IDs? Widespread misannotation of DNA transposons as a general transcription factor.


Journal

Genome biology
ISSN: 1474-760X
Titre abrégé: Genome Biol
Pays: England
ID NLM: 100960660

Informations de publication

Date de publication:
13 Nov 2023
Historique:
received: 14 04 2023
accepted: 01 11 2023
medline: 15 11 2023
pubmed: 14 11 2023
entrez: 14 11 2023
Statut: epublish

Résumé

Accurate annotation of genes and transposable elements (TEs) is vital for understanding genomes, but current annotation pipelines often misannotate TEs as genes. This study reveals how the general transcription factor II-I repeat domain-containing protein 2 (GTF2IRD2) erroneously annotated DNA transposons in non-mammalian species, as it contains a 3' fused hAT transposase domain. We also demonstrate the generality of this problem by identifying misannotated TEs as genes in other vertebrate genomes. Such misannotations can lead to errors in phylogenetic analyses and wasted time for investigators. The study proposes adding a final TE-check to gene annotation pipelines to mitigate this problem.

Identifiants

pubmed: 37957683
doi: 10.1186/s13059-023-03102-9
pii: 10.1186/s13059-023-03102-9
pmc: PMC10641963
doi:

Substances chimiques

DNA Transposable Elements 0
Transcription Factors, General 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

260

Informations de copyright

© 2023. The Author(s).

Références

Mob DNA. 2015 Jun 02;6:11
pubmed: 26045719
Adv Drug Deliv Rev. 2010 Sep 30;62(12):1187-95
pubmed: 20615441
Mol Biol Evol. 2013 Apr;30(4):772-80
pubmed: 23329690
Nucleic Acids Res. 2021 Jan 8;49(D1):D344-D354
pubmed: 33156333
Genome Res. 2002 Jun;12(6):996-1006
pubmed: 12045153
Mol Biol Evol. 2018 Feb 1;35(2):518-522
pubmed: 29077904
J Mol Biol. 1997 Apr 25;268(1):78-94
pubmed: 9149143
J Comput Biol. 2000 Feb-Apr;7(1-2):203-14
pubmed: 10890397
Nucleic Acids Res. 2013 Jul;41(Web Server issue):W22-8
pubmed: 23677614
Bioinformatics. 2010 Mar 15;26(6):841-2
pubmed: 20110278
Eur J Hum Genet. 2004 Jul;12(7):551-60
pubmed: 15100712
Genetics. 2011 May;188(1):45-57
pubmed: 21368277
Genome Biol. 2015 Jan 29;16:21
pubmed: 25723810
PLoS Biol. 2020 Dec 2;18(12):e3001007
pubmed: 33264284
Mol Biol Evol. 2020 May 1;37(5):1530-1534
pubmed: 32011700
Mob DNA. 2016 Aug 11;7:16
pubmed: 27525044
Curr Opin Struct Biol. 1998 Jun;8(3):333-7
pubmed: 9666329
Nat Methods. 2017 Jun;14(6):587-589
pubmed: 28481363

Auteurs

Nozhat T Hassan (NT)

School of Biological Sciences, University of Adelaide, North Terrace, Adelaide, South Australia, 5005, Australia.

David L Adelson (DL)

School of Biological Sciences, University of Adelaide, North Terrace, Adelaide, South Australia, 5005, Australia. david.adelson@adelaide.edu.au.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing
Robotic Surgical Procedures Animals Humans Telemedicine Models, Animal

Odour generalisation and detection dog training.

Lyn Caldicott, Thomas W Pike, Helen E Zulch et al.
1.00
Animals Odorants Dogs Generalization, Psychological Smell
Animals TOR Serine-Threonine Kinases Colorectal Neoplasms Colitis Mice

Classifications MeSH