Unsupervised AI reveals insect species-specific genome signatures.
Chromatin
Genome signature
Insect genome
Oligonucleotide usage
Transcription factor binding motifs
Unsupervised machine learning
Journal
PeerJ
ISSN: 2167-8359
Titre abrégé: PeerJ
Pays: United States
ID NLM: 101603425
Informations de publication
Date de publication:
2024
2024
Historique:
received:
17
11
2023
accepted:
07
02
2024
medline:
11
3
2024
pubmed:
11
3
2024
entrez:
11
3
2024
Statut:
epublish
Résumé
Insects are a highly diverse phylogeny and possess a wide variety of traits, including the presence or absence of wings and metamorphosis. These diverse traits are of great interest for studying genome evolution, and numerous comparative genomic studies have examined a wide phylogenetic range of insects. Here, we analyzed 22 insects belonging to a wide phylogenetic range (Endopterygota, Paraneoptera, Polyneoptera, Palaeoptera, and other insects) by using a batch-learning self-organizing map (BLSOM) for oligonucleotide compositions in their genomic fragments (100-kb or 1-Mb sequences), which is an unsupervised machine learning algorithm that can extract species-specific characteristics of the oligonucleotide compositions (genome signatures). The genome signature is of particular interest in terms of the mechanisms and biological significance that have caused the species-specific difference, and can be used as a powerful search needle to explore the various roles of genome sequences other than protein coding, and can be used to unveil mysteries hidden in the genome sequence. Since BLSOM is an unsupervised clustering method, the clustering of sequences was performed based on the oligonucleotide composition alone, without providing information about the species from which each fragment sequence was derived. Therefore, not only the interspecies separation, but also the intraspecies separation can be achieved. Here, we have revealed the specific genomic regions with oligonucleotide compositions distinct from the usual sequences of each insect genome,
Identifiants
pubmed: 38464746
doi: 10.7717/peerj.17025
pii: 17025
pmc: PMC10924456
doi:
Banques de données
figshare
['10.6084/m9.figshare.25036358.v1']
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
e17025Informations de copyright
©2024 Sawada et al.
Déclaration de conflit d'intérêts
The authors declare there are no competing interests.