DNA sequence similarity analysis using image texture analysis based on first-order statistics.


Journal

Journal of molecular graphics & modelling
ISSN: 1873-4243
Titre abrégé: J Mol Graph Model
Pays: United States
ID NLM: 9716237

Informations de publication

Date de publication:
09 2020
Historique:
received: 17 07 2019
revised: 13 03 2020
accepted: 23 03 2020
pubmed: 23 5 2020
medline: 22 6 2021
entrez: 23 5 2020
Statut: ppublish

Résumé

Similarity is one of the key processes of DNA sequence analysis in computational biology and bioinformatics. In nearly all research that explores evolutionary relationships, gene function analysis, protein structure prediction and sequence retrieving, it is necessary to perform similarity calculations. One major task in alignment-free DNA sequence similarity calculations is to develop novel mathematical descriptors for DNA sequences. In this paper, we present a novel approach to DNA sequence similarity analysis studies using similarity calculations of texture images. Texture analysis methods, which are a subset of digital image processing methods, are used here with the assumption that these calculations can be adapted to alignment-free DNA sequence similarity analysis methods. Gray-level textures were created by the values assigned to the nucleotides in the DNA sequences. Similarity calculations were made between these textures using histogram-based texture analyses based on first-order statistics. We obtained texture features for 3 different DNA data sets of different lengths, and calculated the similarity matrices. The phylogenetic relationships revealed by our method shows our trees to be similar to the results of the MEGA software, which is based on sequence alignment. Our findings show that texture analysis metrics can be used to characterize DNA sequences.

Identifiants

pubmed: 32442904
pii: S1093-3263(19)30551-0
doi: 10.1016/j.jmgm.2020.107603
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

107603

Informations de copyright

Copyright © 2020 Elsevier Inc. All rights reserved.

Auteurs

Emre Delibaş (E)

Department of Computer Engineering, Faculty of Engineering, Cumhuriyet University, 58140, Sivas, Turkey. Electronic address: edelibas@cumhuriyet.edu.tr.

Ahmet Arslan (A)

Department of Computer Engineering, Faculty of Engineering, Selçuk University, 42250, Konya, Turkey. Electronic address: ahmetarslan@selcuk.edu.tr.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Animals Hemiptera Insect Proteins Phylogeny Insecticides
Amaryllidaceae Alkaloids Lycoris NADPH-Ferrihemoprotein Reductase Gene Expression Regulation, Plant Plant Proteins

Classifications MeSH