Graph-based analysis of DNA sequence comparison in closed cotton species: A generalized method to unveil genetic connections.


Journal

PloS one
ISSN: 1932-6203
Titre abrégé: PLoS One
Pays: United States
ID NLM: 101285081

Informations de publication

Date de publication:
2024
Historique:
received: 23 12 2023
accepted: 21 06 2024
medline: 17 9 2024
pubmed: 17 9 2024
entrez: 17 9 2024
Statut: epublish

Résumé

Graph theory provides a systematic method for modeling and analysing complicated biological data as an effective bioinformatics tool. Based on current trends, the number of DNA sequences in the DNA database is growing quickly. To determine the origin of a species and identify homologous sequences, it is crucial to detect similarities in DNA sequences. Alignment-free techniques are required for accurate measures of sequence similarity, which has been one of the main issues facing computational biologists. The current study provides a mathematical technique for comparing DNA sequences that are constructed in graph theory. The sequences of each DNA were divided into pairs of nucleotides, from which weighted loop digraphs and corresponding weighted vectors were computed. To check the sequence similarity, distance measures like Cosine, Correlation, and Jaccard were employed. To verify the method, DNA segments from the genomes of ten species of cotton were tested. Furthermore, to evaluate the efficacy of the proposed methodology, a K-means clustering method was performed. This study proposes a proof-of-model that utilises a distance matrix approach that promises impressive outcomes with future optimisations to be made to the suggested solution to get the hundred percent accurate result. In the realm of bioinformatics, this paper highlights the use of graph theory as an effective tool for biological data study and sequence comparison. It's expected that further optimization in the proposed solution can bring remarkable results, as this paper presents a proof-of-concept implementation for a given set of data using the proposed distance matrix technique.

Identifiants

pubmed: 39288143
doi: 10.1371/journal.pone.0306608
pii: PONE-D-23-43366
doi:

Substances chimiques

DNA, Plant 0

Types de publication

Journal Article Comparative Study

Langues

eng

Sous-ensembles de citation

IM

Pagination

e0306608

Informations de copyright

Copyright: © 2024 Khan et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Déclaration de conflit d'intérêts

The authors have declared that no competing interests exist.

Auteurs

Riaz Hussain Khan (RH)

Institute of Mathematics, Khwaja Fareed University of Engineering & Information Technology, Rahim Yar Khan, Punjab, Pakistan.

Nadeem Salamat (N)

Institute of Mathematics, Khwaja Fareed University of Engineering & Information Technology, Rahim Yar Khan, Punjab, Pakistan.

A Q Baig (AQ)

Deportment of Mathematics and Statistics, Institute of Southern Punjab, Multan, Punjab, Pakistan.
School of New Energy and Intelligent Networked Automobiles, University of Sanya, Sanya, China.

Zaffar Ahmed Shaikh (ZA)

Department of Computer Science and Information Technology, Benazir Bhutto Shaheed University Lyari, Karachi, Pakistan.
School of Engineering, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland.

Amr Yousef (A)

Electrical Engineering Department, College of Engineering, University of Business and Technology, Jeddah, Saudi Arabia.
Engineering Mathematics Department, Alexandria University, Alexandria, Egypt.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Drought Resistance Gene Expression Profiling Gene Expression Regulation, Plant Gossypium Multigene Family

Classifications MeSH