CSBFinder: discovery of colinear syntenic blocks across thousands of prokaryotic genomes.


Journal

Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944

Informations de publication

Date de publication:
15 05 2019
Historique:
received: 15 05 2018
revised: 06 09 2018
accepted: 14 10 2018
pubmed: 16 10 2018
medline: 16 5 2020
entrez: 16 10 2018
Statut: ppublish

Résumé

Identification of conserved syntenic blocks across microbial genomes is important for several problems in comparative genomics such as gene annotation, study of genome organization and evolution and prediction of gene interactions. Current tools for syntenic block discovery do not scale up to the large quantity of prokaryotic genomes available today. We present a novel methodology for the discovery, ranking and taxonomic distribution analysis of colinear syntenic blocks (CSBs)-groups of genes that are consistently located close to each other, in the same order, across a wide range of taxa. We present an efficient algorithm that identifies CSBs in large genomic datasets. The algorithm is implemented and incorporated in a novel tool with a graphical user interface, denoted CSBFinder, that ranks the discovered CSBs according to a probabilistic score and clusters them to families according to their gene content similarity. We apply CSBFinder to data mine 1487 prokaryotic genomes including chromosomes and plasmids. For post-processing analysis, we generate heatmaps for visualizing the distribution of CSB family members across various taxa. We exemplify the utility of CSBFinder in operon prediction, in deciphering unknown gene function and in taxonomic analysis of colinear syntenic blocks. CSBFinder software and code are publicly available at https://github.com/dinasv/CSBFinder. Supplementary data are available at Bioinformatics online.

Identifiants

pubmed: 30321308
pii: 5132694
doi: 10.1093/bioinformatics/bty861
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

1634-1643

Subventions

Organisme : European Research Council
ID : 281357
Pays : International

Informations de copyright

© The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Auteurs

Dina Svetlitsky (D)

Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, Israel.

Tal Dagan (T)

Institute of General Microbiology, Christian-Albrechts University Kiel, Kiel, Germany.

Vered Chalifa-Caspi (V)

Bioinformatics Core Facility, National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel.

Michal Ziv-Ukelson (M)

Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, Israel.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software
Coal Metagenome Phylogeny Bacteria Genome, Bacterial
1.00
Humans Magnetic Resonance Imaging Brain Infant, Newborn Infant, Premature

Classifications MeSH