Reciprocal best match graphs.

Hierarchically colored cograph Pairwise best hit Phylogenetic tree Reciprocal best match heuristics Vertex colored graph

Journal

Journal of mathematical biology
ISSN: 1432-1416
Titre abrégé: J Math Biol
Pays: Germany
ID NLM: 7502105

Informations de publication

Date de publication:
02 2020
Historique:
received: 19 03 2019
revised: 10 06 2019
pubmed: 7 11 2019
medline: 19 3 2021
entrez: 7 11 2019
Statut: ppublish

Résumé

Reciprocal best matches play an important role in numerous applications in computational biology, in particular as the basis of many widely used tools for orthology assessment. Nevertheless, very little is known about their mathematical structure. Here, we investigate the structure of reciprocal best match graphs (RBMGs). In order to abstract from the details of measuring distances, we define reciprocal best matches here as pairwise most closely related leaves in a gene tree, arguing that conceptually this is the notion that is pragmatically approximated by distance- or similarity-based heuristics. We start by showing that a graph G is an RBMG if and only if its quotient graph w.r.t. a certain thinness relation is an RBMG. Furthermore, it is necessary and sufficient that all connected components of G are RBMGs. The main result of this contribution is a complete characterization of RBMGs with 3 colors/species that can be checked in polynomial time. For 3 colors, there are three distinct classes of trees that are related to the structure of the phylogenetic trees explaining them. We derive an approach to recognize RBMGs with an arbitrary number of colors; it remains open however, whether a polynomial-time for RBMG recognition exists. In addition, we show that RBMGs that at the same time are cographs (co-RBMGs) can be recognized in polynomial time. Co-RBMGs are characterized in terms of hierarchically colored cographs, a particular class of vertex colored cographs that is introduced here. The (least resolved) trees that explain co-RBMGs can be constructed in polynomial time.

Identifiants

pubmed: 31691135
doi: 10.1007/s00285-019-01444-2
pii: 10.1007/s00285-019-01444-2
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

865-953

Références

Nat Methods. 2016 May;13(5):425-30
pubmed: 27043882
Algorithms Mol Biol. 2017 Aug 29;12:23
pubmed: 28861118
J Math Biol. 2019 Jun;78(7):2015-2057
pubmed: 30968198
J Math Biol. 2013 Jan;66(1-2):399-420
pubmed: 22456957
Nucleic Acids Res. 2011 Jul;39(13):e88
pubmed: 21572104
J Math Biol. 2020 Jan 30;:
pubmed: 32002659
Proc Natl Acad Sci U S A. 1999 Mar 16;96(6):2896-901
pubmed: 10077608
Genomics Proteomics Bioinformatics. 2017 Dec;15(6):361-370
pubmed: 29133277
J Math Biol. 2018 Nov;77(5):1459-1491
pubmed: 29951855
PLoS Comput Biol. 2009 Jan;5(1):e1000262
pubmed: 19148271
Bioinformatics. 2017 Jul 15;33(14):i75-i82
pubmed: 28881964
J Mol Biol. 1998 Nov 6;283(4):707-25
pubmed: 9790834
Proc Natl Acad Sci U S A. 2015 Feb 17;112(7):2058-63
pubmed: 25646426
PLoS One. 2014 Aug 19;9(8):e105015
pubmed: 25137074
J Math Biol. 2019 Aug;79(3):969-986
pubmed: 31111195
Bioinformatics. 2003 Sep 1;19(13):1710-1
pubmed: 15593400
J Math Biol. 2017 Jul;75(1):199-237
pubmed: 27904954
Methods Mol Biol. 2018;1704:1-28
pubmed: 29277861
Science. 1997 Oct 24;278(5338):631-7
pubmed: 9381173
Trends Genet. 2000 May;16(5):227-31
pubmed: 10782117

Auteurs

Manuela Geiß (M)

Bioinformatics Group, Department of Computer Science, Leipzig University, Härtelstraße 16-18, 04107, Leipzig, Germany.
Interdisciplinary Center of Bioinformatics, Leipzig University, Härtelstraße 16-18, 04107, Leipzig, Germany.

Peter F Stadler (PF)

Bioinformatics Group, Department of Computer Science, Leipzig University, Härtelstraße 16-18, 04107, Leipzig, Germany.
Interdisciplinary Center of Bioinformatics, Leipzig University, Härtelstraße 16-18, 04107, Leipzig, Germany.
German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig University, Härtelstraße 16-18, 04107, Leipzig, Germany.
Competence Center for Scalable Data Services and Solutions, Leipzig University, Härtelstraße 16-18, 04107, Leipzig, Germany.
Leipzig Research Center for Civilization Diseases, Leipzig University, Härtelstraße 16-18, 04107, Leipzig, Germany.
Max-Planck-Institute for Mathematics in the Sciences, Inselstraße 22, 04103, Leipzig, Germany.
Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090, Vienna, Austria.
Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM, 87501, USA.

Marc Hellmuth (M)

Institute of Mathematics and Computer Science, University of Greifswald, Walther-Rathenau-Straße 47, 17487, Greifswald, Germany. mhellmuth@mailbox.org.
Center for Bioinformatics, Saarland University, Building E 2.1, P.O. Box 151150, 66041, Saarbrücken, Germany. mhellmuth@mailbox.org.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing
Animals Hemiptera Insect Proteins Phylogeny Insecticides
Amaryllidaceae Alkaloids Lycoris NADPH-Ferrihemoprotein Reductase Gene Expression Regulation, Plant Plant Proteins

Classifications MeSH