Reciprocal best match graphs.
Hierarchically colored cograph
Pairwise best hit
Phylogenetic tree
Reciprocal best match heuristics
Vertex colored graph
Journal
Journal of mathematical biology
ISSN: 1432-1416
Titre abrégé: J Math Biol
Pays: Germany
ID NLM: 7502105
Informations de publication
Date de publication:
02 2020
02 2020
Historique:
received:
19
03
2019
revised:
10
06
2019
pubmed:
7
11
2019
medline:
19
3
2021
entrez:
7
11
2019
Statut:
ppublish
Résumé
Reciprocal best matches play an important role in numerous applications in computational biology, in particular as the basis of many widely used tools for orthology assessment. Nevertheless, very little is known about their mathematical structure. Here, we investigate the structure of reciprocal best match graphs (RBMGs). In order to abstract from the details of measuring distances, we define reciprocal best matches here as pairwise most closely related leaves in a gene tree, arguing that conceptually this is the notion that is pragmatically approximated by distance- or similarity-based heuristics. We start by showing that a graph G is an RBMG if and only if its quotient graph w.r.t. a certain thinness relation is an RBMG. Furthermore, it is necessary and sufficient that all connected components of G are RBMGs. The main result of this contribution is a complete characterization of RBMGs with 3 colors/species that can be checked in polynomial time. For 3 colors, there are three distinct classes of trees that are related to the structure of the phylogenetic trees explaining them. We derive an approach to recognize RBMGs with an arbitrary number of colors; it remains open however, whether a polynomial-time for RBMG recognition exists. In addition, we show that RBMGs that at the same time are cographs (co-RBMGs) can be recognized in polynomial time. Co-RBMGs are characterized in terms of hierarchically colored cographs, a particular class of vertex colored cographs that is introduced here. The (least resolved) trees that explain co-RBMGs can be constructed in polynomial time.
Identifiants
pubmed: 31691135
doi: 10.1007/s00285-019-01444-2
pii: 10.1007/s00285-019-01444-2
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
865-953Références
Nat Methods. 2016 May;13(5):425-30
pubmed: 27043882
Algorithms Mol Biol. 2017 Aug 29;12:23
pubmed: 28861118
J Math Biol. 2019 Jun;78(7):2015-2057
pubmed: 30968198
J Math Biol. 2013 Jan;66(1-2):399-420
pubmed: 22456957
Nucleic Acids Res. 2011 Jul;39(13):e88
pubmed: 21572104
J Math Biol. 2020 Jan 30;:
pubmed: 32002659
Proc Natl Acad Sci U S A. 1999 Mar 16;96(6):2896-901
pubmed: 10077608
Genomics Proteomics Bioinformatics. 2017 Dec;15(6):361-370
pubmed: 29133277
J Math Biol. 2018 Nov;77(5):1459-1491
pubmed: 29951855
PLoS Comput Biol. 2009 Jan;5(1):e1000262
pubmed: 19148271
Bioinformatics. 2017 Jul 15;33(14):i75-i82
pubmed: 28881964
J Mol Biol. 1998 Nov 6;283(4):707-25
pubmed: 9790834
Proc Natl Acad Sci U S A. 2015 Feb 17;112(7):2058-63
pubmed: 25646426
PLoS One. 2014 Aug 19;9(8):e105015
pubmed: 25137074
J Math Biol. 2019 Aug;79(3):969-986
pubmed: 31111195
Bioinformatics. 2003 Sep 1;19(13):1710-1
pubmed: 15593400
J Math Biol. 2017 Jul;75(1):199-237
pubmed: 27904954
Methods Mol Biol. 2018;1704:1-28
pubmed: 29277861
Science. 1997 Oct 24;278(5338):631-7
pubmed: 9381173
Trends Genet. 2000 May;16(5):227-31
pubmed: 10782117