Unblended disjoint tree merging using GTM improves species tree estimation.
Divide-and-conquer pipelines
Large-scale phylogeny estimation
Species tree estimation
Journal
BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258
Informations de publication
Date de publication:
16 Apr 2020
16 Apr 2020
Historique:
entrez:
18
4
2020
pubmed:
18
4
2020
medline:
14
1
2021
Statut:
epublish
Résumé
Phylogeny estimation is an important part of much biological research, but large-scale tree estimation is infeasible using standard methods due to computational issues. Recently, an approach to large-scale phylogeny has been proposed that divides a set of species into disjoint subsets, computes trees on the subsets, and then merges the trees together using a computed matrix of pairwise distances between the species. The novel component of these approaches is the last step: Disjoint Tree Merger (DTM) methods. We present GTM (Guide Tree Merger), a polynomial time DTM method that adds edges to connect the subset trees, so as to provably minimize the topological distance to a computed guide tree. Thus, GTM performs unblended mergers, unlike the previous DTM methods. Yet, despite the potential limitation, our study shows that GTM has excellent accuracy, generally matching or improving on two previous DTMs, and is much faster than both. The proposed GTM approach to the DTM problem is a useful new tool for large-scale phylogenomic analysis, and shows the surprising potential for unblended DTM methods.
Sections du résumé
BACKGROUND
BACKGROUND
Phylogeny estimation is an important part of much biological research, but large-scale tree estimation is infeasible using standard methods due to computational issues. Recently, an approach to large-scale phylogeny has been proposed that divides a set of species into disjoint subsets, computes trees on the subsets, and then merges the trees together using a computed matrix of pairwise distances between the species. The novel component of these approaches is the last step: Disjoint Tree Merger (DTM) methods.
RESULTS
RESULTS
We present GTM (Guide Tree Merger), a polynomial time DTM method that adds edges to connect the subset trees, so as to provably minimize the topological distance to a computed guide tree. Thus, GTM performs unblended mergers, unlike the previous DTM methods. Yet, despite the potential limitation, our study shows that GTM has excellent accuracy, generally matching or improving on two previous DTMs, and is much faster than both.
CONCLUSIONS
CONCLUSIONS
The proposed GTM approach to the DTM problem is a useful new tool for large-scale phylogenomic analysis, and shows the surprising potential for unblended DTM methods.
Identifiants
pubmed: 32299343
doi: 10.1186/s12864-020-6605-1
pii: 10.1186/s12864-020-6605-1
pmc: PMC7161100
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
235Références
Mol Biol Evol. 1987 Jul;4(4):406-25
pubmed: 3447015
Bioinformatics. 2014 Sep 1;30(17):i519-26
pubmed: 25161242
J Comput Biol. 1998 Fall;5(3):391-407
pubmed: 9773340
BMC Bioinformatics. 2017 Jun 7;18(Suppl 8):238
pubmed: 28617225
Trends Ecol Evol. 2013 Dec;28(12):719-28
pubmed: 24094331
Syst Biol. 2019 Mar 1;68(2):281-297
pubmed: 30247732
BMC Genomics. 2015;16 Suppl 10:S3
pubmed: 26449326
Syst Biol. 2012 Jan;61(1):90-106
pubmed: 22139466
Syst Biol. 2011 Oct;60(5):661-7
pubmed: 21447481
BMC Bioinformatics. 2018 May 8;19(Suppl 6):153
pubmed: 29745866
Genome Biol. 2015 Jun 16;16:124
pubmed: 26076734
Bioinformatics. 2014 Sep 1;30(17):i541-8
pubmed: 25161245
BMC Genomics. 2014;15 Suppl 6:S7
pubmed: 25572610
Algorithms Mol Biol. 2019 Feb 6;14:2
pubmed: 30839943
Algorithms Mol Biol. 2019 Jul 19;14:14
pubmed: 31360216
J Comput Biol. 2015 May;22(5):377-86
pubmed: 25549288
Bioinformatics. 2019 Jul 15;35(14):i417-i426
pubmed: 31510668
Bioinformatics. 2015 Jun 15;31(12):i44-52
pubmed: 26072508
Bioinformatics. 2012 Jun 15;28(12):i274-82
pubmed: 22689772
Theor Popul Biol. 2014 Dec 26;100C:56-62
pubmed: 25545843
Mol Biol Evol. 2017 Dec 1;34(12):3279-3291
pubmed: 29029241
Bioinformatics. 2006 Aug 15;22(16):2047-8
pubmed: 16679334
PLoS One. 2010 Mar 10;5(3):e9490
pubmed: 20224823
Bioinformatics. 2006 Nov 1;22(21):2688-90
pubmed: 16928733
Genome Res. 2013 Feb;23(2):323-30
pubmed: 23132911
Genome Biol. 2019 Jul 25;20(1):144
pubmed: 31345254
PLoS One. 2011;6(11):e27731
pubmed: 22132132