Using Constrained-INC for Large-Scale Gene Tree and Species Tree Estimation.
Journal
IEEE/ACM transactions on computational biology and bioinformatics
ISSN: 1557-9964
Titre abrégé: IEEE/ACM Trans Comput Biol Bioinform
Pays: United States
ID NLM: 101196755
Informations de publication
Date de publication:
Historique:
pubmed:
6
8
2020
medline:
29
12
2021
entrez:
6
8
2020
Statut:
ppublish
Résumé
Incremental tree building (INC) is a new phylogeny estimation method that has been proven to be absolute fast converging under standard sequence evolution models. A variant of INC, called Constrained-INC, is designed for use in divide-and-conquer pipelines for phylogeny estimation where a set of species is divided into disjoint subsets, trees are computed on the subsets using a selected base method, and then the subset trees are combined together. We evaluate the accuracy of INC and Constrained-INC for gene tree and species tree estimation on simulated datasets, and compare it to similar pipelines using NJMerge (another method that merges disjoint trees). For gene tree estimation, we find that INC has very poor accuracy in comparison to standard methods, and even Constrained-INC(using maximum likelihood methods to compute constraint trees) does not match the accuracy of the better maximum likelihood methods. Results for species trees are somewhat different, with Constrained-INC coming close to the accuracy of the best species tree estimation methods, while being much faster; furthermore, using Constrained-INC allows species tree estimation methods to scale to large datasets within limited computational resources. Overall, this study exposes the benefits and limitations of divide-and-conquer strategies for large-scale phylogenetic tree estimation.
Identifiants
pubmed: 32750844
doi: 10.1109/TCBB.2020.2990867
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.
Langues
eng
Sous-ensembles de citation
IM