QuCo: quartet-based co-estimation of species trees and gene trees.
Journal
Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944
Informations de publication
Date de publication:
24 06 2022
24 06 2022
Historique:
entrez:
27
6
2022
pubmed:
28
6
2022
medline:
30
6
2022
Statut:
ppublish
Résumé
Phylogenomics faces a dilemma: on the one hand, most accurate species and gene tree estimation methods are those that co-estimate them; on the other hand, these co-estimation methods do not scale to moderately large numbers of species. The summary-based methods, which first infer gene trees independently and then combine them, are much more scalable but are prone to gene tree estimation error, which is inevitable when inferring trees from limited-length data. Gene tree estimation error is not just random noise and can create biases such as long-branch attraction. We introduce a scalable likelihood-based approach to co-estimation under the multi-species coalescent model. The method, called quartet co-estimation (QuCo), takes as input independently inferred distributions over gene trees and computes the most likely species tree topology and internal branch length for each quartet, marginalizing over gene tree topologies and ignoring branch lengths by making several simplifying assumptions. It then updates the gene tree posterior probabilities based on the species tree. The focus on gene tree topologies and the heuristic division to quartets enables fast likelihood calculations. We benchmark our method with extensive simulations for quartet trees in zones known to produce biased species trees and further with larger trees. We also run QuCo on a biological dataset of bees. Our results show better accuracy than the summary-based approach ASTRAL run on estimated gene trees. QuCo is available on https://github.com/maryamrabiee/quco. Supplementary data are available at Bioinformatics online.
Identifiants
pubmed: 35758818
pii: 6617531
doi: 10.1093/bioinformatics/btac265
pmc: PMC9235488
doi:
Types de publication
Journal Article
Research Support, U.S. Gov't, Non-P.H.S.
Langues
eng
Sous-ensembles de citation
IM
Pagination
i413-i421Informations de copyright
© The Author(s) 2022. Published by Oxford University Press.
Références
Mol Phylogenet Evol. 2015 Feb;83:191-9
pubmed: 25450097
IEEE/ACM Trans Comput Biol Bioinform. 2016 Dec 14;15(3):1010-1015
pubmed: 28113327
Evolution. 2012 Mar;66(3):763-775
pubmed: 22380439
Syst Biol. 2015 Mar;64(2):233-42
pubmed: 25414175
Bioinformatics. 2010 Nov 15;26(22):2910-1
pubmed: 20861028
J Math Biol. 2011 Jun;62(6):833-62
pubmed: 20652704
Trends Genet. 2006 Apr;22(4):225-31
pubmed: 16490279
BMC Genomics. 2015;16 Suppl 10:S3
pubmed: 26449326
Bioinformatics. 2018 Sep 1;34(17):i697-i705
pubmed: 30423064
Syst Biol. 2010 Oct;59(5):573-83
pubmed: 20833951
BMC Evol Biol. 2007 Feb 08;7 Suppl 1:S4
pubmed: 17288577
Syst Biol. 2011 May;60(3):261-75
pubmed: 21368324
Mol Biol Evol. 2009 Aug;26(8):1879-88
pubmed: 19423664
Bioinformatics. 2014 Dec 1;30(23):3317-24
pubmed: 25104814
Bioinformatics. 2010 Jun 15;26(12):1569-71
pubmed: 20421198
Science. 2014 Dec 12;346(6215):1250463
pubmed: 25504728
Syst Biol. 2005 Oct;54(5):743-57
pubmed: 16243762
Syst Biol. 2016 Mar;65(2):334-44
pubmed: 26526427
Nature. 2019 Oct;574(7780):679-685
pubmed: 31645766
Mol Biol Evol. 1988 Sep;5(5):568-83
pubmed: 3193878
Syst Biol. 2021 Jun 16;70(4):803-821
pubmed: 33367855
Bioinformatics. 2014 Sep 1;30(17):i541-8
pubmed: 25161245
BMC Bioinformatics. 2011 Oct 05;12 Suppl 9:S4
pubmed: 22152123
Syst Biol. 2018 Mar 01;67(2):285-303
pubmed: 29029338
BMC Genomics. 2016 Nov 11;17(Suppl 10):783
pubmed: 28185574
Syst Biol. 2020 Mar 1;69(2):209-220
pubmed: 31504998
Syst Biol. 2011 Mar;60(2):126-37
pubmed: 21088009
J Comput Biol. 2008 Jan-Feb;15(1):91-103
pubmed: 18199023
BMC Evol Biol. 2010 Oct 11;10:302
pubmed: 20937096
Syst Biol. 2013 Nov;62(6):901-12
pubmed: 23925510
Syst Biol. 2013 Jul;62(4):574-90
pubmed: 23576318
Syst Biol. 2016 May;65(3):381-96
pubmed: 26821913
Bioinformatics. 2008 Nov 1;24(21):2542-3
pubmed: 18799483
Syst Biol. 2012 May;61(3):539-42
pubmed: 22357727
Mol Biol Evol. 2016 Jul;33(7):1654-68
pubmed: 27189547
Bioinformatics. 2013 Sep 15;29(18):2277-84
pubmed: 23842808
Mol Biol Evol. 2017 Aug 1;34(8):2101-2114
pubmed: 28431121
Syst Biol. 2019 Mar 1;68(2):281-297
pubmed: 30247732
Mol Biol Evol. 2020 May 1;37(5):1530-1534
pubmed: 32011700
Syst Biol. 2009 Oct;58(5):501-8
pubmed: 20525604
PLoS Genet. 2006 May;2(5):e68
pubmed: 16733550
Mol Biol Evol. 2007 Feb;24(2):412-26
pubmed: 17095535
BMC Bioinformatics. 2018 May 8;19(Suppl 6):153
pubmed: 29745866
Syst Biol. 2014 Jan 1;63(1):83-95
pubmed: 24021724
IEEE/ACM Trans Comput Biol Bioinform. 2015 Mar-Apr;12(2):422-32
pubmed: 26357228
Syst Biol. 2015 Jan;64(1):e42-62
pubmed: 25070970