Incongruence in the phylogenomics era.


Journal

Nature reviews. Genetics
ISSN: 1471-0064
Titre abrégé: Nat Rev Genet
Pays: England
ID NLM: 100962779

Informations de publication

Date de publication:
Dec 2023
Historique:
accepted: 19 05 2023
medline: 16 11 2023
pubmed: 28 6 2023
entrez: 27 6 2023
Statut: ppublish

Résumé

Genome-scale data and the development of novel statistical phylogenetic approaches have greatly aided the reconstruction of a broad sketch of the tree of life and resolved many of its branches. However, incongruence - the inference of conflicting evolutionary histories - remains pervasive in phylogenomic data, hampering our ability to reconstruct and interpret the tree of life. Biological factors, such as incomplete lineage sorting, horizontal gene transfer, hybridization, introgression, recombination and convergent molecular evolution, can lead to gene phylogenies that differ from the species tree. In addition, analytical factors, including stochastic, systematic and treatment errors, can drive incongruence. Here, we review these factors, discuss methodological advances to identify and handle incongruence, and highlight avenues for future research.

Identifiants

pubmed: 37369847
doi: 10.1038/s41576-023-00620-x
pii: 10.1038/s41576-023-00620-x
doi:

Types de publication

Journal Article Review

Langues

eng

Sous-ensembles de citation

IM

Pagination

834-850

Subventions

Organisme : NIAID NIH HHS
ID : R01 AI153356
Pays : United States

Informations de copyright

© 2023. Springer Nature Limited.

Références

Simpson, G. G. The Principles of Classification and a Classification of Mammals Vol. 85 (American Museum of Natural History, 1945).
Jarvis, E. D. et al. Whole-genome analyses resolve early branches in the tree of life of modern birds. Science 346, 1320–1331 (2014).
pubmed: 25504713 doi: 10.1126/science.1253451 pmcid: 4405904
Parks, D. H. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat. Biotechnol. 36, 996–1004 (2018).
pubmed: 30148503 doi: 10.1038/nbt.4229
One Thousand Plant Transcriptomes Initiative. One thousand plant transcriptomes and the phylogenomics of green plants. Nature 574, 679–685 (2019).
doi: 10.1038/s41586-019-1693-2
Li, Y. et al. HGT is widespread in insects and contributes to male courtship in lepidopterans. Cell 185, 2975–2987.e10 (2022).
pubmed: 35853453 doi: 10.1016/j.cell.2022.06.014 pmcid: 9357157
Eisen, J. A. Phylogenomics: improving functional predictions for uncharacterized genes by evolutionary analysis. Genome Res. 8, 163–167 (1998).
pubmed: 9521918 doi: 10.1101/gr.8.3.163
Delsuc, F., Brinkmann, H. & Philippe, H. Phylogenomics and the reconstruction of the tree of life. Nat. Rev. Genet. 6, 361–375 (2005).
pubmed: 15861208 doi: 10.1038/nrg1603
Crotty, S. M. et al. GHOST: recovering historical signal from heterotachously evolved sequence alignments. Syst. Biol. 69, 249–264 (2020).
pubmed: 31364711
Rokas, A., Williams, B. L., King, N. & Carroll, S. B. Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature 425, 798–804 (2003).
pubmed: 14574403 doi: 10.1038/nature02053
Kawahara, A. Y. et al. Phylogenomics reveals the evolutionary timing and pattern of butterflies and moths. Proc. Natl Acad. Sci. USA 116, 22657–22663 (2019).
pubmed: 31636187 doi: 10.1073/pnas.1907847116 pmcid: 6842621
Misof, B. et al. Phylogenomics resolves the timing and pattern of insect evolution. Science 346, 763–767 (2014).
pubmed: 25378627 doi: 10.1126/science.1257570
Dunn, C. W. et al. Broad phylogenomic sampling improves resolution of the animal tree of life. Nature 452, 745–749 (2008).
pubmed: 18322464 doi: 10.1038/nature06614
Bond, J. E. et al. Phylogenomics resolves a spider backbone phylogeny and rejects a prevailing paradigm for Orb web evolution. Curr. Biol. 24, 1765–1771 (2014).
pubmed: 25042592 doi: 10.1016/j.cub.2014.06.034
Li, Y. et al. A genome-scale phylogeny of the kingdom Fungi. Curr. Biol. 31, 1653–1665.e5 (2021).
pubmed: 33607033 doi: 10.1016/j.cub.2021.01.074 pmcid: 8347878
Simion, P. et al. A large and consistent phylogenomic dataset supports sponges as the sister group to all other animals. Curr. Biol. 27, 958–967 (2017).
pubmed: 28318975 doi: 10.1016/j.cub.2017.02.031
Whelan, N. V. et al. Ctenophore relationships and their placement as the sister group to all other animals. Nat. Ecol. Evol. 1, 1737–1746 (2017).
pubmed: 28993654 doi: 10.1038/s41559-017-0331-3 pmcid: 5664179
Lemmon, A. R. & Moriarty, E. C. The importance of proper model assumption in Bayesian phylogenetics. Syst. Biol. 53, 265–277 (2004).
pubmed: 15205052 doi: 10.1080/10635150490423520
Mao, Y. et al. A high-quality bonobo genome refines the analysis of hominid evolution. Nature 594, 77–81 (2021).
pubmed: 33953399 doi: 10.1038/s41586-021-03519-x pmcid: 8172381
Meleshko, O. et al. Extensive genome-wide phylogenetic discordance is due to incomplete lineage sorting and not ongoing introgression in a rapidly radiated bryophyte genus. Mol. Biol. Evol. 38, 2750–2766 (2021).
pubmed: 33681996 doi: 10.1093/molbev/msab063 pmcid: 8233498
Feng, S. et al. Incomplete lineage sorting and phenotypic evolution in marsupials. Cell 185, 1646–1660.e18 (2022).
pubmed: 35447073 doi: 10.1016/j.cell.2022.03.034 pmcid: 9200472
Avise, J. C. & Robinson, T. J. Hemiplasy: a new term in the lexicon of phylogenetics. Syst. Biol. 57, 503–507 (2008).
pubmed: 18570042 doi: 10.1080/10635150802164587
Maddison, W. P. & Knowles, L. L. Inferring phylogeny despite incomplete lineage sorting. Syst. Biol. 55, 21–30 (2006).
pubmed: 16507521 doi: 10.1080/10635150500354928
Degnan, J. H. & Rosenberg, N. A. Gene tree discordance, phylogenetic inference and the multispecies coalescent. Trends Ecol. Evol. 24, 332–340 (2009).
pubmed: 19307040 doi: 10.1016/j.tree.2009.01.009
Song, S., Liu, L., Edwards, S. V. & Wu, S. Resolving conflict in eutherian mammal phylogeny using phylogenomics and the multispecies coalescent model. Proc. Natl Acad. Sci. USA 109, 14942–14947 (2012).
pubmed: 22930817 doi: 10.1073/pnas.1211733109 pmcid: 3443116
Flouri, T., Jiao, X., Rannala, B. & Yang, Z. Species tree inference with BPP using genomic sequences and the multispecies coalescent. Mol. Biol. Evol. 35, 2585–2593 (2018).
pubmed: 30053098 doi: 10.1093/molbev/msy147 pmcid: 6188564
Bouckaert, R. et al. BEAST 2.5: an advanced software platform for Bayesian evolutionary analysis. PLoS Comput. Biol. 15, e1006650 (2019).
pubmed: 30958812 doi: 10.1371/journal.pcbi.1006650 pmcid: 6472827
Liu, L., Yu, L., Kubatko, L., Pearl, D. K. & Edwards, S. V. Coalescent methods for estimating phylogenetic trees. Mol. Phylogenet. Evol. 53, 320–328 (2009).
pubmed: 19501178 doi: 10.1016/j.ympev.2009.05.033
Liu, L., Yu, L. & Edwards, S. V. A maximum pseudo-likelihood approach for estimating species trees under the coalescent model. BMC Evol. Biol. 10, 302 (2010).
pubmed: 20937096 doi: 10.1186/1471-2148-10-302 pmcid: 2976751
Zhang, C., Rabiee, M., Sayyari, E. & Mirarab, S. ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees. BMC Bioinform. 19, 153 (2018).
doi: 10.1186/s12859-018-2129-y
Zhang, C. & Mirarab, S. Weighting by gene tree uncertainty improves accuracy of quartet-based species trees. Mol. Biol. Evol. 39, msac215 (2022). This study describes the latest version of the state-of-the-art software for phylogenomic inference using summary-based coalescence methods. By incorporating weighting schemes that reduce the contribution of weakly supported gene trees and/or of trees with long branch lengths.
pubmed: 36201617 doi: 10.1093/molbev/msac215 pmcid: 9750496
Morel, B., Williams, T. A. & Stamatakis, A. Asteroid: a new algorithm to infer species trees from gene trees under high proportions of missing data. Bioinformatics 39, btac832 (2023).
pubmed: 36576010 doi: 10.1093/bioinformatics/btac832
Kominek, J. et al. Eukaryotic acquisition of a bacterial operon. Cell 176, 1356–1366.e10 (2019).
pubmed: 30799038 doi: 10.1016/j.cell.2019.01.034 pmcid: 7295392
Arnold, B. J., Huang, I.-T. & Hanage, W. P. Horizontal gene transfer and adaptive evolution in bacteria. Nat. Rev. Microbiol. 20, 206–218 (2022).
pubmed: 34773098 doi: 10.1038/s41579-021-00650-4
Gophna, U. & Altman-Price, N. Horizontal gene transfer in Archaea — from mechanisms to genome evolution. Annu. Rev. Microbiol. 76, 481–502 (2022).
pubmed: 35667126 doi: 10.1146/annurev-micro-040820-124627
Van Etten, J. & Bhattacharya, D. Horizontal gene transfer in eukaryotes: not if, but how much? Trends Genet. 36, 915–925 (2020).
pubmed: 33012528 doi: 10.1016/j.tig.2020.08.006
Lapierre, P., Lasek-Nesselquist, E. & Gogarten, J. P. The impact of HGT on phylogenomic reconstruction methods. Brief. Bioinform. 15, 79–90 (2014).
pubmed: 22908214 doi: 10.1093/bib/bbs050
Wisecaver, J. H. & Rokas, A. Fungal metabolic gene clusters: caravans traveling across genomes and environments. Front. Microbiol. 6, 161 (2015).
pubmed: 25784900 doi: 10.3389/fmicb.2015.00161 pmcid: 4347624
Sevillya, G., Adato, O. & Snir, S. Detecting horizontal gene transfer: a probabilistic approach. BMC Genomics 21, 106 (2020).
pubmed: 32138652 doi: 10.1186/s12864-019-6395-5 pmcid: 7057450
Gladyshev, E. A., Meselson, M. & Arkhipova, I. R. Massive horizontal gene transfer in Bdelloid rotifers. Science 320, 1210–1213 (2008).
pubmed: 18511688 doi: 10.1126/science.1156407
Szöllősi, G. J., Boussau, B., Abby, S. S., Tannier, E. & Daubin, V. Phylogenetic modeling of lateral gene transfer reconstructs the pattern and relative timing of speciations. Proc. Natl Acad. Sci. USA 109, 17513–17518 (2012). This study uses a statistical model of genome evolution that considers gene duplications, gene losses and horizontal gene transfers in phylogenetic reconstruction, demonstrating that incongruence stemming from these processes can inform inferences of evolutionary history.
pubmed: 23043116 doi: 10.1073/pnas.1202997109 pmcid: 3491530
Williams, T. A. et al. Integrative modeling of gene and genome evolution roots the archaeal tree of life. Proc. Natl Acad. Sci. USA 114, E4602–E4611 (2017).
pubmed: 28533395 doi: 10.1073/pnas.1618463114 pmcid: 5468678
Morel, B. et al. SpeciesRax: a tool for maximum likelihood species tree inference from gene family trees under duplication, transfer, and loss. Mol. Biol. Evol. 39, msab365 (2022).
pubmed: 35021210 doi: 10.1093/molbev/msab365 pmcid: 8826479
Zhang, D. et al. Most genomic loci misrepresent the phylogeny of an avian radiation because of ancient gene flow. Syst. Biol. 70, 961–975 (2021).
pubmed: 33787929 doi: 10.1093/sysbio/syab024 pmcid: 8357342
Hibbins, M. S. & Hahn, M. W. Phylogenomic approaches to detecting and characterizing introgression. Genetics 220, iyab173 (2022).
pubmed: 34788444 doi: 10.1093/genetics/iyab173
Sang, T. & Zhong, Y. Testing hybridization hypotheses based on incongruent gene trees. Syst. Biol. 49, 422–434 (2000).
pubmed: 12116420 doi: 10.1080/10635159950127321
Langdon, Q. K., Peris, D., Kyle, B. & Hittinger, C. T. sppIDer: a species identification tool to investigate hybrid genomes with high-throughput sequencing. Mol. Biol. Evol. 35, 2835–2849 (2018).
pubmed: 30184140 pmcid: 6231485
Steenwyk, J. L. et al. Pathogenic allodiploid hybrids of Aspergillus fungi. Curr. Biol. 30, 2495–2507.e7 (2020).
pubmed: 32502407 doi: 10.1016/j.cub.2020.04.071 pmcid: 7343619
Yu, Y., Dong, J., Liu, K. J. & Nakhleh, L. Maximum likelihood inference of reticulate evolutionary histories. Proc. Natl Acad. Sci. USA 111, 16448–16453 (2014).
pubmed: 25368173 doi: 10.1073/pnas.1407950111 pmcid: 4246314
Durand, E. Y., Patterson, N., Reich, D. & Slatkin, M. Testing for ancient admixture between closely related populations. Mol. Biol. Evol. 28, 2239–2252 (2011).
pubmed: 21325092 doi: 10.1093/molbev/msr048 pmcid: 3144383
Pease, J. B. & Hahn, M. W. Detection and polarization of introgression in a five-taxon phylogeny. Syst. Biol. 64, 651–662 (2015). This work describes a method for detecting incomplete lineage sorting and introgression in the five-taxon case, enabling identification of the taxa involved and the direction of introgression.
pubmed: 25888025 doi: 10.1093/sysbio/syv023
Hahn, M. W. & Hibbins, M. S. A three-sample test for introgression. Mol. Biol. Evol. 36, 2878–2882 (2019).
pubmed: 31373630 doi: 10.1093/molbev/msz178
Suvorov, A. et al. Widespread introgression across a phylogeny of 155 Drosophila genomes. Curr. Biol. 32, 111–123.e5 (2022).
pubmed: 34788634 doi: 10.1016/j.cub.2021.10.052
Posada, D. & Crandall, K. A. The effect of recombination on the accuracy of phylogeny estimation. J. Mol. Evol. 54, 396–402 (2002).
pubmed: 11847565 doi: 10.1007/s00239-001-0034-9
Bruen, T. C., Philippe, H. & Bryant, D. A simple and robust statistical test for detecting the presence of recombination. Genetics 172, 2665–2681 (2006).
pubmed: 16489234 doi: 10.1534/genetics.105.048975 pmcid: 1456386
Martin, D. P. et al. RDP5: a computer program for analyzing recombination in, and removing signals of recombination from, nucleotide sequence datasets. Virus Evol. 7, veaa087 (2021).
pubmed: 33936774 doi: 10.1093/ve/veaa087
Sackton, T. B. & Clark, N. Convergent evolution in the genomics era: new insights and directions. Phil. Trans. R. Soc. B 374, 20190102 (2019).
pubmed: 31154976 doi: 10.1098/rstb.2019.0102 pmcid: 6560275
Li, Y., Liu, Z., Shi, P. & Zhang, J. The hearing gene Prestin unites echolocating bats and whales. Curr. Biol. 20, R55–R56 (2010). Striking example of convergent molecular evolution in Prestin, a gene that encodes a protein involved in echolocation. Even though echolocating bats and whales are not sister lineages, bat and whale sequences of Prestin group these lineages together, demonstrating how convergent evolution can contribute to incongruence.
pubmed: 20129037 doi: 10.1016/j.cub.2009.11.042
Castoe, T. A. et al. Evidence for an ancient adaptive episode of convergent molecular evolution. Proc. Natl Acad. Sci. USA 106, 8986–8991 (2009).
pubmed: 19416880 doi: 10.1073/pnas.0900233106 pmcid: 2690048
Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).
pubmed: 32011700 doi: 10.1093/molbev/msaa015 pmcid: 7182206
Musil, M. et al. FireProt
pubmed: 33346815 doi: 10.1093/bib/bbaa337
Hanson-Smith, V. & Johnson, A. PhyloBot: a web portal for automated phylogenetics, ancestral sequence reconstruction, and exploration of mutational trajectories. PLoS Comput. Biol. 12, e1004976 (2016).
pubmed: 27472806 doi: 10.1371/journal.pcbi.1004976 pmcid: 4966924
Martijn, J. et al. Hikarchaeia demonstrate an intermediate stage in the methanogen-to-halophile transition. Nat. Commun. 11, 5490 (2020).
pubmed: 33127909 doi: 10.1038/s41467-020-19200-2 pmcid: 7599335
Martijn, J., Vosseberg, J., Guy, L., Offre, P. & Ettema, T. J. G. Deep mitochondrial origin outside the sampled alphaproteobacteria. Nature 557, 101–105 (2018).
pubmed: 29695865 doi: 10.1038/s41586-018-0059-5
Muñoz-Gómez, S. A. et al. Site-and-branch-heterogeneous analyses of an expanded dataset favour mitochondria as sister to known Alphaproteobacteria. Nat. Ecol. Evol. 6, 253–262 (2022). This article describes a novel model of protein evolution that considers compositional heterogeneity both across sites of a data matrix and across branches of a phylogeny. This model is likely better than site-homogeneous or site-heterogenous models in cases where compositional heterogeneity varies across time and across the phylogeny such as the thorny question of the origin of mitochondria.
pubmed: 35027725 doi: 10.1038/s41559-021-01638-2
Riley, R. et al. Comparative genomics of biotechnologically important yeasts. Proc. Natl Acad. Sci. USA 113, 9882–9887 (2016).
pubmed: 27535936 doi: 10.1073/pnas.1603941113 pmcid: 5024638
Shen, X.-X. et al. Reconstructing the backbone of the Saccharomycotina yeast phylogeny using genome-scale data. G3 6, 3927–3939 (2016).
pubmed: 27672114 doi: 10.1534/g3.116.034744 pmcid: 5144963
Shen, X.-X., Hittinger, C. T. & Rokas, A. Contentious relationships in phylogenomic studies can be driven by a handful of genes. Nat. Ecol. Evol. 1, 0126 (2017). This article describes a novel approach to visualize single-gene and single-site support for conflicting phylogenetic hypotheses. Application of this approach on phylogenomic data from different instances of incongruence reveals that a few, or even single, genes or sites in very large phylogenomic data matrices can drive incongruence.
doi: 10.1038/s41559-017-0126
Shen, X.-X. et al. Tempo and mode of genome evolution in the budding yeast subphylum. Cell 175, 1533–1545.e20 (2018).
pubmed: 30415838 doi: 10.1016/j.cell.2018.10.023 pmcid: 6291210
Gitzendanner, M. A., Soltis, P. S., Wong, G. K.-S., Ruhfel, B. R. & Soltis, D. E. Plastid phylogenomic analysis of green plants: a billion years of evolutionary history. Am. J. Bot. 105, 291–301 (2018).
pubmed: 29603143 doi: 10.1002/ajb2.1048
Wickett, N. J. et al. Phylotranscriptomic analysis of the origin and early diversification of land plants. Proc. Natl Acad. Sci. USA 111, E4859–E4868 (2014).
pubmed: 25355905 doi: 10.1073/pnas.1323926111 pmcid: 4234587
Cheng, S. et al. Genomes of subaerial Zygnematophyceae provide insights into land plant evolution. Cell 179, 1057–1067.e14 (2019).
pubmed: 31730849 doi: 10.1016/j.cell.2019.10.019
Aberer, A. J., Krompass, D. & Stamatakis, A. Pruning rogue taxa improves phylogenetic accuracy: an efficient algorithm and webservice. Syst. Biol. 62, 162–166 (2013).
pubmed: 22962004 doi: 10.1093/sysbio/sys078
Struck, T. H. TreSpEx — detection of misleading signal in phylogenetic reconstructions based on tree information. Evol. Bioinform. Online 10, EBO.S14239 (2014).
doi: 10.4137/EBO.S14239
Amemiya, C. T. et al. The African coelacanth genome provides insights into tetrapod evolution. Nature 496, 311–316 (2013).
pubmed: 23598338 doi: 10.1038/nature12027 pmcid: 3633110
Liu, S. et al. Ancient and modern genomes unravel the evolutionary history of the rhinoceros family. Cell 184, 4874–4885.e16 (2021).
pubmed: 34433011 doi: 10.1016/j.cell.2021.07.032
Perri, A. R. et al. Dire wolves were the last of an ancient New World canid lineage. Nature 591, 87–91 (2021).
pubmed: 33442059 doi: 10.1038/s41586-020-03082-x
Townsend, J. P. Profiling phylogenetic informativeness. Syst. Biol. 56, 222–231 (2007).
pubmed: 17464879 doi: 10.1080/10635150701311362
Patel, S., Kimball, R. T. & Braun, E. L. Error in phylogenetic estimation for bushes in the tree of life. J. Phylogenet. Evol. Biol. 01, 1000110 (2013).
doi: 10.4172/2329-9002.1000110
Rokas, A. & Carroll, S. B. Bushes in the tree of life. PLoS Biol. 4, e352 (2006).
pubmed: 17105342 doi: 10.1371/journal.pbio.0040352 pmcid: 1637082
Pipes, L., Wang, H., Huelsenbeck, J. P. & Nielsen, R. Assessing uncertainty in the rooting of the SARS-CoV-2 phylogeny. Mol. Biol. Evol. 38, 1537–1543 (2021). This article shows that statistical support for the rooting of the SAR-CoV-2 phylogeny is weak, suggesting that there is a limit in our power to resolve certain phylogenetic branches.
pubmed: 33295605 doi: 10.1093/molbev/msaa316
Steenwyk, J. L. et al. OrthoSNAP: a tree splitting and pruning algorithm for retrieving single-copy orthologs from gene family trees. PLoS Biol. 20, e3001827 (2022).
pubmed: 36228036 doi: 10.1371/journal.pbio.3001827 pmcid: 9595520
Willson, J., Roddur, M. S., Liu, B., Zaharias, P. & Warnow, T. DISCO: species tree inference using multicopy gene family tree decomposition. Syst. Biol. 71, 610–629 (2022).
pubmed: 34450658 doi: 10.1093/sysbio/syab070
Springer, M. S. & Gatesy, J. The gene tree delusion. Mol. Phylogenet. Evol. 94, 1–33 (2016).
pubmed: 26238460 doi: 10.1016/j.ympev.2015.07.018
Sanderson, M. J., McMahon, M. M. & Steel, M. Terraces in phylogenetic tree space. Science 333, 448–450 (2011).
pubmed: 21680810 doi: 10.1126/science.1206357
Xi, Z. et al. Phylogenomics and a posteriori data partitioning resolve the Cretaceous angiosperm radiation Malpighiales. Proc. Natl Acad. Sci. USA 109, 17519–17524 (2012).
pubmed: 23045684 doi: 10.1073/pnas.1205818109 pmcid: 3491498
Sanderson, M. J., McMahon, M. M., Stamatakis, A., Zwickl, D. J. & Steel, M. Impacts of terraces on phylogenetic inference. Syst. Biol. 64, 709–726 (2015).
pubmed: 25999395 doi: 10.1093/sysbio/syv024
Steenwyk, J. L., Shen, X.-X., Lind, A. L., Goldman, G. H. & Rokas, A. A robust phylogenomic time tree for biotechnologically and medically important fungi in the genera Aspergillus and Penicillium. mBio 10, e00925-19 (2019).
pubmed: 31289177 doi: 10.1128/mBio.00925-19 pmcid: 6747717
Smith, B. T., Mauck, W. M., Benz, B. W. & Andersen, M. J. Uneven missing data skew phylogenomic relationships within the lories and lorikeets. Genome Biol. Evol. 12, 1131–1147 (2020).
pubmed: 32470111 doi: 10.1093/gbe/evaa113 pmcid: 7486955
Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019). This article describes OrthoFinder, a state-of-the-art software for the identification of groups of orthologous genes that considers incomplete lineage sorting and gene duplication and loss, improving the accuracy of ortholog inference.
pubmed: 31727128 doi: 10.1186/s13059-019-1832-y pmcid: 6857279
Weisman, C. M., Murray, A. W. & Eddy, S. R. Many, but not all, lineage-specific genes can be explained by homology detection failure. PLoS Biol. 18, e3000862 (2020).
pubmed: 33137085 doi: 10.1371/journal.pbio.3000862 pmcid: 7660931
Martín-Durán, J. M., Ryan, J. F., Vellutini, B. C., Pang, K. & Hejnol, A. Increased taxon sampling reveals thousands of hidden orthologs in flatworms. Genome Res. 27, 1263–1272 (2017).
pubmed: 28400424 doi: 10.1101/gr.216226.116 pmcid: 5495077
Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).
pubmed: 22039361 doi: 10.1371/journal.pcbi.1002195 pmcid: 3197634
Tassia, M. G., David, K. T., Townsend, J. P. & Halanych, K. M. TIAMMAt: leveraging biodiversity to revise protein domain models, evidence from innate immunity. Mol. Biol. Evol. 38, 5806–5818 (2021).
pubmed: 34459919 doi: 10.1093/molbev/msab258 pmcid: 8662601
Scannell, D. R., Byrne, K. P., Gordon, J. L., Wong, S. & Wolfe, K. H. Multiple rounds of speciation associated with reciprocal gene loss in polyploid yeasts. Nature 440, 341–345 (2006).
pubmed: 16541074 doi: 10.1038/nature04562
Philippe, H. et al. Phylogenomics revives traditional views on deep animal relationships. Curr. Biol. 19, 706–712 (2009).
pubmed: 19345102 doi: 10.1016/j.cub.2009.02.052
Steenwyk, J. L. et al. PhyKIT: a broadly applicable UNIX shell toolkit for processing and analyzing phylogenomic data. Bioinformatics 37, 2325–2331 (2021).
pubmed: 33560364 doi: 10.1093/bioinformatics/btab096 pmcid: 8388027
Mai, U. & Mirarab, S. TreeShrink: fast and accurate detection of outlier long branches in collections of phylogenetic trees. BMC Genom. 19, 272 (2018).
doi: 10.1186/s12864-018-4620-2
Tice, A. K. et al. PhyloFisher: a phylogenomic package for resolving eukaryotic relationships. PLoS Biol. 19, e3001365 (2021).
pubmed: 34358228 doi: 10.1371/journal.pbio.3001365 pmcid: 8345874
Kocot, K. M., Citarella, M. R., Moroz, L. L. & Halanych, K. M. PhyloTreePruner: a phylogenetic tree-based approach for selection of orthologous sequences for phylogenomics. Evol. Bioinform. Online 9, EBO.S12813 (2013).
doi: 10.4137/EBO.S12813
Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).
pubmed: 25977477 doi: 10.1101/gr.186072.114 pmcid: 4484387
Hugoson, E., Lam, W. T. & Guy, L. miComplete: weighted quality evaluation of assembled microbial genomes. Bioinformatics 36, 936–937 (2020).
pubmed: 31504158 doi: 10.1093/bioinformatics/btz664
Jukes, T. H. & Cantor, C. R. In Mammalian Protein Metabolism 1st edn, Vol. III (ed. Munro, H. N.) Ch. 24 (Academic Press, 1969).
Kimura, M. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16, 111–120 (1980).
pubmed: 7463489 doi: 10.1007/BF01731581
Felsenstein, J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17, 368–376 (1981).
pubmed: 7288891 doi: 10.1007/BF01734359
Tavaré, S. Some probabilistic and statistical problems in the analysis of DNA sequences. Lect. Math. Life Sci. 17, 57–86 (1986).
Arenas, M. Trends in substitution models of molecular evolution. Front. Genet. 6, 319 (2015).
pubmed: 26579193 doi: 10.3389/fgene.2015.00319 pmcid: 4620419
Yang, Z., Nielsen, R. & Hasegawa, M. Models of amino acid substitution and applications to mitochondrial protein evolution. Mol. Biol. Evol. 15, 1600–1611 (1998).
pubmed: 9866196 doi: 10.1093/oxfordjournals.molbev.a025888
Whelan, S. & Goldman, N. A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol. Biol. Evol. 18, 691–699 (2001).
pubmed: 11319253 doi: 10.1093/oxfordjournals.molbev.a003851
Le, S. Q. & Gascuel, O. An improved general amino acid replacement matrix. Mol. Biol. Evol. 25, 1307–1320 (2008).
pubmed: 18367465 doi: 10.1093/molbev/msn067
Darriba, D., Taboada, G. L., Doallo, R. & Posada, D. jModelTest 2: more models, new heuristics and parallel computing. Nat. Methods 9, 772–772 (2012).
pubmed: 22847109 doi: 10.1038/nmeth.2109 pmcid: 4594756
Susko, E. & Roger, A. J. On the use of information criteria for model selection in phylogenetics. Mol. Biol. Evol. 37, 549–562 (2020).
pubmed: 31688943 doi: 10.1093/molbev/msz228
Spielman, S. J. Relative model fit does not predict topological accuracy in single-gene protein phylogenetics. Mol. Biol. Evol. 37, 2110–2123 (2020).
pubmed: 32191313 doi: 10.1093/molbev/msaa075 pmcid: 7306691
Abadi, S., Azouri, D., Pupko, T. & Mayrose, I. Model selection may not be a mandatory step for phylogeny reconstruction. Nat. Commun. 10, 934 (2019).
pubmed: 30804347 doi: 10.1038/s41467-019-08822-w pmcid: 6389923
Bloom, J. D. An experimentally determined evolutionary model dramatically improves phylogenetic fit. Mol. Biol. Evol. 31, 1956–1978 (2014). Through systematic mutagenesis, functional selection and sequencing experiments, this study experimentally determines a substitution model for a viral protein. This parameter-free model is a much better fit than models with hundreds of parameters, highlighting the potential of high-throughput experimental strategies in improving the accuracy of phylogenetic inference.
pubmed: 24859245 doi: 10.1093/molbev/msu173 pmcid: 4104320
Kainer, D. & Lanfear, R. The effects of partitioning on phylogenetic inference. Mol. Biol. Evol. 32, 1611–1627 (2015).
pubmed: 25660373 doi: 10.1093/molbev/msv026
Lanfear, R., Frandsen, P. B., Wright, A. M., Senfeld, T. & Calcott, B. PartitionFinder 2: new methods for selecting partitioned models of evolution for molecular and morphological phylogenetic analyses. Mol. Biol. Evol. 34, 772–773 (2016).
Lartillot, N. & Philippe, H. A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. Mol. Biol. Evol. 21, 1095–1109 (2004). This landmark study introduces site-heterogeneous models of sequence evolution. By considering compositional heterogeneity across sites, these models can better ameliorate the impact of long-branch attraction artefacts.
pubmed: 15014145 doi: 10.1093/molbev/msh112
Si Quang, L., Gascuel, O. & Lartillot, N. Empirical profile mixture models for phylogenetic reconstruction. Bioinformatics 24, 2317–2323 (2008).
doi: 10.1093/bioinformatics/btn445
Stairs, C. W. et al. Anaeramoebae are a divergent lineage of eukaryotes that shed light on the transition from anaerobic mitochondria to hydrogenosomes. Curr. Biol. 31, 5605–5612.e5 (2021).
pubmed: 34710348 doi: 10.1016/j.cub.2021.10.010
Galindo, L. J., López-García, P., Torruella, G., Karpov, S. & Moreira, D. Phylogenomics of a new fungal phylum reveals multiple waves of reductive evolution across Holomycota. Nat. Commun. 12, 4973 (2021).
pubmed: 34404788 doi: 10.1038/s41467-021-25308-w pmcid: 8371127
Williams, T. A., Cox, C. J., Foster, P. G., Szöllősi, G. J. & Embley, T. M. Phylogenomics provides robust support for a two-domains tree of life. Nat. Ecol. Evol. 4, 138–147 (2019).
pubmed: 31819234 doi: 10.1038/s41559-019-1040-x pmcid: 6942926
Minin, V., Abdo, Z., Joyce, P. & Sullivan, J. Performance-based selection of likelihood models for phylogeny estimation. Syst. Biol. 52, 674–683 (2003).
pubmed: 14530134 doi: 10.1080/10635150390235494
Yang, Z. & Rannala, B. Molecular phylogenetics: principles and practice. Nat. Rev. Genet. 13, 303–314 (2012).
pubmed: 22456349 doi: 10.1038/nrg3186
Sullivan, J. & Swofford, D. L. Are guinea pigs rodents? The importance of adequate models in molecular phylogenetics. J. Mamm. Evol. 4, 77–86 (1997).
doi: 10.1023/A:1027314112438
Lartillot, N., Brinkmann, H. & Philippe, H. Suppression of long-branch attraction artefacts in the animal phylogeny using a site-heterogeneous model. BMC Evol. Biol. 7, S4 (2007).
pubmed: 17288577 doi: 10.1186/1471-2148-7-S1-S4 pmcid: 1796613
Susko, E. & Roger, A. J. Long branch attraction biases in phylogenetics. Syst. Biol. 70, 838–843 (2021).
pubmed: 33528562 doi: 10.1093/sysbio/syab001
Husník, F., Chrudimský, T. & Hypša, V. Multiple origins of endosymbiosis within the Enterobacteriaceae (γ-Proteobacteria): convergence of complex phylogenetic approaches. BMC Biol. 9, 87 (2011).
pubmed: 22201529 doi: 10.1186/1741-7007-9-87 pmcid: 3271043
Capella-Gutiérrez, S., Marcet-Houben, M. & Gabaldón, T. Phylogenomics supports microsporidia as the earliest diverging clade of sequenced fungi. BMC Biol. 10, 47 (2012).
pubmed: 22651672 doi: 10.1186/1741-7007-10-47 pmcid: 3586952
Graybeal, A. Is it better to add taxa or characters to a difficult phylogenetic problem? Syst. Biol. 47, 9–17 (1998).
pubmed: 12064243 doi: 10.1080/106351598260996
Hillis, D. M. Inferring complex phytogenies. Nature 383, 130–131 (1996).
pubmed: 8774876 doi: 10.1038/383130a0
Lopez, P., Casane, D. & Philippe, H. Heterotachy, an important process of protein evolution. Mol. Biol. Evol. 19, 1–7 (2002).
pubmed: 11752184 doi: 10.1093/oxfordjournals.molbev.a003973
Philippe, H., Zhou, Y., Brinkmann, H., Rodrigue, N. & Delsuc, F. Heterotachy and long-branch attraction in phylogenetics. BMC Evol. Biol. 5, 50 (2005).
pubmed: 16209710 doi: 10.1186/1471-2148-5-50 pmcid: 1274308
Bergsten, J. A review of long-branch attraction. Cladistics 21, 163–193 (2005).
pubmed: 34892859 doi: 10.1111/j.1096-0031.2005.00059.x
Geuten, K., Massingham, T., Darius, P., Smets, E. & Goldman, N. Experimental design criteria in phylogenetics: where to add taxa. Syst. Biol. 56, 609–622 (2007).
pubmed: 17654365 doi: 10.1080/10635150701499563
Pollock, D. D., Zwickl, D. J., McGuire, J. A. & Hillis, D. M. Increased taxon sampling is advantageous for phylogenetic inference. Syst. Biol. 51, 664–671 (2002).
pubmed: 12228008 doi: 10.1080/10635150290102357
Brady, S. G., Litman, J. R. & Danforth, B. N. Rooting phylogenies using gene duplications: an empirical example from the bees (Apoidea). Mol. Phylogenet. Evol. 60, 295–304 (2011).
pubmed: 21600997 doi: 10.1016/j.ympev.2011.05.002
Mathews, S., Clements, M. D. & Beilstein, M. A. A duplicate gene rooting of seed plants and the phylogenetic position of flowering plants. Phil. Trans. R. Soc. B 365, 383–395 (2010).
pubmed: 20047866 doi: 10.1098/rstb.2009.0233 pmcid: 2838261
Emms, D. M. & Kelly, S. STRIDE: species tree root inference from gene duplication events. Mol. Biol. Evol. 34, 3267–3278 (2017).
pubmed: 29029342 doi: 10.1093/molbev/msx259 pmcid: 5850722
Naser-Khdour, S., Quang Minh, B. & Lanfear, R. Assessing confidence in root placement on phylogenies: an empirical study using nonreversible models for mammals. Syst. Biol. 71, 959–972 (2022).
pubmed: 34387349 doi: 10.1093/sysbio/syab067
Bettisworth, B. & Stamatakis, A. Root Digger: a root placement program for phylogenetic trees. BMC Bioinformatics 22, 225 (2021).
pubmed: 33932975 doi: 10.1186/s12859-021-03956-5 pmcid: 8088003
Drummond, A. J., Ho, S. Y. W., Phillips, M. J. & Rambaut, A. Relaxed phylogenetics and dating with confidence. PLoS Biol. 4, e88 (2006).
pubmed: 16683862 doi: 10.1371/journal.pbio.0040088 pmcid: 1395354
Tria, F. D. K., Landan, G. & Dagan, T. Phylogenetic rooting using minimal ancestor deviation. Nat. Ecol. Evol. 1, 0193 (2017).
doi: 10.1038/s41559-017-0193
Ashkenazy, H., Sela, I., Levy, K. E., Landan, G. & Pupko, T. Multiple sequence alignment averaging improves phylogeny reconstruction. Syst. Biol. 68, 117–130 (2019).
pubmed: 29771363 doi: 10.1093/sysbio/syy036
Li-San, W. et al. The impact of multiple protein sequence alignment on phylogenetic estimation. IEEE/ACM Trans. Comput. Biol. Bioinform. 8, 1108–1119 (2011).
doi: 10.1109/TCBB.2009.68
Landan, G. & Graur, D. Characterization of pairwise and multiple sequence alignment errors. Gene 441, 141–147 (2009).
pubmed: 18614299 doi: 10.1016/j.gene.2008.05.016
Ali, R. H., Bogusz, M. & Whelan, S. Identifying clusters of high confidence homologies in multiple sequence alignments. Mol. Biol. Evol. 36, 2340–2351 (2019).
pubmed: 31209473 doi: 10.1093/molbev/msz142 pmcid: 6933875
Zhang, C., Zhao, Y., Braun, E. L. & Mirarab, S. TAPER: pinpointing errors in multiple sequence alignments despite varying rates of evolution. Methods Ecol. Evol. 12, 2145–2158 (2021).
doi: 10.1111/2041-210X.13696
Tan, G. et al. Current methods for automated filtering of multiple sequence alignments frequently worsen single-gene phylogenetic inference. Syst. Biol. 64, 778–791 (2015). Upending conventional wisdom, this study convincingly demonstrates that trimming typically reduces the accuracy of phylogenetic inference and contributes to incongruence.
pubmed: 26031838 doi: 10.1093/sysbio/syv033 pmcid: 4538881
Steenwyk, J. L., Buida, T. J., Li, Y., Shen, X.-X. & Rokas, A. ClipKIT: a multiple sequence alignment trimming software for accurate phylogenomic inference. PLoS Biol. 18, e3001007 (2020). This article describes a novel and more accurate approach to multiple sequence alignment trimming, where phylogenetically informative sites, which are more easily defined than phylogenetically uninformative sites, are retained and other sites are removed.
pubmed: 33264284 doi: 10.1371/journal.pbio.3001007 pmcid: 7735675
Susko, E. & Roger, A. J. On reduced amino acid alphabets for phylogenetic inference. Mol. Biol. Evol. 24, 2139–2150 (2007).
pubmed: 17652333 doi: 10.1093/molbev/msm144
Blanquart, S. A Bayesian compound stochastic process for modeling nonstationary and nonhomogeneous sequence evolution. Mol. Biol. Evol. 23, 2058–2071 (2006).
pubmed: 16931538 doi: 10.1093/molbev/msl091
Phillips, M. J., Delsuc, F. & Penny, D. Genome-scale phylogeny and the detection of systematic biases. Mol. Biol. Evol. 21, 1455–1458 (2004).
pubmed: 15084674 doi: 10.1093/molbev/msh137
Laumer, C. E. et al. Support for a clade of Placozoa and Cnidaria in genes with minimal compositional bias. eLife 7, e36278 (2018).
pubmed: 30373720 doi: 10.7554/eLife.36278 pmcid: 6277202
Hernandez, A. M. & Ryan, J. F. Six-state amino acid recoding is not an effective strategy to offset compositional heterogeneity and saturation in phylogenetic analyses. Syst. Biol. 70, 1200–1212 (2021).
pubmed: 33837789 doi: 10.1093/sysbio/syab027 pmcid: 8513762
Foster, P. G. et al. Recoding amino acids to a reduced alphabet may increase or decrease phylogenetic accuracy. Syst. Biol. https://doi.org/10.1093/sysbio/syac042 (2022).
doi: 10.1093/sysbio/syac042
Wascher, M. & Kubatko, L. Consistency of SVDQuartets and maximum likelihood for coalescent-based species tree estimation. Syst. Biol. 70, 33–48 (2021).
pubmed: 32415974 doi: 10.1093/sysbio/syaa039
Alda, F. et al. Resolving deep nodes in an ancient radiation of neotropical fishes in the presence of conflicting signals from incomplete lineage sorting. Syst. Biol. 68, 573–593 (2019).
pubmed: 30521024 doi: 10.1093/sysbio/syy085
Shen, X.-X., Steenwyk, J. L. & Rokas, A. Dissecting incongruence between concatenation- and quartet-based approaches in phylogenomic data. Syst. Biol. 70, 997–1014 (2021).
pubmed: 33616672 doi: 10.1093/sysbio/syab011
Darriba, D., Flouri, T. & Stamatakis, A. The state of software for evolutionary biology. Mol. Biol. Evol. 35, 1037–1046 (2018).
pubmed: 29385525 doi: 10.1093/molbev/msy014 pmcid: 5913673
Shen, X.-X., Li, Y., Hittinger, C. T., Chen, X. & Rokas, A. An investigation of irreproducibility in maximum likelihood phylogenetic inference. Nat. Commun. 11, 6096 (2020). This study reports that a considerable fraction of single gene phylogenies inferred from phylogenomic data matrices is irreproducible, leading to a novel source of incongruence in phylogenomic studies.
pubmed: 33257660 doi: 10.1038/s41467-020-20005-6 pmcid: 7705714
Shen, X.-X., Salichos, L. & Rokas, A. A genome-scale investigation of how sequence, function, and tree-based gene properties influence phylogenetic inference. Genome Biol. Evol. 8, 2565–2580 (2016).
pubmed: 27492233 doi: 10.1093/gbe/evw179 pmcid: 5010910
Mongiardino Koch, N. Phylogenomic subsampling and the search for phylogenetically reliable loci. Mol. Biol. Evol. 38, 4025–4038 (2021).
pubmed: 33983409 doi: 10.1093/molbev/msab151 pmcid: 8382905
Haag, J., Höhler, D., Bettisworth, B. & Stamatakis, A. From easy to hopeless — predicting the difficulty of phylogenetic analyses. Mol. Biol. Evol. 39, msac254 (2022).
pubmed: 36395091 doi: 10.1093/molbev/msac254 pmcid: 9728795
Hillis, D. M. & Bull, J. J. An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analysis. Syst. Biol. 42, 182–192 (1993).
doi: 10.1093/sysbio/42.2.182
Anisimova, M., Gil, M., Dufayard, J.-F., Dessimoz, C. & Gascuel, O. Survey of branch support methods demonstrates accuracy, power, and robustness of fast likelihood-based approximation schemes. Syst. Biol. 60, 685–699 (2011).
pubmed: 21540409 doi: 10.1093/sysbio/syr041 pmcid: 3158332
Lemoine, F. et al. Renewing Felsenstein’s phylogenetic bootstrap in the era of big data. Nature 556, 452–456 (2018).
pubmed: 29670290 doi: 10.1038/s41586-018-0043-0 pmcid: 6030568
Molloy, E. K. & Warnow, T. To include or not to include: the impact of gene filtering on species tree estimation methods. Syst. Biol. 67, 285–303 (2018).
pubmed: 29029338 doi: 10.1093/sysbio/syx077
Minh, B. Q., Hahn, M. W. & Lanfear, R. New methods to calculate concordance factors for phylogenomic datasets. Mol. Biol. Evol. 37, 2727–2733 (2020). This article reports the development of methods to calculate the degree to which sites or genes support a particular branch of a phylogeny, also known as concordance factors, and their implementation in the IQ-TREE software. Concordance factors are very useful in identifying the presence of incongruence among a set of trees.
pubmed: 32365179 doi: 10.1093/molbev/msaa106 pmcid: 7475031
Ane, C., Larget, B., Baum, D. A., Smith, S. D. & Rokas, A. Bayesian estimation of concordance among gene trees. Mol. Biol. Evol. 24, 412–426 (2006).
pubmed: 17095535 doi: 10.1093/molbev/msl170
Baum, D. A. Concordance trees, concordance factors, and the exploration of reticulate genealogy. Taxon 56, 417–426 (2007).
doi: 10.1002/tax.562013
Larget, B. R., Kotha, S. K., Dewey, C. N. & Ané, C. BUCKy: gene tree/species tree reconciliation with Bayesian concordance analysis. Bioinformatics 26, 2910–2911 (2010).
pubmed: 20861028 doi: 10.1093/bioinformatics/btq539
Salichos, L. & Rokas, A. Inferring ancient divergences requires genes with strong phylogenetic signals. Nature 497, 327–331 (2013).
pubmed: 23657258 doi: 10.1038/nature12130
Kobert, K., Salichos, L., Rokas, A. & Stamatakis, A. Computing the internode certainty and related measures from partial gene trees. Mol. Biol. Evol. 33, 1606–1617 (2016).
pubmed: 26915959 doi: 10.1093/molbev/msw040 pmcid: 4868120
Zhou, X. et al. Quartet-based computations of internode certainty provide robust measures of phylogenetic incongruence. Syst. Biol. 69, 308–324 (2020). This article reports the development of internode certainty measures for phylogenomic data matrices with partial taxon coverage. By explicitly quantifying the level of incongruence of a given internal branch among a set of phylogenetic trees, internode certainty measures are a key tool for diagnosing the presence of incongruence in phylogenomic studies.
pubmed: 31504977 doi: 10.1093/sysbio/syz058
Salichos, L., Stamatakis, A. & Rokas, A. Novel information theory-based measures for quantifying incongruence among phylogenetic trees. Mol. Biol. Evol. 31, 1261–1271 (2014).
pubmed: 24509691 doi: 10.1093/molbev/msu061
Huson, D. H. & Bryant, D. Application of phylogenetic networks in evolutionary studies. Mol. Biol. Evol. 23, 254–267 (2006).
pubmed: 16221896 doi: 10.1093/molbev/msj030
Huson, D. H. SplitsTree: analyzing and visualizing evolutionary data. Bioinformatics 14, 68–73 (1998).
pubmed: 9520503 doi: 10.1093/bioinformatics/14.1.68
Huson, D. H., Klöpper, T., Lockhart, P. J. & Steel, M. A. Reconstruction of reticulate networks from gene trees. In Proc. 9th Annual International Conference on Research in Computational Molecular Biology, RECOMB 2005 (eds Miyano, S. et al.) 233–249 (Springer, Berlin, 2005).
Wen, D., Yu, Y., Zhu, J. & Nakhleh, L. Inferring phylogenetic networks using PhyloNet. Syst. Biol. 67, 735–740 (2018).
pubmed: 29514307 doi: 10.1093/sysbio/syy015 pmcid: 6005058
Lutteropp, S., Scornavacca, C., Kozlov, A. M., Morel, B. & Stamatakis, A. NetRAX: accurate and fast maximum likelihood phylogenetic network inference. Bioinformatics 38, 3725–3733 (2022).
pubmed: 35713506 doi: 10.1093/bioinformatics/btac396 pmcid: 9344847
Arcila, D. et al. Genome-wide interrogation advances resolution of recalcitrant groups in the tree of life. Nat. Ecol. Evol. 1, 0020 (2017).
doi: 10.1038/s41559-016-0020
Pease, J. B., Brown, J. W., Walker, J. F., Hinchliff, C. E. & Smith, S. A. Quartet sampling distinguishes lack of support from conflicting support in the green plant tree of life. Am. J. Bot. 105, 385–403 (2018).
pubmed: 29746719 doi: 10.1002/ajb2.1016
Sayyari, E. & Mirarab, S. Testing for polytomies in phylogenetic species trees using quartet frequencies. Genes 9, 132 (2018).
pubmed: 29495636 doi: 10.3390/genes9030132 pmcid: 5867853
Ogden, T. H. & Rosenberg, M. S. Multiple sequence alignment accuracy and phylogenetic inference. Syst. Biol. 55, 314–328 (2006).
pubmed: 16611602 doi: 10.1080/10635150500541730
Zhou, X., Shen, X.-X., Hittinger, C. T. & Rokas, A. Evaluating fast maximum likelihood-based phylogenetic programs using empirical phylogenomic data sets. Mol. Biol. Evol. 35, 486–503 (2018).
pubmed: 29177474 doi: 10.1093/molbev/msx302
Suvorov, A., Hochuli, J. & Schrider, D. R. Accurate inference of tree topologies from multiple sequence alignments using deep learning. Syst. Biol. 69, 221–233 (2020).
pubmed: 31504938 doi: 10.1093/sysbio/syz060
Azouri, D., Abadi, S., Mansour, Y., Mayrose, I. & Pupko, T. Harnessing machine learning to guide phylogenetic-tree search algorithms. Nat. Commun. 12, 1983 (2021).
pubmed: 33790270 doi: 10.1038/s41467-021-22073-8 pmcid: 8012635
Rosenzweig, B. K., Hahn, M. W. & Kern, A. Accurate detection of incomplete lineage sorting via supervised machine learning. Preprint at bioRxiv https://doi.org/10.1101/2022.11.09.515828 (2022).
doi: 10.1101/2022.11.09.515828
Grealey, J. et al. The carbon footprint of bioinformatics. Mol. Biol. Evol. 39, msac034 (2022). This article examines the environmental impact and carbon footprint of bioinformatic analyses, including phylogenetics, offering numerous suggestions for greener computing.
pubmed: 35143670 doi: 10.1093/molbev/msac034 pmcid: 8892942
Darriba, D. et al. ModelTest-NG: a new and scalable tool for the selection of DNA and protein evolutionary models. Mol. Biol. Evol. 37, 291–294 (2020).
pubmed: 31432070 doi: 10.1093/molbev/msz189
Posada, D. jModelTest: phylogenetic model averaging. Mol. Biol. Evol. 25, 1253–1256 (2008).
pubmed: 18397919 doi: 10.1093/molbev/msn083
Kumar, S. Embracing green computing in molecular phylogenetics. Mol. Biol. Evol. 39, msac043 (2022).
pubmed: 35243506 doi: 10.1093/molbev/msac043 pmcid: 8894743
Höhler, D., Haag, J., Kozlov, A. M. & Stamatakis, A. A representative performance assessment of maximum likelihood based phylogenetic inference tools. Preprint at bioRxiv https://doi.org/10.1101/2022.10.31.514545 (2022).
doi: 10.1101/2022.10.31.514545
Scornavacca, C. & Galtier, N. Incomplete lineage sorting in mammalian phylogenomics. Syst. Biol. 66, 112–120 (2016).
Galtier, N. A model of horizontal gene transfer and the bacterial phylogeny problem. Syst. Biol. 56, 633–642 (2007).
pubmed: 17661231 doi: 10.1080/10635150701546231
Stolzer, M. et al. Inferring duplications, losses, transfers and incomplete lineage sorting with nonbinary species trees. Bioinformatics 28, i409–i415 (2012).
pubmed: 22962460 doi: 10.1093/bioinformatics/bts386 pmcid: 3436813
Nabhan, A. R. & Sarkar, I. N. The impact of taxon sampling on phylogenetic inference: a review of two decades of controversy. Brief. Bioinform. 13, 122–134 (2012).
pubmed: 21436145 doi: 10.1093/bib/bbr014
Li, Y., Shen, X.-X., Evans, B., Dunn, C. W. & Rokas, A. Rooting the animal tree of life. Mol. Biol. Evol. 38, 4322–4333 (2021). A systematic and in-depth examination of the evidence in favour of the sponge-sister and ctenophore-sister hypotheses concerning the rooting of the animal tree of life.
pubmed: 34097041 doi: 10.1093/molbev/msab170 pmcid: 8476155
Cheon, S., Zhang, J. & Park, C. Is phylotranscriptomics as reliable as phylogenomics? Mol. Biol. Evol. 37, 3672–3683 (2020).
pubmed: 32658973 doi: 10.1093/molbev/msaa181 pmcid: 7743905
Minh, B. Q., Dang, C. C., Vinh, L. S. & Lanfear, R. QMaker: fast and accurate method to estimate empirical models of protein evolution. Syst. Biol. 70, 1046–1060 (2021).
pubmed: 33616668 doi: 10.1093/sysbio/syab010 pmcid: 8357343
Sharma, S. & Kumar, S. Fast and accurate bootstrap confidence limits on genome-scale phylogenies using little bootstraps. Nat. Comput. Sci. 1, 573–577 (2021).
pubmed: 34734192 doi: 10.1038/s43588-021-00129-5 pmcid: 8560003
Hoang, D. T., Chernomor, O., von Haeseler, A., Minh, B. Q. & Vinh, L. S. UFBoot2: improving the ultrafast bootstrap approximation. Mol. Biol. Evol. 35, 518–522 (2018).
pubmed: 29077904 doi: 10.1093/molbev/msx281
Kowalczyk, A. et al. RERconverge: an R package for associating evolutionary rates with convergent traits. Bioinformatics 35, 4815–4817 (2019).
pubmed: 31192356 doi: 10.1093/bioinformatics/btz468 pmcid: 6853647
Leigh, J. W., Susko, E., Baumgartner, M. & Roger, A. J. Testing congruence in phylogenomic analysis. Syst. Biol. 57, 104–115 (2008).
pubmed: 18288620 doi: 10.1080/10635150801910436
Al Jewari, C. & Baldauf, S. L. Conflict over the Eukaryote root resides in strong outliers, mosaics and missing data sensitivity of site-specific (CAT) mixture models. Syst. Biol. 72, 1–16 (2023).
pubmed: 35412616 doi: 10.1093/sysbio/syac029
Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinform. 10, 421 (2009).
doi: 10.1186/1471-2105-10-421
Zhang, C., Scornavacca, C., Molloy, E. K. & Mirarab, S. ASTRAL-Pro: quartet-based species-tree inference despite paralogy. Mol. Biol. Evol. 37, 3292–3307 (2020).
pubmed: 32886770 doi: 10.1093/molbev/msaa139 pmcid: 7751180
Lartillot, N., Rodrigue, N., Stubbs, D. & Richer, J. PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment. Syst. Biol. 62, 611–615 (2013).
pubmed: 23564032 doi: 10.1093/sysbio/syt022
Kozlov, A. M., Darriba, D., Flouri, T., Morel, B. & Stamatakis, A. RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference. Bioinformatics 35, 4453–4455 (2019).
pubmed: 31070718 doi: 10.1093/bioinformatics/btz305 pmcid: 6821337
Liu, L., Yu, L., Pearl, D. K. & Edwards, S. V. Estimating species phylogenies using coalescence times among sequences. Syst. Biol. 58, 468–477 (2009).
pubmed: 20525601 doi: 10.1093/sysbio/syp031
Chifman, J. & Kubatko, L. Quartet inference from SNP data under the coalescent model. Bioinformatics 30, 3317–3324 (2014).
pubmed: 25104814 doi: 10.1093/bioinformatics/btu530 pmcid: 4296144
Redmond, A. K. & McLysaght, A. Evidence for sponges as sister to all other animals from partitioned phylogenomics with mixture models and recoding. Nat. Commun. 12, 1783 (2021).
pubmed: 33741994 doi: 10.1038/s41467-021-22074-7 pmcid: 7979703
Pisani, D. et al. Genomic data do not support comb jellies as the sister group to all other animals. Proc. Natl Acad. Sci. USA 112, 15402–15407 (2015).
pubmed: 26621703 doi: 10.1073/pnas.1518127112 pmcid: 4687580
Feuda, R. et al. Improved modeling of compositional heterogeneity supports sponges as sister to all other animals. Curr. Biol. 27, 3864–3870.e4 (2017).
pubmed: 29199080 doi: 10.1016/j.cub.2017.11.008
Ryan, J. F. et al. The genome of the Ctenophore Mnemiopsis leidyi and its implications for cell type evolution. Science 342, 1242592 (2013).
pubmed: 24337300 doi: 10.1126/science.1242592 pmcid: 3920664
Moroz, L. L. et al. The ctenophore genome and the evolutionary origins of neural systems. Nature 510, 109–114 (2014).
pubmed: 24847885 doi: 10.1038/nature13400 pmcid: 4337882
King, N. & Rokas, A. Embracing uncertainty in reconstructing early animal evolution. Curr. Biol. 27, R1081–R1088 (2017).
pubmed: 29017048 doi: 10.1016/j.cub.2017.08.054 pmcid: 5679448
Dunn, C. W., Leys, S. P. & Haddock, S. H. D. The hidden biology of sponges and ctenophores. Trends Ecol. Evol. 30, 282–291 (2015).
pubmed: 25840473 doi: 10.1016/j.tree.2015.03.003
Nielsen, C. Early animal evolution: a morphologist’s view. R. Soc. Open Sci. 6, 190638 (2019).
pubmed: 31417759 doi: 10.1098/rsos.190638 pmcid: 6689584
Burkhardt, P. et al. Syncytial nerve net in a ctenophore adds insights on the evolution of nervous systems. Science 380, 293–297 (2023).
pubmed: 37079688 doi: 10.1126/science.ade5645
Liebeskind, B. J., Hillis, D. M., Zakon, H. H. & Hofmann, H. A. Complex homology and the evolution of nervous systems. Trends Ecol. Evol. 31, 127–135 (2016).
pubmed: 26746806 doi: 10.1016/j.tree.2015.12.005
Sachkova, M. Y. et al. Neuropeptide repertoire and 3D anatomy of the ctenophore nervous system. Curr. Biol. 31, 5274–5285.e6 (2021).
pubmed: 34587474 doi: 10.1016/j.cub.2021.09.005
Burkhardt, P. Ctenophores and the evolutionary origin(s) of neurons. Trends Neurosci. 45, 878–880 (2022).
pubmed: 36207172 doi: 10.1016/j.tins.2022.09.001
Baños, H., Susko, E. & Roger, A. J. Is over-parameterization a problem for profile mixture models? Preprint at bioRxiv https://doi.org/10.1101/2022.02.18.481053 (2022).
doi: 10.1101/2022.02.18.481053
Kapli, P. & Telford, M. J. Topology-dependent asymmetry in systematic errors affects phylogenetic placement of Ctenophora and Xenacoelomorpha. Sci. Adv. 6, eabc5162 (2020).
pubmed: 33310849 doi: 10.1126/sciadv.abc5162 pmcid: 7732190
Whelan, N. V. & Halanych, K. M. Who let the CAT out of the Bag? Accurately dealing with substitutional heterogeneity in phylogenomic analyses. Syst. Biol. 66, 232–255 (2017).
pubmed: 27633354
Whelan, N. V. & Halanych, K. M. Available data do not rule out Ctenophora as the sister group to all other Metazoa. Nat. Commun. 14, 711 (2023).
pubmed: 36765046 doi: 10.1038/s41467-023-36151-6 pmcid: 9918479
Parey, E. et al. Genome structures resolve the early diversification of teleost fishes. Science 379, 572–575 (2023). This study uses conservation of genome structure or synteny as an independent source of phylogenomic data. In combination with phylogenomic sequence data, these rare genomic changes resolve controversial relationships in early fish evolution.
pubmed: 36758078 doi: 10.1126/science.abq4257
Schultz, D. T. et al. Ancient gene linkages support ctenophores as sister to other animals. Nature 618, 110–117 (2023).
pubmed: 37198475 doi: 10.1038/s41586-023-05936-6 pmcid: 10232365

Auteurs

Jacob L Steenwyk (JL)

Howards Hughes Medical Institute and the Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA.
Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA.
Vanderbilt Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN, USA.

Yuanning Li (Y)

Institute of Marine Science and Technology, Shandong University, Qingdao, China.

Xiaofan Zhou (X)

Guangdong Laboratory for Lingnan Modern Agriculture, Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Centre, South China Agricultural University, Guangzhou, China.

Xing-Xing Shen (XX)

Key Laboratory of Biology of Crop Pathogens and Insects of Zhejiang Province, Institute of Insect Sciences, Zhejiang University, Hangzhou, China.

Antonis Rokas (A)

Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA. antonis.rokas@vanderbilt.edu.
Vanderbilt Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN, USA. antonis.rokas@vanderbilt.edu.
Heidelberg Institute for Theoretical Studies, Heidelberg, Germany. antonis.rokas@vanderbilt.edu.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing
Animals Hemiptera Insect Proteins Phylogeny Insecticides
Amaryllidaceae Alkaloids Lycoris NADPH-Ferrihemoprotein Reductase Gene Expression Regulation, Plant Plant Proteins

Classifications MeSH