Rooting Species Trees Using Gene Tree-Species Tree Reconciliation.

Amalgamated likelihood estimation Evolution Phylogenetics Reconciliation Rooting

Journal

Methods in molecular biology (Clifton, N.J.)
ISSN: 1940-6029
Titre abrégé: Methods Mol Biol
Pays: United States
ID NLM: 9214969

Informations de publication

Date de publication:
2022
Historique:
entrez: 9 9 2022
pubmed: 10 9 2022
medline: 14 9 2022
Statut: ppublish

Résumé

Interpreting phylogenetic trees requires a root, which provides the direction of evolution and polarizes ancestor-descendant relationships. But inferring the root using genetic data is difficult, particularly in cases where the closest available outgroup is only distantly related, which are common for microbes. In this chapter, we present a workflow for estimating rooted species trees and the evolutionary history of the gene families that evolve within them using probabilistic gene tree-species tree reconciliation. We illustrate the pipeline using a small dataset of prokaryotic genomes, for which the example scripts can be run using modest computer resources. We describe the rooting method used in this work in the context or other rooting strategies and discuss some of the limitations and opportunities presented by probabilistic gene tree-species tree reconciliation methods.

Identifiants

pubmed: 36083449
doi: 10.1007/978-1-0716-2691-7_9
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

189-211

Informations de copyright

© 2022. The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.

Références

Felsenstein J (2003) Inferring phylogenies. Sinauer
Bergsten J (2005) A review of long-branch attraction. Cladistics 21:163–193
pubmed: 34892859
Zuckerkandl E, Pauling L (1965) Molecules as documents of evolutionary history. J Theor Biol 8:357–366
pubmed: 5876245
Farris JS (1972) Estimating phylogenetic trees from distance matrices. Am Nat 106:645–668
Tria FDK, Landan G, Dagan T (2017) Phylogenetic rooting using minimal ancestor deviation. Nat Ecol Evol 1:193
pubmed: 29388565
Drummond AJ, Ho SYW, Phillips MJ, Rambaut A (2006) Relaxed phylogenetics and dating with confidence. PLoS Biol 4:699–710
Dos Reis M, Donoghue PCJ, Yang Z (2016) Bayesian molecular clock dating of species divergences in the genomics era. Nat Rev Genet 17:71–80
pubmed: 26688196
Huelsenbeck JP, Bollback JP, Levine AM (2002) Inferring the root of a phylogenetic tree. Syst Biol 51:32–43
pubmed: 11943091
Williams TA et al (2015) New substitution models for rooting phylogenetic trees. Philos Trans R Soc B Biol Sci 370
Coleman GA et al (2021) A rooted phylogeny resolves early bacterial evolution. Science (80–) 372
Gogarten JP et al (1989) Evolution of the vacuolar H+-ATPase: implications for the origin of eukaryotes. Proc Natl Acad Sci U S A 86:6661–6665
pubmed: 2528146 pmcid: 297905
Iwabe N, Kuma K, Hasegawa M, Osawa S, Miyata T (1989) Evolutionary relationship of archaebacteria, eubacteria, and eukaryotes inferred from phylogenetic trees of duplicated genes. Proc Natl Acad Sci U S A 86:9355–9359
pubmed: 2531898 pmcid: 298494
Szöllosi GJ, Boussau B, Abby SS, Tannier E, Daubin V (2012) Phylogenetic modeling of lateral gene transfer reconstructs the pattern and relative timing of speciations. Proc Natl Acad Sci U S A 109:17513–17518
pubmed: 23043116 pmcid: 3491530
Williams TA et al (2017) Integrative modeling of gene and genome evolution roots the archaeal tree of life. Proc Natl Acad Sci U S A 114:E4602–E4611
pubmed: 28533395 pmcid: 5468678
Szöllosi GJ, Tannier E, Lartillot N, Daubin V (2013) Lateral gene transfer from the dead. Syst Biol 62:386–397
pubmed: 23355531 pmcid: 3622898
Doyon JP et al (2010) An efficient algorithm for gene/species trees parsimonious reconciliation with losses, duplications and transfers. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), vol 6398 LNBI. Springer, Berlin, Heidelberg, pp 93–108
Jacox E, Chauve C, Szöllosi GJ, Ponty Y, Scornavacca C (2016) EcceTERA: comprehensive gene tree-species tree reconciliation using parsimony. Bioinformatics 32:2056–2058
pubmed: 27153713
Bansal MS, Kellis M, Kordi M, Kundu S (2018) RANGER-DTL 2.0: rigorous reconstruction of gene-family evolution by duplication, transfer and loss. Bioinformatics 34:3214–3216
pubmed: 29688310 pmcid: 6137995
Chaudhary R, Bansal MS, Wehe A, Fernández-Baca D, Eulenstein O (2010) iGTP: a software package for large-scale gene tree parsimony analysis. BMC Bioinforma 111(11):1–7
Åkerborg Ö, Sennblad B, Arvestad L, Lagergren J (2009) Simultaneous Bayesian gene tree reconstruction and reconciliation analysis. Proc Natl Acad Sci U S A 106:5714–5719
pubmed: 19299507 pmcid: 2667006
Szöllosi GJ, Rosikiewicz W, Boussau B, Tannier E, Daubin V (2013) Efficient exploration of the space of reconciled gene trees. Syst Biol. https://doi.org/10.1093/sysbio/syt054
Morel B, Kozlov AM, Stamatakis A, Szollosi GJ (2020) GeneRax: a tool for species-tree-aware maximum likelihood-based gene family tree inference under gene duplication, transfer, and loss. Mol Biol Evol 37:2763–2774
pubmed: 32502238 pmcid: 8312565
Sjöstrand J et al (2014) A Bayesian method for analyzing lateral gene transfer. Syst Biol 63:409–420
pubmed: 24562812
Martins L de O, Posada D (2017) Species tree estimation from genome-wide data with guenomu. Methods Mol Biol 1525:461–478
Groussin M, Boussau B, Gouy M (2013) A branch-heterogeneous model of protein evolution for efficient inference of ancestral sequences. Syst Biol 62:523–538
pubmed: 23475623 pmcid: 3676677
Sheridan PO et al (2020) Gene duplication drives genome expansion in a major lineage of Thaumarchaeota. Nat Commun 11:1–12
Dagan T, Martin W (2006) The tree of one percent. Genome Biol 7:1–7
Dayhoff MO, Barker WC, McLaughlin PJ (1974) Inferences from protein and nucleic acid sequences: early molecular evolution, divergence of kingdoms and rates of change. Cosmochem Evol Orig Life 311–330. https://doi.org/10.1007/978-94-015-1118-6_25
Brown JR, Doolittle WF (1995) Root of the universal tree of life based on ancient aminoacyl-tRNA synthetase gene duplications. Proc Natl Acad Sci U S A 92:2441–2445
pubmed: 7708661 pmcid: 42233
Baldauf SL, Palmer JD, Doolittle WF (1996) The root of the universal tree and the origin of eukaryotes based on elongation factor phylogeny. Proc Natl Acad Sci U S A 93:7749–7754
pubmed: 8755547 pmcid: 38819
Zhaxybayeva O, Lapierre P, Gogarten JP (2005) Ancient gene duplications and the root(s) of the tree of life. Protoplasma 227:53–64
pubmed: 16389494
Gouy R, Baurain D, Philippe H (2015) Rooting the tree of life: the phylogenetic jury is still out. Philos Trans R Soc B Biol Sci 370
Buchfink B, Xie C, Huson DH (2014) Fast and sensitive protein alignment using DIAMOND. Nat Methods 12:59–60
pubmed: 25402007
Enright AJ, Van Dongen S, Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res 30:1575–1584
pubmed: 11917018 pmcid: 101833
Tange O (2018) GNU Parallel 2018. https://doi.org/10.5281/ZENODO.1146014
Katoh K, Toh H (2008) Recent developments in the MAFFT multiple sequence alignment program. Brief Bioinform. https://doi.org/10.1093/bib/bbn013
Criscuolo A, Gribaldo S (2010) BMGE (Block Mapping and Gathering with Entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol Biol. https://doi.org/10.1186/1471-2148-10-210
Nguyen LT, Schmidt HA, Von Haeseler A, Minh BQ (2015) IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol. https://doi.org/10.1093/molbev/msu300
Larget B (2013) The estimation of tree posterior probabilities using conditional clade probability distributions. Syst Biol 62:501–511
pubmed: 23479066 pmcid: 3676676
Lartillot N, Rodrigue N, Stubbs D, Richer J (2013) Phylobayes mpi: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment. Syst Biol. https://doi.org/10.1093/sysbio/syt022
Yang Z, Rannala B (2012) Molecular phylogenetics: principles and practice. Nat Rev Genet 13:303–314
pubmed: 22456349
Ren F, Tanaka H, Yang Z (2009) A likelihood look at the supermatrix-supertree controversy. Gene 441:119–125
pubmed: 18502054
Bravo GA et al (2019) Embracing heterogeneity: coalescing the tree of life and the future of phylogenomics. PeerJ 2019:e6399
Emms DM, Kelly S (2019) OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. https://doi.org/10.1186/s13059-019-1832-y
Letunic I, Bork P (2007) Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation. Bioinformatics. https://doi.org/10.1093/bioinformatics/btl529
Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM (2015) BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. https://doi.org/10.1093/bioinformatics/btv351
Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW (2015) CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res 25:1043–1055
pubmed: 25977477 pmcid: 4484387
Shimodaira H, Hasegawa M (2001) CONSEL: for assessing the confidence of phylogenetic tree selection. Bioinformatics. https://doi.org/10.1093/bioinformatics/17.12.1246
Kostka M, Uzlikova M, Cepicka I, Flegr J (2008) SlowFaster, a user-friendly program for slow-fast analysis and its application on phylogeny of Blastocystis. BMC Bioinformatics 9:1–6
Viklund J, Ettema TJG, Andersson SGE (2012) Independent genome reduction and phylogenetic reclassification of the oceanic SAR11 clade. Mol Biol Evol 29:599–615
pubmed: 21900598
Muñoz-Gómez SA et al (2019) An updated phylogeny of the alphaproteobacteria reveals that the parasitic rickettsiales and holosporales have independent origins. elife 8
Huerta-Cepas J et al (2017) Fast genome-wide functional annotation through orthology assignment by eggNOG-mapper. Mol Biol Evol 34:2115–2122
pubmed: 28460117 pmcid: 5850834
Kanehisa M, Sato Y, Morishima K (2016) BlastKOALA and GhostKOALA: KEGG tools for functional characterization of genome and metagenome sequences. J Mol Biol 428:726–731
pubmed: 26585406
Chen ZH et al (2017) Molecular evolution of grass stomata. Trends Plant Sci 22:124–139
pubmed: 27776931
Emms DM, Kelly S (2017) STRIDE: species tree root inference from gene duplication events. Mol Biol Evol 34:3267–3278
pubmed: 29029342 pmcid: 5850722
Morel B et al (2021) SpeciesRax: a tool for maximum likelihood species tree inference from gene family trees under duplication, transfer, and loss. bioRxiv 2021.03.29.437460. https://doi.org/10.1101/2021.03.29.437460
Yang Z (1994) Journal of molecular evolution estimating the pattern of nucleotide substitution. J Mol Evol 39
Bettisworth B, Stamatakis A (2021) Root Digger: a root placement program for phylogenetic trees. BMC Bioinforma 221(22):1–20
Jaffe AL et al (2021) Patterns of gene content and co-occurrence constrain the evolutionary path 2 toward animal association in CPR bacteria. bioRxiv 2021.03.03.433784. https://doi.org/10.1101/2021.03.03.433784
Doolittle WF (1999) Phylogenetic classification and the universal tree. Science 284:2124–2128
pubmed: 10381871
Doolittle WF, Bapteste E (2007) Pattern pluralism and the Tree of Life hypothesis. Proc Natl Acad Sci U S A 104:2043–2049
pubmed: 17261804 pmcid: 1892968
Zwaenepoel A, Van Peer Y, De. (2019) Inference of ancient whole-genome duplications and the evolution of gene duplication and loss rates. Mol Biol Evol 36:1384–1404
pubmed: 31004147
Hug LA et al (2016) A new view of the tree of life. Nat Microbiol 15(1):1–6
Parks DH et al (2018) A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat Biotechnol 36:996
pubmed: 30148503

Auteurs

Brogan J Harris (BJ)

School of Biological Sciences, University of Bristol, Bristol, UK.

Paul O Sheridan (PO)

School of Biological Sciences, University of Bristol, Bristol, UK.
School of Biological Sciences, University of Aberdeen, Aberdeen, UK.

Adrián A Davín (AA)

Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia.

Cécile Gubry-Rangin (C)

School of Biological Sciences, University of Aberdeen, Aberdeen, UK.

Gergely J Szöllősi (GJ)

Dept. of Biological Physics, Eötvös Loránd University, Budapest, Hungary.
MTA-ELTE "Lendület" Evolutionary Genomics Research Group, Budapest, Hungary.
Institute of Evolution, Centre for Ecological Research, Budapest, Hungary.

Tom A Williams (TA)

School of Biological Sciences, University of Bristol, Bristol, UK. tom.a.williams@bristol.ac.uk.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Animals Hemiptera Insect Proteins Phylogeny Insecticides
Amaryllidaceae Alkaloids Lycoris NADPH-Ferrihemoprotein Reductase Gene Expression Regulation, Plant Plant Proteins

Classifications MeSH