Markov-Modulated Continuous-Time Markov Chains to Identify Site- and Branch-Specific Evolutionary Variation in BEAST.


Journal

Systematic biology
ISSN: 1076-836X
Titre abrégé: Syst Biol
Pays: England
ID NLM: 9302532

Informations de publication

Date de publication:
01 01 2021
Historique:
received: 17 05 2019
revised: 19 04 2020
accepted: 06 05 2020
pubmed: 18 5 2020
medline: 19 11 2021
entrez: 17 5 2020
Statut: ppublish

Résumé

Markov models of character substitution on phylogenies form the foundation of phylogenetic inference frameworks. Early models made the simplifying assumption that the substitution process is homogeneous over time and across sites in the molecular sequence alignment. While standard practice adopts extensions that accommodate heterogeneity of substitution rates across sites, heterogeneity in the process over time in a site-specific manner remains frequently overlooked. This is problematic, as evolutionary processes that act at the molecular level are highly variable, subjecting different sites to different selective constraints over time, impacting their substitution behavior. We propose incorporating time variability through Markov-modulated models (MMMs), which extend covarion-like models and allow the substitution process (including relative character exchange rates as well as the overall substitution rate) at individual sites to vary across lineages. We implement a general MMM framework in BEAST, a popular Bayesian phylogenetic inference software package, allowing researchers to compose a wide range of MMMs through flexible XML specification. Using examples from bacterial, viral, and plastid genome evolution, we show that MMMs impact phylogenetic tree estimation and can substantially improve model fit compared to standard substitution models. Through simulations, we show that marginal likelihood estimation accurately identifies the generative model and does not systematically prefer the more parameter-rich MMMs. To mitigate the increased computational demands associated with MMMs, our implementation exploits recent developments in BEAGLE, a high-performance computational library for phylogenetic inference. [Bayesian inference; BEAGLE; BEAST; covarion, heterotachy; Markov-modulated models; phylogenetics.].

Identifiants

pubmed: 32415977
pii: 5838195
doi: 10.1093/sysbio/syaa037
pmc: PMC7744037
doi:

Banques de données

Dryad
['10.5061/dryad.230s5h0']

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

181-189

Subventions

Organisme : NIAID NIH HHS
ID : U19 AI135995
Pays : United States
Organisme : NIAID NIH HHS
ID : R01 AI107034
Pays : United States
Organisme : Wellcome Trust
ID : 206298/Z/17/Z
Pays : United Kingdom

Informations de copyright

© The Author(s) 2020. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.

Références

Virus Evol. 2018 Jun 08;4(1):vey016
pubmed: 29942656
J Mol Evol. 1981;17(6):368-76
pubmed: 7288891
Mol Biol Evol. 2016 Sep;33(9):2469-76
pubmed: 27297467
Math Biosci. 1998 Jan 1;147(1):63-91
pubmed: 9401352
Science. 1970 Nov 20;170(3960):822-5
pubmed: 5473414
Bioinformatics. 2009 Jun 1;25(11):1370-6
pubmed: 19369496
J Mol Evol. 1994 Sep;39(3):306-14
pubmed: 7932792
Mol Biol Evol. 2006 Nov;23(11):2058-71
pubmed: 16931538
Trends Ecol Evol. 2000 Sep;15(9):365-369
pubmed: 10931668
Trends Ecol Evol. 1996 Sep;11(9):367-72
pubmed: 21237881
J Mol Evol. 2001 Dec;53(6):711-23
pubmed: 11677631
Syst Biol. 2019 Nov 1;68(6):1052-1061
pubmed: 31034053
Syst Biol. 2016 Mar;65(2):250-64
pubmed: 26526428
Proc Natl Acad Sci U S A. 2004 Aug 31;101(35):12957-62
pubmed: 15326304
Trends Genet. 2005 Jun;21(6):307-9
pubmed: 15922824
Syst Biol. 2004 Aug;53(4):571-81
pubmed: 15371247
Mol Biol Evol. 2005 Apr;22(4):914-24
pubmed: 15625184

Auteurs

Guy Baele (G)

Department of Microbiology, Immunology and Transplantation, Rega Institute, KU Leuven, Herestraat 49, 3000 Leuven, Belgium.

Mandev S Gill (MS)

Department of Microbiology, Immunology and Transplantation, Rega Institute, KU Leuven, Herestraat 49, 3000 Leuven, Belgium.

Paul Bastide (P)

Department of Microbiology, Immunology and Transplantation, Rega Institute, KU Leuven, Herestraat 49, 3000 Leuven, Belgium.

Philippe Lemey (P)

Department of Microbiology, Immunology and Transplantation, Rega Institute, KU Leuven, Herestraat 49, 3000 Leuven, Belgium.

Marc A Suchard (MA)

Department of Biostatistics, Jonathan and Karin Fielding School of Public Health, University of California, Los Angeles, CA 90095, USA.
Department of Biomathematics, David Geffen School of Medicine at UCLA, University of California, Los Angeles, CA 90095, USA.
Department of Human Genetics, David Geffen School of Medicine at UCLA, Universtiy of California, Los Angeles, CA 90095, USA.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing
Animals Hemiptera Insect Proteins Phylogeny Insecticides
Amaryllidaceae Alkaloids Lycoris NADPH-Ferrihemoprotein Reductase Gene Expression Regulation, Plant Plant Proteins
Drought Resistance Gene Expression Profiling Gene Expression Regulation, Plant Gossypium Multigene Family

Classifications MeSH