Data integration in Bayesian phylogenetics.

Bayesian networks Gaussian processes continuous-time Markov processes phylogenetic comparative methods phylogeography

Journal

Annual review of statistics and its application
ISSN: 2326-8298
Titre abrégé: Annu Rev Stat Appl
Pays: United States
ID NLM: 101622422

Informations de publication

Date de publication:
2023
Historique:
medline: 1 1 2023
pubmed: 1 1 2023
entrez: 22 5 2024
Statut: ppublish

Résumé

Researchers studying the evolution of viral pathogens and other organisms increasingly encounter and use large and complex data sets from multiple different sources. Statistical research in Bayesian phylogenetics has risen to this challenge. Researchers use phylogenetics not only to reconstruct the evolutionary history of a group of organisms, but also to understand the processes that guide its evolution and spread through space and time. To this end, it is now the norm to integrate numerous sources of data. For example, epidemiologists studying the spread of a virus through a region incorporate data including genetic sequences (e.g. DNA), time, location (both continuous and discrete) and environmental covariates (e.g. social connectivity between regions) into a coherent statistical model. Evolutionary biologists routinely do the same with genetic sequences, location, time, fossil and modern phenotypes, and ecological covariates. These complex, hierarchical models readily accommodate both discrete and continuous data and have enormous combined discrete/continuous parameter spaces including, at a minimum, phylogenetic tree topologies and branch lengths. The increased size and complexity of these statistical models have spurred advances in computational methods to make them tractable. We discuss both the modeling and computational advances below, as well as unsolved problems and areas of active research.

Identifiants

pubmed: 38774036
doi: 10.1146/annurev-statistics-033021-112532
pmc: PMC11108065
doi:

Types de publication

Journal Article

Langues

eng

Pagination

353-377

Auteurs

Gabriel W Hassler (GW)

Department of Computational Medicine, University of California, Los Angeles, USA, 90095.

Andrew Magee (A)

Department of Biostatistics, University of California, Los Angeles, USA, 90095.

Zhenyu Zhang (Z)

Department of Biostatistics, University of California, Los Angeles, USA, 90095.

Guy Baele (G)

Department of Microbiology and Immunology, Rega Institute, KU Leuven, Leuven, Belgium, 3000.

Philippe Lemey (P)

Department of Microbiology and Immunology, Rega Institute, KU Leuven, Leuven, Belgium, 3000.

Xiang Ji (X)

Department of Mathematics, Tulane University, New Orleans, USA, 70118.

Mathieu Fourment (M)

Australian Institute for Microbiology and Infection, University of Technology Sydney, Ultimo NSW, Australia, 2007.

Marc A Suchard (MA)

Department of Computational Medicine, University of California, Los Angeles, USA, 90095.
Department of Biostatistics, University of California, Los Angeles, USA, 90095.
Department of Human Genetics, University of California, Los Angeles, USA, 90095.

Classifications MeSH