Data integration in Bayesian phylogenetics.
Bayesian networks
Gaussian processes
continuous-time Markov processes
phylogenetic comparative methods
phylogeography
Journal
Annual review of statistics and its application
ISSN: 2326-8298
Titre abrégé: Annu Rev Stat Appl
Pays: United States
ID NLM: 101622422
Informations de publication
Date de publication:
2023
2023
Historique:
medline:
1
1
2023
pubmed:
1
1
2023
entrez:
22
5
2024
Statut:
ppublish
Résumé
Researchers studying the evolution of viral pathogens and other organisms increasingly encounter and use large and complex data sets from multiple different sources. Statistical research in Bayesian phylogenetics has risen to this challenge. Researchers use phylogenetics not only to reconstruct the evolutionary history of a group of organisms, but also to understand the processes that guide its evolution and spread through space and time. To this end, it is now the norm to integrate numerous sources of data. For example, epidemiologists studying the spread of a virus through a region incorporate data including genetic sequences (e.g. DNA), time, location (both continuous and discrete) and environmental covariates (e.g. social connectivity between regions) into a coherent statistical model. Evolutionary biologists routinely do the same with genetic sequences, location, time, fossil and modern phenotypes, and ecological covariates. These complex, hierarchical models readily accommodate both discrete and continuous data and have enormous combined discrete/continuous parameter spaces including, at a minimum, phylogenetic tree topologies and branch lengths. The increased size and complexity of these statistical models have spurred advances in computational methods to make them tractable. We discuss both the modeling and computational advances below, as well as unsolved problems and areas of active research.
Identifiants
pubmed: 38774036
doi: 10.1146/annurev-statistics-033021-112532
pmc: PMC11108065
doi:
Types de publication
Journal Article
Langues
eng