Automatic Differentiation is no Panacea for Phylogenetic Gradient Computation.

Bayesian inference gradient phylogenetics variational inference

Journal

Genome biology and evolution
ISSN: 1759-6653
Titre abrégé: Genome Biol Evol
Pays: England
ID NLM: 101509707

Informations de publication

Date de publication:
01 06 2023
Historique:
accepted: 25 05 2023
medline: 22 6 2023
pubmed: 2 6 2023
entrez: 2 6 2023
Statut: ppublish

Résumé

Gradients of probabilistic model likelihoods with respect to their parameters are essential for modern computational statistics and machine learning. These calculations are readily available for arbitrary models via "automatic differentiation" implemented in general-purpose machine-learning libraries such as TensorFlow and PyTorch. Although these libraries are highly optimized, it is not clear if their general-purpose nature will limit their algorithmic complexity or implementation speed for the phylogenetic case compared to phylogenetics-specific code. In this paper, we compare six gradient implementations of the phylogenetic likelihood functions, in isolation and also as part of a variational inference procedure. We find that although automatic differentiation can scale approximately linearly in tree size, it is much slower than the carefully implemented gradient calculation for tree likelihood and ratio transformation operations. We conclude that a mixed approach combining phylogenetic libraries with machine learning libraries will provide the optimal combination of speed and model flexibility moving forward.

Identifiants

pubmed: 37265233
pii: 7188956
doi: 10.1093/gbe/evad099
pmc: PMC10282121
pii:
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't Research Support, N.I.H., Extramural

Langues

eng

Sous-ensembles de citation

IM

Subventions

Organisme : NIH HHS
ID : S10 OD028685
Pays : United States
Organisme : Howard Hughes Medical Institute
Pays : United States

Informations de copyright

© The Author(s) 2023. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution.

Références

Virus Evol. 2018 Jun 08;4(1):vey016
pubmed: 29942656
Mol Biol Evol. 2022 Aug 3;39(8):
pubmed: 35816422
Virus Evol. 2018 Jan 08;4(1):vex042
pubmed: 29340210
Genome Res. 1998 Mar;8(3):222-33
pubmed: 9521926
Mol Biol Evol. 2020 Oct 1;37(10):3047-3060
pubmed: 32458974
Genome Res. 2021 Nov;31(11):2107-2119
pubmed: 34426513
Elife. 2014;3:e01914
pubmed: 24497547
Nat Biotechnol. 2017 Apr 11;35(4):316-319
pubmed: 28398311
Syst Biol. 2019 Nov 1;68(6):1052-1061
pubmed: 31034053
Syst Biol. 2021 Feb 10;70(2):258-267
pubmed: 32687171
Stat Appl Genet Mol Biol. 2012 Sep 25;11(4):Article 14
pubmed: 23023698
Mol Biol Evol. 2019 Apr 1;36(4):825-833
pubmed: 30715448
J Mol Evol. 1981;17(6):368-76
pubmed: 7288891
Syst Biol. 2020 Mar 1;69(2):209-220
pubmed: 31504998
BMC Evol Biol. 2014 Jul 24;14:163
pubmed: 25055743
J Stat Softw. 2017;76:
pubmed: 36568334
PeerJ. 2019 Dec 18;7:e8272
pubmed: 31976168

Auteurs

Mathieu Fourment (M)

Australian Institute for Microbiology and Infection, University of Technology Sydney, Ultimo, NSW, Australia.

Christiaan J Swanepoel (CJ)

Centre for Computational Evolution, The University of Auckland, Auckland, New Zealand.
School of Computer Science, The University of Auckland, Auckland, New Zealand.

Jared G Galloway (JG)

Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA.

Xiang Ji (X)

Department of Mathematics, Tulane University, New Orleans, Louisiana, USA.

Karthik Gangavarapu (K)

Department of Human Genetics, University of California, Los Angeles, California, USA.

Marc A Suchard (MA)

Department of Human Genetics, University of California, Los Angeles, California, USA.
Department of Computational Medicine, University of California, Los Angeles, California, USA.
Department of Biostatistics, University of California, Los Angeles, California, USA.

Frederick A Matsen Iv (FA)

Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA.
Department of Statistics, University of Washington, Seattle, Washington, USA.
Department of Genome Sciences, University of Washington, Seattle, Washington, USA.
Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Animals Hemiptera Insect Proteins Phylogeny Insecticides
Amaryllidaceae Alkaloids Lycoris NADPH-Ferrihemoprotein Reductase Gene Expression Regulation, Plant Plant Proteins

Classifications MeSH