Accelerated regression-based summary statistics for discrete stochastic systems via approximate simulators.

Approximate Bayesian Computation Biochemical reaction systems Discrete stochastic reaction systems Gillespie algorithm Summary statistics

Journal

BMC bioinformatics
ISSN: 1471-2105
Titre abrégé: BMC Bioinformatics
Pays: England
ID NLM: 100965194

Informations de publication

Date de publication:
23 Jun 2021
Historique:
received: 02 12 2020
accepted: 10 06 2021
entrez: 24 6 2021
pubmed: 25 6 2021
medline: 29 6 2021
Statut: epublish

Résumé

Approximate Bayesian Computation (ABC) has become a key tool for calibrating the parameters of discrete stochastic biochemical models. For higher dimensional models and data, its performance is strongly dependent on having a representative set of summary statistics. While regression-based methods have been demonstrated to allow for the automatic construction of effective summary statistics, their reliance on first simulating a large training set creates a significant overhead when applying these methods to discrete stochastic models for which simulation is relatively expensive. In this τ work, we present a method to reduce this computational burden by leveraging approximate simulators of these systems, such as ordinary differential equations and τ-Leaping approximations. We have developed an algorithm to accelerate the construction of regression-based summary statistics for Approximate Bayesian Computation by selectively using the faster approximate algorithms for simulations. By posing the problem as one of ratio estimation, we use state-of-the-art methods in machine learning to show that, in many cases, our algorithm can significantly reduce the number of simulations from the full resolution model at a minimal cost to accuracy and little additional tuning from the user. We demonstrate the usefulness and robustness of our method with four different experiments. We provide a novel algorithm for accelerating the construction of summary statistics for stochastic biochemical systems. Compared to the standard practice of exclusively training from exact simulator samples, our method is able to dramatically reduce the number of required calls to the stochastic simulator at a minimal loss in accuracy. This can immediately be implemented to increase the overall speed of the ABC workflow for estimating parameters in complex systems.

Sections du résumé

BACKGROUND BACKGROUND
Approximate Bayesian Computation (ABC) has become a key tool for calibrating the parameters of discrete stochastic biochemical models. For higher dimensional models and data, its performance is strongly dependent on having a representative set of summary statistics. While regression-based methods have been demonstrated to allow for the automatic construction of effective summary statistics, their reliance on first simulating a large training set creates a significant overhead when applying these methods to discrete stochastic models for which simulation is relatively expensive. In this τ work, we present a method to reduce this computational burden by leveraging approximate simulators of these systems, such as ordinary differential equations and τ-Leaping approximations.
RESULTS RESULTS
We have developed an algorithm to accelerate the construction of regression-based summary statistics for Approximate Bayesian Computation by selectively using the faster approximate algorithms for simulations. By posing the problem as one of ratio estimation, we use state-of-the-art methods in machine learning to show that, in many cases, our algorithm can significantly reduce the number of simulations from the full resolution model at a minimal cost to accuracy and little additional tuning from the user. We demonstrate the usefulness and robustness of our method with four different experiments.
CONCLUSIONS CONCLUSIONS
We provide a novel algorithm for accelerating the construction of summary statistics for stochastic biochemical systems. Compared to the standard practice of exclusively training from exact simulator samples, our method is able to dramatically reduce the number of required calls to the stochastic simulator at a minimal loss in accuracy. This can immediately be implemented to increase the overall speed of the ABC workflow for estimating parameters in complex systems.

Identifiants

pubmed: 34162329
doi: 10.1186/s12859-021-04255-9
pii: 10.1186/s12859-021-04255-9
pmc: PMC8220802
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

339

Subventions

Organisme : NIBIB NIH HHS
ID : 2-R01-EB014877-04A1
Pays : United States

Références

J Chem Phys. 2004 Aug 15;121(7):3347-8
pubmed: 15291645
Genome Res. 2003 Nov;13(11):2467-74
pubmed: 14559783
Proc Natl Acad Sci U S A. 2002 Apr 30;99(9):5988-92
pubmed: 11972055
Science. 2002 Aug 16;297(5584):1183-6
pubmed: 12183631
PLoS Comput Biol. 2017 Jan 23;13(1):e1005331
pubmed: 28114351
PLoS Comput Biol. 2016 Dec 8;12(12):e1005220
pubmed: 27930676
Trends Genet. 1999 Feb;15(2):65-9
pubmed: 10098409
Bioinformatics. 2018 Jul 1;34(13):i494-i501
pubmed: 29949983
J Chem Phys. 2006 Jan 28;124(4):044109
pubmed: 16460151
BMC Bioinformatics. 2012 May 01;13:68
pubmed: 22548918
J Theor Biol. 2020 Jul 7;496:110255
pubmed: 32223995
Nature. 2000 Jan 20;403(6767):339-42
pubmed: 10659857
Proc Natl Acad Sci U S A. 2020 Mar 10;117(10):5242-5249
pubmed: 32079725

Auteurs

Richard M Jiang (RM)

Department of Computer Science, University of California, Santa Barbara, Santa Barbara, USA. rmjiang@ucsb.edu.

Fredrik Wrede (F)

Department of Information Technology, Uppsala University, Uppsala, Sweden.

Prashant Singh (P)

Department of Information Technology, Uppsala University, Uppsala, Sweden.

Andreas Hellander (A)

Department of Information Technology, Uppsala University, Uppsala, Sweden.

Linda R Petzold (LR)

Department of Computer Science, University of California, Santa Barbara, Santa Barbara, USA.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
1.00
Humans Magnetic Resonance Imaging Brain Infant, Newborn Infant, Premature
Humans Meta-Analysis as Topic Sample Size Models, Statistical Computer Simulation
Humans Perioperative Period Systematic Reviews as Topic Regression Analysis Developing Countries

Classifications MeSH