Bayesian inference of causal effects from observational data in Gaussian graphical models.

Markov equivalence class causal inference directed acyclic graph graphical model objective Bayes observational data

Journal

Biometrics
ISSN: 1541-0420
Titre abrégé: Biometrics
Pays: United States
ID NLM: 0370625

Informations de publication

Date de publication:
03 2021
Historique:
received: 16 10 2019
revised: 25 03 2020
accepted: 30 03 2020
pubmed: 16 4 2020
medline: 26 10 2021
entrez: 16 4 2020
Statut: ppublish

Résumé

We assume that multivariate observational data are generated from a distribution whose conditional independencies are encoded in a Directed Acyclic Graph (DAG). For any given DAG, the causal effect of a variable onto another one can be evaluated through intervention calculus. A DAG is typically not identifiable from observational data alone. However, its Markov equivalence class (a collection of DAGs) can be estimated from the data. As a consequence, for the same intervention a set of causal effects, one for each DAG in the equivalence class, can be evaluated. In this paper, we propose a fully Bayesian methodology to make inference on the causal effects of any intervention in the system. Main features of our method are: (a) both uncertainty on the equivalence class and the causal effects are jointly modeled; (b) priors on the parameters of the modified Cholesky decomposition of the precision matrices across all DAG models are constructively assigned starting from a unique prior on the complete (unrestricted) DAG; (c) an efficient algorithm to sample from the posterior distribution on graph space is adopted; (d) an objective Bayes approach, requiring virtually no user specification, is used throughout. We demonstrate the merits of our methodology in simulation studies, wherein comparisons with current state-of-the-art procedures turn out to be highly satisfactory. Finally we examine a real data set of gene expressions for Arabidopsis thaliana.

Identifiants

pubmed: 32294233
doi: 10.1111/biom.13281
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

136-149

Subventions

Organisme : Università Cattolica del Sacro Cuore
ID : D.3.2
Organisme : Università Cattolica del Sacro Cuore
ID : D1

Informations de copyright

© 2020 The International Biometric Society.

Références

Andersson, S.A., Madigan, D. and Perlman, M.D. (1997) A characterization of Markov equivalence classes for acyclic digraphs. The Annals of Statistics, 25, 505-541.
Barbieri, M. and Berger, J. (2004) Optimal predictive model selection. The Annals of Statistics, 32, 870-897.
Ben-David, E., Li, T., Massam, H. and Rajaratnam, B. (2015) High dimensional Bayesian inference for Gaussian directed acyclic graph models. [Preprint] Available at: https://arxiv.org/abs/1109.4371v5.
Cao, X., Khare, K. and Ghosh, M. (2019) Posterior graph selection and estimation consistency for high-dimensional Bayesian DAG models. The Annals of Statistics, 47, 319-348.
Castelletti, F., Consonni, G., Della Vedova, M. and Peluso, S. (2018) Learning Markov equivalence classes of directed acyclic graphs: an objective Bayes approach. Bayesian Analysis, 13, 1231-1256.
Chickering, D.M. (2002) Learning equivalence classes of Bayesian-network structures. Journal of Machine Learning Research, 2, 445-498.
Consonni, G., Fouskakis, D., Liseo, B. and Ntzoufras, I. (2018) Prior distributions for objective Bayesian analysis. Bayesian Analysis, 13, 627-679.
Friedman, N. (2004) Inferring cellular networks using probabilistic graphical models. Science, 303, 799-805.
Frot, B., Nandy, P. and Maathuis, M.H. (2019) Robust causal structure learning with some hidden variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 81, 459-487.
García-Donato, G. and Martínez-Beneito, M.A. (2013) On sampling strategies in Bayesian variable selection problems with large model spaces. Journal of the American Statistical Association, 108, 340-352.
Geiger, D. and Heckerman, D. (2002) Parameter priors for directed acyclic graphical models and the characterization of several probability distributions. The Annals of Statistics, 30, 1412-1440.
Gillispie, S.B. and Perlman, M.D. (2002) The size distribution for Markov equivalence classes of acyclic digraph models. Artificial Intelligence, 141, 137-155.
Hauser, A. and Bühlmann, P. (2015) Jointly interventional and observational data: estimation of interventional Markov equivalence classes of directed acyclic graphs. Journal of the Royal Statistical Society. Series B (Methodology), 77, 291-318.
Hoeting, J.A., Madigan, D., Raftery, A.E. and Volinsky, C.T. (1999) Bayesian model averaging: a tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authors). Statistical Science, 14, 382-417.
Hoyer, P.O., Janzing, D., Mooij, J.M., Peters, J. and Schölkopf, B. (2009) Nonlinear causal discovery with additive noise models. In: Koller, D., Schuurmans, D., Bengio, Y. and Bottou, L. (Eds.) Advances in Neural Information Processing Systems 21. Red Hook, NY: Curran Associates, Inc, pp. 689-696.
Imbens, G.W. and Rubin, D.B. (2015) Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. Cambridge: Cambridge University Press.
Kalisch, M. and Bühlmann, P. (2007) Estimating high-dimensional directed acyclic graphs with the PC-algorithm. Journal of Machine Learning Research, 8, 613-36.
Koster, J.T. (1997) Markov properties of nonrecursive causal models. The Annals of Statistics, 24, 2148-2177.
Lauritzen, S.L. (1996) Graphical Models. Oxford: Oxford University Press.
Maathuis, M.H., Kalisch, M. and Bühlmann, P. (2009) Estimating high-dimensional intervention effects from observational data. The Annals of Statistics, 37, 3133-3164.
Mahmoudi, M.S. and Wit, E. (2018) Estimating causal effects from nonparanormal observational data. The International Journal of Biostatistics, 14, forthcoming.
Müller, P., Parmigiani, G. and Rice, K. (2007) FDR and Bayesian multiple comparisons rules. In: Bernardo, J.M., Bayarri, M., Berger, J., Dawid, A., Heckerman, D., Smith, A. and West, M. (Eds.) Bayesian Statistics 8. Oxford: Oxford University Press.
Ni, Y., Ji, Y. and Müller, P. (2018) Reciprocal graphical models for integrative gene regulatory network analysis. Bayesian Analysis, 13, 1095-1110.
Ni, Y., Stingo, F.C. and Baladandayuthapani, V. (2017) Sparse multi-dimensional graphical models: a unified Bayesian framework. Journal of the American Statistical Association, 112, 779-793.
O'Hagan, A. (1995) Fractional Bayes factors for model comparison. Journal of the Royal Statistical Society. Series B (Methodological), 57, 99-138.
Pearl, J. (2000) Causality: Models, Reasoning, and Inference. Cambridge: Cambridge University Press.
Peters, J. and Bühlmann, P. (2014) Identifiability of Gaussian structural equation models with equal error variances. Biometrika, 101, 219-228.
Peterson, C., Stingo, F.C. and Vannucci, M. (2015) Bayesian inference of multiple Gaussian graphical models. Journal of the American Statistical Association, 110, 159-174.
Pingault, J.B., O'Reilly, P.F., Schoeler, T., Ploubidis, G.B., Rijsdijk, F. and Dudbridge, F. (2018) Using genetic data to strengthen causal inference in observational research. Nature Reviews Genetics, 19, 566-580.
Press, S.J. (1982) Applied Multivariate Analysis: Using Bayesian and Frequentist Methods of Inference. Malabar, FL: Krieger Publishing Company, Inc.
Rütimann, P. and Bühlmann, P. (2009) High dimensional sparse covariance estimation via directed acyclic graphs. Electronic Journal of Statistics, 3, 1133-1160.
Sachs, K., Perez, O., Pe'er, D., Lauffenburger, D. and Nolan, G. (2005) Causal protein-signaling networks derived from multiparameter single-cell data. Science, 308, 523-529.
Shimizu, S., Hoyer, P.O., Hyvärinen, A. and Kerminen, A.J. (2006) A linear non-Gaussian acyclic model for causal discovery. J. Mach. Learn. Res., 7, 2003-2030.
Shojaie, A. and Michailidis, G. (2009) Analysis of gene sets based on the underlying regulatory network. Journal of Computational Biology, 16, 407-26.
Spirtes, P., Glymour, C. and Scheines, R. (2000) Causation, Prediction and Search, 2nd edition. Cambridge, MA: MIT Press.
Wille, A., Zimmermann, P., Vranová, E., Fürholz, A., Laule, O., Bleuler, S., Hennig, L., Prelić, A., von Rohr, P., Thiele, L., Zitzler, E., Gruissem, W., Bühlmann, P. (2004) Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana. Genome Biology, 5, A92.

Auteurs

Federico Castelletti (F)

Department of Statistical Sciences, Università Cattolica del Sacro Cuore, Milan, Italy.

Guido Consonni (G)

Department of Statistical Sciences, Università Cattolica del Sacro Cuore, Milan, Italy.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Humans Macular Degeneration Mendelian Randomization Analysis Life Style Genome-Wide Association Study
1.00
Humans Magnetic Resonance Imaging Brain Infant, Newborn Infant, Premature
Humans Meta-Analysis as Topic Sample Size Models, Statistical Computer Simulation

Classifications MeSH