A Probabilistic Programming Approach to Protein Structure Superposition.

Bayesian modelling deep probabilistic programming protein structure prediction protein superposition

Journal

Proceedings of the ... IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology : CIBCB. IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology
Titre abrégé: Proc IEEE Symp Comput Intell Bioinforma Comput Biol
Pays: United States
ID NLM: 101530701

Informations de publication

Date de publication:
Jul 2019
Historique:
entrez: 18 10 2021
pubmed: 1 7 2019
medline: 1 7 2019
Statut: ppublish

Résumé

Optimal superposition of protein structures or other biological molecules is crucial for understanding their structure, function, dynamics and evolution. Here, we investigate the use of probabilistic programming to superimpose protein structures guided by a Bayesian model. Our model THESEUS-PP is based on the THESEUS model, a probabilistic model of protein superposition based on rotation, translation and perturbation of an underlying, latent mean structure. The model was implemented in the probabilistic programming language Pyro. Unlike conventional methods that minimize the sum of the squared distances, THESEUS takes into account correlated atom positions and heteroscedasticity (ie. atom positions can feature different variances). THESEUS performs maximum likelihood estimation using iterative expectation-maximization. In contrast, THESEUS-PP allows automated maximum a-posteriori (MAP) estimation using suitable priors over rotation, translation, variances and latent mean structure. The results indicate that probabilistic programming is a powerful new paradigm for the formulation of Bayesian probabilistic models concerning biomolecular structure. Specifically, we envision the use of the THESEUS-PP model as a suitable error model or likelihood in Bayesian protein structure prediction using deep probabilistic programming.

Identifiants

pubmed: 34661202
doi: 10.1109/cibcb.2019.8791469
pmc: PMC8515897
mid: NIHMS1744718
doi:

Types de publication

Journal Article

Langues

eng

Subventions

Organisme : NIGMS NIH HHS
ID : R01 GM094468
Pays : United States
Organisme : NIGMS NIH HHS
ID : R01 GM121384
Pays : United States
Organisme : NIGMS NIH HHS
ID : R01 GM132499
Pays : United States

Références

Cell Syst. 2019 Apr 24;8(4):292-301.e3
pubmed: 31005579
Bioinformatics. 2009 Jun 1;25(11):1422-3
pubmed: 19304878
Bioinformatics. 2012 Aug 1;28(15):1972-9
pubmed: 22543369
Nucleic Acids Res. 2000 Jan 1;28(1):235-42
pubmed: 10592235
Proc Natl Acad Sci U S A. 2006 Dec 5;103(49):18521-7
pubmed: 17130458
J Comput Chem. 2004 Nov 30;25(15):1849-57
pubmed: 15376254

Auteurs

Lys Sanz Moreta (LS)

Department of Computer Science. University of Copenhagen, Denmark.

Ahmad Salim Al-Sibahi (AS)

Department of Computer Science. University of Copenhagen/Skanned.com, Denmark.

Douglas Theobald (D)

Department of Biochemistry. Brandeis University. Waltham, MA 02452, USA.

William Bullock (W)

The Bioinformatics Centre. Section for Computational and RNA Biology. University of Copenhagen. Copenhagen, Denmark.

Basile Nicolas Rommes (BN)

The Bioinformatics Centre. Section for Computational and RNA Biology. University of Copenhagen. Copenhagen, Denmark.

Andreas Manoukian (A)

The Bioinformatics Centre. Section for Computational and RNA Biology. University of Copenhagen. Copenhagen, Denmark.

Thomas Hamelryck (T)

Department of Computer Science. University of Copenhagen, Denmark.
The Bioinformatics Centre. Section for Computational and RNA Biology. University of Copenhagen. Copenhagen, Denmark.

Classifications MeSH