Generating tertiary protein structures via interpretable graph variational autoencoders.
Journal
Bioinformatics advances
ISSN: 2635-0041
Titre abrégé: Bioinform Adv
Pays: England
ID NLM: 9918282081306676
Informations de publication
Date de publication:
2021
2021
Historique:
received:
26
08
2021
revised:
07
11
2021
accepted:
17
11
2021
entrez:
26
1
2023
pubmed:
29
11
2021
medline:
29
11
2021
Statut:
epublish
Résumé
Modeling the structural plasticity of protein molecules remains challenging. Most research has focused on obtaining one biologically active structure. This includes the recent AlphaFold2 that has been hailed as a breakthrough for protein modeling. Computing one structure does not suffice to understand how proteins modulate their interactions and even evade our immune system. Revealing the structure space available to a protein remains challenging. Data-driven approaches that learn to generate tertiary structures are increasingly garnering attention. These approaches exploit the ability to represent tertiary structures as contact or distance maps and make direct analogies with images to harness convolution-based generative adversarial frameworks from computer vision. Since such opportunistic analogies do not allow capturing highly structured data, current deep models struggle to generate physically realistic tertiary structures. We present novel deep generative models that build upon the graph variational autoencoder framework. In contrast to existing literature, we represent tertiary structures as 'contact' graphs, which allow us to leverage graph-generative deep learning. Our models are able to capture rich, local and distal constraints and additionally compute disentangled latent representations that reveal the impact of individual latent factors. This elucidates what the factors control and makes our models more interpretable. Rigorous comparative evaluation along various metrics shows that the models, we propose advance the state-of-the-art. While there is still much ground to cover, the work presented here is an important first step, and graph-generative frameworks promise to get us to our goal of unraveling the exquisite structural complexity of protein molecules. Code is available at https://github.com/anonymous1025/CO-VAE. Supplementary data are available at
Identifiants
pubmed: 36700110
doi: 10.1093/bioadv/vbab036
pii: vbab036
pmc: PMC9710582
doi:
Types de publication
Journal Article
Langues
eng
Pagination
vbab036Informations de copyright
© The Author(s) 2021. Published by Oxford University Press.
Références
PLoS Comput Biol. 2016 Apr 28;12(4):e1004619
pubmed: 27124275
Nat Chem Biol. 2009 Nov;5(11):789-96
pubmed: 19841628
J Biomol Struct Dyn. 2021 Oct;39(17):6705-6712
pubmed: 32746720
Methods Enzymol. 2011;487:545-74
pubmed: 21187238
Science. 2008 Jun 13;320(5882):1429-30
pubmed: 18556537
Fold Des. 1997;2(5):295-306
pubmed: 9377713
Molecules. 2019 Feb 12;24(3):
pubmed: 30759724
Nat Struct Mol Biol. 2020 Oct;27(10):925-933
pubmed: 32699321
Molecules. 2021 Feb 24;26(5):
pubmed: 33668217
Nat Struct Biol. 2003 Dec;10(12):980
pubmed: 14634627
Proteins. 2015 Aug;83(8):1436-49
pubmed: 25974172
Nature. 2021 Aug;596(7873):583-589
pubmed: 34265844
Curr Opin Struct Biol. 2021 Apr;67:170-177
pubmed: 33338762
J Mol Graph Model. 2021 Jan;102:107778
pubmed: 33099199
Adv Sci (Weinh). 2020 Aug 10;7(19):2001314
pubmed: 33042750