A probabilistic view of protein stability, conformational specificity, and design.
Journal
Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288
Informations de publication
Date de publication:
19 09 2023
19 09 2023
Historique:
received:
10
08
2023
accepted:
04
09
2023
medline:
21
9
2023
pubmed:
20
9
2023
entrez:
19
9
2023
Statut:
epublish
Résumé
Various approaches have used neural networks as probabilistic models for the design of protein sequences. These "inverse folding" models employ different objective functions, which come with trade-offs that have not been assessed in detail before. This study introduces probabilistic definitions of protein stability and conformational specificity and demonstrates the relationship between these chemical properties and the [Formula: see text] Boltzmann probability objective. This links the Boltzmann probability objective function to experimentally verifiable outcomes. We propose a novel sequence decoding algorithm, referred to as "BayesDesign", that leverages Bayes' Rule to maximize the [Formula: see text] objective instead of the [Formula: see text] objective common in inverse folding models. The efficacy of BayesDesign is evaluated in the context of two protein model systems, the NanoLuc enzyme and the WW structural motif. Both BayesDesign and the baseline ProteinMPNN algorithm increase the thermostability of NanoLuc and increase the conformational specificity of WW. The possible sources of error in the model are analyzed.
Identifiants
pubmed: 37726313
doi: 10.1038/s41598-023-42032-1
pii: 10.1038/s41598-023-42032-1
pmc: PMC10509192
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
15493Informations de copyright
© 2023. Springer Nature Limited.
Références
Nucleic Acids Res. 2023 Jan 6;51(D1):D523-D531
pubmed: 36408920
Phys Rev Lett. 2022 Dec 2;129(23):238101
pubmed: 36563190
Nat Methods. 2020 May;17(5):495-503
pubmed: 32284610
Proc Natl Acad Sci U S A. 2012 Oct 16;109(42):16858-63
pubmed: 23035249
Nat Commun. 2022 Feb 8;13(1):746
pubmed: 35136054
Proc Natl Acad Sci U S A. 2020 Jan 21;117(3):1496-1503
pubmed: 31896580
J Am Chem Soc. 2015 Nov 4;137(43):13876-86
pubmed: 26440863
Science. 2022 Jul 22;377(6604):387-394
pubmed: 35862514
J Chem Theory Comput. 2017 Jun 13;13(6):3031-3048
pubmed: 28430426
J Mol Biol. 1997 Apr 25;268(1):209-25
pubmed: 9149153
Proc Natl Acad Sci U S A. 2021 Mar 16;118(11):
pubmed: 33712545
Protein Sci. 1994 Apr;3(4):567-74
pubmed: 8003975
J Mol Biol. 2005 Mar 18;347(1):203-27
pubmed: 15733929
J Agric Food Chem. 2001 Oct;49(10):4889-97
pubmed: 11600040
Nature. 2021 Aug;596(7873):583-589
pubmed: 34265844
Biotechnol Bioeng. 2019 Mar;116(3):667-676
pubmed: 30593665
Microb Cell Fact. 2015 Oct 09;14:158
pubmed: 26449768
N Biotechnol. 2016 Jun 25;33(4):480-7
pubmed: 27085957
J Mol Biol. 2001 Jan 19;305(3):619-31
pubmed: 11152617
ACS Chem Biol. 2019 Jul 19;14(7):1652-1659
pubmed: 31188563
Mol Cell. 2009 Dec 11;36(5):861-71
pubmed: 20005848
Nature. 2021 Dec;600(7889):547-552
pubmed: 34853475
Science. 2022 Oct 7;378(6615):49-56
pubmed: 36108050
Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W235-8
pubmed: 16845000
Nucleic Acids Res. 2020 Jul 2;48(W1):W17-W24
pubmed: 32343309
Nature. 2012 Nov 8;491(7423):222-7
pubmed: 23135467
J Am Chem Soc. 2014 Dec 17;136(50):17547-60
pubmed: 25409346
ACS Chem Biol. 2012 Nov 16;7(11):1848-57
pubmed: 22894855
Science. 1997 Oct 3;278(5335):82-7
pubmed: 9311930
Proc Natl Acad Sci U S A. 2003 Nov 11;100(23):13270-3
pubmed: 14593201
J Biol Chem. 2017 Sep 1;292(35):14349-14361
pubmed: 28710274
Chem Rev. 2022 Sep 28;122(18):14085-14179
pubmed: 35921495
J Proteome Res. 2006 Dec;5(12):3288-300
pubmed: 17137330
Structure. 2012 Jan 11;20(1):161-71
pubmed: 22178248
Int J Mol Sci. 2021 Oct 29;22(21):
pubmed: 34769173
Science. 2003 Nov 21;302(5649):1364-8
pubmed: 14631033