AlphaFold2 fails to predict protein fold switching.
AlphaFold2
fold-switching
protein-folding
structural heterogeneity
Journal
Protein science : a publication of the Protein Society
ISSN: 1469-896X
Titre abrégé: Protein Sci
Pays: United States
ID NLM: 9211750
Informations de publication
Date de publication:
06 2022
06 2022
Historique:
revised:
05
05
2022
received:
24
03
2022
accepted:
07
05
2022
entrez:
31
5
2022
pubmed:
1
6
2022
medline:
3
6
2022
Statut:
ppublish
Résumé
AlphaFold2 has revolutionized protein structure prediction by leveraging sequence information to rapidly model protein folds with atomic-level accuracy. Nevertheless, previous work has shown that these predictions tend to be inaccurate for structurally heterogeneous proteins. To systematically assess factors that contribute to this inaccuracy, we tested AlphaFold2's performance on 98-fold-switching proteins, which assume at least two distinct-yet-stable secondary and tertiary structures. Topological similarities were quantified between five predicted and two experimentally determined structures of each fold-switching protein. Overall, 94% of AlphaFold2 predictions captured one experimentally determined conformation but not the other. Despite these biased results, AlphaFold2's estimated confidences were moderate-to-high for 74% of fold-switching residues, a result that contrasts with overall low confidences for intrinsically disordered proteins, which are also structurally heterogeneous. To investigate factors contributing to this disparity, we quantified sequence variation within the multiple sequence alignments used to generate AlphaFold2's predictions of fold-switching and intrinsically disordered proteins. Unlike intrinsically disordered regions, whose sequence alignments show low conservation, fold-switching regions had conservation rates statistically similar to canonical single-fold proteins. Furthermore, intrinsically disordered regions had systematically lower prediction confidences than either fold-switching or single-fold proteins, regardless of sequence conservation. AlphaFold2's high prediction confidences for fold switchers indicate that it uses sophisticated pattern recognition to search for one most probable conformer rather than protein biophysics to model a protein's structural ensemble. Thus, it is not surprising that its predictions often fail for proteins whose properties are not fully apparent from solved protein structures. Our results emphasize the need to look at protein structure as an ensemble and suggest that systematic examination of fold-switching sequences may reveal propensities for multiple stable secondary and tertiary structures.
Identifiants
pubmed: 35634782
doi: 10.1002/pro.4353
pmc: PMC9134877
doi:
Substances chimiques
Intrinsically Disordered Proteins
0
Types de publication
Journal Article
Research Support, N.I.H., Intramural
Langues
eng
Sous-ensembles de citation
IM
Pagination
e4353Informations de copyright
Published 2022. This article is a U.S. Government work and is in the public domain in the USA.
Références
Bioinformatics. 2010 Apr 1;26(7):889-95
pubmed: 20164152
Nucleic Acids Res. 2022 Jan 7;50(D1):D480-D487
pubmed: 34850135
Biopolymers. 2021 Oct;112(10):e23416
pubmed: 33462801
Biochem J. 2021 May 28;478(10):1885-1890
pubmed: 34029366
Proc Natl Acad Sci U S A. 2019 Dec 17;116(51):25446-25455
pubmed: 31772021
Nat Methods. 2020 Mar;17(3):261-272
pubmed: 32015543
Biochemistry. 2021 Dec 14;60(49):3753-3761
pubmed: 34855369
Nature. 2021 Aug;596(7873):583-589
pubmed: 34265844
Proteins. 2021 Sep;89(9):1226-1228
pubmed: 33973689
Biopolymers. 1983 Dec;22(12):2577-637
pubmed: 6667333
Proc Natl Acad Sci U S A. 2018 Jun 5;115(23):5968-5973
pubmed: 29784778
Cell. 2012 Jul 20;150(2):291-303
pubmed: 22817892
Cell. 2007 Nov 16;131(4):730-43
pubmed: 18022367
Proc Natl Acad Sci U S A. 2013 Sep 24;110(39):15674-9
pubmed: 24009338
Nature. 2021 Aug;596(7873):590-596
pubmed: 34293799
Mol Biol Evol. 2010 Mar;27(3):609-21
pubmed: 19923193
J Mol Biol. 2021 Oct 1;433(20):167059
pubmed: 34023402
Proc Natl Acad Sci U S A. 2008 Apr 1;105(13):5057-62
pubmed: 18364395
Mol Biol Evol. 2004 Sep;21(9):1781-91
pubmed: 15201400
Nat Chem Biol. 2015 Jan;11(1):16-8
pubmed: 25402770
Biochemistry. 2012 Apr 3;51(13):2747-56
pubmed: 22417533
Curr Opin Struct Biol. 2010 Aug;20(4):482-8
pubmed: 20591649
Science. 2021 Jan 1;371(6524):86-90
pubmed: 33384377
Protein Sci. 2022 Jun;31(6):e4353
pubmed: 35634782
Science. 2017 Mar 17;355(6330):1174-1180
pubmed: 28302851
Acta Crystallogr D Biol Crystallogr. 2002 Jun;58(Pt 6 No 1):899-907
pubmed: 12037327
Bioinformatics. 2022 Jan 31;:
pubmed: 35099504
Elife. 2022 Mar 03;11:
pubmed: 35238773
Structure. 2021 Jan 7;29(1):6-14
pubmed: 33176159
Biopolymers. 2021 Oct;112(10):e23471
pubmed: 34498740
Science. 2015 Jul 17;349(6245):324-8
pubmed: 26113641
Proteins. 2005 Mar 1;58(4):852-4
pubmed: 15657933
J Mol Biol. 2022 Jan 30;434(2):167336
pubmed: 34757056
Protein Sci. 2019 Aug;28(8):1487-1493
pubmed: 31148305
Science. 2008 Jun 27;320(5884):1725-6
pubmed: 18583598
Nucleic Acids Res. 2005 Apr 22;33(7):2302-9
pubmed: 15849316
Cell Physiol Biochem. 2018;46(3):907-924
pubmed: 29669336
J Neurosci. 2008 Nov 5;28(45):11488-99
pubmed: 18987185
J Mol Biol. 2021 Oct 1;433(20):167182
pubmed: 34358545
J Mol Biol. 2021 Oct 1;433(20):167208
pubmed: 34418423
Annu Rev Biophys. 2008;37:289-316
pubmed: 18573083
Biophys J. 2020 Oct 6;119(7):1380-1390
pubmed: 32937108
Science. 2020 Dec 4;370(6521):
pubmed: 33060197