Heterogeneity of the GFP fitness landscape and data-driven protein design.
E. coli
GFP
computational biology
evolutionary biology
fitness landscape
machine learning
molecular evolution
protein engineering
systems biology
Journal
eLife
ISSN: 2050-084X
Titre abrégé: Elife
Pays: England
ID NLM: 101579614
Informations de publication
Date de publication:
05 05 2022
05 05 2022
Historique:
received:
25
11
2021
accepted:
25
03
2022
pubmed:
6
5
2022
medline:
24
5
2022
entrez:
5
5
2022
Statut:
epublish
Résumé
Studies of protein fitness landscapes reveal biophysical constraints guiding protein evolution and empower prediction of functional proteins. However, generalisation of these findings is limited due to scarceness of systematic data on fitness landscapes of proteins with a defined evolutionary relationship. We characterized the fitness peaks of four orthologous fluorescent proteins with a broad range of sequence divergence. While two of the four studied fitness peaks were sharp, the other two were considerably flatter, being almost entirely free of epistatic interactions. Mutationally robust proteins, characterized by a flat fitness peak, were not optimal templates for machine-learning-driven protein design - instead, predictions were more accurate for fragile proteins with epistatic landscapes. Our work paves insights for practical application of fitness landscape heterogeneity in protein engineering.
Identifiants
pubmed: 35510622
doi: 10.7554/eLife.75842
pii: 75842
pmc: PMC9119679
doi:
pii:
Substances chimiques
Proteins
0
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : Medical Research Council
ID : MC_UP_1605/9
Pays : United Kingdom
Organisme : Medical Research Council
ID : UKRI MC-A658-5QEA0
Pays : United Kingdom
Informations de copyright
© 2022, Gonzalez Somermeyer et al.
Déclaration de conflit d'intérêts
LG, AF, AM, NB, AI, JM, MA, EP, KS, FK No competing interests declared
Références
J Mol Evol. 2018 Jun;86(5):283-292
pubmed: 29679096
PLoS One. 2011 Feb 18;6(2):e16765
pubmed: 21364738
Mol Biol Evol. 2007 Aug;24(8):1586-91
pubmed: 17483113
ACS Synth Biol. 2015 Sep 18;4(9):975-86
pubmed: 25871405
RNA. 2013 Nov;19(11):1537-51
pubmed: 24064791
Proc Natl Acad Sci U S A. 2013 Aug 6;110(32):13067-72
pubmed: 23878237
Mar Biotechnol (NY). 2006 Sep-Oct;8(5):560-6
pubmed: 17072681
PLoS One. 2013;8(4):e59004
pubmed: 23565140
Nature. 2010 Jan 21;463(7279):353-5
pubmed: 20090752
Mol Biol Evol. 2017 May 1;34(5):1240-1251
pubmed: 28201714
Bioinformatics. 2012 Dec 1;28(23):3150-2
pubmed: 23060610
Bioinformatics. 2006 Jul 1;22(13):1658-9
pubmed: 16731699
Mol Biol Evol. 2015 Jan;32(1):229-38
pubmed: 25371431
J Mol Biol. 2009 Sep 11;392(1):218-27
pubmed: 19577576
Trends Genet. 2015 Jan;31(1):24-33
pubmed: 25438718
Evolution. 2003 Sep;57(9):1959-72
pubmed: 14575319
Nat Biotechnol. 2006 Jan;24(1):79-88
pubmed: 16369541
Genome Res. 2020 May;30(5):711-723
pubmed: 32424071
Science. 2020 Jul 24;369(6502):440-445
pubmed: 32703877
Nat Rev Mol Cell Biol. 2009 Dec;10(12):866-76
pubmed: 19935669
PLoS One. 2012;7(3):e32637
pubmed: 22431978
Elife. 2013 May 14;2:e00631
pubmed: 23682315
Proc Natl Acad Sci U S A. 2019 Apr 30;116(18):8852-8858
pubmed: 30979809
Acta Crystallogr D Biol Crystallogr. 1997 May 1;53(Pt 3):240-55
pubmed: 15299926
Evol Appl. 2019 Aug 11;12(9):1721-1742
pubmed: 31548853
Curr Opin Struct Biol. 2017 Aug;45:36-44
pubmed: 27886568
Nat Rev Genet. 2014 Jul;15(7):480-90
pubmed: 24913663
PLoS Genet. 2019 Apr 10;15(4):e1008079
pubmed: 30969963
Proc Biol Sci. 2008 Jan 7;275(1630):91-100
pubmed: 17971325
Trends Ecol Evol. 2019 Jan;34(1):69-82
pubmed: 30583805
Nat Methods. 2019 Dec;16(12):1315-1322
pubmed: 31636460
Elife. 2018 Mar 28;7:
pubmed: 29590010
Proc Natl Acad Sci U S A. 1978 Dec;75(12):6168-71
pubmed: 282633
Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W604-8
pubmed: 16845081
PLoS Comput Biol. 2020 Sep 29;16(9):e1008243
pubmed: 32991585
Nat Methods. 2021 Apr;18(4):389-396
pubmed: 33828272
J Theor Biol. 2008 Feb 7;250(3):560-8
pubmed: 18054366
PLoS Comput Biol. 2019 Aug 15;15(8):e1006884
pubmed: 31415555
Science. 2019 Nov 29;366(6469):1139-1143
pubmed: 31780559
Syst Biol. 2012 May;61(3):539-42
pubmed: 22357727
Curr Opin Struct Biol. 2018 Feb;48:141-148
pubmed: 29351890
Nat Commun. 2020 Apr 14;11(1):1782
pubmed: 32286265
Nature. 2020 Jan;577(7792):706-710
pubmed: 31942072
Nat Commun. 2017 Mar 06;8:14614
pubmed: 28262665
Proc Natl Acad Sci U S A. 2018 Aug 28;115(35):E8276-E8285
pubmed: 30104379
Science. 2017 Jul 14;357(6347):168-175
pubmed: 28706065
ACS Synth Biol. 2016 Jul 15;5(7):561-8
pubmed: 27072506
Annu Rev Biophys. 2017 May 22;46:85-103
pubmed: 28301766
Nature. 2006 Dec 14;444(7121):929-32
pubmed: 17122770
Acta Crystallogr D Biol Crystallogr. 2004 Dec;60(Pt 12 Pt 1):2126-32
pubmed: 15572765
Elife. 2022 May 05;11:
pubmed: 35510622
Curr Biol. 2014 Nov 17;24(22):2643-51
pubmed: 25455030
Interface Focus. 2019 Apr 6;9(2):20180068
pubmed: 30842871
Science. 2020 Dec 4;370(6521):
pubmed: 33273072
Genetics. 1978 Feb;88(2):391-403
pubmed: 17248802
Cell Syst. 2021 Nov 17;12(11):1026-1045.e7
pubmed: 34416172
PLoS Pathog. 2006 Dec;2(12):e136
pubmed: 17196038
Nature. 2001 Apr 5;410(6829):715-8
pubmed: 11287961
Nat Biotechnol. 2021 Jun;39(6):691-696
pubmed: 33574611
Science. 2019 Oct 25;366(6464):490-493
pubmed: 31649199
Nature. 1970 Feb 7;225(5232):563-4
pubmed: 5411867
Nature. 2016 May 11;533(7603):397-401
pubmed: 27193686
Proteins. 2011 Mar;79(3):830-8
pubmed: 21287615
Nat Ecol Evol. 2017 Feb 21;1(3):77
pubmed: 28812721
Nature. 2010 Jun 17;465(7300):922-6
pubmed: 20485343
Nucleic Acids Res. 2004 Mar 19;32(5):1792-7
pubmed: 15034147
Nat Commun. 2019 Sep 16;10(1):4213
pubmed: 31527666
Heredity (Edinb). 2018 Nov;121(5):466-481
pubmed: 29993041