Modeling Sequence-Space Exploration and Emergence of Epistatic Signals in Protein Evolution.
data-driven models
epistasis
fitness landscapes
protein evolution
sequence space
Journal
Molecular biology and evolution
ISSN: 1537-1719
Titre abrégé: Mol Biol Evol
Pays: United States
ID NLM: 8501455
Informations de publication
Date de publication:
07 01 2022
07 01 2022
Historique:
pubmed:
10
11
2021
medline:
1
4
2022
entrez:
9
11
2021
Statut:
ppublish
Résumé
During their evolution, proteins explore sequence space via an interplay between random mutations and phenotypic selection. Here, we build upon recent progress in reconstructing data-driven fitness landscapes for families of homologous proteins, to propose stochastic models of experimental protein evolution. These models predict quantitatively important features of experimentally evolved sequence libraries, like fitness distributions and position-specific mutational spectra. They also allow us to efficiently simulate sequence libraries for a vast array of combinations of experimental parameters like sequence divergence, selection strength, and library size. We showcase the potential of the approach in reanalyzing two recent experiments to determine protein structure from signals of epistasis emerging in experimental sequence libraries. To be detectable, these signals require sufficiently large and sufficiently diverged libraries. Our modeling framework offers a quantitative explanation for different outcomes of recently published experiments. Furthermore, we can forecast the outcome of time- and resource-intensive evolution experiments, opening thereby a way to computationally optimize experimental protocols.
Identifiants
pubmed: 34751386
pii: 6424001
doi: 10.1093/molbev/msab321
pmc: PMC8789065
pii:
doi:
Substances chimiques
Proteins
0
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Informations de copyright
© The Author(s) 2021. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Références
Mol Biol Evol. 2014 Jun;31(6):1581-92
pubmed: 24567513
Elife. 2019 Mar 12;8:
pubmed: 30857591
Phys Rev E. 2021 Aug;104(2-1):024407
pubmed: 34525554
Nucleic Acids Res. 2021 Jan 8;49(D1):D437-D451
pubmed: 33211854
Nature. 2021 Aug;596(7873):583-589
pubmed: 34265844
Phys Rev E Stat Nonlin Soft Matter Phys. 2013 Jan;87(1):012707
pubmed: 23410359
PLoS Comput Biol. 2011 Oct;7(10):e1002195
pubmed: 22039361
Nucleic Acids Res. 2021 Jan 8;49(D1):D412-D419
pubmed: 33125078
PLoS One. 2014 Mar 24;9(3):e92721
pubmed: 24663061
Nucleic Acids Res. 2021 Jan 8;49(D1):D480-D489
pubmed: 33237286
Nat Commun. 2018 Jun 28;9(1):2511
pubmed: 29955037
PLoS Comput Biol. 2016 Jun 02;12(6):e1004817
pubmed: 27254668
Nature. 2021 Aug;596(7873):590-596
pubmed: 34293799
Proc Natl Acad Sci U S A. 2019 Aug 20;116(34):16856-16865
pubmed: 31399549
Bioinformatics. 2012 Jan 15;28(2):184-90
pubmed: 22101153
Proc Natl Acad Sci U S A. 2020 Mar 17;117(11):5873-5882
pubmed: 32123092
Mol Biol Evol. 2020 Apr 1;37(4):1179-1192
pubmed: 31670785
Proc Natl Acad Sci U S A. 2011 Dec 6;108(49):E1293-301
pubmed: 22106262
Science. 2020 Jul 24;369(6502):440-445
pubmed: 32703877
J Theor Biol. 2005 Jun 21;234(4):497-509
pubmed: 15808871
Nat Biotechnol. 2017 Feb;35(2):128-135
pubmed: 28092658
Nucleic Acids Res. 2021 Jan 8;49(D1):D192-D200
pubmed: 33211869
Mol Biol Evol. 2017 Apr 1;35(4):1018-1027
pubmed: 29351669
Bioinformatics. 2019 May 1;35(9):1582-1584
pubmed: 30304492
Science. 2017 Jan 20;355(6322):294-298
pubmed: 28104891
Cell Syst. 2020 Jan 22;10(1):15-24.e5
pubmed: 31838147
Proc Natl Acad Sci U S A. 2009 Jan 6;106(1):67-72
pubmed: 19116270
Proc Natl Acad Sci U S A. 2020 Jan 21;117(3):1496-1503
pubmed: 31896580
Phys Rev E. 2019 Mar;99(3-1):032405
pubmed: 30999494
J Theor Biol. 2000 Aug 7;205(3):483-503
pubmed: 10882567
Proteins. 2011 Apr;79(4):1061-78
pubmed: 21268112
Mol Biol Evol. 2016 Jan;33(1):268-80
pubmed: 26446903
Rep Prog Phys. 2018 Mar;81(3):032601
pubmed: 29120346
Proc Natl Acad Sci U S A. 2015 Nov 3;112(44):13567-72
pubmed: 26487681
Nat Rev Genet. 2013 Apr;14(4):249-61
pubmed: 23458856
Angew Chem Int Ed Engl. 2018 Apr 9;57(16):4143-4148
pubmed: 29064156
PCR Methods Appl. 1992 Aug;2(1):28-33
pubmed: 1490172
Nature. 2020 Jan;577(7792):706-710
pubmed: 31942072
Proc Natl Acad Sci U S A. 2014 Aug 26;111(34):12408-13
pubmed: 25114242
Nat Commun. 2019 Sep 4;10(1):3977
pubmed: 31484923
Phys Rev E. 2019 Sep;100(3-1):032128
pubmed: 31639992
Curr Opin Struct Biol. 2017 Apr;43:55-62
pubmed: 27870991