DeCoDe: degenerate codon design for complete protein-coding DNA libraries.


Journal

Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944

Informations de publication

Date de publication:
01 06 2020
Historique:
received: 13 08 2019
revised: 13 02 2020
accepted: 13 03 2020
pubmed: 17 3 2020
medline: 30 10 2020
entrez: 17 3 2020
Statut: ppublish

Résumé

High-throughput protein screening is a critical technique for dissecting and designing protein function. Libraries for these assays can be created through a number of means, including targeted or random mutagenesis of a template protein sequence or direct DNA synthesis. However, mutagenic library construction methods often yield vastly more nonfunctional than functional variants and, despite advances in large-scale DNA synthesis, individual synthesis of each desired DNA template is often prohibitively expensive. Consequently, many protein-screening libraries rely on the use of degenerate codons (DCs), mixtures of DNA bases incorporated at specific positions during DNA synthesis, to generate highly diverse protein-variant pools from only a few low-cost synthesis reactions. However, selecting DCs for sets of sequences that covary at multiple positions dramatically increases the difficulty of designing a DC library and leads to the creation of many undesired variants that can quickly outstrip screening capacity. We introduce a novel algorithm for total DC library optimization, degenerate codon design (DeCoDe), based on integer linear programming. DeCoDe significantly outperforms state-of-the-art DC optimization algorithms and scales well to more than a hundred proteins sharing complex patterns of covariation (e.g. the lab-derived avGFP lineage). Moreover, DeCoDe is, to our knowledge, the first DC design algorithm with the capability to encode mixed-length protein libraries. We anticipate DeCoDe to be broadly useful for a variety of library generation problems, ranging from protein engineering attempts that leverage mutual information to the reconstruction of ancestral protein states. github.com/OrensteinLab/DeCoDe. yaronore@bgu.ac.il. Supplementary data are available at Bioinformatics online.

Identifiants

pubmed: 32176271
pii: 5807608
doi: 10.1093/bioinformatics/btaa162
pmc: PMC7267834
doi:

Substances chimiques

Codon 0
Proteins 0

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, Non-P.H.S.

Langues

eng

Sous-ensembles de citation

IM

Pagination

3357-3364

Subventions

Organisme : NIGMS NIH HHS
ID : DP2 GM123641
Pays : United States

Informations de copyright

© The Author(s) 2020. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Références

Proc Natl Acad Sci U S A. 1997 Nov 11;94(23):12297-302
pubmed: 9356443
Science. 2003 Sep 19;301(5640):1714-7
pubmed: 14500980
Proc Natl Acad Sci U S A. 2016 Nov 15;113(46):13045-13050
pubmed: 27799545
J Comput Biol. 2011 Nov;18(11):1743-56
pubmed: 21923411
FEBS Lett. 2001 Nov 23;508(3):309-12
pubmed: 11728441
Nat Methods. 2008 Dec;5(12):1039-45
pubmed: 19029907
Nat Biotechnol. 1997 Jun;15(6):553-7
pubmed: 9181578
Nature. 2016 May 11;533(7603):397-401
pubmed: 27193686
Science. 2003 Nov 21;302(5649):1364-8
pubmed: 14631033
Nat Methods. 2019 Apr;16(4):277-278
pubmed: 30886412
Nucleic Acids Res. 2005 Jun 10;33(10):3390-400
pubmed: 15951512
Gene. 1992 Feb 15;111(2):229-33
pubmed: 1347277
Nature. 2005 Sep 22;437(7058):512-8
pubmed: 16177782
Cell. 2009 Aug 21;138(4):774-86
pubmed: 19703402
BMC Bioinformatics. 2011 Feb 15;12 Suppl 1:S14
pubmed: 21342543
Proc Natl Acad Sci U S A. 2003 Jul 8;100(14):8308-13
pubmed: 12824471
Protein Eng Des Sel. 2005 Dec;18(12):559-61
pubmed: 16239261
Protein Sci. 1993 Aug;2(8):1249-54
pubmed: 8401210
Sci Rep. 2018 Nov 13;8(1):16757
pubmed: 30425279
J Mol Biol. 1986 Apr 5;188(3):491-4
pubmed: 3525847
J Mach Learn Res. 2016 Apr;17:
pubmed: 27375369
Proc Natl Acad Sci U S A. 2019 Apr 30;116(18):8852-8858
pubmed: 30979809
Biotechnology (N Y). 1992 Mar;10(3):297-300
pubmed: 1368102
Nucleic Acids Res. 2004 Feb 20;32(3):e36
pubmed: 14978223
ACS Synth Biol. 2018 Sep 21;7(9):2014-2022
pubmed: 30103599
Methods Enzymol. 2011;487:545-74
pubmed: 21187238
Proteins. 2014 Aug;82(8):1668-73
pubmed: 24623659
Nucleic Acids Res. 2015 Mar 11;43(5):e34
pubmed: 25539925
Nature. 2014 Apr 17;508(7496):331-9
pubmed: 24740064
Science. 2018 Jan 19;359(6373):343-347
pubmed: 29301959
Nature. 2005 Nov 3;438(7064):117-21
pubmed: 16267559
Protein Eng. 2002 Oct;15(10):779-82
pubmed: 12468711
Proc Natl Acad Sci U S A. 2010 Mar 2;107(9):4004-9
pubmed: 20142500
Proc Natl Acad Sci U S A. 1986 Mar;83(6):1588-92
pubmed: 3513181
Protein Sci. 1999 Mar;8(3):680-8
pubmed: 10091671
Science. 1999 Oct 8;286(5438):295-9
pubmed: 10514373
Science. 1985 Jun 14;228(4705):1315-7
pubmed: 4001944
Proc Natl Acad Sci U S A. 2015 Jun 9;112(23):7159-64
pubmed: 26040002
Proc Natl Acad Sci U S A. 1991 Sep 15;88(18):7978-82
pubmed: 1896445
Biochemistry. 2005 Jul 19;44(28):9657-72
pubmed: 16008351
Nucleic Acids Res. 2010 May;38(8):2522-40
pubmed: 20308161
ACS Synth Biol. 2018 Sep 21;7(9):2317-2321
pubmed: 30114904
Science. 1966 Jul 22;153(3734):420-4
pubmed: 5328568

Auteurs

Tyler C Shimko (TC)

Department of Genetics.

Polly M Fordyce (PM)

Department of Genetics.
Department of Bioengineering.
Stanford ChEM-H, Stanford University, Stanford, CA 94305, USA.
Chan Zuckerberg Biohub, San Francisco, CA 94158, USA.

Yaron Orenstein (Y)

School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer-Sheva 8410501, Israel.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Databases, Protein Protein Domains Protein Folding Proteins Deep Learning
Animals Hemiptera Insect Proteins Phylogeny Insecticides
1.00
Humans Magnetic Resonance Imaging Brain Infant, Newborn Infant, Premature

Classifications MeSH