An exploratory in silico comparison of open-source codon harmonization tools.

CHARMING Codon Harmonization Codon usage Bias CodonWizard EuGene Galaxy Synthetic Biology

Journal

Microbial cell factories
ISSN: 1475-2859
Titre abrégé: Microb Cell Fact
Pays: England
ID NLM: 101139812

Informations de publication

Date de publication:
06 Nov 2023
Historique:
received: 14 07 2023
accepted: 14 10 2023
medline: 8 11 2023
pubmed: 7 11 2023
entrez: 6 11 2023
Statut: epublish

Résumé

Not changing the native constitution of genes prior to their expression by a heterologous host can affect the amount of proteins synthesized as well as their folding, hampering their activity and even cell viability. Over the past decades, several strategies have been developed to optimize the translation of heterologous genes by accommodating the difference in codon usage between species. While there have been a handful of studies assessing various codon optimization strategies, to the best of our knowledge, no research has been performed towards the evaluation and comparison of codon harmonization algorithms. To highlight their importance and encourage meaningful discussion, we compared different open-source codon harmonization tools pertaining to their in silico performance, and we investigated the influence of different gene-specific factors. In total, 27 genes were harmonized with four tools toward two different heterologous hosts. The difference in %MinMax values between the harmonized and the original sequences was calculated (ΔMinMax), and statistical analysis of the obtained results was carried out. It became clear that not all tools perform similarly, and the choice of tool should depend on the intended application. Almost all biological factors under investigation (GC content, RNA secondary structures and choice of heterologous host) had a significant influence on the harmonization results and thus must be taken into account. These findings were substantiated using a validation dataset consisting of 8 strategically chosen genes. Due to the size of the dataset, no complex models could be developed. However, this initial study showcases significant differences between the results of various codon harmonization tools. Although more elaborate investigation is needed, it is clear that biological factors such as GC content, RNA secondary structures and heterologous hosts must be taken into account when selecting the codon harmonization tool.

Sections du résumé

BACKGROUND BACKGROUND
Not changing the native constitution of genes prior to their expression by a heterologous host can affect the amount of proteins synthesized as well as their folding, hampering their activity and even cell viability. Over the past decades, several strategies have been developed to optimize the translation of heterologous genes by accommodating the difference in codon usage between species. While there have been a handful of studies assessing various codon optimization strategies, to the best of our knowledge, no research has been performed towards the evaluation and comparison of codon harmonization algorithms. To highlight their importance and encourage meaningful discussion, we compared different open-source codon harmonization tools pertaining to their in silico performance, and we investigated the influence of different gene-specific factors.
RESULTS RESULTS
In total, 27 genes were harmonized with four tools toward two different heterologous hosts. The difference in %MinMax values between the harmonized and the original sequences was calculated (ΔMinMax), and statistical analysis of the obtained results was carried out. It became clear that not all tools perform similarly, and the choice of tool should depend on the intended application. Almost all biological factors under investigation (GC content, RNA secondary structures and choice of heterologous host) had a significant influence on the harmonization results and thus must be taken into account. These findings were substantiated using a validation dataset consisting of 8 strategically chosen genes.
CONCLUSIONS CONCLUSIONS
Due to the size of the dataset, no complex models could be developed. However, this initial study showcases significant differences between the results of various codon harmonization tools. Although more elaborate investigation is needed, it is clear that biological factors such as GC content, RNA secondary structures and heterologous hosts must be taken into account when selecting the codon harmonization tool.

Identifiants

pubmed: 37932726
doi: 10.1186/s12934-023-02230-y
pii: 10.1186/s12934-023-02230-y
pmc: PMC10626681
doi:

Substances chimiques

Codon 0
Proteins 0
Biological Factors 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

227

Subventions

Organisme : Fonds Wetenschappelijk Onderzoek
ID : 198258
Organisme : Fonds Wetenschappelijk Onderzoek
ID : 1SB8423N
Organisme : Fonds Wetenschappelijk Onderzoek
ID : S001422N

Informations de copyright

© 2023. The Author(s).

Références

Protein Expr Purif. 2019 Aug;160:84-93
pubmed: 30953700
Nucleic Acids Res. 1987 Feb 11;15(3):1281-95
pubmed: 3547335
PLoS One. 2019 Apr 23;14(4):e0215892
pubmed: 31013332
Biomed Res Int. 2015;2015:248680
pubmed: 26171389
Protein Sci. 2022 Jan;31(1):221-231
pubmed: 34738275
PLoS One. 2017 Sep 13;12(9):e0184355
pubmed: 28902855
Trends Biotechnol. 2004 Jul;22(7):346-53
pubmed: 15245907
J Am Chem Soc. 2014 Jan 22;136(3):858-61
pubmed: 24392935
PLoS One. 2020 Apr 30;15(4):e0232003
pubmed: 32352987
Protein Expr Purif. 2012 May;83(1):37-46
pubmed: 22425659
BMC Bioinformatics. 2017 Sep 2;18(1):391
pubmed: 28865429
Protein Expr Purif. 2003 Oct;31(2):247-9
pubmed: 14550643
FEBS J. 2015 Dec;282(24):4782-96
pubmed: 26426731
Metab Eng. 2021 Sep;67:262-276
pubmed: 34224897
Nucleic Acids Res. 2007 Jul;35(Web Server issue):W126-31
pubmed: 17439967
PLoS One. 2008 May 14;3(5):e2189
pubmed: 18478103
Annu Rev Biophys. 2015;44:143-66
pubmed: 25747594
BMC Biol. 2021 Feb 19;19(1):36
pubmed: 33607980
Biotechnol J. 2011 Jun;6(6):650-9
pubmed: 21567958
Protein Sci. 2018 Jan;27(1):356-362
pubmed: 29090506
Protein Expr Purif. 2010 Jul;72(1):101-6
pubmed: 20172029
Nucleic Acids Res. 2023 Jan 6;51(D1):D523-D531
pubmed: 36408920
J Mol Biol. 2012 Sep 21;422(3):328-35
pubmed: 22705285
FEBS Lett. 2018 May;592(9):1554-1564
pubmed: 29624661
J Ind Microbiol Biotechnol. 2012 Mar;39(3):383-99
pubmed: 22252444
Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W70-4
pubmed: 18424795
Curr Opin Biotechnol. 2017 Jun;45:1-7
pubmed: 28088091
Biotechnol Adv. 2023 May-Jun;64:108121
pubmed: 36775001
Syst Synth Biol. 2010 Sep;4(3):215-25
pubmed: 21189842
Curr Issues Mol Biol. 2001 Oct;3(4):91-7
pubmed: 11719972
BMC Bioinformatics. 2016 Aug 24;17(1):316
pubmed: 27553667
PLoS One. 2009 Sep 14;4(9):e7002
pubmed: 19759823
BMC Bioinformatics. 2006 Jun 06;7:285
pubmed: 16756672
Protein Sci. 2010 Jul;19(7):1312-26
pubmed: 20506237
Bioinformatics. 2012 Oct 15;28(20):2683-4
pubmed: 22847936
Microb Cell Fact. 2011 Mar 03;10:15
pubmed: 21371320
Science. 2009 Apr 10;324(5924):255-8
pubmed: 19359587
Biochem Biophys Res Commun. 2002 Apr 26;293(1):537-41
pubmed: 12054634
Front Bioeng Biotechnol. 2014 Oct 06;2:41
pubmed: 25340050
Protein Expr Purif. 2006 Jun;47(2):441-5
pubmed: 16376569
Biol Rev Camb Philos Soc. 2020 Apr;95(2):517-529
pubmed: 31863552
RNA. 2007 Jan;13(1):87-96
pubmed: 17095544
Nat Struct Mol Biol. 2013 Feb;20(2):237-43
pubmed: 23262490
J Bioinform Comput Biol. 2011 Oct;9(5):597-611
pubmed: 21976378
Metab Eng. 2020 Nov;62:10-19
pubmed: 32795614
Nucleic Acids Res. 2007 Jan;35(Database issue):D76-9
pubmed: 17062619
J Biotechnol. 2005 Jan 26;115(2):113-28
pubmed: 15607230
PLoS One. 2008;3(10):e3412
pubmed: 18923675
Proc Natl Acad Sci U S A. 2016 Mar 8;113(10):E1362-71
pubmed: 26903634
J Mol Evol. 2001 Sep;53(3):225-36
pubmed: 11523009
J Mol Biol. 1987 Jan 20;193(2):413-7
pubmed: 3298659
Front Microbiol. 2014 Feb 04;5:21
pubmed: 24550894
Curr Opin Biotechnol. 1995 Oct;6(5):494-500
pubmed: 7579660
Proc Natl Acad Sci U S A. 2004 Mar 9;101(10):3480-5
pubmed: 14990797
Curr Opin Struct Biol. 2016 Jun;38:155-62
pubmed: 27449695
J Bacteriol. 2006 Mar;188(5):1892-8
pubmed: 16484200
PLoS Biol. 2006 Jun;4(6):e180
pubmed: 16700628
Front Microbiol. 2018 Dec 07;9:2948
pubmed: 30581420

Auteurs

Thomas Willems (T)

Centre for Industrial Biotechnology and Biocatalysis (InBio.be), Department of Biotechnology, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, Ghent, 9000, Belgium.

Wim Hectors (W)

Centre for Industrial Biotechnology and Biocatalysis (InBio.be), Department of Biotechnology, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, Ghent, 9000, Belgium.

Jeltien Rombaut (J)

Centre for Industrial Biotechnology and Biocatalysis (InBio.be), Department of Biotechnology, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, Ghent, 9000, Belgium.

Anne-Sofie De Rop (AS)

Centre for Industrial Biotechnology and Biocatalysis (InBio.be), Department of Biotechnology, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, Ghent, 9000, Belgium.

Stijn Goegebeur (S)

Centre for Industrial Biotechnology and Biocatalysis (InBio.be), Department of Biotechnology, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, Ghent, 9000, Belgium.

Tom Delmulle (T)

Centre for Industrial Biotechnology and Biocatalysis (InBio.be), Department of Biotechnology, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, Ghent, 9000, Belgium.

Maarten L De Mol (ML)

Centre for Industrial Biotechnology and Biocatalysis (InBio.be), Department of Biotechnology, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, Ghent, 9000, Belgium.

Sofie L De Maeseneire (SL)

Centre for Industrial Biotechnology and Biocatalysis (InBio.be), Department of Biotechnology, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, Ghent, 9000, Belgium. Sofie.DeMaeseneire@UGent.be.

Wim K Soetaert (WK)

Centre for Industrial Biotechnology and Biocatalysis (InBio.be), Department of Biotechnology, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, Ghent, 9000, Belgium.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Databases, Protein Protein Domains Protein Folding Proteins Deep Learning
1.00
Humans Magnetic Resonance Imaging Brain Infant, Newborn Infant, Premature
Humans Algorithms Software Artificial Intelligence Computer Simulation

Classifications MeSH