Removing the Bottleneck: Introducing cMatch - A Lightweight Tool for Construct-Matching in Synthetic Biology.
genetic construct
matching
parts
quality control
software
synthetic biology
tool
Journal
Frontiers in bioengineering and biotechnology
ISSN: 2296-4185
Titre abrégé: Front Bioeng Biotechnol
Pays: Switzerland
ID NLM: 101632513
Informations de publication
Date de publication:
2021
2021
Historique:
received:
28
09
2021
accepted:
14
12
2021
entrez:
27
1
2022
pubmed:
28
1
2022
medline:
28
1
2022
Statut:
epublish
Résumé
We present a software tool, called cMatch, to reconstruct and identify synthetic genetic constructs from their sequences, or a set of sub-sequences-based on two practical pieces of information: their modular structure, and libraries of components. Although developed for combinatorial pathway engineering problems and addressing their quality control (QC) bottleneck, cMatch is not restricted to these applications. QC takes place post assembly, transformation and growth. It has a simple goal, to verify that the genetic material contained in a cell matches what was intended to be built - and when it is not the case, to locate the discrepancies and estimate their severity. In terms of reproducibility/reliability, the QC step is crucial. Failure at this step requires repetition of the construction and/or sequencing steps. When performed manually or semi-manually QC is an extremely time-consuming, error prone process, which scales very poorly with the number of constructs and their complexity. To make QC frictionless and more reliable, cMatch performs an operation we have called "construct-matching" and automates it. Construct-matching is more thorough than simple sequence-matching, as it matches at the functional level-and quantifies the matching at the individual component level and across the whole construct. Two algorithms (called CM_1 and CM_2) are presented. They differ according to the nature of their inputs. CM_1 is the core algorithm for construct-matching and is to be used when input sequences are long enough to cover constructs in their entirety (e.g., obtained with methods such as next generation sequencing). CM_2 is an extension designed to deal with shorter data (e.g., obtained with Sanger sequencing), and that need recombining. Both algorithms are shown to yield accurate construct-matching in a few minutes (even on hardware with limited processing power), together with a set of metrics that can be used to improve the robustness of the decision-making process. To ensure reliability and reproducibility, cMatch builds on the highly validated pairwise-matching Smith-Waterman algorithm. All the tests presented have been conducted on synthetic data for challenging, yet realistic constructs - and on real data gathered during studies on a metabolic engineering example (lycopene production).
Identifiants
pubmed: 35083201
doi: 10.3389/fbioe.2021.785131
pii: 785131
pmc: PMC8784771
doi:
Types de publication
Journal Article
Langues
eng
Pagination
785131Informations de copyright
Copyright © 2022 Casas, Bultelle, Motraghi and Kitney.
Déclaration de conflit d'intérêts
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Références
Nucleic Acids Res. 2017 Feb 17;45(3):1553-1565
pubmed: 28007941
Metab Eng. 2012 May;14(3):233-41
pubmed: 22629571
Appl Environ Microbiol. 2007 Feb;73(4):1355-61
pubmed: 17194842
Trends Biotechnol. 2019 Sep;37(9):917-920
pubmed: 31036350
Nat Commun. 2015 Jul 17;6:7810
pubmed: 26183606
ACS Synth Biol. 2017 Jan 20;6(1):148-158
pubmed: 27490704
Science. 2016 Apr 1;352(6281):aac7341
pubmed: 27034378
Nat Methods. 2018 Aug;15(8):559-560
pubmed: 30065369
Nat Commun. 2020 May 15;11(1):2446
pubmed: 32415065
Nat Commun. 2016 Mar 31;7:11163
pubmed: 27029461
PLoS One. 2015 May 26;10(5):e0126264
pubmed: 26010244
Biotechnol Appl Biochem. 2007 Nov;48(Pt 3):127-33
pubmed: 17927569
J Mol Biol. 1975 May 25;94(3):441-8
pubmed: 1100841
ACS Synth Biol. 2015 Jul 17;4(7):781-7
pubmed: 25746445
Brief Bioinform. 2011 Sep;12(5):489-97
pubmed: 21245079
Nucleic Acids Res. 2014 Feb;42(4):2646-59
pubmed: 24234441
ACS Synth Biol. 2021 Jan 15;10(1):1-18
pubmed: 33406821
Nat Rev Mol Cell Biol. 2015 Sep;16(9):568-76
pubmed: 26081612
Biotechnol Bioeng. 2012 Nov;109(11):2884-95
pubmed: 22565375
J Bacteriol. 1990 Dec;172(12):6704-12
pubmed: 2254247
Biotechnol Bioeng. 2006 Aug 20;94(6):1025-32
pubmed: 16547999
Integr Biol (Camb). 2011 Feb;3(2):109-18
pubmed: 21246151
J Biol Eng. 2018 Oct 29;12:23
pubmed: 30386425
Commun Biol. 2018 Jun 8;1:66
pubmed: 30271948
ACS Synth Biol. 2019 Aug 16;8(8):1838-1846
pubmed: 31298841
ACS Synth Biol. 2016 Jan 15;5(1):99-103
pubmed: 26479688
Synth Syst Biotechnol. 2019 Jan 22;4(1):57-66
pubmed: 30723818
Nature. 2017 Oct 19;550(7676):345-353
pubmed: 29019985
Curr Protoc Mol Biol. 2018 Apr;122(1):e59
pubmed: 29851291
Appl Environ Microbiol. 2020 Aug 18;86(17):
pubmed: 32561588
Curr Opin Biotechnol. 2017 Oct;47:142-151
pubmed: 28750202
Nucleic Acids Res. 2019 Feb 20;47(3):e17
pubmed: 30462270
J Mol Biol. 1970 Mar;48(3):443-53
pubmed: 5420325
Proc Natl Acad Sci U S A. 2005 Sep 6;102(36):12678-83
pubmed: 16123130
Nat Biotechnol. 2014 Dec;32(12):1276-81
pubmed: 25402616
Nat Rev Genet. 2018 Feb 14;19(3):125
pubmed: 29440742
J Biotechnol. 2016 Apr 10;223:36-7
pubmed: 26916415
SLAS Technol. 2019 Jun;24(3):291-297
pubmed: 30165777
ACS Synth Biol. 2021 Sep 17;10(9):2331-2339
pubmed: 34449215
PLoS One. 2020 Jan 9;15(1):e0223935
pubmed: 31917791
ACS Synth Biol. 2015 Aug 21;4(8):939-43
pubmed: 26096262
Nucleic Acids Res. 2012 Oct;40(18):e141
pubmed: 22718978
ACS Synth Biol. 2018 Feb 16;7(2):682-688
pubmed: 29316788
Synth Syst Biotechnol. 2020 Jun 23;5(3):137-144
pubmed: 32637667
Nature. 1999 Dec 2;402(6761 Suppl):C47-52
pubmed: 10591225
Appl Biochem Biotechnol. 1993 Jan-Feb;38(1-2):105-40
pubmed: 8346901
Proc Natl Acad Sci U S A. 1977 Dec;74(12):5463-7
pubmed: 271968
Microb Biotechnol. 2021 Nov;14(6):2291-2315
pubmed: 34171170
Nucleic Acids Res. 2021 Jul 2;49(W1):W516-W522
pubmed: 34019636
Nucleic Acids Res. 2015 Jan;43(Database issue):D1152-7
pubmed: 25392412
J Biol Eng. 2019 Jan 18;13:8
pubmed: 30675181
J Mol Biol. 1981 Mar 25;147(1):195-7
pubmed: 7265238
BMC Biotechnol. 2007 Jun 18;7:34
pubmed: 17572914
Nucleic Acids Res. 2013 Dec;41(22):10668-78
pubmed: 24038353
ACS Synth Biol. 2019 Jun 21;8(6):1337-1351
pubmed: 31072100
Nat Rev Mol Cell Biol. 2018 Jan;19(1):20-30
pubmed: 29018283
Science. 2010 Oct 1;330(6000):70-4
pubmed: 20929806
Biotechnol J. 2012 Jul;7(7):856-66
pubmed: 22649052
AMB Express. 2015 Dec;5(1):65
pubmed: 26395597
Nat Commun. 2019 May 9;10(1):2040
pubmed: 31068573
Nat Biotechnol. 2009 Oct;27(10):946-50
pubmed: 19801975
Nucleic Acids Res. 2010 Jul;38(Web Server issue):W695-9
pubmed: 20439314
PLoS One. 2019 Jun 12;14(6):e0218208
pubmed: 31188885
Biosci Biotechnol Biochem. 1994 Jun;58(6):1112-4
pubmed: 7765036
J Mol Biol. 1990 Oct 5;215(3):403-10
pubmed: 2231712
Foods. 2019 Apr 09;8(4):
pubmed: 30970532