EnzymeML: seamless data flow and modeling of enzymatic data.
Journal
Nature methods
ISSN: 1548-7105
Titre abrégé: Nat Methods
Pays: United States
ID NLM: 101215604
Informations de publication
Date de publication:
03 2023
03 2023
Historique:
received:
04
11
2020
accepted:
21
12
2022
pubmed:
10
2
2023
medline:
14
3
2023
entrez:
9
2
2023
Statut:
ppublish
Résumé
The design of biocatalytic reaction systems is highly complex owing to the dependency of the estimated kinetic parameters on the enzyme, the reaction conditions, and the modeling method. Consequently, reproducibility of enzymatic experiments and reusability of enzymatic data are challenging. We developed the XML-based markup language EnzymeML to enable storage and exchange of enzymatic data such as reaction conditions, the time course of the substrate and the product, kinetic parameters and the kinetic model, thus making enzymatic data findable, accessible, interoperable and reusable (FAIR). The feasibility and usefulness of the EnzymeML toolbox is demonstrated in six scenarios, for which data and metadata of different enzymatic reactions are collected and analyzed. EnzymeML serves as a seamless communication channel between experimental platforms, electronic lab notebooks, tools for modeling of enzyme kinetics, publication platforms and enzymatic reaction databases. EnzymeML is open and transparent, and invites the community to contribute. All documents and codes are freely available at https://enzymeml.org .
Identifiants
pubmed: 36759590
doi: 10.1038/s41592-022-01763-1
pii: 10.1038/s41592-022-01763-1
doi:
Types de publication
Journal Article
Research Support, U.S. Gov't, P.H.S.
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.
Langues
eng
Sous-ensembles de citation
IM
Pagination
400-402Subventions
Organisme : U.S. Department of Health & Human Services | U.S. Food and Drug Administration (U.S. Food & Drug Administration)
ID : Grant U01FD006484
Informations de copyright
© 2023. The Author(s), under exclusive licence to Springer Nature America, Inc.
Références
Iqbal, S. A., Wallach, J. D., Khoury, M. J., Schully, S. D. & Ioannidis, J. P. A. Reproducible research practices and transparency across the biomedical literature. PLoS Biol. 14, e1002333 (2016).
doi: 10.1371/journal.pbio.1002333
pubmed: 26726926
pmcid: 4699702
Wulf, C. et al. A unified research data infrastructure for catalysis research—challenges and concepts. ChemCatChem 13, 3223–3236 (2021).
doi: 10.1002/cctc.202001974
Halling, P. et al. An empirical analysis of enzyme function reporting for experimental reproducibility: missing/incomplete information in published papers. Biophys. Chem. 242, 22–27 (2018).
doi: 10.1016/j.bpc.2018.08.004
pubmed: 30195215
pmcid: 6258184
Stroberg, W. & Schnell, S. On the estimation errors of KM and V from time-course experiments using the Michaelis–Menten equation. Biophys. Chem. 219, 17–27 (2016).
doi: 10.1016/j.bpc.2016.09.004
pubmed: 27677118
Cvijovic, M. et al. Bridging the gaps in systems biology. Mol. Genet. Genomics 289, 727–734 (2014).
doi: 10.1007/s00438-014-0843-3
pubmed: 24728588
Pleiss, J. Standardized data, scalable documentation, sustainable storage—EnzymeML as a basis for fair data management in biocatalysis. ChemCatChem 13, 3909–3913 (2021).
doi: 10.1002/cctc.202100822
Range, J. et al. EnzymeML—a data exchange format for biocatalysis and enzymology. FEBS J. 289, 5864–5874 (2022).
doi: 10.1111/febs.16318
pubmed: 34890097
Wilkinson, M. D. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016).
doi: 10.1038/sdata.2016.18
pubmed: 26978244
pmcid: 4792175
Tipton, K. F. et al. Standards for reporting enzyme data: the STRENDA Consortium: what it aims to do and why it should be helpful. Perspect. Sci. 1, 131–137 (2014).
doi: 10.1016/j.pisc.2014.02.012
Hucka, M. et al. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 19, 524–531 (2003).
doi: 10.1093/bioinformatics/btg015
pubmed: 12611808
Malzacher, S., Range, J., Halupczok, C., Pleiss, J. & Rother, D. BioCatHub, a graphical user interface for standardized data acquisition in biocatalysis. Chem. Ing. Tech. 92, 1251–1251 (2020).
doi: 10.1002/cite.202055297
Hoops, S. et al. COPASI—a complex pathway simulator. Bioinformatics 22, 3067–3074 (2006).
doi: 10.1093/bioinformatics/btl485
pubmed: 17032683
Christensen, C. D., Hofmeyr, J. H. S. & Rohwer, J. M. PySCeSToolbox: a collection of metabolic pathway analysis tools. Bioinformatics 34, 124–125 (2018).
doi: 10.1093/bioinformatics/btx567
pubmed: 28968872
Swainston, N. et al. STRENDA DB: enabling the validation and sharing of enzyme kinetics data. FEBS J. 285, 2193–2204 (2018).
doi: 10.1111/febs.14427
pubmed: 29498804
pmcid: 6005732
Wittig, U., Rey, M., Weidemann, A., Kania, R. & Müller, W. SABIO-RK: an updated resource for manually curated biochemical reaction kinetics. Nucleic Acids Res. 46, D656–D660 (2018).
doi: 10.1093/nar/gkx1065
pubmed: 29092055
Bezerra, R. M. F. & Dias, A. A. Discrimination among eight modified Michaelis–Menten kinetics models of cellulose hydrolysis with a large range of substrate/enzyme ratios: inhibition by cellobiose. Appl. Biochem. Biotechnol. 112, 173–184 (2004).
doi: 10.1385/ABAB:112:3:173
pubmed: 15007185
Buchholz, P. C. F., Ohs, R., Spiess, A. C. & Pleiss, J. Progress curve analysis within BioCatNet: comparing kinetic models for enzyme-catalyzed self-ligation. Biotechnol. J. 14, e1800183 (2019).
doi: 10.1002/biot.201800183
pubmed: 29999245
Dias Gomes, M., Moiseyenko, R. P., Baum, A., Jørgensen, T. M. & Woodley, J. M. Use of image analysis to understand enzyme stability in an aerated stirred reactor. Biotechnol. Prog. 35, e2878 (2019).
doi: 10.1002/btpr.2878
pubmed: 31254450
Woodley, J. M. Advances in biological conversion technologies: new opportunities for reaction engineering. React. Chem. Eng. 5, 632–640 (2020).
doi: 10.1039/C9RE00422J
Courtot, M. et al. Controlled vocabularies and semantics in systems biology. Mol. Syst. Biol. 7, 543 (2011).
doi: 10.1038/msb.2011.77
pubmed: 22027554
pmcid: 3261705
Kluyver, T. et al. Jupyter Notebooks—a publishing format for reproducible computational workflows. In Positioning and Power in Academic Publishing: Players, Agents and Agendas, Proc. 20th Int. Conf. on Electronic Publishing (eds. Loizides, F. & Schmidt, B.) 87–90 (IOS Press, 2016).
Hunter, J. D. Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9, 90–95 (2007).
doi: 10.1109/MCSE.2007.55
Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020).
doi: 10.1038/s41592-019-0686-2
pubmed: 32015543
pmcid: 7056644
Newville, M. et al. lmfit/lmfit-py: 1.1.0; https://doi.org/10.5281/zenodo.7370358 (2022).
Pinto, M. F. et al. interferENZY: a web-based tool for enzymatic assay validation and standardized kinetic analysis. J. Mol. Biol. 433, 166613 (2021).
doi: 10.1016/j.jmb.2020.07.025
pubmed: 32768452
Crosas, M. The Dataverse Network®: an open-source application for sharing, discovering and preserving data. D-Lib Magazine 17, 2 (2011).
Olivier, B. G., Rohwer, J. M. & Hofmeyr, J.-H. S. Modelling cellular systems with PySCeS. Bioinformatics 21, 560–561 (2005).
doi: 10.1093/bioinformatics/bti046
pubmed: 15454409
Dräger, A. et al. JSBML: A flexible java library for working with SBML. Bioinformatics 27, 2167–2168 (2011).
doi: 10.1093/bioinformatics/btr361
pubmed: 21697129
pmcid: 3137227