EHreact: Extended Hasse Diagrams for the Extraction and Scoring of Enzymatic Reaction Templates.


Journal

Journal of chemical information and modeling
ISSN: 1549-960X
Titre abrégé: J Chem Inf Model
Pays: United States
ID NLM: 101230060

Informations de publication

Date de publication:
25 10 2021
Historique:
pubmed: 30 9 2021
medline: 23 11 2021
entrez: 29 9 2021
Statut: ppublish

Résumé

Data-driven computer-aided synthesis planning utilizing organic or biocatalyzed reactions from large databases has gained increasing interest in the last decade, sparking the development of numerous tools to extract, apply, and score general reaction templates. The generation of reaction rules for enzymatic reactions is especially challenging since substrate promiscuity varies between enzymes, causing the optimal levels of rule specificity and optimal number of included atoms to differ between enzymes. This complicates an automated extraction from databases and has promoted the creation of manually curated reaction rule sets. Here, we present EHreact, a purely data-driven open-source software tool, to extract and score reaction rules from sets of reactions known to be catalyzed by an enzyme at appropriate levels of specificity without expert knowledge. EHreact extracts and groups reaction rules into tree-like structures, Hasse diagrams, based on common substructures in the imaginary transition structures. Each diagram can be utilized to output a single or a set of reaction rules, as well as calculate the probability of a new substrate to be processed by the given enzyme by inferring information about the reactive site of the enzyme from the known reactions and their grouping in the template tree. EHreact heuristically predicts the activity of a given enzyme on a new substrate, outperforming current approaches in accuracy and functionality.

Identifiants

pubmed: 34587449
doi: 10.1021/acs.jcim.1c00921
pmc: PMC8549070
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

4949-4961

Subventions

Organisme : NIGMS NIH HHS
ID : T32 GM087237
Pays : United States

Références

Angew Chem Int Ed Engl. 2017 Jul 24;56(31):8942-8973
pubmed: 28407390
Bioinformatics. 2016 Jul 1;32(13):2065-6
pubmed: 27153692
Nat Commun. 2020 Nov 6;11(1):5644
pubmed: 33159067
Nucleic Acids Res. 2016 Jan 4;44(D1):D523-6
pubmed: 26527720
Nucleic Acids Res. 2019 Jan 8;47(D1):D506-D515
pubmed: 30395287
Curr Opin Chem Biol. 2006 Oct;10(5):498-508
pubmed: 16939713
J Chem Inf Model. 2011 Mar 28;51(3):739-53
pubmed: 21384929
PLoS One. 2011;6(10):e26021
pubmed: 21998747
J Chem Inf Model. 2020 Mar 23;60(3):1833-1843
pubmed: 32053362
J Chem Inf Model. 2019 Jun 24;59(6):2516-2521
pubmed: 31063394
Nucleic Acids Res. 2019 Jan 8;47(D1):D596-D600
pubmed: 30272209
Nucleic Acids Res. 2015 Jan;43(Database issue):D459-64
pubmed: 25332395
J Cheminform. 2018 Mar 9;10(1):11
pubmed: 29524042
Biochemistry. 2005 Mar 8;44(9):3390-401
pubmed: 15736949
Nat Commun. 2018 Jan 12;9(1):184
pubmed: 29330441
Metab Eng. 2018 Jan;45:158-170
pubmed: 29233745
ACS Med Chem Lett. 2020 Mar 23;11(4):597-604
pubmed: 32292569
Nat Chem Biol. 2020 Dec;16(12):1427-1433
pubmed: 32839605
Biotechnol Adv. 2015 Nov 15;33(7):1443-54
pubmed: 25747291
Nat Catal. 2021 Feb;4(2):98-104
pubmed: 33604511
ACS Synth Biol. 2019 Nov 15;8(11):2494-2506
pubmed: 31647630
Bioorg Med Chem. 2018 Apr 1;26(7):1285-1303
pubmed: 28716640
Biochemistry. 2008 Jan 8;47(1):157-66
pubmed: 18081310
J Chem Inf Model. 2019 Jun 24;59(6):2529-2537
pubmed: 31190540
Chem Rev. 2018 Jan 10;118(1):270-348
pubmed: 28481088
Nat Commun. 2016 Sep 28;7:12971
pubmed: 27677244
Nucleic Acids Res. 2019 Jan 8;47(D1):D1229-D1235
pubmed: 30321422
Plant Sci. 2018 Aug;273:61-70
pubmed: 29907310
Nucleic Acids Res. 2017 Jan 4;45(D1):D353-D361
pubmed: 27899662
Nucleic Acids Res. 2019 Jan 8;47(D1):D542-D549
pubmed: 30395242
Proc Natl Acad Sci U S A. 2012 Feb 21;109(8):2966-71
pubmed: 22315396
J Cheminform. 2017 Jun 14;9(1):39
pubmed: 29086112
Chem Soc Rev. 2012 Feb 21;41(4):1585-605
pubmed: 22234546
Chem Sci. 2021 May 25;12(25):8648-8659
pubmed: 34257863
Bioinformatics. 2016 Nov 15;32(22):3522-3524
pubmed: 27485447
Proc Natl Acad Sci U S A. 2019 Apr 9;116(15):7298-7307
pubmed: 30910961
Angew Chem Int Ed Engl. 2018 Jul 20;57(30):9238-9261
pubmed: 29573076
ACS Synth Biol. 2020 Jan 17;9(1):157-168
pubmed: 31841626
ACS Chem Biol. 2020 Mar 20;15(3):626-631
pubmed: 32058687
Chem Commun (Camb). 2018 Jun 12;54(48):6088-6104
pubmed: 29770379
Bioinformatics. 2018 Jun 15;34(12):2153-2154
pubmed: 29425325
J Chem Inf Model. 2020 Jul 27;60(7):3398-3407
pubmed: 32568548
J Chem Inf Model. 2014 Feb 24;54(2):387-95
pubmed: 24437465
Chem Commun (Camb). 2015 Feb 14;51(13):2660-2
pubmed: 25574524

Auteurs

Esther Heid (E)

Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.

Samuel Goldman (S)

Computational and Systems Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.

Karthik Sankaranarayanan (K)

Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.

Connor W Coley (CW)

Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.

Christoph Flamm (C)

Department of Theoretical Chemistry, University of Vienna, 1090 Vienna, Austria.

William H Green (WH)

Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software
Cephalometry Humans Anatomic Landmarks Software Internet
Humans Algorithms Software Artificial Intelligence Computer Simulation

Classifications MeSH