A Machine Learning Bioinformatics Method to Predict Biological Activity from Biosynthetic Gene Clusters.


Journal

Journal of chemical information and modeling
ISSN: 1549-960X
Titre abrégé: J Chem Inf Model
Pays: United States
ID NLM: 101230060

Informations de publication

Date de publication:
28 06 2021
Historique:
pubmed: 28 5 2021
medline: 10 8 2021
entrez: 27 5 2021
Statut: ppublish

Résumé

Research in natural products, the genetically encoded small molecules produced by organisms in an idiosyncratic fashion, deals with molecular structure, biosynthesis, and biological activity. Bioinformatics analyses of microbial genomes can successfully reveal the genetic instructions, biosynthetic gene clusters, that produce many natural products. Genes to molecule predictions made on biosynthetic gene clusters have revealed many important new structures. There is no comparable method for genes to biological activity predictions. To address this missing pathway, we developed a machine learning bioinformatics method for predicting a natural product's antibiotic activity directly from the sequence of its biosynthetic gene cluster. We trained commonly used machine learning classifiers to predict antibacterial or antifungal activity based on features of known natural product biosynthetic gene clusters. We have identified classifiers that can attain accuracies as high as 80% and that have enabled the identification of biosynthetic enzymes and their corresponding molecular features that are associated with antibiotic activity.

Identifiants

pubmed: 34042443
doi: 10.1021/acs.jcim.0c01304
pmc: PMC8243324
doi:

Substances chimiques

Anti-Bacterial Agents 0
Biological Products 0

Types de publication

Journal Article Research Support, N.I.H., Extramural

Langues

eng

Sous-ensembles de citation

IM

Pagination

2560-2571

Subventions

Organisme : NIGMS NIH HHS
ID : F32 GM128267
Pays : United States
Organisme : NCCIH NIH HHS
ID : R01 AT009874
Pays : United States

Références

Fungal Genet Biol. 2010 Sep;47(9):736-41
pubmed: 20554054
Nucleic Acids Res. 2020 Jan 8;48(D1):D454-D458
pubmed: 31612915
Chembiochem. 2013 Mar 18;14(5):625-32
pubmed: 23447362
J Org Chem. 1999 Apr 30;64(9):3034-3038
pubmed: 11674399
Proc Natl Acad Sci U S A. 2007 Jan 30;104(5):1506-9
pubmed: 17234808
J Am Chem Soc. 2007 Jul 18;129(28):8747-55
pubmed: 17592838
Sci Rep. 2019 Sep 16;9(1):13406
pubmed: 31527713
Nature. 2017 May 18;545(7654):299-304
pubmed: 28489819
BMC Microbiol. 2014 Feb 08;14:30
pubmed: 24506891
J Nat Prod. 2011 Mar 25;74(3):496-511
pubmed: 21138324
Nucleic Acids Res. 2019 Oct 10;47(18):e110
pubmed: 31400112
Nucleic Acids Res. 2017 Jul 3;45(W1):W36-W41
pubmed: 28460038
Nucleic Acids Res. 2017 Jan 4;45(D1):D566-D573
pubmed: 27789705
Nature. 1978 Jan 19;271(5642):223-5
pubmed: 622161
J Mol Biol. 2003 Jul 25;330(5):1005-14
pubmed: 12860123
J Med Chem. 2015 Sep 24;58(18):7409-18
pubmed: 26308180
J Agric Food Chem. 2016 Nov 23;64(46):8811-8820
pubmed: 27806569
Nat Prod Rep. 2015 Aug;32(8):1207-35
pubmed: 25940955
Biochim Biophys Acta. 2015 Aug;1854(8):1019-37
pubmed: 25900361
Proc Natl Acad Sci U S A. 2011 Apr 26;108(17):6733-8
pubmed: 21368185
J Nat Prod. 2020 Mar 27;83(3):770-803
pubmed: 32162523
Nat Chem Biol. 2015 Sep;11(9):625-31
pubmed: 26284661
Nucleic Acids Res. 2017 Jul 3;45(W1):W80-W88
pubmed: 28499008
Nucleic Acids Res. 2020 Jan 8;48(D1):D517-D525
pubmed: 31665441
Front Microbiol. 2015 Nov 27;6:1363
pubmed: 26640466
Appl Microbiol Biotechnol. 2005 Jun;67(4):539-48
pubmed: 15614563
Nat Commun. 2018 Oct 2;9(1):4035
pubmed: 30279420
ACS Chem Biol. 2015 Dec 18;10(12):2841-2849
pubmed: 26458099
Nucleic Acids Res. 2021 Jan 8;49(D1):D639-D643
pubmed: 33152079
Appl Environ Microbiol. 2007 Nov;73(22):7400-7
pubmed: 17905880
Nat Chem Biol. 2017 May;13(5):470-478
pubmed: 28244986
Nat Commun. 2020 Nov 27;11(1):6058
pubmed: 33247171
Genome Res. 2003 Nov;13(11):2498-504
pubmed: 14597658
Chembiochem. 2008 Sep 22;9(14):2200-3
pubmed: 18780385
J Antibiot (Tokyo). 1995 Feb;48(2):119-25
pubmed: 7706121
Chem Biodivers. 2011 Nov;8(11):1968-77
pubmed: 22083910
Nucleic Acids Res. 2019 Jan 8;47(D1):D625-D630
pubmed: 30395294
J Nat Prod. 2001 Dec;64(12):1541-4
pubmed: 11754607
Microb Cell Fact. 2016 Sep 21;15(1):160
pubmed: 27655321
Nucleic Acids Res. 2017 Jul 3;45(W1):W42-W48
pubmed: 28472505
Biochemistry. 2020 Apr 21;59(15):1470-1473
pubmed: 32237736
Cell. 2020 Apr 16;181(2):475-483
pubmed: 32302574
Chem Biol. 2000 Dec;7(12):931-42
pubmed: 11137816
Methods Enzymol. 2009;458:181-217
pubmed: 19374984
J Am Chem Soc. 2012 Jun 13;134(23):9755-61
pubmed: 22621706
PLoS One. 2009;4(2):e4345
pubmed: 19190775
Nucleic Acids Res. 2018 Jul 2;46(W1):W278-W281
pubmed: 29788290
Sci Rep. 2017 Oct 27;7(1):14243
pubmed: 29079836
Angew Chem Int Ed Engl. 2015 Sep 14;54(38):11254-8
pubmed: 26211520
Nat Prod Rep. 2017 Oct 18;34(10):1203-1232
pubmed: 28820533
J Antibiot (Tokyo). 1995 Sep;48(9):997-1003
pubmed: 7592068
Chembiochem. 2011 May 16;12(8):1171-3
pubmed: 21538763
Eur J Biochem. 1982 Aug;126(1):155-9
pubmed: 6751815
Nucleic Acids Res. 2017 Jul 3;45(W1):W49-W54
pubmed: 28460067
BMC Microbiol. 2012 Aug 08;12:169
pubmed: 22871112
Acc Chem Res. 2008 Oct;41(10):1331-42
pubmed: 18636716
Nucleic Acids Res. 2019 Jul 2;47(W1):W81-W87
pubmed: 31032519

Auteurs

Allison S Walker (AS)

Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, 240 Longwood Avenue, Boston, Massachusetts 02115, United States.

Jon Clardy (J)

Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, 240 Longwood Avenue, Boston, Massachusetts 02115, United States.

Articles similaires

Vancomycin-associated DRESS demonstrates delay in AST abnormalities.

Ahmed Hussein, Kateri L Schoettinger, Jourdan Hydol-Smith et al.
1.00
Humans Drug Hypersensitivity Syndrome Vancomycin Female Male
Humans Arthroplasty, Replacement, Elbow Prosthesis-Related Infections Debridement Anti-Bacterial Agents

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software
Drought Resistance Gene Expression Profiling Gene Expression Regulation, Plant Gossypium Multigene Family

Classifications MeSH