Accelerating the Screening of Small Peptide Ligands by Combining Peptide-Protein Docking and Machine Learning.


Journal

International journal of molecular sciences
ISSN: 1422-0067
Titre abrégé: Int J Mol Sci
Pays: Switzerland
ID NLM: 101092791

Informations de publication

Date de publication:
29 Jul 2023
Historique:
received: 13 06 2023
revised: 19 07 2023
accepted: 28 07 2023
medline: 14 8 2023
pubmed: 12 8 2023
entrez: 12 8 2023
Statut: epublish

Résumé

This research introduces a novel pipeline that couples machine learning (ML), and molecular docking for accelerating the process of small peptide ligand screening through the prediction of peptide-protein docking. Eight ML algorithms were analyzed for their potential. Notably, Light Gradient Boosting Machine (LightGBM), despite having comparable F1-score and accuracy to its counterparts, showcased superior computational efficiency. LightGBM was used to classify peptide-protein docking performance of the entire tetrapeptide library of 160,000 peptide ligands against four viral envelope proteins. The library was classified into two groups, 'better performers' and 'worse performers'. By training the LightGBM algorithm on just 1% of the tetrapeptide library, we successfully classified the remaining 99%with an accuracy range of 0.81-0.85 and an F1-score between 0.58-0.67. Three different molecular docking software were used to prove that the process is not software dependent. With an adjustable probability threshold (from 0.5 to 0.95), the process could be accelerated by a factor of at least 10-fold and still get 90-95% concurrence with the method without ML. This study validates the efficiency of machine learning coupled to molecular docking in rapidly identifying top peptides without relying on high-performance computing power, making it an effective tool for screening potential bioactive compounds.

Identifiants

pubmed: 37569520
pii: ijms241512144
doi: 10.3390/ijms241512144
pmc: PMC10419121
pii:
doi:

Substances chimiques

Ligands 0
Proteins 0
Peptides 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Subventions

Organisme : NHLBI NIH HHS
ID : R01 HL149452
Pays : United States

Références

Int J Biol Macromol. 2022 May 31;208:421-442
pubmed: 35339499
Nat Rev Drug Discov. 2021 Apr;20(4):309-325
pubmed: 33536635
Structure. 2015 Aug 4;23(8):1507-1515
pubmed: 26146186
Dis Markers. 2022 Oct 4;2022:5892627
pubmed: 36246558
J Intern Med. 2003 Sep;254(3):197-215
pubmed: 12930229
Signal Transduct Target Ther. 2022 Feb 14;7(1):48
pubmed: 35165272
Evol Bioinform Online. 2020 Jun 30;16:1176934320934498
pubmed: 32655275
J Cheminform. 2013 Sep 24;5(1):42
pubmed: 24059743
Nat Protoc. 2022 Mar;17(3):672-697
pubmed: 35121854
Biosens Bioelectron. 2021 Nov 1;191:113471
pubmed: 34246123
BMC Bioinformatics. 2019 Feb 4;19(Suppl 13):426
pubmed: 30717654
Nucleic Acids Res. 2008 Jan;36(Database issue):D202-5
pubmed: 17998252
Sci Rep. 2020 Oct 6;10(1):16581
pubmed: 33024236
Curr Pharm Des. 2019;25(31):3358-3366
pubmed: 31544714
Comput Biol Med. 2022 Jul;146:105632
pubmed: 35617726
Protein Eng. 1990 Dec;4(2):155-61
pubmed: 2075190
PLoS One. 2011 Feb 09;6(2):e16968
pubmed: 21347392
Curr Protein Pept Sci. 2018;19(10):948-957
pubmed: 28847290
Molecules. 2015 Jul 22;20(7):13384-421
pubmed: 26205061
Pharmaceuticals (Basel). 2023 Feb 22;16(3):
pubmed: 36986436
J Med Chem. 1998 Jul 2;41(14):2481-91
pubmed: 9651153
Bioinformatics. 2017 May 15;33(10):1479-1487
pubmed: 28073761
Structure. 2016 Oct 4;24(10):1842-1853
pubmed: 27642160
Adv Biol (Weinh). 2023 Jun;7(6):e2200232
pubmed: 36775876
Nucleic Acids Res. 2009 May;37(8):2672-87
pubmed: 19273533
J Biochem. 1980 Dec;88(6):1895-8
pubmed: 7462208
Biopolymers. 2005;80(6):775-86
pubmed: 15895431
Nucleic Acids Res. 2003 Jul 1;31(13):3784-8
pubmed: 12824418
Bioinformatics. 2019 Dec 15;35(24):5121-5127
pubmed: 31161213
Nature. 1982 Sep 23;299(5881):371-4
pubmed: 7110359
PLoS Comput Biol. 2015 Dec 02;11(12):e1004586
pubmed: 26629955
BMC Bioinformatics. 2011 Mar 17;12:77
pubmed: 21414208
Bioinformatics. 2019 Jul 15;35(14):2395-2402
pubmed: 30520961
Curr Drug Targets. 2019;20(5):501-521
pubmed: 30360733
Chem Biol Drug Des. 2008 Apr;71(4):345-51
pubmed: 18318694
J Chem Inf Model. 2018 Jun 25;58(6):1292-1302
pubmed: 29738247
Brief Bioinform. 2023 Mar 19;24(2):
pubmed: 36880207
Biomolecules. 2019 Sep 17;9(9):
pubmed: 31533374

Auteurs

Josep-Ramon Codina (JR)

Department of Biochemistry and Molecular Biology, Miller School of Medicine, University of Miami, Miami, FL 33136, USA.

Marcello Mascini (M)

Department of Bioscience and Technology for Food, Agriculture and Environment, University of Teramo, 64100 Teramo, Italy.

Emre Dikici (E)

Department of Biochemistry and Molecular Biology, Miller School of Medicine, University of Miami, Miami, FL 33136, USA.
Dr. John T. Macdonald Foundation Biomedical Nanotechnology Institute (BioNIUM), University of Miami, Miami, FL 33136, USA.

Sapna K Deo (SK)

Department of Biochemistry and Molecular Biology, Miller School of Medicine, University of Miami, Miami, FL 33136, USA.
Dr. John T. Macdonald Foundation Biomedical Nanotechnology Institute (BioNIUM), University of Miami, Miami, FL 33136, USA.

Sylvia Daunert (S)

Department of Biochemistry and Molecular Biology, Miller School of Medicine, University of Miami, Miami, FL 33136, USA.
Dr. John T. Macdonald Foundation Biomedical Nanotechnology Institute (BioNIUM), University of Miami, Miami, FL 33136, USA.
Clinical and Translational Science Institute (CTSI), University of Miami, Miami, FL 33136, USA.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Databases, Protein Protein Domains Protein Folding Proteins Deep Learning
Animals Hemiptera Insect Proteins Phylogeny Insecticides

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software

Classifications MeSH