Accelerating the Screening of Small Peptide Ligands by Combining Peptide-Protein Docking and Machine Learning.
classification algorithms
machine learning
molecular docking
small peptides
Journal
International journal of molecular sciences
ISSN: 1422-0067
Titre abrégé: Int J Mol Sci
Pays: Switzerland
ID NLM: 101092791
Informations de publication
Date de publication:
29 Jul 2023
29 Jul 2023
Historique:
received:
13
06
2023
revised:
19
07
2023
accepted:
28
07
2023
medline:
14
8
2023
pubmed:
12
8
2023
entrez:
12
8
2023
Statut:
epublish
Résumé
This research introduces a novel pipeline that couples machine learning (ML), and molecular docking for accelerating the process of small peptide ligand screening through the prediction of peptide-protein docking. Eight ML algorithms were analyzed for their potential. Notably, Light Gradient Boosting Machine (LightGBM), despite having comparable F1-score and accuracy to its counterparts, showcased superior computational efficiency. LightGBM was used to classify peptide-protein docking performance of the entire tetrapeptide library of 160,000 peptide ligands against four viral envelope proteins. The library was classified into two groups, 'better performers' and 'worse performers'. By training the LightGBM algorithm on just 1% of the tetrapeptide library, we successfully classified the remaining 99%with an accuracy range of 0.81-0.85 and an F1-score between 0.58-0.67. Three different molecular docking software were used to prove that the process is not software dependent. With an adjustable probability threshold (from 0.5 to 0.95), the process could be accelerated by a factor of at least 10-fold and still get 90-95% concurrence with the method without ML. This study validates the efficiency of machine learning coupled to molecular docking in rapidly identifying top peptides without relying on high-performance computing power, making it an effective tool for screening potential bioactive compounds.
Identifiants
pubmed: 37569520
pii: ijms241512144
doi: 10.3390/ijms241512144
pmc: PMC10419121
pii:
doi:
Substances chimiques
Ligands
0
Proteins
0
Peptides
0
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : NHLBI NIH HHS
ID : R01 HL149452
Pays : United States
Références
Int J Biol Macromol. 2022 May 31;208:421-442
pubmed: 35339499
Nat Rev Drug Discov. 2021 Apr;20(4):309-325
pubmed: 33536635
Structure. 2015 Aug 4;23(8):1507-1515
pubmed: 26146186
Dis Markers. 2022 Oct 4;2022:5892627
pubmed: 36246558
J Intern Med. 2003 Sep;254(3):197-215
pubmed: 12930229
Signal Transduct Target Ther. 2022 Feb 14;7(1):48
pubmed: 35165272
Evol Bioinform Online. 2020 Jun 30;16:1176934320934498
pubmed: 32655275
J Cheminform. 2013 Sep 24;5(1):42
pubmed: 24059743
Nat Protoc. 2022 Mar;17(3):672-697
pubmed: 35121854
Biosens Bioelectron. 2021 Nov 1;191:113471
pubmed: 34246123
BMC Bioinformatics. 2019 Feb 4;19(Suppl 13):426
pubmed: 30717654
Nucleic Acids Res. 2008 Jan;36(Database issue):D202-5
pubmed: 17998252
Sci Rep. 2020 Oct 6;10(1):16581
pubmed: 33024236
Curr Pharm Des. 2019;25(31):3358-3366
pubmed: 31544714
Comput Biol Med. 2022 Jul;146:105632
pubmed: 35617726
Protein Eng. 1990 Dec;4(2):155-61
pubmed: 2075190
PLoS One. 2011 Feb 09;6(2):e16968
pubmed: 21347392
Curr Protein Pept Sci. 2018;19(10):948-957
pubmed: 28847290
Molecules. 2015 Jul 22;20(7):13384-421
pubmed: 26205061
Pharmaceuticals (Basel). 2023 Feb 22;16(3):
pubmed: 36986436
J Med Chem. 1998 Jul 2;41(14):2481-91
pubmed: 9651153
Bioinformatics. 2017 May 15;33(10):1479-1487
pubmed: 28073761
Structure. 2016 Oct 4;24(10):1842-1853
pubmed: 27642160
Adv Biol (Weinh). 2023 Jun;7(6):e2200232
pubmed: 36775876
Nucleic Acids Res. 2009 May;37(8):2672-87
pubmed: 19273533
J Biochem. 1980 Dec;88(6):1895-8
pubmed: 7462208
Biopolymers. 2005;80(6):775-86
pubmed: 15895431
Nucleic Acids Res. 2003 Jul 1;31(13):3784-8
pubmed: 12824418
Bioinformatics. 2019 Dec 15;35(24):5121-5127
pubmed: 31161213
Nature. 1982 Sep 23;299(5881):371-4
pubmed: 7110359
PLoS Comput Biol. 2015 Dec 02;11(12):e1004586
pubmed: 26629955
BMC Bioinformatics. 2011 Mar 17;12:77
pubmed: 21414208
Bioinformatics. 2019 Jul 15;35(14):2395-2402
pubmed: 30520961
Curr Drug Targets. 2019;20(5):501-521
pubmed: 30360733
Chem Biol Drug Des. 2008 Apr;71(4):345-51
pubmed: 18318694
J Chem Inf Model. 2018 Jun 25;58(6):1292-1302
pubmed: 29738247
Brief Bioinform. 2023 Mar 19;24(2):
pubmed: 36880207
Biomolecules. 2019 Sep 17;9(9):
pubmed: 31533374