MultiPep: a hierarchical deep learning approach for multi-label classification of peptide bioactivities.

deep learning machine learning peptide bioactivity prediction peptide therapeutics

Journal

Biology methods & protocols
ISSN: 2396-8923
Titre abrégé: Biol Methods Protoc
Pays: England
ID NLM: 101693064

Informations de publication

Date de publication:
2021
Historique:
received: 08 09 2021
revised: 28 10 2021
accepted: 17 11 2021
entrez: 15 12 2021
pubmed: 16 12 2021
medline: 16 12 2021
Statut: epublish

Résumé

Peptide-based therapeutics are here to stay and will prosper in the future. A key step in identifying novel peptide-drugs is the determination of their bioactivities. Recent advances in peptidomics screening approaches hold promise as a strategy for identifying novel drug targets. However, these screenings typically generate an immense number of peptides and tools for ranking these peptides prior to planning functional studies are warranted. Whereas a couple of tools in the literature predict multiple classes, these are constructed using multiple binary classifiers. We here aimed to use an innovative deep learning approach to generate an improved peptide bioactivity classifier with capacity of distinguishing between multiple classes. We present MultiPep: a deep learning multi-label classifier that assigns peptides to zero or more of 20 bioactivity classes. We train and test MultiPep on data from several publically available databases. The same data are used for a hierarchical clustering, whose dendrogram shapes the architecture of MultiPep. We test a new loss function that combines a customized version of Matthews correlation coefficient with binary cross entropy (BCE), and show that this is better than using class-weighted BCE as loss function. Further, we show that MultiPep surpasses state-of-the-art peptide bioactivity classifiers and that it predicts known and novel bioactivities of FDA-approved therapeutic peptides. In conclusion, we present innovative machine learning techniques used to produce a peptide prediction tool to aid peptide-based therapy development and hypothesis generation.

Identifiants

pubmed: 34909478
doi: 10.1093/biomethods/bpab021
pii: bpab021
pmc: PMC8665375
doi:

Types de publication

Journal Article

Langues

eng

Pagination

bpab021

Informations de copyright

© The Author(s) 2021. Published by Oxford University Press.

Références

Dig Dis Sci. 2006 May;51(5):956-9
pubmed: 16758306
BioDrugs. 1999 Aug;12(2):139-57
pubmed: 18031173
Biochem Biophys Res Commun. 1996 May 15;222(2):559-65
pubmed: 8670244
Bioconjug Chem. 2012 Sep 19;23(9):1812-20
pubmed: 22873735
Nucleic Acids Res. 2020 Jul 27;48(13):7099-7118
pubmed: 32558887
Br J Anaesth. 2004 Dec;93(6):842-58
pubmed: 15277296
Therap Adv Gastroenterol. 2012 May;5(3):159-71
pubmed: 22570676
NAR Genom Bioinform. 2021 May 22;3(2):lqab039
pubmed: 34046590
PLoS One. 2012;7(10):e45012
pubmed: 23056189
Expert Rev Clin Pharmacol. 2016;9(1):59-68
pubmed: 26465174
Front Biol (Beijing). 2012 Aug 1;7(4):313-335
pubmed: 24504115
Neurosci Bull. 2017 Dec;33(6):675-684
pubmed: 28780644
Aliment Pharmacol Ther. 2013 Jan;37(1):18-36
pubmed: 23121085
Nucleic Acids Res. 2016 Jan 4;44(D1):D1119-26
pubmed: 26527728
J Am Soc Mass Spectrom. 2015 Dec;26(12):1981-91
pubmed: 26305799
Int J Hematol. 2017 Oct;106(4):476-483
pubmed: 28600720
Ther Adv Respir Dis. 2008 Oct;2(5):339-44
pubmed: 19124381
Comput Biol Chem. 2019 Jun;80:441-451
pubmed: 31151025
J Clin Endocrinol Metab. 2001 Apr;86(4):1759-64
pubmed: 11297614
Molecules. 2020 May 13;25(10):
pubmed: 32414106
Drug Discov Today. 2015 Jan;20(1):122-8
pubmed: 25450771
FEMS Microbiol Lett. 2014 Aug;357(1):63-8
pubmed: 24888447
Nat Biotechnol. 2015 Aug;33(8):831-8
pubmed: 26213851
BMC Genomics. 2020 Jan 2;21(1):6
pubmed: 31898477
Sci Rep. 2017 Nov 15;7(1):15653
pubmed: 29142299
Database (Oxford). 2015 Apr 29;2015:bav038
pubmed: 25931458
PLoS One. 2017 Jul 31;12(7):e0181748
pubmed: 28759605
Bioinformatics. 2020 Jun 1;36(11):3350-3356
pubmed: 32145017
Bioinformatics. 2011 Oct 1;27(19):2772-3
pubmed: 21821666
Sci Rep. 2020 Dec 8;10(1):21471
pubmed: 33293615
J Proteome Res. 2020 Sep 4;19(9):3732-3740
pubmed: 32786686
Database (Oxford). 2021 Sep 3;2021:
pubmed: 34478499
Nature. 2020 Sep;585(7825):357-362
pubmed: 32939066
Mol Cell Endocrinol. 2009 Jan 15;297(1-2):137-40
pubmed: 19041364
Int J Mol Sci. 2019 Apr 22;20(8):
pubmed: 31013619
Nucleic Acids Res. 2016 Jan 4;44(D1):D1087-93
pubmed: 26602694
Nat Methods. 2020 Mar;17(3):261-272
pubmed: 32015543
P T. 2009 May;34(5):250-7
pubmed: 19561871
Bioorg Med Chem. 2018 Jun 1;26(10):2700-2707
pubmed: 28720325
Int J Mol Sci. 2019 May 14;20(10):
pubmed: 31091705
Front Genet. 2019 Jan 22;9:714
pubmed: 30723495
BioDrugs. 1998 Aug;10(2):159-71
pubmed: 18020592
Int J Mol Sci. 2019 Nov 27;20(23):
pubmed: 31783634
Front Genet. 2019 Apr 24;10:351
pubmed: 31068968
Bioinformatics. 2019 Nov 1;35(21):4272-4280
pubmed: 30994882
Curr Pharm Des. 2002;8(9):671-93
pubmed: 11945164
Diabetes. 2002 Dec;51 Suppl 3:S434-42
pubmed: 12475787
Arthritis Rheum. 2003 Apr;48(4):927-34
pubmed: 12687534
Clin Ther. 2004 Jul;26(7):991-1025
pubmed: 15336466
Nucleic Acids Res. 2015 Jan;43(Database issue):D837-43
pubmed: 25270878
PLoS One. 2013 Sep 13;8(9):e73957
pubmed: 24058508
Ann Pharmacother. 2007 Jan;41(1):86-94
pubmed: 17190850
Support Care Cancer. 2018 Jan;26(1):7-20
pubmed: 28939926
Clin Immunol. 2020 Jan;210:108292
pubmed: 31676420
Cancer Metastasis Rev. 2015 Mar;34(1):115-28
pubmed: 25589384
Bioinformation. 2019 Nov 13;15(11):780-783
pubmed: 31902976
World J Virol. 2020 Dec 15;9(5):67-78
pubmed: 33362999
BMC Bioinformatics. 2017 Dec 28;18(Suppl 14):523
pubmed: 29297288
Bioinformatics. 2000 May;16(5):412-24
pubmed: 10871264
BioDrugs. 2008;22(6):375-86
pubmed: 18998755
Mol Ther Nucleic Acids. 2020 Jun 5;20:882-894
pubmed: 32464552
BioData Min. 2019 Mar 4;12:7
pubmed: 30867681
Curr Med Res Opin. 2014 Jun;30(6):1179-87
pubmed: 24576196
J Pediatr Pharmacol Ther. 2012 Jul;17(3):206-10
pubmed: 23258962
Neurochem Int. 1999 Dec;35(6):463-70
pubmed: 10524714
Bioinformatics. 2018 Aug 15;34(16):2740-2747
pubmed: 29590297
Nucleic Acids Res. 2016 Jan 4;44(D1):D1094-7
pubmed: 26467475
Obes Rev. 2007 Jan;8(1):21-34
pubmed: 17212793
J Proteome Res. 2008 Sep;7(9):4119-31
pubmed: 18707150
Database (Oxford). 2020 Jan 1;2020:
pubmed: 32844169
Neurotherapeutics. 2007 Oct;4(4):647-53
pubmed: 17920545

Auteurs

Alexander G B Grønning (AGB)

Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark.

Tim Kacprowski (T)

Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics, TU Braunschweig and Hannover Medical School, 38106 Braunschweig, Germany.
Braunschweig Integrated Centre for Systems Biology (BRICS), 38106 Braunschweig, Germany.

Camilla Schéele (C)

Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark.

Classifications MeSH