Classification of bioactive peptides: A systematic benchmark of models and encodings.

Bioactive peptide Functional classification Machine learning Sequence encoding Systematic evaluation

Journal

Computational and structural biotechnology journal

ISSN: 2001-0370

Titre abrégé: Comput Struct Biotechnol J

Pays: Netherlands

ID NLM: 101585369

Informations de publication

Date de publication:
Dec 2024

Historique:

received: 19 03 2024

revised: 10 05 2024

accepted: 22 05 2024

medline: 13 6 2024

pubmed: 13 6 2024

entrez: 13 6 2024

Statut: epublish

Résumé

Bioactive peptides are short amino acid chains possessing biological activity and exerting physiological effects relevant to human health. Despite their therapeutic value, their identification remains a major problem, as it mainly relies on time-consuming in vitro tests. While bioinformatic tools for the identification of bioactive peptides are available, they are focused on specific functional classes and have not been systematically tested on realistic settings. To tackle this problem, bioactive peptide sequences and functions were here gathered from a variety of databases to generate a unified collection of bioactive peptides from microbial fermentation. This collection was organized into nine functional classes including some previously studied and some unexplored such as immunomodulatory, opioid and cardiovascular peptides. Upon assessing their sequence properties, four alternative encoding methods were tested in combination with a multitude of machine learning algorithms, from basic classifiers like logistic regression to advanced algorithms like BERT. Tests on a total of 171 models showed that, while some functions are intrinsically easier to detect, no single combination of classifiers and encoders worked universally well for all classes. For this reason, we unified all the best individual models for each class and generated CICERON (Classification of bIoaCtive pEptides fRom micrObial fermeNtation), a classification tool for the functional classification of peptides. State-of-the-art classifiers were found to underperform on our realistic benchmark dataset compared to the models included in CICERON. Altogether, our work provides a tool for real-world peptide classification and can serve as a benchmark for future model development.

Identifiants

DOI: 10.1016/j.csbj.2024.05.040 PMID: 38867723 PMC: PMC11168199

pubmed: 38867723

doi: 10.1016/j.csbj.2024.05.040

pii: S2001-0370(24)00186-7

pmc: PMC11168199

doi:

Types de publication

Journal Article

Langues

eng

Pagination

2442-2452

Informations de copyright

Déclaration de conflit d'intérêts

The authors declare no conflict of interest.

Classification of bioactive peptides: A systematic benchmark of models and encodings.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Pagination

Informations de copyright

Déclaration de conflit d'intérêts

Auteurs

Edoardo Bizzotto (E)

Guido Zampieri (G)

Laura Treu (L)

Pasquale Filannino (P)

Raffaella Di Cagno (R)

Stefano Campanaro (S)

Classifications MeSH