TargetCPP: accurate prediction of cell-penetrating peptides from optimized multi-scale features using gradient boost decision tree.

Cell-penetrating peptides Composite protein sequence representation Composition transition and distribution Gradient boost Split amino acid composition

Journal

Journal of computer-aided molecular design
ISSN: 1573-4951
Titre abrégé: J Comput Aided Mol Des
Pays: Netherlands
ID NLM: 8710425

Informations de publication

Date de publication:
08 2020
Historique:
received: 12 11 2019
accepted: 09 03 2020
pubmed: 18 3 2020
medline: 28 9 2021
entrez: 18 3 2020
Statut: ppublish

Résumé

Cell-penetrating peptides (CPPs) are short length permeable proteins have emerged as drugs delivery tool of therapeutic agents including genetic materials and macromolecules into cells. Recently, CPP has become a hotspot avenue for life science research and paved a new way of disease treatment without harmful impact on cell viability due to nontoxic characteristic. Therefore, the correct identification of CPPs will provide hints for medical applications. Considering the shortcomings of traditional experimental CPPs identification, it is urgently needed to design intelligent predictor for accurate identification of CPPs for the large scale uncharacterized sequences. We develop a novel computational method, called TargetCPP, to discriminate CPPs from Non-CPPs with improved accuracy. In TargetCPP, first the peptide sequences are formulated with four distinct encoding methods i.e., composite protein sequence representation, composition transition and distribution, split amino acid composition, and information theory features. These dominant feature vectors were fused and applied intelligent minimum redundancy and maximum relevancy feature selection method to choose an optimal subset of features. Finally, the predictive model is learned through different classification algorithms on the optimized features. Among these classifiers, gradient boost decision tree algorithm achieved excellent performance throughout the experiments. Notably, the TargetCPP tool attained high prediction Accuracy of 93.54% and 88.28% using jackknife and independent test, respectively. Empirical outcomes prove the superiority and potency of proposed bioinformatics method over state-of-the-art methods. It is highly anticipated that the outcomes of this study will provide a strong background for large scale prediction of CPPs and instructive guidance in clinical therapy and medical applications.

Identifiants

pubmed: 32180124
doi: 10.1007/s10822-020-00307-z
pii: 10.1007/s10822-020-00307-z
doi:

Substances chimiques

Cell-Penetrating Peptides 0

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

841-856

Auteurs

Muhammad Arif (M)

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China.

Saeed Ahmad (S)

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China.

Farman Ali (F)

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China.

Ge Fang (G)

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China.

Min Li (M)

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China.

Dong-Jun Yu (DJ)

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China. njyudj@njust.edu.cn.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Animals Hemiptera Insect Proteins Phylogeny Insecticides

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software

Understanding the role of machine learning in predicting progression of osteoarthritis.

Simone Castagno, Benjamin Gompels, Estelle Strangmark et al.
1.00
Humans Disease Progression Machine Learning Osteoarthritis

Classifications MeSH