DEEPSMP: A deep learning model for predicting the ectodomain shedding events of membrane proteins.


Journal

Journal of bioinformatics and computational biology
ISSN: 1757-6334
Titre abrégé: J Bioinform Comput Biol
Pays: Singapore
ID NLM: 101187344

Informations de publication

Date de publication:
06 2020
Historique:
pubmed: 25 6 2020
medline: 22 6 2021
entrez: 25 6 2020
Statut: ppublish

Résumé

Membrane proteins play essential roles in modern medicine. In recent studies, some membrane proteins involved in ectodomain shedding events have been reported as the potential drug targets and biomarkers of some serious diseases. However, there are few effective tools for identifying the shedding event of membrane proteins. So, it is necessary to design an effective tool for predicting shedding event of membrane proteins. In this study, we design an end-to-end prediction model using deep neural networks with long short-term memory (LSTM) units and attention mechanism, to predict the ectodomain shedding events of membrane proteins only by sequence information. Firstly, the evolutional profiles are encoded from original sequences of these proteins by Position-Specific Iterated BLAST (PSI-BLAST) on Uniref50 database. Then, the LSTM units which contain memory cells are used to hold information from past inputs to the network and the attention mechanism is applied to detect sorting signals in proteins regardless of their position in the sequence. Finally, a fully connected dense layer and a softmax layer are used to obtain the final prediction results. Additionally, we also try to reduce overfitting of the model by using dropout, L2 regularization, and bagging ensemble learning in the model training process. In order to ensure the fairness of performance comparison, firstly we use cross validation process on training dataset obtained from an existing paper. The average accuracy and area under a receiver operating characteristic curve (AUC) of five-fold cross-validation are 81.19% and 0.835 using our proposed model, compared to 75% and 0.78 by a previously published tool, respectively. To better validate the performance of the proposed model, we also evaluate the performance of the proposed model on independent test dataset. The accuracy, sensitivity, and specificity are 83.14%, 84.08%, and 81.63% using our proposed model, compared to 70.20%, 71.97%, and 67.35% by the existing model. The experimental results validate that the proposed model can be regarded as a general tool for predicting ectodomain shedding events of membrane proteins. The pipeline of the model and prediction results can be accessed at the following URL: http://www.csbg-jlu.info/DeepSMP/.

Identifiants

pubmed: 32576054
doi: 10.1142/S0219720020500171
doi:

Substances chimiques

Membrane Proteins 0

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

2050017

Auteurs

Zhongbo Cao (Z)

Key Laboratory of Symbolic Computation and Knowledge, Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, P. R. China.
School of Management Science and Information Engineering, Jilin University of Finance and Economics, Changchun 130117, P. R. China.

Wei Du (W)

Key Laboratory of Symbolic Computation and Knowledge, Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, P. R. China.

Gaoyang Li (G)

Key Laboratory of Symbolic Computation and Knowledge, Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, P. R. China.

Huansheng Cao (H)

Center for Fundamental and Applied Microbiomics, Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA.

Articles similaires

Databases, Protein Protein Domains Protein Folding Proteins Deep Learning
Humans Middle Aged Female Male Surveys and Questionnaires
alpha-Synuclein Humans Animals Mice Lewy Body Disease

Classifications MeSH