DeepDNAbP: A deep learning-based hybrid approach to improve the identification of deoxyribonucleic acid-binding proteins.


Journal

Computers in biology and medicine
ISSN: 1879-0534
Titre abrégé: Comput Biol Med
Pays: United States
ID NLM: 1250250

Informations de publication

Date de publication:
06 2022
Historique:
received: 15 09 2021
revised: 11 03 2022
accepted: 20 03 2022
pubmed: 5 4 2022
medline: 20 5 2022
entrez: 4 4 2022
Statut: ppublish

Résumé

Accurate identification of DNA-binding proteins (DBPs) is critical for both understanding protein function and drug design. DBPs also play essential roles in different kinds of biological activities such as DNA replication, repair, transcription, and splicing. As experimental identification of DBPs is time-consuming and sometimes biased toward prediction, constructing an effective DBP model represents an urgent need, and computational methods that can accurately predict potential DBPs based on sequence information are highly desirable. In this paper, a novel predictor called DeepDNAbP has been developed to accurately predict DBPs from sequences using a convolutional neural network (CNN) model. First, we perform three feature extraction methods, namely position-specific scoring matrix (PSSM), pseudo-amino acid composition (PseAAC) and tripeptide composition (TPC), to represent protein sequence patterns. Secondly, SHapley Additive exPlanations (SHAP) are employed to remove the redundant and irrelevant features for predicting DBPs. Finally, the best features are provided to the CNN classifier to construct the DeepDNAbP model for identifying DBPs. The final DeepDNAbP predictor achieves superior prediction performance in K-fold cross-validation tests and outperforms other existing predictors of DNA-protein binding methods. DeepDNAbP is poised to be a powerful computational resource for the prediction of DBPs. The web application and curated datasets in this study are freely available at: http://deepdbp.sblog360.blog/.

Identifiants

pubmed: 35378437
pii: S0010-4825(22)00225-6
doi: 10.1016/j.compbiomed.2022.105433
pii:
doi:

Substances chimiques

DNA-Binding Proteins 0
DNA 9007-49-2

Types de publication

Journal Article Research Support, N.I.H., Extramural

Langues

eng

Sous-ensembles de citation

IM

Pagination

105433

Subventions

Organisme : NIAMS NIH HHS
ID : R01 AR069055
Pays : United States

Informations de copyright

Copyright © 2022. Published by Elsevier Ltd.

Auteurs

Md Faruk Hosen (MF)

Department of Information and Communication Technology, Mawlana Bhashani Science and Technology University, Santosh, Tangail, 1902, Bangladesh.

S M Hasan Mahmud (SMH)

Department of Computer Science, American International University-Bangladesh (AIUB), Kuratoli, Dhaka, 1229, Bangladesh.

Kawsar Ahmed (K)

Department of Information and Communication Technology, Mawlana Bhashani Science and Technology University, Santosh, Tangail, 1902, Bangladesh.

Wenyu Chen (W)

School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China.

Mohammad Ali Moni (MA)

School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, The University of Queensland, St Lucia, QLD, 4072, Australia.

Hong-Wen Deng (HW)

Tulane Center for Biomedical Informatics and Genomics, Division of Biomedical Informatics and Genomics, John W. Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, LA, 70112, USA.

Watshara Shoombuatong (W)

Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand. Electronic address: watshara.sho@mahidol.ac.th.

Md Mehedi Hasan (MM)

Tulane Center for Biomedical Informatics and Genomics, Division of Biomedical Informatics and Genomics, John W. Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, LA, 70112, USA. Electronic address: mhasan1@tulane.edu.

Articles similaires

Databases, Protein Protein Domains Protein Folding Proteins Deep Learning
Humans Colorectal Neoplasms Biomarkers, Tumor Prognosis Gene Expression Regulation, Neoplastic

Unsupervised learning for real-time and continuous gait phase detection.

Dollaporn Anopas, Yodchanan Wongsawat, Jetsada Arnin
1.00
Humans Gait Neural Networks, Computer Unsupervised Machine Learning Walking

Classifications MeSH