Improving Protein-Protein Interaction Prediction Using Protein Language Model and Protein Network Features.

Protein language model Protein-protein interactions prediction network

Journal

Analytical biochemistry
ISSN: 1096-0309
Titre abrégé: Anal Biochem
Pays: United States
ID NLM: 0370535

Informations de publication

Date de publication:
26 Apr 2024
Historique:
received: 24 01 2024
revised: 12 04 2024
accepted: 25 04 2024
medline: 29 4 2024
pubmed: 29 4 2024
entrez: 28 4 2024
Statut: aheadofprint

Résumé

Interactions between proteins are ubiquitous in a wide variety of biological processes. Accurately identifying the protein-protein interactions (PPI) is of significant importance for understanding the mechanisms of protein functions and facilitating drug discovery. Although the wet-lab technological methods are the best way to identify PPI, their major constraints are their time-consuming nature, high cost, and labor-intensiveness. Hence, lots of efforts have been made towards developing computational methods to improve the performance of PPI prediction. In this study, we propose a novel hybrid computational method (called KSGPPI) that aims at improving the prediction performance of PPI via extracting the discriminative information from protein sequences and interaction networks. The KSGPPI model comprises two feature extraction modules. In the first feature extraction module, a large protein language model, ESM-2, is employed to exploit the global complex patterns concealed within protein sequences. Subsequently, feature representations are further extracted through CKSAAP, and a two-dimensional convolutional neural network (CNN) is utilized to capture local information. In the second feature extraction module, the query protein acquires its similar protein from the STRING database via the sequence alignment tool NW-align and then captures the graph embedding feature for the query protein in the protein interaction network of the similar protein using the algorithm of Node2vec. Finally, the features of these two feature extraction modules are efficiently fused; the fused features are then fed into the fully connected neural networks to predict PPI. The results of five-fold cross-validation on the used benchmarked datasets demonstrate that KSGPPI achieves an average prediction accuracy of 88.96%. Additionally, the average Matthews correlation coefficient value (0.781) of KSGPPI is significantly higher than that of those state-of-the-art PPI prediction methods. The standalone package of KSGPPI is freely downloaded at https://github.com/rickleezhe/KSGPPI.

Identifiants

pubmed: 38679191
pii: S0003-2697(24)00094-0
doi: 10.1016/j.ab.2024.115550
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

115550

Informations de copyright

Copyright © 2024. Published by Elsevier Inc.

Déclaration de conflit d'intérêts

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Auteurs

Jun Hu (J)

College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, China;. Electronic address: hujunum@zjut.edu.cn.

Zhe Li (Z)

College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, China.

Bing Rao (B)

School of Information & Electrical Engineering, Hangzhou City University, Hangzhou, 310015, China.

Maha A Thafar (MA)

Computer Science Department, College of Computers and Information Technology, Taif University, Taif, Saudi Arabia.

Muhammad Arif (M)

College of Science and Engineering, Hamad Bin Khalifa University, Doha 34110, Qatar. Electronic address: mfarif@hbku.edu.qa.

Classifications MeSH