Improving Protein-Protein Interaction Prediction Using Protein Language Model and Protein Network Features.
Protein language model
Protein-protein interactions prediction
network
Journal
Analytical biochemistry
ISSN: 1096-0309
Titre abrégé: Anal Biochem
Pays: United States
ID NLM: 0370535
Informations de publication
Date de publication:
26 Apr 2024
26 Apr 2024
Historique:
received:
24
01
2024
revised:
12
04
2024
accepted:
25
04
2024
medline:
29
4
2024
pubmed:
29
4
2024
entrez:
28
4
2024
Statut:
aheadofprint
Résumé
Interactions between proteins are ubiquitous in a wide variety of biological processes. Accurately identifying the protein-protein interactions (PPI) is of significant importance for understanding the mechanisms of protein functions and facilitating drug discovery. Although the wet-lab technological methods are the best way to identify PPI, their major constraints are their time-consuming nature, high cost, and labor-intensiveness. Hence, lots of efforts have been made towards developing computational methods to improve the performance of PPI prediction. In this study, we propose a novel hybrid computational method (called KSGPPI) that aims at improving the prediction performance of PPI via extracting the discriminative information from protein sequences and interaction networks. The KSGPPI model comprises two feature extraction modules. In the first feature extraction module, a large protein language model, ESM-2, is employed to exploit the global complex patterns concealed within protein sequences. Subsequently, feature representations are further extracted through CKSAAP, and a two-dimensional convolutional neural network (CNN) is utilized to capture local information. In the second feature extraction module, the query protein acquires its similar protein from the STRING database via the sequence alignment tool NW-align and then captures the graph embedding feature for the query protein in the protein interaction network of the similar protein using the algorithm of Node2vec. Finally, the features of these two feature extraction modules are efficiently fused; the fused features are then fed into the fully connected neural networks to predict PPI. The results of five-fold cross-validation on the used benchmarked datasets demonstrate that KSGPPI achieves an average prediction accuracy of 88.96%. Additionally, the average Matthews correlation coefficient value (0.781) of KSGPPI is significantly higher than that of those state-of-the-art PPI prediction methods. The standalone package of KSGPPI is freely downloaded at https://github.com/rickleezhe/KSGPPI.
Identifiants
pubmed: 38679191
pii: S0003-2697(24)00094-0
doi: 10.1016/j.ab.2024.115550
pii:
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
115550Informations de copyright
Copyright © 2024. Published by Elsevier Inc.
Déclaration de conflit d'intérêts
Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.