A Transformer-Based Ensemble Framework for the Prediction of Protein-Protein Interaction Sites.


Journal

Research (Washington, D.C.)
ISSN: 2639-5274
Titre abrégé: Research (Wash D C)
Pays: United States
ID NLM: 101747148

Informations de publication

Date de publication:
2023
Historique:
received: 02 08 2023
accepted: 08 09 2023
medline: 29 9 2023
pubmed: 29 9 2023
entrez: 29 9 2023
Statut: epublish

Résumé

The identification of protein-protein interaction (PPI) sites is essential in the research of protein function and the discovery of new drugs. So far, a variety of computational tools based on machine learning have been developed to accelerate the identification of PPI sites. However, existing methods suffer from the low predictive accuracy or the limited scope of application. Specifically, some methods learned only global or local sequential features, leading to low predictive accuracy, while others achieved improved performance by extracting residue interactions from structures but were limited in their application scope for the serious dependence on precise structure information. There is an urgent need to develop a method that integrates comprehensive information to realize proteome-wide accurate profiling of PPI sites. Herein, a novel ensemble framework for PPI sites prediction, EnsemPPIS, was therefore proposed based on transformer and gated convolutional networks. EnsemPPIS can effectively capture not only global and local patterns but also residue interactions. Specifically, EnsemPPIS was unique in (a) extracting residue interactions from protein sequences with transformer and (b) further integrating global and local sequential features with the ensemble learning strategy. Compared with various existing methods, EnsemPPIS exhibited either superior performance or broader applicability on multiple PPI sites prediction tasks. Moreover, pattern analysis based on the interpretability of EnsemPPIS demonstrated that EnsemPPIS was fully capable of learning residue interactions within the local structure of PPI sites using only sequence information. The web server of EnsemPPIS is freely available at http://idrblab.org/ensemppis.

Identifiants

pubmed: 37771850
doi: 10.34133/research.0240
pii: 0240
pmc: PMC10528219
doi:

Types de publication

Journal Article

Langues

eng

Pagination

0240

Informations de copyright

Copyright © 2023 Minjie Mou et al.

Références

Malar J. 2022 Mar 9;21(1):79
pubmed: 35264165
Bioinformatics. 2017 May 15;33(10):1479-1487
pubmed: 28073761
Science. 2022 Jul 22;377(6604):387-394
pubmed: 35862514
Bioinformatics. 2012 Dec 1;28(23):3150-2
pubmed: 23060610
Bioinformatics. 2015 Mar 15;31(6):926-32
pubmed: 25398609
Nat Struct Mol Biol. 2022 Jan;29(1):1-2
pubmed: 35046575
Science. 2023 Mar 17;379(6637):1123-1130
pubmed: 36927031
Nat Protoc. 2020 Oct;15(10):3182-3211
pubmed: 32778839
Nucleic Acids Res. 2013 Jan;41(Database issue):D1096-103
pubmed: 23087378
Research (Wash D C). 2022 Jul 21;2022:9873564
pubmed: 35958111
Research (Wash D C). 2022 Feb 1;2022:9781758
pubmed: 35198984
J Mol Biol. 2020 Mar 27;432(7):2428-2443
pubmed: 32142788
Nature. 2021 Aug;596(7873):590-596
pubmed: 34293799
J Chem Inf Model. 2022 Dec 12;62(23):5875-5895
pubmed: 36378082
IEEE Trans Image Process. 2022;31:3386-3398
pubmed: 35471883
Bioinformatics. 2019 Feb 1;35(3):470-477
pubmed: 30020406
Brief Bioinform. 2022 Mar 10;23(2):
pubmed: 35189638
Trends Pharmacol Sci. 2012 Feb;33(2):109-18
pubmed: 22130009
Neural Netw. 2019 Feb;110:232-242
pubmed: 30616095
Science. 2021 Aug 20;373(6557):871-876
pubmed: 34282049
Nat Methods. 2022 Jun;19(6):730-739
pubmed: 35637310
Curr Drug Targets. 2016;17(14):1586-1594
pubmed: 26758670
J Chem Inf Model. 2022 Dec 12;62(23):5961-5974
pubmed: 36398714
Cell. 2005 Sep 23;122(6):957-68
pubmed: 16169070
PLoS Comput Biol. 2017 Mar 30;13(3):e1005346
pubmed: 28358804
Brief Bioinform. 2022 Nov 19;23(6):
pubmed: 36198065
Bioinformatics. 2022 Jul 11;38(14):3541-3548
pubmed: 35640972
Bioinformatics. 2019 Jul 15;35(14):i343-i353
pubmed: 31510679
Brief Bioinform. 2016 Jan;17(1):117-31
pubmed: 25971595
Signal Transduct Target Ther. 2023 Mar 14;8(1):115
pubmed: 36918529
Research (Wash D C). 2021 Dec 28;2021:9769586
pubmed: 35088054
Brief Bioinform. 2019 Jul 19;20(4):1250-1268
pubmed: 29253082
Curr Opin Struct Biol. 2022 Apr;73:102344
pubmed: 35219216
Plant Methods. 2022 Jun 3;18(1):73
pubmed: 35658913
Brief Bioinform. 2023 Jul 20;24(4):
pubmed: 37369638
Bioinformatics. 2021 Dec 11;37(24):4668-4676
pubmed: 34320631
J Chem Inf Model. 2014 Jul 28;54(7):2166-79
pubmed: 24866861
Nat Biotechnol. 2022 Nov;40(11):1617-1623
pubmed: 36192636
J Chem Inf Model. 2019 Mar 25;59(3):1253-1268
pubmed: 30615828
Research (Wash D C). 2023 May 31;6:0153
pubmed: 37275124
Nature. 2021 Aug;596(7873):583-589
pubmed: 34265844
Bioinformatics. 2007 Jan 15;23(2):e13-6
pubmed: 17237081
Nat Commun. 2023 Apr 15;14(1):2162
pubmed: 37061542
Bioinformatics. 2021 Sep 9;37(17):2580-2588
pubmed: 33693581
Methods Mol Biol. 2015;1215:399-424
pubmed: 25330973
Nat Methods. 2020 Feb;17(2):184-192
pubmed: 31819266
Phys Chem Chem Phys. 2021 Nov 24;23(45):25841-25849
pubmed: 34763347
Proteins. 2007 Feb 15;66(3):630-45
pubmed: 17152079
Bioinformatics. 2021 May 17;37(7):896-904
pubmed: 32840562
Research (Wash D C). 2023;6:0050
pubmed: 36930772
Bioinformatics. 2021 Dec 22;38(1):125-132
pubmed: 34498061
J Exp Clin Cancer Res. 2021 Jan 7;40(1):18
pubmed: 33413501
Nature. 2020 Jan;577(7792):706-710
pubmed: 31942072
Phys Chem Chem Phys. 2020 Apr 29;22(16):8870-8877
pubmed: 32286592
Bioinformatics. 2020 Feb 15;36(4):1114-1120
pubmed: 31593229
BMC Biol. 2023 Jan 24;21(1):12
pubmed: 36694239
Science. 2021 Oct;374(6563):eabf3066
pubmed: 34591612
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):7112-7127
pubmed: 34232869
Trends Pharmacol Sci. 2013 Oct;34(10):549-59
pubmed: 24035675
Comput Biol Med. 2022 Jun;145:105465
pubmed: 35366467
Am J Respir Crit Care Med. 2018 Aug 15;198(4):544-545
pubmed: 29641217
Brief Funct Genomics. 2023 May 18;22(3):274-280
pubmed: 36528813
Brief Bioinform. 2009 May;10(3):233-46
pubmed: 19346321
Brief Bioinform. 2018 Sep 28;19(5):821-837
pubmed: 28334258
Cell. 2020 Aug 20;182(4):1027-1043.e17
pubmed: 32822567
Brief Bioinform. 2020 Sep 25;21(5):1825-1836
pubmed: 31860715
J Mol Biol. 1997 Sep 12;272(1):121-32
pubmed: 9299342
Nucleic Acids Res. 1997 Sep 1;25(17):3389-402
pubmed: 9254694
Small. 2018 Oct;14(42):e1802358
pubmed: 30239124
Proc Natl Acad Sci U S A. 2009 Aug 18;106(33):13737-41
pubmed: 19666553
Science. 2003 Dec 5;302(5651):1727-36
pubmed: 14605208
Nat Commun. 2023 Apr 18;14(1):2175
pubmed: 37072397
Nucleic Acids Res. 2023 Apr 24;51(7):3017-3029
pubmed: 36796796
J Chem Inf Model. 2022 Sep 12;62(17):4270-4282
pubmed: 35973091
Brief Bioinform. 2021 Nov 5;22(6):
pubmed: 34337657
Commun Biol. 2023 Jan 19;6(1):73
pubmed: 36653447
Cell. 2018 Dec 13;175(7):1917-1930.e13
pubmed: 30550789
Neural Netw. 2022 Jun;150:149-166
pubmed: 35313247
Brief Bioinform. 2021 Jul 20;22(4):
pubmed: 33126261
Bioinformatics. 2022 Jan 12;38(3):678-686
pubmed: 34694393
Nature. 2017 Sep 13;549(7671):293-295
pubmed: 28905898
Nucleic Acids Res. 2023 Jan 6;51(D1):D488-D508
pubmed: 36420884
Bioinformatics. 2018 Jan 15;34(2):223-229
pubmed: 28968673
Brief Bioinform. 2023 Jan 19;24(1):
pubmed: 36631399
Research (Wash D C). 2023 Mar 8;6:0078
pubmed: 36930770
IEEE Trans Neural Netw Learn Syst. 2022 Dec;33(12):7330-7344
pubmed: 34111008
Bioinformatics. 2010 Aug 1;26(15):1841-8
pubmed: 20529890
Brief Bioinform. 2022 Mar 10;23(2):
pubmed: 35106547
J Mol Biol. 1990 Oct 5;215(3):403-10
pubmed: 2231712
Bioinformatics. 2020 Aug 15;36(16):4406-4414
pubmed: 32428219
Comput Struct Biotechnol J. 2023 Jan 18;21:1014-1021
pubmed: 36733699
Nucleic Acids Res. 2022 Jan 7;50(D1):D1417-D1431
pubmed: 34747471
Cell Death Discov. 2022 Jan 10;8(1):3
pubmed: 35013150
Brief Bioinform. 2022 Sep 20;23(5):
pubmed: 35524477

Auteurs

Minjie Mou (M)

College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China.

Ziqi Pan (Z)

College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China.

Zhimeng Zhou (Z)

College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China.

Lingyan Zheng (L)

College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China.

Hanyu Zhang (H)

College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China.

Shuiyang Shi (S)

College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China.

Fengcheng Li (F)

College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China.

Xiuna Sun (X)

College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China.

Feng Zhu (F)

College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China.
Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China.

Classifications MeSH