PCGAN: a generative approach for protein complex identification from protein interaction networks.
Journal
Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944
Informations de publication
Date de publication:
01 08 2023
01 08 2023
Historique:
received:
20
11
2022
revised:
23
07
2023
accepted:
01
08
2023
medline:
28
8
2023
pubmed:
2
8
2023
entrez:
2
8
2023
Statut:
ppublish
Résumé
Protein complexes are groups of polypeptide chains linked by non-covalent protein-protein interactions, which play important roles in biological systems and perform numerous functions, including DNA transcription, mRNA translation, and signal transduction. In the past decade, a number of computational methods have been developed to identify protein complexes from protein interaction networks by mining dense subnetworks or subgraphs. In this article, different from the existing works, we propose a novel approach for this task based on generative adversarial networks, which is called PCGAN, meaning identifying Protein Complexes by GAN. With the help of some real complexes as training samples, our method can learn a model to generate new complexes from a protein interaction network. To effectively support model training and testing, we construct two more comprehensive and reliable protein interaction networks and a larger gold standard complex set by merging existing ones of the same organism (including human and yeast). Extensive comparison studies indicate that our method is superior to existing protein complex identification methods in terms of various performance metrics. Furthermore, functional enrichment analysis shows that the identified complexes are of high biological significance, which indicates that these generated protein complexes are very possibly real complexes. https://github.com/yul-pan/PCGAN.
Identifiants
pubmed: 37531266
pii: 7235566
doi: 10.1093/bioinformatics/btad473
pmc: PMC10457665
pii:
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Informations de copyright
© The Author(s) 2023. Published by Oxford University Press.
Références
Cell. 2021 May 27;184(11):3022-3040.e28
pubmed: 33961781
Nat Protoc. 2014 Nov;9(11):2539-54
pubmed: 25275790
Mol Syst Biol. 2021 May;17(5):e10016
pubmed: 33973408
Nature. 2006 Mar 30;440(7084):637-43
pubmed: 16554755
Nucleic Acids Res. 2008 Jan;36(Database issue):D577-81
pubmed: 17982175
Mol Cell Proteomics. 2007 Mar;6(3):439-50
pubmed: 17200106
Nature. 1989 Jul 20;340(6230):245-6
pubmed: 2547163
Nature. 2020 Apr;580(7803):402-408
pubmed: 32296183
Nature. 1999 Dec 2;402(6761 Suppl):C47-52
pubmed: 10591225
Nucleic Acids Res. 2002 Apr 1;30(7):1575-84
pubmed: 11917018
BMC Bioinformatics. 2003 Jan 13;4:2
pubmed: 12525261
IEEE/ACM Trans Comput Biol Bioinform. 2014 Jul-Aug;11(4):616-27
pubmed: 26356332
BMC Genomics. 2019 Aug 7;20(1):637
pubmed: 31390979
Nucleic Acids Res. 2019 Jan 8;47(D1):D559-D563
pubmed: 30357367
Proteomics. 2007 Mar;7(6):932-43
pubmed: 17285561
Nucleic Acids Res. 2006 Jan 1;34(Database issue):D169-72
pubmed: 16381839
BMC Bioinformatics. 2014 Jun 19;15:204
pubmed: 24944073
Nucleic Acids Res. 2009 Feb;37(3):825-31
pubmed: 19095691
Brief Bioinform. 2021 Mar 22;22(2):1972-1983
pubmed: 32065215
IEEE/ACM Trans Comput Biol Bioinform. 2020 May-Jun;17(3):777-787
pubmed: 30736004
Bioinformatics. 2010 Apr 15;26(8):1105-11
pubmed: 20185405
Nat Methods. 2012 Mar 18;9(5):471-2
pubmed: 22426491
J Bioinform Comput Biol. 2020 Jun;18(3):2040010
pubmed: 32698725
Nucleic Acids Res. 2019 Jul 2;47(W1):W191-W198
pubmed: 31066453
J Comput Biol. 2009 Feb;16(2):133-44
pubmed: 19193141
Brief Bioinform. 2020 Sep 25;21(5):1531-1548
pubmed: 31631226
Nature. 2006 Mar 30;440(7084):631-6
pubmed: 16429126
Nucleic Acids Res. 2002 Jan 1;30(1):303-5
pubmed: 11752321
Bioinformatics. 2021 Apr 9;37(1):73-81
pubmed: 33416831
Brief Bioinform. 2021 Jul 20;22(4):
pubmed: 33333549
BMC Bioinformatics. 2006 Apr 14;7:207
pubmed: 16613608
Cell. 1998 Feb 6;92(3):291-4
pubmed: 9476889