Classification in biological networks with hypergraphlet kernels.
Journal
Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944
Informations de publication
Date de publication:
17 05 2021
17 05 2021
Historique:
received:
09
07
2019
revised:
13
06
2020
accepted:
26
08
2020
pubmed:
5
9
2020
medline:
9
6
2021
entrez:
5
9
2020
Statut:
ppublish
Résumé
Biological and cellular systems are often modeled as graphs in which vertices represent objects of interest (genes, proteins and drugs) and edges represent relational ties between these objects (binds-to, interacts-with and regulates). This approach has been highly successful owing to the theory, methodology and software that support analysis and learning on graphs. Graphs, however, suffer from information loss when modeling physical systems due to their inability to accurately represent multiobject relationships. Hypergraphs, a generalization of graphs, provide a framework to mitigate information loss and unify disparate graph-based methodologies. We present a hypergraph-based approach for modeling biological systems and formulate vertex classification, edge classification and link prediction problems on (hyper)graphs as instances of vertex classification on (extended, dual) hypergraphs. We then introduce a novel kernel method on vertex- and edge-labeled (colored) hypergraphs for analysis and learning. The method is based on exact and inexact (via hypergraph edit distances) enumeration of hypergraphlets; i.e. small hypergraphs rooted at a vertex of interest. We empirically evaluate this method on fifteen biological networks and show its potential use in a positive-unlabeled setting to estimate the interactome sizes in various species. https://github.com/jlugomar/hypergraphlet-kernels. Supplementary data are available at Bioinformatics online.
Identifiants
pubmed: 32886115
pii: 5901538
doi: 10.1093/bioinformatics/btaa768
pmc: PMC8128478
doi:
Substances chimiques
Proteins
0
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.
Langues
eng
Sous-ensembles de citation
IM
Pagination
1000-1007Subventions
Organisme : NIMH NIH HHS
ID : R01 MH105524
Pays : United States
Informations de copyright
© The Author(s) 2020. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Références
Bioinformatics. 2006 Nov 15;22(22):2800-5
pubmed: 16954137
Pac Symp Biocomput. 2019;24:124-135
pubmed: 30864316
Nat Methods. 2009 Jan;6(1):83-90
pubmed: 19060904
KDD. 2016 Aug;2016:855-864
pubmed: 27853626
Bioinformatics. 2007 Jul 1;23(13):i57-65
pubmed: 17646345
Proc Natl Acad Sci U S A. 2008 May 13;105(19):6959-64
pubmed: 18474861
Bioinformatics. 2018 Sep 1;34(17):i944-i953
pubmed: 30423061
Nat Methods. 2012 Dec;9(12):1134-6
pubmed: 23223166
Nucleic Acids Res. 2019 Jan 8;47(D1):D559-D563
pubmed: 30357367
Bioinformatics. 2004 Dec 12;20(18):3508-15
pubmed: 15284103
Bioinformatics. 2013 Jul 01;29(13):i126-34
pubmed: 23812976
Nature. 2002 May 23;417(6887):399-403
pubmed: 12000970
Bioinformatics. 2005 Jun;21 Suppl 1:i302-10
pubmed: 15961472
BMC Bioinformatics. 2004 Apr 16;5:38
pubmed: 15090078
J Comput Biol. 2010 Jan;17(1):55-72
pubmed: 20078397
Bioinformatics. 2007 Jan 15;23(2):e177-83
pubmed: 17237089
Bioinformatics. 2003 Oct 12;19(15):1875-81
pubmed: 14555619
PLoS Comput Biol. 2007 Nov;3(11):e214
pubmed: 18039026
Bioinformatics. 2017 Jun 1;33(11):1681-1688
pubmed: 28130237
PLoS Comput Biol. 2012;8(9):e1002645
pubmed: 23028270
J Comput Biol. 2003;10(6):947-60
pubmed: 14980019
Bioinformatics. 2008 Jul 1;24(13):i232-40
pubmed: 18586719
PLoS Comput Biol. 2009 May;5(5):e1000385
pubmed: 19478865
Bioinformatics. 2005 Jun;21 Suppl 1:i38-46
pubmed: 15961482
J Am Chem Soc. 2003 Oct 1;125(39):11853-65
pubmed: 14505407
Proteins. 2011 Jul;79(7):2086-96
pubmed: 21671271
Nat Commun. 2019 Mar 18;10(1):1240
pubmed: 30886144