Community detection with node attributes in multilayer networks.
Journal
Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288
Informations de publication
Date de publication:
25 09 2020
25 09 2020
Historique:
received:
28
04
2020
accepted:
04
09
2020
entrez:
26
9
2020
pubmed:
27
9
2020
medline:
27
9
2020
Statut:
epublish
Résumé
Community detection in networks is commonly performed using information about interactions between nodes. Recent advances have been made to incorporate multiple types of interactions, thus generalizing standard methods to multilayer networks. Often, though, one can access additional information regarding individual nodes, attributes, or covariates. A relevant question is thus how to properly incorporate this extra information in such frameworks. Here we develop a method that incorporates both the topology of interactions and node attributes to extract communities in multilayer networks. We propose a principled probabilistic method that does not assume any a priori correlation structure between attributes and communities but rather infers this from data. This leads to an efficient algorithmic implementation that exploits the sparsity of the dataset and can be used to perform several inference tasks; we provide an open-source implementation of the code online. We demonstrate our method on both synthetic and real-world data and compare performance with methods that do not use any attribute information. We find that including node information helps in predicting missing links or attributes. It also leads to more interpretable community structures and allows the quantification of the impact of the node attributes given in input.
Identifiants
pubmed: 32978484
doi: 10.1038/s41598-020-72626-y
pii: 10.1038/s41598-020-72626-y
pmc: PMC7519123
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.
Langues
eng
Sous-ensembles de citation
IM
Pagination
15736Références
Waskiewicz, T. Friend of a friend influence in terrorist social networks. In Proceedings on the international conference on artificial intelligence (ICAI), 1 (The Steering Committee of The World Congress in Computer Science, Computer..., 2012).
Pinheiro, C. A. R. Community detection to identify fraud events in telecommunications networks. In SAS SUGI proceedings: customer intelligence (2012).
Pan, W.-F., Jiang, B. & Li, B. Refactoring software packages via community detection in complex software networks. Int. J. Autom. Comput. 10, 157–166 (2013).
doi: 10.1007/s11633-013-0708-y
Bechtel, J. J. et al. Lung cancer detection in patients with airflow obstruction identified in a primary care outpatient practice. Chest 127, 1140–1145 (2005).
pubmed: 15821187
Chen, J., Zhang, H., Guan, Z.-H. & Li, T. Epidemic spreading on networks with overlapping community structure. Physica A Stat. Mech. Appl. 391, 1848–1854 (2012).
doi: 10.1016/j.physa.2011.10.011
Traud, A. L., Kelsic, E. D., Mucha, P. J. & Porter, M. A. Comparing community structure to characteristics in online collegiate social networks. SIAM Rev. 53, 526–543 (2011).
doi: 10.1137/080734315
Newman, M. E. Modularity and community structure in networks. Proc. Natl. Acad. Sci. 103, 8577–8582 (2006).
doi: 10.1073/pnas.0601602103
Peel, L., Larremore, D. B. & Clauset, A. The ground truth about metadata and community detection in networks. Sci. Adv. 3, e1602548 (2017).
doi: 10.1126/sciadv.1602548
Yang, J., McAuley, J. & Leskovec, J. Community detection in networks with node attributes. In 2013 IEEE 13th international conference on data mining, 1151–1156 (IEEE, 2013).
Falih, I., Grozavu, N., Kanawati, R. & Bennani, Y. Community detection in attributed network. Companion Proc. Web Conf. 2018, 1299–1306 (2018).
Fortunato, S. Community detection in graphs. Phys. Rep. 486, 75–174 (2010).
doi: 10.1016/j.physrep.2009.11.002
De Domenico, M. et al. Mathematical formulation of multilayer networks. Phys. Rev. X 3, 041022 (2013).
De Bacco, C., Power, E. A., Larremore, D. B. & Moore, C. Community detection, link prediction, and layer interdependence in multilayer networks. Phys. Rev. E 95, 042317 (2017).
doi: 10.1103/PhysRevE.95.042317
Schein, A., Paisley, J., Blei, D. M. & Wallach, H. Bayesian Poisson tensor factorization for inferring multilateral relations from sparse dyadic event counts. In Proceedings of the 21st ACM SIGKDD international conference on knowledge discovery and data mining, 1045–1054 (2015).
Schein, A., Zhou, M., Blei, D. M. & Wallach, H. Bayesian Poisson tucker decomposition for learning the structure of international relations. In Proceedings of the 33rd international conference on machine learning, vol. 48 (2016).
Valles-Catala, T., Massucci, F. A., Guimera, R. & Sales-Pardo, M. Multilayer stochastic block models reveal the multilayer structure of complex networks. Phys. Rev. X 6, 011036 (2016).
Stanley, N., Shai, S., Taylor, D. & Mucha, P. Clustering network layers with the strata multilayer stochastic block model. IEEE Trans. Netw. Sci. Eng. 3, 95–105 (2016).
doi: 10.1109/TNSE.2016.2537545
Peixoto, T. P. Inferring the mesoscale structure of layered, edge-valued, and time-varying networks. Phys. Rev. E 92, 042807 (2015).
doi: 10.1103/PhysRevE.92.042807
Paul, S. et al. Consistent community detection in multi-relational data through restricted multi-layer stochastic blockmodel. Electron. J. Stat. 10, 3807–3870 (2016).
doi: 10.1214/16-EJS1211
Gheche, M. E., Chierchia, G. & Frossard, P. Orthonet: multilayer network data clustering. IEEE Trans. Signal Inf. Process. Netw. 6, 13–23 (2020).
Papadopoulos, A., Rafailidis, D., Pallis, G. & Dikaiakos, M. D. Clustering attributed multi-graphs with information ranking. In Proceedings, Part I, of the 26th international conference on database and expert systems applications—volume 9261, DEXA 2015, 432–446 (Springer, 2015).
Papadopoulos, A., Pallis, G. & Dikaiakos, M. D. Weighted clustering of attributed multi-graphs. Computing 99, 813–840 (2017).
doi: 10.1007/s00607-016-0526-5
Chang, S. et al. Heterogeneous network embedding via deep architectures. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’15, 119–128 (2015).
Sachan, M., Contractor, D., Faruquie, T. A. & Subramaniam, L. V. Using content and interactions for discovering communities in social networks. In Proceedings of the 21st international conference on world wide web, WWW ’12, 331–340 (2012).
Sweet, T. M. & Zheng, Q. Estimating the effects of network covariates on subgroup insularity with a hierarchical mixed membership stochastic blockmodel. Soc. Netw. 52, 100–114 (2018).
doi: 10.1016/j.socnet.2017.05.008
Signorelli, M. & Wit, E. C. Model-based clustering for populations of networks. Stat. Model. 20, 9–29 (2019).
doi: 10.1177/1471082X19871128
Newman, M. E. & Clauset, A. Structure and inference in annotated networks. Nat. Commun. 7, 11863 (2016).
doi: 10.1038/ncomms11863
Bothorel, C., Cruz, J. D., Magnani, M. & Micenkova, B. Clustering attributed graphs: models, measures and methods. Netw. Sci. 3, 408–444 (2015).
doi: 10.1017/nws.2015.9
Zhang, Y. et al. Community detection in networks with node features. Electron. J. Stat. 10, 3153–3178 (2016).
doi: 10.1214/16-EJS1206
Hric, D., Peixoto, T. P. & Fortunato, S. Network structure, metadata, and the prediction of missing nodes and annotations. Phys. Rev. X 6, 031038 (2016).
Stanley, N., Bonacci, T., Kwitt, R., Niethammer, M. & Mucha, P. J. Stochastic block models with multiple continuous attributes. Appl. Netw. Sci. 4, 1–22 (2019).
doi: 10.1007/s41109-019-0170-z
Emmons, S. & Mucha, P. J. Map equation with metadata: varying the role of attributes in community detection. Phys. Rev. E 100, 022301 (2019).
doi: 10.1103/PhysRevE.100.022301
Xu, Z., Ke, Y., Wang, Y., Cheng, H. & Cheng, J. A model-based approach to attributed graph clustering. In Proceedings of the 2012 ACM SIGMOD international conference on management of data, 505–516 (2012).
Bu, Z., Li, H.-J., Cao, J., Wang, Z. & Gao, G. Dynamic cluster formation game for attributed graph clustering. IEEE Trans. Cybern. 49, 328–341 (2017).
doi: 10.1109/TCYB.2017.2772880
Tallberg, C. A bayesian approach to modeling stochastic blockstructures with covariates. J. Math. Sociol. 29, 1–23 (2004).
doi: 10.1080/00222500590889703
White, A. & Murphy, T. B. Mixed-membership of experts stochastic blockmodel. Netw. Sci. 4, 48–80 (2016).
doi: 10.1017/nws.2015.29
Airoldi, E. M., Choi, D. S. & Wolfe, P. J. Confidence sets for network structure. Stat. Anal. Data Min. ASA Data Sci. J. 4, 461–469 (2011).
doi: 10.1002/sam.10136
Sweet, T. M. Incorporating covariates into stochastic blockmodels. J. Educ. Behav. Stat. 40, 635–664 (2015).
doi: 10.3102/1076998615606110
Taylor, D., Shai, S., Stanley, N. & Mucha, P. J. Enhanced detectability of community structure in multilayer networks through layer aggregation. Phys. Rev. Lett. 116, 228301 (2016).
doi: 10.1103/PhysRevLett.116.228301
Taylor, D., Caceres, R. S. & Mucha, P. J. Super-resolution community detection for layer-aggregated multilayer networks. Phys. Rev. X 7, 031056 (2017).
pubmed: 29445565
pmcid: 5809009
Holland, P. W., Laskey, K. B. & Leinhardt, S. Stochastic blockmodels: first steps. Soc. Netw. 5, 109–137 (1983).
doi: 10.1016/0378-8733(83)90021-7
Power, E. A. Building Bigness: Religious Practice and Social Support in Rural South India. Doctoral Dissertation, Stanford University, Stanford, CA (2015).
Power, E. A. Social support networks and religiosity in rural South India. Nat. Hum. Behav. 1, 0057 (2017).
doi: 10.1038/s41562-017-0057
Power, E. A. & Ready, E. Cooperation beyond consanguinity: post-marital residence, delineations of kin and social support among South Indian Tamils. Philos. Trans. R. Soc. B Biol. Sci. 374, 20180070 (2019).
doi: 10.1098/rstb.2018.0070
McAuley, J. & Leskovec, J. Learning to discover social circles in ego networks. In Proceedings of the 25th international conference on neural information processing systems—volume 1, NIPS’12, 539–547 (2012).
Girvan, M. & Newman, M. E. J. Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99, 7821–7826 (2002).
doi: 10.1073/pnas.122653799
Adamic, L. A. & Glance, N. The political blogosphere and the 2004 U.S. election: divided they blog. In Proceedings of the 3rd international workshop on link discovery, LinkKDD ’05, 36–43 (2005).
Kolda, T. G. & Bader, B. W. Tensor decompositions and applications. SIAM Rev. 51, 455–500 (2009).
doi: 10.1137/07070111X
Ball, B., Karrer, B. & Newman, M. E. J. Efficient and principled method for detecting communities in networks. Phys. Rev. E 84, 036103 (2011).
doi: 10.1103/PhysRevE.84.036103
Gopalan, P. K. & Blei, D. M. Efficient discovery of overlapping communities in massive networks. Proc. Natl. Acad. Sci. USA 110, 14534–14539 (2013).
doi: 10.1073/pnas.1221839110
Gopalan, P., Hofman, J. M. & Blei, D. M. Scalable recommendation with hierarchical poisson factorization. In Proceedings of the 31-st conference on uncertainty in artificial intelligence, 122–129 (2015).
Dempster, A. P., Laird, N. M. & Rubin, D. B. Maximum likelihood from incomplete data via the em algorithm. J. R. Stat. Soc. Ser. B (Methodol.) 39, 1–22 (1977).
Hanley, J. A. & McNeil, B. J. The meaning and use of the area under a receiver operating characteristic (roc) curve. Radiology 143, 29–36 (1982).
doi: 10.1148/radiology.143.1.7063747