Abstract
The complex nature of many real-world networks is motivating researchers to investigate or extend network analysis methods such as centrality computation, link prediction, and community detection. One of these complex structures is the multilayer network in which each layer contains a network. Multilayer networks frequently possess complex local structures of multimodal data and interlinked relations. Thus, efficient detection of local communities in such networks often remains a key challenge. In this paper, we propose a community detection strategy, called CoDeBi, which leverages Formal Concept Analysis (FCA) to find possibly overlapping and nested communities in multilayer networks. At the preprocessing stage, we exploit operations such as apposition, subposition and composition on formal contexts—associated with individual layers—to generate a global formal context representing the whole multilayer network. At the first step of CoDeBi, we extract the formal concepts that capture groups in the global formal context while in the second step, we filter the extracted formal concepts to keep only the ones that have a high harmonic mean of stability and separation indices. Such groups represent core communities. In the third step, we detect final communities by refining the core groups using Silhouette Analysis. Our validation study shows that CoDeBi can accurately identify communities in bipartite graphs, and hence can be exploited for community detection in multilayer networks. Another contribution of this paper is the application of the attractive features of Triadic Concept Analysis and the adaptation of our approach to the analysis of tridimensional networks represented by a tridimensional adjacency matrix.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
This diagram is produced using the public domain software called ConExp.
- 2.
We sometimes use simplified notations for sets. E.g. 125 stands for \(\{1, 2, 5\}\) and ab for \(\{a, b\}\). We also write \((X_j, X_k) \subseteq K_j {\times } K_k\) to mean that \(X_j\subseteq K_j\) and \(X_k \subseteq K_k\).
- 3.
The equation is slightly different from the general one proposed by Kuznetsov and Makhalova (2016) because the symbol \(\supseteq \) is more appropriate than \(=\) .
- 4.
- 5.
References
Berlingerio, M., Coscia, M., & Giannotti, F. (2011a). Finding and characterizing communities in multidimensional networks. In 2011 International Conference on Advances in Social Networks Analysis and Mining (pp. 490–494). IEEE.
Berlingerio, M., Coscia, M., Giannotti, F., Monreale, A., & Pedreschi, D. (2011b). Foundations of multidimensional network analysis. In 2011 International Conference on Advances in Social Networks Analysis and Mining (ASONAM) (pp. 485–489). IEEE.
Blondel, V. D., Guillaume, J.-L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10), P10008.
Boccaletti, S., Bianconi, G., Criado, R., Genio, C. I. D., Gómez-Gardeñes, J., Romance, M., Sendiña-Nadal, I., Wang, Z., & Zanin, M. (2014). The structure and dynamics of multilayer networks. arXiv:abs/1407.0742.
Boden, B., Günnemann, S., Hoffmann, H., & Seidl, T. (2012). Mining coherent subgraphs in multi-layer graphs with edge labels. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1258–1266). ACM.
Borgatti, S. P. (2009). 2-mode concepts in social network analysis. Encyclopedia of Complexity and System Science, 6, 8279–8291.
Boutemine, O., & Bouguessa, M. (2017). Mining community structures in multidimensional networks. TKDD, 11(4), 51:1–51:36.
Buzmakov, A., Kuznetsov, S. O., & Napoli, A. (2014). Scalable estimates of concept stability. In International Conference on Formal Concept Analysis (pp. 157–172). Springer.
Cerf, L., Besson, J., Robardet, C., & Boulicaut, J. (2009). Closed patterns meet n-ary relations. TKDD, 3(1), 3:1–3:36.
Chakraborty, T., Dalmia, A., Mukherjee, A., & Ganguly, N. (2017). Metrics for community analysis: A survey. ACM Computing Surveys (CSUR), 50(4), 54.
Collins, L. M., & Dent, C. W. (1988). Omega: A general formulation of the rand index of cluster recovery suitable for non-disjoint solutions. Multivariate Behavioral Research, 23(2), 231–242.
Crampes, M., & Plantié, M. (2012). Détection de communautés dans les graphes bipartis. In IC 2012 (p. 125).
Dickison, M. E., Magnani, M., & Rossi, L. (2016). Multilayer Social Networks (1st edn). New York: Cambridge University Press.
Dong, X., Frossard, P., Vandergheynst, P., & Nefedov, N. (2012). Clustering with multi-layer graphs: A spectral perspective. IEEE Transactions on Signal Processing, 60(11), 5820–5831.
Du, N., Wang, B., Wu, B., & Wang, Y. (2008). Overlapping community detection in bipartite networks. In Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology-Volume 01 (pp. 176–179). IEEE Computer Society.
Dunlavy, D. M., Kolda, T. G., & Kegelmeyer, W. P. (2011). Multilinear algebra for analyzing data with multiple linkages. In Graph Algorithms in the Language of Linear Algebra (pp. 85–114). SIAM.
Everett, M. G., & Borgatti, S. P. (2013). The dual-projection approach for two-mode networks. Social Networks, 35(2), 204–210.
Fortunato, S. (2010). Community detection in graphs. Physics Reports, 486(3–5), 75–174.
Ganter, B., & Obiedkov, S. A. (2016). Conceptual Exploration. Berlin: Springer.
Ganter, B., & Wille, R. (1999). Formal Concept Analysis: Mathematical Foundations. New York: Springer. Translator-C. Franzke.
Hacene, M. R., Huchard, M., Napoli, A., & Valtchev, P. (2013). Relational concept analysis: Mining concept lattices from multi-relational data. Annals of Mathematics and Artificial Intelligence, 67(1), 81–108.
Hmimida, M., & Kanawati, R. (2015). Community detection in multiplex networks: A seed-centric approach. NHM, 10(1), 71–85.
Ibrahim, M. H., & Missaoui, R. (2018). An efficient approximation of concept stability using low-discrepancy sampling. In Graph-Based Representation and Reasoning - 23rd International Conference on Conceptual Structures, ICCS 2018, Edinburgh, UK, June 20-22, 2018, Proceedings (pp. 24–38).
Interdonato, R., Atzmueller, M., Gaito, S., Kanawati, R., Largeron, C., & Sala, A. (2019). Feature-rich networks: Going beyond complex network topologies. Applied Network Science, 4(1), 4:1–4:13.
Jay, N., Kohler, F., & Napoli, A. (2008). Analysis of social communities with iceberg and stability-based concept lattices. In International Conference on Formal Concept Analysis (pp. 258–272). Springer.
Kim, J., & Lee, J.-G. (2015). Community detection in multi-layer graphs: A survey. ACM SIGMOD Record, 44(3), 37–48.
Kivelä, M., Arenas, A., Barthelemy, M., Gleeson, J. P., Moreno, Y., & Porter, M. A. (2014). Multilayer networks. Journal of Complex Networks, 2(3), 203–271.
Klimushkin, M., Obiedkov, S., & Roth, C. (2010). Approaches to the selection of relevant concepts in the case of noisy data. In International Conference on Formal Concept Analysis (pp. 255–266). Springer.
Kolda, T. G., & Bader, B. W. (2009). Tensor decompositions and applications. SIAM Review, 51(3), 455–500.
Kuznetsov, S. O. (2007). On stability of a formal concept. Annals of Mathematics and Artificial Intelligence, 49(1), 101–115.
Kuznetsov, S. O., & Makhalova, T. (2018). On interestingness measures of formal concepts. Information Sciences, 442, 202–219.
Kuznetsov, S. O., & Makhalova, T. P. (2016). On stability of triadic concepts. In Proceedings of the Thirteenth International Conference on Concept Lattices and Their Applications, Moscow, Russia, July 18-22, 2016 (pp. 245–253).
Lancichinetti, A., Radicchi, F., Ramasco, J. J., & Fortunato, S. (2010). Finding statistically significant communities in networks. arXiv:abs/1012.2363.
Lehmann, F., & Wille, R. (1995). A triadic approach to formal concept analysis. In ICCS (pp. 32–43).
Lehmann, S., Schwartz, M., & Hansen, L. K. (2008). Biclique communities. Physical Review E, 78(1), 016108.
Li, H., Nie, Z., Lee, W.-C., Giles, L., & Wen, J.-R. (2008). Scalable community discovery on textual data with relations. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (pp. 1203–1212). ACM.
Messaoudi, A., Missaoui, R., & Ibrahim, M. H. (2019). Detecting overlapping communities in two-mode data networks using formal concept analysis. Revue des Nouvelles Technologies de l’Information, Extraction et Gestion des connaissances, RNTI-E-35, 189–200.
Newman, M. E., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69(2), 026113.
Newman, M. E. J. (2003). The structure and function of complex networks. SIAM Review, 45(2), 167–256.
Nicosia, V., Mangioni, G., Carchiolo, V., & Malgeri, M. (2009). Extending the definition of modularity to directed graphs with overlapping communities. Journal of Statistical Mechanics: Theory and Experiment, 2009(03), 3–24.
Palla, G., Derényi, I., Farkas, I., & Vicsek, T. (2005). Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435(7043), 814.
Potgieter, A., April, K. A., Cooke, R. J., & Osunmakinde, I. O. (2009). Temporality in link prediction: Understanding social complexity. Emergence: Complexity & Organization (E: CO), 11(1), 69–83.
Rosvall, M., & Bergstrom, C. T. (2008). Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences, 105(4), 1118–1123.
Roth, C., Obiedkov, S., & Kourie, D. G. (2008). On succinct representation of knowledge community taxonomies with formal concept analysis. International Journal of Foundations of Computer Science, 19(02), 383–404.
Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65.
Salehi, M., Sharma, R., Marzolla, M., Magnani, M., Siyari, P., & Montesi, D. (2015). Spreading processes in multilayer networks. IEEE Transactions on Network Science and Engineering, 2(2), 65–83.
Shi, C., Li, Y., Zhang, J., Sun, Y., & Philip, S. Y. (2017). A survey of heterogeneous information network analysis. IEEE Transactions on Knowledge and Data Engineering, 29(1), 17–37.
Silva, A., Meira, W., Jr., & Zaki, M. J. (2012). Mining attribute-structure correlated patterns in large attributed graphs. Proceedings of the VLDB Endowment, 5(5), 466–477.
Sun, Y., & Han, J. (2012). Mining Heterogeneous Information Networks: Principles and Methodologies. Synthesis Lecture on Data Mining and Knowledge Discovery. San Rafael: Morgan & Claypool Publishers.
Tang, L., & Liu, H. (2010). Community detection and mining in social media. Synthesis Lectures on Data Mining and Knowledge Discovery, 2(1), 1–137.
Tang, W., Lu, Z., & Dhillon, I. S. (2009). Clustering with multiple graphs. In 2009 Ninth IEEE International Conference on Data Mining (pp. 1016–1021). IEEE.
Valtchev, P., & Missaoui, R. (2001). Building concept (galois) lattices from parts: Generalizing the incremental methods. In Conceptual Structures: Broadening the Base, 9th International Conference on Conceptual Structures, ICCS 2001, Stanford, CA, USA, July 30-August 3, 2001, Proceedings (pp. 290–303).
Valtchev, P., Missaoui, R., & Lebrun, P. (2002). A partition-based approach towards constructing galois (concept) lattices. Discrete Mathematics, 256(3), 801–829.
Wang, Q., & Fleury, E. (2013). Overlapping community structure and modular overlaps in complex networks. Mining Social Networks and Security Informatics (pp. 15–40). Berlin: Springer.
Wille, R. (1995). The basic theorem of triadic concept analysis. Order, 12(2), 149–158.
Wille, R. (1996). Conceptual structures of multicontexts. In International Conference on Conceptual Structures (pp. 23–39). Springer.
Xie, J., Kelley, S., & Szymanski, B. K. (2013). Overlapping community detection in networks: The state-of-the-art and comparative study. ACM computing surveys (csur), 45(4), 43.
Xu, Z., Ke, Y., Wang, Y., Cheng, H., & Cheng, J. (2012). A model-based approach to attributed graph clustering. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (pp. 505–516). ACM.
Zeng, Z., Wang, J., Zhou, L., & Karypis, G. (2006). Coherent closed quasi-clique discovery from large dense graph databases. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 797–802). ACM.
Zhang, S., Wang, R.-S., & Zhang, X. (2007). Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Physica A: Statistical Mechanics and its Applications, 374, 483–490.
Zhou, K., Martin, A., & Pan, Q. (2015). Evidential communities for complex networks. arXiv:abs/1501.01780.
Zhou, Y., Cheng, H., & Yu, J. X. (2009). Graph clustering based on structural/attribute similarities. Proceedings of the VLDB Endowment, 2(1), 718–729.
Acknowledgements
The first author acknowledges the financial support of the Natural Sciences and Engineering Research Council of Canada (NSERC). All the authors are grateful to the reviewers for their relevant comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
Omega index (Collins and Dent 1988) counts the number of node pairs without community assignment as well as those which are in exactly one community, two communities, and so on Chakraborty et al. (2017).
Let \(R=\{R_1, R_2, \ldots , R_J\}\) be the set of the J ground-truth communities in the graph of size N, and \(C=\{C_1, C_2, \ldots , C_K\}\) the set of detected communities. The Omega index is then defined as follows:
where the unadjusted omega index Omega\(_u\) is defined as
where \(M = N(N - 1)/2\) represents the number of node pairs, and \(t_j (R)\) is the set of pairs that appear exactly j times in the ground-truth set R. Finally, the expected omega index Omega\(_e\) is given by
The computation of the overlapping Normalized Mutual Information is as follows. For each node i in the detected community structure C, its community membership can be declared as a binary vector of length |C|, where \((x_i)_k\) is set to 1 if node i is a member of the k-th cluster \(C_k\), and 0 otherwise. The k-th entry of this vector can be viewed as a random variable \(X_k\) whose probability distribution is given by:
\(P(X_k = k) = N_k/N\), \(P(X_k = 0)= 1 - P(X_k =1)\), where \(N_k= |C|\), and N is the number of nodes in the graph. The same holds for the random variable \(Y_l\) associated with the \(l-\)th cluster in the ground truth community structure R.
To define the entropy measures H(X) and \(H(X_k, Y_l)\), both the empirical marginal probability distribution \(P(X_k)\) and the joint probability distribution \(P(X_k, Y_l)\) are needed. The conditional entropy of a cluster \(X_k\) given \(Y_l\) is defined as \(H(X_k|Y_l) = H(X_k, Y_l) - H(Y_l)\). The entropy of \(X_k\) with respect to the entire vector Y is based on the best matching between \(X_k\) and any component of Y:
The normalized conditional entropy of a community X with respect to Y is
Similarly, we define H(Y|X).
Then, the final Overlapping Normalized Mutual Information formula for two community structures C and R is given by :
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Missaoui, R., Messaoudi, A., Ibrahim, M.H., Abdessalem, T. (2022). Detecting Communities in Complex Networks Using Formal Concept Analysis. In: Jaziri, R., Martin, A., Rousset, MC., Boudjeloud-Assala, L., Guillet, F. (eds) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol 1004. Springer, Cham. https://doi.org/10.1007/978-3-030-90287-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-90287-2_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-90286-5
Online ISBN: 978-3-030-90287-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)