Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Mining Overlapping Communities and Inner Role Assignments through Bayesian Mixed-Membership Models of Networks with Context-Dependent Interactions

Published: 10 January 2018 Publication History

Abstract

Community discovery and role assignment have been recently integrated into an unsupervised approach for the exploratory analysis of overlapping communities and inner roles in networks. However, the formation of ties in these prototypical research efforts is not truly realistic, since it does not account for a fundamental aspect of link establishment in real-world networks, i.e., the explicative reasons that cause interactions among nodes. Such reasons can be interpreted as generic requirements of nodes, that are met by other nodes and essentially pertain both to the nodes themselves and to their interaction contexts (i.e., the respective communities and roles).
In this article, we present two new model-based machine-learning approaches, wherein community discovery and role assignment are seamlessly integrated and simultaneously performed through approximate posterior inference in Bayesian mixed-membership models of directed networks. The devised models account for the explicative reasons governing link establishment in terms of node-specific and contextual latent interaction factors. The former are inherently characteristic of nodes, while the latter are characterizations of nodes in the context of the individual communities and roles. The generative process of both models assigns nodes to communities with respective roles and connects them through directed links, which are probabilistically governed by their node-specific and contextual interaction factors. The difference between the proposed models lies in the exploitation of the contextual interaction factors. More precisely, in one model, the contextual interaction factors have the same impact on link generation. In the other model, the contextual interaction factors are weighted by the extent of involvement of the linked nodes in the respective communities and roles.
We develop MCMC algorithms implementing approximate posterior inference and parameter estimation within our models.
Finally, we conduct an intensive comparative experimentation, which demonstrates their superiority in community compactness and link prediction on various real-world and synthetic networks.

Supplementary Material

a18-costa-apndx.pdf (costa.zip)
Supplemental movie, appendix, image and software files for, Mining Overlapping Communities and Inner Role Assignments through Bayesian Mixed-Membership Models of Networks with Context-Dependent Interactions

References

[1]
C. Aggarwal (Ed.). 2011. Social Network Data Analytics. Springer.
[2]
Y. Y. Ahn, J. P. Bagrow, and S. Lehmann. 2010. Link communities reveal multiscale complexity in networks. Nature 466 (2010), 761--764.
[3]
E. M. Airoldi, D. M. Blei, S. E. Fienberg, and E. P. Xing. 2008. Mixed membership stochastic blockmodels. The Journal of Machine Learning Research 9 (2008), 1981--2014.
[4]
C. Andrieu, N. De Freitas, A. Doucet, and M. I. Jordan. 2003. An introduction to MCMC for machine learning. Machine Learning 50, 1--2 (2003), 5--43.
[5]
S. Argamon and N. Howard (Eds.). 2009. Computational Methods for Counterterrorism. Springer.
[6]
C. M. Bishop. 2013. Model-based machine learning. Philosophical Transactions of the Royal Society A 371 (2013), 20120222.
[7]
C. M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer.
[8]
D. Blei. 2014. Build, compute, critique, repeat: Data analysis with latent variable models. Annual Review of Statistics and Its Application 1 (2014), 203--232.
[9]
D. Blei and J. Lafferty. 2009. Text Mining: Classification, Clustering, and Applications. Data Mining and Knowledge Discovery Series. Chapter Topic Models. Chapman 8 Hall/CRC, 71--94.
[10]
D. M. Blei, A. Y. Ng, and M. I. Jordan. 2003. Latent Dirichlet allocation. The Journal of Machine Learning Research 3 (2003), 993--1022.
[11]
F. Buccafurri, G. Lax, and A. Nocera. 2015. A new form of assortativity in online social networks. International Journal of Human-Computer Studies 80 (2015), 56--65.
[12]
N. Chatterjee and S. Sinha. 2008. Understanding the mind of a worm: Hierarchical network structure underlying nervous system function in C. elegans. In Progress in Brain Research, R. Banerjee and B. K. Chakrabarti (Eds.), Elsevier B.V., 145--153.
[13]
B.-H. Chou and E. Suzuki. 2010. Discovering community-oriented roles of nodes in a social network. In Proc. of Int. Conf. on Data Warehousing and Knowledge Discovery. 52--64.
[14]
G. Costa and R. Ortale. 2012. A Bayesian hierarchical approach for exploratory analysis of communities and roles in social networks. In Proc. of the IEEE/ACM Int. Conf. on Advances in Social Networks Analysis and Mining. 194--201.
[15]
G. Costa and R. Ortale. 2013. Probabilistic analysis of communities and inner roles in networks: Bayesian generative models and approximate inference. Social Network Analysis and Mining 3, 4 (2013), 1015--1038.
[16]
G. Costa and R. Ortale. 2014. A unified generative Bayesian model for community discovery and role assignment based upon latent interaction factors. In Proc. of the IEEE/ACM Int. Conf. on Advances in Social Networks Analysis and Mining. 93--100.
[17]
G. Costa and R. Ortale. 2016. A mean-field variational Bayesian approach to detecting overlapping communities with inner roles using poisson link generation. In Proc. of International Symposium on Intelligent Data Analysis. 110--122.
[18]
G. Costa and R. Ortale. 2016. Model-based collaborative personalized recommendation on signed social rating networks. ACM Transactions on Internet Technology 16, 3 (2016), 20:1--20:21.
[19]
G. Costa and R. Ortale. 2016. Scalable detection of overlapping communities and role assignments in networks via Bayesian probabilistic generative affiliation modeling. In Proc. of International OTM Conference on Cooperative Information Systems. 99--117.
[20]
G. Creamer, R. Rowe, S. Hershkop, and S. J. Stolfo. 2009. Segmentation and automated social hierarchy detection through email network analysis. Advances in Web Mining and Web Usage Analysis, H. Zhang, M. Spiliopoulou, B. Mobasher, C. Lee Giles, A. McCallum, O. Nasraoui, J. Srivastava, and J. Yen (Eds.). Springer-Verlag, 40--58.
[21]
L. Danon, J. Duch, A. Arenas, and A. Daz-Guilera. 2005. Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment 2005, 9 (2005), 09008.
[22]
M. DeGroot. 1970. Optimal Statistical Decisions. McGraw-Hill.
[23]
D. Easley and J. Kleinberg. 2010. Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press.
[24]
T. S. Evans. 2010. Clique graphs and overlapping communities. Journal of Statistical Mechanics 2010, 12 (2010), P12037.
[25]
T. S. Evans and R. Lambiotte. 2009. Line graphs, line partitions and overlapping communities. Physical Review E 80 (2009), 016105.
[26]
T. S. Evans and R. Lambiotte. 2010. Line graphs of weighted networks for overlapping communities. The European Physical Journal B 77, 2 (2010), 265--272.
[27]
S. Fortunato. 2010. Community detection in graphs. Physics Reports 486, 3--5 (2010), 75--174.
[28]
T. M. J. Fruchterman and E. M. Reingold. 1991. Graph drawing by force-directed placement. Software-Practice and Experience 21, 11 (1991), 1129--1164.
[29]
L. Getoor and C. P. Diehl. 2005. Link mining: A survey. ACM SIGKDD Explorations Newsletter 7, 2 (2005), 3--12.
[30]
M. Girvan and M. E. J. Newman. 2002. Community structure in social and biological networks. Proceedings of the National Academy of Sciences of the United States of America 99, 12 (2002), 7821--7826.
[31]
P. K. Gopalan and D. M. Blei. 2013. Efficient discovery of overlapping communities in massive networks. Proceedings of the National Academy of Sciences of the United States of America 110, 36 (2013), 14534--14539.
[32]
P. Gopalan, C. Wang, and D. M. Blei. 2013. Modeling overlapping communities with node popularities. In Proc. of Advances in Neural Information Processing Systems. 2850--2858.
[33]
M. A. Hasan and M. J. Zaki. 2011. A survey of link prediction in social networks. In Social Network Data Analytics, C. Aggarwal (Ed.). Springer, Boston, MA, 243--275.
[34]
G. Heinrich. 2008. Parameter Estimation for Text Analysis. Technical Report. University of Leipzig. http://www.arbylon.net/publications/text-est.pdf.
[35]
K. Henderson, T. Eliassi-Rad, S. Papadimitriou, and C. Faloutsos. 2010. HCDF: A hybrid community discovery framework. In Proc. of SIAM Int. Conf. on Data Mining. 754--765.
[36]
K. Henderson and T. Eliassi Rad. 2009. Applying latent Dirichlet allocation to group discovery in large graphs. In Proc. of ACM Symposium on Applied Computing. 1456--1461.
[37]
M. I. Jordan, Z. Ghahramani, T. S. Jaakkola, and L. K. Saul. 1999. An introduction to variational methods for graphical models. Machine Learning 37, 2 (1999), 183--233.
[38]
B. W. Kernighan and S. Lin. 1970. An efficient heuristic procedure for partitioning graphs. Bell System Technical Journal 49, 1 (1970), 291--307.
[39]
Y. Kim and H. Jeong. 2011. Map equation for link community. Physical Review E 84 (2011), 026110.
[40]
E. D. Kolaczyk. 2009. Statistical Analysis of Network Data. Springer.
[41]
D. Koller and N. Friedman. 2009. Probabilistic Graphical Models. Principles and Techniques. The MIT Press.
[42]
A. Lancichinetti and S. Fortunato. 2009. Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Physical Review E 80, 1 (2009), 016118.
[43]
A. Lancichinetti and S. Fortunato. 2009. Community detection algorithms: A comparative analysis. Physical Review E 80, 5 (2009), 056117.
[44]
P. Landi and C. Piccardi. 2014. Community analysis in directed networks: In-, out-, and pseudo-communities. Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics 89, 1 (2014), 012814.
[45]
J. Leskovec, K. J. Lang, and M. Mahoney. 2010. Empirical comparison of algorithms for network community detection. In Proc. of Int. Conf. on World Wide Web. 631--640.
[46]
D. Liben-Nowell and J. Kleinberg. 2007. The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology 58, 7 (2007), 1019--1031.
[47]
J. S. Liu. 2001. Monte Carlo Strategies in Scientific Computing. Springer.
[48]
F. Lorrain and H. C. White. 1971. The structural equivalence of individuals in social networks. Journal of Mathematical Sociology 1 (1971), 49--80.
[49]
L. Lü and T. Zhou. 2011. Link prediction in complex networks: A survey. Physica A 390, 6 (2011), 1150--1170.
[50]
L. W. Mackey, D. J. Weiss, and M. I. Jordan. 2010. Mixed membership matrix factorization. In Proc. of Int. Conf. on Machine Learning. 711--718.
[51]
A. McCallum, X. Wang, and A. Corrada-Emmanuel. 2007. Topic and role discovery in social networks with experiments on enron and academic email. Journal of Artificial Intelligence Research 30, 1 (2007), 249--272.
[52]
R. M. Neal. 1993. Probabilistic Inference Using Markov Chain Monte Carlo Methods. Technical Report. Computer Science Department - University of Toronto.
[53]
T. Nepusz, A. Petróczi, L. Négyessy, and F. Bazsó. 2008. Fuzzy communities and the concept of bridgeness in complex networks. Physical Review E 77, 1 (2008), 016107.
[54]
M. E. J. Newman. 2004. Detecting community structure in networks. The European Physical Journal B 38, 2 (2004), 321--330.
[55]
M. E. J. Newman. 2004. Fast algorithm for detecting community structure in networks. Physical Review E 69 (2004), 066133.
[56]
M. E. J. Newman and M. Girvan. 2004. Finding and evaluating community structure in networks. Phisycal Review E 69, 2 (2004), 026113.
[57]
M. E. J. Newman and E. A. Leicht. 2007. Mixture models and exploratory analysis in networks. Proceedings of the National Academy of Sciences of the United States of America 104 (2007), 9564--9569.
[58]
N. Pathak, C. Delong, A. Banerjee, and K. Erickson. 2008. Social topic models for community extraction. In Proc. of KDD Workshop on Social Network Mining and Analysis.
[59]
M. A. Porter, J.-P. Onnela, and P. J. Mucha. 2009. Communities in networks. Notices of the American Mathematical Society 56, 9 (2009), 1082--1166.
[60]
A. Pothen, H. D. Simon, and K.-P. Liou. 1990. Partitioning sparse matrices with eigenvectors of graphs. SIAM J. Matrix Anal. Appl. 11, 3 (1990), 430--452.
[61]
F. Radicchi, C. Castellano, F. Cecconi, V. Loreto, and D. Parisi. 2004. Defining and identifying communities in networks. Proceedings of the National Academy of Sciences of the United States of America, (2004), 2658--2663.
[62]
P. Resnik and E. Hardisty. 2010. Gibbs Sampling for the Uninitiated. Technical Report. Computer Science Department - Univeristy of Maryland. http://hdl.handle.net/1903/10058.
[63]
C. Robert and G. Casella. 2004. Monte Carlo Statistical Methods. Springer.
[64]
J. Scripps, P.-N. Tan, and A.-H. Esfahanian. 2007. Exploration of link structure and community-based node roles in network analysis. In Proc. of Int. Conf. on Data Mining. 649--654.
[65]
J. Scripps, P.-N. Tan, and A.-H. Esfahanian. 2007. Node roles and community structure in networks. In Proc. of Workshop on Web Mining and Social Network Analysis (WebKDD and SNA-KDD). 26--35.
[66]
Y. Sohn, M.-K. Choi, Y.-Y. Ahn, J. Lee, and J. Jeong. 2011. Topological cluster analysis reveals the systemic organization of the Caenorhabditis elegans connectome. PLoS Computational Biology 7, 5 (2011), e1001139.
[67]
M. Steyvers and T. Griffiths. 2007. Latent Semantic Analysis: A Road to Meaning. Lawrence Erlbaum, Chapter Probabilistic Topic Models, 427--448.
[68]
V. S. Subrahmanian (Ed.). 2013. Handbook of Computational Approaches to Counterterrorism. Springer.
[69]
L. Tierney. 1994. Markov chains for exploring posterior distributions. Annals of Statistics 22, 4 (1994), 1701--1728.
[70]
M. J. Wainwright and M. I. Jordan. 2008. Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning 1, 1--2 (2008), 1--305.
[71]
S. Wasserman and K. Faust. 1994. Social Network Analysis: Methods and Applications. Cambridge University Press.
[72]
D. J. Watts and S. H. Strogatz. 1998. Collective dynamics of small-world networks. Nature 393, 6684 (1998), 440--442.
[73]
J. G. White, E. Southgate, J. N. Thompson, and S. Brenner. 1986. The structure of the nervous system of the nematode caenorhabditis elegans. Philosophical Transactions of the Royal Society B: Biological Sciences 314, 1165 (1986), 1--340.
[74]
Z. Wu. 2010. A fast and reasonable method for community detection with adjustable extent of overlapping. In Proc. of Int. Conf. on Intelligent Systems and Knowledge Engineering. 376--379.
[75]
J. Xie, S. Kelley, and B. K. Szymanski. 2013. Overlapping community detection in networks: The state of the art and comparative study. ACM Computing Surveys 45, 4 (2013), 43:1--43:35.
[76]
J. Xu and H. Chen. 2008. The topology of dark networks. Communications of the ACM 51, 10 (2008), 58--65.
[77]
J. Yang, J. McAuley, and J. Leskovec. 2013. Community detection in networks with node attributes. In Proc. of Int. Conf. on Data Mining. 1151--1156.
[78]
Z.-Liaghat, A.-Hossein Rasekh, and A.-Mahdavi. 2013. Application of data mining methods for link prediction in social networks. Social Network Analysis and Mining 3, 2 (2013), 143--150.
[79]
H. Zhang, B. Qiu, C. L. Giles, H. C. Foley, and J. Yen. 2007. An LDA-based community structure discovery approach for large-scale social networks. In Proc. of IEEE Int. Conf. on Intelligence and Security Informatics. 200--207.
[80]
D. Zhou, E. Manavoglu, J. Li, C. L. Giles, and H. Zha. 2006. Probabilistic models for discovering E-communities. In Proc. of Int. Conf. on World Wide Web. 173--182.

Cited By

View all
  • (2024)The Core Might Change Anyhow We Define ItComplexity10.1155/2024/39568772024Online publication date: 1-Jan-2024
  • (2023)Rule-Based Detection of Anomalous Patterns in Device Behavior for Explainable IoT SecurityIEEE Transactions on Services Computing10.1109/TSC.2023.332782216:6(4514-4525)Online publication date: Nov-2023
  • (2023)An IoT-based Approach to Expert Recommendation in Community Question Answering for Disaster Recovery2023 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW60847.2023.00116(860-867)Online publication date: 4-Dec-2023
  • Show More Cited By

Index Terms

  1. Mining Overlapping Communities and Inner Role Assignments through Bayesian Mixed-Membership Models of Networks with Context-Dependent Interactions

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Knowledge Discovery from Data
      ACM Transactions on Knowledge Discovery from Data  Volume 12, Issue 2
      Survey Papers and Regular Papers
      April 2018
      376 pages
      ISSN:1556-4681
      EISSN:1556-472X
      DOI:10.1145/3178544
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 10 January 2018
      Accepted: 01 June 2017
      Revised: 01 September 2016
      Received: 01 December 2014
      Published in TKDD Volume 12, Issue 2

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Bayesian probabilistic network analysis
      2. Overlapping community detection
      3. link prediction
      4. role assignment

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)7
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 10 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)The Core Might Change Anyhow We Define ItComplexity10.1155/2024/39568772024Online publication date: 1-Jan-2024
      • (2023)Rule-Based Detection of Anomalous Patterns in Device Behavior for Explainable IoT SecurityIEEE Transactions on Services Computing10.1109/TSC.2023.332782216:6(4514-4525)Online publication date: Nov-2023
      • (2023)An IoT-based Approach to Expert Recommendation in Community Question Answering for Disaster Recovery2023 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW60847.2023.00116(860-867)Online publication date: 4-Dec-2023
      • (2023)Here are the answers. What is your question? Bayesian collaborative tag-based recommendation of time-sensitive expertise in question-answering communitiesExpert Systems with Applications10.1016/j.eswa.2023.120042225(120042)Online publication date: Sep-2023
      • (2022)Discovering Organizational Hierarchy through a Corporate Ranking AlgorithmComplexity10.1155/2022/81544762022Online publication date: 1-Jan-2022
      • (2022)Hierarchical Bayesian text modeling for the unsupervised joint analysis of latent topics and semantic clustersInternational Journal of Approximate Reasoning10.1016/j.ijar.2022.05.002147:C(23-39)Online publication date: 1-Aug-2022
      • (2022)Overlapping communities and roles in networks with node attributesArtificial Intelligence10.1016/j.artint.2021.103580302:COnline publication date: 1-Jan-2022
      • (2021)Role-Aware Information Spread in Online Social NetworksEntropy10.3390/e2311154223:11(1542)Online publication date: 19-Nov-2021
      • (2020)Integrating overlapping community discovery and role analysis: Bayesian probabilistic generative modeling and mean-field variational inferenceEngineering Applications of Artificial Intelligence10.1016/j.engappai.2019.10343789(103437)Online publication date: Mar-2020
      • (2019)Document Clustering and Topic Modeling: A Unified Bayesian Probabilistic Perspective2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)10.1109/ICTAI.2019.00047(278-285)Online publication date: Nov-2019
      • Show More Cited By

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media