Abstract
Topic modeling can be used to improve the mutuality and interpenetration of community discovery and role analysis in social media. Also, it is useful to uncover communities and roles that are both social and topic-aware. In the present manuscript, we explore the exploitation of topic modeling to inform the seamless integration of community discovery and role analysis. For this purpose, we develop an innovative generative model of social media, in which the interrelation among communities, roles and topics is explained from a fully Bayesian perspective. Essentially, communities, roles and topics are latent factors that interact in an underlying generative process, to govern link formation and message wording. Posterior inference under the devised model allows for a variety of exploratory, descriptive and predictive tasks. These include the detection and interpretation of overlapping communities, roles and topics as well as the prediction of missing links. We derive the mathematical details of variational inference and design a coordinate-ascent algorithm implementing the latter. An empirical assessment on real-world social media demonstrates a superior accuracy of the proposed model in community discovery and link prediction compared to several established competitors, which substantiates the rationality of both our modeling effort and the underlying intuition.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Notice that, in the case of collaboration networks, the term message refers to the corresponding type of coauthored content, such as project proposals, deliverables and publications. In particular, one data set used for the experimental assessment of Sect. 6 is chosen from the scientific collaboration domain and, in such a context, message is a synonym of publication.
The mathematical derivation both of the functional forms of the individual factors on the right hand side of Eq. 3 and the updates of the respective variational parameters is omitted for brevity.
References
Aggarwal, C., Subbian, K.: Evolutionary network analysis: a survey. ACM Comput. Surv. 47(1), 10:1–10:36 (2014)
Ahmed, N., Rossi, R., Lee, J., Willke, T., Zhou, R., Kong, X., Eldardiry, H.: Learning role-based graph embeddings. In: Proceedings of International Workshop on Statistical Relational AI (2018)
Ahn, Y., Bagrow, J., Lehmann, S.: Link communities reveal multiscale complexity in networks. Nature 466, 761–764 (2010)
Airoldi, E., Blei, D., Fienberg, S., Xing, E.: Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9, 1981–2014 (2008)
Berry, G.: Role action embeddings: scalable representation of network positions. arXiv:1811.08019 (2018)
Bishop, C.M.: Model-based machine learning. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 371(1984), 20120222 (2013). https://doi.org/10.1098/rsta.2012.0222
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Berlin (2006)
Blei, D.: Probabilistic topic models. Commun. ACM 55(4), 77–84 (2012)
Blei, D., Kucukelbir, A., McAuliffe, J.: Variational inference: a review for statisticians. J. Am. Stat. Assoc. 112(518), 859–877 (2017)
Blei, D., Lafferty, J.: Dynamic topic models. In: Proceedings of International Conference on Machine Learning, pp. 113 – 120 (2006)
Blei, D., Lafferty, J.: Topic models. In: Srivastava, A.N., Sahami, M. (eds.) Text Mining: Classification, Clustering, and Applications, pp. 71 – 94. Chapman & Hall/CRC Data Mining and Knowledge Discovery Series (2009)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Blondel, V., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008(10), P10008 (2008)
Cai, H., Zheng, V., Chang, K.C.: A comprehensive survey of graph embedding: problems, techniques, and applications. IEEE Trans. Knowl. Data Eng. 30(9), 1616–1637 (2018)
Chaney, A., Blei, D., Eliassi-Rad, T.: A probabilistic model for using social networks in personalized item recommendation. In: Proceedings of ACM Conference on Recommender Systems, pp. 43–50 (2015)
Chang, J., Boyd-Graber, J., Gerrish, S., Wang, C., Blei, D.: Reading tea leaves: how humans interpret topic models. In: Proceedings of International Conference on Neural Information Processing Systems, pp. 288–296 (2009)
Chou, B.H., Suzuki, E.: Discovering community-oriented roles of nodes in a social network. In: Proceedings of International Conference on Data Warehousing and Knowledge Discovery, pp. 52–64 (2010)
Costa, G., Ortale, R.: A Bayesian hierarchical approach for exploratory analysis of communities and roles in social networks. In: Proceedings of IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 194–201 (2012)
Costa, G., Ortale, R.: Probabilistic analysis of communities and inner roles in networks: Bayesian generative models and approximate inference. Soc. Netw. Anal. Min. 3(4), 1015–1038 (2013)
Costa, G., Ortale, R.: A unified generative bayesian model for community discovery and role assignment based upon latent interaction factors. In: Proceedings of IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 93–100 (2014)
Costa, G., Ortale, R.: A mean-field variational bayesian approach to detecting overlapping communities with inner roles using poisson link generation. In: Proceedings of International Symposium on Intelligent Data Analysis, pp. 110–122 (2016)
Costa, G., Ortale, R.: Model-based collaborative personalized recommendation on signed social rating networks. ACM Trans. Internet Technol. 16(3), 20:1–20:21 (2016)
Costa, G., Ortale, R.: Scalable detection of overlapping communities and role assignments in networks via bayesian probabilistic generative affiliation modeling. In: Proceedings of International OTM Conference on Cooperative Information Systems, pp. 99–117 (2016)
Costa, G., Ortale, R.: Overlapping communities meet roles and respective behavioral patterns in networks with node attributes. In: Proceedings of International Conference on Web Information Systems Engineering, pp. 215–230 (2017)
Costa, G., Ortale, R.: Mining overlapping communities and inner role assignments through bayesian mixed-membership models of networks with context-dependent interactions. ACM Trans. Knowl. Discov. Data 12(2), 18:1–18:32 (2018)
Cui, P., Wang, X., Pei, J., Zhu, W.: A survey on network embedding. IEEE Trans. Knowl. Data Eng 31(5), 833–852 (2019)
da Silva, E., Langseth, H., Ramampiaro, H.: Content-based social recommendation with Poisson matrix factorization. In: Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 530–546 (2017)
Evans, T., Lambiotte, R.: Line graphs, line partitions and overlapping communities. Phys. Rev. E 80, 016105 (2009)
Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010)
Fortunato, S., Hric, D.: Community detection in networks: a user guide. Phys. Rep. 659, 1–44 (2016)
Fu, Y., Ma, Y.: Graph Embedding for Pattern Analysis. Springer, Berlin (2012)
Gopalan, P., Blei, D.: Efficient discovery of overlapping communities in massive networks. Proc. Natl. Acad. Sci. 110(36), 14534–14539 (2013)
Gopalan, P., Charlin, L., Blei, D.: Content-based recommendations with Poisson factorization. In: Proceedings of International Conference on Neural Information Processing Systems, pp. 3176–3184 (2014)
Gopalan, P., Hofman, J., Blei, D.: Scalable recommendation with hierarchical Poisson factorization. In: Proceedings of Conference on Uncertainty in Artificial Intelligence, pp. 326 – 335 (2015)
Gopalan, P., Ruiz, F., Ranganath, R., Blei, D.: Bayesian nonparametric poisson factorization for recommendation systems. In: Proceedings of International Conference on Artificial Intelligence and Statistics, pp. 275–283 (2014)
Gopalan, P., Wang, C., Blei, D.: Modeling overlapping communities with node popularities. In: NIPS, pp. 2850–2858 (2013)
Goyal, P., Ferrara, E.: Graph embedding techniques, applications, and performance: a survey. Knowl. Based Syst. 151, 78–94 (2018)
Henderson, K., Eliassi-Rad, T., Papadimitriou, S., Faloutsos, C.: HCDF: a hybrid community discovery framework. In: Proceedings of SIAM International Conference on Data Mining, pp. 754–765 (2010)
Henderson, K., Rad, T.E.: Applying latent Dirichlet allocation to group discovery in large graphs. In: Proceedings of ACM Symposium on Applied Computing, pp. 1456–1461 (2009)
Huang, S., Lv, T., Zhang, X., Yang, Y., Zheng, W., Wen, C.: Identifying node role in social network based on multiple indicators. PLoS ONE 9(8), e103733 (2014)
Kim, J., Lee, J.G.: Community detection in multi-layer graphs: a survey. ACM SIGMOD Rec. 44(3), 37–48 (2015)
Koller, D., Friedman, N.: Probabilistic Graphical Models. Principles and Techniques. The MIT Press, Cambridge (2009)
Lancichinetti, A., Fortunato, S.: Community detection algorithms: A comparative analysis. Phys. Rev. E 80, 056117 (2009)
Lancichinetti, A., Fortunato, S., Kert\(\acute{e}\)sz, J.: Detecting the overlapping and hierarchical community structure in complex networks. New J. Phys. 11, 033015 (2009)
Lattanzi, S., Sivakumar, D.: Affiliation networks. In: Proceedings of ACM Symposium on the Theory of Computing, pp. 427–434 (2009)
Leskovec, J., Lang, K., Mahoney, M.: Empirical comparison of algorithms for network community detection. In: Proceedings of International Conference on World Wide Web, pp. 631–640 (2010)
Li, Y., Sha, C., Huang, X., Zhang, Y.: Community detection in attributed graphs: An embedding approach. In: Proceedings of AAAI Conference on Artificial Intelligence, pp. 338–345 (2018)
Liben-Nowell, D., Kleinberg, J.: The link-prediction problem for social networks. J. Am. Soc. Inf. Sci. Technol. 58(7), 1019–1031 (2007)
Liu, H., Morstatter, F., Tang, J., Zafarani, R.: The good, the bad, and the ugly: uncovering novel research opportunities in social media mining. Int. J. Data Sci. Anal. 1(3–4), 137–143 (2016)
Lü, L., Zhou, T.: Link prediction in complex networks: a survey. Physica A Stat. Mech. Appl. 390(6), 1150–1170 (2011)
Malliaros, F., Vazirgiannis, M.: Clustering and community detection in directed networks: a survey. Phys. Rep. 533(4), 95–142 (2013)
Martínez, V., Berzal, F., Cubero, J.C.: A survey of link prediction in complex networks. ACM Comput. Surv. 49(4), 69:1–69:33 (2017)
McCallum, A., Wang, X., Corrada-Emmanuel, A.: Topic and role discovery in social networks with experiments on enron and academic email. J. Artif. Intell. Res. 30(1), 249–272 (2007)
Murphy, K.: Machine Learning: A Probabilistic Perspective. The MIT Press, Cambridge (2012)
Nguyen, G., Lee, J., Rossi, R., Ahmed, N., Koh, E., Kim, S.: Continuous-time dynamic network embeddings. In: Companion Proceedings of the The Web Conference, pp. 969–976 (2018)
Pathak, N., Delong, C., Banerjee, A., Erickson, K.: Social topic models for community extraction. In: Proceedings of KDD Workshop on Social Network Mining and Analysis (2008)
Porter, M., Onnela, J.P., Mucha, P.: Communities in networks. Not. Am. Math. Soc. 56(9), 1082–1166 (2009)
Ribeiro, L., Saverese, P., Figueiredo, D.: struc2vec: Learning node representations from structural identity. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 385–394 (2017)
Rosen-Zvi, M., Chemudugunta, C., Griffiths, T., Smyth, P., Steyvers, M.: Learning author-topic models from text corpora. ACM Trans. Inf. Syst. 28(1), 4:1–4:38 (2010)
Ross, R., Ahmed, N.: Role discovery in networks. IEEE Trans. Knowl. Data Eng. 27(04), 1112–1131 (2015)
Rossetti, G., Cazabet, R.: Community discovery in dynamic networks: a survey. ACM Comput. Surv. 51(2), 35:1–35:37 (2018)
Scripps, J., Tan, P.N., Esfahanian, A.H.: Exploration of link structure and community-based node roles in network analysis. In: Proceedings of International Conference on Data Mining, pp. 649–654 (2007)
Scripps, J., Tan, P.N., Esfahanian, A.H.: Node roles and community structure in networks. In: Proceedings of Workshop on Web Mining and Social Network Analysis (WebKDD and SNA-KDD), pp. 26–35 (2007)
Sherchan, W., Nepal, S., Paris, C.: A survey of trust in social networks. ACM Comput. Surv. 45(4), 47:1–47:33 (2013)
Steyvers, M., Griffiths, T.: Probabilistic topic models. In: Landauer, T., McNamara, D., Dennis, S., Kintsch, W. (eds.) Latent Semantic Analysis: A Road to Meaning, pp. 427–448. Lawrence Erlbaum (2007)
Tu, C., Liu, H., Liu, Z., Sun, M.: Cane: Context-aware network embedding for relation modeling. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp. 1722–1731 (2017)
Tu, K., Cui, P., Wang, X., Yu, P., Zhu, W.: Deep recursive network embedding with regular equivalence. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2357–2366 (2018)
Wainwright, M., Jordan, M.: Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn. 1(1–2), 1–305 (2008)
Wallach, H.: Topic modeling: Beyond bag-of-words. In: Proceedings of International Conference on Machine Learning, pp. 977–984 (2006)
Wallach, H., Murray, I., Salakhutdinov, R., Mimno, D.: Evaluation methods for topic models. In: Proceedings of International Conference on Machine Learning, pp. 1105–1112 (2009)
Wang, X., Cui, P., Wang, J., Pei, J., Zhu, W., Yang, S.: Community preserving network embedding. In: Proceedings of AAAI Conference on Artificial Intelligence, pp. 203–209 (2017)
Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge (1994)
Xie, J., Kelley, S., Szymanski, B.: Overlapping community detection in networks: the state of the art and comparative study. ACM Comput. Surv. 45(4), 43 (2013). https://doi.org/10.1145/2501654.2501657
Xu, Z., Ke, Y., Wang, Y., Cheng, H., Cheng, J.: A model-based approach to attributed graph clustering. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 505–516 (2012)
Xuan, J., Lu, J., Zhang, G., Luo, X.: Topic model for graph mining. IEEE Trans. Cybernet. 45(12), 2792–2803 (2015)
Yang, J., Leskovec, J.: Overlapping community detection at scale: a nonnegative matrix factorization approach. In: Proceedings of ACM International Conference on Web Search and Data Mining, pp. 587–596 (2013)
Yang, J., McAuley, J., Leskovec, J.: Community detection in networks with node attributes. In: ICDM, pp. 1151–1156 (2013)
Yang, Z., Algesheimer, R., Tessone, C.J.: A comparative analysis of community detection algorithms on artificial networks. Sci. Rep. 6, 30750 (2016). https://doi.org/10.1038/srep30750
Yin, Z., Cao, L., Gu, Q., Han, J.: Latent community topic analysis: integration of community discovery with topic modeling. ACM Trans. Intell. Syst. Technol. 3(4), 63:1–63:21 (2012)
Zafarani, R., Abbasi, M., Liu, H.: Social Media Mining: An Introduction. Cambridge University Press, Cambridge (2014)
Zhang, H., Qiu, B., Giles, C., Foley, H., Yen, J.: An LDA-based community structure discovery approach for large-scale social networks. In: IEEE Intelligence and Security Informatics, pp. 200–207 (2007)
Zhao, Y., Wang, G., Yu, P., Liu, S., Zhang, S.: Inferring social roles and statuses in social networks. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 695–703 (2013)
Zhou, D., Manavoglu, E., Li, J., Giles, C., Zha, H.: Probabilistic models for discovering e-communities. In: Proceedings of International Conference on World Wide Web, pp. 173–182 (2006)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This manuscript is an extended version of the PAKDD’2018 Long Presentation paper “Marrying Community Discovery and Role Analysis in Social Media via Topic Modeling”.
Rights and permissions
About this article
Cite this article
Costa, G., Ortale, R. Topic-aware joint analysis of overlapping communities and roles in social media. Int J Data Sci Anal 9, 415–429 (2020). https://doi.org/10.1007/s41060-019-00190-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41060-019-00190-4