Topic-aware joint analysis of overlapping communities and roles in social media

Costa, Gianni; Ortale, Riccardo

doi:10.1007/s41060-019-00190-4

Topic-aware joint analysis of overlapping communities and roles in social media

Regular Paper
Published: 05 June 2019

Volume 9, pages 415–429, (2020)
Cite this article

International Journal of Data Science and Analytics Aims and scope Submit manuscript

Gianni Costa¹ &
Riccardo Ortale¹

269 Accesses
Explore all metrics

Abstract

Topic modeling can be used to improve the mutuality and interpenetration of community discovery and role analysis in social media. Also, it is useful to uncover communities and roles that are both social and topic-aware. In the present manuscript, we explore the exploitation of topic modeling to inform the seamless integration of community discovery and role analysis. For this purpose, we develop an innovative generative model of social media, in which the interrelation among communities, roles and topics is explained from a fully Bayesian perspective. Essentially, communities, roles and topics are latent factors that interact in an underlying generative process, to govern link formation and message wording. Posterior inference under the devised model allows for a variety of exploratory, descriptive and predictive tasks. These include the detection and interpretation of overlapping communities, roles and topics as well as the prediction of missing links. We derive the mathematical details of variational inference and design a coordinate-ascent algorithm implementing the latter. An empirical assessment on real-world social media demonstrates a superior accuracy of the proposed model in community discovery and link prediction compared to several established competitors, which substantiates the rationality of both our modeling effort and the underlying intuition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Marrying Community Discovery and Role Analysis in Social Media via Topic Modeling

Community Detection Through Topic Modeling in Social Networks

Adaptation of Static and Contextualized Topic Modeling Techniques to Hidden Community Detection

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

Notice that, in the case of collaboration networks, the term message refers to the corresponding type of coauthored content, such as project proposals, deliverables and publications. In particular, one data set used for the experimental assessment of Sect. 6 is chosen from the scientific collaboration domain and, in such a context, message is a synonym of publication.
The mathematical derivation both of the functional forms of the individual factors on the right hand side of Eq. 3 and the updates of the respective variational parameters is omitted for brevity.

References

Aggarwal, C., Subbian, K.: Evolutionary network analysis: a survey. ACM Comput. Surv. 47(1), 10:1–10:36 (2014)
Article Google Scholar
Ahmed, N., Rossi, R., Lee, J., Willke, T., Zhou, R., Kong, X., Eldardiry, H.: Learning role-based graph embeddings. In: Proceedings of International Workshop on Statistical Relational AI (2018)
Ahn, Y., Bagrow, J., Lehmann, S.: Link communities reveal multiscale complexity in networks. Nature 466, 761–764 (2010)
Article Google Scholar
Airoldi, E., Blei, D., Fienberg, S., Xing, E.: Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9, 1981–2014 (2008)
MATH Google Scholar
Berry, G.: Role action embeddings: scalable representation of network positions. arXiv:1811.08019 (2018)
Bishop, C.M.: Model-based machine learning. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 371(1984), 20120222 (2013). https://doi.org/10.1098/rsta.2012.0222
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Berlin (2006)
MATH Google Scholar
Blei, D.: Probabilistic topic models. Commun. ACM 55(4), 77–84 (2012)
Article Google Scholar
Blei, D., Kucukelbir, A., McAuliffe, J.: Variational inference: a review for statisticians. J. Am. Stat. Assoc. 112(518), 859–877 (2017)
Article MathSciNet Google Scholar
Blei, D., Lafferty, J.: Dynamic topic models. In: Proceedings of International Conference on Machine Learning, pp. 113 – 120 (2006)
Blei, D., Lafferty, J.: Topic models. In: Srivastava, A.N., Sahami, M. (eds.) Text Mining: Classification, Clustering, and Applications, pp. 71 – 94. Chapman & Hall/CRC Data Mining and Knowledge Discovery Series (2009)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Blondel, V., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008(10), P10008 (2008)
Article Google Scholar
Cai, H., Zheng, V., Chang, K.C.: A comprehensive survey of graph embedding: problems, techniques, and applications. IEEE Trans. Knowl. Data Eng. 30(9), 1616–1637 (2018)
Article Google Scholar
Chaney, A., Blei, D., Eliassi-Rad, T.: A probabilistic model for using social networks in personalized item recommendation. In: Proceedings of ACM Conference on Recommender Systems, pp. 43–50 (2015)
Chang, J., Boyd-Graber, J., Gerrish, S., Wang, C., Blei, D.: Reading tea leaves: how humans interpret topic models. In: Proceedings of International Conference on Neural Information Processing Systems, pp. 288–296 (2009)
Chou, B.H., Suzuki, E.: Discovering community-oriented roles of nodes in a social network. In: Proceedings of International Conference on Data Warehousing and Knowledge Discovery, pp. 52–64 (2010)
Costa, G., Ortale, R.: A Bayesian hierarchical approach for exploratory analysis of communities and roles in social networks. In: Proceedings of IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 194–201 (2012)
Costa, G., Ortale, R.: Probabilistic analysis of communities and inner roles in networks: Bayesian generative models and approximate inference. Soc. Netw. Anal. Min. 3(4), 1015–1038 (2013)
Article Google Scholar
Costa, G., Ortale, R.: A unified generative bayesian model for community discovery and role assignment based upon latent interaction factors. In: Proceedings of IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 93–100 (2014)
Costa, G., Ortale, R.: A mean-field variational bayesian approach to detecting overlapping communities with inner roles using poisson link generation. In: Proceedings of International Symposium on Intelligent Data Analysis, pp. 110–122 (2016)
Costa, G., Ortale, R.: Model-based collaborative personalized recommendation on signed social rating networks. ACM Trans. Internet Technol. 16(3), 20:1–20:21 (2016)
Article Google Scholar
Costa, G., Ortale, R.: Scalable detection of overlapping communities and role assignments in networks via bayesian probabilistic generative affiliation modeling. In: Proceedings of International OTM Conference on Cooperative Information Systems, pp. 99–117 (2016)
Costa, G., Ortale, R.: Overlapping communities meet roles and respective behavioral patterns in networks with node attributes. In: Proceedings of International Conference on Web Information Systems Engineering, pp. 215–230 (2017)
Costa, G., Ortale, R.: Mining overlapping communities and inner role assignments through bayesian mixed-membership models of networks with context-dependent interactions. ACM Trans. Knowl. Discov. Data 12(2), 18:1–18:32 (2018)
Article Google Scholar
Cui, P., Wang, X., Pei, J., Zhu, W.: A survey on network embedding. IEEE Trans. Knowl. Data Eng 31(5), 833–852 (2019)
da Silva, E., Langseth, H., Ramampiaro, H.: Content-based social recommendation with Poisson matrix factorization. In: Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 530–546 (2017)
Evans, T., Lambiotte, R.: Line graphs, line partitions and overlapping communities. Phys. Rev. E 80, 016105 (2009)
Article Google Scholar
Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010)
Article MathSciNet Google Scholar
Fortunato, S., Hric, D.: Community detection in networks: a user guide. Phys. Rep. 659, 1–44 (2016)
Article MathSciNet Google Scholar
Fu, Y., Ma, Y.: Graph Embedding for Pattern Analysis. Springer, Berlin (2012)
MATH Google Scholar
Gopalan, P., Blei, D.: Efficient discovery of overlapping communities in massive networks. Proc. Natl. Acad. Sci. 110(36), 14534–14539 (2013)
Article MathSciNet Google Scholar
Gopalan, P., Charlin, L., Blei, D.: Content-based recommendations with Poisson factorization. In: Proceedings of International Conference on Neural Information Processing Systems, pp. 3176–3184 (2014)
Gopalan, P., Hofman, J., Blei, D.: Scalable recommendation with hierarchical Poisson factorization. In: Proceedings of Conference on Uncertainty in Artificial Intelligence, pp. 326 – 335 (2015)
Gopalan, P., Ruiz, F., Ranganath, R., Blei, D.: Bayesian nonparametric poisson factorization for recommendation systems. In: Proceedings of International Conference on Artificial Intelligence and Statistics, pp. 275–283 (2014)
Gopalan, P., Wang, C., Blei, D.: Modeling overlapping communities with node popularities. In: NIPS, pp. 2850–2858 (2013)
Goyal, P., Ferrara, E.: Graph embedding techniques, applications, and performance: a survey. Knowl. Based Syst. 151, 78–94 (2018)
Article Google Scholar
Henderson, K., Eliassi-Rad, T., Papadimitriou, S., Faloutsos, C.: HCDF: a hybrid community discovery framework. In: Proceedings of SIAM International Conference on Data Mining, pp. 754–765 (2010)
Henderson, K., Rad, T.E.: Applying latent Dirichlet allocation to group discovery in large graphs. In: Proceedings of ACM Symposium on Applied Computing, pp. 1456–1461 (2009)
Huang, S., Lv, T., Zhang, X., Yang, Y., Zheng, W., Wen, C.: Identifying node role in social network based on multiple indicators. PLoS ONE 9(8), e103733 (2014)
Article Google Scholar
Kim, J., Lee, J.G.: Community detection in multi-layer graphs: a survey. ACM SIGMOD Rec. 44(3), 37–48 (2015)
Article Google Scholar
Koller, D., Friedman, N.: Probabilistic Graphical Models. Principles and Techniques. The MIT Press, Cambridge (2009)
MATH Google Scholar
Lancichinetti, A., Fortunato, S.: Community detection algorithms: A comparative analysis. Phys. Rev. E 80, 056117 (2009)
Article Google Scholar
Lancichinetti, A., Fortunato, S., Kert$\acute{e}$sz, J.: Detecting the overlapping and hierarchical community structure in complex networks. New J. Phys. 11, 033015 (2009)
Lattanzi, S., Sivakumar, D.: Affiliation networks. In: Proceedings of ACM Symposium on the Theory of Computing, pp. 427–434 (2009)
Leskovec, J., Lang, K., Mahoney, M.: Empirical comparison of algorithms for network community detection. In: Proceedings of International Conference on World Wide Web, pp. 631–640 (2010)
Li, Y., Sha, C., Huang, X., Zhang, Y.: Community detection in attributed graphs: An embedding approach. In: Proceedings of AAAI Conference on Artificial Intelligence, pp. 338–345 (2018)
Liben-Nowell, D., Kleinberg, J.: The link-prediction problem for social networks. J. Am. Soc. Inf. Sci. Technol. 58(7), 1019–1031 (2007)
Article Google Scholar
Liu, H., Morstatter, F., Tang, J., Zafarani, R.: The good, the bad, and the ugly: uncovering novel research opportunities in social media mining. Int. J. Data Sci. Anal. 1(3–4), 137–143 (2016)
Article Google Scholar
Lü, L., Zhou, T.: Link prediction in complex networks: a survey. Physica A Stat. Mech. Appl. 390(6), 1150–1170 (2011)
Article Google Scholar
Malliaros, F., Vazirgiannis, M.: Clustering and community detection in directed networks: a survey. Phys. Rep. 533(4), 95–142 (2013)
Article MathSciNet Google Scholar
Martínez, V., Berzal, F., Cubero, J.C.: A survey of link prediction in complex networks. ACM Comput. Surv. 49(4), 69:1–69:33 (2017)
Article Google Scholar
McCallum, A., Wang, X., Corrada-Emmanuel, A.: Topic and role discovery in social networks with experiments on enron and academic email. J. Artif. Intell. Res. 30(1), 249–272 (2007)
Article Google Scholar
Murphy, K.: Machine Learning: A Probabilistic Perspective. The MIT Press, Cambridge (2012)
MATH Google Scholar
Nguyen, G., Lee, J., Rossi, R., Ahmed, N., Koh, E., Kim, S.: Continuous-time dynamic network embeddings. In: Companion Proceedings of the The Web Conference, pp. 969–976 (2018)
Pathak, N., Delong, C., Banerjee, A., Erickson, K.: Social topic models for community extraction. In: Proceedings of KDD Workshop on Social Network Mining and Analysis (2008)
Porter, M., Onnela, J.P., Mucha, P.: Communities in networks. Not. Am. Math. Soc. 56(9), 1082–1166 (2009)
MathSciNet MATH Google Scholar
Ribeiro, L., Saverese, P., Figueiredo, D.: struc2vec: Learning node representations from structural identity. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 385–394 (2017)
Rosen-Zvi, M., Chemudugunta, C., Griffiths, T., Smyth, P., Steyvers, M.: Learning author-topic models from text corpora. ACM Trans. Inf. Syst. 28(1), 4:1–4:38 (2010)
Article Google Scholar
Ross, R., Ahmed, N.: Role discovery in networks. IEEE Trans. Knowl. Data Eng. 27(04), 1112–1131 (2015)
Article Google Scholar
Rossetti, G., Cazabet, R.: Community discovery in dynamic networks: a survey. ACM Comput. Surv. 51(2), 35:1–35:37 (2018)
Article Google Scholar
Scripps, J., Tan, P.N., Esfahanian, A.H.: Exploration of link structure and community-based node roles in network analysis. In: Proceedings of International Conference on Data Mining, pp. 649–654 (2007)
Scripps, J., Tan, P.N., Esfahanian, A.H.: Node roles and community structure in networks. In: Proceedings of Workshop on Web Mining and Social Network Analysis (WebKDD and SNA-KDD), pp. 26–35 (2007)
Sherchan, W., Nepal, S., Paris, C.: A survey of trust in social networks. ACM Comput. Surv. 45(4), 47:1–47:33 (2013)
Article Google Scholar
Steyvers, M., Griffiths, T.: Probabilistic topic models. In: Landauer, T., McNamara, D., Dennis, S., Kintsch, W. (eds.) Latent Semantic Analysis: A Road to Meaning, pp. 427–448. Lawrence Erlbaum (2007)
Tu, C., Liu, H., Liu, Z., Sun, M.: Cane: Context-aware network embedding for relation modeling. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp. 1722–1731 (2017)
Tu, K., Cui, P., Wang, X., Yu, P., Zhu, W.: Deep recursive network embedding with regular equivalence. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2357–2366 (2018)
Wainwright, M., Jordan, M.: Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn. 1(1–2), 1–305 (2008)
Article Google Scholar
Wallach, H.: Topic modeling: Beyond bag-of-words. In: Proceedings of International Conference on Machine Learning, pp. 977–984 (2006)
Wallach, H., Murray, I., Salakhutdinov, R., Mimno, D.: Evaluation methods for topic models. In: Proceedings of International Conference on Machine Learning, pp. 1105–1112 (2009)
Wang, X., Cui, P., Wang, J., Pei, J., Zhu, W., Yang, S.: Community preserving network embedding. In: Proceedings of AAAI Conference on Artificial Intelligence, pp. 203–209 (2017)
Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge (1994)
Book Google Scholar
Xie, J., Kelley, S., Szymanski, B.: Overlapping community detection in networks: the state of the art and comparative study. ACM Comput. Surv. 45(4), 43 (2013). https://doi.org/10.1145/2501654.2501657
Xu, Z., Ke, Y., Wang, Y., Cheng, H., Cheng, J.: A model-based approach to attributed graph clustering. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 505–516 (2012)
Xuan, J., Lu, J., Zhang, G., Luo, X.: Topic model for graph mining. IEEE Trans. Cybernet. 45(12), 2792–2803 (2015)
Article Google Scholar
Yang, J., Leskovec, J.: Overlapping community detection at scale: a nonnegative matrix factorization approach. In: Proceedings of ACM International Conference on Web Search and Data Mining, pp. 587–596 (2013)
Yang, J., McAuley, J., Leskovec, J.: Community detection in networks with node attributes. In: ICDM, pp. 1151–1156 (2013)
Yang, Z., Algesheimer, R., Tessone, C.J.: A comparative analysis of community detection algorithms on artificial networks. Sci. Rep. 6, 30750 (2016). https://doi.org/10.1038/srep30750
Yin, Z., Cao, L., Gu, Q., Han, J.: Latent community topic analysis: integration of community discovery with topic modeling. ACM Trans. Intell. Syst. Technol. 3(4), 63:1–63:21 (2012)
Article Google Scholar
Zafarani, R., Abbasi, M., Liu, H.: Social Media Mining: An Introduction. Cambridge University Press, Cambridge (2014)
Book Google Scholar
Zhang, H., Qiu, B., Giles, C., Foley, H., Yen, J.: An LDA-based community structure discovery approach for large-scale social networks. In: IEEE Intelligence and Security Informatics, pp. 200–207 (2007)
Zhao, Y., Wang, G., Yu, P., Liu, S., Zhang, S.: Inferring social roles and statuses in social networks. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 695–703 (2013)
Zhou, D., Manavoglu, E., Li, J., Giles, C., Zha, H.: Probabilistic models for discovering e-communities. In: Proceedings of International Conference on World Wide Web, pp. 173–182 (2006)

Download references

Author information

Authors and Affiliations

ICAR-CNR, Via P. Bucci, 8/9C, Rende, CS, Italy
Gianni Costa & Riccardo Ortale

Authors

Gianni Costa
View author publications
You can also search for this author in PubMed Google Scholar
Riccardo Ortale
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Riccardo Ortale.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This manuscript is an extended version of the PAKDD’2018 Long Presentation paper “Marrying Community Discovery and Role Analysis in Social Media via Topic Modeling”.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Costa, G., Ortale, R. Topic-aware joint analysis of overlapping communities and roles in social media. Int J Data Sci Anal 9, 415–429 (2020). https://doi.org/10.1007/s41060-019-00190-4

Download citation

Received: 14 December 2018
Accepted: 22 May 2019
Published: 05 June 2019
Issue Date: May 2020
DOI: https://doi.org/10.1007/s41060-019-00190-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Topic-aware joint analysis of overlapping communities and roles in social media

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Marrying Community Discovery and Role Analysis in Social Media via Topic Modeling

Community Detection Through Topic Modeling in Social Networks

Adaptation of Static and Contextualized Topic Modeling Techniques to Hidden Community Detection

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now