Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Topic-aware joint analysis of overlapping communities and roles in social media

  • Regular Paper
  • Published:
International Journal of Data Science and Analytics Aims and scope Submit manuscript

Abstract

Topic modeling can be used to improve the mutuality and interpenetration of community discovery and role analysis in social media. Also, it is useful to uncover communities and roles that are both social and topic-aware. In the present manuscript, we explore the exploitation of topic modeling to inform the seamless integration of community discovery and role analysis. For this purpose, we develop an innovative generative model of social media, in which the interrelation among communities, roles and topics is explained from a fully Bayesian perspective. Essentially, communities, roles and topics are latent factors that interact in an underlying generative process, to govern link formation and message wording. Posterior inference under the devised model allows for a variety of exploratory, descriptive and predictive tasks. These include the detection and interpretation of overlapping communities, roles and topics as well as the prediction of missing links. We derive the mathematical details of variational inference and design a coordinate-ascent algorithm implementing the latter. An empirical assessment on real-world social media demonstrates a superior accuracy of the proposed model in community discovery and link prediction compared to several established competitors, which substantiates the rationality of both our modeling effort and the underlying intuition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. Notice that, in the case of collaboration networks, the term message refers to the corresponding type of coauthored content, such as project proposals, deliverables and publications. In particular, one data set used for the experimental assessment of Sect. 6 is chosen from the scientific collaboration domain and, in such a context, message is a synonym of publication.

  2. The mathematical derivation both of the functional forms of the individual factors on the right hand side of Eq. 3 and the updates of the respective variational parameters is omitted for brevity.

References

  1. Aggarwal, C., Subbian, K.: Evolutionary network analysis: a survey. ACM Comput. Surv. 47(1), 10:1–10:36 (2014)

    Article  Google Scholar 

  2. Ahmed, N., Rossi, R., Lee, J., Willke, T., Zhou, R., Kong, X., Eldardiry, H.: Learning role-based graph embeddings. In: Proceedings of International Workshop on Statistical Relational AI (2018)

  3. Ahn, Y., Bagrow, J., Lehmann, S.: Link communities reveal multiscale complexity in networks. Nature 466, 761–764 (2010)

    Article  Google Scholar 

  4. Airoldi, E., Blei, D., Fienberg, S., Xing, E.: Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9, 1981–2014 (2008)

    MATH  Google Scholar 

  5. Berry, G.: Role action embeddings: scalable representation of network positions. arXiv:1811.08019 (2018)

  6. Bishop, C.M.: Model-based machine learning. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 371(1984), 20120222 (2013). https://doi.org/10.1098/rsta.2012.0222

  7. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Berlin (2006)

    MATH  Google Scholar 

  8. Blei, D.: Probabilistic topic models. Commun. ACM 55(4), 77–84 (2012)

    Article  Google Scholar 

  9. Blei, D., Kucukelbir, A., McAuliffe, J.: Variational inference: a review for statisticians. J. Am. Stat. Assoc. 112(518), 859–877 (2017)

    Article  MathSciNet  Google Scholar 

  10. Blei, D., Lafferty, J.: Dynamic topic models. In: Proceedings of International Conference on Machine Learning, pp. 113 – 120 (2006)

  11. Blei, D., Lafferty, J.: Topic models. In: Srivastava, A.N., Sahami, M. (eds.) Text Mining: Classification, Clustering, and Applications, pp. 71 – 94. Chapman & Hall/CRC Data Mining and Knowledge Discovery Series (2009)

  12. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  13. Blondel, V., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008(10), P10008 (2008)

    Article  Google Scholar 

  14. Cai, H., Zheng, V., Chang, K.C.: A comprehensive survey of graph embedding: problems, techniques, and applications. IEEE Trans. Knowl. Data Eng. 30(9), 1616–1637 (2018)

    Article  Google Scholar 

  15. Chaney, A., Blei, D., Eliassi-Rad, T.: A probabilistic model for using social networks in personalized item recommendation. In: Proceedings of ACM Conference on Recommender Systems, pp. 43–50 (2015)

  16. Chang, J., Boyd-Graber, J., Gerrish, S., Wang, C., Blei, D.: Reading tea leaves: how humans interpret topic models. In: Proceedings of International Conference on Neural Information Processing Systems, pp. 288–296 (2009)

  17. Chou, B.H., Suzuki, E.: Discovering community-oriented roles of nodes in a social network. In: Proceedings of International Conference on Data Warehousing and Knowledge Discovery, pp. 52–64 (2010)

  18. Costa, G., Ortale, R.: A Bayesian hierarchical approach for exploratory analysis of communities and roles in social networks. In: Proceedings of IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 194–201 (2012)

  19. Costa, G., Ortale, R.: Probabilistic analysis of communities and inner roles in networks: Bayesian generative models and approximate inference. Soc. Netw. Anal. Min. 3(4), 1015–1038 (2013)

    Article  Google Scholar 

  20. Costa, G., Ortale, R.: A unified generative bayesian model for community discovery and role assignment based upon latent interaction factors. In: Proceedings of IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 93–100 (2014)

  21. Costa, G., Ortale, R.: A mean-field variational bayesian approach to detecting overlapping communities with inner roles using poisson link generation. In: Proceedings of International Symposium on Intelligent Data Analysis, pp. 110–122 (2016)

  22. Costa, G., Ortale, R.: Model-based collaborative personalized recommendation on signed social rating networks. ACM Trans. Internet Technol. 16(3), 20:1–20:21 (2016)

    Article  Google Scholar 

  23. Costa, G., Ortale, R.: Scalable detection of overlapping communities and role assignments in networks via bayesian probabilistic generative affiliation modeling. In: Proceedings of International OTM Conference on Cooperative Information Systems, pp. 99–117 (2016)

  24. Costa, G., Ortale, R.: Overlapping communities meet roles and respective behavioral patterns in networks with node attributes. In: Proceedings of International Conference on Web Information Systems Engineering, pp. 215–230 (2017)

  25. Costa, G., Ortale, R.: Mining overlapping communities and inner role assignments through bayesian mixed-membership models of networks with context-dependent interactions. ACM Trans. Knowl. Discov. Data 12(2), 18:1–18:32 (2018)

    Article  Google Scholar 

  26. Cui, P., Wang, X., Pei, J., Zhu, W.: A survey on network embedding. IEEE Trans. Knowl. Data Eng 31(5), 833–852 (2019)

  27. da Silva, E., Langseth, H., Ramampiaro, H.: Content-based social recommendation with Poisson matrix factorization. In: Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 530–546 (2017)

  28. Evans, T., Lambiotte, R.: Line graphs, line partitions and overlapping communities. Phys. Rev. E 80, 016105 (2009)

    Article  Google Scholar 

  29. Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010)

    Article  MathSciNet  Google Scholar 

  30. Fortunato, S., Hric, D.: Community detection in networks: a user guide. Phys. Rep. 659, 1–44 (2016)

    Article  MathSciNet  Google Scholar 

  31. Fu, Y., Ma, Y.: Graph Embedding for Pattern Analysis. Springer, Berlin (2012)

    MATH  Google Scholar 

  32. Gopalan, P., Blei, D.: Efficient discovery of overlapping communities in massive networks. Proc. Natl. Acad. Sci. 110(36), 14534–14539 (2013)

    Article  MathSciNet  Google Scholar 

  33. Gopalan, P., Charlin, L., Blei, D.: Content-based recommendations with Poisson factorization. In: Proceedings of International Conference on Neural Information Processing Systems, pp. 3176–3184 (2014)

  34. Gopalan, P., Hofman, J., Blei, D.: Scalable recommendation with hierarchical Poisson factorization. In: Proceedings of Conference on Uncertainty in Artificial Intelligence, pp. 326 – 335 (2015)

  35. Gopalan, P., Ruiz, F., Ranganath, R., Blei, D.: Bayesian nonparametric poisson factorization for recommendation systems. In: Proceedings of International Conference on Artificial Intelligence and Statistics, pp. 275–283 (2014)

  36. Gopalan, P., Wang, C., Blei, D.: Modeling overlapping communities with node popularities. In: NIPS, pp. 2850–2858 (2013)

  37. Goyal, P., Ferrara, E.: Graph embedding techniques, applications, and performance: a survey. Knowl. Based Syst. 151, 78–94 (2018)

    Article  Google Scholar 

  38. Henderson, K., Eliassi-Rad, T., Papadimitriou, S., Faloutsos, C.: HCDF: a hybrid community discovery framework. In: Proceedings of SIAM International Conference on Data Mining, pp. 754–765 (2010)

  39. Henderson, K., Rad, T.E.: Applying latent Dirichlet allocation to group discovery in large graphs. In: Proceedings of ACM Symposium on Applied Computing, pp. 1456–1461 (2009)

  40. Huang, S., Lv, T., Zhang, X., Yang, Y., Zheng, W., Wen, C.: Identifying node role in social network based on multiple indicators. PLoS ONE 9(8), e103733 (2014)

    Article  Google Scholar 

  41. Kim, J., Lee, J.G.: Community detection in multi-layer graphs: a survey. ACM SIGMOD Rec. 44(3), 37–48 (2015)

    Article  Google Scholar 

  42. Koller, D., Friedman, N.: Probabilistic Graphical Models. Principles and Techniques. The MIT Press, Cambridge (2009)

    MATH  Google Scholar 

  43. Lancichinetti, A., Fortunato, S.: Community detection algorithms: A comparative analysis. Phys. Rev. E 80, 056117 (2009)

    Article  Google Scholar 

  44. Lancichinetti, A., Fortunato, S., Kert\(\acute{e}\)sz, J.: Detecting the overlapping and hierarchical community structure in complex networks. New J. Phys. 11, 033015 (2009)

  45. Lattanzi, S., Sivakumar, D.: Affiliation networks. In: Proceedings of ACM Symposium on the Theory of Computing, pp. 427–434 (2009)

  46. Leskovec, J., Lang, K., Mahoney, M.: Empirical comparison of algorithms for network community detection. In: Proceedings of International Conference on World Wide Web, pp. 631–640 (2010)

  47. Li, Y., Sha, C., Huang, X., Zhang, Y.: Community detection in attributed graphs: An embedding approach. In: Proceedings of AAAI Conference on Artificial Intelligence, pp. 338–345 (2018)

  48. Liben-Nowell, D., Kleinberg, J.: The link-prediction problem for social networks. J. Am. Soc. Inf. Sci. Technol. 58(7), 1019–1031 (2007)

    Article  Google Scholar 

  49. Liu, H., Morstatter, F., Tang, J., Zafarani, R.: The good, the bad, and the ugly: uncovering novel research opportunities in social media mining. Int. J. Data Sci. Anal. 1(3–4), 137–143 (2016)

    Article  Google Scholar 

  50. Lü, L., Zhou, T.: Link prediction in complex networks: a survey. Physica A Stat. Mech. Appl. 390(6), 1150–1170 (2011)

    Article  Google Scholar 

  51. Malliaros, F., Vazirgiannis, M.: Clustering and community detection in directed networks: a survey. Phys. Rep. 533(4), 95–142 (2013)

    Article  MathSciNet  Google Scholar 

  52. Martínez, V., Berzal, F., Cubero, J.C.: A survey of link prediction in complex networks. ACM Comput. Surv. 49(4), 69:1–69:33 (2017)

    Article  Google Scholar 

  53. McCallum, A., Wang, X., Corrada-Emmanuel, A.: Topic and role discovery in social networks with experiments on enron and academic email. J. Artif. Intell. Res. 30(1), 249–272 (2007)

    Article  Google Scholar 

  54. Murphy, K.: Machine Learning: A Probabilistic Perspective. The MIT Press, Cambridge (2012)

    MATH  Google Scholar 

  55. Nguyen, G., Lee, J., Rossi, R., Ahmed, N., Koh, E., Kim, S.: Continuous-time dynamic network embeddings. In: Companion Proceedings of the The Web Conference, pp. 969–976 (2018)

  56. Pathak, N., Delong, C., Banerjee, A., Erickson, K.: Social topic models for community extraction. In: Proceedings of KDD Workshop on Social Network Mining and Analysis (2008)

  57. Porter, M., Onnela, J.P., Mucha, P.: Communities in networks. Not. Am. Math. Soc. 56(9), 1082–1166 (2009)

    MathSciNet  MATH  Google Scholar 

  58. Ribeiro, L., Saverese, P., Figueiredo, D.: struc2vec: Learning node representations from structural identity. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 385–394 (2017)

  59. Rosen-Zvi, M., Chemudugunta, C., Griffiths, T., Smyth, P., Steyvers, M.: Learning author-topic models from text corpora. ACM Trans. Inf. Syst. 28(1), 4:1–4:38 (2010)

    Article  Google Scholar 

  60. Ross, R., Ahmed, N.: Role discovery in networks. IEEE Trans. Knowl. Data Eng. 27(04), 1112–1131 (2015)

    Article  Google Scholar 

  61. Rossetti, G., Cazabet, R.: Community discovery in dynamic networks: a survey. ACM Comput. Surv. 51(2), 35:1–35:37 (2018)

    Article  Google Scholar 

  62. Scripps, J., Tan, P.N., Esfahanian, A.H.: Exploration of link structure and community-based node roles in network analysis. In: Proceedings of International Conference on Data Mining, pp. 649–654 (2007)

  63. Scripps, J., Tan, P.N., Esfahanian, A.H.: Node roles and community structure in networks. In: Proceedings of Workshop on Web Mining and Social Network Analysis (WebKDD and SNA-KDD), pp. 26–35 (2007)

  64. Sherchan, W., Nepal, S., Paris, C.: A survey of trust in social networks. ACM Comput. Surv. 45(4), 47:1–47:33 (2013)

    Article  Google Scholar 

  65. Steyvers, M., Griffiths, T.: Probabilistic topic models. In: Landauer, T., McNamara, D., Dennis, S., Kintsch, W. (eds.) Latent Semantic Analysis: A Road to Meaning, pp. 427–448. Lawrence Erlbaum (2007)

  66. Tu, C., Liu, H., Liu, Z., Sun, M.: Cane: Context-aware network embedding for relation modeling. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp. 1722–1731 (2017)

  67. Tu, K., Cui, P., Wang, X., Yu, P., Zhu, W.: Deep recursive network embedding with regular equivalence. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2357–2366 (2018)

  68. Wainwright, M., Jordan, M.: Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn. 1(1–2), 1–305 (2008)

    Article  Google Scholar 

  69. Wallach, H.: Topic modeling: Beyond bag-of-words. In: Proceedings of International Conference on Machine Learning, pp. 977–984 (2006)

  70. Wallach, H., Murray, I., Salakhutdinov, R., Mimno, D.: Evaluation methods for topic models. In: Proceedings of International Conference on Machine Learning, pp. 1105–1112 (2009)

  71. Wang, X., Cui, P., Wang, J., Pei, J., Zhu, W., Yang, S.: Community preserving network embedding. In: Proceedings of AAAI Conference on Artificial Intelligence, pp. 203–209 (2017)

  72. Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge (1994)

    Book  Google Scholar 

  73. Xie, J., Kelley, S., Szymanski, B.: Overlapping community detection in networks: the state of the art and comparative study. ACM Comput. Surv. 45(4), 43 (2013). https://doi.org/10.1145/2501654.2501657

  74. Xu, Z., Ke, Y., Wang, Y., Cheng, H., Cheng, J.: A model-based approach to attributed graph clustering. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 505–516 (2012)

  75. Xuan, J., Lu, J., Zhang, G., Luo, X.: Topic model for graph mining. IEEE Trans. Cybernet. 45(12), 2792–2803 (2015)

    Article  Google Scholar 

  76. Yang, J., Leskovec, J.: Overlapping community detection at scale: a nonnegative matrix factorization approach. In: Proceedings of ACM International Conference on Web Search and Data Mining, pp. 587–596 (2013)

  77. Yang, J., McAuley, J., Leskovec, J.: Community detection in networks with node attributes. In: ICDM, pp. 1151–1156 (2013)

  78. Yang, Z., Algesheimer, R., Tessone, C.J.: A comparative analysis of community detection algorithms on artificial networks. Sci. Rep. 6, 30750 (2016). https://doi.org/10.1038/srep30750

  79. Yin, Z., Cao, L., Gu, Q., Han, J.: Latent community topic analysis: integration of community discovery with topic modeling. ACM Trans. Intell. Syst. Technol. 3(4), 63:1–63:21 (2012)

    Article  Google Scholar 

  80. Zafarani, R., Abbasi, M., Liu, H.: Social Media Mining: An Introduction. Cambridge University Press, Cambridge (2014)

    Book  Google Scholar 

  81. Zhang, H., Qiu, B., Giles, C., Foley, H., Yen, J.: An LDA-based community structure discovery approach for large-scale social networks. In: IEEE Intelligence and Security Informatics, pp. 200–207 (2007)

  82. Zhao, Y., Wang, G., Yu, P., Liu, S., Zhang, S.: Inferring social roles and statuses in social networks. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 695–703 (2013)

  83. Zhou, D., Manavoglu, E., Li, J., Giles, C., Zha, H.: Probabilistic models for discovering e-communities. In: Proceedings of International Conference on World Wide Web, pp. 173–182 (2006)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Riccardo Ortale.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This manuscript is an extended version of the PAKDD’2018 Long Presentation paper “Marrying Community Discovery and Role Analysis in Social Media via Topic Modeling”.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Costa, G., Ortale, R. Topic-aware joint analysis of overlapping communities and roles in social media. Int J Data Sci Anal 9, 415–429 (2020). https://doi.org/10.1007/s41060-019-00190-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41060-019-00190-4

Keywords