Abstract
Community and topic are two widely studied patterns in social network analysis. However, most existing studies either utilize textual content to improve the community detection or use link structure to guide topic modeling. Recently, some studies take both the link emphasized community and text emphasized topic into account, but community and topic are modeled by using the same latent variable. However, community and topic are different from each other in practical aspects. Therefore, it is more reasonable to model the community and topic by using different variables. To discover community, topic and their relations simultaneously, a m utual e nhanced i nfinite generative model (MEI) is proposed. This model discriminates the community and topic from one another and relates them together via community-topic distributions. Community and topic can be detected simultaneously and can be enhanced mutually during learning process. To detect the appropriate number of communities and topics automatically, Hierarchical/Dirichlet Process Mixture model (H/DPM) is employed. Gibbs sampling based approach is adopted to learn the model parameters. Experiments are conducted on the co-author network extracted from DBLP where each author is associated with his/her published papers. Experimental results show that our proposed model outperforms several baseline models in terms of perplexity and link prediction performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. JMLR 3, 993–1022 (2003)
Escobar, M.D., West, M.: Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association 90, 577–588 (1994)
Fortunato, S.: Community detection in graphs. Physics Reports 486(3-5), 75–174 (2010)
Gao, J., Liang, F., Fan, W., Wang, C., Sun, Y., Han, J.: On community outliers and their efficient detection in information networks. In: KDD, pp. 813–822 (2010)
Guo, Z., Zhang, Z.M., Zhu, S., Chi, Y., Gong, Y.: Knowledge discovery from citation networks. In: ICDM, pp. 800–805 (2009)
Heinrich, G.: Parameter estimation for text analysis. Technical report, University of Leipzig (2008)
Hofmann, T.: Probabilistic latent semantic indexing. In: SIGIR, pp. 50–57 (1999)
Li, H., Nie, Z., Lee, W.-C., Giles, C.L., Wen, J.-R.: Scalable community discovery on textual data with relations. In: WWW, pp. 101–110 (2008)
McCallum, A., Wang, X., Corrada-Emmanuel, A.: Topic and role discovery in social networks with experiments on enron and academic email. JAIR 30, 249–272 (2007)
McPherson, M., Lovin, L.S., Cook, J.M.: Birds of a feather: Homophily in social networks. Annual Review of Sociology 27(1), 415–444 (2001)
Mei, Q., Cai, D., Zhang, D., Zhai, C.: Topic modeling with network regularization. In: CIKM, pp. 1203–1212 (2008)
Nallapati, R., Ahmed, A., Xing, E.P., Cohen, W.W.: Joint latent topic models for text and citations. In: KDD, pp. 542–550 (2008)
Neal, R.M.: Markov chain sampling methods for dirichlet process mixture models. Journal of Computational and Graphical Statistics 9(2), 249–265 (2000)
Nowicki, K., Snijders, T.A.B.: Estimation and prediction for stochastic blockstructures. Journal of the American Statistical Association 96(455), 1077–1087 (2004)
Sun, Y., Han, J., Gao, J., Yu, Y.: Itopicmodel: Information network-integrated topic modeling. In: ICDM, pp. 493–502 (2009)
Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical dirichlet processes. Journal of the American Statistical Association 101(476), 1566–1581 (2006)
Wang, X., Mohanty, N., Mccallum, A.: Group and topic discovery from relations and text. In: LinkKDD, pp. 28–35 (2005)
Yang, T., Jin, R., Chi, Y., Zhu, S.: Combining link and content for community detection: A discriminative approach. In: KDD, pp. 927–935 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Duan, D., Li, Y., Li, R., Lu, Z., Wen, A. (2011). MEI: Mutual Enhanced Infinite Generative Model for Simultaneous Community and Topic Detection. In: Elomaa, T., Hollmén, J., Mannila, H. (eds) Discovery Science. DS 2011. Lecture Notes in Computer Science(), vol 6926. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24477-3_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-24477-3_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24476-6
Online ISBN: 978-3-642-24477-3
eBook Packages: Computer ScienceComputer Science (R0)