Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2348283.2348360acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Social-network analysis using topic models

Published: 12 August 2012 Publication History

Abstract

In this paper, we discuss how we can extend probabilistic topic models to analyze the relationship graph of popular social-network data, so that we can group or label the edges and nodes in the graph based on their topic similarity. In particular, we first apply the well-known Latent Dirichlet Allocation (LDA) model and its existing variants to the graph-labeling task and argue that the existing models do not handle popular nodes (nodes with many incoming edges) in the graph very well. We then propose possible extensions to this model to deal with popular nodes. Our experiments show that the proposed extensions are very effective in labeling popular nodes, showing significant improvements over the existing methods. Our proposed methods can be used for providing, for instance, more relevant friend recommendations within a social network.

References

[1]
Multinomial distribution. http://en.wikipedia.org/wiki/Multinomial_distribution.
[2]
E. M. Airoldi, D. M. Blei, S. E. Fienberg, and E. P. Xing. Mixed membership stochastic blockmodels. In J. Mach. Learn. Res., 2008.
[3]
D. M. Blei, T. L. Griffiths, M. I. Jordan, and J. B. Tanenbaum. Hierarchical topic models and the nested chinese restaurant process. In Neural Information Processing Systems, 2003.
[4]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, 2003.
[5]
K. R. Canini, L. Shi, and T. L. Griffiths. Online inference of topics with latent dirichlet allocation. In Artifical Intelligence and Statistics, 2009.
[6]
M. Ester, H. peter Kriegel, J. S, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining, pages 226--231. AAAI Press, 1996.
[7]
M. Girolami and A. Kaban. On an equivalence between plsi and lda. In SIGIR, 2003.
[8]
M. Harvey, I. Ruthven, and M. J. Carman. Improving social bookmark search using personalised latent variable language models. In WSDM, 2011.
[9]
K. Henderson and T. Eliassi-Rad. Applying latent dirichlet allocation to group discovery in large graphs. In Proceedings of the 2009 ACM symposium on Applied Computing, 2009.
[10]
M. D. Hoffman, D. M. Blei, and F. Bach. Online learning for latent dirichlet allocation. In In NIPS, 2010.
[11]
T. Hofmann. Probabilistic latent semantic indexing. In SIGIR, 1999.
[12]
T. Iwata, T. Yamada, and N. Ueda. Modeling social annotation data with content relevance using a topic model. In NIPS, 2009.
[13]
J. M. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM, 1999.
[14]
H. Kwak, C. Lee, H. Park, and S. Moon. What is twitter, a social network or a news media? In Proceedings of the 19th international conference on World wide web, WWW '10, pages 591--600, New York, NY, USA, 2010. ACM.
[15]
S. Lloyd. Least squares quantization in pcm. IEEE Transactions on Information Theory, pages 129--137, 1982.
[16]
A. Mccallum, X. Wang, and A. Corrada-Emmanuel. Topic and role discovery in social networks with experiments on enron and academic email. Journal of Artificial Intelligence Research, 30:249--272, 2007.
[17]
Q. Mei, D. Cai, D. Zhang, and C. Zhai. Topic modeling with network regularization. In Proceedings of the 17th international conference on World Wide Web, WWW '08, pages 101--110, New York, NY, USA, 2008. ACM.
[18]
A. Mislove, B. Viswanath, K. P. Gummadi, and P. Druschel. You are who you know: Inferring user profiles in online social networks. In WSDM, 2010.
[19]
N. Pathak, C. DeLong, A. Banerjee, and K. Erickson. Social topic models for community extraction. In The 2nd SNA-KDD Workshop, 2008.
[20]
A. Smola and S. Narayanamurthy. An architecture for parallel topic models. In VLDB, 2010.
[21]
M. Steyvers and T. L. Griffiths. Probabilistic topic models. Handbook of Latent Semantic Analysis, 2007.
[22]
M. Steyvers, P. Smyth, M. Rosen-Zvi, and T. Griffiths. Probabilistic author-topic models for information discovery. In SIGKDD, 2004.
[23]
V. Tuulos and H. Tirri. Combining topic models and social networks for chat data mining. In In Proc. of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence, 2004.
[24]
X. Wang, N. Mohanty, and A. Mccallum. Group and topic discovery from relations and text. In In Proc. 3rd international workshop on Link discovery, pages 28--35. ACM, 2005.
[25]
M. J. Welch, U. Schonfeld, D. He, and J. Cho. Topical semantics of twitter links. In WSDM, 2011.
[26]
H. Zhang, B. Qiu, C. L. Giles, H. C. Foley, and J. Yen. An lda-based community structure discovery approach for large-scale social networks. In In IEEE International Conference on Intelligence and Security Informatics, pages 200--207, 2007.
[27]
D. Zhou, E. Manavoglu, J. Li, C. L. Giles, and H. Zha. Probabilistic models for discovering e-communities. In World Wide Web Conference, 2006.

Cited By

View all
  • (2024)An Assessment of Digitalization Techniques in Contact Centers and Their Impact on Agent Performance and Well-BeingSustainability10.3390/su1602071416:2(714)Online publication date: 14-Jan-2024
  • (2024)GraphInterpreter: a visual analytics approach for dynamic networks evolution exploration via topic modelsJournal of Visualization10.1007/s12650-024-00993-z27:5(909-924)Online publication date: 1-Oct-2024
  • (2023)Trustable Co-Label Learning From Multiple Noisy AnnotatorsIEEE Transactions on Multimedia10.1109/TMM.2021.313775225(1045-1057)Online publication date: 2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
August 2012
1236 pages
ISBN:9781450314725
DOI:10.1145/2348283
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. handling popular nodes
  2. latent dirichlet allocation
  3. social-network analysis
  4. topic model

Qualifiers

  • Research-article

Conference

SIGIR '12
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)95
  • Downloads (Last 6 weeks)18
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)An Assessment of Digitalization Techniques in Contact Centers and Their Impact on Agent Performance and Well-BeingSustainability10.3390/su1602071416:2(714)Online publication date: 14-Jan-2024
  • (2024)GraphInterpreter: a visual analytics approach for dynamic networks evolution exploration via topic modelsJournal of Visualization10.1007/s12650-024-00993-z27:5(909-924)Online publication date: 1-Oct-2024
  • (2023)Trustable Co-Label Learning From Multiple Noisy AnnotatorsIEEE Transactions on Multimedia10.1109/TMM.2021.313775225(1045-1057)Online publication date: 2023
  • (2023)A Topic Clustering Method to Identify Online Threats against Soft Targets2023 International Conference on Computational Science and Computational Intelligence (CSCI)10.1109/CSCI62032.2023.00124(727-733)Online publication date: 13-Dec-2023
  • (2023)Comparative Analysis of Topics Covered by False and True News in the Context of the COVID-19 PandemicNetworks in the Global World VI10.1007/978-3-031-29408-2_2(21-35)Online publication date: 2-May-2023
  • (2022)Samachar: Print News Media on Air Pollution in IndiaProceedings of the 5th ACM SIGCAS/SIGCHI Conference on Computing and Sustainable Societies10.1145/3530190.3534812(401-413)Online publication date: 29-Jun-2022
  • (2022)Mutual Quantization for Cross-Modal Search with Noisy Labels2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52688.2022.00740(7541-7550)Online publication date: Jun-2022
  • (2022)Noise Is Also Useful: Negative Correlation-Steered Latent Contrastive Learning2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52688.2022.00013(31-40)Online publication date: Jun-2022
  • (2022)Hierarchical Bayesian text modeling for the unsupervised joint analysis of latent topics and semantic clustersInternational Journal of Approximate Reasoning10.1016/j.ijar.2022.05.002147(23-39)Online publication date: Aug-2022
  • (2022)Tracking public opinion about online education over COVID-19 in ChinaEducational technology research and development10.1007/s11423-022-10080-570:3(1083-1104)Online publication date: 22-Feb-2022
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media