Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2736277.2741673acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Tackling the Achilles Heel of Social Networks: Influence Propagation based Language Model Smoothing

Published: 18 May 2015 Publication History

Abstract

Online social networks nowadays enjoy their worldwide prosperity, as they have revolutionized the way for people to discover, to share, and to distribute information. With millions of registered users and the proliferation of user-generated contents, the social networks become "giants", likely eligible to carry on any research tasks. However, the giants do have their Achilles Heel: extreme data sparsity. Compared with the massive data over the whole collection, individual posting documents, (e.g., a microblog less than 140 characters), seem to be too sparse to make a difference under various research scenarios, while actually they are different. In this paper we propose to tackle the Achilles Heel of social networks by smoothing the language model via influence propagation. We formulate a socialized factor graph model, which utilizes both the textual correlations between document pairs and the socialized augmentation networks behind the documents, such as user relationships and social interactions. These factors are modeled as attributes and dependencies among documents and their corresponding users. An efficient algorithm is designed to learn the proposed factor graph model. Finally we propagate term counts to smooth documents based on the estimated influence. Experimental results on Twitter and Weibo datasets validate the effectiveness of the proposed model. By leveraging the smoothed language model with social factors, our approach obtains significant improvement over several alternative methods on both intrinsic and extrinsic evaluations measured in terms of perplexity, nDCG and MAP results.

References

[1]
P. Analytics. Twitter study--august 2009. 15, 2009.
[2]
L. Bottou. Online learning and stochastic approximations. On-line learning in neural networks, 17:9, 1998.
[3]
M. A. Carreira-Perpinan and G. E. Hinton. On contrastive divergence learning. In Artificial Intelligence and Statistics, volume 2005, page 17, 2005.
[4]
G. Casella and E. I. George. Explaining the gibbs sampler. The American Statistician, 46:167--174, 1992.
[5]
K. Chen, T. Chen, G. Zheng, O. Jin, E. Yao, and Y. Yu. Collaborative personalized tweet recommendation. In SIGIR '12, pages 661--670, 2012.
[6]
J. Dean and S. Ghemawat. Mapreduce: Simplified data processing on large clusters. pages 107--113, 2004.
[7]
J. Hopcroft, T. Lou, and J. Tang. Who will follow you back?: Reciprocal relationship prediction. In CIKM '11, pages 1137--1146, 2011.
[8]
Y.-Y. Huang, R. Yan, T.-T. Kuo, and S.-D. Lin. Enriching cold start personalized language model using social network information. In ACL '14, pages 611--617, 2014.
[9]
K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of ir techniques. ACM Trans. Inf. Syst., 20(4):422--446, 2002.
[10]
R. Kindermann, J. L. Snell, et al. Markov random fields and their applications, volume 1. American Mathematical Society Providence, RI, 1980.
[11]
F. R. Kschischang, B. J. Frey, and H.-A. Loeliger. Factor graphs and the sum-product algorithm. Information Theory, IEEE Transactions on, 47(2):498--519, 2001.
[12]
T.-T. Kuo, R. Yan, Y.-Y. Huang, P.-H. Kung, and S.-D. Lin. Unsupervised link prediction using aggregative statistics on heterogeneous social networks. In KDD '13, pages 775--783, 2013.
[13]
H. Kwak, C. Lee, H. Park, and S. Moon. What is twitter, a social network or a news media? In WWW '10, pages 591--600, 2010.
[14]
J. Lafferty and C. Zhai. Document language models, query models, and risk minimization for information retrieval. In SIGIR '01, pages 111--119, 2001.
[15]
V. Lavrenko and W. B. Croft. Relevance based language models. In SIGIR '01, pages 120--127, 2001.
[16]
C. Lin, C. Lin, J. Li, D. Wang, Y. Chen, and T. Li. Generating event storylines from microblogs. In CIKM '12, pages 175--184, 2012.
[17]
J. Lin, R. Snow, and W. Morgan. Smoothing techniques for adaptive online language models: Topic tracking in tweet streams. In KDD '11, pages 422--429, 2011.
[18]
X. Liu and W. B. Croft. Cluster-based retrieval using language models. In SIGIR '04, pages 186--193, 2004.
[19]
Y. Lv and C. Zhai. Positional language models for information retrieval. In SIGIR '09, pages 299--306, 2009.
[20]
C. D. Manning, P. Raghavan, and H. Schütze. Introduction to information retrieval, volume 1. 2008.
[21]
Q. Mei, D. Zhang, and C. Zhai. A general optimization framework for smoothing language models on graph structures. In SIGIR '08, pages 611--618, 2008.
[22]
H. Ney, U. Essen, and R. Kneser. On the estimation of small probabilities by leaving-one-out. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 17(12):1202--1212, 1995.
[23]
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: bringing order to the web. 1999.
[24]
E. Pitler, A. Louis, and A. Nenkova. Automatic evaluation of linguistic quality in multi-document summarization. In ACL '10, pages 544--554, 2010.
[25]
J. M. Ponte and W. B. Croft. A language modeling approach to information retrieval. In SIGIR '98, pages 275--281, 1998.
[26]
F. Song and W. B. Croft. A general language model for information retrieval. In CIKM '99, pages 316--321, 1999.
[27]
J. Tang, T. Lou, and J. Kleinberg. Inferring social ties across heterogenous networks. In WSDM '12, pages 743--752, 2012.
[28]
J. Tang, J. Sun, C. Wang, and Z. Yang. Social influence analysis in large-scale networks. In KDD '09, pages 807--816, 2009.
[29]
W. Tang, H. Zhuang, and J. Tang. Learning to infer social ties in large networks. In ECML/PKDD '11, pages 381--397. 2011.
[30]
T. Tao, X. Wang, Q. Mei, and C. Zhai. Language model information retrieval with document expansion. In HLT-NAACL '06, pages 407--414.
[31]
J. Teevan, D. Ramage, and M. R. Morris. #twittersearch: A comparison of microblog search and web search. In WSDM '11, pages 35--44, 2011.
[32]
Z. Wang, J. Li, Z. Wang, and J. Tang. Cross-lingual knowledge linking across wiki knowledge bases. In WWW '12, pages 459--468, 2012.
[33]
S. Wu, J. Sun, and J. Tang. Patent partner recommendation in enterprise social networks. In WSDM '13, pages 43--52, 2013.
[34]
R. Yan, H. Jiang, M. Lapata, S.-D. Lin, X. Lv, and X. Li. Semantic v.s. positions: Utilizing balanced proximity in language model smoothing for information retrieval. In IJCNLP'13, pages 507--515, 2013.
[35]
R. Yan, L. Kong, C. Huang, X. Wan, X. Li, and Y. Zhang. Timeline generation through evolutionary trans-temporal summarization. In EMNLP '11, pages 433--443, 2011.
[36]
R. Yan, M. Lapata, and X. Li. Tweet recommendation with graph co-ranking. In ACL '12, pages 516--525, 2012.
[37]
R. Yan, X. Wan, J. Otterbacher, L. Kong, X. Li, and Y. Zhang. Evolutionary timeline summarization: A balanced optimization framework via iterative substitution. In SIGIR'11, pages 745--754, 2011.
[38]
Z. Yang, K. Cai, J. Tang, L. Zhang, Z. Su, and J. Li. Social context summarization. In SIGIR '11, pages 255--264, 2011.
[39]
J. S. Yedidia, W. T. Freeman, Y. Weiss, et al. Generalized belief propagation. In NIPS, volume 13, pages 689--695, 2000.
[40]
C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In SIGIR '01, pages 334--342, 2001.
[41]
J. Zhao and Y. Yun. A proximity language model for information retrieval. In SIGIR '09, pages 291--298, 2009.
[42]
W. X. Zhao, J. Jiang, J. Weng, J. He, E.-P. Lim, H. Yan, and X. Li. Comparing twitter and traditional media using topic models. In ECIR '11, pages 338--349. 2011.
[43]
X. W. Zhao, Y. Guo, R. Yan, Y. He, and X. Li. Timeline generation with social attention. In SIGIR '13, pages 1061--1064, 2013.

Cited By

View all
  • (2023)Trust-Aware Evidence Reasoning and Spatiotemporal Feature Aggregation for Explainable Fake News DetectionApplied Sciences10.3390/app1309570313:9(5703)Online publication date: 5-May-2023
  • (2022)MiSTR: A Multiview Structural-Temporal Learning Framework for Rumor DetectionIEEE Transactions on Big Data10.1109/TBDATA.2021.31074818:4(1007-1019)Online publication date: 1-Aug-2022
  • (2021)PostCom2DR: Utilizing information from post and comments to detect rumorsExpert Systems with Applications10.1016/j.eswa.2021.116071(116071)Online publication date: Oct-2021
  • Show More Cited By

Index Terms

  1. Tackling the Achilles Heel of Social Networks: Influence Propagation based Language Model Smoothing

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      WWW '15: Proceedings of the 24th International Conference on World Wide Web
      May 2015
      1460 pages
      ISBN:9781450334693

      Sponsors

      • IW3C2: International World Wide Web Conference Committee

      In-Cooperation

      Publisher

      International World Wide Web Conferences Steering Committee

      Republic and Canton of Geneva, Switzerland

      Publication History

      Published: 18 May 2015

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. influence propagation
      2. language model smoothing
      3. social network

      Qualifiers

      • Research-article

      Conference

      WWW '15
      Sponsor:
      • IW3C2

      Acceptance Rates

      WWW '15 Paper Acceptance Rate 131 of 929 submissions, 14%;
      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)7
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 08 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Trust-Aware Evidence Reasoning and Spatiotemporal Feature Aggregation for Explainable Fake News DetectionApplied Sciences10.3390/app1309570313:9(5703)Online publication date: 5-May-2023
      • (2022)MiSTR: A Multiview Structural-Temporal Learning Framework for Rumor DetectionIEEE Transactions on Big Data10.1109/TBDATA.2021.31074818:4(1007-1019)Online publication date: 1-Aug-2022
      • (2021)PostCom2DR: Utilizing information from post and comments to detect rumorsExpert Systems with Applications10.1016/j.eswa.2021.116071(116071)Online publication date: Oct-2021
      • (2016)Learning cascaded influence under partial monitoringProceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining10.5555/3192424.3192471(255-262)Online publication date: 18-Aug-2016
      • (2016)Learning to Respond with Deep Neural Networks for Retrieval-Based Human-Computer Conversation SystemProceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval10.1145/2911451.2911542(55-64)Online publication date: 7-Jul-2016
      • (2016)Socialized Language Model Smoothing via Bi-directional Influence Propagation on Social NetworksProceedings of the 25th International Conference on World Wide Web10.1145/2872427.2874811(1395-1406)Online publication date: 11-Apr-2016
      • (2016)Learning cascaded influence under partial monitoring2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)10.1109/ASONAM.2016.7752243(255-262)Online publication date: Aug-2016

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media