research-article

Tackling the Achilles Heel of Social Networks: Influence Propagation based Language Model Smoothing

Authors:

Xiaohua HuAuthors Info & Claims

WWW '15: Proceedings of the 24th International Conference on World Wide Web

Pages 1318 - 1328

https://doi.org/10.1145/2736277.2741673

Published: 18 May 2015 Publication History

Abstract

Online social networks nowadays enjoy their worldwide prosperity, as they have revolutionized the way for people to discover, to share, and to distribute information. With millions of registered users and the proliferation of user-generated contents, the social networks become "giants", likely eligible to carry on any research tasks. However, the giants do have their Achilles Heel: extreme data sparsity. Compared with the massive data over the whole collection, individual posting documents, (e.g., a microblog less than 140 characters), seem to be too sparse to make a difference under various research scenarios, while actually they are different. In this paper we propose to tackle the Achilles Heel of social networks by smoothing the language model via influence propagation. We formulate a socialized factor graph model, which utilizes both the textual correlations between document pairs and the socialized augmentation networks behind the documents, such as user relationships and social interactions. These factors are modeled as attributes and dependencies among documents and their corresponding users. An efficient algorithm is designed to learn the proposed factor graph model. Finally we propagate term counts to smooth documents based on the estimated influence. Experimental results on Twitter and Weibo datasets validate the effectiveness of the proposed model. By leveraging the smoothed language model with social factors, our approach obtains significant improvement over several alternative methods on both intrinsic and extrinsic evaluations measured in terms of perplexity, nDCG and MAP results.

References

[1]

P. Analytics. Twitter study--august 2009. 15, 2009.

[2]

L. Bottou. Online learning and stochastic approximations. On-line learning in neural networks, 17:9, 1998.

Digital Library

[3]

M. A. Carreira-Perpinan and G. E. Hinton. On contrastive divergence learning. In Artificial Intelligence and Statistics, volume 2005, page 17, 2005.

[4]

G. Casella and E. I. George. Explaining the gibbs sampler. The American Statistician, 46:167--174, 1992.

[5]

K. Chen, T. Chen, G. Zheng, O. Jin, E. Yao, and Y. Yu. Collaborative personalized tweet recommendation. In SIGIR '12, pages 661--670, 2012.

Digital Library

[6]

J. Dean and S. Ghemawat. Mapreduce: Simplified data processing on large clusters. pages 107--113, 2004.

[7]

J. Hopcroft, T. Lou, and J. Tang. Who will follow you back?: Reciprocal relationship prediction. In CIKM '11, pages 1137--1146, 2011.

Digital Library

[8]

Y.-Y. Huang, R. Yan, T.-T. Kuo, and S.-D. Lin. Enriching cold start personalized language model using social network information. In ACL '14, pages 611--617, 2014.

[9]

K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of ir techniques. ACM Trans. Inf. Syst., 20(4):422--446, 2002.

Digital Library

[10]

R. Kindermann, J. L. Snell, et al. Markov random fields and their applications, volume 1. American Mathematical Society Providence, RI, 1980.

[11]

F. R. Kschischang, B. J. Frey, and H.-A. Loeliger. Factor graphs and the sum-product algorithm. Information Theory, IEEE Transactions on, 47(2):498--519, 2001.

Digital Library

[12]

T.-T. Kuo, R. Yan, Y.-Y. Huang, P.-H. Kung, and S.-D. Lin. Unsupervised link prediction using aggregative statistics on heterogeneous social networks. In KDD '13, pages 775--783, 2013.

Digital Library

[13]

H. Kwak, C. Lee, H. Park, and S. Moon. What is twitter, a social network or a news media? In WWW '10, pages 591--600, 2010.

Digital Library

[14]

J. Lafferty and C. Zhai. Document language models, query models, and risk minimization for information retrieval. In SIGIR '01, pages 111--119, 2001.

Digital Library

[15]

V. Lavrenko and W. B. Croft. Relevance based language models. In SIGIR '01, pages 120--127, 2001.

Digital Library

[16]

C. Lin, C. Lin, J. Li, D. Wang, Y. Chen, and T. Li. Generating event storylines from microblogs. In CIKM '12, pages 175--184, 2012.

Digital Library

[17]

J. Lin, R. Snow, and W. Morgan. Smoothing techniques for adaptive online language models: Topic tracking in tweet streams. In KDD '11, pages 422--429, 2011.

Digital Library

[18]

X. Liu and W. B. Croft. Cluster-based retrieval using language models. In SIGIR '04, pages 186--193, 2004.

Digital Library

[19]

Y. Lv and C. Zhai. Positional language models for information retrieval. In SIGIR '09, pages 299--306, 2009.

Digital Library

[20]

C. D. Manning, P. Raghavan, and H. Schütze. Introduction to information retrieval, volume 1. 2008.

Digital Library

[21]

Q. Mei, D. Zhang, and C. Zhai. A general optimization framework for smoothing language models on graph structures. In SIGIR '08, pages 611--618, 2008.

Digital Library

[22]

H. Ney, U. Essen, and R. Kneser. On the estimation of small probabilities by leaving-one-out. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 17(12):1202--1212, 1995.

Digital Library

[23]

L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: bringing order to the web. 1999.

[24]

E. Pitler, A. Louis, and A. Nenkova. Automatic evaluation of linguistic quality in multi-document summarization. In ACL '10, pages 544--554, 2010.

Digital Library

[25]

J. M. Ponte and W. B. Croft. A language modeling approach to information retrieval. In SIGIR '98, pages 275--281, 1998.

Digital Library

[26]

F. Song and W. B. Croft. A general language model for information retrieval. In CIKM '99, pages 316--321, 1999.

Digital Library

[27]

J. Tang, T. Lou, and J. Kleinberg. Inferring social ties across heterogenous networks. In WSDM '12, pages 743--752, 2012.

Digital Library

[28]

J. Tang, J. Sun, C. Wang, and Z. Yang. Social influence analysis in large-scale networks. In KDD '09, pages 807--816, 2009.

Digital Library

[29]

W. Tang, H. Zhuang, and J. Tang. Learning to infer social ties in large networks. In ECML/PKDD '11, pages 381--397. 2011.

Digital Library

[30]

T. Tao, X. Wang, Q. Mei, and C. Zhai. Language model information retrieval with document expansion. In HLT-NAACL '06, pages 407--414.

Digital Library

[31]

J. Teevan, D. Ramage, and M. R. Morris. #twittersearch: A comparison of microblog search and web search. In WSDM '11, pages 35--44, 2011.

Digital Library

[32]

Z. Wang, J. Li, Z. Wang, and J. Tang. Cross-lingual knowledge linking across wiki knowledge bases. In WWW '12, pages 459--468, 2012.

Digital Library

[33]

S. Wu, J. Sun, and J. Tang. Patent partner recommendation in enterprise social networks. In WSDM '13, pages 43--52, 2013.

Digital Library

[34]

R. Yan, H. Jiang, M. Lapata, S.-D. Lin, X. Lv, and X. Li. Semantic v.s. positions: Utilizing balanced proximity in language model smoothing for information retrieval. In IJCNLP'13, pages 507--515, 2013.

[35]

R. Yan, L. Kong, C. Huang, X. Wan, X. Li, and Y. Zhang. Timeline generation through evolutionary trans-temporal summarization. In EMNLP '11, pages 433--443, 2011.

Digital Library

[36]

R. Yan, M. Lapata, and X. Li. Tweet recommendation with graph co-ranking. In ACL '12, pages 516--525, 2012.

Digital Library

[37]

R. Yan, X. Wan, J. Otterbacher, L. Kong, X. Li, and Y. Zhang. Evolutionary timeline summarization: A balanced optimization framework via iterative substitution. In SIGIR'11, pages 745--754, 2011.

Digital Library

[38]

Z. Yang, K. Cai, J. Tang, L. Zhang, Z. Su, and J. Li. Social context summarization. In SIGIR '11, pages 255--264, 2011.

Digital Library

[39]

J. S. Yedidia, W. T. Freeman, Y. Weiss, et al. Generalized belief propagation. In NIPS, volume 13, pages 689--695, 2000.

Digital Library

[40]

C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In SIGIR '01, pages 334--342, 2001.

Digital Library

[41]

J. Zhao and Y. Yun. A proximity language model for information retrieval. In SIGIR '09, pages 291--298, 2009.

Digital Library

[42]

W. X. Zhao, J. Jiang, J. Weng, J. He, E.-P. Lim, H. Yan, and X. Li. Comparing twitter and traditional media using topic models. In ECIR '11, pages 338--349. 2011.

Digital Library

[43]

X. W. Zhao, Y. Guo, R. Yan, Y. He, and X. Li. Timeline generation with social attention. In SIGIR '13, pages 1061--1064, 2013.

Digital Library

Cited By

Chen JZhou GLu JWang SLi S(2023)Trust-Aware Evidence Reasoning and Spatiotemporal Feature Aggregation for Explainable Fake News DetectionApplied Sciences10.3390/app1309570313:9(5703)Online publication date: 5-May-2023
https://doi.org/10.3390/app13095703
Li JBao PShen HLi X(2022)MiSTR: A Multiview Structural-Temporal Learning Framework for Rumor DetectionIEEE Transactions on Big Data10.1109/TBDATA.2021.31074818:4(1007-1019)Online publication date: 1-Aug-2022
https://doi.org/10.1109/TBDATA.2021.3107481
Yang YWang YWang LMeng J(2021)PostCom2DR: Utilizing information from post and comments to detect rumorsExpert Systems with Applications10.1016/j.eswa.2021.116071(116071)Online publication date: Oct-2021
https://doi.org/10.1016/j.eswa.2021.116071
Show More Cited By

Index Terms

Tackling the Achilles Heel of Social Networks: Influence Propagation based Language Model Smoothing
1. Applied computing
  1. Law, social and behavioral sciences
2. Information systems
  1. Information retrieval
    1. Document representation

Recommendations

Socialized Language Model Smoothing via Bi-directional Influence Propagation on Social Networks
WWW '16: Proceedings of the 25th International Conference on World Wide Web

In recent years, online social networks are among the most popular websites with high PV (Page View) all over the world, as they have renewed the way for information discovery and distribution. Millions of users have registered on these websites and ...
Investigating Homophily in Online Social Networks
WI-IAT '10: Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01

Similarity breeds connections, the principle of homophily, has been well studied in existing sociology literature. %Several studies have observed this phenomena by conducting surveys on human subjects. These studies have concluded that new ties are ...
Modeling and Propagation Analysis on Social Influence Using Social Big Data
Security, Privacy, and Anonymity in Computation, Communication, and Storage
Abstract
Although most existing models focus on the evaluation of social influence in online social networks, failing to characterize indirect influence. So we present a novel framework for modeling and propagation analysis on social influence using social ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

WWW '15: Proceedings of the 24th International Conference on World Wide Web

May 2015

1460 pages

ISBN:9781450334693

General Chairs:
Aldo Gangemi
National Research Council, Italy & Paris 13 University-CNRS, France
,
Stefano Leonardi
Sapienza University of Rome, Italy
,
Alessandro Panconesi
Sapienza University of Rome, Italy

Copyright © 2015 Copyright is held by the International World Wide Web Conference Committee (IW3C2).

Sponsors

IW3C2: International World Wide Web Conference Committee

In-Cooperation

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

International World Wide Web Conferences Steering Committee

Republic and Canton of Geneva, Switzerland

Publication History

Published: 18 May 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WWW '15

Sponsor:

IW3C2

WWW '15: 24th International World Wide Web Conference

May 18 - 22, 2015

Florence, Italy

Acceptance Rates

WWW '15 Paper Acceptance Rate 131 of 929 submissions, 14%;

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
213
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)1

Reflects downloads up to 08 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chen JZhou GLu JWang SLi S(2023)Trust-Aware Evidence Reasoning and Spatiotemporal Feature Aggregation for Explainable Fake News DetectionApplied Sciences10.3390/app1309570313:9(5703)Online publication date: 5-May-2023
https://doi.org/10.3390/app13095703
Li JBao PShen HLi X(2022)MiSTR: A Multiview Structural-Temporal Learning Framework for Rumor DetectionIEEE Transactions on Big Data10.1109/TBDATA.2021.31074818:4(1007-1019)Online publication date: 1-Aug-2022
https://doi.org/10.1109/TBDATA.2021.3107481
Yang YWang YWang LMeng J(2021)PostCom2DR: Utilizing information from post and comments to detect rumorsExpert Systems with Applications10.1016/j.eswa.2021.116071(116071)Online publication date: Oct-2021
https://doi.org/10.1016/j.eswa.2021.116071
Zhang JMa JTang JSubrahmanian VRokne JKumar RCaverlee JTong H(2016)Learning cascaded influence under partial monitoringProceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining10.5555/3192424.3192471(255-262)Online publication date: 18-Aug-2016
https://dl.acm.org/doi/10.5555/3192424.3192471
Yan RSong YWu HPerego RSebastiani FAslam JRuthven IZobel J(2016)Learning to Respond with Deep Neural Networks for Retrieval-Based Human-Computer Conversation SystemProceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval10.1145/2911451.2911542(55-64)Online publication date: 7-Jul-2016
https://dl.acm.org/doi/10.1145/2911451.2911542
Yan RLi CHsieh HHu PHu XHe TBourdeau JHendler JNkambou RHorrocks IZhao B(2016)Socialized Language Model Smoothing via Bi-directional Influence Propagation on Social NetworksProceedings of the 25th International Conference on World Wide Web10.1145/2872427.2874811(1395-1406)Online publication date: 11-Apr-2016
https://dl.acm.org/doi/10.1145/2872427.2874811
Zhang JMa JTang J(2016)Learning cascaded influence under partial monitoring2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)10.1109/ASONAM.2016.7752243(255-262)Online publication date: Aug-2016
https://doi.org/10.1109/ASONAM.2016.7752243

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten