Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3340531.3412024acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Fusing Parallel Social Contexts within Flexible-Order Proximity for Microblog Topic Detection

Published: 19 October 2020 Publication History

Abstract

Topic detection in social media is a challenging task due to large-scale short, noisy and informal nature of messages. Most existing methods only consider textual content or simultaneously model the posts and the first-order structural characteristics of social networks. They ignore the impact of larger neighborhoods in microblog conversations on topics. Moreover, the simple combination of separated content and structure representations fails to capture their nonlinear correlation and different importance in topic inference. To this end, we propose a novel random walk based Parallel Social Contexts Fusion Topic Model (PCFTM) for weibo conversations. Firstly, a user-level conversation network with content information is built by the reposting and commenting relationships among users. Through random walks of different lengths on network, we obtain the user sequences containing the parallel content and structure contexts, which are used to acquire the flexible-order proximity of users. Then we propose a self-fusion network embedding to capture the nonlinear correlation between parallel social contexts. It is achieved by taking the content embedding sequence processed by CNN as the initial value of structure embedding sequence fed to Bi-LSTM. Meanwhile, a user-level self-attention is further used to mine the different importance of users to topics. Lastly, the user sequence embedding is incorporated into neural variational inference for detecting topics, which adaptively balances the intrinsic complementarity between content and structure, and fully uses both local and global social contexts in topic inference. Extensive experiments on three real-world weibo datasets demonstrate the effectiveness of our proposed model.

Supplementary Material

MP4 File (3340531.3412024.mp4)
In this paper, we propose a novel random walk based Parallel Social Contexts Fusion Topic Model (PCFTM) for microblog conversations. \r\nThe merits of PCFTM come from three aspects: \r\nFirst, we consider the user's flexible and high-order proximity for topic detection in social media, which overcomes the restrictions of previous methods that only focus on the social contexts within the fixed and low-order proximity to infer topics.\r\nSecond, we seamlessly fuse the parallel social contexts into the user sequence embedding rather than directly combine the separated content and structure representations, which effectively captures their nonlinear correlation.\r\nThird, we input the user sequence embedding into neural variational inference to adaptively balance the intrinsic complementarity between content and structure and their different importance for generating more coherent topics.\r\nExtensive experiments conducted on three real-world weibo datasets demonstrate that our proposed PCFTM is effective.

References

[1]
David Alvarez-Melis and Martin Saveski. 2016. Topic modeling in twitter: Aggregating tweets by conversations. In AAAI. 519--522.
[2]
David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. Journal of machine Learning research, Vol. 3 (2003), 993--1022.
[3]
Jonathan Chang, Sean Gerrish, Chong Wang, Jordan L Boyd-Graber, and David M Blei. 2009. Reading tea leaves: How humans interpret topic models. In NIPS. 288--296.
[4]
Chaotao Chen and Jiangtao Ren. 2017. Forum latent Dirichlet allocation for user interest discovery. Knowledge-Based Systems, Vol. 126 (2017), 1--7.
[5]
Weizheng Chen, Jinpeng Wang, Yan Zhang, Hongfei Yan, and Xiaoming Li. 2015. User based aggregation for biterm topic model. In IJCNLP. 489--494.
[6]
Weiyu Guo, Shu Wu, Liang Wang, and Tieniu Tan. 2015. Social-relational topic model for social networks. In CIKM. 1731--1734.
[7]
Ruifang He, Xuefei Zhang, Di Jin, Longbiao Wang, Jianwu Dang, and Xiangang Li. 2018. Interaction-Aware Topic Model for Microblog Conversations through Network Embedding and User Attention. In COLING. 1398--1409.
[8]
Yu He, Yangqiu Song, Jianxin Li, Cheng Ji, Jian Peng, and Hao Peng. 2019. HeteSpaceyWalk: A Heterogeneous Spacey Random Walk for Heterogeneous Information Network Embedding. In CIKM. 639--648.
[9]
Liangjie Hong and Brian D Davison. 2010. Empirical study of topic modeling in Twitter. In KDD. 80--88.
[10]
Weihua Hu and Jun?ichi Tsujii. 2016. A latent concept topic model for robust topic inference using word embeddings. In ACL. 380--386.
[11]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. In ICLR.
[12]
Diederik P Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In ICLR.
[13]
Chenliang Li, Haoran Wang, Zhiqian Zhang, Aixin Sun, and Zongyang Ma. 2016b. Topic modeling for short texts with auxiliary word embeddings. In SIGIR. 165--174.
[14]
Jing Li, Ming Liao, Wei Gao, Yulan He, and Kam-Fai Wong. 2016a. Topic Extraction from Microblog Posts Using Conversation Structures. In ACL. 1731--1734.
[15]
Kar Wai Lim, Changyou Chen, and Wray L. Buntine. 2013. Twitter-Network Topic Model: A full Bayesian treatment for social network and text modeling. In NIPS. 1--5.
[16]
Jie Liu, Zhicheng He, Lai Wei, and Yalou Huang. 2018. Content to node: Self-translation network embedding. In KDD. 1794--1802.
[17]
Jie Liu, Na Li, and Zhicheng He. 2019. Network embedding with dual generation tasks. In IJCAI. 5102--5108.
[18]
Heng-Yang Lu, Lu-Yao Xie, Ning Kang, Chong-Jun Wang, and Jun-Yuan Xie. 2017. Don't forget the quantifiable relationship between words: Using recurrent neural network for short text topic discovery. In AAAI. 1192--1198.
[19]
Rishabh Mehrotra, Scott Sanner, Wray Buntine, and Lexing Xie. 2013. Improving lda topic models for microblogs via tweet pooling and automatic labeling. In SIGIR. 889--892.
[20]
Yishu Miao, Edward Grefenstette, and Phil Blunsom. 2017. Discovering discrete latent topics with neural variational inference. In ICML. 2410--2419.
[21]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In ICLR.
[22]
David Mimno, Hanna M Wallach, Edmund Talley, Miriam Leenders, and Andrew McCallum. 2011. Optimizing semantic coherence in topic models. In EMNLP. 262--272.
[23]
Xiaojun Quan, Chunyu Kit, Yong Ge, and Sinno Jialin Pan. 2015. Short and sparse text topic modeling via self-aggregation. In IJCAI. 2270--2276.
[24]
Tian Shi, Kyeongpil Kang, Jaegul Choo, and Chandan K Reddy. 2018. Short-text topic modeling via non-negative matrix factorization enriched with local word-context correlations. In WWW. 1105--1114.
[25]
Wei Shi, Ling Huang, Chang-Dong Wang, Juan-Hui Li, Yong Tang, and Chengzhou Fu. 2019. Network embedding via community based variational autoencoder. IEEE Access, Vol. 7 (2019), 25323--25333.
[26]
Vivek Kumar Rangarajan Sridhar. 2015. Unsupervised topic modeling for short texts using distributed representations of words. In NAACL. 192--200.
[27]
Akash Srivastava and Charles Sutton. 2017. Autoencoding Variational Inference For Topic Models. In ICLR.
[28]
Jian Tang, Ming Zhang, and Qiaozhu Mei. 2013. One theme in all views: modeling consensus topics in multiple contexts. In KDD. 5--13.
[29]
Cunchao Tu, Han Liu, Zhiyuan Liu, and Maosong Sun. 2017. CANE: Context-Aware Network Embedding for Relation Modeling. In ACL. 1722--1731.
[30]
Felipe Viegas, Washington Luiz, Christian Gomes, Amir Khatibi, Sergio Canuto, Fernando Mourao, Thiago Salles, Leonardo Rocha, and Marcos Andre Goncalves. 2018. Semantically-Enhanced Topic Modeling. In CIKM. 893--902.
[31]
Yue Wang, Jing Li, Hou Pong Chan, Irwin King, Michael R Lyu, and Shuming Shi. 2019. Topic-Aware Neural Keyphrase Generation for Social Media Language. In ACL. 2516--2526.
[32]
Sheng Xu, Peifeng Li, Fang Kong, Qiaoming Zhu, and Guodong Zhou. 2019. Topic Tensor Network for Implicit Discourse Relation Recognition in Chinese. In ACL. 608--618.
[33]
Xiaohui Yan, Jiafeng Guo, Yanyan Lan, and Xueqi Cheng. 2013. A biterm topic model for short texts. In WWW. 1445--1456.
[34]
Jichuan Zeng, Jing Li, Yan Song, Cuiyun Gao, Michael R Lyu, and Irwin King. 2018. Topic memory networks for short text classification. In EMNLP. 3120--3131.
[35]
Wayne Xin Zhao, Jing Jiang, Jianshu Weng, Jing He, Ee-Peng Lim, Hongfei Yan, and Xiaoming Li. 2011. Comparing twitter and traditional media using topic models. In ECIR. 338--349.
[36]
Qile Zhu, Zheng Feng, and Xiaolin Li. 2018. GraphBTM: Graph enhanced autoencoded variational inference for biterm topic model. In EMNLP. 4663--4672.
[37]
Yuan Zuo, Junjie Wu, Hui Zhang, Hao Lin, Fei Wang, Ke Xu, and Hui Xiong. 2016. Topic modeling of short texts: A pseudo-document view. In KDD. 2105--2114.

Cited By

View all
  • (2023)On the modeling of cyber-attacks associated with social engineeringJournal of Information Security and Applications10.1016/j.jisa.2023.10350175:COnline publication date: 26-Jul-2023
  • (2022)Topic Model on Microblog with Dual-Streams Graph Convolution Networks2022 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN55064.2022.9892645(1-8)Online publication date: 18-Jul-2022
  • (2021)Post2StoryProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3478559(2786-2788)Online publication date: 17-Oct-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management
October 2020
3619 pages
ISBN:9781450368599
DOI:10.1145/3340531
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 October 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. flexible-order proximity
  2. microblog conversations
  3. parallel social contexts fusion topic model

Qualifiers

  • Research-article

Funding Sources

  • the National Natural Science Foundation of China (
  • the Tianjin Natural Science Foundation
  • the National Key R&D Program of China

Conference

CIKM '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)1
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)On the modeling of cyber-attacks associated with social engineeringJournal of Information Security and Applications10.1016/j.jisa.2023.10350175:COnline publication date: 26-Jul-2023
  • (2022)Topic Model on Microblog with Dual-Streams Graph Convolution Networks2022 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN55064.2022.9892645(1-8)Online publication date: 18-Jul-2022
  • (2021)Post2StoryProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3478559(2786-2788)Online publication date: 17-Oct-2021
  • (2021)Socializing in Interference$$:$$ Quantum-Inspired Topic Model with Crystal-Like Structure Grid for Microblog Topic DetectionNeural Information Processing10.1007/978-3-030-92307-5_16(132-140)Online publication date: 2-Dec-2021

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media