Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3289600.3290957acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Sparsemax and Relaxed Wasserstein for Topic Sparsity

Published: 30 January 2019 Publication History

Abstract

Topic sparsity refers to the observation that individual documents usually focus on several salient topics instead of covering a wide variety of topics, and a real topic adopts a narrow range of terms instead of a wide coverage of the vocabulary. Understanding this topic sparsity is especially important for analyzing user-generated web content and social media, which are featured in the form of extremely short posts and discussions. As topic sparsity of individual documents in online social media increases, so does the difficulty of analyzing the online text sources using traditional methods. In this paper, we propose two novel neural models by providing sparse posterior distributions over topics based on the Gaussian sparsemax construction, enabling efficient training by stochastic backpropagation. We construct an inference network conditioned on the input data and infer the variational distribution with the relaxed Wasserstein (RW) divergence. Unlike existing works based on Gaussian softmax construction and Kullback-Leibler (KL) divergence, our approaches can identify latent topic sparsity with training stability, predictive performance, and topic coherence. Experiments on different genres of large text corpora have demonstrated the effectiveness of our models as they outperform both probabilistic and neural methods.

References

[1]
M. Arjovsky and L. Bottou. 2017. Towards principled methods for training generative adversarial networks. In ICLR .
[2]
M. Arjovsky, S. Chintala, and L. Bottou. 2017. Wasserstein Generative Adversarial Networks. In ICML. 214--223.
[3]
S. Banerjee and T. Pedersen. 2002. An adapted Lesk algorithm for word sense disambiguation using WordNet. In ICITPCL . Springer, 136--145.
[4]
D. M. Blei, A. Y. Ng, and M. I. Jordan. 2003. Latent dirichlet allocation. Journal of Machine Learning Research, Vol. 3, Jan (2003), 993--1022.
[5]
G. Bouma. 2009. Normalized (pointwise) mutual information in collocation extraction. Proceedings of GSCL (2009), 31--40.
[6]
Z. Cao, S. Li, Y. Liu, W. Li, and H. Ji. 2015. A Novel Neural Topic Model and Its Supervised Extension. In AAAI. 2210--2216.
[7]
D. Card, C. Tan, and N. A. Smith. 2017. A Neural Framework for Generalized Topic Models. ArXiv Preprint: 1705.09296 (2017).
[8]
J. Chang, S. Gerrish, C. Wang, J. L. Boyd-Graber, and D. M. Blei. 2009. Reading tea leaves: How humans interpret topic models. In NIPS. 288--296.
[9]
X. Chen, M. Zhou, and L. Carin. 2012. The contextual focused topic model. In KDD. ACM, 96--104.
[10]
X. Cheng, X. Yan, Y. Lan, and J. Guo. 2014. B™: Topic modeling over short texts. IEEE Transactions on Knowledge and Data Engineering, Vol. 26, 12 (2014), 2928--2941.
[11]
T. M. Cover and J. A. Thomas. 2012. Elements of information theory .John Wiley & Sons.
[12]
D. C. Dowson and B. V. Landau. 1982. The Fréchet distance between multivariate normal distributions. Journal of multivariate analysis, Vol. 12, 3 (1982), 450--455.
[13]
J. Eisenstein, A. Ahmed, and E. P. Xing. 2011. Sparse additive generative models of text. In ICML. 1041--1048.
[14]
X. Guo, J. Hong, T. Lin, and N. Yang. 2017. Relaxed Wasserstein with Applications to GANs. ArXiv Preprint: 1705.07164 (2017).
[15]
G. E. Hinton and R. R. Salakhutdinov. 2009. Replicated softmax: an undirected topic model. In NIPS. 1607--1614.
[16]
M. Hoffman, F. R. Bach, and D. M. Blei. 2010. Online learning for latent dirichlet allocation. In NIPS. 856--864.
[17]
M. D. Hoffman, D. M. Blei, C. Wang, and J. Paisley. 2013. Stochastic variational inference. Journal of Machine Learning Research, Vol. 14, 1 (2013), 1303--1347.
[18]
T. Hofmann. 1999. Probabilistic latent semantic analysis. In UAI. Morgan Kaufmann Publishers Inc., 289--296.
[19]
L. Hong and B. D. Davison. 2010. Empirical study of topic modeling in twitter. In Proceedings of the first workshop on social media analytics. ACM, 80--88.
[20]
P. O. Hoyer. 2004. Non-negative matrix factorization with sparseness constraints. Journal of machine learning research, Vol. 5, Nov (2004), 1457--1469.
[21]
M. I. Jordan, Z. Ghahramani, T. S. Jaakkola, and L. K. Saul. 1999. An introduction to variational methods for graphical models. Machine Learning, Vol. 37, 2 (1999), 183--233.
[22]
D. P. Kingma and M. Welling. 2014. Auto-encoding variational bayes. In ICLR .
[23]
M. Knott and C. S. Smith. 1984. On the optimal mapping of distributions. Journal of Optimization Theory and Applications, Vol. 43, 1 (1984), 39--49.
[24]
H. Larochelle and S. Lauly. 2012. A Neural Autoregressive Topic Model. In NIPS. 2708--2716.
[25]
T. Lin, W. Tian, Q. Mei, and H. Cheng. 2014. The dual-sparse topic model: mining focused topics and focused terms in short text. In WWW . 539--550.
[26]
T. Lin, S. Zhang, and H. Cheng. 2016. Understanding Sparse Topical Structure of Short Text via Stochastic Variational-Gibbs Inference. In CIKM. ACM, 407--416.
[27]
A. Martins and R. Astudillo. 2016. From softmax to sparsemax: A sparse model of attention and multi-label classification. In ICML. 1614--1623.
[28]
L. Mescheder, S. Nowozin, and A. Geiger. 2017. Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks. In ICML .
[29]
Y. Miao, E. Grefenstette, and P. Blunsom. 2017. Discovering Discrete Latent Topics with Neural Variational Inference. In ICML . 2410--2419.
[30]
Y. Miao, L. Yu, and P. Blunsom. 2016. Neural variational inference for text processing. In ICML. 1727--1736.
[31]
A. Mnih and K. Gregor. 2014. Neural Variational Inference and Learning in Belief Networks. In ICML. 1791--1799.
[32]
D. Newman, J. H. Lau, K. Grieser, and T. Baldwin. 2010. Automatic evaluation of topic coherence. In NAACL. 100--108.
[33]
M. Peng, Q. Xie, Y. Zhang, H. Wang, X. Zhang, J. Huang, and G. Tian. 2018. Neural Sparse Topical Coding. In ACL, Vol. 1. 2332--2340.
[34]
D. J. Rezende, S. Mohamed, and D. Wierstra. 2014. Stochastic Backpropagation and Approximate Inference in Deep Generative Models. In ICML . 1278--1286.
[35]
M. Shashanka, B. Raj, and P. Smaragdis. 2008. Sparse overcomplete latent variable decomposition of counts data. In NIPS . 1313--1320.
[36]
A. Srivastava and C. Sutton. 2017. Autoencoding variational inference for topic models. In ICLR .
[37]
I. Tolstikhin, O. Bousquet, S. Gelly, and B. Schoelkopf. 2018. Wasserstein Auto-Encoders. In ICLR .
[38]
C. Villani. 2008. Optimal transport: old and new . Vol. 338. Springer Science & Business Media.
[39]
C. Wang and D. M. Blei. 2009. Decoupling sparsity and smoothness in the discrete hierarchical dirichlet process. In NIPS . 1982--1989.
[40]
Y. Xu, T. Lin, W. Lam, Z. Zhou, H. Cheng, and A. M-C. So. 2014. Latent aspect mining via exploring sparsity and intrinsic information. In CIKM. ACM, 879--888.
[41]
F. Yan, N. Xu, and Y. Qi. 2009. Parallel inference for latent dirichlet allocation on graphics processing units. In NIPS . 2134--2142.
[42]
A. Zhang, J. Zhu, and B. Zhang. 2013. Sparse online topic models. In WWW. ACM, 1489--1500.
[43]
W. X. Zhao, J. Jiang, J. Weng, J. He, E-P. Lim, H. Yan, and X. Li. 2011. Comparing twitter and traditional media using topic models. In Advances in Information Retrieval . Springer, 338--349.
[44]
J. Zhu and E. P. Xing. 2011. Sparse topical coding. In UAI. AUAI Press, 831--838.

Cited By

View all
  • (2024)The Generative Generic-Field Design Method Based on Design Cognition and Knowledge ReasoningSustainability10.3390/su1622984116:22(9841)Online publication date: 12-Nov-2024
  • (2024)Dynamic topic language model on heterogeneous children’s mental health clinical notesThe Annals of Applied Statistics10.1214/24-AOAS193018:4Online publication date: 1-Dec-2024
  • (2024)Augmented projection Wasserstein distances: Multi-dimensional projection with neural surfaceJournal of Statistical Planning and Inference10.1016/j.jspi.2024.106185233(106185)Online publication date: Dec-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WSDM '19: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining
January 2019
874 pages
ISBN:9781450359405
DOI:10.1145/3289600
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 January 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. neural topic modeling
  2. relaxed wasserstein divergence
  3. sparsemax
  4. stochastic gradient backpropagation
  5. topic sparsity

Qualifiers

  • Research-article

Conference

WSDM '19

Acceptance Rates

WSDM '19 Paper Acceptance Rate 84 of 511 submissions, 16%;
Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)2
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)The Generative Generic-Field Design Method Based on Design Cognition and Knowledge ReasoningSustainability10.3390/su1622984116:22(9841)Online publication date: 12-Nov-2024
  • (2024)Dynamic topic language model on heterogeneous children’s mental health clinical notesThe Annals of Applied Statistics10.1214/24-AOAS193018:4Online publication date: 1-Dec-2024
  • (2024)Augmented projection Wasserstein distances: Multi-dimensional projection with neural surfaceJournal of Statistical Planning and Inference10.1016/j.jspi.2024.106185233(106185)Online publication date: Dec-2024
  • (2024)A survey on neural topic models: methods, applications, and challengesArtificial Intelligence Review10.1007/s10462-023-10661-757:2Online publication date: 25-Jan-2024
  • (2024)Lifelong Hierarchical Topic Modeling via Non-negative Matrix FactorizationWeb and Big Data10.1007/978-981-97-2421-5_11(155-170)Online publication date: 12-May-2024
  • (2023)A survey of topic models: From a whole-cycle perspectiveJournal of Intelligent & Fuzzy Systems10.3233/JIFS-233551(1-25)Online publication date: 8-Sep-2023
  • (2023)DATM: A Novel Data Agnostic Topic Modeling Technique With Improved Effectiveness for Both Short and Long TextIEEE Access10.1109/ACCESS.2023.326265311(32826-32841)Online publication date: 2023
  • (2023)Encouraging Sparsity in Neural Topic Modeling with Non-Mean-Field InferenceMachine Learning and Knowledge Discovery in Databases: Research Track10.1007/978-3-031-43421-1_9(142-158)Online publication date: 18-Sep-2023
  • (2022)The short texts classification based on neural network topic modelJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-21147142:3(2143-2155)Online publication date: 1-Jan-2022
  • (2022)Neural variational sparse topic model for sparse explainable text representationInformation Processing and Management: an International Journal10.1016/j.ipm.2021.10261458:5Online publication date: 22-Apr-2022
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media