Neural Topic Models for Hierarchical Topic Detection and Visualization

Pham, Dang; Le, Tuan M. V.

doi:10.1007/978-3-030-86523-8_3

Dang Pham¹³ &
Tuan M. V. Le¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12977))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

2144 Accesses
5 Citations

Abstract

Given a corpus of documents, hierarchical topic detection aims to learn a topic hierarchy where the topics are more general at high levels of the hierarchy and they become more specific toward the low levels. In this paper, we consider the joint problem of hierarchical topic detection and document visualization. We propose a joint neural topic model that can not only detect topic hierarchies but also generate a visualization of documents and their topic structure. By being able to view the topic hierarchy and see how documents are visually distributed across the hierarchy, we can quickly identify documents and topics of interest with desirable granularity. We conduct both quantitative and qualitative experiments on real-world large datasets. The results show that our method produces a better hierarchical visualization of topics and documents while achieving competitive performance in hierarchical topic detection, as compared to state-of-the-art baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Neural Topic Model with Distance Awareness

Hierarchical neural topic modeling with manifold regularization

Article 15 October 2021

Concurrent Inference of Topic Models and Distributed Vector Representations

Notes

1.
The source code is available at https://github.com/dangpnh2/htv.
2.
r is Euclidean distance in our experiments.
3.
In the experiments, we set $a=b=0.01$.
4.
https://mlg.ucd.ie/datasets/bbc.html.
5.
https://ana.cachopo.org/datasets-for-single-label-text-categorization.
6.
https://scikit-learn.org/0.19/datasets/twenty_newsgroups.html.
7.
https://data.mendeley.com/datasets/9rw3vkcfy4/6.
8.
Its implementation is at https://github.com/akashgit/autoencoding_vi_for_topic_models.
9.
We use the implementation at https://github.com/dangpnh2/plsv_vae.
10.
We use the implementation at https://github.com/blei-lab/hlda.
11.
We use the implementation at https://github.com/misonuma/tsntm.
12.
https://github.com/DmitryUlyanov/Multicore-TSNE.
13.
https://nlp.cs.nyu.edu/wikipedia-data/.

References

Almars, A., Li, X., Zhao, X.: Modelling user attitudes using hierarchical sentiment-topic model. Data Knowl. Eng. 119, 139–149 (2019)
Article Google Scholar
Alvarez-Melis, D., Jaakkola, T.: Tree-structured decoding with doubly-recurrent neural networks. In: ICLR (2017)
Google Scholar
Blei, D.M., Griffiths, T.L., Jordan, M.I.: The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies. J. ACM (JACM) 57(2), 1–30 (2010)
Article MathSciNet Google Scholar
Blei, D.M., Griffiths, T.L., Jordan, M.I., Tenenbaum, J.B.: Hierarchical topic models and the nested Chinese restaurant process. In: Advances in Neural Information Processing Systems, vol. 16, no. 16, pp. 17–24 (2004)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Cardoso-Cachopo, A.: Improving methods for single-label text categorization. Ph.D. thesis, Instituto Superior Tecnico, Universidade Tecnica de Lisboa (2007)
Google Scholar
Chen, Y., Zaki, M.J.: KATE: K-competitive autoencoder for text. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 85–94 (2017)
Google Scholar
Choo, J., Lee, C., Reddy, C.K., Park, H.: UTOPIAN: user-driven topic modeling based on interactive nonnegative matrix factorization. IEEE Trans. Visual Comput. Graph. 19(12), 1992–2001 (2013)
Article Google Scholar
Greene, D., Cunningham, P.: Practical solutions to the problem of diagonal dominance in kernel document clustering. In: Proceedings of the 23rd International Conference on Machine learning (ICML 2006), pp. 377–384. ACM Press (2006)
Google Scholar
Guo, D., Chen, B., Lu, R., Zhou, M.: Recurrent hierarchical topic-guided RNN for language generation. In: Proceedings of the 37th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 119, pp. 3810–3821 (2020)
Google Scholar
Isonuma, M., Mori, J., Bollegala, D., Sakata, I.: Tree-structured neural topic model. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 800–806 (2020)
Google Scholar
Iwata, T., Yamada, T., Ueda, N.: Probabilistic latent semantic visualization: topic model for visualizing documents. In: KDD, pp. 363–371 (2008)
Google Scholar
Kamada, T., Kawai, S.: An algorithm for drawing general undirected graphs. Inf. Process. Lett. 31, 7–15 (1989)
Article MathSciNet Google Scholar
Kataria, S.S., Kumar, K.S., Rastogi, R.R., Sen, P., Sengamedu, S.H.: Entity disambiguation with hierarchical topic models. In: KDD, pp. 1037–1045 (2011)
Google Scholar
Kim, H., Drake, B., Endert, A., Park, H.: ArchiText: interactive hierarchical topic modeling. IEEE Trans. Visual. Comput. Graph. 27, 3644–3655 (2020)
Article Google Scholar
Kim, J.H., Kim, D., Kim, S., Oh, A.H.: Modeling topic hierarchies with the recursive Chinese restaurant process. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management (2012)
Google Scholar
Kim, S., Zhang, J., Chen, Z., Oh, A., Liu, S.: A hierarchical aspect-sentiment model for online reviews. In: AAAI, vol. 27 (2013)
Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, 14–16 April 2014, Conference Track Proceedings (2014)
Google Scholar
Kowsari, K., et al.: HDLTex: hierarchical deep learning for text classification. In: International Conference on Machine Learning and Applications. IEEE (2017)
Google Scholar
Lau, J.H., Newman, D., Baldwin, T.: Machine reading tea leaves: automatically evaluating topic coherence and topic model quality. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp. 530–539 (2014)
Google Scholar
Le, T., Lauw, H.: Manifold learning for jointly modeling topic and visualization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 28 (2014)
Google Scholar
Le, T.M., Lauw, H.W.: Semantic visualization for spherical representation. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1007–1016 (2014)
Google Scholar
Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(11), 2579–2605 (2008)
MATH Google Scholar
Paisley, J., Wang, C., Blei, D.M., Jordan, M.I.: Nested hierarchical Dirichlet processes. IEEE Trans. Pattern Anal. Mach. Intell. 37(2), 256–270 (2014)
Article Google Scholar
Pham, D., Le, T.: Auto-encoding variational Bayes for inferring topics and visualization. In: Proceedings of the 28th International Conference on Computational Linguistics (2020)
Google Scholar
Smith, A., Hawes, T., Myers, M.: Hiearchie: visualization for hierarchical topic models. In: Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces, pp. 71–78 (2014)
Google Scholar
Srivastava, A., Sutton, C.A.: Autoencoding variational inference for topic models. In: ICLR (2017)
Google Scholar
Wang, R., et al.: Neural topic modeling with bidirectional adversarial training. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 340–350 (2020)
Google Scholar
Wang, X., Yang, Y.: Neural topic model with attention for supervised learning. In: International Conference on Artificial Intelligence and Statistics, pp. 1147–1156. PMLR (2020)
Google Scholar
Yang, Y., Yao, Q., Qu, H.: VISTopic: a visual analytics system for making sense of large document collections using hierarchical topic modeling. Visual Inform. 1(1), 40–47 (2017)
Article Google Scholar

Download references

Acknowledgments

This research is sponsored by NSF #1757207 and NSF #1914635.

Author information

Authors and Affiliations

Department of Computer Science, New Mexico State University, Las Cruces, USA
Dang Pham & Tuan M. V. Le

Authors

Dang Pham
View author publications
You can also search for this author in PubMed Google Scholar
Tuan M. V. Le
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Dang Pham or Tuan M. V. Le .

Editor information

Editors and Affiliations

ELLIS - The European Laboratory for Learning and Intelligent Systems, Alicante, Spain
Nuria Oliver
ETHZ and EPFL, Zürich, Switzerland
Fernando Pérez-Cruz
Johannes Gutenberg University of Mainz, Mainz, Germany
Stefan Kramer
École Polytechnique, Palaiseau, France
Jesse Read
Basque Center for Applied Mathematics, Bilbao, Spain
Jose A. Lozano

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pham, D., Le, T.M.V. (2021). Neural Topic Models for Hierarchical Topic Detection and Visualization. In: Oliver, N., Pérez-Cruz, F., Kramer, S., Read, J., Lozano, J.A. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2021. Lecture Notes in Computer Science(), vol 12977. Springer, Cham. https://doi.org/10.1007/978-3-030-86523-8_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-86523-8_3
Published: 11 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86522-1
Online ISBN: 978-3-030-86523-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)

Neural Topic Models for Hierarchical Topic Detection and Visualization

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Neural Topic Model with Distance Awareness

Hierarchical neural topic modeling with manifold regularization

Concurrent Inference of Topic Models and Distributed Vector Representations

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

Neural Topic Models for Hierarchical Topic Detection and Visualization

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Neural Topic Model with Distance Awareness

Hierarchical neural topic modeling with manifold regularization

Concurrent Inference of Topic Models and Distributed Vector Representations

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation