Abstract
Given a corpus of documents, hierarchical topic detection aims to learn a topic hierarchy where the topics are more general at high levels of the hierarchy and they become more specific toward the low levels. In this paper, we consider the joint problem of hierarchical topic detection and document visualization. We propose a joint neural topic model that can not only detect topic hierarchies but also generate a visualization of documents and their topic structure. By being able to view the topic hierarchy and see how documents are visually distributed across the hierarchy, we can quickly identify documents and topics of interest with desirable granularity. We conduct both quantitative and qualitative experiments on real-world large datasets. The results show that our method produces a better hierarchical visualization of topics and documents while achieving competitive performance in hierarchical topic detection, as compared to state-of-the-art baselines.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The source code is available at https://github.com/dangpnh2/htv.
- 2.
r is Euclidean distance in our experiments.
- 3.
In the experiments, we set \(a=b=0.01\).
- 4.
- 5.
- 6.
- 7.
- 8.
Its implementation is at https://github.com/akashgit/autoencoding_vi_for_topic_models.
- 9.
We use the implementation at https://github.com/dangpnh2/plsv_vae.
- 10.
We use the implementation at https://github.com/blei-lab/hlda.
- 11.
We use the implementation at https://github.com/misonuma/tsntm.
- 12.
- 13.
References
Almars, A., Li, X., Zhao, X.: Modelling user attitudes using hierarchical sentiment-topic model. Data Knowl. Eng. 119, 139–149 (2019)
Alvarez-Melis, D., Jaakkola, T.: Tree-structured decoding with doubly-recurrent neural networks. In: ICLR (2017)
Blei, D.M., Griffiths, T.L., Jordan, M.I.: The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies. J. ACM (JACM) 57(2), 1–30 (2010)
Blei, D.M., Griffiths, T.L., Jordan, M.I., Tenenbaum, J.B.: Hierarchical topic models and the nested Chinese restaurant process. In: Advances in Neural Information Processing Systems, vol. 16, no. 16, pp. 17–24 (2004)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Cardoso-Cachopo, A.: Improving methods for single-label text categorization. Ph.D. thesis, Instituto Superior Tecnico, Universidade Tecnica de Lisboa (2007)
Chen, Y., Zaki, M.J.: KATE: K-competitive autoencoder for text. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 85–94 (2017)
Choo, J., Lee, C., Reddy, C.K., Park, H.: UTOPIAN: user-driven topic modeling based on interactive nonnegative matrix factorization. IEEE Trans. Visual Comput. Graph. 19(12), 1992–2001 (2013)
Greene, D., Cunningham, P.: Practical solutions to the problem of diagonal dominance in kernel document clustering. In: Proceedings of the 23rd International Conference on Machine learning (ICML 2006), pp. 377–384. ACM Press (2006)
Guo, D., Chen, B., Lu, R., Zhou, M.: Recurrent hierarchical topic-guided RNN for language generation. In: Proceedings of the 37th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 119, pp. 3810–3821 (2020)
Isonuma, M., Mori, J., Bollegala, D., Sakata, I.: Tree-structured neural topic model. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 800–806 (2020)
Iwata, T., Yamada, T., Ueda, N.: Probabilistic latent semantic visualization: topic model for visualizing documents. In: KDD, pp. 363–371 (2008)
Kamada, T., Kawai, S.: An algorithm for drawing general undirected graphs. Inf. Process. Lett. 31, 7–15 (1989)
Kataria, S.S., Kumar, K.S., Rastogi, R.R., Sen, P., Sengamedu, S.H.: Entity disambiguation with hierarchical topic models. In: KDD, pp. 1037–1045 (2011)
Kim, H., Drake, B., Endert, A., Park, H.: ArchiText: interactive hierarchical topic modeling. IEEE Trans. Visual. Comput. Graph. 27, 3644–3655 (2020)
Kim, J.H., Kim, D., Kim, S., Oh, A.H.: Modeling topic hierarchies with the recursive Chinese restaurant process. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management (2012)
Kim, S., Zhang, J., Chen, Z., Oh, A., Liu, S.: A hierarchical aspect-sentiment model for online reviews. In: AAAI, vol. 27 (2013)
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, 14–16 April 2014, Conference Track Proceedings (2014)
Kowsari, K., et al.: HDLTex: hierarchical deep learning for text classification. In: International Conference on Machine Learning and Applications. IEEE (2017)
Lau, J.H., Newman, D., Baldwin, T.: Machine reading tea leaves: automatically evaluating topic coherence and topic model quality. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp. 530–539 (2014)
Le, T., Lauw, H.: Manifold learning for jointly modeling topic and visualization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 28 (2014)
Le, T.M., Lauw, H.W.: Semantic visualization for spherical representation. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1007–1016 (2014)
Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(11), 2579–2605 (2008)
Paisley, J., Wang, C., Blei, D.M., Jordan, M.I.: Nested hierarchical Dirichlet processes. IEEE Trans. Pattern Anal. Mach. Intell. 37(2), 256–270 (2014)
Pham, D., Le, T.: Auto-encoding variational Bayes for inferring topics and visualization. In: Proceedings of the 28th International Conference on Computational Linguistics (2020)
Smith, A., Hawes, T., Myers, M.: Hiearchie: visualization for hierarchical topic models. In: Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces, pp. 71–78 (2014)
Srivastava, A., Sutton, C.A.: Autoencoding variational inference for topic models. In: ICLR (2017)
Wang, R., et al.: Neural topic modeling with bidirectional adversarial training. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 340–350 (2020)
Wang, X., Yang, Y.: Neural topic model with attention for supervised learning. In: International Conference on Artificial Intelligence and Statistics, pp. 1147–1156. PMLR (2020)
Yang, Y., Yao, Q., Qu, H.: VISTopic: a visual analytics system for making sense of large document collections using hierarchical topic modeling. Visual Inform. 1(1), 40–47 (2017)
Acknowledgments
This research is sponsored by NSF #1757207 and NSF #1914635.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Pham, D., Le, T.M.V. (2021). Neural Topic Models for Hierarchical Topic Detection and Visualization. In: Oliver, N., Pérez-Cruz, F., Kramer, S., Read, J., Lozano, J.A. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2021. Lecture Notes in Computer Science(), vol 12977. Springer, Cham. https://doi.org/10.1007/978-3-030-86523-8_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-86523-8_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86522-1
Online ISBN: 978-3-030-86523-8
eBook Packages: Computer ScienceComputer Science (R0)