Abstract
Extreme multi-label text classification (XMC) is an important yet challenging problem in the NLP community, which refers to the problem of assigning to each document its most relevant subset of class labels from an extremely large label collection. For example, the input text could be a story document on chinastory.cn and the labels could be story categories that implies the potential meaning. However, naively applying normal neural network models to the XMC problem leads to sub-optimal performance due to the large output space and the label sparsity issue. In this paper, we presents the first attempt at applying reinforcement learning to XMC. Experimental results on public and our own engineering datasets demonstrate that our approach achieves expecting performance compared with the evaluation of the state-of-the-art methods.
H. Teng and Y. Li—Contributed equally.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Liu, J., Chang, W. C., Wu, Y., Yang, Y.: Deep learning for extreme multi-label text classification. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 115–124 (2017)
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Johnson, R., Zhang, T.: Effective use of word order for text categorization with convolutional neural networks. arXiv preprint arXiv:1412.1058 (2014)
Bhatia, K., Jain, H., Kar, P., Varma, M., Jain, P.: Sparse local embeddings for extreme multi-label classification. In: Advances in Neural Information Processing Systems, pp. 730–738 (2015)
Choo, J., Lee, C., Reddy, C.K., Park, H.: UTOPIAN: user-driven topic modeling based on interactive nonnegative matrix factorization. IEEE Trans. Vis. Comput. Graph. 19(12), 1992–2001 (2013)
Teng, H., Liu, H., Yu, L., Sun, F.: Representative video action discovery using interactive non-negative matrix factorization. In: Hu, X., Xia, Y., Zhang, Y., Zhao, D. (eds.) ISNN 2015. LNCS, vol. 9377, pp. 205–212. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25393-0_23
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Prabhu, Y., Varma, M.: FastXML: a fast, accurate and stable tree-classifier for extreme multi-label learning. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 263–272 (2014)
Zhang, T., Huang, M., Zhao, L.: Learning structured representation for text classification via reinforcement learning. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)
Chinastory. https://www.chinastory.cn/english/index.html
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Teng, H., Li, Y., Long, F., Xu, M., Ling, Q. (2021). Reinforcement Learning for Extreme Multi-label Text Classification. In: Sun, F., Liu, H., Fang, B. (eds) Cognitive Systems and Signal Processing. ICCSIP 2020. Communications in Computer and Information Science, vol 1397. Springer, Singapore. https://doi.org/10.1007/978-981-16-2336-3_22
Download citation
DOI: https://doi.org/10.1007/978-981-16-2336-3_22
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-2335-6
Online ISBN: 978-981-16-2336-3
eBook Packages: Computer ScienceComputer Science (R0)