Abstract
Multi-label document classification is an important challenge with many real-world applications. While multi-label ranking is a common approach for multi-label classification. However existing works usually suffer from incomplete and context-free representation, and nonautomatic and part based model implementation. To solve the problem, we propose a LSTM\(^2\) (Long short term memory) model for document classification in this paper. This model consists of two-steps. The first is repLSTM process which is based on supervised LSTM by introducing the document labels to learn document representation. The second is rankLSTM process. The order of documents labels are rearranged in accordance with a semantics tree, which better exerts the advantages of the LSTM in sequence. Besides by predicting label serially, the model can be trained as a whole. In addition, Connectionist Temporal Classification is used in this process which is a good solution to deal with the error propagation for variable length output (the number of labels in each document). Experiments on three generalization datasets have achieved good results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
References
Barutcuoglu, Z., Schapire, R.E., Troyanskaya, O.G.: Hierarchical multi-label prediction of gene function. Bioinformatics 22(7), 830–836 (2006)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. In: Advances in neural information processing systems, pp. 601–608 (2001)
Bucak, S.S., Mallapragada, P.K., Jin, R., Jain, A.K.: Efficient multi-label ranking for multi-class learning: application to object recognition. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2098–2105. IEEE (2009)
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)
Chen, J., Chaudhari, N.S.: Protein secondary structure prediction with bidirectional LSTM networks. In: International Joint Conference on Neural Networks: Post-Conference Workshop on Computational Intelligence Approaches for the Analysis of Bio-data (CI-BIO), August 2005
Chiang, T.H., Lo, H.Y., Lin, S.D.: A ranking-based KNN approach for multi-label classification. ACML 25, 81–96 (2012)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
Dos Santos, C.N., Gatti, M.: Deep convolutional neural networks for sentiment analysis of short texts. In: Proceedings of the 25th International Conference on Computational Linguistics (COLING), Dublin, Ireland (2014)
Elman, J.L.: Finding structure in time. Cognit. Sci. 14(2), 179–211 (1990)
Elsas, J.L., Donmez, P., Callan, J., Carbonell, J.G.: Pairwise document classification for relevance feedback. Technical report, DTIC Document (2009)
Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649. IEEE (2013)
Graves, A., Daojian, L., K., Lai, S., Zhou, G., Zhao, J.: Supervised Sequence Labelling with Recurrent Neural Networks, vol. 385. Springer, Heidelberg (2012)
Hüllermeier, E., Fürnkranz, J., Cheng, W., Brinker, K.: Label ranking by learning pairwise preferences. Artif. Intell. 172(16), 1897–1916 (2008)
Jordan, A.: On discriminative vs. generative classifiers: a comparison of logistic regression and naive Bayes. Adv. Neural Inf. Process. Syst. 14, 841 (2002)
Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. arXiv preprint arXiv:1412.2306 (2014)
Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)
Li, J., Chen, X., Hovy, E., Jurafsky, D.: Visualizing and understanding neural models in NLP. arXiv preprint arXiv:1506.01066 (2015)
Loza MencÃa, E., Fürnkranz, J.: Efficient pairwise multilabel classification for large-scale problems in the legal domain. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008. LNCS, vol. 5212, pp. 50–65. Springer, Heidelberg (2008). doi:10.1007/978-3-540-87481-2_4
Mikolov, T., Karafiát, M., Burget, L., Cernockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010, Makuhari, Chiba, Japan, 26–30 September 2010, pp. 1045–1048 (2010)
Padhye, A.: Comparing supervised and unsupervised classification of messages in the enron email corpus. Ph.D. thesis. University of Minnesota (2006)
Petterson, J., Caetano, T.S.: Reverse multi-label learning. In: Advances in Neural Information Processing Systems, pp. 1912–1920 (2010)
Srivastava, N., Mansimov, E., Salakhutdinov, R.: Unsupervised learning of video representations using LSTMS. arXiv preprint arXiv:1502.04681 (2015)
Srivastava, N., Salakhutdinov, R.R., Hinton, G.E.: Modeling documents with deep Boltzmann machines. arXiv preprint arXiv:1309.6865 (2013)
Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075 (2015)
Trohidis, K., Tsoumakas, G., Kalliris, G., Vlahavas, I.P.: Multi-label classification of music into emotions. In: ISMIR, vol. 8, pp. 325–330 (2008)
Tsoumakas, G., Katakis, I., Vlahavas, I.: Random k-labelsets for multilabel classification. IEEE Trans. Knowl. Data Eng. 23(7), 1079–1089 (2011)
Vembu, S., Gärtner, T.: Label ranking algorithms: a survey. In: Fürnkranz, J., Hüllermeier, E. (eds.) Preference Learning, pp. 45–64. Springer, Heidelberg (2011)
Xue, X., Zhang, W., Zhang, J., Wu, B., Fan, J., Lu, Y.: Correlative multi-label multi-instance image annotation. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 651–658. IEEE (2011)
Yepes, A.J., MacKinlay, A., Bedo, J., Garnavi, R., Chen, Q.: Deep belief networks and biomedical text categorisation. In: Australasian Language Technology Association Workshop, p. 123 (2014)
Zeng, D., Liu, K., Lai, S., Zhou, G., Zhao, J.: Relation classification via convolutional deep neural network. In: Proceedings of COLING, pp. 2335–2344 (2014)
Zhang, M.L., Zhou, Z.H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit. 40(7), 2038–2048 (2007)
Zhu, X., Sobihani, P., Guo, H.: Long short-term memory over recursive structures. In: Proceedings of the 32nd International Conference on Machine Learning (ICML-2015), pp. 1604–1612 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Yan, Y., Yin, XC., Yang, C., Zhang, BW., Hao, HW. (2016). Multi-label Ranking with LSTM\(^2\) for Document Classification. In: Tan, T., Li, X., Chen, X., Zhou, J., Yang, J., Cheng, H. (eds) Pattern Recognition. CCPR 2016. Communications in Computer and Information Science, vol 663. Springer, Singapore. https://doi.org/10.1007/978-981-10-3005-5_29
Download citation
DOI: https://doi.org/10.1007/978-981-10-3005-5_29
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3004-8
Online ISBN: 978-981-10-3005-5
eBook Packages: Computer ScienceComputer Science (R0)