Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

LSTM\(^{2}\): Multi-Label Ranking for Document Classification

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Multi-label document classification is a typical challenge in many real-world applications. Multi-label ranking is a common approach, while existing studies usually disregard the effects of context and the relationships among labels during the scoring process. In this paper, we propose an Long Short Term Memory (LSTM)-based multi-label ranking model for document classification, namely LSTM\(^2\) consisting of repLSTM—an adaptive data representation process and rankLSTM—a unified learning-ranking process. In repLSTM, the supervised LSTM is used to learn document representation by incorporating the document labels. In rankLSTM, the order of the documents labels is rearranged in accordance with a semantic tree, in which the semantics are compatible with and appropriate to the sequential learning of LSTM. The model can be wholly trained by sequentially predicting labels. Connectionist Temporal Classification is performed in rankLSTM to address the error propagation for a variable number of labels in each document. Moreover, a variety of experiments with document classification conducted on three typical datasets reveal the impressive performance of our proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. http://www.bioasq.org/participate/challenges.

  2. http://www.d.umn.edu/~tpederse/enron.html.

  3. http://www.daviddlewis.com/resources/testcollections/rcv1/.

  4. https://www.csie.ntu.edu.tw/~cjlin/libsvm/.

  5. http://lamda.nju.edu.cn/code_MLkNN.ashx.

  6. http://www.vlfeat.org/matconvnet/.

  7. http://www.fit.vutbr.cz/~imikolov/rnnlm/.

  8. http://deeplearning.net/tutorial/lstm.html.

References

  1. Barutcuoglu Z, Schapire RE, Troyanskaya OG (2006) Hierarchical multi-label prediction of gene function. Bioinformatics 22(7):830–836

    Article  Google Scholar 

  2. Blei DM, Ng AY, Jordan MI (2001) Latent dirichlet allocation. In: Advances in neural information processing systems, pp 601–608

  3. Blockeel H, De Raedt L, Ramon J (2000) Top-down induction of clustering trees. arXiv:cs/0011032

  4. Bucak SS, Mallapragada PK, Jin R, Jain AK (2009) Efficient multi-label ranking for multi-class learning: application to object recognition. In: 2009 IEEE 12th international conference on Computer vision, IEEE, pp 2098–2105

  5. Chang CC, Lin CJ (2011) Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):27

    Article  Google Scholar 

  6. Chen J, Chaudhari NS (2005) Protein secondary structure prediction with bidirectional lstm networks. In: International joint conference on neural networks: post-conference workshop on computational intelligence approaches for the analysis of bio-data (CI-BIO), August 2005

  7. Chiang TH, Lo HY, Lin SD (2012) A ranking-based knn approach for multi-label classification. ACML 25:81–96

    Google Scholar 

  8. Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537

    MATH  Google Scholar 

  9. Dembczyński K, Waegeman W, Cheng W, Hüllermeier E (2012) On label dependence and loss minimization in multi-label classification. Mach Learn 88(1–2):5–45

    Article  MathSciNet  MATH  Google Scholar 

  10. Dos Santos CN, Gatti M (2014) Deep convolutional neural networks for sentiment analysis of short texts. In: Proceedings of the 25th international conference on computational linguistics (COLING), Dublin, Ireland

  11. Elman JL (1990) Finding structure in time. Cogn Sci 14(2):179–211

    Article  Google Scholar 

  12. Elsas JL, Donmez P, Callan J, Carbonell JG (2009) Pairwise document classification for relevance feedback. Technical report, DTIC Document

  13. Gharroudi O, Elghazel H, Aussem A (2015) Ensemble multi-label classification: a comparative study on threshold selection and voting methods. In: 2015 IEEE 27th international conference on Tools with artificial intelligence (ICTAI), IEEE, pp 377–384

  14. Gibaja E, Ventura S (2015) A tutorial on multilabel learning. ACM Comput Surv 47(3):52

    Article  Google Scholar 

  15. Graves A, Daojian, Liu K, Lai S, Zhou G, Zhao J (2012) Supervised sequence labelling with recurrent neural networks, vol 385. Springer, Berlin

    Google Scholar 

  16. Graves A, Mohamed Ar, Hinton G (2013) Speech recognition with deep recurrent neural networks. In: 2013 IEEE international conference on Acoustics, speech and signal processing (ICASSP), IEEE, pp 6645–6649

  17. Hüllermeier E, Fürnkranz J, Cheng W, Brinker K (2008) Label ranking by learning pairwise preferences. Artifi Intell 172(16):1897–1916

    Article  MathSciNet  MATH  Google Scholar 

  18. Ioannou M, Sakkas G, Tsoumakas G, Vlahavas I (2010) Obtaining bipartitions from score vectors for multilabel classification. In: 2010, 22nd IEEE international conference on tools with artificial intelligence, vol. 1, IEEE, pp 409–416

  19. Jordan A (2002) On discriminative vs. generative classifiers: a comparison of logistic regression and naive bayes. Adv Neural Inf Process Syst 14:841

    Google Scholar 

  20. Karpathy A, Fei-Fei L (2014) Deep visual-semantic alignments for generating image descriptions. arXiv preprint arXiv:1412.2306

  21. Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: 29th AAAI conference on artificial intelligence

  22. Li J, Chen X, Hovy E, Jurafsky D (2015) Visualizing and understanding neural models in nlp. arXiv preprint arXiv:1506.01066

  23. Madjarov G, Kocev D, Gjorgjevikj D, Džeroski S (2012) An extensive experimental comparison of methods for multi-label learning. Pattern Recognit 45(9):3084–3104

    Article  Google Scholar 

  24. Mencia EL, Fürnkranz J (2008) Efficient pairwise multilabel classification for large-scale problems in the legal domain. In: Machine learning and knowledge discovery in databases, Springer, pp 50–65

  25. Mikolov T, Karafiát M, Burget L, Cernockỳ J, Khudanpur S (2010) Recurrent neural network based language model. In: INTERSPEECH 2010, 11th annual conference of the international speech communication association, Makuhari, Chiba, Japan, 26–30 September 2010, pp 1045–1048

  26. Mikolov T, Yih Wt, Zweig G (2013) Linguistic regularities in continuous space word representations. In: HLT-NAACL, vol 13, pp 746–751

  27. Padhye A (2006) Comparing supervised and unsupervised classification of messages in the enron email corpus. Ph.D. thesis, University of Minnesota

  28. Petterson J, Caetano TS (2010) Reverse multi-label learning. In: Advances in neural information processing systems, pp 1912–1920

  29. Srivastava N, Mansimov E, Salakhutdinov R (2015) Unsupervised learning of video representations using lstms. arXiv preprint arXiv:1502.04681

  30. Srivastava N, Salakhutdinov RR, Hinton GE (2013) Modeling documents with deep boltzmann machines. arXiv preprint arXiv:1309.6865

  31. Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075

  32. Trohidis K, Tsoumakas G, Kalliris G, Vlahavas IP (2008) Multi-label classification of music into emotions. ISMIR 8:325–330

    Google Scholar 

  33. Tsoumakas G, Katakis I, Vlahavas I (2008) Effective and efficient multilabel classification in domains with large number of labels. In: Proceedings ECML/PKDD 2008 workshop on mining multidimensional data (MMD08), pp 30–44

  34. Vembu S, Gärtner T (2011) Label ranking algorithms: a survey. In: Preference learning, Springer, Berlin, pp 45–64

  35. Xue X, Zhang W, Zhang J, Wu B, Fan J, Lu Y (2011) Correlative multi-label multi-instance image annotation. In: 2011 IEEE international conference on Computer vision (ICCV), IEEE, pp 651–658

  36. Yepes AJ, MacKinlay A, Bedo J, Garnavi R, Chen Q (2014) Deep belief networks and biomedical text categorisation. In: Australasian language technology association workshop, p 123

  37. Zeng D, Liu K, Lai S, Zhou G, Zhao J (2014) Relation classification via convolutional deep neural network. In: Proceedings of COLING, pp 2335–2344

  38. Zhang ML, Zhou ZH (2007) Ml-knn: a lazy learning approach to multi-label learning. Pattern Recognit 40(7):2038–2048

    Article  MATH  Google Scholar 

  39. Zhu X, Sobihani P, Guo H (2015) Long short-term memory over recursive structures. In: Proceedings of the 32nd international conference on machine learning (ICML-15), pp 1604–1612

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yan Yan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yan, Y., Wang, Y., Gao, WC. et al. LSTM\(^{2}\): Multi-Label Ranking for Document Classification. Neural Process Lett 47, 117–138 (2018). https://doi.org/10.1007/s11063-017-9636-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-017-9636-0

Keywords