research-article

Free access

Deep short text classification with knowledge powered attention

AUTHORs:

Haiyun JiangAuthors Info & Claims

AAAI'19/IAAI'19/EAAI'19: Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence

Article No.: 767, Pages 6252 - 6259

https://doi.org/10.1609/aaai.v33i01.33016252

Published: 27 January 2019 Publication History

PDF eReader Publisher Site

Abstract

Short text classification is one of important tasks in Natural Language Processing (NLP). Unlike paragraphs or documents, short texts are more ambiguous since they have not enough contextual information, which poses a great challenge for classification. In this paper, we retrieve knowledge from external knowledge source to enhance the semantic representation of short texts. We take conceptual information as a kind of knowledge and incorporate it into deep neural networks. For the purpose of measuring the importance of knowledge, we introduce attention mechanisms and propose deep Short Text Classification with Knowledge powered Attention (STCKA). We utilize Concept towards Short Text (C-ST) attention and Concept towards Concept Set (C-CS) attention to acquire the weight of concepts from two aspects. And we classify a short text with the help of conceptual information. Unlike traditional approaches, our model acts like a human being who has intrinsic ability to make decisions based on observation (i.e., training data for machines) and pays more attention to important knowledge. We also conduct extensive experiments on four public datasets for different tasks. The experimental results and case studies show that our model outperforms the state-of-the-art methods, justifying the effectiveness of knowledge powered attention.

References

[1]

Bahdanau, D.; Cho, K.; and Bengio, Y. 2015. Neural machine translation by jointly learning to align and translate. Computer Science.

[2]

Bollacker, K.; Evans, C.; Paritosh, P.; Sturge, T.; and Taylor, J. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, 1247-1250. AcM.

Digital Library

[3]

Cavnar, W. B.; Trenkle, J. M.; et al. 1994. N-gram-based text categorization. Ann arbor mi 48113(2): 161-175.

[4]

Chen, L.; Liang, J.; Xie, C.; and Xiao, Y. 2018. Short text entity linking with fine-grained topics. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, 457-466. ACM.

[5]

Fu, J.; Qiu, J.; Wang, J.; and Li, L. 2015. Name disambiguation using semi-supervised topic model. In International Conference on Intelligent Computing, 471-480. Springer.

[6]

Gabrilovich, E., and Markovitch, S. 2007. Computing semantic relatedness using wikipedia-based explicit semantic analysis.

[7]

Gulcehre, C.; Ahn, S.; Nallapati, R.; Zhou, B.; and Bengio, Y. 2016. Pointing the unknown words. arXiv preprint arXiv: 1603.08148.

[8]

Hao, Y.; Zhang, Y.; Liu, K.; He, S.; Liu, Z.; Wu, H.; Zhao, J.; Hao, Y.; Zhang, Y.; and Liu, K. 2017. An end-to-end model for question answering over knowledge base with cross-attention combining global knowledge. In Meeting of the Association for Computational Linguistics, 221-231.

[9]

Hu, J.; Wang, G.; Lochovsky, F.; Sun, J.-t.; and Chen, Z. 2009. Understanding user's query intent with wikipedia. In Proceedings of the 18th international conference on World wide web, 471-480. ACM.

[10]

Kim, Y. 2014. Convolutional neural networks for sentence classification. Eprint Arxiv.

[11]

Kingma, D. P., and Ba, J. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv: 1412.6980.

[12]

Lai, S.; Xu, L.; Liu, K.; and Zhao, J. 2015. Recurrent convolutional neural networks for text classification. In AAAI, volume 333, 2267-2273.

Digital Library

[13]

Lee, J. Y., and Dernoncourt, F. 2016. Sequential short-text classification with recurrent and convolutional neural networks. 515-520.

[14]

Lin, Z.; Feng, M.; Santos, C. N. D.; Yu, M.; Xiang, B.; Zhou, B.; and Bengio, Y. 2017. A structured self-attentive sentence embedding.

[15]

Luong, M.-T.; Pham, H.; and Manning, C. D. 2015. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv: 1508.04025.

[16]

Mikolov, T.; Chen, K.; Corrado, G.; and Dean, J. 2013. Efficient estimation of word representations in vector space. Computer Science.

[17]

Moro, A.; Raganato, A.; and Navigli, R. 2014. Entity linking meets word sense disambiguation: a unified approach. Transactions of the Association for Computational Linguistics 2:231-244.

[18]

Pang, B.; Lee, L.; and Vaithyanathan, S. 2002. Thumbs up?: sentiment classification using machine learning techniques. Proceedings of Emnlp 79-86.

Digital Library

[19]

Park, M. Y., and Hastie, T. 2007. L1-regularization path algorithm for generalized linear models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 69(4):659-677.

[20]

Post, M., and Bergsma, S. 2013. Explicit and implicit syntactic features for text classification. 866-872.

[21]

Qiu, X.; Gong, J.; and Huang, X. 2017. Overview of the nlpcc 2017 shared task: Chinese news headline categorization. In National CCF Conference on Natural Language Processing and Chinese Computing, 948-953. Springer.

[22]

Shuyan, T. 2018. Cn-probase concept api. Accessed May 22, 2018. http://shuyantech.com/api/cnprobase/concept.

[23]

Suchanek, F. M.; Kasneci, G.; and Weikum, G. 2008. Yago a large ontology from wikipedia and wordnet. Web Semantics Science Services and Agents on the World Wide Web 6(3):203-217.

Digital Library

[24]

Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L.; and Polosukhin, I. 2017. Attention is all you need.

[25]

Wang, Z., and Wang, H. 2016. Understanding short texts. In the Association for Computational Linguistics (ACL) (Tutorial).

[26]

Wang, F.; Wang, Z.; Li, Z.; and Wen, J. R. 2014. Concept-based short text classification and ranking. In ACM International Conference on Conference on Information and Knowledge Management, 1069-1078.

[27]

Wang, J.; Wang, Z.; Zhang, D.; and Yan, J. 2017. Combining knowledge with deep convolutional neural networks for short text classification. In Twenty-Sixth International Joint Conference on Artificial Intelligence, 2915-2921.

[28]

Wu; Wentao; Li; Hongsong; Wang; Haixun; Zhu; and Kenny, Q. 2012. Probase: a probabilistic taxonomy for text understanding. 481-492.

[29]

Zeng, W.; Luo, W.; Fidler, S.; and Urtasun, R. 2016. Efficient summarization with read-again and copy mechanism.

[30]

Zhang, X.; Zhao, J.; and LeCun, Y. 2015. Character-level convolutional networks for text classification. In Advances in neural information processing systems, 649-657.

[31]

Zhou, H.; Huang, M.; Zhang, T.; Zhu, X.; and Liu, B. 2017. Emotional chatting machine: Emotional conversation generation with internal and external memory.

[32]

Zhou, Y.; Xu, R.; and Gui, L. 2017. A sequence level latent topic modeling method for sentiment analysis via cnn based diversified restrict boltzmann machine. In International Conference on Machine Learning and Cybernetics, 356-361.

Cited By

Thao PDao CWu CWang JLiu SDing JRestrepo DLiu FHung FPeng WSerra ESpezzano F(2024)MEDFuse: Multimodal EHR Data Fusion with Masked Lab-Test Modeling and Large Language ModelsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679962(3974-3978)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679962
Cai ZZhang HZhan PJia XYan YSong XXie B(2024)Multi-schema prompting powered token-feature woven attention network for short text classificationPattern Recognition10.1016/j.patcog.2024.110782156:COnline publication date: 18-Nov-2024
https://dl.acm.org/doi/10.1016/j.patcog.2024.110782
Wu M(2023)Commonsense knowledge powered heterogeneous graph attention networks for semi-supervised short text classification▪Expert Systems with Applications: An International Journal10.1016/j.eswa.2023.120800232:COnline publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1016/j.eswa.2023.120800
Show More Cited By

Index Terms

Deep short text classification with knowledge powered attention
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Neural networks
2. Information systems

Index terms have been assigned to the content through auto-classification.

Recommendations

Combining knowledge with deep convolutional neural networks for short text classification
IJCAI'17: Proceedings of the 26th International Joint Conference on Artificial Intelligence

Text classification is a fundamental task in NLP applications. Most existing work relied on either explicit or implicit text representation to address this problem. While these techniques work well for sentences, they can not easily be applied to short ...
Commonsense knowledge powered heterogeneous graph attention networks for semi-supervised short text classification▪
Abstract
In real-world scenarios, considerable human power and expert knowledge are required to label data. Therefore, solving short text classification problems in a semi-supervised manner is a good method. Existing graph-based semi-supervised short text ...
Combining Knowledge with Attention Neural Networks for Short Text Classification
Knowledge Science, Engineering and Management
Abstract
Text classification has emerged as an important research area over the last few years in natural language processing (NLP). Different from formal documents and paragraphs, short texts are more ambiguous, due to the lack of contextual information ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

AAAI'19/IAAI'19/EAAI'19: Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence

January 2019

10088 pages

ISBN:978-1-57735-809-1

Copyright © 2019 Association for the Advancement of Artificial Intelligence.

Sponsors

Association for the Advancement of Artificial Intelligence

Publisher

AAAI Press

Publication History

Published: 27 January 2019

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
114
Total Downloads

Downloads (Last 12 months)50
Downloads (Last 6 weeks)8

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Thao PDao CWu CWang JLiu SDing JRestrepo DLiu FHung FPeng WSerra ESpezzano F(2024)MEDFuse: Multimodal EHR Data Fusion with Masked Lab-Test Modeling and Large Language ModelsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679962(3974-3978)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679962
Cai ZZhang HZhan PJia XYan YSong XXie B(2024)Multi-schema prompting powered token-feature woven attention network for short text classificationPattern Recognition10.1016/j.patcog.2024.110782156:COnline publication date: 18-Nov-2024
https://dl.acm.org/doi/10.1016/j.patcog.2024.110782
Wu M(2023)Commonsense knowledge powered heterogeneous graph attention networks for semi-supervised short text classification▪Expert Systems with Applications: An International Journal10.1016/j.eswa.2023.120800232:COnline publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1016/j.eswa.2023.120800
Lan GHu MLi YZhang Y(2023)Contrastive knowledge integrated graph neural networks for Chinese medical text classificationEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.106057122:COnline publication date: 1-Jun-2023
https://dl.acm.org/doi/10.1016/j.engappai.2023.106057
Lyu PRao GZhang LCong Q(2023)BiLGAT: Bidirectional lattice graph attention network for chinese short text classificationApplied Intelligence10.1007/s10489-023-04700-753:19(22405-22414)Online publication date: 28-Jun-2023
https://dl.acm.org/doi/10.1007/s10489-023-04700-7
Choudhary NAggarwal CSubbian KReddy C(2022)Self-supervised Short-text Modeling through Auxiliary Context GenerationACM Transactions on Intelligent Systems and Technology10.1145/351171213:3(1-21)Online publication date: 12-Apr-2022
https://dl.acm.org/doi/10.1145/3511712
Li QPeng HLi JXia CYang RSun LYu PHe L(2022)A Survey on Text Classification: From Traditional to Deep LearningACM Transactions on Intelligent Systems and Technology10.1145/349516213:2(1-41)Online publication date: 8-Apr-2022
https://dl.acm.org/doi/10.1145/3495162
Wang CLiu JZhuang TLi JLiu JXiao YWang WXie RSelcuk Candan KLiu HAkoglu LLuna Dong XTang J(2022)A Sequence-to-Sequence Model for Large-scale Chinese Abbreviation Database ConstructionProceedings of the Fifteenth ACM International Conference on Web Search and Data Mining10.1145/3488560.3498430(1063-1071)Online publication date: 11-Feb-2022
https://dl.acm.org/doi/10.1145/3488560.3498430
Wang YWang SLi YDou DAmigo ECastells PGonzalo JCarterette BCulpepper JKazai G(2022)Recognizing Medical Search Query Intent by Few-shot LearningProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531789(502-512)Online publication date: 6-Jul-2022
https://dl.acm.org/doi/10.1145/3477495.3531789
Nguyen TMai TNguyen NVan LThan K(2022)Balancing stability and plasticity when learning topic models from short and noisy text streamsNeurocomputing10.1016/j.neucom.2022.07.019505:C(30-43)Online publication date: 21-Sep-2022
https://dl.acm.org/doi/10.1016/j.neucom.2022.07.019
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten