Abstract
Graph convolutional networks-based text classification methods have shown impressive success in further improving the classification results by considering the structural relationship between words and texts. However, existing GCN-based text classification methods tend to ignore the semantic representation of the node and the global structural information among nodes. Besides, only the word granularity information within the text, i.e., endogenous source, is used to represent the text. Furthermore, the existing graph convolutional network approaches are faced with major challenges to handle large and dense graphs, i.e., neighbor explosion and noisy inputs. To address these shortcomings, this paper proposes an inductive learning-based text classification method that utilizes representation learning on heterogeneous information networks and exogenous knowledge. Firstly, a weighted heterogeneous information network for text (HINT) is constructed by introducing exogenous knowledge, in which the node types cover text, entities and words. The unstructured text is represented as a structured heterogeneous information network, which expands the granularity of text features and makes full use of the exogenous structural information and explicit semantic information to enhance the interpretability of text information. Besides, we also enhanced the graph neural network against the challenges of neighbor explosion and noisy inputs derived from HINT using two strategies: graph sampling and Dropedge, for semi-supervised learning with improved classification performance. The effectiveness of our model is demonstrated by examining four publicly available text classification datasets. Based on experimental results, our approach achieves state-of-the-art performance on the text classification datasets.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
The R8 and R52 data that support the findings of this study are available in/from the Reuters-21,578 corpus, http://www.daviddlewis.co. The Ohsumed data that support the findings of this study are available in/from the Text Categorization corpora, http://disi.unitn.it/moschitti/corpora.htm. The TREC data that support the findings of this study are available in/from the TERC corpus, https://trec.nist.gov/data.html. The 20NG data that support the findings of this study are available in/from the 20Newsgroups, https://trec.nist.gov/data.html. The MR data that support the findings of this study are available in/from the movie review corpus, https://www.cs.cornell.edu/people/pabo/movie-review-data/.
References
Yao L, Mao C, Luo Y (2019) Graph convolutional networks for text classification. Proc AAAI Conf Artif Intell 33:7370–7377
Chiang W-L, Liu X, Si S, Li Y, Bengio S, Hsieh C-J (2019) Cluster-gcn: an efficient algorithm for training deep and large graph convolutional networks. In: Proceedings of the 25th ACM SIGKDD International conference on knowledge discovery & data mining, pp 257–266
Zeng H, Zhou H, Srivastava A, Kannan R, Prasanna V (2019) Graphsaint: graph sampling based inductive learning method. arXiv preprint arXiv:1907.04931
Luo D, Cheng W, Yu W, Zong B, Ni J, Chen H, Zhang X (2021) Learning to drop: robust graph neural network via topological denoising. In: Proceedings of the 14th ACM International conference on web search and data mining, pp 779–787
Yamada I, Shindo H (2019) Neural attentive bag-of-entities model for text classification. arXiv preprint arXiv:1909.01259
Hasibi F, Balog K, Bratsberg SE (2016) Exploiting entity linking in queries for entity retrieval. In: Proceedings of the 2016 Acm International conference on the theory of information retrieval, pp 209–218
Xiong C, Callan J, Liu T-Y (2016) Bag-of-entities representation for ranking. In: Proceedings of the 2016 ACM International conference on the theory of information retrieval, pp 181–184
Guo S, Chang M-W, Kiciman E (2013) To link or not to link? a study on end-to-end tweet entity linking. In: Proceedings of the 2013 conference of the north american chapter of the association for computational linguistics: human language technologies, pp 1020–1030
Mihalcea R, Csomai A (2007) Wikify! linking documents to encyclopedic knowledge. In: Proceedings of the sixteenth ACM conference on conference on information and knowledge management, pp 233–242
Milne D, Witten IH (2008) Learning to link with wikipedia. In: Proceedings of the 17th ACM conference on information and knowledge management, pp 509–518
Yamada I, Shindo H, Takeda H, Takefuji Y (2016) Joint learning of the embedding of words and entities for named entity disambiguation. arXiv preprint arXiv:1601.01343
Yamada I, Shindo H, Takefuji Y (2018) Representation learning of entities and documents from knowledge base descriptions. arXiv preprint arXiv:1806.02960
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781
Auer S, Bizer C, Kobilarov G, Lehmann J, Cyganiak R, Ives Z (2007) Dbpedia: a nucleus for a web of open data. In: The semantic web, pp 722–735. Springer
Haveliwala TH (2003) Topic-sensitive pagerank: a context-sensitive ranking algorithm for web search. IEEE Trans Knowl Data Eng 15(4):784–796
Yeh E, Ramage D, Manning CD, Agirre E, Soroa A (2009) Wikiwalk: random walks on wikipedia for semantic relatedness. In: Proceedings of the 2009 workshop on graph-based methods for natural language processing (TextGraphs-4), pp 41–49
Hong D, Gao L, Yao J, Zhang B, Plaza A, Chanussot J (2020) Graph convolutional networks for hyperspectral image classification. IEEE Trans Geosci Remote Sens
Rong Y, Huang W, Xu T, Huang J (2019) Dropedge: towards deep graph convolutional networks on node classification. arXiv preprint arXiv:1907.10903
Hersh W, Buckley C, Leone T, Hickam D (1994) Ohsumed: an interactive retrieval evaluation and new large test collection for research. In: SIGIR’94, pp 192–201. Springer
Jin P, Zhang Y, Chen X, Xia Y (2016) Bag-of-embeddings for text classification. IJCAI 16:2824–2830
Kim Y (2014) Convolutional neural networks for sentence classification
Liu P, Qiu X, Huang X (2016) Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101
Liu G, Guo J (2019) Bidirectional lstm with attention mechanism and convolutional layer for text classification. Neurocomputing 337:325–338
Tang J, Qu M, Mei Q (2015) Pte: predictive text embedding through large-scale heterogeneous text networks. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1165–1174
Joulin A, Grave E, Bojanowski P, Mikolov T (2016) Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759
Shen D, Wang G, Wang W, Min M.R, Su Q, Zhang Y, Li C, Henao R, Carin L (2018) Baseline needs more love: on simple word-embedding-based models and associated pooling mechanisms. arXiv preprint arXiv:1805.09843
Wang G, Li C, Wang W, Zhang Y, Shen D, Zhang X, Henao R, Carin L (2018) Joint embedding of words and labels for text classification. arXiv preprint arXiv:1805.04174
Defferrard M, Bresson X, Vandergheynst P (2016) Convolutional neural networks on graphs with fast localized spectral filtering. Adv Neural Inf Process Syst 29:3844–3852
Bruna J, Zaremba W, Szlam A, LeCun Y (2013) Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203
Henaff M, Bruna J, LeCun Y (2015) Deep convolutional networks on graph-structured data. arXiv preprint arXiv:1506.05163
Paccanaro A, Hinton GE (2001) Learning distributed representations of concepts using linear relational embedding. IEEE Trans Knowl Data Eng 13(2):232–244
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Peters M.E, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. arXiv preprint arXiv:1802.05365
Yin W, Schütze H (2016) Multichannel variable-size convolution for sentence classification. arXiv preprint arXiv:1603.04513
Wang S, Huang M, Deng Z, et al. (2018) Densely connected cnn with multi-scale feature attention for text classification. In: IJCAI,pp. 4468–4474
Conneau A, Schwenk H, Barrault L, Lecun Y (2016) Very deep convolutional networks for text classification. arXiv preprint arXiv:1606.01781
Liu Y, Ji L, Huang R, Ming T, Gao C, Zhang J (2019) An attention-gated convolutional neural network for sentence classification. Intell Data Anal 23(5):1091–1107
Zhao W, Zhu L, Wang M, Zhang X, Zhang J (2022) Wtl-cnn: a news text classification method of convolutional neural network based on weighted word embedding. Connect Sci 34(1):2291–2312
Tai K.S, Socher R, Manning C.D (2015) Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075
Zhou P, Shi W, Tian J, Qi Z, Li B, Hao H, Xu B (2016) Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th annual meeting of the association for computational linguistics, vol 2: Short Papers, pp 207–212
Zhou P, Qi Z, Zheng S, Xu J, Bao H, Xu B (2016) Text classification improved by integrating bidirectional lstm with two-dimensional max pooling. arXiv preprint arXiv:1611.06639
Xia W, Zhu W, Liao B, Chen M, Cai L, Huang L (2018) Novel architecture for long short-term memory used in question classification. Neurocomputing 299:20–31
Ding Z, Xia R, Yu J, Li X, Yang J (2018) Densely connected bidirectional lSTM with applications to sentence classification. In: CCF International conference on natural language processing and chinese computing, pp 278–287. Springer
Zhao Y, Shen Y, Yao J (2019) Recurrent neural network for text classification with hierarchical multiscale dense connections. In: IJCAI, pp 5450–5456
Li W, Qi F, Tang M, Yu Z (2020) Bidirectional lstm with self-attention mechanism and multi-channel features for sentiment classification. Neurocomputing 387:63–77
Wang R, Li Z, Cao J, Chen T, Wang L (2019) Convolutional recurrent neural networks for text classification. In: 2019 International joint conference on neural networks (IJCNN), pp 1–6 IEEE
Zhou C, Sun C, Liu Z, Lau F (2015) A c-lstm neural network for text classification. arXiv preprint arXiv:1511.08630
Devlin J, Chang M.-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov RR, Le QV (2019) Xlnet: Generalized autoregressive pretraining for language understanding. Adv Neural Inf Proces Syst 32
Pan L, Hang C-W, Sil A, Potdar S (2022) Improved text classification via contrastive adversarial training. Proc AAAI Conf Artif Intell 36:11130–11138
Jiang T, Wang D, Sun L, Yang H, Zhao Z, Zhuang F (2021) Lightxml: transformer with dynamic negative sampling for high-performance extreme multi-label text classification. Proc AAAI Conf Artif Intell 35:7987–7994
Hamilton W.L, Ying R, Leskovec J (2017) Inductive representation learning on large graphs. In: Proceedings of the 31st International conference on neural information processing systems, pp 1025–1035
Xu K, Hu W, Leskovec J, Jegelka S (2018) How powerful are graph neural networks? arXiv preprint arXiv:1810.00826
Wu Z, Pan S, Chen F, Long G, Zhang C, Philip SY (2020) A comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst 32(1):4–24
Kipf T.N, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907
Wu F, Souza A, Zhang T, Fifty C, Yu T, Weinberger K (2019) Simplifying graph convolutional networks. In: International conference on machine learning, pp 6861–6871. PMLR
Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. arXiv preprint arXiv:1710.10903
Zhang J, Shi X, Xie J, Ma H, King I, Yeung D-Y (2018) Gaan: gated attention networks for learning on large and spatiotemporal graphs. arXiv preprint arXiv:1803.07294
Cao S, Lu W, Xu Q (2016) Deep neural networks for learning graph representations. In: Proceedings of the AAAI conference on artificial intelligence, vol. 30
Kipf TN, Welling M (2016) Variational graph auto-encoders. arXiv preprint arXiv:1611.07308
Zhang Y, Qi P, Manning CD (2018) Graph convolution over pruned dependency trees improves relation extraction. arXiv preprint arXiv:1809.10185
Marcheggiani D, Titov I (2017) Encoding sentences with graph convolutional networks for semantic role labeling. arXiv preprint arXiv:1703.04826
Marcheggiani D, Perez-Beltrachini L (2018) Deep graph convolutional encoders for structured data to text generation. arXiv preprint arXiv:1810.09995
Bastings J, Titov I, Aziz W, Marcheggiani D, Sima’an K (2017) Graph convolutional encoders for syntax-aware neural machine translation. arXiv preprint arXiv:1704.04675
De Cao N, Aziz W, Titov I (2018) Question answering by reasoning across documents with graph convolutional networks. arXiv preprint arXiv:1808.09920
Chai D, Wu W, Han Q, Wu F, Li J (2020) Description based text classification with reinforcement learning. In: International conference on machine learning, pp 1371–1382. PMLR
Pappas N, Henderson J (2019) Gile: a generalized input-label embedding for text classification. Trans Assoc Comput Linguist 7:139–155
Chen J, Ma T, Xiao C (2018) Fastgcn: fast learning with graph convolutional networks via importance sampling. arXiv preprint arXiv:1801.10247
Ying R, He R, Chen K, Eksombatchai P, Hamilton WL, Leskovec J (2018) Graph convolutional neural networks for web-scale recommender systems. In: Proceedings of the 24th ACM SIGKDD International conference on knowledge discovery & data mining, pp 974–983
Chen J, Zhu J, Song L (2017) Stochastic training of graph convolutional networks with variance reduction. arXiv preprint arXiv:1710.10568
Gao H, Wang Z, Ji S (2018) Large-scale learnable graph convolutional networks. In: Proceedings of the 24th ACM SIGKDD International conference on knowledge discovery & data mining, pp 1416–1424
Wang K, Han SC, Poon J (2022) Induct-gcn: Inductive graph convolutional networks for text classification. arXiv preprint arXiv:2206.00265
Wang K, Han SC, Long S, Poon J (2022) Me-gcn: Multi-dimensional edge-embedded graph convolutional networks for semi-supervised text classification. arXiv preprint arXiv:2204.04618
Han SC, Yuan Z, Wang K, Long S, Poon J (2022) Understanding graph convolutional networks for text classification. arXiv preprint arXiv:2203.16060
Li Q, Peng H, Li J, Xia C, Yang R, Sun L, Yu PS, He L (2022) A survey on text classification: from traditional to deep learning. ACM Trans Intell Syst Technol TIST 13(2):1–41
Funding
This research was funded partially by Innovation Program of Chinese Academy of Agricultural Sciences (Grant No. CAAS-ASTIP-2021-AII-06), Central Public-interest Scientific Institution Basal Research Fund (Grant No. key laboratory open subject of Agricultural Information Research Institute 22), Key Laboratory of Agricultural Big Data, Ministry of Agriculture and Rural Affairs, Beijing, China, Digital agriculture technology system Beijing innovation team (Grant No. BAIC10-2022-E10).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, H., Yan, Y., Wang, S. et al. Text classification on heterogeneous information network via enhanced GCN and knowledge. Neural Comput & Applic 35, 14911–14927 (2023). https://doi.org/10.1007/s00521-023-08494-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08494-0