Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/978-3-031-28244-7_11guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

An Interpretable Knowledge Representation Framework for Natural Language Processing with Cross-Domain Application

Published: 02 April 2023 Publication History

Abstract

Data representation plays a crucial role in natural language processing (NLP), forming the foundation for most NLP tasks. Indeed, NLP performance highly depends upon the effectiveness of the preprocessing pipeline that builds the data representation. Many representation learning frameworks, such as Word2Vec, encode input data based on local contextual information that interconnects words. Such approaches can be computationally intensive, and their encoding is hard to explain. We here propose an interpretable representation learning framework utilizing Tsetlin Machine (TM). The TM is an interpretable logic-based algorithm that has exhibited competitive performance in numerous NLP tasks. We employ the TM clauses to build a sparse propositional (boolean) representation of natural language text. Each clause is a class-specific propositional rule that links words semantically and contextually. Through visualization, we illustrate how the resulting data representation provides semantically more distinct features, better separating the underlying classes. As a result, the following classification task becomes less demanding, benefiting simple machine learning classifiers such as Support Vector Machine (SVM). We evaluate our approach using six NLP classification tasks and twelve domain adaptation tasks. Our main finding is that the accuracy of our proposed technique significantly outperforms the vanilla TM, approaching the competitive accuracy of deep neural network (DNN) baselines. Furthermore, we present a case study showing how the representations derived from our framework are interpretable. (We use an asynchronous and parallel version of Tsetlin Machine: available at https://github.com/cair/PyTsetlinMachineCUDA).

References

[1]
Abeyrathna, K.D., et al.: Massively Parallel and Asynchronous tsetlin Machine Architecture Supporting Almost Constant-Time Scaling. In: The Thirty-eighth International Conference on Machine Learning (ICML), pp. 10–20 (2021)
[2]
Bender, E.M., Gebru, T., McMillan-Major, A., Shmitchell, S.: On the dangers of stochastic parrots: can language models be too big? In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pp. 610–623 (2021)
[3]
Bengio Y Neural net language models Scholarpedia 2008 3 1 3881
[4]
Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Adv. Neural Inf. Proc. Syst. 13 (2000)
[5]
Berge GT et al. Using the tsetlin machine to learn human-interpretable rules for high-accuracy text categorization with medical applications IEEE Access 2019 7 115134-115146
[6]
Bhattarai, B., Granmo, O.C., Jiao, L.: Measuring the novelty of natural language text using the conjunctive clauses of a tsetlin machine text classifier. In: Proceedings of ICAART (2021)
[7]
Bhattarai, B., Granmo, O.C., Jiao, L.: Convtexttm: an explainable convolutional tsetlin machine framework for text classification. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3761–3770 (2022)
[8]
Bhattarai, B., Granmo, O.C., Jiao, L.: Explainable tsetlin machine framework for fake news detection with credibility score assessment. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference (2022)
[9]
Bhattarai, B., Granmo, O.C., Jiao, L.: Word-level human interpretable scoring mechanism for novel text detection using tsetlin machines. Appl. Intell. (2022)
[10]
Blakely C and Granmo O Fujita H, Selamat A, Lin J, and Ali M Closed-Form Expressions for Global and Local Interpretation of tsetlin Machines Advances and Trends in Artificial Intelligence. Artificial Intelligence Practices 2021 Cham Springer 158-172
[11]
Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In: Proceedings of the 45th annual meeting of the association of computational linguistics, pp. 440–447 (2007)
[12]
Bojanowski P, Grave E, Joulin A, and Mikolov T Enriching word vectors with subword information Trans. Assoc. Comput. Linguist. 2017 5 135-146
[13]
Chang E, Seide F, Meng HM, Chen Z, Shi Y, and Li YC A system for spoken query information retrieval on mobile devices IEEE Trans. Speech Audio proc. 2002 10 8 531-541
[14]
Chen, Q., Zhang, R., Zheng, Y., Mao, Y.: Dual contrastive learning: Text classification via label-aware data augmentation. arXiv preprint arXiv:2201.08702 (2022)
[15]
Chen T, Xu R, He Y, and Wang X Improving sentiment analysis via sentence type classification using bilstm-CRF and CNN Expert Syst. Appl. 2017 72 221-230
[16]
Craven, M.W., et al.: Learning to extract symbolic knowledge from the world wide web. In: AAAI/IAAI (1998)
[17]
Darshana Abeyrathna K, Granmo OC, Zhang X, Jiao L, and Goodwin M The regression tsetlin machine: a novel approach to interpretable nonlinear regression Phil. Trans. R. Soc. A 2020 378 2164 20190165
[18]
Debole F and Sebastiani F An analysis of the relative hardness of reuters-21578 subsets J. Am. Soc. Inform. Sci. Technol. 2005 56 6 584-596
[19]
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: NAACL (2019)
[20]
Ding, X., Liu, B., Yu, P.S.: A holistic lexicon-based approach to opinion mining. In: Proceedings of the 2008 international conference on web search and data mining, pp. 231–240 (2008)
[21]
Gilpin, L.H., Bau, D., Yuan, B.Z., Bajwa, A., Specter, M., Kagal, L.: Explaining explanations: An overview of interpretability of machine learning. In: 2018 IEEE 5th International Conference on data science and advanced analytics (DSAA), pp. 80–89. IEEE (2018)
[22]
Granmo, O.C.: The tsetlin machine-a game theoretic bandit driven approach to optimal pattern recognition with propositional logic. arXiv preprint arXiv:1804.01508 (2018)
[23]
Hinton, G., McClelland, J., Rumelhart, D.: Distributed representations. In: The Philosophy of Artificial Intelligence. Oxford University Press, pp. 248–280.(1990)
[24]
Ilic, S., Marrese-Taylor, E., Balazs, J.A., Matsuo, Y.: Deep contextualized word representations for detecting sarcasm and irony. In: WASSA EMNLP (2018)
[25]
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751 (2014)
[26]
Lei J, Rahman T, Shafik R, Wheeldon A, Yakovlev A, Granmo OC, Kawsar F, and Mathur A Low-power audio keyword spotting using tsetlin machines J. Low Power Electron. Appl. 2021 11 2 18
[27]
Liu, P., Qiu, X., Huang, X.: Recurrent neural network for text classification with multi-task learning. In: Proceedings of the Twenty-Fifth. International Joint Conference on Artificial Intelligence, pp. 2873–2879 (2016)
[28]
Luo Y, Uzuner Ö, and Szolovits P Bridging semantics and syntax with graph algorithms-state-of-the-art of extracting biomedical relations Brief. Bio inform. 2017 18 1 160-178
[29]
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Proc. Syst. 26 (2013)
[30]
Pang, B., Lee, L.: A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), pp. 271–278 (2004)
[31]
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532–1543 (2014)
[32]
Qu, X., Zou, Z., Cheng, Y., Yang, Y., Zhou, P.: Adversarial category alignment network for cross-domain sentiment classification. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (Long and Short Papers), 1, pp. 2496–2508 (2019)
[33]
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre-training (2018)
[34]
Saha, R., Granmo, O.C., Goodwin, M.: Mining interpretable rules for sentiment and semantic relation analysis using tsetlin machines. In: Proceedings of International Conference on Innovative Techniques and Applications of Artificial Intelligence (2020)
[35]
Green AI Schwartz, R., Dodge, J., Smith, N., Etzioni, O Commun. ACM 2020 63 54-63
[36]
Serrano, S., Smith, N.A.: Is attention interpretable? In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 2931–2951 (2019)
[37]
Shen, J., Qu, Y., Zhang, W., Yu, Y.: Wasserstein distance guided representation learning for domain adaptation. In: Thirty-second AAAI conference on artificial intelligence, pp. 4058–4065 (2018)
[38]
Shen, T., Zhou, T., Long, G., Jiang, J., Pan, S., Zhang, C.: Disan: Directional self-attention network for RNN/CNN-free language understanding. In: Proceedings of the AAAI conference on artificial intelligence (2018)
[39]
Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Long Papers), 1, pp. 1556–1566 (2015)
[40]
Tänzer, M., Ruder, S., Rei, M.: Memorisation versus generalisation in pre-trained language models. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Long Papers)1, pp. 7564–7578 (2022)
[41]
Tsetlin ML On behaviour of finite automata in random medium Avtomat. i Telemekh 1961 22 10 1345-1354
[42]
Wallach, H.M.: Topic modeling: beyond bag-of-words. In: Proceedings of the 23rd international conference on Machine learning, pp. 977–984 (2006)
[43]
Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of human language technology conference and conference on empirical methods in natural language processing, pp. 347–354 (2005)
[44]
Yadav, R., Jiao, L., Granmo, O.C., Goodwin, M.: Human-level interpretable learning for aspect-based sentiment analysis. In: Proceedings of AAAI (2021)
[45]
Yadav, R.K., Jiao, L., Granmo, O.C., Goodwin, M.: Interpretability in word sense disambiguation using tsetlin machine. In: Proceedings of ICAART, pp. 402–409 (2021)
[46]
Yadav, R.K., Jiao, L., Granmo, O.C., Goodwin, M.: Robust Interpretable Text Classification against Spurious Correlations Using and-rules with Negation. In: The 31st International Joint Conference on Artificial Intelligence (IJCAI) (2022)
[47]
Yang, J., et al.: A survey of knowledge enhanced pre-trained models. arXiv preprint arXiv:2110.00269 (2021)
[48]
Zhang, T., Huang, M., Zhao, L.: Learning structured representation for text classification via reinforcement learning. In: Proceedings of the AAAI Conference on Artificial Intelligence (2018)
[49]
Zhang Y, Jin R, and Zhou ZH Understanding bag-of-words model: a statistical framework Int. J. Mach. Learn. Cybern. 2010 1 1 43-52
[50]
Zhao R and Mao K Fuzzy bag-of-words model for document representation IEEE Trans. Fuzzy Syst. 2017 26 2 794-804
[51]
Zhou, J., Tian, J., Wang, R., Wu, Y., Xiao, W., He, L.: Sentix: A sentiment-aware pre-trained model for cross-domain sentiment analysis. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 568–579 (2020)

Cited By

View all
  • (2024)eval-rationales: An End-to-End Toolkit to Explain and Evaluate Transformers-Based ModelsAdvances in Information Retrieval10.1007/978-3-031-56069-9_20(212-217)Online publication date: 24-Mar-2024
  • (2023)Building concise logical patterns by constraining tsetlin machine clause sizeProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/378(3395-3403)Online publication date: 19-Aug-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
Advances in Information Retrieval: 45th European Conference on Information Retrieval, ECIR 2023, Dublin, Ireland, April 2–6, 2023, Proceedings, Part I
Apr 2023
780 pages
ISBN:978-3-031-28243-0
DOI:10.1007/978-3-031-28244-7

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 02 April 2023

Author Tags

  1. Natural language processing (NLP)
  2. Tsetlin machine (TM)
  3. Propositional logic
  4. Knowledge representation
  5. Domain adaptation
  6. Interpretable representation

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)eval-rationales: An End-to-End Toolkit to Explain and Evaluate Transformers-Based ModelsAdvances in Information Retrieval10.1007/978-3-031-56069-9_20(212-217)Online publication date: 24-Mar-2024
  • (2023)Building concise logical patterns by constraining tsetlin machine clause sizeProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/378(3395-3403)Online publication date: 19-Aug-2023

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media