Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3459637.3482126acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

Grad-SAM: Explaining Transformers via Gradient Self-Attention Maps

Published: 30 October 2021 Publication History

Editorial Notes

The authors have requested minor, non-substantive changes to the VoR and, in accordance with ACM policies, a Corrected VoR was published on January 20, 2022. For reference purposes the VoR may still be accessed via the Supplemental Material section on this page.

Abstract

Transformer-based language models significantly advanced the state-of-the-art in many linguistic tasks. As this revolution continues, the ability to explain model predictions has become a major area of interest for the NLP community. In this work, we present Gradient Self-Attention Maps (Grad-SAM) - a novel gradient-based method that analyzes self-attention units and identifies the input elements that explain the model's prediction the best. Extensive evaluations on various benchmarks show that Grad-SAM obtains significant improvements over state-of-the-art alternatives.

Supplementary Material

3482126-vor (3482126-vor.pdf)
Version of Record for "Grad-SAM: Explaining Transformers via Gradient Self-Attention Maps" by Barkan et al., Proceedings of the 30th ACM International Conference on Information & Knowledge Management (CIKM '21)

References

[1]
Samira Abnar and Willem Zuidema. 2020. Quantifying Attention Flow in Transformers. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 4190--4197. https://doi.org/10.18653/v1/2020.acl-main.385
[2]
Oren Barkan. 2017Bayesian neural word embedding. In Thirty-First AAAI Conference on Artificial Intelligence.
[3]
Oren Barkan, Omri Armstrong, Amir Hertz, Avi Caciularu, Ori Katz, Itzik Malkiel, and Noam Koenigstein. 2021 a. GAM: Explainable Visual Similarity and Classification via Gradient Activation Maps. In Proceedings of the ACM International Conference on Information & Knowledge Management (CIKM).
[4]
Oren Barkan, Avi Caciularu, and Ido Dagan. 2020 a. Within-Between Lexical Relation Classification. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Online, 3521--3527. https://doi.org/10.18653/v1/2020.emnlp-main.284
[5]
Oren Barkan, Avi Caciularu, Idan Rejwan, Ori Katz, Jonathan Weill, Itzik Malkiel, and Noam Koenigstein. 2021 b. Representation Learning via Variational Bayesian Networks. In Proceedings of the ACM International Conference on Information & Knowledge Management (CIKM).
[6]
Oren Barkan, Noam Razin, Itzik Malkiel, Ori Katz, Avi Caciularu, and Noam Koenigstein. 2020 b. Scalable attentive sentence pair modeling via distilled sentence embedding. In Proceedings of the Conference on Artificial Intelligence (AAAI).
[7]
Oren Barkan, Idan Rejwan, Avi Caciularu, and Noam Koenigstein. 2020 c. Bayesian Hierarchical Words Representation Learning. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 3871--3877. https://doi.org/10.18653/v1/2020.acl-main.356
[8]
Jasmijn Bastings and Katja Filippova. 2020. The elephant in the interpretability room: Why use attention as explanation when we have saliency methods?. In Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics, Online, 149--155. https://doi.org/10.18653/v1/2020.blackboxnlp-1.14
[9]
Aaron Chan, Soumya Sanyal, Boyuan Long, Jiashu Xu, Tanishq Gupta, and Xiang Ren. 2021. SalKG: Learning From Knowledge Graph Explanations for Commonsense Reasoning. arXiv preprint arXiv:2104.08793 (2021).
[10]
George Chrysostomou and Nikolaos Aletras. 2021. Variable Instance-Level Explainability for Text Classification. arXiv preprint arXiv:2104.08219 (2021).
[11]
Kevin Clark, Urvashi Khandelwal, Omer Levy, and Christopher D. Manning. 2019. What Does BERT Look At? An Analysis of BERT's Attention. In BlackBoxNLP@ACL.
[12]
Andrew M Dai and Quoc V Le. 2015. Semi-supervised Sequence Learning. In Advances in Neural Information Processing Systems, C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett (Eds.), Vol. 28. Curran Associates, Inc., 3079--3087. https://proceedings.neurips.cc/paper/2015/file/7137debd45ae4d0ab9aa953017286b20-Paper.pdf
[13]
Gianna M Del Corso, Antonio Gulli, and Francesco Romani. 2005. Ranking a stream of news. In Proceedings of the international conference on World Wide Web (WWW).
[14]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019 a. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171--4186. https://doi.org/10.18653/v1/N19-1423
[15]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019 b. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arxiv: 1810.04805 [cs.CL]
[16]
Javid Ebrahimi, Anyi Rao, Daniel Lowd, and Dejing Dou. 2018. HotFlip: White-Box Adversarial Examples for Text Classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 31--36.
[17]
Phu Mon Htut, Jason Phang, Shikha Bordia, and Samuel R Bowman. 2019. Do Attention Heads in BERT Track Syntactic Dependencies? arXiv preprint arXiv:1911.12246 (2019).
[18]
Sarthak Jain and Byron C Wallace. 2019. Attention is not explanation. arXiv preprint arXiv:1902.10186 (2019).
[19]
Sarthak Jain, Sarah Wiegreffe, Yuval Pinter, and Byron C. Wallace. 2020. Learning to Faithfully Rationalize by Construction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 4459--4473. https://doi.org/10.18653/v1/2020.acl-main.409
[20]
Mandar Joshi, Omer Levy, Luke Zettlemoyer, and Daniel Weld. 2019. BERT for Coreference Resolution: Baselines and Analysis. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 5803--5808. https://doi.org/10.18653/v1/D19-1588
[21]
Daniel Khashabi, Snigdha Chaturvedi, Michael Roth, Shyam Upadhyay, and Dan Roth. 2018. Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 252--262. https://doi.org/10.18653/v1/N18-1023
[22]
Luoqiu Li, Xiang Chen, Ningyu Zhang, Shumin Deng, Xin Xie, Chuanqi Tan, Mosha Chen, Fei Huang, and Huajun Chen. 2021. Normal vs. Adversarial: Salience-based Analysis of Adversarial Samples for Relation Extraction. arXiv preprint arXiv:2104.00312 (2021).
[23]
Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. 2011. Learning Word Vectors for Sentiment Analysis. In Proceedings of the Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL).
[24]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.
[25]
Dong Nguyen. 2018. Automatic and Human Evaluation of Local Explanations for Text Classification. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 1069--1078. https://doi.org/10.18653/v1/N18--1097
[26]
Badri Patro, Vinay Namboodiri, et al. 2020. Explanation vs attention: A two-player game to obtain attention for VQA. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 11848--11855.
[27]
Badri N Patro, Mayank Lunayach, Shivansh Patel, and Vinay P Namboodiri. 2019. U-cam: Visual explanation using uncertainty based class activation maps. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 7444--7453.
[28]
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532--1543.
[29]
Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep Contextualized Word Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 2227--2237. https://doi.org/10.18653/v1/N18-1202
[30]
Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, and Zachary C Lipton. 2019. Learning to deceive with attention-based explanations. arXiv preprint arXiv:1909.07913 (2019).
[31]
Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. [n.d.]. Improving Language Understanding by Generative Pre-Training. ([n.,d.]).
[32]
Anna Rogers, Olga Kovaleva, and Anna Rumshisky. 2020. A primer in bertology: What we know about how bert works. arXiv preprint arXiv:2002.12327 (2020).
[33]
Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision. 618--626.
[34]
Sofia Serrano and Noah A. Smith. 2019. Is Attention Interpretable?. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 2931--2951. https://doi.org/10.18653/v1/P19-1282
[35]
Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D. Manning, Andrew Ng, and Christopher Potts. 2013. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Seattle, Washington, USA, 1631--1642. https://www.aclweb.org/anthology/D13--1170
[36]
Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic Attribution for Deep Networks. In Proceedings of the International Conference on Machine Learning (ICML).
[37]
Wen Tai, HT Kung, Xin Luna Dong, Marcus Comiter, and Chang-Fu Kuo. 2020. exBERT: Extending Pre-trained Models with Domain-specific Vocabulary Under Constrained Training Resources. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings. 1433--1439.
[38]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.
[39]
Jesse Vig. 2019. Visualizing attention in transformer-based language representation models. arXiv preprint arXiv:1904.02679 (2019).
[40]
Eric Wallace, Jens Tuyls, Junlin Wang, Sanjay Subramanian, Matt Gardner, and Sameer Singh. 2019. AllenNLP Interpret: A Framework for Explaining Predictions of NLP Models. In Empirical Methods in Natural Language Processing.
[41]
Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel Bowman. 2019. Superglue: A stickier benchmark for general-purpose language understanding systems. In Advances in Neural Information Processing Systems. 3266--3280.
[42]
Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel Bowman. 2018. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics, Brussels, Belgium, 353--355. https://doi.org/10.18653/v1/W18-5446
[43]
Sarah Wiegreffe and Yuval Pinter. 2019. Attention is not not Explanation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 11--20. https://doi.org/10.18653/v1/D19-1002
[44]
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Remi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander Rush. 2020. Transformers: State-of-the-Art Natural Language Processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Online, 38--45. https://doi.org/10.18653/v1/2020.emnlp-demos.6
[45]
Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, and Quoc V Le. 2019. XLNet: Generalized Autoregressive Pretraining for Language Understanding. In Advances in Neural Information Processing Systems (NeurIPS).

Cited By

View all
  • (2024)Beyond the HypeUnderstanding Generative AI in a Cultural Context10.4018/979-8-3693-7235-7.ch016(399-422)Online publication date: 27-Dec-2024
  • (2024)Explainability for Large Language Models: A SurveyACM Transactions on Intelligent Systems and Technology10.1145/363937215:2(1-38)Online publication date: 22-Feb-2024
  • (2024)Probabilistic Path Integration with Mixture of Baseline DistributionsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679641(570-580)Online publication date: 21-Oct-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
October 2021
4966 pages
ISBN:9781450384469
DOI:10.1145/3459637
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bert
  2. deep learning
  3. explainable & interpretable ai
  4. nlp
  5. self-attention
  6. transformers
  7. transparent machine learning

Qualifiers

  • Short-paper

Conference

CIKM '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)145
  • Downloads (Last 6 weeks)32
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Beyond the HypeUnderstanding Generative AI in a Cultural Context10.4018/979-8-3693-7235-7.ch016(399-422)Online publication date: 27-Dec-2024
  • (2024)Explainability for Large Language Models: A SurveyACM Transactions on Intelligent Systems and Technology10.1145/363937215:2(1-38)Online publication date: 22-Feb-2024
  • (2024)Probabilistic Path Integration with Mixture of Baseline DistributionsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679641(570-580)Online publication date: 21-Oct-2024
  • (2024)A Learning-based Approach for Explaining Language ModelsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679548(98-108)Online publication date: 21-Oct-2024
  • (2024)A Counterfactual Framework for Learning and Evaluating Explanations for Recommender SystemsProceedings of the ACM Web Conference 202410.1145/3589334.3645560(3723-3733)Online publication date: 13-May-2024
  • (2024)B-Cos Alignment for Inherently Interpretable CNNs and Vision TransformersIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.335515546:6(4504-4518)Online publication date: Jun-2024
  • (2024)SoK: Explainable Machine Learning in Adversarial Environments2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00021(2441-2459)Online publication date: 19-May-2024
  • (2024)Explainable Generative AI (GenXAI): a survey, conceptualization, and research agendaArtificial Intelligence Review10.1007/s10462-024-10916-x57:11Online publication date: 15-Sep-2024
  • (2023)Deep Integrated ExplanationsProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614836(57-67)Online publication date: 21-Oct-2023
  • (2023)Computing and evaluating saliency maps for image classification: a tutorialJournal of Electronic Imaging10.1117/1.JEI.32.2.02080132:02Online publication date: 1-Mar-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media