short-paper

On Explaining Multimodal Hateful Meme Detection Models

Authors:

Roy Ka-Wei Lee,

Wen-Haw ChongAuthors Info & Claims

WWW '22: Proceedings of the ACM Web Conference 2022

Pages 3651 - 3655

https://doi.org/10.1145/3485447.3512260

Published: 25 April 2022 Publication History

Abstract

Hateful meme detection is a new multimodal task that has gained significant traction in academic and industry research communities. Recently, researchers have applied pre-trained visual-linguistic models to perform the multimodal classification task, and some of these solutions have yielded promising results. However, what these visual-linguistic models learn for the hateful meme classification task remains unclear. For instance, it is unclear if these models are able to capture the derogatory or slurs references in multimodality (i.e., image and text) of the hateful memes. To fill this research gap, this paper propose three research questions to improve our understanding of these visual-linguistic models performing the hateful meme classification task. We found that the image modality contributes more to the hateful meme classification task, and the visual-linguistic models are able to perform visual-text slurs grounding to a certain extent. Our error analysis also shows that the visual-linguistic models have acquired biases, which resulted in false-positive predictions.

References

[1]

Marco Ancona, Enea Ceolini, Cengiz Öztireli, and Markus Gross. 2018. Towards better understanding of gradient-based attribution methods for Deep Neural Networks. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30-May 3, 2018, Conference Track Proceedings. OpenReview. net.

[2]

Pinkesh Badjatiya, Manish Gupta, and Vasudeva Varma. 2019. Stereotypical bias removal for hate speech detection task using knowledge-based generalizations. In The World Wide Web Conference. 49–59.

Digital Library

[3]

Jize Cao, Zhe Gan, Yu Cheng, Licheng Yu, Yen-Chun Chen, and Jingjing Liu. 2020. Behind the scene: Revealing the secrets of pre-trained vision-and-language models. In European Conference on Computer Vision. Springer, 565–580.

Digital Library

[4]

Kevin Clark, Urvashi Khandelwal, Omer Levy, and Christopher D Manning. 2019. What Does BERT Look at? An Analysis of BERT’s Attention. In Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. 276–286.

[5]

Thomas Davidson, Debasmita Bhattacharya, and Ingmar Weber. 2019. Racial Bias in Hate Speech and Abusive Language Detection Datasets. In Proceedings of the Third Workshop on Abusive Language Online. 25–35.

[6]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805(2018).

[7]

Stella Frank, Emanuele Bugliarello, and Desmond Elliott. 2021. Vision-and-Language or Vision-for-Language? On Cross-Modal Influence in Multimodal Transformers. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 9847–9857.

[8]

Raul Gomez, Jaume Gibert, Lluis Gomez, and Dimosthenis Karatzas. 2020. Exploring hate speech detection in multimodal publications. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1470–1478.

[9]

Yash Goyal, Tejas Khot, Douglas Summers-Stay, Dhruv Batra, and Devi Parikh. 2017. Making the v in vqa matter: Elevating the role of image understanding in visual question answering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6904–6913.

[10]

Yash Goyal, Akrit Mohapatra, Devi Parikh, and Dhruv Batra. 2016. Towards transparent ai systems: Interpreting visual question answering models. arXiv preprint arXiv:1608.08974(2016).

[11]

Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. 2018. A survey of methods for explaining black box models. ACM computing surveys (CSUR) 51, 5 (2018), 1–42.

[12]

Brendan Kennedy, Xisen Jin, Aida Mostafazadeh Davani, Morteza Dehghani, and Xiang Ren. 2020. Contextualizing Hate Speech Classifiers with Post-hoc Explanation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 5435–5442.

[13]

Douwe Kiela, Hamed Firooz, Aravind Mohan, Vedanuj Goswami, Amanpreet Singh, Pratik Ringshia, and Davide Testuggine. 2020. The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes. Advances in Neural Information Processing Systems 33 (2020).

[14]

Goro Kobayashi, Tatsuki Kuribayashi, Sho Yokoi, and Kentaro Inui. 2020. Attention is not only a weight: Analyzing transformers with vector norms. arXiv preprint arXiv:2004.10102(2020).

[15]

Narine Kokhlikyan, Vivek Miglani, Miguel Martin, Edward Wang, Bilal Alsallakh, Jonathan Reynolds, Alexander Melnikov, Natalia Kliushkina, Carlos Araya, Siqi Yan, 2020. Captum: A unified and generic model interpretability library for pytorch. arXiv preprint arXiv:2009.07896(2020).

[16]

Roy Ka-Wei Lee, Rui Cao, Ziqing Fan, Jing Jiang, and Wen-Haw Chong. 2021. Disentangling Hate in Online Memes. In Proceedings of the 29th ACM International Conference on Multimedia. 5138–5147.

Digital Library

[17]

Liunian Harold Li, Mark Yatskar, Da Yin, Cho-Jui Hsieh, and Kai-Wei Chang. 2019. Visualbert: A simple and performant baseline for vision and language. arXiv preprint arXiv:1908.03557(2019).

[18]

Liunian Harold Li, Mark Yatskar, Da Yin, Cho-Jui Hsieh, and Kai-Wei Chang. 2020. What Does BERT with Vision Look At?. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 5265–5275.

[19]

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740–755.

[20]

Phillip Lippe, Nithin Holla, Shantanu Chandra, Santhosh Rajamanickam, Georgios Antoniou, Ekaterina Shutova, and Helen Yannakoudakis. 2020. A multimodal framework for the detection of hateful memes. arXiv preprint arXiv:2012.12871(2020).

[21]

Jiasen Lu, Dhruv Batra, Devi Parikh, and Stefan Lee. 2019. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks. In NeurIPS.

[22]

Letitia Parcalabescu, Albert Gatt, Anette Frank, and Iacer Calixto. 2021. Seeing past words: Testing the cross-modal capabilities of pretrained v&l models on counting tasks. In Proceedings of the 1st Workshop on Multimodal Semantic Representations (MMSR). 32–44.

[23]

Badri Patro, Shivansh Patel, and Vinay Namboodiri. 2020. Robust explanations for visual question answering. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1577–1586.

[24]

Shraman Pramanick, Dimitar Dimitrov, Rituparna Mukherjee, Shivam Sharma, Md Shad Akhtar, Preslav Nakov, and Tanmoy Chakraborty. 2021. Detecting Harmful Memes and Their Targets. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 2783–2796.

[25]

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28 (2015), 91–99.

[26]

Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision. 618–626.

[27]

Piyush Sharma, Nan Ding, Sebastian Goodman, and Radu Soricut. 2018. Conceptual captions: A cleaned, hypernymed, image alt-text dataset for automatic image captioning. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2556–2565.

[28]

Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2013. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034(2013).

[29]

Amanpreet Singh, Vedanuj Goswami, Vivek Natarajan, Yu Jiang, Xinlei Chen, Meet Shah, Marcus Rohrbach, Dhruv Batra, and Devi Parikh. 2020. MMF: A multimodal framework for vision and language research. https://github.com/facebookresearch/mmf.

[30]

Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic attribution for deep networks. In International Conference on Machine Learning. PMLR, 3319–3328.

[31]

Riza Velioglu and Jewgeni Rose. 2020. Detecting Hate Speech in Memes Using Multimodal Deep Learning Approaches: Prize-winning solution to Hateful Memes Challenge. arXiv preprint arXiv:2012.12975(2020).

[32]

Mengzhou Xia, Anjalie Field, and Yulia Tsvetkov. 2020. Demoting Racial Bias in Hate Speech Detection. In Proceedings of the Eighth International Workshop on Natural Language Processing for Social Media. 7–14.

[33]

Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. 2017. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1492–1500.

[34]

Xuhui Zhou, Maarten Sap, Swabha Swayamdipta, Yejin Choi, and Noah A Smith. 2021. Challenges in Automated Debiasing for Toxic Language Detection. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. 3143–3155.

[35]

Yi Zhou, Zhenhao Chen, and Huiyuan Yang. 2021. Multimodal Learning For Hateful Memes Detection. In 2021 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). IEEE, 1–6.

[36]

Ron Zhu. 2020. Enhance multimodal transformer with external label and in-domain pretrain: Hateful meme challenge winning solution. arXiv preprint arXiv:2012.08290(2020).

Cited By

Lim YHee MYee XYau WSim XTay WNg WNg SLee RChua TNgo CKumar RLauw HKa-Wei Lee R(2024)AISG's Online Safety Prize Challenge: Detecting Harmful Social Bias in Multimodal MemesCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3665993(1884-1891)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589335.3665993
Abdullakutty FNaseem UChua TNgo CKumar RLauw HKa-Wei Lee R(2024)Decoding Memes: A Comprehensive Analysis of Late and Early Fusion Models for Explainable Meme AnalysisCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3652504(1681-1689)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589335.3652504
Hee MCao RChakraborty TLee RChua TNgo CKumar RLauw HKa-Wei Lee R(2024)Understanding (Dark) Humour with Internet Meme AnalysisCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3641249(1276-1279)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589335.3641249
Show More Cited By

Index Terms

On Explaining Multimodal Hateful Meme Detection Models
1. Computing methodologies
  1. Artificial intelligence
2. Human-centered computing

Index terms have been assigned to the content through auto-classification.

Recommendations

Multimodal Zero-Shot Hateful Meme Detection
WebSci '22: Proceedings of the 14th ACM Web Science Conference 2022

Facebook has recently launched the hateful meme detection challenge, which garnered much attention in academic and industry research communities. Researchers have proposed multimodal deep learning classification methods to perform hateful meme ...
Multimodal Hate Speech Detection via Cross-Domain Knowledge Transfer
MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Nowadays, the hate speech diffusion of texts and images in social network has become the mainstream compared with the diffusion of texts-only, raising the pressing needs of multimodal hate speech detection task. Current research on this task mainly ...
Disentangling Hate in Online Memes
MM '21: Proceedings of the 29th ACM International Conference on Multimedia

Hateful and offensive content detection has been extensively explored in a single modality such as text. However, such toxic information could also be communicated via multimodal content such as online memes. Therefore, detecting multimodal hateful ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '22: Proceedings of the ACM Web Conference 2022

April 2022

3764 pages

ISBN:9781450390965

DOI:10.1145/3485447

Editors:
Frédérique Laforest
INSA Lyon, France
,
Raphaël Troncy
EURECOM, France
,
Elena Simperl
King’s College London, UK
,
Deepak Agarwal
Pinterest, USA
,
Aristides Gionis
KTH Royal Institute of Technology, Sweden
,
Ivan Herman
W3C / retired
,
Lionel Médini
Université Lyon 1, France

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 April 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper
Research
Refereed limited

Conference

WWW '22

Sponsor:

SIGWEB

WWW '22: The ACM Web Conference 2022

April 25 - 29, 2022

Virtual Event, Lyon, France

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
523
Total Downloads

Downloads (Last 12 months)147
Downloads (Last 6 weeks)16

Reflects downloads up to 09 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Lim YHee MYee XYau WSim XTay WNg WNg SLee RChua TNgo CKumar RLauw HKa-Wei Lee R(2024)AISG's Online Safety Prize Challenge: Detecting Harmful Social Bias in Multimodal MemesCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3665993(1884-1891)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589335.3665993
Abdullakutty FNaseem UChua TNgo CKumar RLauw HKa-Wei Lee R(2024)Decoding Memes: A Comprehensive Analysis of Late and Early Fusion Models for Explainable Meme AnalysisCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3652504(1681-1689)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589335.3652504
Hee MCao RChakraborty TLee RChua TNgo CKumar RLauw HKa-Wei Lee R(2024)Understanding (Dark) Humour with Internet Meme AnalysisCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3641249(1276-1279)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589335.3641249
Wang HLee RChua TNgo CKa-Wei Lee RKumar RLauw H(2024)MemeCraft: Contextual and Stance-Driven Multimodal Meme GenerationProceedings of the ACM Web Conference 202410.1145/3589334.3648151(4642-4652)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3648151
Lin HLuo ZGao WMa JWang BYang RChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Towards Explainable Harmful Meme Detection through Multimodal Debate between Large Language ModelsProceedings of the ACM Web Conference 202410.1145/3589334.3645381(2359-2370)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645381
Arya GHasan MBagwari ASafie NIslam SAhmed FDe AKhan MGhazal T(2024)Multimodal Hate Speech Detection in Memes Using Contrastive Language-Image Pre-TrainingIEEE Access10.1109/ACCESS.2024.336132212(22359-22375)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3361322
Acharya SDas BSudarshan T(2024)Capturing the Concept Projection in Metaphorical Memes for Downstream Learning TasksIEEE Access10.1109/ACCESS.2023.334798812(1250-1265)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2023.3347988
Wang KLu JYu BYang LLin H(2024)CETA: Context-Enhanced and Target-Aware Hateful Meme Inference MethodNatural Language Processing and Chinese Computing10.1007/978-981-97-9443-0_8(95-106)Online publication date: 1-Nov-2024
https://doi.org/10.1007/978-981-97-9443-0_8
Hee MChong WLee RElkind E(2023)Decoding the underlying meaning of multimodal hateful memesProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/665(5995-6003)Online publication date: 19-Aug-2023
https://dl.acm.org/doi/10.24963/ijcai.2023/665
Prakash NWang HHoang NHee MLee REl Saddik AMei TCucchiara RBertini MTobon Vallejo DAtrey PHossain M(2023)PromptMTopic: Unsupervised Multimodal Topic Modeling of Memes using Large Language ModelsProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3613836(621-631)Online publication date: 26-Oct-2023
https://dl.acm.org/doi/10.1145/3581783.3613836
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents