research-article

Multimodal Deep Learning with Discriminant Descriptors for Offensive Memes Detection

Authors:

Ahmad Alzu’bi,

Lojin Bani Younis,

Abdelrahman Abuarqoub,

Mohammad HammoudehAuthors Info & Claims

ACM Journal of Data and Information Quality, Volume 15, Issue 3

Article No.: 38, Pages 1 - 16

https://doi.org/10.1145/3597308

Published: 28 September 2023 Publication History

Abstract

A meme is a visual representation that illustrates a thought or concept. Memes are spreading steadily among people in this era of rapidly expanding social media platforms, and they are becoming increasingly popular forms of expression. In the domain of meme and emotion analysis, the detection of offensives is a crucial task. However, it can be difficult to identify and comprehend the underlying emotion of a meme because its content is multimodal. Additionally, there is a lack of memes datasets that address how offensive a meme is, and the existing ones in this context have a bias towards the dominant labels or categories, leading to an imbalanced training set. In this article, we present a descriptive balanced dataset to help detect the offensive nature of the meme’s content using a proposed multimodal deep learning model. Two deep semantic models, baseline BERT and hateXplain-BERT, are systematically combined with several deep ResNet architectures to estimate the severity of the offensive memes. This process is based on the Meme-Merge collection that we construct from two publicly available datasets. The experimental results demonstrate the model’s effectiveness in classifying offensive memes, achieving F1 scores of 0.7315 and 0.7140 for the baseline datasets and Meme-Merge, respectively. The proposed multimodal deep learning approach also outperformed the baseline model in three meme tasks: metaphor understanding, sentiment understanding, and intention detection.

References

[1]

Bernhard Kratzwald, Suzana Ilic, Mathias Kraus, Stefan Feuerriegel, and Helmut Prendinger. 2018. Decision support with text-based emotion recognition: Deep learning for affective computing. arXiv preprint arXiv:1803.06397 (2018).

[2]

Navya Jose, Bharathi Raja Chakravarthi, Shardul Suryawanshi, Elizabeth Sherly, and John P. McCrae. 2020. A survey of current datasets for code-switching research. In Proceedings of the 6th International Conference on Advanced Computing and Communication Systems (ICACCS). IEEE, 136–141.

[3]

Bharathi Raja Chakravarthi, Vigneshwaran Muralidaran, Ruba Priyadharshini, and John P. McCrae. 2020. Corpus creation for sentiment analysis in code-mixed Tamil-English text. arXiv preprint arXiv:2006.00206 (2020).

[4]

Georgios K. Pitsilis, Heri Ramampiaro, and Helge Langseth. 2018. Detecting offensive language in tweets using deep learning. arXiv preprint arXiv:1801.04433 (2018).

[5]

Aditya Bohra, Deepanshu Vijay, Vinay Singh, Syed Sarfaraz Akhtar, and Manish Shrivastava. 2018. A dataset of Hindi-English code-mixed social media text for hate speech detection. In Proceedings of the 2nd Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media. 36–41.

[6]

Shammur Absar Chowdhury, Hamdy Mubarak, Ahmed Abdelali, Soon-gyo Jung, Bernard J. Jansen, and Joni Salminen. 2020. A multi-platform Arabic news comment dataset for offensive language detection. In Proceedings of the 12th Language Resources and Evaluation Conference. 6203–6212.

[7]

Rasha Obeidat, Amjad Bashayreh, and Lojin Bani Younis. 2022. The impact of combining Arabic sarcasm detection datasets on the performance of BERT-based model. In Proceedings of the 13th International Conference on Information and Communication Systems (ICICS). IEEE, 22–29.

[8]

Ankita Gandhi, Kinjal Adhvaryu, Soujanya Poria, Erik Cambria, and Amir Hussain. 2023. Multimodal sentiment analysis: A systematic review of history, datasets, multimodal fusion methods, applications, challenges and future directions. Inf. Fusion 91 (2023), 424-444.

Digital Library

[9]

George-Alexandru Vlad, George-Eduard Zaharia, Dumitru-Clementin Cercel, Costin-Gabriel Chiru, and Stefan Trausan-Matu. 2020. UPB at SemEval-2020 Task 8: Joint textual and visual modeling in a multi-task learning architecture for Memotion analysis. arXiv preprint arXiv:2009.02779 (2020).

[10]

Pranaydeep Singh, Nina Bauwelinck, and Els Lefever. 2020. LT3 at SemEval-2020 Task 8: Multi-modal multi-task learning for Memotion analysis. In Proceedings of the 14th Workshop on Semantic Evaluation. 1155–1162.

[11]

T. Alsbouí, Mohammad Hammoudeh, Zuhair Bandar, and Andy Nisbet. 2011. An overview and classification of approaches to information extraction in wireless sensor networks. In Proceedings of the 5th International Conference on Sensor Technologies and Applications (SENSORCOMM’11).

[12]

Mohammad Hammoudeh and Robert Newman. 2015. Information extraction from sensor networks using the Watershed transform algorithm. Inf. Fusion 22 (2015), 39–49.

[13]

Frederik Hvilshøj. 2022. An Introduction to Balanced and Imbalanced Datasets in Machine Learning. Retrieved from https://encord.com/article/balanced-and-imbalanced-datasets.

[14]

Wasswa Shafik and S. Mojtaba Matinkhah. 2019. Privacy issues in social Web of things. In Proceedings of the 5th International Conference on Web Research (ICWR). IEEE, 208–214.

[15]

Abdelrahman Abuarqoub, Simak Abuarqoub, Ahmad Alzu’bi, and Ammar Muthanna. 2021. The impact of quantum computing on security in emerging technologies. In Proceedings of the 5th International Conference on Future Networks & Distributed Systems. 171–176.

Digital Library

[16]

Yi Gong, Lin Zhang, Renping Liu, Keping Yu, and Gautam Srivastava. 2020. Nonlinear MIMO for industrial internet of things in cyber–physical systems. IEEE Trans. Industr. Inform. 17, 8 (2020), 5533–5541.

[17]

Xiaokang Zhou, Wei Liang, Suzhen Huang, and Miao Fu. 2019. Social recommendation with large-scale group decision-making for cyber-enabled online service. IEEE Trans. Computat. Soc. Syst. 6, 5 (2019), 1073–1082.

[18]

Min Gao, Junwei Zhang, Junliang Yu, Jundong Li, Junhao Wen, and Qingyu Xiong. 2021. Recommender systems based on generative adversarial networks: A problem-driven perspective. Inf. Sci. 546 (2021), 1166–1185.

[19]

Zhiwei Guo, Keping Yu, Yu Li, Gautam Srivastava, and Jerry Chun-Wei Lin. 2021. Deep learning-embedded social internet of things for ambiguity-aware social recommendations. IEEE Trans. Netw. Sci. Eng. 9, 3 (2021), 1067–1081.

[20]

Paula Fortuna and Sérgio Nunes. 2018. A survey on automatic detection of hate speech in text. ACM Comput. Surv. 51, 4 (2018), 1–30.

Digital Library

[21]

Sean MacAvaney, Hao-Ren Yao, Eugene Yang, Katina Russell, Nazli Goharian, and Ophir Frieder. 2019. Hate speech detection: Challenges and solutions. PloS One 14, 8 (2019), e0221152.

[22]

Binny Mathew, Ritam Dutt, Pawan Goyal, and Animesh Mukherjee. 2019. Spread of hate speech in online social media. In Proceedings of the 10th ACM Conference on Web Science. 173–182.

Digital Library

[23]

Julian Risch, Robin Ruff, and Ralf Krestel. 2020. Offensive language detection explained. In Proceedings of the 2nd Workshop on Trolling, Aggression and Cyberbullying. 137–143.

[24]

Binny Mathew, Punyajoy Saha, Seid Muhie Yimam, Chris Biemann, Pawan Goyal, and Animesh Mukherjee. 2020. HateXplain: A benchmark dataset for explainable hate speech detection. arXiv preprint arXiv:2012.10289 (2020).

[25]

Gudbjartur Ingi Sigurbergsson and Leon Derczynski. 2019. Offensive language and hate speech detection for Danish. arXiv preprint arXiv:1908.04531 (2019).

[26]

Zeses Pitenis, Marcos Zampieri, and Tharindu Ranasinghe. 2020. Offensive language identification in Greek. arXiv preprint arXiv:2003.07459 (2020).

[27]

Anil Özberk and İlyas Çiçekli. 2021. Offensive language detection in Turkish tweets with BERT models. In Proceedings of the 6th International Conference on Computer Science and Engineering (UBMK). IEEE, 517–521.

[28]

Fatemah Husain and Ozlem Uzuner. 2021. A survey of offensive language detection for the Arabic language. ACM Trans. Asian Low-resour. Lang. Inf. Process. 20, 1 (2021), 1–44.

Digital Library

[29]

Chen Qian, Edoardo Ragusa, Iti Chaturvedi, Erik Cambria, and Rodolfo Zunino. 2023. Text-image sentiment analysis. Lecture Notes in Computer Science 13397 (2023), 169-180.

Digital Library

[30]

Jun-Ho Choi and Jong-Seok Lee. 2019. EmbraceNet: A robust deep learning architecture for multimodal classification. Inf. Fusion 51 (2019), 259–270.

Digital Library

[31]

Lin Yue, Weitong Chen, Xue Li, Wanli Zuo, and Minghao Yin. 2019. A survey of sentiment analysis in social media. Knowl. Inf. Syst. 60, 2 (2019), 617–663.

Digital Library

[32]

Ahmad Alzu’bi, Abbes Amira, and Naeem Ramzan. 2019. Learning transfer using deep convolutional features for remote sensing image retrieval. Int. J. Comput. Sci 46, 4 (2019), 637–644.

[33]

Ahmad Alzu’bi, Firas Albalas, Tawfik Al-Hadhrami, Lojin Bani Younis, and Amjad Bashayreh. 2021. Masked face recognition using deep learning: A review. Electronics 10, 21 (2021), 2666.

[34]

Andrew Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. 2011. Learning word vectors for sentiment analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 142–150.

Digital Library

[35]

Louis-Philippe Morency, Rada Mihalcea, and Payal Doshi. 2011. Towards multimodal sentiment analysis: Harvesting opinions from the web. In Proceedings of the 13th International Conference on Multimodal Interfaces. 169–176.

Digital Library

[36]

Alexander Pak and Patrick Paroubek. 2010. Twitter as a corpus for sentiment analysis and opinion mining. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC’10).

[37]

Nhan Cach Dang, María N. Moreno-García, and Fernando De la Prieta. 2020. Sentiment analysis based on deep learning: A comparative study. Electronics 9, 3 (2020), 483.

[38]

Raul Gomez, Jaume Gibert, Lluis Gomez, and Dimosthenis Karatzas. 2020. Exploring hate speech detection in multimodal publications. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1470–1478.

[39]

Shardul Suryawanshi, Bharathi Raja Chakravarthi, Mihael Arcan, and Paul Buitelaar. 2020. Multimodal meme dataset (MultiOFF) for identifying offensive content in image and text. In Proceedings of the 2nd Workshop on Trolling, Aggression and Cyberbullying. 32–41.

[40]

Douwe Kiela, Hamed Firooz, Aravind Mohan, Vedanuj Goswami, Amanpreet Singh, Pratik Ringshia, and Davide Testuggine. 2020. The hateful memes challenge: Detecting hate speech in multimodal memes. Adv. Neural Inf. Process. Syst. 33 (2020), 2611–2624.

[41]

Niklas Muennighoff. 2020. Vilio: State-of-the-art visio-linguistic models applied to hateful memes. arXiv preprint arXiv:2012.07788 (2020).

[42]

Phillip Lippe, Nithin Holla, Shantanu Chandra, Santhosh Rajamanickam, Georgios Antoniou, Ekaterina Shutova, and Helen Yannakoudakis. 2020. A multimodal framework for the detection of hateful memes. arXiv preprint arXiv:2012.12871 (2020).

[43]

Lanyu Shang, Yang Zhang, Yuheng Zha, Yingxi Chen, Christina Youn, and Dong Wang. 2021. AOMD: An Analogy-aware approach to Offensive Meme Detection on social media. Inf. Process. Manag. 58, 5 (2021), 102664.

Digital Library

[44]

Dushyant Singh Chauhan, S. R. Dhanush, Asif Ekbal, and Pushpak Bhattacharyya. 2020. All-in-one: A deep attentive multi-task learning framework for humour, sarcasm, offensive, motivation, and sentiment on memes. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing. 281–290.

[45]

Ana-Maria Bucur, Adrian Cosma, and Ioan-Bogdan Iordache. 2022. BLUE at Memotion 2.0 2022: You have my image, my text and my transformer. arXiv preprint arXiv:2202.07543 (2022).

[46]

Kim Ngan Phan, Guee-Sang Lee, Hyung-Jeong Yang, and Soo-Hyung Kim. 2022. Little flower at Memotion 2.0 2022: Ensemble of multi-modal model using attention mechanism in Memotion analysis. In Proceedings of De-Factify: Workshop on Multimodal Fact Checking and Hate Speech Detection, CEUR.

[47]

Roy Ka-Wei Lee, Rui Cao, Ziqing Fan, Jing Jiang, and Wen-Haw Chong. 2021. Disentangling hate in online memes. In Proceedings of the 29th ACM International Conference on Multimedia. 5138–5147.

Digital Library

[48]

Bo Xu, Tingting Li, Junzhe Zheng, Mehdi Naseriparsa, Zhehuan Zhao, Hongfei Lin, and Feng Xia. 2022. MET-Meme: A multimodal meme dataset rich in metaphors. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2887–2899.

Digital Library

[49]

Chhavi Sharma, Deepesh Bhageria, William Scott, Srinivas Pykl, Amitava Das, Tanmoy Chakraborty, Viswanath Pulabaigari, and Bjorn Gamback. 2020. SemEval-2020 Task 8: Memotion analysis–The Visuo-lingual metaphor! arXiv preprint arXiv:2008.03781 (2020).

[50]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

[51]

Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. 2018. GLUE: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461 (2018).

[52]

Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250 (2016).

[53]

Adina Williams, Nikita Nangia, and Samuel R. Bowman. 2017. A broad-coverage challenge corpus for sentence understanding through inference. arXiv preprint arXiv:1704.05426 (2017).

[54]

Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz et al. 2019. HuggingFace’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019).

[55]

Ali Sharif Razavian, Hossein Azizpour, Josephine Sullivan, and Stefan Carlsson. 2014. CNN features off-the-shelf: An astounding baseline for recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 806–813.

Digital Library

[56]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.

Cited By

Index Terms

Multimodal Deep Learning with Discriminant Descriptors for Offensive Memes Detection
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Information systems
  1. Information systems applications
    1. Collaborative and social computing systems and tools
      1. Social networking sites

Recommendations

Multimodal Religiously Hateful Social Media Memes Classification based on Textual and Image Data
Multimodal hateful social media meme detection is an important and challenging problem in the vision-language domain. Recent studies show high accuracy for such multimodal tasks due to datasets that provide better joint multimodal embedding to narrow the ...
A Multitask Framework for Sentiment, Emotion and Sarcasm aware Cyberbullying Detection from Multi-modal Code-Mixed Memes
SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

Detecting cyberbullying from memes is highly challenging, because of the presence of the implicit affective content which is also often sarcastic, and multi-modality (image + text). The current work is the first attempt, to the best of our knowledge, in ...
On the Origins of Memes by Means of Fringe Web Communities
IMC '18: Proceedings of the Internet Measurement Conference 2018

Internet memes are increasingly used to sway and manipulate public opinion. This prompts the need to study their propagation, evolution, and influence across the Web. In this paper, we detect and measure the propagation of memes across multiple Web ...

Comments

Information & Contributors

Information

Published In

cover image Journal of Data and Information Quality

Journal of Data and Information Quality Volume 15, Issue 3

September 2023

326 pages

ISSN:1936-1955

EISSN:1936-1963

DOI:10.1145/3611329

Editor:
Tiziana Catarci
Sapienza University of Rome, Rome, Italy

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 September 2023

Online AM: 15 May 2023

Accepted: 27 March 2023

Revised: 11 March 2023

Received: 15 December 2022

Published in JDIQ Volume 15, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
608
Total Downloads

Downloads (Last 12 months)500
Downloads (Last 6 weeks)25

Reflects downloads up to 27 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents