Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Multimodal Deep Learning with Discriminant Descriptors for Offensive Memes Detection

Published: 28 September 2023 Publication History
  • Get Citation Alerts
  • Abstract

    A meme is a visual representation that illustrates a thought or concept. Memes are spreading steadily among people in this era of rapidly expanding social media platforms, and they are becoming increasingly popular forms of expression. In the domain of meme and emotion analysis, the detection of offensives is a crucial task. However, it can be difficult to identify and comprehend the underlying emotion of a meme because its content is multimodal. Additionally, there is a lack of memes datasets that address how offensive a meme is, and the existing ones in this context have a bias towards the dominant labels or categories, leading to an imbalanced training set. In this article, we present a descriptive balanced dataset to help detect the offensive nature of the meme’s content using a proposed multimodal deep learning model. Two deep semantic models, baseline BERT and hateXplain-BERT, are systematically combined with several deep ResNet architectures to estimate the severity of the offensive memes. This process is based on the Meme-Merge collection that we construct from two publicly available datasets. The experimental results demonstrate the model’s effectiveness in classifying offensive memes, achieving F1 scores of 0.7315 and 0.7140 for the baseline datasets and Meme-Merge, respectively. The proposed multimodal deep learning approach also outperformed the baseline model in three meme tasks: metaphor understanding, sentiment understanding, and intention detection.

    References

    [1]
    Bernhard Kratzwald, Suzana Ilic, Mathias Kraus, Stefan Feuerriegel, and Helmut Prendinger. 2018. Decision support with text-based emotion recognition: Deep learning for affective computing. arXiv preprint arXiv:1803.06397 (2018).
    [2]
    Navya Jose, Bharathi Raja Chakravarthi, Shardul Suryawanshi, Elizabeth Sherly, and John P. McCrae. 2020. A survey of current datasets for code-switching research. In Proceedings of the 6th International Conference on Advanced Computing and Communication Systems (ICACCS). IEEE, 136–141.
    [3]
    Bharathi Raja Chakravarthi, Vigneshwaran Muralidaran, Ruba Priyadharshini, and John P. McCrae. 2020. Corpus creation for sentiment analysis in code-mixed Tamil-English text. arXiv preprint arXiv:2006.00206 (2020).
    [4]
    Georgios K. Pitsilis, Heri Ramampiaro, and Helge Langseth. 2018. Detecting offensive language in tweets using deep learning. arXiv preprint arXiv:1801.04433 (2018).
    [5]
    Aditya Bohra, Deepanshu Vijay, Vinay Singh, Syed Sarfaraz Akhtar, and Manish Shrivastava. 2018. A dataset of Hindi-English code-mixed social media text for hate speech detection. In Proceedings of the 2nd Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media. 36–41.
    [6]
    Shammur Absar Chowdhury, Hamdy Mubarak, Ahmed Abdelali, Soon-gyo Jung, Bernard J. Jansen, and Joni Salminen. 2020. A multi-platform Arabic news comment dataset for offensive language detection. In Proceedings of the 12th Language Resources and Evaluation Conference. 6203–6212.
    [7]
    Rasha Obeidat, Amjad Bashayreh, and Lojin Bani Younis. 2022. The impact of combining Arabic sarcasm detection datasets on the performance of BERT-based model. In Proceedings of the 13th International Conference on Information and Communication Systems (ICICS). IEEE, 22–29.
    [8]
    Ankita Gandhi, Kinjal Adhvaryu, Soujanya Poria, Erik Cambria, and Amir Hussain. 2023. Multimodal sentiment analysis: A systematic review of history, datasets, multimodal fusion methods, applications, challenges and future directions. Inf. Fusion 91 (2023), 424-444.
    [9]
    George-Alexandru Vlad, George-Eduard Zaharia, Dumitru-Clementin Cercel, Costin-Gabriel Chiru, and Stefan Trausan-Matu. 2020. UPB at SemEval-2020 Task 8: Joint textual and visual modeling in a multi-task learning architecture for Memotion analysis. arXiv preprint arXiv:2009.02779 (2020).
    [10]
    Pranaydeep Singh, Nina Bauwelinck, and Els Lefever. 2020. LT3 at SemEval-2020 Task 8: Multi-modal multi-task learning for Memotion analysis. In Proceedings of the 14th Workshop on Semantic Evaluation. 1155–1162.
    [11]
    T. Alsbouí, Mohammad Hammoudeh, Zuhair Bandar, and Andy Nisbet. 2011. An overview and classification of approaches to information extraction in wireless sensor networks. In Proceedings of the 5th International Conference on Sensor Technologies and Applications (SENSORCOMM’11).
    [12]
    Mohammad Hammoudeh and Robert Newman. 2015. Information extraction from sensor networks using the Watershed transform algorithm. Inf. Fusion 22 (2015), 39–49.
    [13]
    Frederik Hvilshøj. 2022. An Introduction to Balanced and Imbalanced Datasets in Machine Learning. Retrieved from https://encord.com/article/balanced-and-imbalanced-datasets.
    [14]
    Wasswa Shafik and S. Mojtaba Matinkhah. 2019. Privacy issues in social Web of things. In Proceedings of the 5th International Conference on Web Research (ICWR). IEEE, 208–214.
    [15]
    Abdelrahman Abuarqoub, Simak Abuarqoub, Ahmad Alzu’bi, and Ammar Muthanna. 2021. The impact of quantum computing on security in emerging technologies. In Proceedings of the 5th International Conference on Future Networks & Distributed Systems. 171–176.
    [16]
    Yi Gong, Lin Zhang, Renping Liu, Keping Yu, and Gautam Srivastava. 2020. Nonlinear MIMO for industrial internet of things in cyber–physical systems. IEEE Trans. Industr. Inform. 17, 8 (2020), 5533–5541.
    [17]
    Xiaokang Zhou, Wei Liang, Suzhen Huang, and Miao Fu. 2019. Social recommendation with large-scale group decision-making for cyber-enabled online service. IEEE Trans. Computat. Soc. Syst. 6, 5 (2019), 1073–1082.
    [18]
    Min Gao, Junwei Zhang, Junliang Yu, Jundong Li, Junhao Wen, and Qingyu Xiong. 2021. Recommender systems based on generative adversarial networks: A problem-driven perspective. Inf. Sci. 546 (2021), 1166–1185.
    [19]
    Zhiwei Guo, Keping Yu, Yu Li, Gautam Srivastava, and Jerry Chun-Wei Lin. 2021. Deep learning-embedded social internet of things for ambiguity-aware social recommendations. IEEE Trans. Netw. Sci. Eng. 9, 3 (2021), 1067–1081.
    [20]
    Paula Fortuna and Sérgio Nunes. 2018. A survey on automatic detection of hate speech in text. ACM Comput. Surv. 51, 4 (2018), 1–30.
    [21]
    Sean MacAvaney, Hao-Ren Yao, Eugene Yang, Katina Russell, Nazli Goharian, and Ophir Frieder. 2019. Hate speech detection: Challenges and solutions. PloS One 14, 8 (2019), e0221152.
    [22]
    Binny Mathew, Ritam Dutt, Pawan Goyal, and Animesh Mukherjee. 2019. Spread of hate speech in online social media. In Proceedings of the 10th ACM Conference on Web Science. 173–182.
    [23]
    Julian Risch, Robin Ruff, and Ralf Krestel. 2020. Offensive language detection explained. In Proceedings of the 2nd Workshop on Trolling, Aggression and Cyberbullying. 137–143.
    [24]
    Binny Mathew, Punyajoy Saha, Seid Muhie Yimam, Chris Biemann, Pawan Goyal, and Animesh Mukherjee. 2020. HateXplain: A benchmark dataset for explainable hate speech detection. arXiv preprint arXiv:2012.10289 (2020).
    [25]
    Gudbjartur Ingi Sigurbergsson and Leon Derczynski. 2019. Offensive language and hate speech detection for Danish. arXiv preprint arXiv:1908.04531 (2019).
    [26]
    Zeses Pitenis, Marcos Zampieri, and Tharindu Ranasinghe. 2020. Offensive language identification in Greek. arXiv preprint arXiv:2003.07459 (2020).
    [27]
    Anil Özberk and İlyas Çiçekli. 2021. Offensive language detection in Turkish tweets with BERT models. In Proceedings of the 6th International Conference on Computer Science and Engineering (UBMK). IEEE, 517–521.
    [28]
    Fatemah Husain and Ozlem Uzuner. 2021. A survey of offensive language detection for the Arabic language. ACM Trans. Asian Low-resour. Lang. Inf. Process. 20, 1 (2021), 1–44.
    [29]
    Chen Qian, Edoardo Ragusa, Iti Chaturvedi, Erik Cambria, and Rodolfo Zunino. 2023. Text-image sentiment analysis. Lecture Notes in Computer Science 13397 (2023), 169-180.
    [30]
    Jun-Ho Choi and Jong-Seok Lee. 2019. EmbraceNet: A robust deep learning architecture for multimodal classification. Inf. Fusion 51 (2019), 259–270.
    [31]
    Lin Yue, Weitong Chen, Xue Li, Wanli Zuo, and Minghao Yin. 2019. A survey of sentiment analysis in social media. Knowl. Inf. Syst. 60, 2 (2019), 617–663.
    [32]
    Ahmad Alzu’bi, Abbes Amira, and Naeem Ramzan. 2019. Learning transfer using deep convolutional features for remote sensing image retrieval. Int. J. Comput. Sci 46, 4 (2019), 637–644.
    [33]
    Ahmad Alzu’bi, Firas Albalas, Tawfik Al-Hadhrami, Lojin Bani Younis, and Amjad Bashayreh. 2021. Masked face recognition using deep learning: A review. Electronics 10, 21 (2021), 2666.
    [34]
    Andrew Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. 2011. Learning word vectors for sentiment analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 142–150.
    [35]
    Louis-Philippe Morency, Rada Mihalcea, and Payal Doshi. 2011. Towards multimodal sentiment analysis: Harvesting opinions from the web. In Proceedings of the 13th International Conference on Multimodal Interfaces. 169–176.
    [36]
    Alexander Pak and Patrick Paroubek. 2010. Twitter as a corpus for sentiment analysis and opinion mining. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC’10).
    [37]
    Nhan Cach Dang, María N. Moreno-García, and Fernando De la Prieta. 2020. Sentiment analysis based on deep learning: A comparative study. Electronics 9, 3 (2020), 483.
    [38]
    Raul Gomez, Jaume Gibert, Lluis Gomez, and Dimosthenis Karatzas. 2020. Exploring hate speech detection in multimodal publications. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1470–1478.
    [39]
    Shardul Suryawanshi, Bharathi Raja Chakravarthi, Mihael Arcan, and Paul Buitelaar. 2020. Multimodal meme dataset (MultiOFF) for identifying offensive content in image and text. In Proceedings of the 2nd Workshop on Trolling, Aggression and Cyberbullying. 32–41.
    [40]
    Douwe Kiela, Hamed Firooz, Aravind Mohan, Vedanuj Goswami, Amanpreet Singh, Pratik Ringshia, and Davide Testuggine. 2020. The hateful memes challenge: Detecting hate speech in multimodal memes. Adv. Neural Inf. Process. Syst. 33 (2020), 2611–2624.
    [41]
    Niklas Muennighoff. 2020. Vilio: State-of-the-art visio-linguistic models applied to hateful memes. arXiv preprint arXiv:2012.07788 (2020).
    [42]
    Phillip Lippe, Nithin Holla, Shantanu Chandra, Santhosh Rajamanickam, Georgios Antoniou, Ekaterina Shutova, and Helen Yannakoudakis. 2020. A multimodal framework for the detection of hateful memes. arXiv preprint arXiv:2012.12871 (2020).
    [43]
    Lanyu Shang, Yang Zhang, Yuheng Zha, Yingxi Chen, Christina Youn, and Dong Wang. 2021. AOMD: An Analogy-aware approach to Offensive Meme Detection on social media. Inf. Process. Manag. 58, 5 (2021), 102664.
    [44]
    Dushyant Singh Chauhan, S. R. Dhanush, Asif Ekbal, and Pushpak Bhattacharyya. 2020. All-in-one: A deep attentive multi-task learning framework for humour, sarcasm, offensive, motivation, and sentiment on memes. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing. 281–290.
    [45]
    Ana-Maria Bucur, Adrian Cosma, and Ioan-Bogdan Iordache. 2022. BLUE at Memotion 2.0 2022: You have my image, my text and my transformer. arXiv preprint arXiv:2202.07543 (2022).
    [46]
    Kim Ngan Phan, Guee-Sang Lee, Hyung-Jeong Yang, and Soo-Hyung Kim. 2022. Little flower at Memotion 2.0 2022: Ensemble of multi-modal model using attention mechanism in Memotion analysis. In Proceedings of De-Factify: Workshop on Multimodal Fact Checking and Hate Speech Detection, CEUR.
    [47]
    Roy Ka-Wei Lee, Rui Cao, Ziqing Fan, Jing Jiang, and Wen-Haw Chong. 2021. Disentangling hate in online memes. In Proceedings of the 29th ACM International Conference on Multimedia. 5138–5147.
    [48]
    Bo Xu, Tingting Li, Junzhe Zheng, Mehdi Naseriparsa, Zhehuan Zhao, Hongfei Lin, and Feng Xia. 2022. MET-Meme: A multimodal meme dataset rich in metaphors. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2887–2899.
    [49]
    Chhavi Sharma, Deepesh Bhageria, William Scott, Srinivas Pykl, Amitava Das, Tanmoy Chakraborty, Viswanath Pulabaigari, and Bjorn Gamback. 2020. SemEval-2020 Task 8: Memotion analysis–The Visuo-lingual metaphor! arXiv preprint arXiv:2008.03781 (2020).
    [50]
    Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
    [51]
    Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. 2018. GLUE: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461 (2018).
    [52]
    Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250 (2016).
    [53]
    Adina Williams, Nikita Nangia, and Samuel R. Bowman. 2017. A broad-coverage challenge corpus for sentence understanding through inference. arXiv preprint arXiv:1704.05426 (2017).
    [54]
    Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz et al. 2019. HuggingFace’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019).
    [55]
    Ali Sharif Razavian, Hossein Azizpour, Josephine Sullivan, and Stefan Carlsson. 2014. CNN features off-the-shelf: An astounding baseline for recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 806–813.
    [56]
    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.

    Cited By

    View all

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Journal of Data and Information Quality
    Journal of Data and Information Quality  Volume 15, Issue 3
    September 2023
    326 pages
    ISSN:1936-1955
    EISSN:1936-1963
    DOI:10.1145/3611329
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 September 2023
    Online AM: 15 May 2023
    Accepted: 27 March 2023
    Revised: 11 March 2023
    Received: 15 December 2022
    Published in JDIQ Volume 15, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Multimodal analysis
    2. offensiveness detection
    3. memes
    4. deep learning

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 608
      Total Downloads
    • Downloads (Last 12 months)500
    • Downloads (Last 6 weeks)25
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media