Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3474085.3475190acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Multi-Modal Sarcasm Detection with Interactive In-Modal and Cross-Modal Graphs

Published: 17 October 2021 Publication History

Abstract

Sarcasm is a peculiar form and sophisticated linguistic act to express the incongruity of someone's implied sentiment expression, which is a pervasive phenomenon in social media platforms. Compared with sarcasm detection purely on texts, multi-modal sarcasm detection is more adapted to the rapidly growing social media platforms, where people are interested in creating multi-modal messages. When focusing on the multi-modal sarcasm detection for tweets consisting of texts and images on Twitter, the significant clue of improving the performance of multi-modal sarcasm detection evolves into how to determine the incongruity relations between texts and images. In this paper, we investigate multi-modal sarcasm detection from a novel perspective, so as to determine the sentiment inconsistencies within a certain modality and across different modalities by constructing heterogeneous in-modal and cross-modal graphs (InCrossMGs) for each multi-modal example. Based on it, we explore an interactive graph convolution network (GCN) structure to jointly and interactively learn the incongruity relations of in-modal and cross-modal graphs for determining the significant clues in sarcasm detection. Experimental results demonstrate that our proposed model achieves state-of-the-art performance in multi-modal sarcasm detection.

References

[1]
Nastaran Babanejad, Heidar Davoudi, Aijun An, and Manos Papagelis. 2020. Affective and Contextual Embedding for Sarcasm Detection. In Proceedings of the 28th International Conference on Computational Linguistics. 225--243.
[2]
Alexei Baevski and Michael Auli. 2018. Adaptive Input Representations for Neural Language Modeling. In International Conference on Learning Representations .
[3]
David Bamman and Noah Smith. 2015. Contextualized sarcasm detection on twitter. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 9.
[4]
Yitao Cai, Huiyu Cai, and Xiaojun Wan. 2019. Multi-Modal Sarcasm Detection in Twitter with Hierarchical Fusion Model. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2506--2515.
[5]
Santiago Castro, Devamanyu Hazarika, Verónica Pérez-Rosas, Roger Zimmermann, Rada Mihalcea, and Soujanya Poria. 2019. Towards Multimodal Sarcasm Detection (An _Obviously_ Perfect Paper). In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 4619--4629.
[6]
Dmitry Davidov, Oren Tsur, and Ari Rappoport. 2010. Semi-Supervised Recognition of Sarcasm in Twitter and Amazon. In Proceedings of the Fourteenth Conference on Computational Natural Language Learning. 107--116.
[7]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171--4186.
[8]
Shelly Dews and Ellen Winner. 1995. Muting the meaning a social function of irony. Metaphor and Symbol, Vol. 10, 1 (1995), 3--19.
[9]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations. https://openreview.net/forum?id=YicbFdNTTy
[10]
Kunihiko Fukushima and Sei Miyake. 1982. Neocognitron: A self-organizing neural network model for a mechanism of visual pattern recognition. In Competition and cooperation in neural nets. Springer, 267--285.
[11]
Raymond W Gibbs. 1986. On the psycholinguistics of sarcasm. Journal of experimental psychology: general, Vol. 115, 1 (1986), 3.
[12]
Raymond W Gibbs. 2007. On the psycholinguistics of sarcasm. Irony in language and thougt: A cognitive science reader (2007), 173--200.
[13]
Roberto González-Ibánez, Smaranda Muresan, and Nina Wacholder. 2011. Identifying sarcasm in Twitter: a closer look. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 581--586.
[14]
K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770--778.
[15]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.
[16]
Amit Kumar Jena, Aman Sinha, and Rohit Agarwal. 2020. C-net: Contextual network for sarcasm detection. In Proceedings of the Second Workshop on Figurative Language Processing. 61--66.
[17]
Aditya Joshi, Vinita Sharma, and Pushpak Bhattacharyya. 2015a. Harnessing Context Incongruity for Sarcasm Detection. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 757--762.
[18]
Aditya Joshi, Vinita Sharma, and Pushpak Bhattacharyya. 2015b. Harnessing context incongruity for sarcasm detection. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 757--762.
[19]
Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1746--1751.
[20]
Thomas N Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In Proceedings of the 28th International Conference on Computational Linguistics .
[21]
Yann LeCun, Bernhard Boser, John S Denker, Donnie Henderson, Richard E Howard, Wayne Hubbard, and Lawrence D Jackel. 1989. Backpropagation applied to handwritten zip code recognition. Neural computation, Vol. 1, 4 (1989), 541--551.
[22]
Bin Liang, Rongdi Yin, Lin Gui, Jiachen Du, and Ruifeng Xu. 2020. Jointly Learning Aspect-Focused and Inter-Aspect Relations with Graph Convolutional Networks for Aspect Sentiment Analysis. In Proceedings of the 28th International Conference on Computational Linguistics. 150--161.
[23]
Edwin Lunando and Ayu Purwarianti. 2013. Indonesian social media sentiment analysis with sarcasm detection. In 2013 International Conference on Advanced Computer Science and Information Systems (ICACSIS). IEEE, 195--198.
[24]
Hongliang Pan, Zheng Lin, Peng Fu, Yatao Qi, and Weiping Wang. 2020. Modeling Intra and Inter-modality Incongruity for Multi-Modal Sarcasm Detection. In Findings of the Association for Computational Linguistics: EMNLP 2020. 1383--1392.
[25]
Bo Pang and Lillian Lee. 2008. Opinion Mining and Sentiment Analysis. Information Retrieval, Vol. 2, 1--2 (2008), 1--135.
[26]
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1532--1543.
[27]
Soujanya Poria, Erik Cambria, Devamanyu Hazarika, and Prateek Vij. 2016. A Deeper Look into Sarcastic Tweets Using Deep Convolutional Neural Networks. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. 1601--1612.
[28]
Ellen Riloff, Ashequl Qadir, Prafulla Surve, Lalindra De Silva, Nathan Gilbert, and Ruihong Huang. 2013. Sarcasm as contrast between a positive sentiment and negative situation. In Proceedings of the 2013 conference on empirical methods in natural language processing. 704--714.
[29]
Rossano Schifanella, Paloma de Juan, Joel Tetreault, and Liangliang Cao. 2016. Detecting sarcasm in multimodal social platforms. In Proceedings of the 24th ACM international conference on Multimedia. 1136--1145.
[30]
Mohammad Soleymani, David Garcia, Brendan Jou, Björn Schuller, Shih-Fu Chang, and Maja Pantic. 2017. A survey of multimodal sentiment analysis. Image and Vision Computing, Vol. 65 (2017), 3--14.
[31]
Yi Tay, Anh Tuan Luu, Siu Cheung Hui, and Jian Su. 2018. Reasoning with Sarcasm by Reading In-Between. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1010--1020.
[32]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 6000--6010.
[33]
Qiang Wang, Bei Li, Tong Xiao, Jingbo Zhu, Changliang Li, Derek F Wong, and Lidia S Chao. 2019. Learning Deep Transformer Models for Machine Translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 1810--1822.
[34]
Tao Xiong, Peiran Zhang, Hongbo Zhu, and Yihui Yang. 2019. Sarcasm detection with self-matching networks and low-rank bilinear pooling. In The World Wide Web Conference. 2115--2124.
[35]
Nan Xu, Zhixiong Zeng, and Wenji Mao. 2020. Reasoning with Multimodal Sarcastic Tweets via Modeling Cross-Modality Contrast and Semantic Association. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 3777--3786.
[36]
Chen Zhang, Qiuchi Li, and Dawei Song. 2019 a. Aspect-based Sentiment Classification with Aspect-specific Graph Convolutional Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 4567--4577.
[37]
Dong Zhang, Shoushan Li, Qiaoming Zhu, and Guodong Zhou. 2019 b. Effective sentiment-relevant word selection for multi-modal sentiment analysis in spoken language. In Proceedings of the 27th ACM International Conference on Multimedia. 148--156.
[38]
Dong Zhang, Suzhong Wei, Shoushan Li, Hanqian Wu, Qiaoming Zhu, and Guodong Zhou. 2021. Multi-modal Graph Fusion for Named Entity Recognition with Targeted Visual Guidance. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 14347--14355.
[39]
Meishan Zhang, Yue Zhang, and Guohong Fu. 2016. Tweet sarcasm detection using deep neural network. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: technical papers. 2449--2460.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '21: Proceedings of the 29th ACM International Conference on Multimedia
October 2021
5796 pages
ISBN:9781450386517
DOI:10.1145/3474085
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. graph networks
  2. multi-modal sarcasm detection
  3. sarcasm detection

Qualifiers

  • Research-article

Funding Sources

Conference

MM '21
Sponsor:
MM '21: ACM Multimedia Conference
October 20 - 24, 2021
Virtual Event, China

Acceptance Rates

Overall Acceptance Rate 995 of 4,171 submissions, 24%

Upcoming Conference

MM '24
The 32nd ACM International Conference on Multimedia
October 28 - November 1, 2024
Melbourne , VIC , Australia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)275
  • Downloads (Last 6 weeks)23
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Sarcasm DetectionHarnessing Artificial Emotional Intelligence for Improved Human-Computer Interactions10.4018/979-8-3693-2794-4.ch012(197-221)Online publication date: 6-Jun-2024
  • (2024)A Semantic Enhancement Framework for Multimodal Sarcasm DetectionMathematics10.3390/math1202031712:2(317)Online publication date: 18-Jan-2024
  • (2024)Multi-Modal Sarcasm Detection with Sentiment Word EmbeddingElectronics10.3390/electronics1305085513:5(855)Online publication date: 23-Feb-2024
  • (2024)A Multi-View Interactive Approach for Multimodal Sarcasm Detection in Social Internet of Things with Knowledge EnhancementApplied Sciences10.3390/app1405214614:5(2146)Online publication date: 4-Mar-2024
  • (2024)Modeling Multi-Task Joint Training of Aggregate Networks for Multi-Modal Sarcasm DetectionProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3658015(833-841)Online publication date: 30-May-2024
  • (2024)Learning Multitask Commonness and Uniqueness for Multimodal Sarcasm Detection and Sentiment Analysis in ConversationIEEE Transactions on Artificial Intelligence10.1109/TAI.2023.32983285:3(1349-1361)Online publication date: Mar-2024
  • (2024)Context-Aware Dual Attention Network for Multimodal Sarcasm DetectionICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10448377(12777-12781)Online publication date: 14-Apr-2024
  • (2024)A Relation-Aware Heterogeneous Graph Transformer on Dynamic Fusion for Multimodal Classification TasksICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10446972(7855-7859)Online publication date: 14-Apr-2024
  • (2024)AutoAMS: Automated attention-based multi-modal graph learning architecture searchNeural Networks10.1016/j.neunet.2024.106427179(106427)Online publication date: Nov-2024
  • (2024)Multifaceted and deep semantic alignment network for multimodal sarcasm detectionKnowledge-Based Systems10.1016/j.knosys.2024.112298301(112298)Online publication date: Oct-2024
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media