Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Multimodality in Misinformation Detection

  • Chapter
  • First Online:
Dive into Misinformation Detection

Part of the book series: The Information Retrieval Series ((INRE,volume 30))

  • 53 Accesses

Abstract

The fifth chapter explores the importance of multimodality in solving various NLP tasks. It begins with the discussion of the basic introduction of multimodality, and after that, it talks about various applications and challenges of multimodality. Later, this chapter briefly explores a case study in which it discusses how and why multimodality is important in misinformation detection. The case study describes an end-to-end deep learning system that determines if a multimedia post is authentic or fake based on its text and image input. The critical point of any multimodal system is to get the efficient and best multimodal features, which is the prime objective of solving misinformation detection in this chapter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://en.bdfactcheck.com/.

  2. 2.

    https://pictures.reuters.com/archive/GERMANY--BM2EA3O1B2X01.html.

  3. 3.

    https://pypi.org/project/jieba/.

References

  • Abdullah, Sharmeen M. Saleem Abdullah, Siddeeq Y. Ameen Ameen, Mohammed A.M. Sadeeq, and Subhi Zeebaree. 2021. Multimodal emotion recognition using deep learning. Journal of Applied Science and Technology Trends 2 (02): 52–58.

    Google Scholar 

  • Ben Abacha, Asma, Sadid A. Hasan, Vivek V. Datla, Dina Demner-Fushman, and Henning Müller. 2019. VQA-med: Overview of the medical visual question answering task at imageclef 2019. In Proceedings of CLEF (Conference and Labs of the Evaluation Forum) 2019 Working Notes. 9–12 September 2019.

    Google Scholar 

  • Bhagat, Dhritesh, Aritra Ray, Adarsh Sarda, Nilanjana Dutta Roy, Mufti Mahmud, and Debashis De. 2023. Improving mental health through multimodal emotion detection from speech and text data using long-short term memory. In Frontiers of ICT in Healthcare: Proceedings of EAIT 2022, 13–23. Springer.

    Google Scholar 

  • Castillo, Carlos, Marcelo Mendoza, and Barbara Poblete. 2011. Information credibility on twitter. In Proceedings of the 20th International Conference on World Wide Web, 675–684. ACM.

    Google Scholar 

  • Curtis, Drew A. 2021. Deception detection and emotion recognition: Investigating face software. Psychotherapy Research 31 (6): 802–816.

    Article  Google Scholar 

  • Dai, Sin. 2013. The origin and development of multimodal discourse analysis. Foreign Language Research 2: 17–23.

    Google Scholar 

  • Darwin, Charles, and Phillip Prodger. 1998. The expression of the emotions in man and animals. Oxford: Oxford University Press.

    Book  Google Scholar 

  • Das, Ringki, and Thoudam Doren Singh. 2023. Multimodal sentiment analysis: A survey of methods, trends and challenges. ACM Computing Surveys 55 (13s): 1–38.

    Article  Google Scholar 

  • Deng, Jia, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, 248–255. IEEE.

    Google Scholar 

  • Ekman, Paul. 2006. Darwin and facial expression: A century of research in review. Singapore: Ishk.

    Google Scholar 

  • Fukui, Akira, Dong Huk Park, Daylen Yang, Anna Rohrbach, Trevor Darrell, and Marcus Rohrbach. 2016. Multimodal compact bilinear pooling for visual question answering and visual grounding. In Conference on Empirical Methods in Natural Language Processing, 457–468. ACL.

    Google Scholar 

  • Garcia-Garcia, Jose Maria, Maria Dolores Lozano, Victor MR Penichet, and Effie Lai-Chong Law. 2023. Building a three-level multimodal emotion recognition framework. Multimedia Tools and Applications 82 (1): 239–269.

    Google Scholar 

  • Ghorbanali, Alireza, Mohammad Karim Sohrabi, and Farzin Yaghmaee. 2022. Ensemble transfer learning-based multimodal sentiment analysis using weighted convolutional neural networks. Information Processing & Management 59 (3): 102929.

    Article  Google Scholar 

  • Imran, Ali Shariq, Sher Muhammad Daudpota, Zenun Kastrati, and Rakhi Batra. 2020. Cross-cultural polarity and emotion detection using sentiment analysis and deep learning on COVID-19 related tweets. IEEE Access 8: 181074–181090.

    Article  Google Scholar 

  • Jewitt, C., and GJIAP Kress. 2003. Multimodal literacy (new literacies and digital epistemologies) peter lang inc. Lausanne: International Academic Publishers.

    Google Scholar 

  • Jewitt, Carey, Jeff Bezemer, and Kay O’Halloran. 2016. Introducing multimodality. Milton Park: Routledge.

    Book  Google Scholar 

  • Jin, Zhiwei, Juan Cao, Han Guo, Yongdong Zhang, and Jiebo Luo. 2017. Multimodal fusion with recurrent neural networks for rumor detection on microblogs. In Proceedings of the 25th ACM International Conference on Multimedia, 795–816.

    Chapter  Google Scholar 

  • Kafle, Kushal, and Christopher Kanan. 2017. Visual question answering: Datasets, algorithms, and future challenges. Computer Vision and Image Understanding 163: 3–20.

    Article  Google Scholar 

  • Kenton, Jacob Devlin Ming-Wei Chang, and Lee Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, 4171–4186.

    Google Scholar 

  • Khattar, Dhruv, Jaipal Singh Goud, Manish Gupta, and Vasudeva Varma. 2019. MVAE: Multimodal variational autoencoder for fake news detection. In The World Wide Web Conference, 2915–2921. ACM.

    Google Scholar 

  • Kumari, Rina, Nischal Ashok, Pawan Kumar Agrawal, Tirthankar Ghosal, and Asif Ekbal. 2023. Identifying multimodal misinformation leveraging novelty detection and emotion recognition. Journal of Intelligent Information Systems 61: 1–22.

    Article  Google Scholar 

  • Lakmal, Dimuthu, Surangika Ranathunga, Saman Peramuna, and Indu Herath. 2020. Word embedding evaluation for Sinhala. In Proceedings of the 12th Language Resources and Evaluation Conference, 1874–1881.

    Google Scholar 

  • Ma, Hui, Jian Wang, Hongfei Lin, Bo Zhang, Yijia Zhang, and Bo Xu. 2023. A transformer-based model with self-distillation for multimodal emotion recognition in conversations. IEEE Transactions on Multimedia abs/2310.20494: 1–13.

    Google Scholar 

  • Mocanu, Bogdan, Ruxandra Tapu, and Titus Zaharia. 2023. Multimodal emotion recognition using cross modal audio-video fusion with attention and deep metric learning. Image and Vision Computing 133: 104676.

    Article  Google Scholar 

  • Ruiz, Nataniel, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, and Kfir Aberman. 2023. DreamBooth: Fine tuning text-to-image diffusion models for subject-driven generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 22500–22510.

    Google Scholar 

  • Shaha, Manali, and Meenakshi Pawar. 2018. Transfer learning for image classification. In 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA), 656–660. IEEE.

    Google Scholar 

  • Singhal, Shivangi, Rajiv Ratn Shah, Tanmoy Chakraborty, Ponnurangam Kumaraguru, and Shin’ichi Satoh. 2019. SpotFake: A multi-modal framework for fake news detection. In 2019 IEEE Fifth International Conference on Multimedia Big Data (BigMM), 39–47. IEEE.

    Google Scholar 

  • Soleymani, Mohammad, Maja Pantic, and Thierry Pun. 2011. Multimodal emotion recognition in response to videos. IEEE Transactions on Affective Computing 3 (2): 211–223.

    Article  Google Scholar 

  • Wang, Yaqing, Fenglong Ma, Zhiwei Jin, Ye Yuan, Guangxu Xun, Kishlay Jha, Lu Su, and Jing Gao. 2018. Eann: Event adversarial neural networks for multi-modal fake news detection. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 849–857. ACM.

    Google Scholar 

  • Xu, Zheng, M.M. Kamruzzaman, and Jinyao Shi. 2022. Method of generating face image based on text description of generating adversarial network. Journal of Electronic Imaging 31 (5): 051411.

    Google Scholar 

  • Yang, Zichao, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1480–1489.

    Google Scholar 

  • Yang, Jufeng, Yan Sun, Jie Liang, Yong-Liang Yang, and Ming-Ming Cheng. 2018a. Understanding image impressiveness inspired by instantaneous human perceptual cues. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32.

    Google Scholar 

  • Ye, Junjie, Jie Zhou, Junfeng Tian, Rui Wang, Jingyi Zhou, Tao Gui, Qi Zhang, and Xuanjing Huang. 2022. Sentiment-aware multimodal pre-training for multimodal sentiment analysis. Knowledge-Based Systems 258: 110021.

    Article  Google Scholar 

  • Yu, Zhou, Jun Yu, Yuhao Cui, Dacheng Tao, and Qi Tian. 2019. Deep modular co-attention networks for visual question answering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6281–6290.

    Google Scholar 

  • Yu, Jiahui, Yuanzhong Xu, Jing Yu Koh, Thang Luong, Gunjan Baid, Zirui Wang, Vijay Vasudevan, Alexander Ku, Yinfei Yang, Burcu Karagol Ayan, et al. 2022. Scaling autoregressive models for content-rich text-to-image generation. arXiv preprint arXiv:2206.10789 2 (3): 5.

    Google Scholar 

  • Zhang, Pingping, Dong Wang, Huchuan Lu, Hongyu Wang, and Xiang Ruan. 2017. Amulet: Aggregating multi-level convolutional features for salient object detection. In Proceedings of the IEEE International Conference on Computer Vision, 202–211.

    Google Scholar 

  • Zheng, Wenfeng, Lirong Yin, Xiaobing Chen, Zhiyang Ma, Shan Liu, and Bo Yang. 2021. Knowledge base graph embedding module design for visual question answering model. Pattern Recognition 120: 108153.

    Article  Google Scholar 

  • Zhong, Zilong, Jonathan Li, Lingfei Ma, Han Jiang, and He Zhao. 2017. Deep residual networks for hyperspectral image classification. In 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 1824–1827. IEEE.

    Google Scholar 

  • Zhou, Yufan, Zhang, Ruiyi, Chen, Changyou, Li, Chunyuan, Tensmeyer, Chris, Yu, Tong, Gu, Jiuxiang, Xu, Jinhui, and Sun, Tong. 2022. Towards language-free training for text-to-image generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 17907–17917.

    Google Scholar 

  • Zhu, Tong, Leida Li, Jufeng Yang, Sicheng Zhao, Hantao Liu, and Jiansheng Qian. 2022. Multimodal sentiment analysis with image-text interaction network. IEEE Transactions on Multimedia 25: 3375–3385.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Ekbal, A., Kumari, R. (2024). Multimodality in Misinformation Detection. In: Dive into Misinformation Detection. The Information Retrieval Series, vol 30. Springer, Cham. https://doi.org/10.1007/978-3-031-54834-5_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-54834-5_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-54833-8

  • Online ISBN: 978-3-031-54834-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics