Multimodality in Misinformation Detection

Ekbal, Asif; Kumari, Rina

doi:10.1007/978-3-031-54834-5_5

Asif Ekbal⁹ &
Rina Kumari¹⁰

Part of the book series: The Information Retrieval Series ((INRE,volume 30))

53 Accesses

Abstract

The fifth chapter explores the importance of multimodality in solving various NLP tasks. It begins with the discussion of the basic introduction of multimodality, and after that, it talks about various applications and challenges of multimodality. Later, this chapter briefly explores a case study in which it discusses how and why multimodality is important in misinformation detection. The case study describes an end-to-end deep learning system that determines if a multimedia post is authentic or fake based on its text and image input. The critical point of any multimodal system is to get the efficient and best multimodal features, which is the prime objective of solving misinformation detection in this chapter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Abdullah, Sharmeen M. Saleem Abdullah, Siddeeq Y. Ameen Ameen, Mohammed A.M. Sadeeq, and Subhi Zeebaree. 2021. Multimodal emotion recognition using deep learning. Journal of Applied Science and Technology Trends 2 (02): 52–58.
Google Scholar
Ben Abacha, Asma, Sadid A. Hasan, Vivek V. Datla, Dina Demner-Fushman, and Henning Müller. 2019. VQA-med: Overview of the medical visual question answering task at imageclef 2019. In Proceedings of CLEF (Conference and Labs of the Evaluation Forum) 2019 Working Notes. 9–12 September 2019.
Google Scholar
Bhagat, Dhritesh, Aritra Ray, Adarsh Sarda, Nilanjana Dutta Roy, Mufti Mahmud, and Debashis De. 2023. Improving mental health through multimodal emotion detection from speech and text data using long-short term memory. In Frontiers of ICT in Healthcare: Proceedings of EAIT 2022, 13–23. Springer.
Google Scholar
Castillo, Carlos, Marcelo Mendoza, and Barbara Poblete. 2011. Information credibility on twitter. In Proceedings of the 20th International Conference on World Wide Web, 675–684. ACM.
Google Scholar
Curtis, Drew A. 2021. Deception detection and emotion recognition: Investigating face software. Psychotherapy Research 31 (6): 802–816.
Article Google Scholar
Dai, Sin. 2013. The origin and development of multimodal discourse analysis. Foreign Language Research 2: 17–23.
Google Scholar
Darwin, Charles, and Phillip Prodger. 1998. The expression of the emotions in man and animals. Oxford: Oxford University Press.
Book Google Scholar
Das, Ringki, and Thoudam Doren Singh. 2023. Multimodal sentiment analysis: A survey of methods, trends and challenges. ACM Computing Surveys 55 (13s): 1–38.
Article Google Scholar
Deng, Jia, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, 248–255. IEEE.
Google Scholar
Ekman, Paul. 2006. Darwin and facial expression: A century of research in review. Singapore: Ishk.
Google Scholar
Fukui, Akira, Dong Huk Park, Daylen Yang, Anna Rohrbach, Trevor Darrell, and Marcus Rohrbach. 2016. Multimodal compact bilinear pooling for visual question answering and visual grounding. In Conference on Empirical Methods in Natural Language Processing, 457–468. ACL.
Google Scholar
Garcia-Garcia, Jose Maria, Maria Dolores Lozano, Victor MR Penichet, and Effie Lai-Chong Law. 2023. Building a three-level multimodal emotion recognition framework. Multimedia Tools and Applications 82 (1): 239–269.
Google Scholar
Ghorbanali, Alireza, Mohammad Karim Sohrabi, and Farzin Yaghmaee. 2022. Ensemble transfer learning-based multimodal sentiment analysis using weighted convolutional neural networks. Information Processing & Management 59 (3): 102929.
Article Google Scholar
Imran, Ali Shariq, Sher Muhammad Daudpota, Zenun Kastrati, and Rakhi Batra. 2020. Cross-cultural polarity and emotion detection using sentiment analysis and deep learning on COVID-19 related tweets. IEEE Access 8: 181074–181090.
Article Google Scholar
Jewitt, C., and GJIAP Kress. 2003. Multimodal literacy (new literacies and digital epistemologies) peter lang inc. Lausanne: International Academic Publishers.
Google Scholar
Jewitt, Carey, Jeff Bezemer, and Kay O’Halloran. 2016. Introducing multimodality. Milton Park: Routledge.
Book Google Scholar
Jin, Zhiwei, Juan Cao, Han Guo, Yongdong Zhang, and Jiebo Luo. 2017. Multimodal fusion with recurrent neural networks for rumor detection on microblogs. In Proceedings of the 25th ACM International Conference on Multimedia, 795–816.
Chapter Google Scholar
Kafle, Kushal, and Christopher Kanan. 2017. Visual question answering: Datasets, algorithms, and future challenges. Computer Vision and Image Understanding 163: 3–20.
Article Google Scholar
Kenton, Jacob Devlin Ming-Wei Chang, and Lee Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, 4171–4186.
Google Scholar
Khattar, Dhruv, Jaipal Singh Goud, Manish Gupta, and Vasudeva Varma. 2019. MVAE: Multimodal variational autoencoder for fake news detection. In The World Wide Web Conference, 2915–2921. ACM.
Google Scholar
Kumari, Rina, Nischal Ashok, Pawan Kumar Agrawal, Tirthankar Ghosal, and Asif Ekbal. 2023. Identifying multimodal misinformation leveraging novelty detection and emotion recognition. Journal of Intelligent Information Systems 61: 1–22.
Article Google Scholar
Lakmal, Dimuthu, Surangika Ranathunga, Saman Peramuna, and Indu Herath. 2020. Word embedding evaluation for Sinhala. In Proceedings of the 12th Language Resources and Evaluation Conference, 1874–1881.
Google Scholar
Ma, Hui, Jian Wang, Hongfei Lin, Bo Zhang, Yijia Zhang, and Bo Xu. 2023. A transformer-based model with self-distillation for multimodal emotion recognition in conversations. IEEE Transactions on Multimedia abs/2310.20494: 1–13.
Google Scholar
Mocanu, Bogdan, Ruxandra Tapu, and Titus Zaharia. 2023. Multimodal emotion recognition using cross modal audio-video fusion with attention and deep metric learning. Image and Vision Computing 133: 104676.
Article Google Scholar
Ruiz, Nataniel, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, and Kfir Aberman. 2023. DreamBooth: Fine tuning text-to-image diffusion models for subject-driven generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 22500–22510.
Google Scholar
Shaha, Manali, and Meenakshi Pawar. 2018. Transfer learning for image classification. In 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA), 656–660. IEEE.
Google Scholar
Singhal, Shivangi, Rajiv Ratn Shah, Tanmoy Chakraborty, Ponnurangam Kumaraguru, and Shin’ichi Satoh. 2019. SpotFake: A multi-modal framework for fake news detection. In 2019 IEEE Fifth International Conference on Multimedia Big Data (BigMM), 39–47. IEEE.
Google Scholar
Soleymani, Mohammad, Maja Pantic, and Thierry Pun. 2011. Multimodal emotion recognition in response to videos. IEEE Transactions on Affective Computing 3 (2): 211–223.
Article Google Scholar
Wang, Yaqing, Fenglong Ma, Zhiwei Jin, Ye Yuan, Guangxu Xun, Kishlay Jha, Lu Su, and Jing Gao. 2018. Eann: Event adversarial neural networks for multi-modal fake news detection. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 849–857. ACM.
Google Scholar
Xu, Zheng, M.M. Kamruzzaman, and Jinyao Shi. 2022. Method of generating face image based on text description of generating adversarial network. Journal of Electronic Imaging 31 (5): 051411.
Google Scholar
Yang, Zichao, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1480–1489.
Google Scholar
Yang, Jufeng, Yan Sun, Jie Liang, Yong-Liang Yang, and Ming-Ming Cheng. 2018a. Understanding image impressiveness inspired by instantaneous human perceptual cues. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32.
Google Scholar
Ye, Junjie, Jie Zhou, Junfeng Tian, Rui Wang, Jingyi Zhou, Tao Gui, Qi Zhang, and Xuanjing Huang. 2022. Sentiment-aware multimodal pre-training for multimodal sentiment analysis. Knowledge-Based Systems 258: 110021.
Article Google Scholar
Yu, Zhou, Jun Yu, Yuhao Cui, Dacheng Tao, and Qi Tian. 2019. Deep modular co-attention networks for visual question answering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6281–6290.
Google Scholar
Yu, Jiahui, Yuanzhong Xu, Jing Yu Koh, Thang Luong, Gunjan Baid, Zirui Wang, Vijay Vasudevan, Alexander Ku, Yinfei Yang, Burcu Karagol Ayan, et al. 2022. Scaling autoregressive models for content-rich text-to-image generation. arXiv preprint arXiv:2206.10789 2 (3): 5.
Google Scholar
Zhang, Pingping, Dong Wang, Huchuan Lu, Hongyu Wang, and Xiang Ruan. 2017. Amulet: Aggregating multi-level convolutional features for salient object detection. In Proceedings of the IEEE International Conference on Computer Vision, 202–211.
Google Scholar
Zheng, Wenfeng, Lirong Yin, Xiaobing Chen, Zhiyang Ma, Shan Liu, and Bo Yang. 2021. Knowledge base graph embedding module design for visual question answering model. Pattern Recognition 120: 108153.
Article Google Scholar
Zhong, Zilong, Jonathan Li, Lingfei Ma, Han Jiang, and He Zhao. 2017. Deep residual networks for hyperspectral image classification. In 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 1824–1827. IEEE.
Google Scholar
Zhou, Yufan, Zhang, Ruiyi, Chen, Changyou, Li, Chunyuan, Tensmeyer, Chris, Yu, Tong, Gu, Jiuxiang, Xu, Jinhui, and Sun, Tong. 2022. Towards language-free training for text-to-image generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 17907–17917.
Google Scholar
Zhu, Tong, Leida Li, Jufeng Yang, Sicheng Zhao, Hantao Liu, and Jiansheng Qian. 2022. Multimodal sentiment analysis with image-text interaction network. IEEE Transactions on Multimedia 25: 3375–3385.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science and Engineering, Indian Institute of Technology Patna, Patna, India
Asif Ekbal
KIIT University, Bhubaneswar, India
Rina Kumari

Authors

Asif Ekbal
View author publications
You can also search for this author in PubMed Google Scholar
Rina Kumari
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Ekbal, A., Kumari, R. (2024). Multimodality in Misinformation Detection. In: Dive into Misinformation Detection. The Information Retrieval Series, vol 30. Springer, Cham. https://doi.org/10.1007/978-3-031-54834-5_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-54834-5_5
Published: 28 May 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-54833-8
Online ISBN: 978-3-031-54834-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics