Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

A multimodal sentiment analysis approach for tweets by comprehending co-relations between information modalities

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

With the popularity of smart devices and online social media platforms, people are expressing their views in various modalities like text, images, and audio. Thus, recent research in sentiment analysis is no more limited to one modality of information only, rather it compiles all the available modalities to predict more correct sentiment. Multimodal sentiment analysis (MSA) is the process of extracting sentiment from various modalities such as text, images, and audio. Existing research works predict the sentiment of individual modalities independently and these predictions leverage the final sentiment. This paper presents an MSA approach for obtaining the final sentiment of an image-text tweet using multimodal decision-level fusion by incorporating features of individual modalities and inter-modal semantic relations. A dataset is prepared from an existing benchmark MSA dataset by annotating the final sentiment to tweets as a whole after assessing all the modalities. The proposed approach is experimented on this dataset and compared with state-of-the-art MSA methods. The in-depth analysis of the comparison results shows that the proposed approach outperforms existing methods in terms of accuracy, and F1-score.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data Availibility Statement

We have collected our data from MVSA-Single (MVSA-S) dataset which is publicly available and the detail is also mentioned in Section 3. Collected data is further annotated to fit in our current research work which may be available on request to the corresponding author.

References

  1. Abdi H, Williams LJ (2010) Principal component analysis. WIREs. Comput Stat 2(4):433–459. https://doi.org/10.1002/wics.101, https://onlinelibrary.wiley.com/doi/pdf/10.1002/wics.101

  2. Barbieri F, Camacho-Collados J, Neves L, et al. (2020) TweetEval: Unified Benchmark and Comparative Evaluation for Tweet Classification. https://doi.org/10.48550/arXiv.2010.12421, arXiv:2010.12421

  3. Barrett LF, Lindquist KA, Gendron M (2007) Language as context for the perception of emotion. Trends Cognit Sci 11(8):327–332. https://doi.org/10.1016/j.tics.2007.06.003, https://www.sciencedirect.com/science/article/pii/S1364661307001532

  4. Borth D, Ji R, Chen T, et al. (2013) Large-scale visual sentiment ontology and detectors using adjective noun pairs. In: Proceedings of the 21st ACM international conference on Multimedia. Association for Computing Machinery, New York, NY, USA, MM ’13, pp 223–232, https://doi.org/10.1145/2502081.2502282

  5. Cai G, Xia B (2015) Convolutional neural networks for multimedia sentiment analysis. In: natural language processing and chinese computing: 4th CCF Conference, NLPCC 2015, Nanchang, China, October 9-13, 2015, Proceedings 4, Springer, pp 159–167

  6. Caschera MC, Grifoni P, Ferri F (2022) Emotion classification from speech and text in videos using a multimodal approach. Multimod Technol Interact 6(4):28

    Article  Google Scholar 

  7. Castellano G, Kessous L, Caridakis G (2008) Emotion Recognition through Multiple Modalities: Face, Body Gesture, Speech. In: Peter C, Beale R (eds) Affect and Emotion in Human-Computer Interaction: From Theory to Applications. Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, p 92–103, https://doi.org/10.1007/978-3-540-85099-1_8

  8. Cheema GS, Hakimov S, Müller-Budack E, et al. (2021) A fair and comprehensive comparison of multimodal tweet sentiment analysis methods. In: Proceedings of the 2021 Workshop on Multi-Modal Pre-Training for Multimedia Understanding, pp 37–45

  9. Chen T, Borth D, Darrell T, et al. (2014) DeepSentiBank: Visual Sentiment Concept Classification with Deep Convolutional Neural Networks. https://doi.org/10.48550/arXiv.1410.8586, arXiv:1410.8586

  10. Chen T, Yu FX, Chen J, et al. (2014) Object-Based Visual Sentiment Concept Analysis and Application. In: Proceedings of the 22nd ACM international conference on Multimedia. Association for Computing Machinery, New York, NY, USA, MM ’14, pp 367–376, https://doi.org/10.1145/2647868.2654935

  11. Das R, Singh TD (2023) Multimodal sentiment analysis: A survey of methods, trends and challenges. ACM Comput Surv

  12. Dave K, Lawrence S, Pennock DM (2003) Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In: Proceedings of the 12th international conference on World Wide Web, pp 519–528

  13. Devlin J, Chang MW, Lee K, et al. (2019) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, pp 4171–4186, https://doi.org/10.18653/v1/N19-1423, https://aclanthology.org/N19-1423

  14. El-Sappagh S, Saleh H, Sahal R et al (2021) Alzheimer’s disease progression detection model based on an early fusion of cost-effective multimodal data. Future Generation Comput Syst 115:680–699. https://doi.org/10.1016/j.future.2020.10.005, https://www.sciencedirect.com/science/article/pii/S0167739X20329824

  15. Fan RE, Chang KW, Hsieh CJ et al (2008) Liblinear: A library for large linear classification. J Mach Learn Res 9:1871–1874

    Google Scholar 

  16. Gandhi A, Adhvaryu K, Khanduja V (2021) Multimodal sentiment analysis: Review, application domains and future directions. In: 2021 IEEE Pune Section International Conference (PuneCon), pp 1–5, https://doi.org/10.1109/PuneCon52575.2021.9686504

  17. Gandhi A, Adhvaryu K, Poria S, et al. (2022) Multimodal sentiment analysis: A systematic review of history, datasets, multimodal fusion methods, applications, challenges and future directions. Information Fusion

  18. Gkoumas D, Li Q, Lioma C et al (2021) What makes the difference? An empirical comparison of fusion strategies for multimodal language analysis. Inf Fusion 66:184–197. https://doi.org/10.1016/j.inffus.2020.09.005, https://www.sciencedirect.com/science/article/pii/S1566253520303675

  19. Goel A, Gautam J, Kumar S (2016) Real time sentiment analysis of tweets using naive bayes. In: 2016 2nd international conference on next generation computing technologies (NGCT), IEEE, pp 257–261

  20. Huang F, Zhang X, Zhao Z et al (2019) Image-text sentiment analysis via deep multimodal attentive fusion. Knowledge-Based Systems 167:26–37. https://doi.org/10.1016/j.knosys.2019.01.019, https://www.sciencedirect.com/science/article/pii/S095070511930019X

  21. Huang F, Wei K, Weng J, et al. (2020) Attention-Based Modality-Gated Networks for Image-Text Sentiment Analysis. ACM Trans Multimed Comput, Commun Appl 16(3):79:1–79:19. https://doi.org/10.1145/3388861

  22. Huddar MG, Sannakki SS, Rajpurohit VS (2020) Multi-level context extraction and attention-based contextual inter-modal fusion for multimodal sentiment analysis and emotion classification. Int J Multimed Inf Retrieval 9(2):103–112. https://doi.org/10.1007/s13735-019-00185-8

    Article  Google Scholar 

  23. Jain PK, Pamula R, Srivastava G (2021) A systematic literature review on machine learning applications for consumer sentiment analysis using online reviews. Comput Sci Rev 41:100413. https://doi.org/10.1016/j.cosrev.2021.100413

    Article  Google Scholar 

  24. Jiang T, Wang J, Liu Z, et al. (2020) Fusion-Extraction Network for Multimodal Sentiment Analysis. In: Lauw HW, Wong RCW, Ntoulas A, et al. (eds) Advances in Knowledge Discovery and Data Mining. Springer International Publishing, Cham, Lecture Notes in Computer Science, pp 785–797, https://doi.org/10.1007/978-3-030-47436-2_59

  25. Joachims T (1998) (2005) Text categorization with support vector machines: Learning with many relevant features. Machine Learning: ECML-98: 10th European Conference on Machine Learning Chemnitz, Germany, April 21–23. Proceedings, Springer, pp 137–142

  26. Kaur R, Kautish S (2022). Multimodal Sentiment Analysis: A Survey and Comparison. https://doi.org/10.4018/978-1-6684-6303-1.ch098, iSBN: 9781668463031 Pages: 1846-1870 Publisher: IGI Global

  27. Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, pp 1746–1751, https://doi.org/10.3115/v1/D14-1181, https://aclanthology.org/D14-1181

  28. Li J, Selvaraju R, Gotmare A, et al. (2021) Align before Fuse: Vision and Language Representation Learning with Momentum Distillation. Advances in Neural Information Processing Systems 34:9694–9705. https://proceedings.neurips.cc/paper/2021/hash/505259756244493872b7709a8a01b536-Abstract.html

  29. Li J, Li D, Xiong C, et al. (2022) BLIP: Bootstrapping language-image pre-training for unified vision-language understanding and generation. In: Chaudhuri K, Jegelka S, Song L, et al. (eds) Proceedings of the 39th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol 162. PMLR, pp 12888–12900, https://proceedings.mlr.press/v162/li22n.html

  30. Liao W, Zeng B, Yin X et al (2021) An improved aspect-category sentiment analysis model for text sentiment analysis based on roberta. Appl Intell 51:3522–3533

    Article  Google Scholar 

  31. Liao W, Zeng B, Liu J et al (2022) Image-text interaction graph neural network for image-text sentiment analysis. Appl Intell 52(10):11184–11198

    Article  Google Scholar 

  32. Ligthart A, Catal C, Tekinerdogan B (2021) Systematic reviews in sentiment analysis: a tertiary study. Artif Intell Rev 54(7):4997–5053. https://doi.org/10.1007/s10462-021-09973-3

    Article  Google Scholar 

  33. Liu M, Zhang L, Liu Y et al (2017) Recognizing semantic correlation in image-text weibo via feature space mapping. Comput Vision Image Understand 163:58–66

    Article  Google Scholar 

  34. Liu Y, Ott M, Goyal N, et al. (2019) RoBERTa: A Robustly Optimized BERT Pretraining Approach. https://doi.org/10.48550/arXiv.1907.11692, arXiv:1907.11692

  35. Lu X, Suryanarayan P, Adams RB, et al. (2012) On shape and the computability of emotions. In: Proceedings of the 20th ACM international conference on Multimedia. Association for Computing Machinery, New York, NY, USA, MM ’12, pp 229–238, https://doi.org/10.1145/2393347.2393384,

  36. Machajdik J, Hanbury A (2010) Affective image classification using features inspired by psychology and art theory. In: Proceedings of the 18th ACM international conference on Multimedia. Association for Computing Machinery, New York, NY, USA, MM ’10, pp 83–92, https://doi.org/10.1145/1873951.1873965,

  37. Miaschi A, Dell’Orletta F (2020) Contextual and non-contextual word embeddings: an in-depth linguistic investigation. In: Proceedings of the 5th Workshop on Representation Learning for NLP. Association for Computational Linguistics, Online, pp 110–119, https://doi.org/10.18653/v1/2020.repl4nlp-1.15, https://aclanthology.org/2020.repl4nlp-1.15

  38. Mikolov T, Sutskever I, Chen K, et al. (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst 26

  39. Niu T, Zhu S, Pang L, et al. (2016) Sentiment Analysis on Multi-View Social Data. In: Tian Q, Sebe N, Qi GJ, et al. (eds) MultiMedia Modeling. Springer International Publishing, Cham, Lecture Notes in Computer Science, pp 15–27, https://doi.org/10.1007/978-3-319-27674-8_2

  40. Pang B, Lee L (2004) A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, USA, ACL ’04, p 271-es, https://doi.org/10.3115/1218955.1218990

  41. Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? sentiment classification using machine learning techniques. arXiv: cs/0205070

  42. Poria S, Chaturvedi I, Cambria E, et al. (2016) Convolutional MKL Based Multimodal Emotion Recognition and Sentiment Analysis. In: 2016 IEEE 16th international conference on data mining (ICDM), pp 439–448, https://doi.org/10.1109/ICDM.2016.0055, iSSN: 2374-8486

  43. Pérez Rosas V, Mihalcea R, Morency LP (2013) Multimodal Sentiment Analysis of Spanish Online Videos. IEEE Intelligent Systems 28(3):38–45. https://doi.org/10.1109/MIS.2013.9, conference Name: IEEE Intelligent Systems

  44. Radford A, Narasimhan K, Salimans T, et al. (2018) Improving language understanding by generative pre-training

  45. Riaz S, Fatima M, Kamran M et al (2019) Opinion mining on large scale data using sentiment analysis and k-means clustering. Cluster Comput 22:7149–7164

    Article  Google Scholar 

  46. Rogers S (2014) What fuels a tweet’s engagement? twitter

  47. Sanagar S, Gupta D (2020) Unsupervised genre-based multidomain sentiment lexicon learning using corpus-generated polarity seed words. IEEE Access 8:118050–118071

    Article  Google Scholar 

  48. Sebastiani F, Esuli A (2006) Sentiwordnet: A publicly available lexical resource for opinion mining. In: Proceedings of the 5th international conference on language resources and evaluation, European Language Resources Association (ELRA) Genoa, Italy, pp 417–422

  49. Setiawan E, Juwiantho H, Santoso J, et al. (2021) Multiview sentiment analysis with image-text-concept features of indonesian social media posts. International Journal of Intelligent Engineering and Systems 14(2):521–535. https://doi.org/10.22266/ijies2021.0430.47, publisher Copyright: 2021, Int J Intell Eng Syst. All Rigts Reserved

  50. She D, Yang J, Cheng MM et al (2020) WSCNet: Weakly Supervised Coupled Networks for Visual Sentiment Classification and Detection. IEEE Trans Multimed 22(5):1358–1371. https://doi.org/10.1109/TMM.2019.2939744, conference Name: IEEE Trans Multimed

  51. Smith R (2007) An overview of the tesseract ocr engine. In: Ninth international conference on document analysis and recognition (ICDAR 2007), IEEE, pp 629–633

  52. Snoek CGM, Worring M (2009) Concept-Based Video Retrieval. Foundations and Trends® in Information Retrieval 2(4):215–322. https://doi.org/10.1561/1500000014, https://www.nowpublishers.com/article/Details/INR-014, publisher: Now Publishers, Inc

  53. Soleymani M, Garcia D, Jou B et al (2017) A survey of multimodal sentiment analysis. Image Vision Comput 65:3–14. https://doi.org/10.1016/j.imavis.2017.08.003, https://www.sciencedirect.com/science/article/pii/S0262885617301191

  54. Sun C, Huang L, Qiu X (2019) Utilizing bert for aspect-based sentiment analysis via constructing auxiliary sentence. arXiv:1903.09588

  55. Taboada M, Brooke J, Tofiloski M et al (2011) Lexicon-based methods for sentiment analysis. Computat Linguist 37(2):267–307

    Article  Google Scholar 

  56. Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. arXiv:1503.00075

  57. Tang D, Qin B, Liu T (2015) Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 1422–1432

  58. Vaswani A, Shazeer N, Parmar N, et al. (2017) Attention is All you Need. In: Advances in Neural Information Processing Systems, vol 30. Curran Associates, Inc., https://proceedings.neurips.cc/paper_files/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html

  59. Wang A, Singh A, Michael J, et al. (2019) GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. https://doi.org/10.48550/arXiv.1804.07461, arXiv:1804.07461

  60. Wang F, Qi S, Gao G et al (2016) Logo information recognition in large-scale social media data. Multimed Syst 22:63–73

    Article  Google Scholar 

  61. Wang M, Cao D, Li L, et al. (2014) Microblog Sentiment Analysis Based on Cross-media Bag-of-words Model. In: Proceedings of International Conference on Internet Multimedia Computing and Service. Association for Computing Machinery, New York, NY, USA, ICIMCS ’14, pp 76–80, https://doi.org/10.1145/2632856.2632912

  62. Wang Y, Huang M, Zhu X, et al. (2016) Attention-based lstm for aspect-level sentiment classification. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 606–615

  63. Wilson T, Hoffmann P, Somasundaran S, et al. (2005) Opinionfinder: A system for subjectivity analysis. In: Proceedings of HLT/EMNLP on Interactive Demonstrations. Association for Computational Linguistics, USA, HLT-Demo ’05, p 34-35, https://doi.org/10.3115/1225733.1225751

  64. Wu Y, Ngai EWT, Wu P et al (2020) Fake online reviews: Literature review, synthesis, and directions for future research. Decision Support Syst 132:113280. https://doi.org/10.1016/j.dss.2020.113280

    Article  Google Scholar 

  65. Xi D, Xu W, Chen R et al (2021) Sending or not? A multimodal framework for Danmaku comment prediction. Inf Process Manag 58(6):102687. https://doi.org/10.1016/j.ipm.2021.102687, https://www.sciencedirect.com/science/article/pii/S0306457321001722

  66. Xiao Y, Codevilla F, Gurram A et al (2022) Multimodal End-to-End Autonomous Driving. IEEE Trans Intell Transportat Syst 23(1):537–547. https://doi.org/10.1109/TITS.2020.3013234, conference Name: IEEE Transactions on Intelligent Transportation Systems

  67. Xu J, Huang F, Zhang X et al (2019) Visual-textual sentiment classification with bi-directional multi-level attention networks. Knowl-Based Syst 178:61–73. https://doi.org/10.1016/j.knosys.2019.04.018, https://www.sciencedirect.com/science/article/pii/S0950705119301911

  68. Xu N (2017) Analyzing multimodal public sentiment based on hierarchical semantic attentional network. In: 2017 IEEE international conference on intelligence and security informatics (ISI), IEEE, pp 152–154

  69. Xu N, Mao W (2017) MultiSentiNet: A Deep Semantic Network for Multimodal Sentiment Analysis. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. Association for Computing Machinery, New York, NY, USA, CIKM ’17, pp 2399–2402, https://doi.org/10.1145/3132847.3133142

  70. Xu N, Mao W, Chen G (2018) A Co-Memory Network for Multimodal Sentiment Analysis. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. Association for Computing Machinery, New York, NY, USA, SIGIR ’18, pp 929–932, https://doi.org/10.1145/3209978.3210093

  71. Yan X, Huang T (2015) Tibetan sentence sentiment analysis based on the maximum entropy model. 2015 10th International Conference on Broadband and Wireless Computing. Communication and Applications (BWCCA), IEEE, pp 594–597

  72. Yang J, She D, Sun M (2017) Joint Image Emotion Classification and Distribution Learning via Deep Convolutional Neural Network. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, Melbourne, Australia, pp 3266–3272, https://doi.org/10.24963/ijcai.2017/456, https://www.ijcai.org/proceedings/2017/456

  73. Yang T, Li Y, Pan Q, et al. (2016) Tb-cnn: joint tree-bank information for sentiment analysis using cnn. In: 2016 35th Chinese Control Conference (CCC), IEEE, pp 7042–7044

  74. Yang X, Feng S, Wang D et al (2020) Image-text multimodal emotion classification via multi-view attentional network. IEEE Trans Multimed 23:4014–4026

    Article  Google Scholar 

  75. You Q, Luo J, Jin H, et al. (2015) Robust Image Sentiment Analysis Using Progressively Trained and Domain Transferred Deep Networks. Proceedings of the AAAI Conference on Artificial Intelligence 29(1). https://doi.org/10.1609/aaai.v29i1.9179, https://ojs.aaai.org/index.php/AAAI/article/view/9179, number: 1

  76. You Q, Jin H, Luo J (2017) Visual Sentiment Analysis by Attending on Local Image Regions. Proceedings of the AAAI Conference on Artificial Intelligence 31(1). https://doi.org/10.1609/aaai.v31i1.10501, https://ojs.aaai.org/index.php/AAAI/article/view/10501, number: 1

  77. Yu Y, Lin H, Meng J et al (2016) Visual and Textual Sentiment Analysis of a Microblog Using Deep Convolutional Neural Networks. Algorithms 9(2):41. https://doi.org/10.3390/a9020041, https://www.mdpi.com/1999-4893/9/2/41, number: 2 Publisher: Multidisciplinary Digital Publishing Institute

  78. Yuan J, Mcdonough S, You Q, et al. (2013) Sentribute: image sentiment analysis from a mid-level perspective. In: Proceedings of the Second International Workshop on Issues of Sentiment Discovery and Opinion Mining. Association for Computing Machinery, New York, NY, USA, WISDOM ’13, pp 1–8, https://doi.org/10.1145/2502069.2502079

  79. Zhao S, Gao Y, Jiang X, et al. (2014) Exploring Principles-of-Art Features For Image Emotion Recognition. In: Proceedings of the 22nd ACM international conference on Multimedia. Association for Computing Machinery, New York, NY, USA, MM ’14, pp 47–56, https://doi.org/10.1145/2647868.2654930

  80. Zhao S, Gao Y, Ding G et al (2018) Real-Time Multimedia Social Event Detection in Microblog. IEEE Trans Cybernet 48(11):3218–3231. https://doi.org/10.1109/TCYB.2017.2762344, conference Name: IEEE Transactions on Cybernetics

  81. Zhao Z, Zhu H, Xue Z et al (2019) An image-text consistency driven multimodal sentiment analysis approach for social media. Inf Process Manag 56(6):102097. https://doi.org/10.1016/j.ipm.2019.102097, https://www.sciencedirect.com/science/article/pii/S0306457319304546

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dwijen Rudrapal.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chakraborty, D., Rudrapal, D. & Bhattacharya, B. A multimodal sentiment analysis approach for tweets by comprehending co-relations between information modalities. Multimed Tools Appl 83, 50061–50085 (2024). https://doi.org/10.1007/s11042-023-17569-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-17569-y

Keywords