Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

MahaEmoSen: Towards Emotion-aware Multimodal Marathi Sentiment Analysis

Published: 22 September 2023 Publication History

Abstract

With the advent of the Internet, social media platforms have witnessed an enormous increase in user-generated textual and visual content. Microblogs on platforms such as Twitter are extremely useful for comprehending how individuals feel about a specific issue through their posted texts, images, and videos. Owing to the plethora of content generated, it is necessary to derive an insight of its emotional and sentimental inclination. Individuals express themselves in a variety of languages and, lately, the number of people preferring native languages has been consistently increasing. Marathi language is predominantly spoken in the Indian state of Maharashtra. However, sentiment analysis in Marathi has rarely been addressed. In light of the above, we propose an emotion-aware multimodal Marathi sentiment analysis method (MahaEmoSen). Unlike the existing studies, we leverage emotions embedded in tweets besides assimilating the content-based information from the textual and visual modalities of social media posts to perform a sentiment classification. We mitigate the problem of small training sets by implementing data augmentation techniques. A word-level attention mechanism is applied on the textual modality for contextual inference and filtering out noisy words from tweets. Experimental outcomes on real-world social media datasets demonstrate that our proposed method outperforms the existing methods for Marathi sentiment analysis in resource-constrained circumstances.

References

[1]
Jitendra Kumar Rout, Kim-Kwang Raymond Choo, Amiya Kumar Dash, Sambit Bakshi, Sanjay Kumar Jena, and Karen L. Williams. 2018. A model for sentiment and emotion analysis of unstructured social media text. Electron. Comm. Res. 18 (2018), 181–199.
[2]
Rasmus Kleis Nielsen and Kim Christian Schrøder. 2014. The relative importance of social media for accessing, finding, and engaging with news: An eight-country cross-media comparison. Digit. Journal. 2, 4 (2014), 472–489.
[3]
Nagendra Kumar, Rakshita Nagalla, Tanya Marwah, and Manish Singh. 2018. Sentiment dynamics in social media news channels. Online Soc. Netw. Media 8 (2018), 42–54.
[4]
Anna Rogers, Alexey Romanov, Anna Rumshisky, Svitlana Volkova, Mikhail Gronas, and Alex Gribov. 2018. RuSentiment: An enriched sentiment analysis dataset for social media in Russian. In Proceedings of the 27th International Conference on Computational Linguistics. 755–763.
[5]
Wenmeng Yu, Hua Xu, Fanyang Meng, Yilin Zhu, Yixiao Ma, Jiele Wu, Jiyun Zou, and Kaicheng Yang. 2020. CH-SIMS: A Chinese multimodal sentiment analysis dataset with fine-grained annotation of modality. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 3718–3727.
[6]
Piyush Arora. 2013. Sentiment Analysis For Hindi Language. phdthesis. International Institute of Information Technology Hyderabad.
[7]
Harika Abburi, Eswar Sai Akhil Akkireddy, Suryakanth Gangashetti, and Radhika Mamidi. 2016. Multimodal sentiment analysis of Telugu songs. In Proceedings of the Workshop on Sentiment Analysis where AI meets Psychology co-located with International Joint Conference on Artificial Intelligence (SAAIP@IJCAI’16). 48–52.
[8]
Md Al-Amin, Md Saiful Islam, and Shapan Das Uzzal. 2017. Sentiment analysis of Bengali comments with Word2Vec and sentiment information of words. In Proceedings of the International Conference on Electrical, Computer and Communication Engineering (ECCE’17). IEEE, 186–190.
[9]
Atharva Kulkarni, Meet Mandhane, Manali Likhitkar, Gayatri Kshirsagar, and Raviraj Joshi. 2021. L3CubeMahaSent: A Marathi tweet-based sentiment analysis dataset. arXiv preprint arXiv:2103.11408 (2021).
[10]
Mohammad Zia Ur Rehman, Somya Mehta, Kuldeep Singh, Kunal Kaushik, and Nagendra Kumar. 2023. User-aware multilingual abusive content detection in social media. Inf. Process. Manag. 60, 5 (2023), 103450.
[11]
Parupalli Srinivas Rao. 2019. The role of English as a global language. Res. J. English 4, 1 (2019), 65–79.
[12]
Kia Dashtipour, Soujanya Poria, Amir Hussain, Erik Cambria, Ahmad Y. A. Hawalah, Alexander Gelbukh, and Qiang Zhou. 2016. Multilingual sentiment analysis: State of the art and independent comparison of techniques. Cognit. Comput. 8, 4 (2016), 757–771.
[13]
Barry Haddow and Faheem Kirefu. 2020. PMIndia–A collection of parallel corpora of languages of India. arXiv e-prints (2020), arXiv–2001.
[14]
Christian Buck, Kenneth Heafield, and Bas van Ooyen. 2014. N-gram counts and language models from the common crawl. In Proceedings of the Language Resources and Evaluation Conference. 3579–3584.
[15]
Brian Roark, Lawrence Wolf-Sonkin, Christo Kirov, Sabrina J. Mielke, Cibu Johny, Isin Demirsahin, and Keith Hall. 2020. Processing South Asian languages written in the Latin script: The Dakshina dataset. arXiv preprint arXiv:2007.01176 (2020).
[16]
Raviraj Joshi. 2022. L3Cube-HindBERT and DevBERT: Pre-trained BERT transformer models for Devanagari based Hindi and Marathi languages. arXiv preprint arXiv:2211.11418 (2022).
[17]
Shubhi Bansal, Kushaan Gowda, and Nagendra Kumar. 2022. A hybrid deep neural network for multimodal personalized hashtag recommendation. IEEE Trans. Comput. Soc. Syst. (2022).
[18]
Louis-Philippe Morency, Rada Mihalcea, and Payal Doshi. 2011. Towards multimodal sentiment analysis: Harvesting opinions from the web. In Proceedings of the 13th International Conference on Multimodal Interfaces. 169–176.
[19]
Soujanya Poria, Erik Cambria, and Alexander Gelbukh. 2015. Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2539–2544.
[20]
Rajat Subhra Bhowmick, Isha Ganguli, Jayanta Paul, and Jaya Sil. 2021. A multimodal deep framework for derogatory social media post identification of a recognized person. Trans. Asian Low-resour. Lang. Inf. Process. 21, 1 (2021), 1–19.
[21]
Chung-Hsien Wu, Ze-Jing Chuang, and Yu-Chung Lin. 2006. Emotion recognition from text using semantic labels and separable mixture models. ACM Trans. Asian Lang. Inf. Process. 5, 2 (2006), 165–183.
[22]
Seunghyun Yoon, Seokhyun Byun, and Kyomin Jung. 2018. Multimodal speech emotion recognition using audio and text. In Proceedings of the IEEE Spoken Language Technology Workshop (SLT’18). IEEE, 112–118.
[23]
Jason Wei and Kai Zou. 2019. EDA: Easy data augmentation techniques for boosting performance on text classification tasks. arXiv preprint arXiv:1901.11196 (2019).
[24]
Amane Sugiyama and Naoki Yoshinaga. 2019. Data augmentation using back-translation for context-aware neural machine translation. In Proceedings of the 4th Workshop on Discourse in Machine Translation (DiscoMT’19). 35–44.
[25]
Markus Bayer, Marc-André Kaufhold, and Christian Reuter. 2022. A survey on data augmentation for text classification. Comput. Surv. 55, 7 (2022), 1–39.
[26]
Gowtham Ramesh, Sumanth Doddapaneni, Aravinth Bheemaraj, Mayank Jobanputra, Raghavan AK, Ajitesh Sharma, Sujit Sahoo, Harshita Diddee, Mahalakshmi J, Divyanshu Kakwani, Navneet Kumar, Aswin Pradeep, Srihari Nagaraj, Kumar Deepak, Vivek Raghavan, Anoop Kunchukuttan, Pratyush Kumar, and Mitesh Shantadevi Khapra. 2022. Samanantar: The largest publicly available parallel corpora collection for 11 Indic languages. Transactions of the Association for Computational Linguistics 10 (2022), 145–162. DOI:
[27]
Divyanshu Kakwani, Anoop Kunchukuttan, Satish Golla, N. C. Gokul, Avik Bhattacharyya, Mitesh M. Khapra, and Pratyush Kumar. 2020. IndicNLPSuite: Monolingual corpora, evaluation benchmarks and pre-trained multilingual language models for Indian languages. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, 4948–4961.
[28]
Purnadip Chakrabarti, Eish Malvi, Shubhi Bansal, and Nagendra Kumar. 2023. Hashtag recommendation for enhancing the popularity of social media posts. Soc. Netw. Anal. Min. 13, 1 (2023), 21.
[29]
Nagendra Kumar, Eshwanth Baskaran, Anand Konjengbam, and Manish Singh. 2021. Hashtag recommendation for short social media texts using word-embeddings and external knowledge. Knowl. Inf. Syst. 63 (2021), 175–198.
[30]
Jacob Devlin, Ming-Wei Chang Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’19). 4171–4186.
[31]
Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116 (2019).
[32]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[33]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 248–255.
[34]
Lijun Wu, Fei Tian, Li Zhao, Jianhuang Lai, and Tie-Yan Liu. 2018. Word attention for sequence to sequence text understanding. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence and 30th Innovative Applications of Artificial Intelligence Conference and 8th AAAI Symposium on Educational Advances in Artificial Intelligence. 5578–5585.
[35]
Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1480–1489.
[36]
Simran Khanuja, Diksha Bansal, Sarvesh Mehtani, Savya Khosla, Atreyee Dey, Balaji Gopalan, Dilip Kumar Margam, Pooja Aggarwal, Rajiv Teja Nagipogu, Shachi Dave, Shruti Gupta, Subhash Chandra Bose Gali, Vish Subramanian, and Partha Talukdar. 2021. MuRIL: Multilingual representations for Indian languages. arXiv preprint arXiv:2103.10730 (2021).
[37]
Alexis Conneau and Guillaume Lample. 2019. Cross-lingual language model pretraining. Adv. Neural Inf. Process. Syst. 32 (2019).

Cited By

View all
  • (2025)A Hybrid Sentiment and Emotion Analysis Model for Marathi Text Using Horse Herd Optimization, Bidirectional RNN, and Affective Cognitive ComputingCognitive Computing and Cyber Physical Systems10.1007/978-3-031-77081-4_2(11-24)Online publication date: 9-Feb-2025
  • (2024)Enhancing User Experience through Emotion-Aware Interfaces: A Multimodal ApproachJournal of Innovative Image Processing10.36548/jiip.2024.1.0036:1(27-39)Online publication date: Mar-2024

Index Terms

  1. MahaEmoSen: Towards Emotion-aware Multimodal Marathi Sentiment Analysis

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 22, Issue 9
    September 2023
    226 pages
    ISSN:2375-4699
    EISSN:2375-4702
    DOI:10.1145/3625383
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 September 2023
    Online AM: 02 September 2023
    Accepted: 11 August 2023
    Received: 17 April 2023
    Published in TALLIP Volume 22, Issue 9

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Sentiment analysis
    2. Marathi
    3. emotions
    4. multimodal data classification

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)300
    • Downloads (Last 6 weeks)10
    Reflects downloads up to 08 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)A Hybrid Sentiment and Emotion Analysis Model for Marathi Text Using Horse Herd Optimization, Bidirectional RNN, and Affective Cognitive ComputingCognitive Computing and Cyber Physical Systems10.1007/978-3-031-77081-4_2(11-24)Online publication date: 9-Feb-2025
    • (2024)Enhancing User Experience through Emotion-Aware Interfaces: A Multimodal ApproachJournal of Innovative Image Processing10.36548/jiip.2024.1.0036:1(27-39)Online publication date: Mar-2024

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media