Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3551626.3564950acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Affective Embedding Framework with Semantic Representations from Tweets for Zero-Shot Visual Sentiment Prediction

Published: 13 December 2022 Publication History

Abstract

This paper presents a zero-shot visual sentiment prediction method using semantic representation features of texts from tweets as the non-visual auxiliary data. Previous studies show that visual sentiment prediction methods can only predict the sentiment labels that are the same as the labels of the sentiment theory used in the training dataset, which means that they cannot predict the new sentiment label used in different sentiment theories. To solve the problem of predicting new labels, zero-shot learning has been proposed. The previous zero-shot visual sentiment prediction method uses Word2vec features and the adjective-noun pair features to obtain the semantical relationship between images and sentiment words to predict unseen sentiments. However, many adjective-noun pairs are not related to sentiments, which makes it difficult to compensate for an affective gap between low-level visual features and high-level sentiment semantics. Thus, to better compensate for the affective gap, it is considered to introduce the new non-visual auxiliary data. As people tend to share their feelings with both images and texts on social networking services, the texts from tweets are effective as the side information of the images in visual sentiment prediction. Thus, we introduce the semantic representations from tweets as the new non-visual auxiliary data to construct an affective embedding space, which makes a more effective zero-shot visual sentiment prediction model. Moreover, we propose a cross-dataset zero-shot task for visual sentiment prediction, which is more consistent with the real situation that the testing and training images may be in different domains. The contributions in this paper are to combine several semantic representation features for zero-shot visual sentiment prediction and the proposal of the cross-dataset zero-shot task for visual sentiment prediction. The experiments on several open datasets show the effectiveness of the proposed method.

References

[1]
Martin Arjovsky, Soumith Chintala, and Léon Bottou. 2017. Wasserstein generative adversarial networks. In Proc. International Conference on Machine Learning. PMLR, 214--223.
[2]
Erdenebileg Batbaatar, Meijing Li, and Keun Ho Ryu. 2019. Semantic-emotion neural network for emotion recognition from text. IEEE Access 7 (2019), 111866--111878.
[3]
Tao Chen, Damian Borth, Trevor Darrell, and Shih-Fu Chang. 2014. Deepsentibank: Visual sentiment concept classification with deep convolutional neural networks. arXiv preprint arXiv:1410.8586 (2014).
[4]
Yan-Ying Chen, Tao Chen, Winston H Hsu, Hong-Yuan Mark Liao, and Shih-Fu Chang. 2014. Predicting viewer affective comments based on image content in social media. In Proc. ACM International Conference on Multimedia Retrieval. 233--240.
[5]
Yan-Ying Chen, Tao Chen, Taikun Liu, Hong-Yuan Mark Liao, and Shih-Fu Chang. 2015. Assistive image comment robot --- a novel mid-level concept-based representation. IEEE Trans. Affective Computing 6, 3 (2015), 298--311.
[6]
Niko Colnerič and Janez Demšar. 2018. Emotion recognition on twitter: Comparative study and training a unison model. IEEE Trans. Affective Computing 11, 3 (2018), 433--446.
[7]
Belur V Dasarathy. 1991. Nearest neighbor (NN) norms: NN pattern classification techniques. IEEE Computer Society Tutorial. 1--447 pages.
[8]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. 248--255.
[9]
Bhuwan Dhingra, Zhong Zhou, Dylan Fitzpatrick, Michael Muehl, and William W Cohen. 2016. Tweet2vec: Character-based distributed representations for social media. arXiv preprint arXiv:1605.03481 (2016).
[10]
Yanwei Fu, Tao Xiang, Yu-Gang Jiang, Xiangyang Xue, Leonid Sigal, and Shaogang Gong. 2018. Recent advances in zero-shot recognition: Toward data-efficient understanding of visual content. IEEE Signal Processing Magazine 35, 1 (2018), 112--125.
[11]
Himaanshu Gauba, Pradeep Kumar, Partha Pratim Roy, Priyanka Singh, Debi Prosad Dogra, and Balasubramanian Raman. 2017. Prediction of advertisement preference by fusing EEG response and sentiment analysis. Neural Networks 92 (2017), 77--88.
[12]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. 770--778.
[13]
Elyor Kodirov, Tao Xiang, and Shaogang Gong. 2017. Semantic autoencoder for zero-shot learning. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. 3174--3183.
[14]
Jana Machajdik and Allan Hanbury. 2010. Affective image classification using features inspired by psychology and art theory. In Proc. ACM International Conference on Multimedia. 83--92.
[15]
Joseph A Mikels, Barbara L Fredrickson, Gregory R Larkin, Casey M Lindberg, Sam J Maglio, and Patricia A Reuter-Lorenz. 2005. Emotional category data on images from the International Affective Picture System. Behavior Research Methods 37, 4 (2005), 626--630.
[16]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
[17]
Rameswar Panda, Jianming Zhang, Haoxiang Li, Joon-Young Lee, Xin Lu, and Amit K Roy-Chowdhury. 2018. Contemplating visual emotions: understanding and overcoming dataset bias. In Proc. European Conference on Computer Vision. 1--17.
[18]
Lei Pang, Shiai Zhu, and Chong-Wah Ngo. 2015. Deep multimodal learning for affective analysis and retrieval. IEEE Trans. Multimedia 17, 11 (2015), 2008--2020.
[19]
W Gerrod Parrott. 2001. Emotions in social psychology: Essential readings. Psychology press.
[20]
Soujanya Poria, Navonil Majumder, Rada Mihalcea, and Eduard Hovy. 2019. Emotion recognition in conversation: Research challenges, datasets, and recent advances. IEEE Access 7 (2019), 100943--100953.
[21]
Flood Sung, Yongxin Yang, Li Zhang, Tao Xiang, Philip HS Torr, and Timothy M Hospedales. 2018. Learning to compare: Relation network for few-shot learning. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. 1199--1208.
[22]
Lucia Vadicamo, Fabio Carrara, Andrea Cimino, Stefano Cresci, Felice Dell'Orletta, Fabrizio Falchi, and Maurizio Tesconi. 2017. Cross-media learning for image sentiment analysis in the wild. In Proc. IEEE/CVF International Conference on Computer Vision Workshops. 308--317.
[23]
Lin Wang, Xiangmin Xu, Fang Liu, Xiaofen Xing, Bolun Cai, and Weirui Lu. 2019. Robust emotion navigation: Few-shot visual sentiment analysis by auxiliary noisy data. In Proc. IEEE International Conference on Affective Computing and Intelligent Interaction Workshops and Demos. IEEE, 121--127.
[24]
Weining Wang and Qianhua He. 2008. A survey on emotional semantic image retrieval. In Proc. IEEE International Conference on Image Processing. 117--120.
[25]
Wei Wang, Vincent W Zheng, Han Yu, and Chunyan Miao. 2019. A survey of zero-shot learning: Settings, methods, and applications. ACM Trans. Intelligent Systems and Technology 10, 2 (2019), 1--37.
[26]
Jufeng Yang, Dongyu She, Yu-Kun Lai, Paul L Rosin, and Ming-Hsuan Yang. 2018. Weakly supervised coupled networks for visual sentiment analysis. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. 7584--7592.
[27]
Quanzeng You, Jiebo Luo, Hailin Jin, and Jianchao Yang. 2016. Building a large scale dataset for image emotion recognition: The fine print and the benchmark. In Proc. AAAI Conference on Artificial Intelligence, Vol. 30. 308--314.
[28]
Chi Zhan, Dongyu She, Sicheng Zhao, Ming-Ming Cheng, and Jufeng Yang. 2019. Zero-shot emotion recognition via affective structural embedding. In Proc. IEEE/CVF International Conference on Computer Vision. 1151--1160.
[29]
Li Zhang, Tao Xiang, and Shaogang Gong. 2017. Learning a deep embedding model for zero-shot learning. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021--2030.

Cited By

View all
  • (2024)Bridging Visual Affective Gap: Borrowing Textual Knowledge by Learning from Noisy Image-Text PairsProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680875(602-611)Online publication date: 28-Oct-2024
  • (2024)Zero-Shot Visual Sentiment Prediction via Cross-Domain Knowledge DistillationIEEE Open Journal of Signal Processing10.1109/OJSP.2023.33440795(177-185)Online publication date: 2024

Index Terms

  1. Affective Embedding Framework with Semantic Representations from Tweets for Zero-Shot Visual Sentiment Prediction

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MMAsia '22: Proceedings of the 4th ACM International Conference on Multimedia in Asia
    December 2022
    296 pages
    ISBN:9781450394789
    DOI:10.1145/3551626
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 December 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. cross-dataset
    2. tweet analysis
    3. visual sentiment prediction
    4. zero-shot learning

    Qualifiers

    • Research-article

    Funding Sources

    • JSPS KAKENHI
    • AMED

    Conference

    MMAsia '22
    Sponsor:
    MMAsia '22: ACM Multimedia Asia
    December 13 - 16, 2022
    Tokyo, Japan

    Acceptance Rates

    Overall Acceptance Rate 59 of 204 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)17
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 26 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Bridging Visual Affective Gap: Borrowing Textual Knowledge by Learning from Noisy Image-Text PairsProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680875(602-611)Online publication date: 28-Oct-2024
    • (2024)Zero-Shot Visual Sentiment Prediction via Cross-Domain Knowledge DistillationIEEE Open Journal of Signal Processing10.1109/OJSP.2023.33440795(177-185)Online publication date: 2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media