research-article

Affective Embedding Framework with Semantic Representations from Tweets for Zero-Shot Visual Sentiment Prediction

Authors:

Miki HaseyamaAuthors Info & Claims

MMAsia '22: Proceedings of the 4th ACM International Conference on Multimedia in Asia

Article No.: 6, Pages 1 - 7

https://doi.org/10.1145/3551626.3564950

Published: 13 December 2022 Publication History

Get Access

Abstract

This paper presents a zero-shot visual sentiment prediction method using semantic representation features of texts from tweets as the non-visual auxiliary data. Previous studies show that visual sentiment prediction methods can only predict the sentiment labels that are the same as the labels of the sentiment theory used in the training dataset, which means that they cannot predict the new sentiment label used in different sentiment theories. To solve the problem of predicting new labels, zero-shot learning has been proposed. The previous zero-shot visual sentiment prediction method uses Word2vec features and the adjective-noun pair features to obtain the semantical relationship between images and sentiment words to predict unseen sentiments. However, many adjective-noun pairs are not related to sentiments, which makes it difficult to compensate for an affective gap between low-level visual features and high-level sentiment semantics. Thus, to better compensate for the affective gap, it is considered to introduce the new non-visual auxiliary data. As people tend to share their feelings with both images and texts on social networking services, the texts from tweets are effective as the side information of the images in visual sentiment prediction. Thus, we introduce the semantic representations from tweets as the new non-visual auxiliary data to construct an affective embedding space, which makes a more effective zero-shot visual sentiment prediction model. Moreover, we propose a cross-dataset zero-shot task for visual sentiment prediction, which is more consistent with the real situation that the testing and training images may be in different domains. The contributions in this paper are to combine several semantic representation features for zero-shot visual sentiment prediction and the proposal of the cross-dataset zero-shot task for visual sentiment prediction. The experiments on several open datasets show the effectiveness of the proposed method.

References

[1]

Martin Arjovsky, Soumith Chintala, and Léon Bottou. 2017. Wasserstein generative adversarial networks. In Proc. International Conference on Machine Learning. PMLR, 214--223.

Abstract

References

Cited By

Index Terms

Recommendations

Transductive Visual-Semantic Embedding for Zero-shot Learning

Zero-Shot Visual Recognition via Bidirectional Latent Embedding

Learning discriminative visual semantic embedding for zero-shot recognition

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations