Abstract
In this paper we investigate the use of a multimodal feature learning approach, using neural network based models such as Skip-gram and Denoising Autoencoders, to address sentiment analysis of micro-blogging content, such as Twitter short messages, that are composed by a short text and, possibly, an image. The approach used in this work is motivated by the recent advances in: i) training language models based on neural networks that have proved to be extremely efficient when dealing with web-scale text corpora, and have shown very good performances when dealing with syntactic and semantic word similarities; ii) unsupervised learning, with neural networks, of robust visual features, that are recoverable from partial observations that may be due to occlusions or noisy and heavily modified images. We propose a novel architecture that incorporates these neural networks, testing it on several standard Twitter datasets, and showing that the approach is efficient and obtains good classification results.
Similar content being viewed by others
Notes
Twitter reports to have 271 million monthly active users that send 500 million status updates per day - https://about.twitter.com/company
References
Baecchi C, Turchini F, Seidenari L, Bagdanov AD, Del Bimbo A (2014) Fisher vectors over random density forests for object recognition. In: Proceeding of international conference on pattern recognition (ICPR)
Barbosa L, Feng J (2010) Robust sentiment detection on twitter from biased and noisy data. In: Proceeding of international conference on computational linguistics (COLING)
Bastien F, Lamblin P, Pascanu R, Bergstra J, Goodfellow I, Bergeron A, Bouchard N, Warde-Farley D, Bengio Y (2012) Theano: new features and speed improvements. arXiv:1211.5590
Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2(1):1–127. doi:10.1561/2200000006
Bengio Y, Schwenk H, Senécal JS, Morin F, Gauvain JL (2006) Neural probabilistic language models. In: Innovations in machine learning. Springer, pp 137–186
Bian J, Yang Y, Chua TS (2013) Multimedia summarization for trending topics in microblogs. In: Proceeding of the ACM international conference on information and knowledge management (CIKM), pp 1807–1812. doi:10.1145/2505515.2505652
Bifet A, Frank E (2010) Sentiment knowledge discovery in twitter streaming data. In: Proceedings of international conference on discovery science (DS). doi:10.1007/978-3-642-16184-1_1
Borth D, Ji R, Chen T, Breuel T, Chang SF (2013) Large-scale visual sentiment ontology and detectors using adjective noun pairs. In: Proceeding of ACM international conference on multimedia (MM), pp 223–232. doi:10.1145/2502081.2502282
Bravo-Marquez F, Mendoza M, Poblete B (2013) Combining strengths, emotions and polarities for boosting Twitter sentiment analysis. In: Proceeding of ACM international workshop on issues of sentiment discovery and opinion mining (WISDOM). doi:10.1145/2502069.2502071
Cao D, Ji R, Lin D, Li S (2014) A cross-media public sentiment analysis system for microblog. Multimedia Systems (MS):1–8. doi:10.1007/s00530-014-0407-8
Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the devil in the details: delving deep into convolutional nets. arXiv:1405.3531
Chen T, Lu D, Kan MY, Cui P (2013) Understanding and classifying image tweets. In: Proceeding of ACM international conference on multimedia (MM), pp 781–784. doi:10.1145/2502081.2502203
Chen YY, Chen T, Hsu WH, Liao HYM, Chang SF (2014) Predicting viewer affective comments based on image content in social media. In: Proceeding of ACM international conference on multimedia retrieval (ICMR), pp 233:233–233:240. doi:10.1145/2578726.2578756,
Collobert R, Weston J (2008) A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceeding of international conference on machine learning (ICML)
Dan-Glauser E, Scherer K (2011) The geneva affective picture database (gaped): A new 730-picture database focusing on valence and normative significance. Behav Res Methods 43(2):468–477. doi:10.3758/s13428-011-0064-1
Deitrick W, Hu W (2013) Mutually enhancing community detection and sentiment analysis on Twitter networks. J Data Anal Inf Process 1(3):19.29
Ghiassi M, Skinner J, Zimbra D (2013) Twitter brand sentiment analysis: A hybrid system using n-gram analysis and dynamic artificial neural network. Expert Syst Appl 40(16):6266–6282. doi:10.1016/j.eswa.2013.05.057
Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. Tech. rep., CS224N Project Report, Stanford
Grauman K, Darrell T (2005) The pyramid match kernel: Discriminative classification with sets of image features. In: Proceeding of international conference on computer vision (ICCV)
Gutmann MU, Hyvärinen A (2012) Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. J Mach Learn Res (JMLR) 13(1):307–361
Jiang L, Yu M, Zhou M, Liu X, Zhao T (2011) Target-dependent Twitter sentiment classification. In: Proceeding of ACL annual meeting of the association for computational linguistics: Human language Technologies (HLT)
Joshi D, Datta R, Fedorovskaya E, Luong QT, Wang J, Li J, Luo J (2011) Aesthetics and emotions in images. IEEE Signal Proc Mag (MSP) 28(5):94–115. doi:10.1109/MSP.2011.941851
Kaneko T, Harada H, Yanai K (2013) Twitter visual event mining system. In: Proceeding of IEEE international conference on multimedia and expo workshops (ICMEW), pp 1–2. doi:10.1109/ICMEW.2013.6618224
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceeding of neural information processing systems (NIPS), pp 1097–1105
Lang PJ, Bradley MM, Cuthbert BN (1999) International affective picture system (iaps): Technical manual and affective ratings
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Proceeding of conference on computer vision and pattern recognition (CVPR)
Le QV, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceeding of international conference on machine learning (ICML)
Liu KL, Li WJ, Guo M (2012) Emoticon smoothed language models for Twitter sentiment analysis. In: Proceeding of AAAI conference on artificial intelligence (CAI)
Li T, Mei T, Kweon IS, Hua XS (2011) Contextual bag-of-words for visual categorization. IEEE Trans Circ Syst Video Technol (TCSVT) 21(4):381–392
McParlane PJ, Jose J (2014) Exploiting twitter and wikipedia for the annotation of event images. In: Proceeding of ACM SIGIR interantional conference on research and development in information retrieval , pp 1175–1178. doi:10.1145/2600428.2609538
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceeding of neural information processing systems (NIPS)
Mikolov T, Deoras A, Kombrink S, Burget L, Cernocky JH (2011) Empirical evaluation and combination of advanced language modeling techniques. In: Proceeding of interspeech
Mnih A, Hinton GE (2009) A scalable hierarchical distributed language model. In: Proceedings of neural information processing systems (NIPS)
Perronnin F, Liu Y, Sánchez J, Poirier H (2010) Large-scale image retrieval with compressed fisher vectors. In: Proceeding of computer vision and pattern recognition (CVPR)
Perronnin F, Sánchez J, Mensink T (2010) Improving the fisher kernel for large-scale image classification. In: Proceeding of european conference on computer vision (ECCV)
Plutchik R (2001) The nature of emotions. Am Sci 89(4):344–350
Saif H, Fernandez M, He Y, Alani H (2013) Evaluation datasets for Twitter sentiment analysis. In: Proceeding of AI ∗IA emotion and sentiment in social and expressive media (ESSEM)
Saif H, He Y, Alani H (2012) Semantic sentiment analysis of twitter. In: Proceeding of international conference on the semantic web (ISWC)
Serra G, Alisi T, Bertini M, Ballan L, Del Bimbo A, Goix L, Licciardi C (2013) STAMAT: A framework for social topics and media analysis. In: Proceeding of IEEE international conference on multimedia and expo workshops (ICMEW), pp 1–2. doi:10.1109/ICMEW.2013.6618227
Thelwall M, Buckley K, Paltoglou G, Cai D, Kappas A (2010) Sentiment strength detection in short informal text. J Am Soc Inf Sci Technol 61(12):2544–2558
Tumasjan A, Sprenger TO, Sandner PG, Welpe IM (2010) Predicting elections with Twitter: What 140 characters reveal about political sentiment. In: Proceeding of AAAI international conference on weblogs and social media (ICWSM)
Turian J, Ratinov L, Bengio Y (2010) Word representations: a simple and general method for semi-supervised learning. In: Proceeding of ACL annual meeting of the association for computational linguistics
Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: Proceeding of international conference on machine learning (ICML), pp 1096–1103. doi:10.1145/1390156.1390294
Wang M, Cao D, Li L, Li S, Ji R (2014) Microblog sentiment analysis based on cross-media bag-of-words model. In: Proceeding of international conference on internet multimedia computing and service (ICIMCS), pp 76:76–76:80. doi:10.1145/2632856.2632912
Wang W, He Q (2008) A survey on emotional semantic image retrieval. In: Proceeding of IEEE international conference on image processing (ICIP), pp 117–120. doi:10.1109/ICIP.2008.4711705
Wang Z, Cui P, Xie L, Chen H, Zhu W, Yang S (2012) Analyzing social media via event facets. In: Proceeding of ACM international conference on multimedia (MM), pp 1359–1360. doi:10.1145/2393347.2396484
Yanai K (2012) World Seer: A realtime geo-tweet photo mapping system. In: Proceeding of ACM international conference on multimedia retrieval (ICMR), pp 65:1–65:2. doi:10.1145/2324796.2324870
Yang Y, Cui P, Zhu W, Zhao HV, Shi Y, Yang S (2014) Emotionally representative image discovery for social events. In: Proceeding of ACM international conference on multimedia retrieval (ICMR), pp 177:177–177:184. doi:10.1145/2578726.2578749
Zhao X, Zhu F, Qian W, Zhou A (2012) Impact of multimedia in Sina Weibo: Popularity and life span. In: Proceeding of chinese semantic web symposium and the first chinese web science conference (CSWS & CWSC)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Baecchi, C., Uricchio, T., Bertini, M. et al. A multimodal feature learning approach for sentiment analysis of social network multimedia. Multimed Tools Appl 75, 2507–2525 (2016). https://doi.org/10.1007/s11042-015-2646-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-015-2646-x