research-article

A social emotion classification approach using multi-model fusion

Authors:

Jun LiuAuthors Info & Claims

Volume 102, Issue C

Pages 347 - 356

https://doi.org/10.1016/j.future.2019.07.007

Published: 01 January 2020 Publication History

Abstract

With the proliferation of the online video publishing, the number of multimodal contents on the Internet has exponentially grown. Research of emotion analysis has developed from the traditional single-mode to complex multimode analysis. Most recent studies have paid rare attention to the visual emotion information deriving from merging visual and audio emotional information at the feature or decision level, even though some of them considered the multimodality analysis. In this paper, we extract visual, textual, and audio information from video and propose a multimodal emotional classification framework to capture the emotions of users in social networks. We have designed a 3DCLS (3D Convolutional-Long Short Term Memory) hybrid model that classifies visual emotions as well as a CNN–RNN hybrid model that classifies text-based emotions. Finally, visual, audio and text modes are combined to generate final emotional classification results. Experiments on the MOUD and IEMOCAP emotion datasets show that the proposed framework outperforms existing models in multimodal mood analysis.

Highlights

•

This paper proposes a 3DCLS model to establish spatio-temporal information for emotion recognition tasks through the concatenation of deep 3-dimensional convolutional networks and convolutional long-term and short-term memory recurrent neural network.

•

Introduced the CNN–RNN hybrid model, using CNN to extract the emotional features in the text, and RNN classified the extracted features.

•

Construct a multi-mode fusion framework and use the MKL fusion method to heterogeneously integrate visual, textual, and audio information.

References

[1]

Balazs J., Velsquez J., Opinion mining and information fusion: A survey, Inf. Fusion 27 (2016) 95–110.

Digital Library

[2]

Sun S., Luo C., Chen J., A review of natural language processing techniques for opinion mining systems, Inf. Fusion 36 (2016) 10–25.

Digital Library

[3]

Morency L.-P., Mihalcea R., Doshi P., Towards multimodal sentiment analysis: harvesting opinions from the web, in: Proceedings of the 13th International Conference on Multimodal Interfaces, ACM, 2011, pp. 169–176.

[4]

Poria S., Cambria E., Bajpai R., A review of affective computing: From unimodal analysis to multimodal fusion, Inf. Fusion 37 (2017) 98–125.

Digital Library

[5]

S.E. Kahou, P. Froumenty, C. Pal, Facial expression analysis based on high dimensional binary features, in: ECCV Workshop on Computer Vision with Local Binary Patterns Variants, Zurich, Switzerland, 2014.

[6]

Shan C., Gong S., McOwan P.W., Facial expression recognition based on local binary patterns: A comprehensive study, Image Vis. Comput. 27 (6) (2009) 803–816.

Digital Library

[7]

Krizhevsky A., Sutskever I., Hinton G.E., Imagenet classification with deep convolutional neural networks, in: Advances in Neural Information Processing Systems, 2012, pp. 1097–1105.

Digital Library

[8]

Kalchbrenner N., Grefenstette E., Blunsom P., A convolutional neural network for modelling sentences, 2014, arXiv:1404.2188.

[9]

Tran D., Bourdev L., Fergus R., Torresani L., Paluri M., Learning spatiotemporal features with 3D convolutional networks, in: 2015 IEEE International Conference on Computer Vision, ICCV, IEEE, 2015, pp. 4489–4497.

[10]

Mishne G., Experiments with mood classification in blog posts, 2005.

[11]

Oneto L., Bisio F., Cambria E., Statistical learning theory and ELM for big social data analysis, IEEE Comput. Intell. Mag. 11 (3) (2016) 45–55.

Digital Library

[12]

Pang B., Lee L., Vaithyanathan S., Thumbs up? sentiment classification using machine learning techniques, in: Proceedings of the EMNLP, ACL, 2002, pp. 79–86.

[13]

R. Socher, A. Perelygin, J.Y. Wu, Recursive deep models for semantic compositionality over a sentiment tree-bank, in: Proceedings of EMNLP, Vol. 1631, 2013, pp. 1642–1654.

[14]

I. Titov, R. Mcdonald, Modeling online reviews with multi-grain topic models, in: Proceedings of the 17th international conference on World Wide Web, 41 (1) (2008) 111–120.

[15]

T. Ivan, R. Mcdonald, A joint model of text and aspect ratings for sentiment summarization, in: Proc. ACL-08: HLT, 2008, pp. 308–316.

[16]

Melville P., Gryc W., Lawrence R.D., Sentiment analysis of blogs by combining lexical knowledge with text classification, in: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 2009, pp. 1275–1284.

[17]

X. Hu, J. Tang, H. Gao, Unsupervised sentiment analysis with emotional signals, in: Proceedings of the WWW, 2013, pp. 607–618.

[18]

Tai K.S., Socher R., Manning C.-D., Improved semantic representations from tree-structured long short-term memory networks, Comput. Sci. 5 (1) (2015) 36.

[19]

Lei T., Barzilay R., Jaakkola T., Molding CNNs for text: non-linear, non-consecutive convolutions, 2015, arXiv preprint arXiv:1508.04112.

[20]

Zhou C., Sun C., Liu Z., A C-LSTM neural network for text classification, Comput. Sci. 1 (4) (2015) 39–44.

[21]

Eyben F., Wollmer M., Schuller B., Openear introducing the munich opensource emotion and affect recognition toolkit, in: 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, IEEE, 2009, pp. 1–6.

[22]

Navas E., Hernaez I., Luengo I., An objective and subjective study of the role of semantics and prosodic features in building corpora for emotional TTS, IEEE Trans. Audio Speech Lang. Process. 14 (4) (2006) 1117–1127.

[23]

Huang Z.-W., Dong M., Mao Q.-R., Speech emotion recognition using CNN, in: Proceedings of the ACM International Conference on Multimedia, ACM, 2014, pp. 801–804.

[24]

C.-C. Chiu, Y.-L. Chang, Y.-J. Lai, The analysis and recognition of human vocal emotions, in: Proc. International Computer Symposium, pp. 83–88.

[25]

Ekman P., Friesen W.V., Constants across cultures in the face and emotion, J. Pers. Soc. Psychol. 17 (2) (1971) 124–129.

[26]

P. Ekman, Universal facial expressions of emotion, in: Culture and Personality: Contemporary Readings, Chicago, 1974.

[27]

Krizhevsky A., Sutskever I., Hinton G.E., Imagenet classification with deep convolutional neural networks, in: Advances in Neural Information Processing Systems, 2012, pp. 1097–1105.

Digital Library

[28]

Xu C., Cetintas S., Lee K.-C., Li L.-J., Visual sentiment prediction with deep convolutional neural networks, 2014, arXiv preprint arXiv:1411.5731.

[29]

Baveye Y., Dellandrea E., Chamaret C., LIRIS-ACCEDE: A video database for affective content analysis, IEEE Trans. Affect. Comput. 6 (1) (2015) 43–55.

Digital Library

[30]

P. Liu, S. Han, Z. Meng, Facial expression recognition via a boosted deep belief network, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1805–1812.

[31]

Tran D., Bourdev L., Fergus R., Learning spatiotemporal features with 3D convolutional networks, 2014, arXiv preprint arXiv:1412.0767.

[32]

Poria S., Peng H., Hussain A., Ensemble application of convolutional neural networks and multiple kernel learning for multimodal sentiment analysis, Neurocomputing 261 (2017) 217–230.

[33]

Glodek M., Reuter S., Schels M., Kalman filter based classifier fusion for affective state recognition, in: Multiple Classifier Systems, Springer, 2013, pp. 85–94.

[34]

Dobrisek S., Gajsek R., Mihelic F., Towards efficient multimodal emotion recognition, Int. J. Adv. Robot. Syst. 10 (1) (2013) 53.

[35]

Wollmer M., Weninger F., Knaup T., Youtube movie reviews: sentiment analysis in an audio-visual context, IEEE Intell. Syst. 28 (3) (2013) 46–53.

[36]

Qiu H., Qiu M., Lu Z.h., et al., An efficient key distribution system for data fusion in V2X heterogeneous networks, Inf. Fusion 50 (2019) 212–220.

[37]

Qiu H., Noura H., Qiu M., et al., A user-centric data protection method for cloud storage based on invertible DWT, IEEE Trans. Cloud Comput. (2019) 1–1.

[38]

Qiu M., Zhang K., Huang M., An empirical study of web interface design on small display devices, in: IEEE/WIC/ACM International Conference on Web Intelligence, WI’04, IEEE, 2004.

[39]

P. Molchanov, X. Yang, S. Gupta, K. Kim, S. Tyree, J. Kautz, Online detection and classification of dynamic hand gestures with recurrent 3D convolutional neural networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 4207–4215.

[40]

X. Shi, Z. Chen, H. Wang, D.-Y. Yeung, W.-K. Wong, W.-C. Woo, Convolutional LSTM network: A machine learning approach for precipitation nowcasting, in: Proc. Adv. Neural Inf. Process. Syst. 2015, pp. 802–810.

[41]

Morency L.-P., Mihalcea R., Doshi P., Towards multimodal sentiment analysis: Harvesting opinions from the web, in: Proceedings of the 13th International Conference on Multimodal Interfaces, ACM, 2011, pp. 169–176.

[42]

Gu Y., Chanussot J., Jia X., et al., Multiple kernel learning for hyperspectral image classification: A review, IEEE Trans. Geosci. Remote Sens. (2017) 1–19.

[43]

Busso C., Bulut M., Lee C.-C., IEMOCAP: Interactive emotional dyadic motion capture database, Lang. Resour. Eval. 42 (4) (2008) 335.

[44]

Lucey P., Cohn J.F., Kanade T., Saragih J., Ambadar Z., Matthews I., The extended Cohn–Kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression, in: Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on, Vol. 5, IEEE, 2010, pp. 94–101.

Cited By

Liang SPan Zliu wYin Jde Rijke M(2024)A Survey on Variational Autoencoders in Recommender SystemsACM Computing Surveys10.1145/366336456:10(1-40)Online publication date: 24-Jun-2024
https://dl.acm.org/doi/10.1145/3663364
An PZhu JZhang ZYin YMa QYan CDu LZhao J(2024)EmoWear: Exploring Emotional Teasers for Voice Message Interaction on SmartwatchesProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642101(1-16)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642101
Tian YLiao XDong LXu YJiang H(2024)Amount-Based Covert Communication Over BlockchainIEEE Transactions on Network and Service Management10.1109/TNSM.2024.335801321:3(3095-3111)Online publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1109/TNSM.2024.3358013
Show More Cited By

Index Terms

A social emotion classification approach using multi-model fusion

Index terms have been assigned to the content through auto-classification.

Recommendations

Temporal multimodal fusion for video emotion classification in the wild
ICMI '17: Proceedings of the 19th ACM International Conference on Multimodal Interaction

This paper addresses the question of emotion classification. The task consists in predicting emotion labels (taken among a set of possible labels) best describing the emotions contained in short video clips. Building on a standard framework – lying in ...
Emotion Maps based on Geotagged Posts in the Social Media
GeoHumanities '17: Proceedings of the 1st ACM SIGSPATIAL Workshop on Geospatial Humanities

Emotions influence people's behavior in a profound way. Feelings like happiness, hope, fear, boredom, anger, anxiety or relaxation affect the way people behave and interact with one another. However, there is often a strong correlation between the ...
Fine-grained emotion recognition: fusion of physiological signals and facial expressions on spontaneous emotion corpus

The recognition of fine-grained emotions (i.e., happiness, sad, etc.) has shown its importance in a real-world implementation. The emotion recognition using physiological signals is a challenging task due to the precision of the labelled data while using ...

Comments

Information & Contributors

Information

Published In

cover image Future Generation Computer Systems

Future Generation Computer Systems Volume 102, Issue C

Jan 2020

1062 pages

ISSN:0167-739X

Issue’s Table of Contents

Elsevier B.V.

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 January 2020

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Liang SPan Zliu wYin Jde Rijke M(2024)A Survey on Variational Autoencoders in Recommender SystemsACM Computing Surveys10.1145/366336456:10(1-40)Online publication date: 24-Jun-2024
https://dl.acm.org/doi/10.1145/3663364
An PZhu JZhang ZYin YMa QYan CDu LZhao J(2024)EmoWear: Exploring Emotional Teasers for Voice Message Interaction on SmartwatchesProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642101(1-16)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642101
Tian YLiao XDong LXu YJiang H(2024)Amount-Based Covert Communication Over BlockchainIEEE Transactions on Network and Service Management10.1109/TNSM.2024.335801321:3(3095-3111)Online publication date: 1-Jun-2024
https://dl.acm.org/doi/10.1109/TNSM.2024.3358013
Smitha ESendhilkumar SMahalakshmi G(2023)Ensemble Convolution Neural Network for Robust Video Emotion Recognition Using Deep SemanticsScientific Programming10.1155/2023/68592842023Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1155/2023/6859284
Deng JRen F(2023)A Survey of Textual Emotion Recognition and Its ChallengesIEEE Transactions on Affective Computing10.1109/TAFFC.2021.305327514:1(49-67)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TAFFC.2021.3053275
Hashem AArif MAlghamdi M(2023)Speech emotion recognition approachesSpeech Communication10.1016/j.specom.2023.102974154:COnline publication date: 1-Oct-2023
https://dl.acm.org/doi/10.1016/j.specom.2023.102974
Sun HLiu JChen YLin L(2023)Modality-invariant temporal representation learning for multimodal sentiment classificationInformation Fusion10.1016/j.inffus.2022.10.03191:C(504-514)Online publication date: 1-Mar-2023
https://dl.acm.org/doi/10.1016/j.inffus.2022.10.031
Hadikhah Mozhdehi MEftekhari Moghadam A(2023)Textual emotion detection utilizing a transfer learning approachThe Journal of Supercomputing10.1007/s11227-023-05168-579:12(13075-13089)Online publication date: 22-Mar-2023
https://dl.acm.org/doi/10.1007/s11227-023-05168-5
Kim DKang P(2022)Cross-modal distillation with audio–text fusion for fine-grained emotion classification using BERT and Wav2vec 2.0Neurocomputing10.1016/j.neucom.2022.07.035506:C(168-183)Online publication date: 28-Sep-2022
https://dl.acm.org/doi/10.1016/j.neucom.2022.07.035
Maithri MRaghavendra UGudigar ASamanth JPrabal Datta Barua Murugappan MChakole YAcharya U(2022)Automated emotion recognitionComputer Methods and Programs in Biomedicine10.1016/j.cmpb.2022.106646215:COnline publication date: 1-Mar-2022
https://dl.acm.org/doi/10.1016/j.cmpb.2022.106646
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents