research-article

Deep Learning for Human Affect Recognition: Insights and New Developments

Authors:

Philipp V. Rouast,

Marc T. P. Adam,

Raymond ChiongAuthors Info & Claims

IEEE Transactions on Affective Computing, Volume 12, Issue 2

Pages 524 - 543

https://doi.org/10.1109/TAFFC.2018.2890471

Published: 01 April 2021 Publication History

Abstract

Automatic human affect recognition is a key step towards more natural human-computer interaction. Recent trends include recognition in the wild using a fusion of audiovisual and physiological sensors, a challenging setting for conventional machine learning algorithms. Since 2010, novel deep learning algorithms have been applied increasingly in this field. In this paper, we review the literature on human affect recognition between 2010 and 2017, with a special focus on approaches using deep neural networks. By classifying a total of 950 studies according to their usage of shallow or deep architectures, we are able to show a trend towards deep learning. Reviewing a subset of 233 studies that employ deep neural networks, we comprehensively quantify their applications in this field. We find that deep learning is used for learning of (i) spatial feature representations, (ii) temporal feature representations, and (iii) joint feature representations for multimodal sensor data. Exemplary state-of-the-art architectures illustrate the progress. Our findings show the role deep architectures will play in human affect recognition, and can serve as a reference point for researchers working on related applications.

References

[1]

R. W. Picard, Affective Computing. Cambridge, MA, USA: MIT Press, 1997.

[2]

R. W. Picard, “Affective computing,” MIT Media Lab., Cambridge, MA, Tech. Rep. no. 321, 1995.

[3]

S. D’Mello, R. W. Picard, and A. Graesser, “Toward an affect-sensitive AutoTutor,” IEEE Intell. Syst., vol. 22, no. 4, pp. 53–61, Jul./Aug. 2007.

Digital Library

[4]

C. Lisetti, F. Nasoz, C. LeRouge, O. Ozyer, and K. Alvarez, “Developing multimodal intelligent affective interfaces for tele-home health care,” Int. J. Hum.-Comput. Stud., vol. 59, no. 1, pp. 245–255, 2003.

Digital Library

[5]

G. N. Yannakakis and J. Togelius, “Experience-driven procedural content generation,” IEEE Trans. Affect. Comput., vol. 2, no. 3, pp. 147–161, Jul.–Sep. 2011.

Digital Library

[6]

F. De Rosis, C. Pelachaud, I. Poggi, V. Carofiglio, and B. De Carolis, “From greta’s mind to her face: Modelling the dynamics of affective states in a conversational embodied agent,” Int. J. Hum.-Comput. Stud., vol. 59, no. 1, pp. 81–118, 2003.

Digital Library

[7]

M. Pantic and L. J. Rothkrantz, “Toward an affect-sensitive multimodal human-computer interaction,” Proc. IEEE, vol. 91, no. 9, pp. 1370–1390, Sep. 2003.

[8]

R. W. Picard, E. Vyzas, and J. Healey, “Toward machine emotional intelligence: Analysis of affective physiological state,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, no. 10, pp. 1175–1191, Oct. 2001.

Digital Library

[9]

R. A. Calvo and S. D’Mello, “Affect detection: An interdisciplinary review of models, methods, and their applications,” IEEE Trans. Affect. Comput., vol. 1, no. 1, pp. 18–37, Jan. 2010.

Digital Library

[10]

A. Jaimes and N. Sebe, “Multimodal human–computer interaction: A survey,” Comput. Vis. Image Understanding, vol. 108, no. 1, pp. 116–134, 2007.

Digital Library

[11]

Z. Zeng, M. Pantic, G. I. Roisman, and T. S. Huang, “A survey of affect recognition methods: Audio, visual, and spontaneous expressions,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 1, pp. 39–58, Jan. 2009.

Digital Library

[12]

A. Dhall, R. Goecke, J. Joshi, J. Hoey, and T. Gedeon, “EmotiW 2016: Video and group-level emotion recognition challenges,” in Proc. Int. Conf. Multimodal Interact., 2016, pp. 427–432.

[13]

M. Valstar, J. Gratch, B. Schuller, F. Ringeval, D. Lalanne, M. Torres Torres, S. Scherer, G. Stratou, R. Cowie, and M. Pantic, “AVEC 2016: Depression, mood, and emotion recognition workshop and challenge,” in Proc. 6th Int. Workshop Audio/Visual Emotion Challenge, 2016, pp. 3–10.

[14]

Y. Fan, X. Lu, D. Li, and Y. Liu, “Video-based emotion recognition using CNN-RNN and C3D hybrid networks,” in Proc. Int. Conf. Multimodal Interact., 2016, pp. 445–450.

[15]

J. Li, S. Roy, J. Feng, and T. Sim, “Happiness level prediction with sequential inputs via multiple regressions,” in Proc. Int. Conf. Multimodal Interact., 2016, pp. 487–493.

[16]

B.-K. Kim, H. Lee, J. Roh, and S.-Y. Lee, “Hierarchical committee of deep CNNs with exponentially-weighted decision fusion for static facial expression recognition,” in Proc. Int. Conf. Multimodal Interact., 2015, pp. 427–434.

[17]

S. Chen, Q. Jin, J. Zhao, and S. Wang, “Multimodal multi-task learning for dimensional and continuous emotion recognition,” in Proc. 7th Annu. Workshop Audio/Visual Emotion Challenge, 2017, pp. 19–26.

[18]

K. Brady, Y. Gwon, P. Khorrami, E. Godoy, W. Campbell, C. Dagli, and T. S. Huang, “Multi-modal audio, video and physiological sensor learning for continuous emotion prediction,” in Proc. 6th Int. Workshop Audio/Visual Emotion Challenge, 2016, pp. 97–104.

[19]

L. He, D. Jiang, L. Yang, E. Pei, P. Wu, and H. Sahli, “Multimodal affective dimension prediction using deep bidirectional long short-term memory recurrent neural networks,” in Proc. 5th Int. Workshop Audio/Visual Emotion Challenge, 2015, pp. 73–80.

[20]

Y. Bengio and Y. LeCun, “Scaling learning algorithms towards AI,” in Large-Scale Kernel Machines, L. Bottou, O. Chapelle, D. DeCoste, and J. Weston, Eds. Cambridge, MA, USA: MIT Press, 2007, pp. 1–41.

[21]

Y. Bengio, O. Delalleau, and N. Le Roux, “The curse of dimensionality for local kernel machines,” Université de Montréal, Montreal, QC, Tech. Rep., Mar. 2005.

[22]

I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge, MA, USA: MIT Press, 2016.

Digital Library

[23]

G. E. Hinton, S. Osindero, and Y.-W. Teh, “A fast learning algorithm for deep belief nets,” Neural Comput., vol. 18, no. 7, pp. 1527–1554, 2006.

Digital Library

[24]

Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, pp. 436–444, 2015.

[25]

J. Schmidhuber, “Deep learning in neural networks: An overview,” Neural Netw., vol. 61, pp. 85–117, 2015.

Digital Library

[26]

Y. Bengio, “Learning deep architectures for AI,” Found. Trends Mach. Learn., vol. 2, no. 1, pp. 1–127, 2009.

Digital Library

[27]

Y. Bengio, A. Courville, and P. Vincent, “Representation learning: A review and new perspectives,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 8, pp. 1798–1828, Aug. 2013.

Digital Library

[28]

S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997.

Digital Library

[29]

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 770–778.

[30]

J. Martens and I. Sutskever, “Learning recurrent neural networks with hessian-free optimization,” in Proc. Int. Conf. Mach. Learn., 2011, pp. 1033–1040.

[31]

X. Glorot, A. Bordes, and Y. Bengio, “Deep sparse rectifier neural networks,” in Proc. Int. Conf. Artif. Intell. Stat., 2011, pp. 315–323.

[32]

I. Sutskever, J. Martens, G. Dahl, and G. Hinton, “On the importance of initialization and momentum in deep learning,” in Proc. Int. Conf. Mach. Learn., 2013, pp. 1139–1147.

[33]

S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in Proc. Mach. Learn. Res., 2015, pp. 448–456.

[34]

C. Sun, A. Shrivastava, S. Singh, and A. Gupta, “Revisiting unreasonable effectiveness of data in deep learning era,” in Proc. Int. Conf. Comput. Vis., 2017, pp. 843–852.

[35]

D. H. Hubel and T. N. Wiesel, “Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex,” J. Physiol., vol. 160, no. 1, pp. 106–154, 1962.

[36]

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proc. 25th Int. Conf. Neural Inf. Process. Syst., 2012, pp. 1097–1105.

[37]

Y. LeCun, L. Jackel, B. Boser, J. Denker, H. Graf, I. Guyon, D. Henderson, R. Howard, and W. Hubbard, “Handwritten digit recognition: Applications of neural network chips and automatic learning,” IEEE Commun. Mag., vol. 27, no. 11, pp. 41–46, Nov. 1989.

Digital Library

[38]

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and F.-F. Li, “ImageNet large scale visual recognition challenge,” Int. J. Comput. Vis., vol. 115, no. 3, pp. 211–252, 2015.

Digital Library

[39]

Y. Ming, S. Cao, R. Zhang, Z. Li, Y. Chen, Y. Song, and H. Qu, “Understanding hidden memories of recurrent neural networks,” in Proc. Conf. Visual Anal. Sci. Technol., 2017, pp. 13–24.

[40]

Z. C. Lipton, J. Berkowitz, and C. Elkan, “A critical review of recurrent neural networks for sequence learning,” arXiv:1506.00019, p. 38, 2015.

[41]

K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” in Proc. Conf. Empirical Methods Natural Lang. Process., 2017, pp. 1724–1734.

[42]

K. J. Han, A. Chandrashekaran, J. Kim, and I. Lane, “Densely connected networks for conversational speech recognition,” in Proc. Annu. Conf. Int. Speech Commun. Assoc., 2018, pp. 796–800.

[43]

P. Baldi and K. Hornik, “Neural networks and principal component analysis: Learning from examples without local minima,” Neural Netw., vol. 2, pp. 53–58, 1989.

Digital Library

[44]

P. Smolensky, “Information processing in dynamical systems: Foundations of harmony theory,” in Parallel Distributed Processing: Volume 1: Foundations, D. E. Rumelhart and J. L. McClelland, Eds. Cambridge, MA, USA: MIT Press, 1986, pp. 194–281.

Digital Library

[45]

R. Salakhutdinov and H. Larochelle, “Efficient learning of deep Boltzmann machines,” in Proc. Int. Conf. Artif. Intell. Stat., 2010, pp. 693–700.

[46]

Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle, “Greedy layer-wise training of deep networks,” in Proc. 19th Int. Conf. Neural Inf. Process. Syst., 2007, pp. 153–160.

[47]

M. Pantic, N. Sebe, J. F. Cohn, and T. Huang, “Affective multimodal human-computer interaction,” in Proc. Int. Conf. Multimedia, 2005, pp. 669–676.

[48]

P. Kragel and K. LaBar, “Decoding the nature of emotion in the brain,” Trends Cogn. Sci., vol. 20, no. 6, pp. 444–455, 2016.

[49]

A. Ortony and T. J. Turner, “What’s basic about basic emotions?” Psychological Rev., vol. 97, no. 3, pp. 315–331, 1990.

[50]

P. Ekman and W. V. Friesen, “The repertoire of nonverbal behavior: Categories, origins, usage, and coding,” Semiotica, vol. 1, no. 1, pp. 49–98, 1969.

[51]

R. Plutchik, “A general psychoevolutionary theory of emotion,” in Theories of Emotion, New York, NY, USA: Academic Press, 1980, pp. 3–33.

[52]

W. James, The Principles of Psychology, Volume I. New York, NY, USA: Holt, 1890.

[53]

H. Gunes and B. Schuller, “Categorical and dimensional affect analysis in continuous input: Current trends and future directions,” Image Vis. Comput., vol. 31, pp. 120–136, 2013.

Digital Library

[54]

G. Trigeorgis, F. Ringeval, R. Brueckner, E. Marchi, M. A. Nicolaou, B. Schuller, and S. Zafeiriou, “Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2016, pp. 5200–5204.

[55]

X. Li, D. Song, P. Zhang, G. Yu, Y. Hou, and B. Hu, “Emotion recognition from multi-channel EEG data through convolutional recurrent neural network,” in Proc. Int. Conf. Bioinf. Biomed., 2016, pp. 352–359.

[56]

L. A. Bugnon, R. A. Calvo, and D. H. Milone, “Dimensional affect recognition from HRV: An approach based on supervised SOM and ELM,” IEEE Trans. Affect. Comput., 2017.

[57]

A. Mollahosseini, B. Hasani, and M. H. Mahoor, “AffectNet: A database for facial expression, valence, and arousal computing in the wild,” IEEE Trans. Affect. Comput., 2017.

Digital Library

[58]

J. A. Russell, “A circumplex model of affect,” J. Pers. Soc. Psychol., vol. 39, no. 6, pp. 1161–1178, 1980.

[59]

S. K. D’Mello and J. Kory, “A review and meta-analysis of multimodal affect detection systems,” ACM Comput. Surv., vol. 47, no. 3, 2015, Art. no.

[60]

S. J. Pan, Q. Yang, et al., “A survey on transfer learning,” IEEE Trans. Knowl. Data Eng., vol. 22, no. 10, pp. 1345–1359, Oct. 2010.

Digital Library

[61]

S. Rifai, Y. Bengio, A. Courville, P. Vincent, and M. Mirza, “Disentangling factors of variation for facial expression recognition,” in Proc. Eur. Conf. Comput. Vis., 2012, pp. 808–822.

[62]

O. Gupta, D. Raviv, and R. Raskar, “Multi-velocity neural networks for facial expression recognition in videos,” IEEE Trans. Affect. Comput., 2017.

[63]

S. Mirsamadi, E. Barsoum, and C. Zhang, “Automatic speech emotion recognition using recurrent neural networks with local attention,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2017, pp. 2227–2231.

[64]

D. H. Kim, W. Baddar, J. Jang, and Y. M. Ro, “Multi-objective based spatio-temporal feature representation learning robust to expression intensity variations for facial expression recognition,” IEEE Trans. Affect. Comput., 2017.

[65]

F. Ringeval, F. Eyben, E. Kroupi, A. Yuce, J.-P. Thiran, T. Ebrahimi, D. Lalanne, and B. Schuller, “Prediction of asynchronous dimensional emotion ratings from audiovisual and physiological data,” Pattern Recognit. Lett., vol. 66, pp. 22–30, 2015.

Digital Library

[66]

M. Chen, L. Zhang, and J. P. Allebach, “Learning deep features for image emotion classification,” in Proc. Int. Conf. Image Process., 2015, pp. 4491–4495.

[67]

Y. E. Kim, E. M. Schmidt, R. Migneco, B. G. Morton, P. Richardson, J. Scott, J. A. Speck, and D. Turnbull, “Music emotion recognition: A state of the art review,” in Proc. Int. Soc. Music Inf. Retrieval, 2010, pp. 255–266.

[68]

E. M. Schmidt and Y. E. Kim, “Learning emotion-based acoustic features with deep belief networks,” in Proc. Workshop Appl. Signal Process. Audio Acoust., 2011, pp. 65–68.

[69]

R. A. Calvo and S. Mac Kim, “Emotions in text: Dimensional and categorical models,” Comput. Intell., vol. 29, no. 3, pp. 527–543, 2013.

[70]

R. Cowie, E. Douglas-Cowie, N. Tsapatsoulis, G. Votsis, S. Kollias, W. Fellenz, and J. G. Taylor, “Emotion recognition in human-computer interaction,” IEEE Signal Process. Mag., vol. 18, no. 1, pp. 32–80, Jan. 2001.

[71]

A. Mehrabian, “Communication without words,” Psychology Today, vol. 2, no. 4, pp. 53–56, 1968.

[72]

E. Sariyanidi, H. Gunes, and A. Cavallaro, “Automatic analysis of facial affect: A survey of registration, representation, and recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 6, pp. 1113–1133, Jun. 2015.

Digital Library

[73]

C. A. Corneanu, M. O. Simon, J. F. Cohn, and S. E. Guerrero, “Survey on RGB, 3D, thermal, and multimodal approaches for facial expression recognition: History, trends, and affect-related applications,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 8, pp. 1548–1568, Aug. 2016.

Digital Library

[74]

G. Levi and T. Hassner, “Emotion recognition in the wild via convolutional neural networks and mapped binary patterns,” in Proc. Int. Conf. Multimodal Interact., 2015, pp. 503–510.

[75]

A. Yao, D. Cai, P. Hu, S. Wang, L. Sha, and Y. Chen, “HoloNet: Towards robust emotion recognition in the wild,” in Proc. Int. Conf. Multimodal Interact., 2016, pp. 472–478.

[76]

Y. Cheng, B. Jiang, and K. Jia, “A deep structure for facial expression recognition under partial occlusion,” in Proc. Int. Conf. Intell. Inf. Hiding Multimedia Signal Process., 2014, pp. 211–214.

[77]

Y. Lv, Z. Feng, and C. Xu, “Facial expression recognition via deep learning,” in Proc. Int. Conf. Smart Comput., 2014, pp. 303–308.

[78]

C. Fadil, R. Alvarez, C. Martínez, J. Goddard, and H. Rufiner, “Multimodal emotion recognition using deep networks,” in Proc. Latin Amer. Congress Biomed. Eng., 2014, pp. 813–816.

[79]

C. Zhang, S. Bengio, M. Hardt, B. Recht, and O. Vinyals, “Understanding deep learning requires rethinking generalization,” in Proc. Int. Conf. Learn. Represent., 2017, pp. 1–15.

[80]

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proc. Int. Conf. Learn. Represent., pp. 1–14, 2014.

[81]

C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 1–9.

[82]

Y. Tang, “Deep learning using linear support vector machines,” in Proc. Int. Conf. Mach. Learn., 2013, pp. 1–6.

[83]

S. Ji, W. Xu, M. Yang, and K. Yu, “3D convolutional neural networks for human action recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 1, pp. 221–231, Jan. 2013.

Digital Library

[84]

P. Barros, D. Jirak, C. Weber, and S. Wermter, “Multimodal emotional state recognition using sequence-dependent deep hierarchical features,” Neural Netw., vol. 72, pp. 140–151, 2015.

Digital Library

[85]

S. Zhang, S. Zhang, T. Huang, W. Gao, and Q. Tian, “Learning affective features with a hybrid deep model for audio-visual emotion recognition,” IEEE Trans. Circuits Syst. Video Technol., vol. 28, no. 10, pp. 3030–3043, Oct. 2018.

Digital Library

[86]

A. Dhall, R. Goecke, S. Ghosh, J. Joshi, J. Hoey, and T. Gedeon, “From individual to group-level emotion recognition: Emotiw 5.0,” in Proc. Int. Conf. Multimodal Interact., 2017, pp. 524–528.

[87]

P. Khorrami, T. Paine, and T. Huang, “Do deep neural networks learn facial action units when doing expression recognition?” in Proc. IEEE Int. Conf. Comput. Vis., 2015, pp. 19–27.

[88]

H. Jung, S. Lee, J. Yim, S. Park, and J. Kim, “Joint fine-tuning in deep neural networks for facial expression recognition,” in Proc. Int. Conf. Comput. Vis., 2015, pp. 2983–2991.

[89]

S. Chen, X. Li, Q. Jin, S. Zhang, and Y. Qin, “Video emotion recognition in the wild based on fusion of multimodal features,” in Proc. IEEE Int. Conf. Multimodal Interact., 2016, pp. 494–500.

[90]

P. Khorrami, T. Le Paine, K. Brady, C. Dagli, and T. Huang, “How deep neural networks can improve emotion recognition on video data,” in Proc. Int. Conf. Image Process., 2016, pp. 619–623.

[91]

R. Breuer and R. Kimmel, “A deep learning perspective on the origin of facial expressions,” arXiv: 1705.01842, p. 16, 2017.

[92]

H. Li, J. Sun, Z. Xu, and L. Chen, “Multimodal 2D+3D facial expression recognition with deep fusion convolutional neural network,” IEEE Trans. Multimedia, vol. 19, no. 12, pp. 2816–2831, Dec. 2017.

[93]

L. Gui, T. Baltrušaitis, and L.-P. Morency, “Curriculum learning for facial expression recognition,” in Proc. Int. Conf. Automat. Face Gesture Recognit., 2017, pp. 505–511.

[94]

A. T. Lopes, E. de Aguiar, A. F. De Souza, and T. Oliveira-Santos, “Facial expression recognition with convolutional neural networks: Coping with few data and the training sample order,” Pattern Recognit., vol. 61, pp. 610–628, 2017.

Digital Library

[95]

P. Hu, D. Cai, S. Wang, A. Yao, and Y. Chen, “Learning supervised scoring ensemble for emotion recognition in the wild,” in Proc. Int. Conf. Multimodal Interact., 2017, pp. 553–560.

[96]

M. M. Ghazi and H. K. Ekenel, “Automatic emotion recognition in the wild using an ensemble of static and dynamic representations,” in Proc. Int. Conf. Multimodal Interact., 2016, pp. 514–521.

[97]

W. Ding, M. Xu, D. Huang, W. Lin, M. Dong, X. Yu, and H. Li, “Audio and face video emotion recognition in the wild using deep neural networks and small datasets,” in Proc. Int. Conf. Multimodal Interact., 2016, pp. 506–513.

[98]

B. Sun, L. Li, G. Zhou, and J. He, “Facial expression recognition in the wild based on multimodal texture features,” J. Electron. Imaging, vol. 25, no. 6, pp. 1–8, 2016.

[99]

H. Kaya, F. Gürpınar, and A. A. Salah, “Video-based emotion recognition in the wild using deep transfer learning and score fusion,” Image Vis. Comput., vol. 65, pp. 66–75, 2017.

Digital Library

[100]

B. Xu, Y. Fu, Y.-G. Jiang, B. Li, and L. Sigal, “Video emotion recognition with transferred deep feature encodings,” in Proc. Int. Conf. Multimedia Retrieval, 2016, pp. 15–22.

[101]

J. Susskind, A. Anderson, and G. E. Hinton, “The Toronto face dataset,” University of Toronto, Toronto, Ontario, Tech. Rep., 2010.

[102]

A. Dhall, R. Goecke, S. Lucey, and T. Gedeon, “Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark,” in Proc. IEEE Int. Conf. Comput. Vis. Workshops, 2011, pp. 2106–2112.

[103]

I. J. Goodfellow, D. Erhan, P. L. Carrier, A. Courville, M. Mirza, B. Hamner, W. Cukierski, Y. Tang, D. Thaler, D.-H. Lee, et al., “Challenges in representation learning: A report on three machine learning contests,” Neural Netw., vol. 64, pp. 59–63, 2015.

Digital Library

[104]

T. Hassner, S. Harel, E. Paz, and R. Enbar, “Effective face frontalization in unconstrained images,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 4295–4304.

[105]

P. Rodriguez, G. Cucurull, J. Gonzàlez, J. M. Gonfaus, K. Nasrollahi, T. B. Moeslund, and F. X. Roca, “Deep Pain: Exploiting long short-term memory networks for facial expression classification,” IEEE Trans. Cybern., 2017.

[106]

S. E. Kahou, V. Michalski, K. Konda, R. Memisevic, and C. Pal, “Recurrent neural networks for emotion recognition in video,” in Proc. Int. Conf. Multimodal Interact., 2015, pp. 467–474.

[107]

H.-W. Ng, V. D. Nguyen, V. Vonikakis, and S. Winkler, “Deep learning for emotion recognition on small datasets using transfer learning,” in Proc. Int. Conf. Multimodal Interact., 2015, pp. 443–449.

[108]

P. Tzirakis, G. Trigeorgis, M. A. Nicolaou, B. Schuller, and S. Zafeiriou, “End-to-end multimodal emotion recognition using deep neural networks,” IEEE J. Sel. Top. Signal Process., vol. 11, no. 8, pp. 1301–1309, Dec. 2017.

[109]

A. S. Razavian, H. Azizpour, J. Sullivan, and S. Carlsson, “CNN features off-the-shelf: An astounding baseline for recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, 2014, pp. 806–813.

[110]

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and F.-F. Li, “ImageNet: A large-scale hierarchical image database,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2009, pp. 248–255.

[111]

O. M. Parkhi, A. Vedaldi, A. Zisserman, et al., “Deep face recognition,” in Proc. Brit. Mach. Vis. Conf., 2015, pp. 1–12.

[112]

P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, and I. Matthews, “The extended cohn-kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, 2010, pp. 94–101.

[113]

Y. Guo, D. Tao, J. Yu, H. Xiong, Y. Li, and D. Tao, “Deep neural networks with relativity learning for facial expression recognition,” in Proc. Int. Conf. Multimedia Expo Workshops, 2016, pp. 1–6.

[114]

Y. Cai, W. Zheng, T. Zhang, Q. Li, Z. Cui, and J. Ye, “Video based emotion recognition using CNN and BRNN,” in Proc. Chin. Conf. Pattern Recognit., 2016, pp. 679–691.

[115]

H. Ding, S. K. Zhou, and R. Chellappa, “FaceNet2ExpNet: Regularizing a deep face recognition net for expression recognition,” in Proc. Int. Conf. Automat. Face Gesture Recognit., 2017, pp. 118–126.

[116]

N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A simple way to prevent neural networks from overfitting,” J. Mach. Learn. Res., vol. 15, no. 1, pp. 1929–1958, 2014.

Digital Library

[117]

M. El Ayadi, M. S. Kamel, and F. Karray, “Survey on speech emotion recognition: Features, classification schemes, and databases,” Pattern Recognit., vol. 44, no. 3, pp. 572–587, 2011.

Digital Library

[118]

C.-N. Anagnostopoulos, T. Iliou, and I. Giannoukos, “Features and classifiers for emotion recognition from speech: A survey from 2000 to 2011,” Artif. Intell. Rev., vol. 43, no. 2, pp. 155–177, 2015.

Digital Library

[119]

H. Lee, P. Pham, Y. Largman, and A. Y. Ng, “Unsupervised feature learning for audio classification using convolutional deep belief networks,” in Proc. 22nd Int. Conf. Neural Inf. Process. Syst., 2009, pp. 1096–1104.

[120]

O. Abdel-Hamid, A.-R. Mohamed, H. Jiang, L. Deng, G. Penn, and D. Yu, “Convolutional neural networks for speech recognition,” IEEE/ACM Trans. Audio Speech Lang. Process., vol. 22, no. 10, pp. 1533–1545, Oct. 2014.

Digital Library

[121]

F. Eyben, K. R. Scherer, B. W. Schuller, J. Sundberg, E. André, C. Busso, L. Y. Devillers, J. Epps, P. Laukka, S. S. Narayanan, et al., “The geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing,” IEEE Trans. Affect. Comput., vol. 7, no. 2, pp. 190–202, Apr.–Jun. 2016.

Digital Library

[122]

F. Eyben, M. Wöllmer, and B. Schuller, “OpenEAR—Introducing the Munich open-source emotion and affect recognition toolkit,” in Proc. 3rd Int. Conf. Affect. Comput. Intell. Interaction Workshops, 2009, pp. 1–6.

[123]

B. Schuller, S. Steidl, A. Batliner, A. Vinciarelli, K. Scherer, F. Ringeval, M. Chetouani, F. Weninger, F. Eyben, E. Marchi, et al., “The INTERSPEECH 2013 computational paralinguistics challenge,” in Proc. Annu. Conf. Int. Speech Commun. Assoc., 2013, pp. 148–152.

[124]

F. Eyben, M. Wöllmer, and B. Schuller, “Opensmile: The Munich versatile and fast open-source audio feature extractor,” in Proc. Int. Conf. Multimedia, 2010, pp. 1459–1462.

[125]

L. Li, Y. Zhao, D. Jiang, Y. Zhang, F. Wang, I. Gonzalez, E. Valentin, and H. Sahli, “Hybrid deep neural network–hidden Markov model (DNN-HMM) based speech emotion recognition,” in Proc. Int. Conf. Affect. Comput. Intell. Interaction Workshops, 2013, pp. 312–317.

[126]

Z. Huang, M. Dong, Q. Mao, and Y. Zhan, “Speech emotion recognition using CNN,” in Proc. Int. Conf. Multimedia, 2014, pp. 801–804.

[127]

A. M. Badshah, J. Ahmad, N. Rahim, and S. W. Baik, “Speech emotion recognition from spectrograms with deep convolutional neural network,” in Proc. Int. Conf. Platform Technol. Serv., 2017, pp. 1–5.

[128]

H. M. Fayek, M. Lech, and L. Cavedon, “Evaluating deep learning architectures for speech emotion recognition,” Neural Netw., vol. 92, pp. 60–68, 2017.

[129]

M. Wöllmer, M. Kaiser, F. Eyben, B. Schuller, and G. Rigoll, “LSTM-modeling of continuous emotions in an audiovisual affect recognition framework,” Image Vis. Comput., vol. 31, no. 2, pp. 153–163, 2013.

Digital Library

[130]

H. M. Fayek, M. Lech, and L. Cavedon, “Towards real-time speech emotion recognition using deep neural networks,” in Proc. Int. Conf. Signal Process. Commun. Syst., 2015, pp. 1–5.

[131]

W. Zheng, J. Yu, and Y. Zou, “An experimental study of speech emotion recognition based on deep convolutional neural networks,” in Proc. Int. Conf. Affect. Comput. Intell. Interaction Workshops, 2015, pp. 827–831.

[132]

C.-T. Ho, Y.-H. Lin, and J.-L. Wu, “Emotion prediction from user-generated videos by emotion wheel guided deep learning,” in Proc. Int. Conf. Neural Inf. Process., 2016, pp. 3–12.

[133]

N. Jaitly and G. Hinton, “Learning a better representation of speech soundwaves using restricted Boltzmann machines,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2011, pp. 5884–5887.

[134]

D. Bertero and P. Fung, “A first look into a convolutional neural network for speech emotion detection,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2017, pp. 5115–5119.

[135]

Y. Zhang, Y. Liu, F. Weninger, and B. Schuller, “Multi-task deep neural network with shared hidden layers: Breaking down the wall between emotion representations,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2017, pp. 4990–4994.

[136]

J. Gideon, S. Khorram, Z. Aldeneh, D. Dimitriadis, and E. Provost, “Progressive neural networks for transfer learning in emotion recognition,” in Proc. Annu. Conf. Int. Speech Commun. Assoc., 2017, pp. 1098–1102.

[137]

J. Deng, Z. Zhang, E. Marchi, and B. Schuller, “Sparse autoencoder-based feature transfer learning for speech emotion recognition,” in Proc. Int. Conf. Affect. Comput. Intell. Interaction Workshops, 2013, pp. 511–516.

[138]

Y. Aytar, C. Vondrick, and A. Torralba, “SoundNet: Learning sound representations from unlabeled video,” in Proc. 30th Int. Conf. Neural Inf. Process. Syst., 2016, pp. 1–9, https://papers.nips.cc/paper/6146-soundnet-learning-sound-representations-from-unlabeled-video

[139]

S. Pini, O. B. Ahmed, M. Cornia, L. Baraldi, R. Cucchiara, and B. Huet, “Modeling multimodal cues in a deep learning-based framework for emotion recognition in the wild,” in Proc. Int. Conf. Multimodal Interact, 2017, pp. 536–543.

[140]

J. Deng, X. Xu, Z. Zhang, S. Frühholz, and B. Schuller, “Universum autoencoder-based domain adaptation for speech emotion recognition,” IEEE Signal Process. Lett., vol. 24, no. 4, pp. 500–504, Apr. 2017.

[141]

E. Coutinho, J. Deng, and B. Schuller, “Transfer learning emotion manifestation across music and speech,” in Proc. Int. Joint Conf. Neural Netw., 2014, pp. 3592–3598.

[142]

Z. Aldeneh and E. M. Provost, “Using regional saliency for speech emotion recognition,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2017, pp. 2741–2745.

[143]

S. H. Fairclough, “Fundamentals of physiological computing,” Interact. Comput., vol. 21, no. 1/2, pp. 133–145, 2008.

[144]

F. H. Wilhelm and P. Grossman, “Emotions beyond the laboratory: Theoretical fundaments, study design, and analytic strategies for advanced ambulatory assessment,” Biol. Psychol., vol. 84, no. 3, pp. 552–569, 2010.

[145]

M. Yanagimoto and C. Sugimoto, “Recognition of persisting emotional valence from EEG using convolutional neural networks,” in Proc. Int. Workshop Comput. Intell. Appl., 2016, pp. 27–32.

[146]

T. Zhang, W. Zheng, Z. Cui, Y. Zong, and Y. Li, “Spatial-temporal recurrent neural network for emotion recognition,” IEEE Trans. Cybern., 2017.

[147]

J. Li, Z. Zhang, and H. He, “Implementation of EEG emotion recognition system based on hierarchical convolutional neural networks,” in Proc. Int. Conf. Brain Inspired Cogn. Syst., 2016, pp. 22–33.

[148]

B. Zhang, G. Essl, and E. Mower Provost, “Automatic recognition of self-reported and perceived emotion: Does joint modeling help?” in Proc. Int. Conf. Multimodal Interact., 2016, pp. 217–224.

[149]

H. Ranganathan, S. Chakraborty, and S. Panchanathan, “Multimodal emotion recognition using deep learning architectures,” in Proc. Winter Conf. Appl. Comput. Vis., 2016, pp. 1–9.

[150]

M. A. Nicolaou, H. Gunes, and M. Pantic, “Continuous prediction of spontaneous affect from multiple cues and modalities in valence-arousal space,” IEEE Trans. Affect. Comput., vol. 2, no. 2, pp. 92–105, Apr.–Jun. 2011.

Digital Library

[151]

J. Wei, E. Pei, D. Jiang, H. Sahli, L. Xie, and Z. Fu, “Multimodal continuous affect recognition based on LSTM and multiple kernel learning,” in Proc. Annu. Summit Conf. Asia-Pacific Signal Inf. Process. Assoc., 2014, pp. 1–4.

[152]

J. Yan, W. Zheng, Z. Cui, C. Tang, T. Zhang, Y. Zong, and N. Sun, “Multi-clue fusion for emotion recognition in the wild,” in Proc. Int. Conf. Multimodal Interact., 2016, pp. 458–463.

[153]

A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei, “Large-scale video classification with convolutional neural networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2014, pp. 1725–1732.

[154]

K. Kaza, A. Psaltis, K. Stefanidis, K. C. Apostolakis, S. Thermos, K. Dimitropoulos, and P. Daras, “Body motion analysis for emotion recognition in serious games,” in Proc. Int. Conf. Universal Access in Hum.-Comput. Interact., 2016, pp. 33–42.

[155]

M. Gales, S. Young, et al., “The application of hidden Markov models in speech recognition,” Found. Trends Signal Process., vol. 1, no. 3, pp. 195–304, 2008.

Digital Library

[156]

A. Graves, A.-R. Mohamed, and G. Hinton, “Speech recognition with deep recurrent neural networks,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2013, pp. 6645–6649.

[157]

B. Schuller, S. Steidl, and A. Batliner, “The INTERSPEECH 2009 emotion challenge,” in Proc. Annu. Conf. Int. Speech Commun. Assoc., 2009, pp. 312–315.

[158]

B. W. Schuller, S. Steidl, A. Batliner, E. Nöth, A. Vinciarelli, F. Burkhardt, R. Van Son, F. Weninger, F. Eyben, T. Bocklet, et al., “The INTERSPEECH 2012 speaker trait challenge,” in Proc. Annu. Conf. Int. Speech Commun. Assoc., 2012, pp. 254–257.

[159]

Y. Kim, H. Lee, and E. M. Provost, “Deep learning for robust feature generation in audiovisual emotion recognition,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2013, pp. 3687–3691.

[160]

S. E. Kahou, X. Bouthillier, P. Lamblin, C. Gulcehre, V. Michalski, K. Konda, S. Jean, P. Froumenty, Y. Dauphin, N. Boulanger-Lewandowski, et al., “EmoNets: Multimodal deep learning approaches for emotion recognition in video,” J. Multimodal User Interfaces, vol. 10, no. 2, pp. 99–111, 2016.

[161]

A. Stuhlsatz, C. Meyer, F. Eyben, T. Zielke, G. Meier, and B. Schuller, “Deep neural networks for acoustic emotion recognition: Raising the benchmarks,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2011, pp. 5688–5691.

[162]

F. Lingenfelser, J. Wagner, J. Deng, R. Bruckner, B. Schuller, and E. Andre, “Asynchronous and event-based fusion systems for affect recognition on naturalistic data in comparison to conventional approaches,” IEEE Trans. Affect. Comput., vol. 9, no. 4, pp. 410–423, Oct.–Dec. 2018.

Digital Library

[163]

W. Lim, D. Jang, and T. Lee, “Speech emotion recognition using convolutional and recurrent neural networks,” in Proc. Annu. Summit Conf. Asia-Pacific Signal Inf. Process. Assoc., 2016, pp. 1–4.

[164]

F. Burkhardt, A. Paeschke, M. Rolfes, W. F. Sendlmeier, and B. Weiss, “A database of german emotional speech,” in Proc. Annu. Conf. Int. Speech Commun. Assoc., 2005, pp. 1517–1520.

[165]

C. Busso, M. Bulut, C.-C. Lee, A. Kazemzadeh, E. Mower, S. Kim, J. N. Chang, S. Lee, and S. S. Narayanan, “IEMOCAP: Interactive emotional dyadic motion capture database,” Lang. Resources Eval., vol. 42, no. 4, 2008, Art. no.

[166]

S. Jirayucharoensak, S. Pan-Ngum, and P. Israsena, “EEG-based emotion recognition using deep learning network with principal component based covariate shift adaptation,” The Sci. World J., vol. 2014, 2014, Art. no.

[167]

W.-L. Zheng, J.-Y. Zhu, Y. Peng, and B.-L. Lu, “EEG-based emotion classification using deep belief networks,” in Proc. Int. Conf. Multimedia Expo, 2014, pp. 1–6.

[168]

H. Xu and K. N. Plataniotis, “Affective states classification using EEG and semi-supervised deep learning approaches,” in Proc. Int. Workshop Multimedia Signal Process., 2016, pp. 1–6.

[169]

R. M. Mehmood, R. Du, and H. J. Lee, “Optimal feature selection and deep learning ensembles method for emotion recognition from human brain EEG sensors,” IEEE Access, vol. 5, pp. 14 797–14 806, 2017.

[170]

L. Bozhkov, P. Koprinkova-Hristova, and P. Georgieva, “Learning to decode human emotions with echo state networks,” Neural Netw., vol. 78, pp. 112–119, 2016.

Digital Library

[171]

Z. Yin, M. Zhao, Y. Wang, J. Yang, and J. Zhang, “Recognition of emotions using multimodal physiological signals and an ensemble deep learning model,” Comput. Meth. Programs Biomed., vol. 140, pp. 93–110, 2017.

Digital Library

[172]

H. P. Martinez, Y. Bengio, and G. N. Yannakakis, “Learning deep physiological models of affect,” IEEE Comput. Intell. Mag., vol. 8, no. 2, pp. 20–33, 2013.

Digital Library

[173]

F. Lingenfelser, J. Wagner, and E. André, “A systematic discussion of fusion techniques for multi-modal affect recognition tasks,” in Proc. Int. Conf. Multimodal Interact., 2011, pp. 19–26.

[174]

S. Poria, E. Cambria, R. Bajpai, and A. Hussain, “A review of affective computing: From unimodal analysis to multimodal fusion,” Inf. Fusion, vol. 37, pp. 98–125, 2017.

Digital Library

[175]

W. Liu, W.-L. Zheng, and B.-L. Lu, “Emotion recognition using multimodal deep learning,” in Proc. Int. Conf. Neural Inf. Process., 2016, pp. 521–529.

[176]

M. Lyons, S. Akamatsu, M. Kamachi, and J. Gyoba, “Coding facial expressions with gabor wavelets,” in Proc. Int. Conf. Automat. Face Gesture Recognit., 1998, pp. 200–205.

[177]

S. Koelstra, C. Muhl, M. Soleymani, J.-S. Lee, A. Yazdani, T. Ebrahimi, T. Pun, A. Nijholt, and I. Patras, “DEAP: A database for emotion analysis using physiological signals,” IEEE Trans. Affect. Comput., vol. 3, no. 1, pp. 18–31, Jan.–Mar. 2012.

Digital Library

[178]

A. Dhall, O. Ramana Murthy, R. Goecke, J. Joshi, and T. Gedeon, “Video and image based emotion recognition challenges in the wild: Emotiw 2015,” in Proc. Int. Conf. Multimodal Interact., 2015, pp. 423–426.

[179]

O. Martin, I. Kotsia, B. Macq, and I. Pitas, “The eNTERFACE’05 audio-visual emotion database,” in Proc. Int. Conf. Data Eng. Workshops, 2006, pp. 8–8.

[180]

F. Ringeval, A. Sonderegger, J. Sauer, and D. Lalanne, “Introducing the recola multimodal corpus of remote collaborative and affective interactions,” in Proc. Int. Conf. Automat. Face Gesture Recognit., 2013, pp. 1–8.

[181]

W. Bao, Y. Li, M. Gu, M. Yang, H. Li, L. Chao, and J. Tao, “Building a chinese natural emotional audio-visual database,” in Proc. Int. Conf. Signal Process., 2014, pp. 583–587.

[182]

G. McKeown, M. Valstar, R. Cowie, M. Pantic, and M. Schroder, “The SEMAINE database: Annotated multimodal records of emotionally colored conversations between a person and a limited agent,” IEEE Trans. Affect. Comput., vol. 3, no. 1, pp. 5–17, Jan.–Mar. 2012.

Digital Library

[183]

C. F. Benitez-Quiroz, R. Srinivasan, A. M. Martinez, et al., “EmotioNet: An accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 5562–5570.

[184]

M. Soleymani, J. Lichtenauer, T. Pun, and M. Pantic, “A multimodal database for affect recognition and implicit tagging,” IEEE Trans. Affect. Comput., vol. 3, no. 1, pp. 42–55, Jan.–Mar. 2012.

Digital Library

[185]

H. Meng, D. Huang, H. Wang, H. Yang, M. AI-Shuraifi, and Y. Wang, “Depression recognition based on dynamic facial and vocal expression features using partial least square regression,” in Proc. 3rd ACM Int. Workshop Audio/Visual Emotion Challenge, 2013, pp. 21–30.

[186]

S. E. Kahou, C. Pal, X. Bouthillier, P. Froumenty, Ç. Gülçehre, R. Memisevic, P. Vincent, A. Courville, Y. Bengio, R. C. Ferrari, et al., “Combining modality specific deep neural networks for emotion recognition in video,” in Proc. Int. Conf. Multimodal Interact., 2013, pp. 543–550.

[187]

G. Gosztolya, R. Busa-Fekete, and L. Tóth, “Detecting autism, emotions and social signals using adaboost,” in Proc. Annu. Conf. Int. Speech Commun. Assoc., 2013, pp. 220–224.

[188]

M. Kächele, M. Schels, and F. Schwenker, “Inferring depression and affect from application dependent meta knowledge,” in Proc. 4th ACM Int. Workshop Audio/Visual Emotion Challenge, 2014, pp. 41–48.

[189]

M. Liu, R. Wang, S. Li, S. Shan, Z. Huang, and X. Chen, “Combining multiple kernel methods on Riemannian manifold for emotion recognition in the wild,” in Proc. Int. Conf. Multimodal Interact., 2014, pp. 494–501.

[190]

A. Yao, J. Shao, N. Ma, and Y. Chen, “Capturing au-aware facial features and their latent relations for emotion recognition in the wild,” in Proc. Int. Conf. Multimodal Interact., 2015, pp. 451–458.

[191]

L. Tan, K. Zhang, K. Wang, X. Zeng, X. Peng, and Y. Qiao, “Group emotion recognition with individual facial emotion CNNs and global image based CNNs,” in Proc. Int. Conf. Multimodal Interact., 2017, pp. 549–552.

[192]

J. Kukačka, V. Golkov, and D. Cremers, “Regularization for deep learning: A taxonomy,” arXiv: 1710.10686, p. 23, 2017.

[193]

D. Mahajan, R. Girshick, V. Ramanathan, K. He, M. Paluri, Y. Li, A. Bharambe, and L. van der Maaten, “Exploring the limits of weakly supervised pretraining,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 185–201.

[194]

Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proc. IEEE, vol. 86, no. 11, pp. 2278–2324, Nov. 1998.

[195]

A. Graves and N. Jaitly, “Towards end-to-end speech recognition with recurrent neural networks,” in Proc. Int. Conf. Mach. Learn., 2014, pp. 1764–1772.

[196]

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in Proc. 31st Conf. Neural Inf. Process. Syst., 2017, pp. 1–11, https://papers.nips.cc/paper/7181-attention-is-all-you-need

[197]

M. D. Zeiler and R. Fergus, “Visualizing and understanding convolutional networks,” in Proc. Eur. Conf. Comput. Vis., 2014, pp. 818–833.

Cited By

Wanjau SWambugu GOirere AMuketha G(2024)Discriminative spatial-temporal feature learning for modeling network intrusion detection systemsJournal of Computer Security10.3233/JCS-22003132:1(1-30)Online publication date: 2-Feb-2024
https://dl.acm.org/doi/10.3233/JCS-220031
Tellamekala MAmiriparian SSchuller BAndré EGiesbrecht TValstar M(2024)COLD Fusion: Calibrated and Ordinal Latent Distribution Fusion for Uncertainty-Aware Multimodal Emotion RecognitionIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.332577046:2(805-822)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1109/TPAMI.2023.3325770
Cang XGuerra RGuta BBucci PRodgers LMah HFeng QAgrawal AMacLean K(2024)FEELing (key)Pressed: Implicit Touch Pressure Bests Brain Activity for Modeling Emotion Dynamics in the Space Between Stressed & RelaxedIEEE Transactions on Haptics10.1109/TOH.2023.330805917:3(310-318)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1109/TOH.2023.3308059
Show More Cited By

Index Terms

Deep Learning for Human Affect Recognition: Insights and New Developments
1. Computing methodologies
  1. Artificial intelligence
  2. Machine learning
2. Human-centered computing
  1. Human computer interaction (HCI)

Index terms have been assigned to the content through auto-classification.

Recommendations

Face Recognition Based on Deep Learning
Human Centered Computing
Abstract
As one of the non-contact biometrics, face representation had been widely used in many circumstances. However conventional methods could no longer satisfy the demand at present, due to its low recognition accuracy and restrictions of many ...
A survey on deep learning based face recognition
Abstract
Deep learning, in particular the deep convolutional neural networks, has received increasing interests in face recognition recently, and a number of deep learning methods have been proposed. This paper summarizes about 330 ...
Graphical abstract

Display Omitted
Highlights
- Presents a comprehensive survey of deep learning based face recognition methods.
Improving Deep Learning Feature with Facial Texture Feature for Face Recognition

Face recognition in the reality, is a challenging problem, due to varieties in illumination, background, pose etc. Recently, the deep learning based face recognition algorithm is able to learn effective face features to obtain a very impressive ...

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Affective Computing

IEEE Transactions on Affective Computing Volume 12, Issue 2

April-June 2021

274 pages

ISSN:1949-3045

Issue’s Table of Contents

1949-3045 © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Computer Society Press

Washington, DC, United States

Publication History

Published: 01 April 2021

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

26
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 23 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wanjau SWambugu GOirere AMuketha G(2024)Discriminative spatial-temporal feature learning for modeling network intrusion detection systemsJournal of Computer Security10.3233/JCS-22003132:1(1-30)Online publication date: 2-Feb-2024
https://dl.acm.org/doi/10.3233/JCS-220031
Tellamekala MAmiriparian SSchuller BAndré EGiesbrecht TValstar M(2024)COLD Fusion: Calibrated and Ordinal Latent Distribution Fusion for Uncertainty-Aware Multimodal Emotion RecognitionIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.332577046:2(805-822)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1109/TPAMI.2023.3325770
Cang XGuerra RGuta BBucci PRodgers LMah HFeng QAgrawal AMacLean K(2024)FEELing (key)Pressed: Implicit Touch Pressure Bests Brain Activity for Modeling Emotion Dynamics in the Space Between Stressed & RelaxedIEEE Transactions on Haptics10.1109/TOH.2023.330805917:3(310-318)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1109/TOH.2023.3308059
Hinduja SDarzi AErtugrul IProvenza NGadot RStorch ESheth SGoodman WCohn J(2024)Multimodal Prediction of Obsessive-Compulsive Disorder and Comorbid Depression Severity and Energy Delivered by Deep Brain ElectrodesIEEE Transactions on Affective Computing10.1109/TAFFC.2024.339511715:4(2025-2041)Online publication date: 30-Apr-2024
https://dl.acm.org/doi/10.1109/TAFFC.2024.3395117
Li MChen LWu MHirota K(2024)A broad-deep fusion network-based fuzzy emotional intention inference model for teaching validity evaluationInformation Sciences: an International Journal10.1016/j.ins.2023.119837654:COnline publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1016/j.ins.2023.119837
Shi CZhang YLiu B(2024)A multimodal fusion-based deep learning framework combined with local-global contextual TCNs for continuous emotion recognition from videosApplied Intelligence10.1007/s10489-024-05329-w54:4(3040-3057)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1007/s10489-024-05329-w
Khan UXu QLiu YLagstedt AAlamäki AKauttonen J(2024)Exploring contactless techniques in multimodal emotion recognition: insights into diverse applications, challenges, solutions, and prospectsMultimedia Systems10.1007/s00530-024-01302-230:3Online publication date: 6-Apr-2024
https://dl.acm.org/doi/10.1007/s00530-024-01302-2
Nida NYousaf MIrtaza AJaved SVelastin S(2024)Spatial deep feature augmentation technique for FER using genetic algorithmNeural Computing and Applications10.1007/s00521-023-09245-x36:9(4563-4581)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1007/s00521-023-09245-x
Petković UFrenkel JHellwich OLazarides R(2024)Nonverbal Immediacy Analysis in Education: A Multimodal Computational ModelFrom Animals to Animats 1710.1007/978-3-031-71533-4_26(326-338)Online publication date: 9-Sep-2024
https://dl.acm.org/doi/10.1007/978-3-031-71533-4_26
Ghahremani MWachinger COh ANaumann TGloberson ASaenko KHardt MLevine S(2023)RegBNProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667071(21687-21701)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3667071
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents