Article

Exploring bag of words architectures in the facial expression domain

Authors:

Marian BartlettAuthors Info & Claims

ECCV'12: Proceedings of the 12th international conference on Computer Vision - Volume 2

Pages 250 - 259

https://doi.org/10.1007/978-3-642-33868-7_25

Published: 07 October 2012 Publication History

Abstract

Automatic facial expression recognition (AFER) has undergone substantial advancement over the past two decades. This work explores the application of bag of words (BoW), a highly matured approach for object and scene recognition to AFER. We proceed by first highlighting the reasons that makes the task for BoW differ for AFER compared to object and scene recognition. We propose suitable extensions to BoW architecture for the AFER's task. These extensions are able to address some of the limitations of current state of the art appearance-based approaches to AFER. Our BoW architecture is based on the spatial pyramid framework, augmented by multiscale dense SIFT features, and a recently proposed approach for object classification: locality-constrained linear coding and max-pooling. Combining these, we are able to achieve a powerful facial representation that works well even with linear classifiers. We show that a well designed BoW architecture can provide a performance benefit for AFER, and elements of the proposed BoW architecture are empirically evaluated. The proposed BoW approach supersedes previous state of the art results by achieving an average recognition rate of 96% on AFER for two public datasets.

References

[1]

Bartlett, M., Littlewort, G., Frank, M., Lainscsek, C., Fasel, I., Movellan, J.: Automatic recognition of facial actions in spontaneous expressions. Journal of Multimedia 1(6), 22-35 (2006).

[2]

Biederman, I., Kalocsais, P.: Neurocomputational bases of object and face recognition. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences 352(1358), 1203-1219 (1997).

[3]

Chang, C., Lin, C.: Libsvm: a library for support vector machines, software (2001).

[4]

Chatfield, K., Lempitsky, V., Vedaldi, A., Zisserman, A.: The devil is in the details: an evaluation of recent feature encoding methods. In: British Machine Vision Conference (2011).

[5]

Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, vol. 1, p. 22 (2004).

[6]

Eckhardt, M., Fasel, I., Movellan, J.: Towards practical facial feature detection. International Journal of Pattern Recognitionand Artificial Intelligence 23(3), 379 (2009).

[7]

Guo, G., Li, S., Chan, K.: Support vector machines for face recognition. Image and Vision computing 19(9-10), 631-638 (2001).

[8]

Izard, C.: The face of emotion (1971).

[9]

Jurie, F., Triggs, B.: Creating efficient codebooks for visual recognition. In: Tenth IEEE International Conference on Computer Vision, ICCV 2005, vol. 1, pp. 604-610. IEEE (2005).

Digital Library

[10]

Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1-8. IEEE (2008).

[11]

Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2169-2178. IEEE (2006).

Digital Library

[12]

Lee, H., Grosse, R., Ranganath, R., Ng, A.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 609-616. ACM (2009).

Digital Library

[13]

Li, S.Z., Jain, A.K., Tian, Y.-L., Kanade, T., Cohn, J.F.: Facial expression analysis. In: Handbook of Face Recognition, pp. 247-275. Springer, New York (2005).

[14]

Li, Z., Imai, J., Kaneko, M.: Facial-component-based bag of words and phog descriptor for facial expression recognition. In: IEEE International Conference on Systems, Man and Cybernetics, SMC 2009, pp. 1353-1358. IEEE (2009).

Digital Library

[15]

Littlewort, G., Whitehill, J., Wu, T., Fasel, I., Frank, M., Movellan, J., Bartlett, M.: The computer expression recognition toolbox (cert). In: 2011 IEEE International Conference on Automatic Face & Gesture Recognition and Workshops (FG 2011), pp. 298-305. IEEE (2011).

[16]

Lucey, P., Cohn, J., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 94-101. IEEE (2010).

[17]

Nilsback, M., Zisserman, A.: A visual vocabulary for flower classification. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1447-1454. IEEE (2006).

Digital Library

[18]

Nowak, E., Jurie, F., Triggs, B.: Sampling Strategies for Bag-of-Features Image Classification. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 490-503. Springer, Heidelberg (2006).

Digital Library

[19]

Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(7), 971-987 (2002).

Digital Library

[20]

Parkhi, O.M., Vedaldi, A., Zisserman, A., Jawahar, C.V.: Cats and dogs. In: IEEE Conference on Computer Vision and Pattern Recognition (2012).

Digital Library

[21]

Van der Schalk, J., Hawk, S., Fischer, A., Doosje, B.: Moving faces, looking places: Validation of the amsterdam dynamic facial expression set (adfes). Emotion 11(4), 907 (2011).

[22]

Shan, C., Gong, S., McOwan, P.: Facial expression recognition based on local binary patterns: A comprehensive study. Image and Vision Computing 27(6), 803-816 (2009).

Digital Library

[23]

van Gemert, J.C., Geusebroek, J.-M., Veenman, C.J., Smeulders, A.W.M.: Kernel Codebooks for Scene Categorization. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 696-709. Springer, Heidelberg (2008).

Digital Library

[24]

Vedaldi, A., Fulkerson, B.: Vlfeat: An open and portable library of computer vision algorithms. In: Proceedings of the International Conference on Multimedia, pp. 1469-1472. ACM (2010).

Digital Library

[25]

Viola, P., Jones, M.: Robust real-time face detection. International Journal of Computer Vision 57(2), 137-154 (2004).

Digital Library

[26]

Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3360-3367. IEEE (2010).

[27]

Zhang, J., Marszalek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: A comprehensive study. In: Conference on Computer Vision and Pattern Recognition Workshop, CVPRW 2006, p. 13. IEEE (2006).

Digital Library

[28]

Zhao, G., Pietikainen, M.: Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence 29(6), 915-928 (2007).

Digital Library

Cited By

Alzahrani MAldhyani TAlsubari SAlthobaiti MFahad A(2022)Developing an Intelligent System with Deep Learning Algorithms for Sentiment Analysis of E-Commerce Product ReviewsComputational Intelligence and Neuroscience10.1155/2022/38400712022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/3840071
Wang YXu XZhuang Y(2021)Learning Dynamics for Video Facial Expression RecognitionProceedings of the 2021 4th International Conference on Algorithms, Computing and Artificial Intelligence10.1145/3508546.3508581(1-5)Online publication date: 22-Dec-2021
https://dl.acm.org/doi/10.1145/3508546.3508581
Huang KLi JCheng SYu JTian WZhao LHu JChang C(2020)An Efficient Algorithm of Facial Expression Recognition by TSG-RNN NetworkMultiMedia Modeling10.1007/978-3-030-37734-2_14(161-174)Online publication date: 5-Jan-2020
https://dl.acm.org/doi/10.1007/978-3-030-37734-2_14
Show More Cited By

Exploring bag of words architectures in the facial expression domain
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
      2. Computer vision tasks
  2. Machine learning
    1. Machine learning algorithms

Recommendations

Facial-component-based bag of words and PHOG descriptor for facial expression recognition
SMC'09: Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics

A novel framework of facial appearance and shape information extraction for facial expression recognition is proposed. For appearance extraction, a facial-component-based bag of words method is presented. We segment face images into 4 component regions, ...
Expression-invariant face recognition by facial expression transformations

In this paper, we present a method of expression-invariant face recognition that transforms input face image with an arbitrary expression into its corresponding neutral facial expression image. When a new face image with an arbitrary expression is ...
Recognizing action units for facial expression analysis
Multimodal interface for human-machine communication

Most automatic expression analysis systems attempt to recognize a small set of prototypic expressions, such as happiness, anger, surprise, and fear. Such prototypic expressions, however, occur rather infrequently. Human emotions and intentions are more ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

ECCV'12: Proceedings of the 12th international conference on Computer Vision - Volume 2

October 2012

609 pages

ISBN:9783642338670

Editors:
Andrea Fusiello
Dipartimento di Ingegneria Elettrica, Gestionale e Meccanica (DIEGM), Università degli Studi di Udine, Via delle Scienze, 208, Udine, Italy
,
Vittorio Murino
Dipartimento di Ingegneria Elettrica, Gestionale e Meccanica (DIEGM), IIT Istituto Italiano di Tecnologia, Via Morego 30, Genoa, Italy
,
Rita Cucchiara
Dipartimento di Ingegneria dell'Informazione, Università degli Studi di Modena e Reggio Emilia, Strada Vignolege, 905, Modena, Italy

Sponsors

Adobe
TOYOTA: TOYOTA
Google Inc.
IBMR: IBM Research
Microsoft Reasearch: Microsoft Reasearch

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 07 October 2012

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

20
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Alzahrani MAldhyani TAlsubari SAlthobaiti MFahad A(2022)Developing an Intelligent System with Deep Learning Algorithms for Sentiment Analysis of E-Commerce Product ReviewsComputational Intelligence and Neuroscience10.1155/2022/38400712022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/3840071
Wang YXu XZhuang Y(2021)Learning Dynamics for Video Facial Expression RecognitionProceedings of the 2021 4th International Conference on Algorithms, Computing and Artificial Intelligence10.1145/3508546.3508581(1-5)Online publication date: 22-Dec-2021
https://dl.acm.org/doi/10.1145/3508546.3508581
Huang KLi JCheng SYu JTian WZhao LHu JChang C(2020)An Efficient Algorithm of Facial Expression Recognition by TSG-RNN NetworkMultiMedia Modeling10.1007/978-3-030-37734-2_14(161-174)Online publication date: 5-Jan-2020
https://dl.acm.org/doi/10.1007/978-3-030-37734-2_14
Chen JJin YAkram MLi KChen E(2019)Novel multi-convolutional neural network fusion approach for smile recognitionMultimedia Tools and Applications10.1007/s11042-018-6945-x78:12(15887-15907)Online publication date: 1-Jun-2019
https://dl.acm.org/doi/10.1007/s11042-018-6945-x
Sen DDatta SBalasubramanian R(2019)Facial emotion classification using concatenated geometric and textural featuresMultimedia Tools and Applications10.1007/s11042-018-6537-978:8(10287-10323)Online publication date: 1-Apr-2019
https://dl.acm.org/doi/10.1007/s11042-018-6537-9
Fan YLam JLi VD'Mello SGeorgiou PScherer SProvost ESoleymani MWorsley M(2018)Video-based Emotion Recognition Using Deeply-Supervised Neural NetworksProceedings of the 20th ACM International Conference on Multimodal Interaction10.1145/3242969.3264978(584-588)Online publication date: 2-Oct-2018
https://dl.acm.org/doi/10.1145/3242969.3264978
Fan JTie YQi L(2018)Facial Expression Recognition Based on Multiple Feature Fusion in VideoProceedings of the 2018 International Conference on Computing and Pattern Recognition10.1145/3232829.3232839(86-92)Online publication date: 23-Jun-2018
https://dl.acm.org/doi/10.1145/3232829.3232839
Zhang FMao QShen XZhan YDong M(2018)Spatially Coherent Feature Learning for Pose-Invariant Facial Expression RecognitionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/317664614:1s(1-19)Online publication date: 6-Mar-2018
https://dl.acm.org/doi/10.1145/3176646
Chen JChen ZChi ZFu H(2018)Facial Expression Recognition in Video with Multiple Feature FusionIEEE Transactions on Affective Computing10.1109/TAFFC.2016.25937199:1(38-50)Online publication date: 1-Jan-2018
https://dl.acm.org/doi/10.1109/TAFFC.2016.2593719
Du HZheng HYu M(2018)Facial Expression Recognition Based on Region-Wise Attention and Geometry DifferencePattern Recognition and Computer Vision10.1007/978-3-030-03338-5_16(183-194)Online publication date: 23-Nov-2018
https://dl.acm.org/doi/10.1007/978-3-030-03338-5_16
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Table of Contents