Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Texture and Geometry Scattering Representation-Based Facial Expression Recognition in 2D+3D Videos

Published: 06 March 2018 Publication History

Abstract

Facial Expression Recognition (FER) is one of the most important topics in the domain of computer vision and pattern recognition, and it has attracted increasing attention for its scientific challenges and application potentials. In this article, we propose a novel and effective approach to FER using multi-model two-dimensional (2D) and 3D videos, which encodes both static and dynamic clues by scattering convolution network. First, a shape-based detection method is introduced to locate the start and the end of an expression in videos; segment its onset, apex, and offset states; and sample the important frames for emotion analysis. Second, the frames in Apex of 2D videos are represented by scattering, conveying static texture details. Those of 3D videos are processed in a similar way, but to highlight static shape details, several geometric maps in terms of multiple order differential quantities, i.e., Normal Maps and Shape Index Maps, are generated as the input of scattering, instead of original smooth facial surfaces. Third, the average of neighboring samples centred at each key texture frame or shape map in Onset is computed, and the scattering features extracted from all the average samples of 2D and 3D videos are then concatenated to capture dynamic texture and shape cues, respectively. Finally, Multiple Kernel Learning is adopted to combine the features in the 2D and 3D modalities and compute similarities to predict the expression label. Thanks to the scattering descriptor, the proposed approach not only encodes distinct local texture and shape variations of different expressions as by several milestone operators, such as SIFT, HOG, and so on, but also captures subtle information hidden in high frequencies in both channels, which is quite crucial to better distinguish expressions that are easily confused. The validation is conducted on the BU-4DFE and BP-4D databa ses, and the accuracies reached are very competitive, indicating its competency for this issue.

References

[1]
Bouchra Abboud, Franck Davoine, and Mo Dang. 2004. Facial expression recognition and synthesis based on an appearance model. Sign. Process.: Image Commun. 19, 8 (2004), 723--740.
[2]
Taleb Alashkar, Boulbaba Ben Amor, Mohamed Daoudi, and Stefano Berretti. 2016. Spontaneous expression detection from 3D dynamic sequences by analyzing trajectories on grassmann manifolds. IEEE Transactions on Affective Computing PP, 99 (2016), 1--1.
[3]
Boulbaba. Ben Amor, Hassen Drira, Stefano Berretti, Mohamed Daoudi, and Anuj Srivastava. 2014. 4D facial expression recognition by learning geometric deformations. IEEE Trans. Cybernet. 44, 12 (2014), 2443--2457.
[4]
Stefano Berretti, Alberto Del Bimbo, and Pietro Pala. 2013. Automatic facial expression recognition in real-time from dynamic sequences of 3D face scans. Visual Comput. 29, 12 (2013), 1--18.
[5]
Stefano Berretti, Alberto Del Bimbo, Pietro Pala, Boulbaba Ben Amor, and Mohamed Daoudi. 2010. A set of selected SIFT features for 3D facial expression recognition. In Proceedings of the IEEE/IAPR International Conference on Pattern Recognition. 4125--4128.
[6]
Joan Bruna and Stephane Mallat. 2010. Classification with scattering operators. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 1561--1566.
[7]
Navneet Dalal and Bill Triggs. 2005. Histograms of oriented gradients for human detection. In Proceedings of the IEEE Society Conference on Computer Vision and Pattern Recognition. 886--893.
[8]
Antonios Danelakis, Theoharis Theoharis, and Ioannis Pratikakis. 2015. A survey on facial expression recognition in 3D video sequences. Multimedia Tools Appl. 74, 15 (2015), 5577--5615.
[9]
Arnaud Dapogny, Kevin Bailly, and Severine Dubuisson. 2015. Pairwise conditional random forests for facial expression recognition. In Proceedings of the IEEE International Conference on Computer Vision. 3783--3791.
[10]
Hassen Drira, Boulbaba Ben Amor, Meroua Daoudi, Anuj Srivastava, and Stefano Berretti. 2012. 3D dynamic expression recognition based on a novel deformation vector field and random forest. In Proceedings of the IEEE/IAPR International Conference on Pattern Recognition. 1104--1107.
[11]
Paul Ekman and Wallace V. Friesen. 1971. Constants across cultures in the face and emotion. J. Pers. Soc. Psychol. 17, 2 (1971), 124--129.
[12]
Tianhong Fang, Xi Zhao, Omar Ocegueda, Shishir K. Shah, and Ioannis A. Kakadiaris. 2011. 3D facial expression recognition: A perspective on promises and challenges. In Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition. 603--610.
[13]
Tianhong Fang, Xi Zhao, Omar Ocegueda, Shishir K. Shah, and Ioannis A. Kakadiaris. 2012. 3D/4D facial expression analysis: An advanced annotated face model approach. Image Vis. Comput. 30, 10 (2012), 738--749.
[14]
Beat Fasel and Juergen Luettin. 2003. Automatic facial expression analysis: A survey. Pattern Recogn. 36, 1 (2003), 259--275.
[15]
Salih Burak Gokturk, Jean-Yves Bouguet, Carlo Tomasi, and Bernd Girod. 2002. Model-based face tracking for view-independent facial expression recognition. In Proceedings of the IEEE International Conference on Automatic Face Gesture Recognition.
[16]
Boqing Gong, Yueming Wang, Jianzhuang Liu, and Xiaoou Tang. 2009. Automatic facial expression recognition on a single 3D face by exploring shape deformation. In Proceedings of the ACM Conference on Multimedia. 569--572.
[17]
Di Huang, Mohsen Ardabilian, Yunhong Wang, and Liming Chen. 2012. 3-D face recognition using elbp-based facial description and local feature hybrid matching. IEEE Trans. Inf. Forens. Secur. 7, 5 (2012), 1551--1565.
[18]
Di Huang, Karima Ouji, Mohsen Ardabilian, Yunhong Wang, and Liming Chen. 2011a. 3D face recognition based on local shape patterns and sparse representation classifier. In Proceedings of the International Conference on Multimedia Modeling. 206--216.
[19]
Di Huang, Caifeng Shan, Mohsen Ardabilian, Yunhong Wang, and Liming Chen. 2011b. Local binary patterns and its application to facial image analysis: A survey. IEEE Trans. Syst. Man. Cybernet. C 41, 6 (2011), 765--781.
[20]
Bruna Joan and Mallat Stephane. 2013. Invariant scattering convolution networks. IEEE Trans. Pattern Anal. Mach. Intell. 35, 8 (2013), 1872--1886.
[21]
Takeo Kanade, Jeffrey F. Cohn, and Yingli Tian. 2000. Comprehensive database for facial expression analysis. In Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition. 46--53.
[22]
Jan J. Koenderink and Andrea J. Van Doorn. 1992. Surface shape and curvature scales. Image Vis. Comput. 10, 8 (1992), 557--564.
[23]
Andreas Lanitis, Christopher J. Taylor, and Timothy F. Cootes. 1997. Automatic interpretation and coding of face images using flexible models. IEEE Trans. Pattern Anal. Mach. Intell. 19, 7 (1997), 743--756.
[24]
Pierre Lemaire, Liming Chen, Mohsen Ardabilian, and Mohamed Daoudi. 2013. Fully automatic 3D facial expression recognition using differential mean curvature maps and histograms of oriented gradients. In Proceedings of the IEEE International Conference on Automatic Facial and Gesture Recognition Workshop: 3D Face Biometrics. 1--8.
[25]
Huibin Li, Liming Chen, Di Huang, Yunhong Wang, and Jean Marie Morvan. 2012. 3D facial expression recognition via multiple kernel learning of multi-scale local normal patterns. In Proceedings of the IEEE/IAPR International Conference on Pattern Recognition. 2577--2580.
[26]
Huibin Li, Huaxiong Ding, Di Huang, Yunhong Wang, Xi Zhao, Jean Marie Morvan, and Liming Chen. 2015. An efficient multimodal 2D + 3D feature-based approach to automatic facial expression recognition. Comput. Vis. Image Understand. 140, C (2015), 83--92.
[27]
Huibin Li, Jean Marie Morvan, and Liming Chen. 2011. 3D facial expression recognition based on histograms of surface differential quantities. Advanced Concepts for Intelligent Vision Systems. 483--494.
[28]
Chengjun Liu and Harry Wechsler. 2002. Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition. IEEE Trans. Image Process. 11, 4 (2002), 467--476.
[29]
Ping Liu, Shizhong Han, Zibo Meng, and Yan Tong. 2014. Facial expression recognition via a boosted deep belief network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1805--1812.
[30]
David G. Lowe. 1999. Object recognition from local scale-invariant features. In Proceedings of the IEEE International Conference on Computer Vision. 1150--1157.
[31]
Patrick Lucey, Jeffrey F. Cohn, Takeo Kanade, Jason Saragih, Zara Ambadar, and Iain Matthews. 2010. The extended cohn-kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. 94--101.
[32]
Michael J. Lyons, Shigeru Akamatsu, Miyuki Kamachi, and Jiro Gyoba. 1998. Coding facial expressions with gabor wavelets. In Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition. 200--205.
[33]
Ali Mollahosseini, David Chan, and Mohammad H. Mahoor. 2016. Going deeper in facial expression recognition using deep neural networks. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision. 1--10.
[34]
Iordanis Mpiperis, Sotiris Malassiotis, and Michael G. Strintzis. 2008. Bilinear models for 3-D face and facial expression recognition. IEEE Trans. Inf. Forens. Secur. 3, 3 (2008), 498--511.
[35]
Timo Ojala, Matti Pietikinen, and Topi Menp. 2002. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24, 7 (2002), 971--987.
[36]
Michael Reale, Xing Zhang, and Lijun Yin. 2013. Nebula feature: A space-time feature for posed and spontaneous 4D facial behavior analysis. In Proceedings of the IEEE International Conference and Workshops on Automatic Face and Gesture Recognition. 1--8.
[37]
William E. Rinn. 1984. The neuropsychology of facial expression: A review of the neurological and psychological mechanisms for producing facial expressions.Psychol. Bull. 95, 1 (1984), 52--77.
[38]
Georgia Sandbach, Stefanos Zafeiriou, Maja Pantic, and Daniel Rueckert. 2011. A dynamic approach to the recognition of 3D facial expressions and their temporal models. In Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition. 406--413.
[39]
Georgia Sandbach, Stefanos Zafeiriou, Maja Pantic, and Daniel Rueckert. 2012. Recognition of 3D facial expression dynamics. Image Vis. Comput. 30, 10 (2012), 762--773.
[40]
Arman Savran, Neşe Alyüz, Hamdi Dibeklioğlu, Oya Çeliktutan, Berk Gökberk, Bülent Sankur, and Lale Akarun. 2008. Bosphorus database for 3D face analysis. In Proceedings of the COST Workshop on Biometrics and Identity Management. 47--56.
[41]
Caifeng Shan, Shaogang Gong, and Peter W. Mcowan. 2009. Facial expression recognition based on local binary patterns: A comprehensive study. Image and Vision Computing 27 (2009), 803--816. Issue 6.
[42]
Yi Sun and Lijun Yin. 2008. Facial expression recognition based on 3D dynamic range model sequences. In Proceedings of the European Conference on Computer Vision. 58--71.
[43]
Hao Tang and Thomas S. Huang. 2008. 3D facial expression recognition based on automatically selected features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[44]
Ba Tu Truong and Svetha Venkatesh. 2007. Video abstraction: A systematic review and classification. ACM Trans. Multimedia Comput. Commun. Appl. 3, 1 (2007), 3.
[45]
Filareti Tsalakanidou and Sotiris Malassiotis. 2010. Real-time 2D+3D facial action and expression recognition. Pattern Recogn. 43, 5 (2010), 1763--1775.
[46]
Mingliang Xue, A. Mian, Wanquan Liu, and Ling Li. 2015. Automatic 4D facial expression recognition using DCT features. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision. 199--206.
[47]
Songfan Yang and Bir Bhanu. 2012. Understanding discrete facial expressions in video using an emotion avatar image. IEEE Trans. Syst. Man Cybernet. B 42, 4 (2012), 980--992.
[48]
Xudong Yang, Di Huang, Yunhong Wang, and Liming Chen. 2015. Automatic 3D facial expression recognition using geometric scattering representation. In Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition.
[49]
Lijun Yin, Xiaochen Chen, Yi Sun, Tony Worm, and Michael Reale. 2008. A high-resolution 3D dynamic facial expression database. In Proceedings of the IEEE International Conference on Automatic Face Gesture Recognition. 1--6.
[50]
Lijun Yin, Xiaozhou Wei, Yi Sun, and Jun Wang. 2006. A 3D facial expression database for facial behavior research. In Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition. 211--216.
[51]
Lukasz Zalewski and Shaogang Gong. 2004. Synthesis and recognition of facial expressions in virtual 3D views. In Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition. 493--498.
[52]
Xing Zhang, Lijun Yin, Jeffrey F. Cohn, Shaun Canavan, Michael Reale, Andy Horowitz, Peng Liu, and Jeffrey M. Girard. 2014. BP4D-spontaneous: A high-resolution spontaneous 3D dynamic facial expression database. Image Vis. Comput. 32, 10 (2014), 692--706.
[53]
Guoying Zhao and Matti Pietikainen. 2007. Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans. Pattern Anal. Mach. Intell. 29, 6 (2007), 915--928.
[54]
Xi Zhao, Di Huang, Emmanuel Dellandréa, and Liming Chen. 2010. Automatic 3D facial expression recognition based on a bayesian belief net and a statistical facial feature model. In Proceedings of the IEEE/IAPR International Conference on Pattern Recognition. 3724--3727.
[55]
Qingkai Zhen, Di Huang, Yunhong Wang, and Liming Chen. 2013. LPQ based static and dynamic modeling of facial expressions in 3D videos. In Chinese Conference on Biometric Recognition. 122--129.
[56]
Qingkai Zhen, Di Huang, Yunhong Wang, and Liming Chen. 2015. Muscular movement model based automatic 3D facial expression recognition. In International Conference on MultiMedia Modeling. 522--533.
[57]
Qingkai Zhen, Di Huang, Yunhong Wang, and Liming Chen. 2016. Muscular movement model-based automatic 3D/4D facial expression recognition. IEEE Trans. Multimedia 18, 7 (2016), 1438--1450.

Cited By

View all
  • (2024)Suitable and Style-Consistent Multi-Texture Recommendation for Cartoon IllustrationsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365251820:7(1-26)Online publication date: 12-Mar-2024
  • (2024)A Comprehensive Survey on Affective Computing: Challenges, Trends, Applications, and Future DirectionsIEEE Access10.1109/ACCESS.2024.342248012(96150-96168)Online publication date: 2024
  • (2024)Cross-Domain Facial Expression Recognition by Combining Transfer Learning and Face-Cycle Generative Adversarial NetworkMultimedia Tools and Applications10.1007/s11042-024-18713-yOnline publication date: 11-Mar-2024
  • Show More Cited By

Index Terms

  1. Texture and Geometry Scattering Representation-Based Facial Expression Recognition in 2D+3D Videos

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 14, Issue 1s
    Special Section on Representation, Analysis and Recognition of 3D Humans and Special Section on Multimedia Computing and Applications of Socio-Affective Behaviors in the Wild
    March 2018
    234 pages
    ISSN:1551-6857
    EISSN:1551-6865
    DOI:10.1145/3190503
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 06 March 2018
    Accepted: 01 August 2017
    Revised: 01 June 2017
    Received: 01 February 2017
    Published in TOMM Volume 14, Issue 1s

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. 2D and 3D videos
    2. Facial expression recognition
    3. multi-modal fusion
    4. the scattering descriptor

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • Research Program of State Key Laboratory of Software Development Environment
    • Partner University Foundation
    • PUF 4D Vision project
    • National Natural Science Foundation of China
    • Microsoft Research Asia Collaborative Program
    • French Research Agency
    • National Key Research and Development Plan
    • l'Agence Nationale de Recherche (ANR)
    • Jemime project

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)23
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 12 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Suitable and Style-Consistent Multi-Texture Recommendation for Cartoon IllustrationsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365251820:7(1-26)Online publication date: 12-Mar-2024
    • (2024)A Comprehensive Survey on Affective Computing: Challenges, Trends, Applications, and Future DirectionsIEEE Access10.1109/ACCESS.2024.342248012(96150-96168)Online publication date: 2024
    • (2024)Cross-Domain Facial Expression Recognition by Combining Transfer Learning and Face-Cycle Generative Adversarial NetworkMultimedia Tools and Applications10.1007/s11042-024-18713-yOnline publication date: 11-Mar-2024
    • (2023)Facial expression recognition method based on PSA—YOLO networkFrontiers in Neurorobotics10.3389/fnbot.2022.105798316Online publication date: 17-Jan-2023
    • (2023)Spiking-Fer: Spiking Neural Network for Facial Expression Recognition With Event CamerasProceedings of the 20th International Conference on Content-based Multimedia Indexing10.1145/3617233.3617235(1-7)Online publication date: 20-Sep-2023
    • (2023)A comprehensive survey on techniques to handle face identity threats: challenges and opportunitiesMultimedia Tools and Applications10.1007/s11042-022-13248-682:2(1669-1748)Online publication date: 1-Jan-2023
    • (2022)A systematic review on affective computing: emotion models, databases, and recent advancesInformation Fusion10.1016/j.inffus.2022.03.00983-84(19-52)Online publication date: Jul-2022
    • (2021)A Survey on Various Deep Learning Algorithms for an Efficient Facial Expression Recognition SystemInternational Journal of Image and Graphics10.1142/S021946782240005823:03Online publication date: 15-Dec-2021
    • (2021)Multi‐angle face expression recognition based on generative adversarial networksComputational Intelligence10.1111/coin.1243738:1(20-37)Online publication date: 23-Feb-2021
    • (2021)Spatio-Temporal Encoder-Decoder Fully Convolutional Network for Video-Based Dimensional Emotion RecognitionIEEE Transactions on Affective Computing10.1109/TAFFC.2019.294022412:3(565-578)Online publication date: 1-Jul-2021
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media