Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A Deep Learning System for Recognizing Facial Expression in Real-Time

Published: 05 June 2019 Publication History

Abstract

This article presents an image-based real-time facial expression recognition system that is able to recognize the facial expressions of several subjects on a webcam at the same time. Our proposed methodology combines a supervised transfer learning strategy and a joint supervision method with center loss, which is crucial for facial tasks. A newly proposed Convolutional Neural Network (CNN) model, MobileNet, which has both accuracy and speed, is deployed in both offline and in a real-time framework that enables fast and accurate real-time output. Evaluations towards two publicly available datasets, JAFFE and CK+, are carried out respectively. The JAFFE dataset reaches an accuracy of 95.24%, while an accuracy of 96.92% is achieved on the 6-class CK+ dataset, which contains only the last frames of image sequences. At last, the average run-time cost for the recognition of the real-time implementation is around 3.57ms/frame on a NVIDIA Quadro K4200 GPU.

References

[1]
Ognjen Rudovic, Jaeryoung Lee, Miles Dai, Bjorn Schuller, and Rosalind Picard. 2018. Personalized machine learning for robot perception of affect and engagement in autism therapy. Retrieved from arXiv preprint arXiv:1802.01186.
[2]
Ying Qiu, Yang Liu, Juan Arteaga-Falconi, Haiwei Dong, and Abdulmotaleb El Saddik. 2019. EVM-CNN: Real-time contactless heart rate estimation from facial video. IEEE Trans. Multimedia (2019).
[3]
Abdulmotaleb El Saddik. 2018. Digital twins: The convergence of multimedia technologies. IEEE MultiMedia 25, 2 (2018), 87--92.
[4]
Albert Mehrabian. 2008. Communication without words. Communication Theory, C. David Mortensen (Ed.). Transaction Publishers, New Brunswick, 193--200.
[5]
Paul Ekman and Wallace V. Friesen. 2003. Unmasking the Face: A Guide to Recognizing Emotions from Facial Clues. ISHK.
[6]
Ligang Zhang and Dian Tjondronegoro. 2011. Facial expression recognition using facial movement features. IEEE Trans. Affect. Comput. 2, 4 (2011), 219--229.
[7]
Zhengyou Zhang, Michael Lyons, Michael Schuster, and Shigeru Akamatsu. 1998. Comparison between geometry-based and Gabor-wavelets-based facial expression recognition using multi-layer perceptron. In Proceedings of the 3rd International Conference on Face 8 Gesture Recognition. 454--459.
[8]
Hong-Bo Deng, Lian-Wen Jin, Li-Xin Zhen, Jian-Cheng Huang.2005. A new facial expression recognition method based on local Gabor filter bank and PCA plus LDA. Int. J. Inform. Technol. 11, 11 (2005), 86--96.
[9]
Feifei Zhang, Qirong Mao, Xiangjun Shen, Yongzhao Zhan, and Ming Dong. 2018. Spatially coherent feature learning for pose-invariant facial expression recognition. ACM Trans. Multimedia Comput., Commun. Appl. 14, 1s (2018), 27.
[10]
Shu Liao, Wei Fan, Albert C. S. Chung, and Dit-Yan Yeung. 2006. Facial expression recognition using advanced local binary patterns, Tsallis entropies and global appearance features. In Proceedings of the IEEE International Conference on Image Processing. 665--668.
[11]
Pranav Kumar, S. L. Happy, and Aurobinda Routray. 2016. A real-time robust facial expression recognition system using HOG features. In Proceedings of the International Conference on Computing, Analytics and Security Trends. 289--293.
[12]
Rahul Islam, Karan Ahuja, Sandip Karmakar, and Ferdous Barbhuiya. 2016. SenTion: A framework for sensing facial expressions. Retrieved from arXiv preprint arXiv:1608.04489.
[13]
Huei-Fang Yang, Bo-Yao Lin, Kuang-Yu Chang, and Chu-Song Chen. 2018. Joint estimation of age and expression by combining scattering and convolutional networks. ACM Trans. Multimedia Comput., Commun. Appl. 14, 1 (2018), 9--1.
[14]
Veena Mayya, Radhika M. Pai, and M. M. Manohara Pai. 2016. Automatic facial expression recognition using DCNN. Procedia Comput. Sci. 93 (2016), 453--461.
[15]
Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. Retrieved from arXiv preprint arXiv:1704.04861.
[16]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, Vol. 1. 1097--1105.
[17]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. Retrieved from arXiv preprint arXiv:1409.1556.
[18]
Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 815--823.
[19]
Patrick Lucey, Jeffrey F. Cohn, Takeo Kanade, Jason Saragih, Zara Ambadar, and Iain Matthews. 2010. The extended Cohn-Kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. 94--101.
[20]
Michael Lyons, Shigeru Akamatsu, Miyuki Kamachi, and Jiro Gyoba. 1998. Coding facial expressions with Gabor wavelets. In Proceedings of the 3rd International Conference on Face 8 Gesture Recognition. 200--205.
[21]
Maja Pantic, Michel Valstar, Ron Rademaker, and Ludo Maat. 2005. Web-based database for facial expression analysis. In Proceedings of the IEEE International Conference on Multimedia and Expo. 5.
[22]
Oliver Langner, Ron Dotsch, Gijsbert Bijlstra, Daniel H. J. Wigboldus, Skyler T. Hawk, and A. D. Van Knippenberg. 2010. Presentation and validation of the Radboud Faces Database. Cognit. Emot. 24, 8 (2010), 1377--1388.
[23]
Yuxiang Jiang, Haiwei Dong, and Abdulmotaleb El Saddik. 2018. Baidu Meizu deep learning competition: Arithmetic operation recognition using end-to-end learning OCR techniques. IEEE Access 6 (2018), 60128--60136.
[24]
Nima Tajbakhsh, Jae Y. Shin, Suryakanth R. Gurudu, R. Todd Hurst, Christopher B. Kendall, Michael B. Gotway, and Jianming Liang. 2016. Convolutional neural networks for medical image analysis: Full training or fine tuning? IEEE Trans. Medical Imag. 35, 5 (2016), 1299--1312.
[25]
Bo-Kyeong Kim, Jihyeon Roh, Suh-Yeon Dong, and Soo-Young Lee. 2016. Hierarchical committee of deep convolutional neural networks for robust facial expression recognition. J. Multimod. User Interfaces 10, 2 (2016), 173--189.
[26]
Hong-Wei Ng, Viet Dung Nguyen, Vassilios Vonikakis, and Stefan Winkler. 2015. Deep learning for emotion recognition on small datasets using transfer learning. In Proceedings of the ACM International Conference on Multimodal Interaction. 443--449.
[27]
Yandong Wen, Kaipeng Zhang, Zhifeng Li, and Yu Qiao. 2016. A discriminative feature learning approach for deep face recognition. In Proceedings of the European Conference on Computer Vision. 499--515.
[28]
Charles Darwin and Phillip Prodger. 1998. The Expression of the Emotions in Man and Animals. Oxford University Press, USA.
[29]
Paul Ekman and Erika L. Rosenberg. 1997. What the Face Reveals: Basic and Applied Studies of Spontaneous Expression Using the Facial Action Coding System (FACS). Oxford University Press, USA.
[30]
Di Huang, Caifeng Shan, Mohsen Ardabilian, Yunhong Wang, and Liming Chen. 2011. Local binary patterns and its application to facial image analysis: a survey. IEEE Trans. Syst., Man, Cyber., Part C (Appl. Rev.) 41, 6 (2011), 765--781.
[31]
Yongqiang Yao, Di Huang, Xudong Yang, Yunhong Wang, and Liming Chen. 2018. Texture and geometry scattering representation-based facial expression recognition in 2D+3D videos. ACM Trans. Multimedia Comput., Commun. Appl. 14, 1s (2018), 18.
[32]
Zhiding Yu and Cha Zhang. 2015. Image based static facial expression recognition with multiple deep network learning. In Proceedings of the ACM International Conference on Multimodal Interaction. 435--442.
[33]
Peter Burkert, Felix Trier, Muhammad Zeshan Afzal, Andreas Dengel, and Marcus Liwicki. 2015. DeXpression: Deep convolutional neural network for expression recognition. Retrieved from arXiv preprint arXiv:1509.05371.
[34]
Yichuan Tang. 2013. Deep learning using linear support vector machines. Retrieved from arXiv preprint arXiv:1306.0239.
[35]
Ian J. Goodfellow, Dumitru Erhan, Pierre Luc Carrier, Aaron Courville, Mehdi Mirza, Ben Hamner, Will Cukierski, Yichuan Tang, David Thaler, Dong-Hyun Lee, et al. 2013. Challenges in representation learning: A report on three machine learning contests. In Proceedings of the International Conference on Neural Information Processing. 117--124.
[36]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--9.
[37]
Ali Mollahosseini, David Chan, and Mohammad H. Mahoor. 2016. Going deeper in facial expression recognition using deep neural networks. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision. 1--10.
[38]
Paul Viola and Michael J. Jones. 2004. Robust real-time face detection. Int. J. Comput. Vis. 57, 2 (2004), 137--154.
[39]
Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 10 (2010), 1345--1359.
[40]
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.
[41]
George E. Dahl, Tara N. Sainath, and Geoffrey E. Hinton. 2013. Improving deep neural networks for LVCSR using rectified linear units and dropout. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. 8609--8613.
[42]
S. L. Happy and Aurobinda Routray. 2015. Automatic facial expression recognition using features of salient facial patches. IEEE Trans. Affect. Comput. 6, 1 (2015), 1--12.
[43]
Rohit Verma and Mohamed-Yahia Dabbagh. 2013. Fast facial expression recognition based on local binary patterns. In Proceedings of the 26th IEEE Canadian Conference on Electrical and Computer Engineering. 1--4.
[44]
Ping Liu, Shizhong Han, Zibo Meng, and Yan Tong. 2014. Facial expression recognition via a boosted deep belief network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1805--1812.
[45]
Frank Y. Shih, Chao-Fa Chuang, and Patrick S. P. Wang. 2008. Performance comparisons of facial expression recognition in JAFFE database. Int. J. Pattern Recog. Artific. Intell. 22, 3 (2008), 445--459.
[46]
Fei Cheng, Jiangsheng Yu, and Huilin Xiong. 2010. Facial expression recognition in JAFFE dataset based on Gaussian process classification. IEEE Trans. Neural Netw. 21, 10 (2010), 1685--1690.
[47]
Yogachandran Rahulamathavan, Raphael C.-W. Phan, Jonathon A. Chambers, and David J. Parish. 2013. Facial expression recognition in the encrypted domain based on local fisher discriminant analysis. IEEE Trans. Affect. Comput. 4, 1 (2013), 83--92.
[48]
Andre Teixeira Lopes, Edilson de Aguiar, and Thiago Oliveira-Santos. 2015. A facial expression recognition system using convolutional networks. In Proceedings of the 28th SIBGRAPI Conference on Graphics, Patterns and Images. 273--280.
[49]
Kamlesh Mistry, Li Zhang, Siew Chin Neoh, Ming Jiang, Alamgir Hossain, and Benoît Lafon. 2014. Intelligent appearance and shape based facial emotion recognition for a humanoid robot. In Proceedings of the 8th International Conference on Software, Knowledge, Information Management and Applications. 1--8.
[50]
Mundher Al-Shabi, Wooi Ping Cheah, and Tee Connie. 2016. Facial expression recognition using a hybrid CNN-SIFT aggregator. Retrieved from arXiv preprint arXiv:1608.02833.
[51]
Pooya Khorrami, Thomas Le Paine, and Thomas S. Huang. 2015. Do deep neural networks learn facial action units when doing expression recognition? In Proceedings of the IEEE International Conference on Computer Vision Workshops. 19--27.
[52]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation. 265--283.

Cited By

View all
  • (2024)XR technologies to enhance the emotional skills of people with autism spectrum disorder: A systematic reviewComputers & Graphics10.1016/j.cag.2024.103942121(103942)Online publication date: Jun-2024
  • (2024)MiniTomatoNet: a lightweight CNN for tomato leaf disease recognition on heterogeneous FPGA-SoCThe Journal of Supercomputing10.1007/s11227-024-06301-880:15(21837-21866)Online publication date: 1-Oct-2024
  • (2024)Enhanced spatio-temporal 3D CNN for facial expression classification in videosMultimedia Tools and Applications10.1007/s11042-023-16066-683:4(9911-9928)Online publication date: 1-Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 15, Issue 2
May 2019
375 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/3339884
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 2019
Accepted: 01 January 2019
Revised: 01 November 2018
Received: 01 September 2018
Published in TOMM Volume 15, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Facial expression recognition
  2. deep learning networks

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Qatar National Research Fund (a member of Qatar Foundation)
  • NPRP

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)36
  • Downloads (Last 6 weeks)2
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)XR technologies to enhance the emotional skills of people with autism spectrum disorder: A systematic reviewComputers & Graphics10.1016/j.cag.2024.103942121(103942)Online publication date: Jun-2024
  • (2024)MiniTomatoNet: a lightweight CNN for tomato leaf disease recognition on heterogeneous FPGA-SoCThe Journal of Supercomputing10.1007/s11227-024-06301-880:15(21837-21866)Online publication date: 1-Oct-2024
  • (2024)Enhanced spatio-temporal 3D CNN for facial expression classification in videosMultimedia Tools and Applications10.1007/s11042-023-16066-683:4(9911-9928)Online publication date: 1-Jan-2024
  • (2023)A feature boosted deep learning method for automatic facial expression recognitionPeerJ Computer Science10.7717/peerj-cs.12169(e1216)Online publication date: 31-Jan-2023
  • (2023)Virtual Reality Solutions Employing Artificial Intelligence Methods: A Systematic Literature ReviewACM Computing Surveys10.1145/356502055:10(1-29)Online publication date: 2-Feb-2023
  • (2023)ExpresSense: Exploring a Standalone Smartphone to Sense Engagement of Users from Facial Expressions Using Acoustic SensingProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581235(1-18)Online publication date: 19-Apr-2023
  • (2023)Meta-MMFNet: Meta-learning-based Multi-model Fusion Network for Micro-expression RecognitionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/353957620:2(1-20)Online publication date: 25-Sep-2023
  • (2023)A Review of Emotions Recognition via Facial Expressions for Human-Robot Interaction2023 International Conference on Information Technology, Applied Mathematics and Statistics (ICITAMS)10.1109/ICITAMS57610.2023.10525397(116-120)Online publication date: 20-Mar-2023
  • (2023)Transformer-Based Feature Fusion Approach for Multimodal Visual Sentiment Recognition Using Tweets in the WildIEEE Access10.1109/ACCESS.2023.327474411(47070-47079)Online publication date: 2023
  • (2023)Real time facial expression and gender recognition with feature integrated CNNThe Imaging Science Journal10.1080/13682199.2022.215795669:5-8(254-269)Online publication date: 11-Jan-2023
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media