Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3274783.3275184acmconferencesArticle/Chapter ViewAbstractPublication PagessensysConference Proceedingsconference-collections
short-paper

Speech Emotion Recognition via Attention-based DNN from Multi-Task Learning

Published: 04 November 2018 Publication History

Abstract

Speech unlocks the huge potentials in emotion recognition. High accurate and real-time understanding of human emotion via speech assists Human-Computer Interaction. Previous works are often limited in either coarse-grained emotion learning tasks or the low precisions on the emotion recognition. To solve these problems, we construct a real-world large-scale corpus composed of 4 common emotions (i.e., anger, happiness, neutral and sadness). We also propose a multi-task attention-based DNN model (i.e., MT-A-DNN) on the emotion learning. MT-A-DNN efficiently learns the high-order dependency and non-linear correlations underlying in the audio data. Extensive experiments show that MT-A-DNN outperforms conventional methods on the emotion recognition. It could take one step further on the real-time acoustic emotion recognition in many smart audio-devices.

References

[1]
Moataz El Ayadi, Mohamed S Kamel, and Fakhri Karray. 2011. Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition 44, 3 (2011), 572--587.
[2]
Theodoros Giannakopoulos. 2015. pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis. PloS one 10, 12 (2015).
[3]
Weixi Gu. 2017. PhD Forum Abstract: Non-intrusive Blood Glucose Monitor by Multi-task Deep Learning. In Information Processing in Sensor Networks (IPSN), 2017 16th ACM/IEEE International Conference on. IEEE, 249--250.
[4]
Weixi Gu, Yuxun Zhou, Zimu Zhou, Xi Liu, Han Zou, Pei Zhang, Costas J Spanos, and Lin Zhang. 2017. Sugarmate: Non-intrusive blood glucose monitoring with smartphones. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 3 (2017), 54.
[5]
Weixi Gu, Zimu Zhou, Yuxun Zhou, Han Zou, Yunxin Liu, Costas J Spanos, and Lin Zhang. 2017. BikeMate: Bike Riding Behavior Monitoring with Smartphones. In Proceedings of the 14th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services, MobiQuitous 2017. ACM.
[6]
YAWEI MU, LUIS A HERNÁNDEZ GÓMEZ, ANTONIO CANO MONTES, CARLOS ALCARAZ MARTÍNEZ, XUETIAN WANG, and HONGMIN GAO. 2017. Speech Emotion Recognition Using Convolutional-Recurrent Neural Networks with Attention Model. DEStech Transactions on Computer Science and Engineering cii (2017).
[7]
Bjorn Schuller, Bogdan Vlasenko, Florian Eyben, Martin Wollmer, Andre Stuhlsatz, Andreas Wendemuth, and Gerhard Rigoll. 2010. Cross-corpus acoustic emotion recognition: Variances and strategies. IEEE Transactions on Affective Computing 1, 2 (2010), 119--131.
[8]
Fei Tao, Gang Liu, and Qingen Zhao. 2018. An Ensemble Framework of Voice-Based Emotion Recognition System for Films and TV Programs. arXiv preprint arXiv:1803.01122 (2018).
[9]
Rui Xia and Yang Liu. 2017. A multi-task learning framework for emotion recognition using 2D continuous space. IEEE Transactions on Affective Computing 1 (2017), 3--14.

Cited By

View all
  • (2024)Detecting Moral Emotions with Facial and Vocal Expressions: A Multimodal Emotion Recognition Approach2024 IEEE 4th International Conference on Human-Machine Systems (ICHMS)10.1109/ICHMS59971.2024.10555674(1-5)Online publication date: 15-May-2024
  • (2023)Multitask Learning From Augmented Auxiliary Data for Improving Speech Emotion RecognitionIEEE Transactions on Affective Computing10.1109/TAFFC.2022.322174914:4(3164-3176)Online publication date: 1-Oct-2023
  • (2023)An Adversarial Training Based Speech Emotion Classifier With Isolated Gaussian RegularizationIEEE Transactions on Affective Computing10.1109/TAFFC.2022.316909114:3(2361-2374)Online publication date: 1-Jul-2023
  • Show More Cited By

Index Terms

  1. Speech Emotion Recognition via Attention-based DNN from Multi-Task Learning

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SenSys '18: Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems
      November 2018
      449 pages
      ISBN:9781450359528
      DOI:10.1145/3274783
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 04 November 2018

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Multi-Task Learning
      2. Speech Emotion Recognition

      Qualifiers

      • Short-paper
      • Research
      • Refereed limited

      Conference

      Acceptance Rates

      Overall Acceptance Rate 174 of 867 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)27
      • Downloads (Last 6 weeks)6
      Reflects downloads up to 10 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Detecting Moral Emotions with Facial and Vocal Expressions: A Multimodal Emotion Recognition Approach2024 IEEE 4th International Conference on Human-Machine Systems (ICHMS)10.1109/ICHMS59971.2024.10555674(1-5)Online publication date: 15-May-2024
      • (2023)Multitask Learning From Augmented Auxiliary Data for Improving Speech Emotion RecognitionIEEE Transactions on Affective Computing10.1109/TAFFC.2022.322174914:4(3164-3176)Online publication date: 1-Oct-2023
      • (2023)An Adversarial Training Based Speech Emotion Classifier With Isolated Gaussian RegularizationIEEE Transactions on Affective Computing10.1109/TAFFC.2022.316909114:3(2361-2374)Online publication date: 1-Jul-2023
      • (2023)Focusing on Needs: A Chatbot-Based Emotion Regulation Tool for Adolescents2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC53992.2023.10394600(2295-2300)Online publication date: 1-Oct-2023
      • (2022)Data Augmentation for Audio-Visual Emotion Recognition with an Efficient Multimodal Conditional GANApplied Sciences10.3390/app1201052712:1(527)Online publication date: 5-Jan-2022
      • (2022)Multi-Task Semi-Supervised Adversarial Autoencoding for Speech Emotion RecognitionIEEE Transactions on Affective Computing10.1109/TAFFC.2020.298366913:2(992-1004)Online publication date: 1-Apr-2022
      • (2022)Refined Feature Vectors for Human Emotion Classifier by combining multiple learning strategies with Recurrent Neural Networks2022 International Conference on Breakthrough in Heuristics And Reciprocation of Advanced Technologies (BHARAT)10.1109/BHARAT53139.2022.00042(160-165)Online publication date: Apr-2022
      • (2021)A Deep Learning-Based Approach to Constructing a Domain Sentiment Lexicon: a Case Study in Financial Distress PredictionInformation Processing & Management10.1016/j.ipm.2021.10267358:5(102673)Online publication date: Sep-2021
      • (2020)Learning Better Representations for Audio-Visual Emotion Recognition with Common InformationApplied Sciences10.3390/app1020723910:20(7239)Online publication date: 16-Oct-2020
      • (2020)Emotional Speaker Recognition based on Machine and Deep Learning2020 2nd International Multidisciplinary Information Technology and Engineering Conference (IMITEC)10.1109/IMITEC50163.2020.9334138(1-8)Online publication date: 25-Nov-2020
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media