Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3264869.3264875acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

A Deep Learning-based Stress Detection Algorithm with Speech Signal

Published: 26 October 2018 Publication History

Abstract

In this paper, we propose a deep learning-based psychological stress detection algorithm using speech signals. With increasing demands for communication between human and intelligent systems, automatic stress detection is becoming an interesting research topic. Stress can be reliably detected by measuring the level of specific hormones (e.g., cortisol), but this is not a convenient method for the detection of stress in human-machine interactions. The proposed algorithm first extracts mel-filterbank coefficients using pre-processed speech data and then predicts the status of stress output using a binary decision criterion (i.e., stressed or unstressed) using long short-term memory (LSTM) and feed-forward networks. To evaluate the performance of the proposed algorithm, speech, video, and bio-signal data were collected in a well-controlled environment. We utilized only speech signals in the decision process from subjects whose salivary cortisol level varies over 10%. Using the proposed algorithm, we achieved 66.4% accuracy in detecting the stress state from 25 subjects, thereby demonstrating the possibility of utilizing speech signals for automatic stress detection.

References

[1]
A. Baum. Stress, intrusive imagery, and chronic distress. Health psychology, 9(6): 653, 1990.
[2]
N. Sharma and T. Gedeon. Objective measures, sensors and computational techniques for stress recognition and classification: A survey. Computer methods and programs in biomedicine, 108(3):1287--1301, 2012.
[3]
Khan M. Vijay R. Sondhi, S. and A. K. Salhan. Vocal indicators of emotional stress. International Journal of Computer Applications, 122(15), 2015.
[4]
I. R. Murray and Arnott J. L. Toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotion. The Journal of the Acoustical Society of America, 93(2):1097--1108, 1993.
[5]
Andrews S. Ellis D. Dobrescu R. Sandulescu, V. and O. Martinez-Mozos. Mobile app for stress monitoring using voice features. In E-Health and Bioengineering Conference (EHB), 2015, pages 1--4. IEEE, 2015.
[6]
Vignolo L. Schlotthauer G. Colominas M.A. Rufiner H.L. Sharma, R. and S.R.M. Prasanna. Empirical mode decomposition for adaptive am-fm analysis of speech: a review. Speech Communication, 88:39--64, 2017.
[7]
J. Lee and I. Tashev. High-level feature representation using recurrent neural network for speech emotion recognition. 2015.
[8]
C.N. Anagnostopoulos and T. Iliou. Towards emotion recognition from speech: definition, problems and the materials of research. In Semantics in Adaptive and Personalized Services, pages 127--143. Springer, 2010.
[9]
M. Hashemi. Language stress and anxiety among the english language learners. Procedia-Social and Behavioral Sciences, 30:1811--1816, 2011.
[10]
L. Woodrow. Anxiety and speaking english as a second language. RELC journal, 37(3):308--328, 2006.
[11]
K. Manley. Comparative study of foreign language anxiety in korean and chinese students. 2015.
[12]
M. Boden. A guide to recurrent neural networks and backpropagation. the Dallas project, 2002.
[13]
Y. Bengio, P. Simard, and P. Frasconi. Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks, 5(2):157--166, 1994.
[14]
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735--1780, 1997.
[15]
Mohamed A.R. Graves, A. and G. Hinton. Speech recognition with deep recurrent neural networks. In Acoustics, speech and signal processing (icassp), 2013 ieee international conference on, pages 6645--6649. IEEE, 2013.
[16]
Kumsta R. von Dawans B. Monakhov M. Ebstein R.P. Chen, F.S. and M. Heinrichs. Common oxytocin receptor gene (oxtr) polymorphism and social support interact to reduce stress in humans. Proceedings of the National Academy of Sciences, 108 (50):19937--19942, 2011.
[17]
Yamaguchi M. Aragaki T. Eto K. Uchihashi K. Takai, N. and Y. Nishikawa. Effect of psychological stress on the salivary cortisol and amylase levels in healthy young adults. Archives of oral biology, 49(12):963--968, 2004.
[18]
L. V. D. Maaten and G. Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(Nov):2579--2605, 2008.

Cited By

View all
  • (2024)Integration of MFCCs and CNN for Multi-Class Stress Speech Classification on Unscripted DatasetIIUM Engineering Journal10.31436/iiumej.v25i2.320725:2(381-395)Online publication date: 14-Jul-2024
  • (2024)Multimodal depression detection using deep learning in the workplace2024 Fourth International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT)10.1109/ICAECT60202.2024.10468966(1-8)Online publication date: 11-Jan-2024
  • (2024)LSTM‐based real‐time stress detection using PPG signals on raspberry PiIET Wireless Sensor Systems10.1049/wss2.12083Online publication date: 30-Oct-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AVSU'18: Proceedings of the 2018 Workshop on Audio-Visual Scene Understanding for Immersive Multimedia
October 2018
46 pages
ISBN:9781450359771
DOI:10.1145/3264869
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 October 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cortisol
  2. deep learning
  3. long-short term memory (lstm)
  4. stress detection

Qualifiers

  • Research-article

Funding Sources

  • National Research Foundation of Korea

Conference

MM '18
Sponsor:
MM '18: ACM Multimedia Conference
October 26, 2018
Seoul, Republic of Korea

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)74
  • Downloads (Last 6 weeks)9
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Integration of MFCCs and CNN for Multi-Class Stress Speech Classification on Unscripted DatasetIIUM Engineering Journal10.31436/iiumej.v25i2.320725:2(381-395)Online publication date: 14-Jul-2024
  • (2024)Multimodal depression detection using deep learning in the workplace2024 Fourth International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT)10.1109/ICAECT60202.2024.10468966(1-8)Online publication date: 11-Jan-2024
  • (2024)LSTM‐based real‐time stress detection using PPG signals on raspberry PiIET Wireless Sensor Systems10.1049/wss2.12083Online publication date: 30-Oct-2024
  • (2024)Unravelling stress levels in continuous speech through optimal feature selection and deep learningProcedia Computer Science10.1016/j.procs.2024.04.163235(1722-1731)Online publication date: 2024
  • (2024)EEG sensor driven assistive device for elbow and finger rehabilitation using deep learningExpert Systems with Applications10.1016/j.eswa.2023.122954244(122954)Online publication date: Jun-2024
  • (2024)Audio spectrogram analysis in IoT paradigm for the classification of psychological-emotional characteristicsInternational Journal of Information Technology10.1007/s41870-024-02166-5Online publication date: 5-Sep-2024
  • (2024)Real-Time Stress Detection from Raw Noisy PPG Signals Using LSTM Model Leveraging TinyMLArabian Journal for Science and Engineering10.1007/s13369-024-09095-2Online publication date: 7-May-2024
  • (2024)Real-time stress detection from smartphone sensor data using genetic algorithm-based feature subset optimization and k-nearest neighbor algorithmMultimedia Tools and Applications10.1007/s11042-023-15706-183:1(1-32)Online publication date: 1-Jan-2024
  • (2023)Stress and Anxiety Detection via Facial Expression Through Deep Learning2023 3rd International Conference on Technological Advancements in Computational Sciences (ICTACS)10.1109/ICTACS59847.2023.10389882(1565-1568)Online publication date: 1-Nov-2023
  • (2023)Analysis and Detection of Speech under Emotional Stress2023 21st International Conference on Emerging eLearning Technologies and Applications (ICETA)10.1109/ICETA61311.2023.10343755(493-498)Online publication date: 26-Oct-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media