Abstract
Electroglottographic (EGG) signals are acquired directly from the glottis. Hence EGG signals effectively represent the excitation source part of the human speech production system. Compared to speech signals, EGG signals are smooth and carry perceptually relevant emotional information. The work presented in this paper includes a sequence of experiments conducted on the emotion recognition system developed by the Gaussian Mixture Modeling (GMM) of perceptually motivated Mel Frequency Cepstral Coefficients (MFCC) features extracted from the EGG. The conclusions drawn from these experiments are two folds. (1) The 13 static MFCC features showed improved emotion recognition performance than 39 MFCC features with dynamic coefficients (by adding \(\varDelta \) and \(\varDelta \) \(\varDelta \)). (2) Low frequency regions in the EGG are emphasized by increasing the number of Mel filters for MFCC computation found to improve the performance of emotion recognition for EGG. These experimental results are verified on the EGG data available in the classic German emotional speech database (EmoDb) for four emotions such as (Anger, Happy, Boredom and Fear) apart from Neutral signals.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Albornoz, E.M., Milone, D.H., Rufiner, H.L.: Spoken emotion recognition using hierarchical classifiers. Comput. Speech Lang. 25, 556–570 (2011)
Ananthapadmanabha, T.V., Yegnanarayana, B.: Epoch extraction from linear prediction residual for identification of closed glottis interval. IEEE Trans. Acoust. Speech Sig. Process. 27(4), 309–319 (1979)
Burkhardt, F., Paeschke, A., Rolfes, M., Sendlemeier, W., Weiss, B.: A database of German emotional speech. In: Proceedings of INTERSPEECH, pp. 1517–1520 (2005)
Eyben, F., Wöllmer, M., Schuller, B.: Opensmile: the Munich versatile and fast open-source audio feature extractor, pp. 1459–1462 (2010)
Govind, D., Prasanna, S.R.M.: Expressive speech synthesis: a review. Int. J. Speech Technol. 16(2), 237–260 (2013)
Henrich, N., DAlessandro, C., Doval, B., Castellengo, M.: On the use of the derivative of electroglottographic signals for characterization of nonpathological phonation. J. Acoust. Soc. Am. 115(3), 1321–32 (2004)
Kandali, A.B., Routray, A., Basu, T.K.: Emotion recognition from Assamese speeches using MFCC features and GMM classifier. In: IEEE Region 10 Conference (2008)
Kitzing, P.: Clinical applications of electroglottography. J. Voice 4(3), 238–249 (1990)
Koolagudi, S.G., Rao, K.S.: Two stage emotion recognition based on speaking rate. Int. J. Speech Technol. 14, 35–48 (2011)
Koolagudi, S.G., Rao, K.S.: Emotion recognition from speech using source, system, and prosodic features. Int. J. Speech Technol. 15, 265–289 (2012)
Neiberg, D., Elenius, K., Laskowski, K.: Emotion recognition in spontaneous speech using GMMS. In: INTERSPEECH (2006)
Pati, D., Prasanna, S.R.M.: Processing of linear prediction residual in spectral and cepstral domains for speaker information. Int. J. Speech Technol. 18(3), 333–350 (2015)
Prasanna, S.R.M., Govind, D.: Analysis of excitation source information in emotional speech. In: Proceedings INTERSPEECH, pp. 781–784 (2010)
Pravena, D., Nandhakumar, S., Govind, D.: Significance of natural elicitation in developing simulated full blown speech emotion databases, pp. 261–265 (2016)
Raviram, P., Umarani, S.D., Wahidabanu, R.S.D.: Isolated word recognition using enhanced MFCC and IIFS. In: Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA), vol. 199, pp. 273–283. Springer (2013)
Vondra, M., Vch, R.: Recognition of emotions in German speech using Gaussian mixture models. Multimodal Sig. 5398, 256–263 (2009)
Young, S.J., Young, S.: The HTK hidden Markov model toolkit: design and philosophy (1993)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Ajay, S.G., Pravena, D., Govind, D., Pradeep, D. (2018). Exploring the Significance of Low Frequency Regions in Electroglottographic Signals for Emotion Recognition. In: Thampi, S., Krishnan, S., Corchado Rodriguez, J., Das, S., Wozniak, M., Al-Jumeily, D. (eds) Advances in Signal Processing and Intelligent Recognition Systems. SIRS 2017. Advances in Intelligent Systems and Computing, vol 678. Springer, Cham. https://doi.org/10.1007/978-3-319-67934-1_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-67934-1_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67933-4
Online ISBN: 978-3-319-67934-1
eBook Packages: EngineeringEngineering (R0)