Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2522848.2531743acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

Multi classifier systems and forward backward feature selection algorithms to classify emotional coloured speech

Published: 09 December 2013 Publication History

Abstract

Systems for the recognition of psychological characteristics such as the emotional state in real world scenarios have to deal with several difficulties. Amongst those are unconstrained environments and uncertainties in one or several input channels. However a more crucial aspect is the content of the data itself. Psychological states are highly person-dependent and often even humans are not able to determine the correct state a person is in. A successful recognition system thus has to deal with data, that is not very discriminative and often simply misleading. In order to succeed, a critical view on features and decisions is essential to select only the most valuable ones. This work presents a comparison of a common multi classifier system approach based on state of the art features and a modified forward backward feature selection algorithm with a long term stopping criteria. The second approach takes also features of the voice quality family into account. Both approaches are based on the audio modality only. The dataset used in the challenge is an in between dataset of real world datasets which are still very hard to handle and over acted datasets which were famous in the past and today are well understood.

References

[1]
M. Airas and P. Alku. Comparison of multiple voice source parameters in different phonation types. In INTERSPEECH, pages 1410--1413, 2007.
[2]
P. Alku, T. Bäckström, and E. Vilkman. Normalized amplitude quotient for parametrization of the glottal flow. the Journal of the Acoustical Society of America, 112:701, 2002.
[3]
F. Burkhardt, A. Paeschke, M. Rolfes, W. Sendlmeier, and B. Weiss. A database of german emotional speech. In Proceedings of Interspeech 2005, 2005.
[4]
A. Dhall, R. Goecke, J. Joshi, M. Wagner, and T. Gedeon. Emotion recognition in the wild challenge. ACM ICMI, 2013.
[5]
A. Dhall, R. Goecke, S. Lucey, and T. Gedeon. Collecting large, richly annotated facial-expression databases from movies. IEEE Multimedia, 19:34--41, 2012.
[6]
T. Drugman, B. Bozkurt, and T. Dutoit. Causal-anticausal decomposition of speech using complex cepstrum for glottal source estimation. Speech Communication}, 53(6):855--866, 2011.
[7]
G. Fant, J. Liljencrants, and Q.-g. Lin. A four-parameter model of glottal flow. STL-QPSR, 4(1985):1--13, 1985.
[8]
N. Fragopanagos and J. Taylor. Emotion recognition in human-computer interaction. Neural Networks, 18:389--405, 2005.
[9]
N. H. Frijda. Recognition of emotion. Advances in experimental social psychology, 4:167--223, 1969.
[10]
H. Hermansky, N. Morgan, A. Bayya, and P. Kohn. Rasta-plp speech analysis technique. In Acoustics, Speech, and Signal Processing. ICASSP-92., IEEE International Conference on, volume 1, pages 121--124. IEEE, 1992.
[11]
T. Kanade, J. Cohn, and Y. Tian. Comprehensive database for facial expression analysis. In Automatic Face and Gesture Recognition, 2000. Proceedings. Fourth IEEE International Conference on, pages 46--53, 2000.
[12]
J. Kane and C. Gobl. Identifying regions of non-modal phonation using features of the wavelet transform. In INTERSPEECH, pages 177--180, 2011.
[13]
J. Kane and C. Gobl. Wavelet maxima dispersion for breathy to tense voice discrimination. IEEE Transactions on Audio, Speech, and Language Processing, 21, 2013.
[14]
N. Kanedera, T. Arai, H. Hermansky, and M. Pavel. On the relative importance of various components of the modulation spectrum for automatic speech recognition. Speech Communication, 28(1):43--55, 1999.
[15]
C. M. Lee, S. Yildirim, M. Bulut, A. Kazemzadeh, C. Busso, Z. Deng, S. Lee, and S. S. Narayanan. Emotion recognition based on phoneme classes. In Proceedings of ICSLP 04, 2004.
[16]
I. Luengo, E. Navas, and I. Hernáez. Feature analysis and evaluation for automatic emotion identification in speech. Multimedia, IEEE Transactions on, 12(6):490--501, 2010.
[17]
M. Lugger and B. Yang. Classification of different speaking groups by means of voice quality parameters. ITG-Fachbericht-Sprachkommunikation, 2006.
[18]
M. Lugger and B. Yang. The relevance of voice quality features in speaker independent emotion recognition. In IEEE International Conference on Acoustics, Speech and Signal Processing. ICASSP., volume 4, pages IV--17. IEEE, 2007.
[19]
S. Meudt and F. Schwenker. On instance selection in audio based emotion recognition. In Proceedings of the 5th IAPR TC3 Workshop on Artificial Neural Networks for Pattern Recognition (ANNPR'12), pages 186--192. Springer, 2012.
[20]
J. Nicholson, K. Takahashi, and R. Nakatsu. Emotion recognition in speech using neural networks. Neural Computing and Applications, 9:290--296, 2000.
[21]
T. L. Nwe, S. W. Foo, and L. C. De Silva. Speech emotion recognition using hidden markov models. Speech communication, 41(4):603--623, 2003.
[22]
G. Palm and F. Schwenker. Sensor-fusion in neural networks. In E. Shahbazian, G. Rogova, and M. J. DeWeert, editors, Harbour Protection Through Data Fusion Technologies, pages 299--306. Springer, 2009.
[23]
L. Rabiner and B.-H. Juang. Fundamentals of speech recognition. 1993.
[24]
S. Schachter. The interaction of cognitive and physiological determinants of emotional state. Advances in Experimental Social Psychology}, 1(Bd. 1):49--80, 1964.
[25]
K. R. Scherer, T. Johnstone, and G. Klasmeyer. Handbook of Affective Sciences - Vocal expression of emotion, chapter 23, pages 433--456. Affective Science. Oxford University Press, 2003.
[26]
S. Scherer, J. Kane, C. Gobl, and F. Schwenker. Investigating fuzzy-input fuzzy-output support vector machines for robust voice quality classification. Computer Speech and Language, 27(1):263--287, Jan. 2012.
[27]
S. Scherer, M. Oubbati, F. Schwenker, and G. Palm. Real-time emotion recognition from speech using echo state networks. In Artificial Neural Networks in Pattern Recognition, pages 205--216. Springer Berlin Heidelberg, 2008.
[28]
S. Scherer, F. Schwenker, and G. Palm. Classifier fusion for emotion recognition from speech. In Proceedings of Intelligent Environments 07, 2007.
[29]
S. Scherer, F. Schwenker, and G. Palm. Emotion recognition from speech using multi-classifier systems and rbf-ensembles. In Speech, Audio, Image and Biomedical Signal Processing using Neural Networks, pages 49--70. Springer Berlin Heidelberg, 2008.
[30]
S. Scherer, F. Schwenker, and G. Palm. Classifier fusion for emotion recognition from speech. In W. Minker, M. Weber, H. Hagras, V. Callagan, and A. D. Kameas, editors, Advanced Intelligent Environments, pages 95--117. Springer, 2009.
[31]
F. Schwenker, S. Scherer, M. Schmidt, M. Schels, and M. Glodek. Multiple classifier systems for the recognition of human emotions. In N. E. Gayar, J. Kittler, and F. Roli, editors, Proceedings of the 9th International Workshop on Multiple Classifier Systems (MCS'10), LNCS 5997, pages 315--324. Springer, 2010.
[32]
M. Valstar, B. Schuller, K. Smith, F. Eyben, B. Jiang, S. Bilakhia, S. Schnieder, R. Cowie, and M. Pantic. Avec 2013--the continuous audio/visual emotion and depression recognition challenge.
[33]
S. Walter, S. Scherer, M. Schels, M. Glodek, D. Hrabal, M. Schmidt, R. Böck, K. Limbrecht, H. Traue, and F. Schwenker. Multimodal emotion classification in naturalistic user behavior. In J. Jacko, editor,Human-Computer Interaction. Towards Mobile and Intelligent Interaction Environments, volume 6763 of in Computer Science, pages 603--611. Springer Berlin Heidelberg, 2011.

Cited By

View all
  • (2021)Functional Brain Imaging Reliably Predicts Bimanual Motor Skill Performance in a Standardized Surgical TaskIEEE Transactions on Biomedical Engineering10.1109/TBME.2020.301429968:7(2058-2066)Online publication date: Jul-2021
  • (2018)Multi-classifier-Systems: Architectures, Algorithms and ApplicationsComputational Intelligence for Pattern Recognition10.1007/978-3-319-89629-8_4(83-113)Online publication date: 1-May-2018
  • (2017)Fusion Architectures for Multimodal Cognitive Load RecognitionMultimodal Pattern Recognition of Social Signals in Human-Computer-Interaction10.1007/978-3-319-59259-6_4(36-47)Online publication date: 1-Jun-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI '13: Proceedings of the 15th ACM on International conference on multimodal interaction
December 2013
630 pages
ISBN:9781450321297
DOI:10.1145/2522848
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 December 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. affective computing
  2. emotion recognition
  3. feature selection
  4. human computer interaction
  5. multi classifier systems

Qualifiers

  • Research-article

Conference

ICMI '13
Sponsor:

Acceptance Rates

ICMI '13 Paper Acceptance Rate 49 of 133 submissions, 37%;
Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Functional Brain Imaging Reliably Predicts Bimanual Motor Skill Performance in a Standardized Surgical TaskIEEE Transactions on Biomedical Engineering10.1109/TBME.2020.301429968:7(2058-2066)Online publication date: Jul-2021
  • (2018)Multi-classifier-Systems: Architectures, Algorithms and ApplicationsComputational Intelligence for Pattern Recognition10.1007/978-3-319-89629-8_4(83-113)Online publication date: 1-May-2018
  • (2017)Fusion Architectures for Multimodal Cognitive Load RecognitionMultimodal Pattern Recognition of Social Signals in Human-Computer-Interaction10.1007/978-3-319-59259-6_4(36-47)Online publication date: 1-Jun-2017
  • (2017)Bimodal Recognition of Cognitive Load Based on Speech and Physiological ChangesMultimodal Pattern Recognition of Social Signals in Human-Computer-Interaction10.1007/978-3-319-59259-6_2(12-23)Online publication date: 1-Jun-2017
  • (2017)Multimodal Affect Recognition in the Context of Human-Computer Interaction for Companion-SystemsCompanion Technology10.1007/978-3-319-43665-4_19(387-408)Online publication date: 5-Dec-2017
  • (2016)Inferring mental overload based on postural behavior and gesturesProceedings of the 2nd workshop on Emotion Representations and Modelling for Companion Systems10.1145/3009960.3009961(1-4)Online publication date: 16-Nov-2016
  • (2016)Revisiting the EmotiW challenge: how wild is it really?Journal on Multimodal User Interfaces10.1007/s12193-015-0202-710:2(151-162)Online publication date: 12-Feb-2016
  • (2016)On Gestures and Postural Behavior as a Modality in Ensemble MethodsArtificial Neural Networks in Pattern Recognition10.1007/978-3-319-46182-3_26(312-323)Online publication date: 9-Sep-2016
  • (2015)Combining feature-level and decision-level fusion in a hierarchical classifier for emotion recognition in the wildJournal on Multimodal User Interfaces10.1007/s12193-015-0203-610:2(125-137)Online publication date: 18-Nov-2015
  • (2015)Bio-Visual Fusion for Person-Independent Recognition of Pain IntensityMultiple Classifier Systems10.1007/978-3-319-20248-8_19(220-230)Online publication date: 3-Jun-2015
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media