Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

A cross-cultural, multimodal, affective corpus for gesture expressivity analysis

  • Original Paper
  • Published:
Journal on Multimodal User Interfaces Aims and scope Submit manuscript

Abstract

A multimodal, cross-cultural corpus of affective behavior is presented in this research work. The corpus construction process, including issues related to the design and implementation of an experiment, is discussed along with resulting acoustic prosody, facial expressions and gesture expressivity features. However, research work presented here focuses more on the cross-cultural aspect of gestural behavior defining a common corpus construction protocol aiming to identify cultural patterns within non-verbal behavior across cultures i.e. German, Greek and Italian. Culture specific findings regarding gesture expressivity are derived from the affective analysis performed. Additionally, the multimodal aspect, including prosody and facial expressions, is researched in terms of fusion techniques. Finally, a release plan of the corpus to the public domain is discussed aiming to establish the current corpus as a benchmark multimodal, cross-cultural standard and reference point.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. Initially, we had expected that emotions are more or less homogeneously expressed across the three modalities. But first analysis, yet based exclusively on annotations obtained for the audio channel, showed surprisingly low improvements when adding information of other modalities to the audio channel [26]. This was taken as a first hint for a possible discrepancy between the modalities.

  2. http://www.iis.fraunhofer.de/en/bf/bv/ks/gpe/demo/.

References

  1. Abrilian S, Devillers L, Buisine S, Martin JC (2005) EmoTV1: annotation of real-life emotions for the specification of multimodal affective interfaces. In: International proceedings of HCI

  2. Amazon Web Services: Public Data Sets (2012) http://aws.amazon.com/publicdatasets/. Accessed 31 Jan 2012

  3. Amir N, Weiss A, Hadad R (2009) Is there a dominant channel in perception of emotions? In: 3rd International conference on affective computing and intelligent interaction and workshops, 2009 (ACII 2009). IEEE, Amsterdam, pp 1–6

  4. Bänziger T, Pirker H, Scherer K (2006) GEMEP-GEneva multimodal emotion portrayals: a corpus for the study of multimodal emotional expressions. In: The workshop programme corpora for research on emotion and affect, 23 May 2006. Citeseer, p 15

  5. Battocchi A, Pianesi F, Goren-Bar D (2005) A first evaluation study of a database of kinetic facial expressions (dafex). In: Proceedings of the 7th international conference on multimodal interfaces (ICMI ’05). ACM, New York, pp 214–221

  6. Burkhardt F, Paeschke A, Rolfes M, Sendlmeier W, Weiss B (2005) A database of German emotional speech. In: Proceedings of interspeech, Lissabon, pp 1517–1520

  7. Busso C, Bulut M, Lee C, Kazemzadeh A, Mower E, Kim S, Chang J, Lee S, Narayanan S (2008) Iemocap: interactive emotional dyadic motion capture database. Lang Resour Eval 42(4):335–359

    Article  Google Scholar 

  8. Caridakis G, Raouzaiou A, Bevacqua E, Mancini M, Karpouzis K, Malatesta L, Pelachaud C (2007) Virtual agent multimodal mimicry of humans. Special issue on multimodal corpora. Lang Resour Eval 41(3–4): pp 367–388. Springer, Berlin. http://www.image.ece.ntua.gr/publications.php

  9. Caridakis G, Raouzaiou A, Karpouzis K, Kollias S (2006) Synthesizing gesture expressivity based on real sequences. Workshop on multimodal corpora: from multimodal behaviour theories to usable models. In: LREC 2006 conference, Genoa, Italy, 24–26 May 2006. http://www.image.ece.ntua.gr/publications.php

  10. Caridakis G, Wagner J, Raouzaiou A, Curto Z, Andre E, Karpouzis K (2010) A multimodal corpus for gesture expressivity analysis. In: Multimodal corpora: advances in capturing, coding and analyzing multimodality, LREC, Malta, 17–23 May 2010

  11. Castellano G, Leite I, Pereira A, Martinho C, Paiva A, McOwan P (2010) Inter-act: an affective and contextually rich multimodal video corpus for studying interaction with robots. In: Proceedings of the international conference on multimedia. ACM, New York, pp 1031–1034

  12. Cowie R, Douglas-Cowie E, Savvidou S, McMahon E, Sawey M, Schröder M (2000) ‘FEELTRACE’: an instrument for recording perceived emotion in real time. In: ISCA tutorial and research workshop (ITRW) on speech and emotion, Citeseer

  13. Creative Commons: BY-NC-SA 3.0 (2012) http://creativecommons.org/licenses/by-nc-sa/3.0/. Accessed 2 Feb 2012

  14. Douglas-Cowie E, Campbell N, Cowie R, Roach P (2003) Emotional speech: towards a new generation of databases. Speech Commun 40(1–2):33–60

    Article  MATH  Google Scholar 

  15. Douglas-Cowie E, Cowie R, Sneddon I, Cox C, Lowry O, Mcrorie M, Martin J, Devillers L, Abrilian S, Batliner A et al (2007) The humaine database: addressing the collection and annotation of naturalistic and induced emotional data. In: Affective computing and intelligent interaction, pp 488–500

  16. Douglas-Cowie E, Devillers L, Martin JC, Cowie R, Savvidou S, Abrilian S, Cox C (2005) Multimodal databases of everyday emotion: facing up to complexity. In: INTERSPEECH 2005, pp 813–816

  17. Velten E (1968) A laboratory task for induction of mood states. Behav Res Ther 6:473–482

    Google Scholar 

  18. Ekman P et al (1971) Universals and cultural differences in facial expressions of emotion. University of Nebraska Press, Lincoln

  19. Elfenbein H, Beaupré M, Lévesque M, Hess U (2007) Toward a dialect theory: cultural differences in the expression and recognition of posed facial expressions. Emotion 7(1):131

    Article  Google Scholar 

  20. Fanelli G, Gall J, Romsdorfer H, Weise T, Van Gool L (2010) A 3-d audio-visual corpus of affective communication. IEEE Trans Multimedia 12(6):591–598

    Article  Google Scholar 

  21. Fleiss J, Levin B, Paik M (2003) Statistical methods for rates and proportions. Wiley series in probability and mathematical statistics. Probability and mathematical statistics. Wiley, New York

    Google Scholar 

  22. Kessous L, Castellano G, Caridakis G (2009) Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis. J Multimodal User Interfaces. doi:10.1007/s12193-009-0025-5. http://www.image.ece.ntua.gr/publications.php

  23. Kipp M (2001) Anvil-a generic annotation tool for multimodal dialogue. In: Seventh European conference on speech communication and technology. ISCA

  24. Küblbeck C, Ernst A (2006) Face detection and tracking in video sequences using the modified census transformation. Image Vis Comput 24:564–572

    Google Scholar 

  25. Leite I, Pereira A, Martinho C, Paiva A (2008) Are emotional robots more fun to play with? In: 17th IEEE international symposium on robot and human interactive communication, 2008, RO-MAN 2008. IEEE, pp 77–82

  26. Lingenfelser F, Wagner J, André E (2011) A systematic discussion of fusion techniques for multi-modal affect recognition tasks. In: ICMI, pp 19–26

  27. Plutchik R (1994) The psychology and biology of emotion. HarperCollins College Publishers

  28. Russell JA (1980) A circumplex model of affect. J Pers Soc Psychol 39:1161–1178. doi:10.1037/h0077714

    Google Scholar 

  29. Shami M, Verhelst W (2007) Automatic classification of expressiveness in speech: a multi-corpus study. Speaker classification II, pp 43–56

  30. Soleymani M, Lichtenauer J, Pun T, Pantic M (2011) A multi-modal affective database for affect recognition and implicit tagging. IEEE Transactions on Affective Computing, vol 99, p 1

  31. Velten E (1998) A laboratory task for induction of mood states. Behav Res Ther 35:72–82

    Google Scholar 

  32. Vogt T, André E (2009) Exploring the benefits of discretization of acoustic features for speech emotion recognition. In: Proceedings of 10th conference of the international speech communication association (INTERSPEECH). ISCA, Brighton, UK, pp 328–331

  33. Wagner J, Lingenfelser F, André E (2011) The social signal interpretation framework (SSI) for real time signal processing and recognition. In: Proceedings of Interspeech 2011

  34. Wagner J, Lingenfelser F, André E, Kim J (2011) Exploring fusion methods for multimodal emotion recognition with missing data. IEEE Transactions on Affective Computing 99(PrePrints)

  35. Zara A, Maffiolo V, Martin J, Devillers L (2007) Collection and annotation of a corpus of human-human multimodal interactions: emotion and others anthropomorphic characteristics. In: Affective computing and intelligent interaction, pp 464–475

  36. Zeng Z, Pantic M, Roisman G, Huang T (2009) A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans Pattern Anal Mach Intell 31(1):39–58

    Article  Google Scholar 

Download references

Acknowledgments

This work was partially funded by the European Commission under grant agreement eCute (FP7-ICT-2009-5), ILHAIRE (FP7-ICT-2009.8.0) and CEEDS (FP7-ICT-2009-5).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to G. Caridakis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Caridakis, G., Wagner, J., Raouzaiou, A. et al. A cross-cultural, multimodal, affective corpus for gesture expressivity analysis. J Multimodal User Interfaces 7, 121–134 (2013). https://doi.org/10.1007/s12193-012-0112-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12193-012-0112-x

Keywords