A cross-cultural, multimodal, affective corpus for gesture expressivity analysis

Caridakis, G.; Wagner, J.; Raouzaiou, A.; Lingenfelser, F.; Karpouzis, K.; Andre, E.

doi:10.1007/s12193-012-0112-x

A cross-cultural, multimodal, affective corpus for gesture expressivity analysis

Original Paper
Published: 18 October 2012

Volume 7, pages 121–134, (2013)
Cite this article

Journal on Multimodal User Interfaces Aims and scope Submit manuscript

G. Caridakis¹,
J. Wagner²,
A. Raouzaiou¹,
F. Lingenfelser²,
K. Karpouzis¹ &
…
E. Andre²

830 Accesses
1 Altmetric
Explore all metrics

Abstract

A multimodal, cross-cultural corpus of affective behavior is presented in this research work. The corpus construction process, including issues related to the design and implementation of an experiment, is discussed along with resulting acoustic prosody, facial expressions and gesture expressivity features. However, research work presented here focuses more on the cross-cultural aspect of gestural behavior defining a common corpus construction protocol aiming to identify cultural patterns within non-verbal behavior across cultures i.e. German, Greek and Italian. Culture specific findings regarding gesture expressivity are derived from the affective analysis performed. Additionally, the multimodal aspect, including prosody and facial expressions, is researched in terms of fusion techniques. Finally, a release plan of the corpus to the public domain is discussed aiming to establish the current corpus as a benchmark multimodal, cross-cultural standard and reference point.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analysis of Expressiveness of Portuguese Sign Language Speakers

Cultural Differences Demonstrated by TV Series: A Cross-Cultural Analysis of Multimodal Features

The JESTKOD database: an affective multimodal database of dyadic interactions

Article 28 November 2016

Notes

Initially, we had expected that emotions are more or less homogeneously expressed across the three modalities. But first analysis, yet based exclusively on annotations obtained for the audio channel, showed surprisingly low improvements when adding information of other modalities to the audio channel [26]. This was taken as a first hint for a possible discrepancy between the modalities.
http://www.iis.fraunhofer.de/en/bf/bv/ks/gpe/demo/.

References

Abrilian S, Devillers L, Buisine S, Martin JC (2005) EmoTV1: annotation of real-life emotions for the specification of multimodal affective interfaces. In: International proceedings of HCI
Amazon Web Services: Public Data Sets (2012) http://aws.amazon.com/publicdatasets/. Accessed 31 Jan 2012
Amir N, Weiss A, Hadad R (2009) Is there a dominant channel in perception of emotions? In: 3rd International conference on affective computing and intelligent interaction and workshops, 2009 (ACII 2009). IEEE, Amsterdam, pp 1–6
Bänziger T, Pirker H, Scherer K (2006) GEMEP-GEneva multimodal emotion portrayals: a corpus for the study of multimodal emotional expressions. In: The workshop programme corpora for research on emotion and affect, 23 May 2006. Citeseer, p 15
Battocchi A, Pianesi F, Goren-Bar D (2005) A first evaluation study of a database of kinetic facial expressions (dafex). In: Proceedings of the 7th international conference on multimodal interfaces (ICMI ’05). ACM, New York, pp 214–221
Burkhardt F, Paeschke A, Rolfes M, Sendlmeier W, Weiss B (2005) A database of German emotional speech. In: Proceedings of interspeech, Lissabon, pp 1517–1520
Busso C, Bulut M, Lee C, Kazemzadeh A, Mower E, Kim S, Chang J, Lee S, Narayanan S (2008) Iemocap: interactive emotional dyadic motion capture database. Lang Resour Eval 42(4):335–359
Article Google Scholar
Caridakis G, Raouzaiou A, Bevacqua E, Mancini M, Karpouzis K, Malatesta L, Pelachaud C (2007) Virtual agent multimodal mimicry of humans. Special issue on multimodal corpora. Lang Resour Eval 41(3–4): pp 367–388. Springer, Berlin. http://www.image.ece.ntua.gr/publications.php
Caridakis G, Raouzaiou A, Karpouzis K, Kollias S (2006) Synthesizing gesture expressivity based on real sequences. Workshop on multimodal corpora: from multimodal behaviour theories to usable models. In: LREC 2006 conference, Genoa, Italy, 24–26 May 2006. http://www.image.ece.ntua.gr/publications.php
Caridakis G, Wagner J, Raouzaiou A, Curto Z, Andre E, Karpouzis K (2010) A multimodal corpus for gesture expressivity analysis. In: Multimodal corpora: advances in capturing, coding and analyzing multimodality, LREC, Malta, 17–23 May 2010
Castellano G, Leite I, Pereira A, Martinho C, Paiva A, McOwan P (2010) Inter-act: an affective and contextually rich multimodal video corpus for studying interaction with robots. In: Proceedings of the international conference on multimedia. ACM, New York, pp 1031–1034
Cowie R, Douglas-Cowie E, Savvidou S, McMahon E, Sawey M, Schröder M (2000) ‘FEELTRACE’: an instrument for recording perceived emotion in real time. In: ISCA tutorial and research workshop (ITRW) on speech and emotion, Citeseer
Creative Commons: BY-NC-SA 3.0 (2012) http://creativecommons.org/licenses/by-nc-sa/3.0/. Accessed 2 Feb 2012
Douglas-Cowie E, Campbell N, Cowie R, Roach P (2003) Emotional speech: towards a new generation of databases. Speech Commun 40(1–2):33–60
Article MATH Google Scholar
Douglas-Cowie E, Cowie R, Sneddon I, Cox C, Lowry O, Mcrorie M, Martin J, Devillers L, Abrilian S, Batliner A et al (2007) The humaine database: addressing the collection and annotation of naturalistic and induced emotional data. In: Affective computing and intelligent interaction, pp 488–500
Douglas-Cowie E, Devillers L, Martin JC, Cowie R, Savvidou S, Abrilian S, Cox C (2005) Multimodal databases of everyday emotion: facing up to complexity. In: INTERSPEECH 2005, pp 813–816
Velten E (1968) A laboratory task for induction of mood states. Behav Res Ther 6:473–482
Google Scholar
Ekman P et al (1971) Universals and cultural differences in facial expressions of emotion. University of Nebraska Press, Lincoln
Elfenbein H, Beaupré M, Lévesque M, Hess U (2007) Toward a dialect theory: cultural differences in the expression and recognition of posed facial expressions. Emotion 7(1):131
Article Google Scholar
Fanelli G, Gall J, Romsdorfer H, Weise T, Van Gool L (2010) A 3-d audio-visual corpus of affective communication. IEEE Trans Multimedia 12(6):591–598
Article Google Scholar
Fleiss J, Levin B, Paik M (2003) Statistical methods for rates and proportions. Wiley series in probability and mathematical statistics. Probability and mathematical statistics. Wiley, New York
Google Scholar
Kessous L, Castellano G, Caridakis G (2009) Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis. J Multimodal User Interfaces. doi:10.1007/s12193-009-0025-5. http://www.image.ece.ntua.gr/publications.php
Kipp M (2001) Anvil-a generic annotation tool for multimodal dialogue. In: Seventh European conference on speech communication and technology. ISCA
Küblbeck C, Ernst A (2006) Face detection and tracking in video sequences using the modified census transformation. Image Vis Comput 24:564–572
Google Scholar
Leite I, Pereira A, Martinho C, Paiva A (2008) Are emotional robots more fun to play with? In: 17th IEEE international symposium on robot and human interactive communication, 2008, RO-MAN 2008. IEEE, pp 77–82
Lingenfelser F, Wagner J, André E (2011) A systematic discussion of fusion techniques for multi-modal affect recognition tasks. In: ICMI, pp 19–26
Plutchik R (1994) The psychology and biology of emotion. HarperCollins College Publishers
Russell JA (1980) A circumplex model of affect. J Pers Soc Psychol 39:1161–1178. doi:10.1037/h0077714
Google Scholar
Shami M, Verhelst W (2007) Automatic classification of expressiveness in speech: a multi-corpus study. Speaker classification II, pp 43–56
Soleymani M, Lichtenauer J, Pun T, Pantic M (2011) A multi-modal affective database for affect recognition and implicit tagging. IEEE Transactions on Affective Computing, vol 99, p 1
Velten E (1998) A laboratory task for induction of mood states. Behav Res Ther 35:72–82
Google Scholar
Vogt T, André E (2009) Exploring the benefits of discretization of acoustic features for speech emotion recognition. In: Proceedings of 10th conference of the international speech communication association (INTERSPEECH). ISCA, Brighton, UK, pp 328–331
Wagner J, Lingenfelser F, André E (2011) The social signal interpretation framework (SSI) for real time signal processing and recognition. In: Proceedings of Interspeech 2011
Wagner J, Lingenfelser F, André E, Kim J (2011) Exploring fusion methods for multimodal emotion recognition with missing data. IEEE Transactions on Affective Computing 99(PrePrints)
Zara A, Maffiolo V, Martin J, Devillers L (2007) Collection and annotation of a corpus of human-human multimodal interactions: emotion and others anthropomorphic characteristics. In: Affective computing and intelligent interaction, pp 464–475
Zeng Z, Pantic M, Roisman G, Huang T (2009) A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans Pattern Anal Mach Intell 31(1):39–58
Article Google Scholar

Download references

Acknowledgments

This work was partially funded by the European Commission under grant agreement eCute (FP7-ICT-2009-5), ILHAIRE (FP7-ICT-2009.8.0) and CEEDS (FP7-ICT-2009-5).

Author information

Authors and Affiliations

Image, Video and Multimedia Systems Laboratory, National Technical University of Athens, Athens, Greece
G. Caridakis, A. Raouzaiou & K. Karpouzis
Multimedia Concepts and their Applications Laboratory, Augsburg University, Augsburg, Germany
J. Wagner, F. Lingenfelser & E. Andre

Authors

G. Caridakis
View author publications
You can also search for this author in PubMed Google Scholar
J. Wagner
View author publications
You can also search for this author in PubMed Google Scholar
A. Raouzaiou
View author publications
You can also search for this author in PubMed Google Scholar
F. Lingenfelser
View author publications
You can also search for this author in PubMed Google Scholar
K. Karpouzis
View author publications
You can also search for this author in PubMed Google Scholar
E. Andre
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to G. Caridakis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Caridakis, G., Wagner, J., Raouzaiou, A. et al. A cross-cultural, multimodal, affective corpus for gesture expressivity analysis. J Multimodal User Interfaces 7, 121–134 (2013). https://doi.org/10.1007/s12193-012-0112-x

Download citation

Received: 05 March 2012
Accepted: 15 September 2012
Published: 18 October 2012
Issue Date: March 2013
DOI: https://doi.org/10.1007/s12193-012-0112-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A cross-cultural, multimodal, affective corpus for gesture expressivity analysis

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Analysis of Expressiveness of Portuguese Sign Language Speakers

Cultural Differences Demonstrated by TV Series: A Cross-Cultural Analysis of Multimodal Features

The JESTKOD database: an affective multimodal database of dyadic interactions

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A cross-cultural, multimodal, affective corpus for gesture expressivity analysis

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Analysis of Expressiveness of Portuguese Sign Language Speakers

Cultural Differences Demonstrated by TV Series: A Cross-Cultural Analysis of Multimodal Features

The JESTKOD database: an affective multimodal database of dyadic interactions

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation