Abstract
We present a new, freely available, multimodal corpus for research into, amongst other areas, real-time realistic interaction between humans in online virtual environments. The specific corpus scenario focuses on an online dance class application scenario where students, with avatars driven by whatever 3D capture technology is locally available to them, can learn choreographies with teacher guidance in an online virtual dance studio. As the dance corpus is focused on this scenario, it consists of student/teacher dance choreographies concurrently captured at two different sites using a variety of media modalities, including synchronised audio rigs, multiple cameras, wearable inertial measurement devices and depth sensors. In the corpus, each of the several dancers performs a number of fixed choreographies, which are graded according to a number of specific evaluation criteria. In addition, ground-truth dance choreography annotations are provided. Furthermore, for unsynchronised sensor modalities, the corpus also includes distinctive events for data stream synchronisation. The total duration of the recorded content is 1 h and 40 min for each single sensor, amounting to 55 h of recordings across all sensors. Although the dance corpus is tailored specifically for an online dance class application scenario, the data is free to download and use for any research and development purposes.
Similar content being viewed by others
Notes
More ratings by other experienced Salsa dancers will be provided in the near future
References
Clave (ryhthm) (2011). http://en.wikipedia.org/wiki/Clave_rhythm
Openni (2011). http://www.openni.org/
Alexiadis D, Kelly P, Daras P, O’Connor N, Boubekeur T, Moussa MB (2011) Evaluating a dancer’s performance using kinect-based skeleton tracking. In: ACMR, pp 659–662
Alonso M, Richard G, David B (2005) Extracting note onsets from musical recordings. In: IEEE International Conference on Multimedia and Expo. IEEE Computer Society, Los Alamitos, USA. http://doi.ieeecomputersociety.org/10.1109/ICME.2005.1521568. ISBN 0-7803-9331-7
Altun K, Barshan B (2010) Human activity recognition using inertial/magnetic sensor units. In: HBU, pp 38–51
Blank M, Gorelick L, Shechtman E, Irani M, Basri R (2005) Actions as space-time shapes. In: ICCV, vol 3, pp 1395–1402.
Cannam C, Landone C, Sandler M (2010) Sonic visualiser: an open source application for viewing, analysing and annotating music audio files. In: Proceedings of the ACM multimedia 2010 international conference, Firenze, Italy, October 2010, pp 1467– 1468.
Eichner M, Marin-Jimenez M, Zisserman A, Ferrari V (2012) 2D articulated human pose estimation and retrieval in (almost) unconstrained still images. Int J Comput Vis 99:190–214
Essid S, Alexiadis D, Tournemenne R, Gowing M, Kelly P, Monhagan D, Daras P, Dremeau A, O’Connor NE (2012) An advanced virtual dance performance evaluator. In: IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, Japan
Essid S, Lin X, Gowing M, Kordelas G, Aksay A, Kelly P, Fillon T, Zhang Q, Dielmann A, Kitanovski V, Tournemenne R, O’Connor NE, Daras P (2011) Richard G (2011) A multimodal dance corpus for research into real-time interaction between humans in online virtual environments. In: ICMI workshop on multimodal corpora for machine learning, Alicante, Spain
Gkalelis N, Kim H, Hilton A, Nikolaidis N, Pitas I (2009) The i3dpost multi-view and 3d human action/interactions. In: CMVP, pp 159–168
Gowing M, Kell P, O’Connor N, Concolato C, Essid S, Lefeuvre J, Tournemenne R, Izquierdo E, Kitanovski V, Lin X, Zhang Q (2011) Enhanced visualisation of dance performance from automatically synchronised multimodal recordings. In: ACMR, pp 667–670
Gross R, Shik J (2001) The cmu motion of body (mobo) database. Technical report.
Hofmann M, Gavrila D (2012) Multi-view 3d human pose estimation in complex environment. IJCV 96(1):103–124
Ji X, Liu H (2010) Advances in view-invariant human motion analysis: a review. SMC Part C 40(1):13–24
Messing R, Pal C, Kautz H (2009) Activity recognition using the velocity histories of tracked keypoints. In: IEEE 12th International Conference on Computer Vision, Kyoto, Japan
Pons-Moll G, Baak A, Helten T, Mueller M, Seidel H, Rosenhahn B (2010) Multisensor-fusion for 3d full-body human motion capture. In: CVPR, pp 663–670
Raptis M, Kirovski D, Hoppe H (2011) Real-time classification of dance gestures from skeleton animation. In: ACM/SIGGRAPH SCA
Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. 17th International Conference on Pattern Recognition. Cambridge, UK. 3, pp 32–36
Schwarz L, Mateus D, Navab N (2012) Recognizing multiple human activities and tracking full-body pose in unconstrained environments. Pattern Recognit 45(1):11–23
Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2011) Real-time human pose recognition in parts from single depth images. In: CVPR
Sigal L, Balan AO, Black MJ (2010) Humaneva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. Int J Comput Vis 87(1):4–27
Singh S, Velastin S, Ragheb H (2010) Muhavi: a multicamera human action video dataset for the evaluation of action recognition methods. In: AVSS, pp 48–55
Wang Y, Huang K, Tan T (2007) Human activity recognition based on r transform. In: CVPR, pp 1–8
Ushizaki KDM, Okatani T (2006) Video synchronization based on co-occurrence of appearance changes in video sequence. In: ICPR
Weinland D, Ronfard R, Boyer E (2006) Free viewpoint action recognition using motion history volumes. CVIU 104(2–3):249–257
Yang A, Jarafi R, Kuryloski P, Iyengar S, Sastry S, Bajcsy R (2008) Distributed segmentation and classification of human actions using a wearable motion sensor network. In: CVPRW, pp 1–8
Acknowledgments
The authors and 3DLife would like to acknowledge the support of Huawei in the creation of this dataset. In addition, warmest thanks go to all the contributors to these capture sessions, especially: The Dancers Anne-Sophie K., Anne-Sophie M., Bertrand, Gabi, Gael, Habib, Helene, Jacky, Jean-Marc, Laetitia, Martine, Ming-Li, Remi, Roland, Thomas. The Tech Guys Alfred, Dave, Dominique, Fabrice, Gael, Georgios, Gilbert, Marc, Mounira, Noel, Phil, Radek, Robin, Slim, Qianni, Sophie-Charlotte, Thomas, Xinyu, Yves. This research was partially supported by the European Commission under contract FP7-247688 3DLife.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Essid, S., Lin, X., Gowing, M. et al. A multi-modal dance corpus for research into interaction between humans in virtual environments. J Multimodal User Interfaces 7, 157–170 (2013). https://doi.org/10.1007/s12193-012-0109-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12193-012-0109-5