Featured Papers by Baptiste Caramiaux
Sonic interaction is the continuous relationship between user actions and sound, mediated by some... more Sonic interaction is the continuous relationship between user actions and sound, mediated by some technology. Because interaction with sound may be task oriented or experience-based it is important to understand the nature of action-sound relationships in order to design rich sonic interactions. We propose a participatory approach to sonic interaction design that first considers the affordances of sounds in order to imagine embodied interaction, and based on this, generates interaction models for interaction designers wishing to work with sound. We describe a series of workshops, called Form Follows Sound, where participants ideate imagined sonic interactions, and then realize working interactive sound prototypes. We introduce the Sonic Incident technique, as a way to recall memorable sound experiences. We identified three interaction models for sonic interaction design: conducting; manipulating; substituting. These three interaction models offer interaction designers and developers a framework on which they can build richer sonic interactions.
Bookmarks Related papers MentionsView impact
This paper presents a gesture recognition/adaptation system for Human Computer Interaction applic... more This paper presents a gesture recognition/adaptation system for Human Computer Interaction applications that goes beyond activity classification and that, complementary to gesture labelling, characterises the move- ment execution. We describe a template-based recognition method that simultaneously aligns the input gesture to the templates using a Sequential Montecarlo inference technique. Contrary to standard template- based methods based on dynamic programming, such as Dynamic Time Warping, the algorithm has an adaptation process that tracks gesture variation in real-time. The method continuously updates, during execution of the gesture, the estimated parameters and recognition results which offers key advantages for continuous human-machine interaction. The technique is evaluated in several different ways: recognition and early recognition are evaluated on a 2D onscreen pen gestures; adaptation is assessed on synthetic data; and both early recognition and adaptation is evaluation in a user study involving 3D free space gestures. The method is not only robust to noise and successfully adapts to parameter variation but also performs recognition as well or better than non-adapting offline template-based methods.
Bookmarks Related papers MentionsView impact
Expressivity is a visceral capacity of the human body. To understand what makes a gesture express... more Expressivity is a visceral capacity of the human body. To understand what makes a gesture expressive, we need to consider not only its spatial placement and orientation, but also its dynamics and the mechanisms enacting them. We start by defining gesture and gesture expressivity, and then present fundamental aspects of muscle activity and ways to capture information through electromyography (EMG) and mechanomyography (MMG). We present pilot studies that inspect the ability of users to control spatial and temporal variations of 2D shapes and that use muscle sensing to assess expressive information in gesture execution beyond space and time. This leads us to the design of a study that explores the notion of gesture power in terms of control and sensing. Results give insights to interaction designers to go beyond simplistic gestural interaction, towards the design of interactions that draw upon nuances of expressive gesture.
Bookmarks Related papers MentionsView impact
We present a way to make environmental recordings controllable again by the use of continuous ann... more We present a way to make environmental recordings controllable again by the use of continuous annotations of the high-level semantic parameter one wishes to control, e.g. wind strength or crowd excitation level. A partial annotation can be propagated to cover the entire recording via cross-modal analysis between gesture and sound by canonical time warping (CTW). The annotations serve as a descriptor for lookup in corpus-based concatenative synthesis in order to invert the sound/annotation relationship. The workflow has been evaluated by a preliminary subject test and results on canonical correlation analysis (CCA) show high consistency between annotations and a small set of audio descriptors being well correlated with them. An experiment of the propagation of annotations shows the superior performance of CTW over CCA with as little as 20 s of annotated material.
Bookmarks Related papers MentionsView impact
Gesture-to-sound mapping is generally defined as the association between gestural and sound param... more Gesture-to-sound mapping is generally defined as the association between gestural and sound parameters. This article describes an approach that brings forward the perception-action loop as a fundamental design principle for gesture–sound mapping in digital music instrument. Our approach considers the processes of listening as the foundation – and the first step – in the design of action-sound relationships. In this design process, the relationship between action and sound is derived from actions that can be perceived in the sound. Building on previous works on listening modes and gestural descriptions we proposed to distinguish between three mapping strategies: instantaneous, temporal, and metaphoric. Our approach makes use of machine learning techniques for building prototypes, from digital music instruments to interactive installations. Four different examples of scenarios and prototypes are described and discussed.
Bookmarks Related papers MentionsView impact
We investigated gesture description of sound stimuli performed during a listening task. Our hypot... more We investigated gesture description of sound stimuli performed during a listening task. Our hypothesis is that the strategies in gestural responses depend on the level of identification of the sound source, and specifically on the identification of the action causing the sound. To validate our hypothesis, we conducted two experiments. In the first experiment, we built two corpora of sounds. The first corpus contains sounds with identifiable causal actions. The second contains sounds where no causal actions could be identified. These corpora properties were validated through a listening test. In the second experiment, participants performed arm and hand gestures synchronously while listening to sounds taken from these corpora. Afterwards, we conducted interviews asking participants to verbalize their experience, watching their own video recordings. They were questioned on their perception of the listened sounds and on their gestural strategies. We showed that for the sounds where causal action can be identified, participants mainly mimic the action that has produced the sound. In the other case, when no action can be associated to the sound, participants trace contours related to sound acoustic features. We also found that the inter-participants gesture variability is higher for causal sounds compared to non-causal sounds. Variability demonstrates that in the first case, participants have several ways of producing the same action whereas in the second case, the sound features tend to make the gesture responses consistent.
Bookmarks Related papers MentionsView impact
PhD Thesis, Sep 12, 2012
This thesis presents the studies on the analysis of the relationship between gesture and sound wi... more This thesis presents the studies on the analysis of the relationship between gesture and sound with the aim to help with the design of digital expressive instruments for musical performance. Studies of these relationships are related to various areas of research and lead to a multidisciplinary approach.
We initiate the thesis by presenting an exploratory study sealing the objectives and issues. This thesis focuses on two main themes : the gesture response to sound stimuli and the mode- ling of gesture for analysis and control.
Within the first theme, we propose experimental studies showing the cognitive strategies of participants when they associate gestures to sounds they hear. First, we show that these strategies are related to the level of identification of the causal sound source. Then, when the causal source is not identifiable, relationship strategies vary in the correspondence between the parameters of both the gesture and the sound.
Within the second theme, we address the problem of modeling the musical gesture tempo- ral structures. We present a first model for tracking and recognizing in real-time the temporal profiles of gesture parameters. Motivated by the structural aspects of music, we show the rele- vance of using a segmental-based Markov model for segmenting and parsing musical gesture. Thus, we approach the analysis of gesture from a signal point of view to a symbolic point of view.
Finally, applications of different theoretical contributions are presented. They are proofs of concept aiming at practically illustrating the specific research questions. Precisely, the two applications are : a system of sound selection based on gesture query ; and a system of sound re-synthesis based on morphological monitoring.
Bookmarks Related papers MentionsView impact
Journal of New Music Research, Jan 1, 2012
This article presents a segmentation model applied to musician movements, taking into account dif... more This article presents a segmentation model applied to musician movements, taking into account different time structures. In particular we report on ancillary gestures that are not directly linked to sound production, whilst still being entirely part of the global instrumental gesture. Precisely, we study movements of the clarinet captured with an optical 3D motion capture system, analysing ancillary movements assuming that they can be considered as a sequence of primitive actions regarded as base shapes. A stochastic model called segmental hidden Markov model is used. It allows for the representation of a continuous trajectory as a sequence of primitive temporal profiles taken from a given dictionary. We evaluate the model using two criteria: the Euclidean norm and the log-likelihood. We show that the size of the dictionary is not a predominant influence in the fitting accuracy and we propose a method for building a dictionary based on the log-likelihood criterion. Finally, we show that the sequence of primitive shapes can also be considered as a sequence of symbols enabling us to interpret the data as symbolic patterns and motifs. Based on this representation, we show that circular patterns occur in all players’ performances. This symbolic step produces a different layer of interpretation, linked to a larger time scale, which might not be obvious from a direct signal representation.
Bookmarks Related papers MentionsView impact
Papers by Baptiste Caramiaux
In our work on computational design of expressive gestural interaction, we experienced various ch... more In our work on computational design of expressive gestural interaction, we experienced various challenges for advanced optimisation methods. Here we want to highlight two of these challenges based on the design and the use of a Bayesian model called Gesture Variation Follower, with the aim to discuss such challenges with a broader community of designers and HCI practitioners during the workshop.
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Work in Progress accepted at the Conference on Tangible, Embedded and Embodied Interaction (TEI2013)
This paper presents work in progress on applying a
Multimodal interaction (MMI) approach to stud... more This paper presents work in progress on applying a
Multimodal interaction (MMI) approach to studying
interactive music performance. We report on a study
where an existing musical work was used to provide a
gesture vocabulary. The biophysical sensing already used
in the work was used as input modality, and augmented
with several other input sensing modalities not in the
original piece. The bioacoustics-based sensor,
accelerometer sensors, and full-body motion capture
system generated data recorded into a multimodal
database. We plotted the data from the different
modalities and offer observations based on visual analysis
of the collected data. Our preliminary results show that
there is complementarity of different forms in the
information. We noted three types of complementarity:
synchronicity, coupling, and correlation.
Bookmarks Related papers MentionsView impact
Proceedings of the 1st …, Jan 1, 2011
Bookmarks Related papers MentionsView impact
Music as a multimodal phenomenon promises to provide new insights into music cognition. Studied f... more Music as a multimodal phenomenon promises to provide new insights into music cognition. Studied from an embodied perspective, body movements play a major role in our musical experiences. Here we address how motor invariants such as the two-thirds power law relate to music cognition. A sample of 64 musically trained and untrained participants were asked to represent gesturally 20 short musical excerpts. In one of two conditions, their hand movements—captured with a Microsoft Kinect—created a real-time visualization on a screen in front of them. Results revealed that the two-thirds power law is disrespected in the presence of visual feedback, especially for musical excerpts with low pulse clarity. Participants also used more space with visual feedback, and when the pulse clarity was low. These findings suggest that 3D drawings of music—particularly in the absence of a clear beat—are less endpoint-oriented but more continuously monitored. We discuss the applicability of the two-thirds power law in studies involving music-induced movements.
Bookmarks Related papers MentionsView impact
Proceedings of NIME'13, May 27, 2013
We present a study that explores the affordance evoked by sound and sound-gesture mappings. In or... more We present a study that explores the affordance evoked by sound and sound-gesture mappings. In order to do this, we make use of a sensor system with minimal form factor in a user study that minimizes cultural association. The present study focuses on understanding how participants describe sounds and gestures produced while playing designed sonic interaction mappings. This approach seeks to move from object-centric affordance towards investigating embodied gestural sonic affordances.
Bookmarks Related papers MentionsView impact
Lecture Notes in Computer Science, Embodied Communication and Human-Computer Interaction. 5934, Jan 1, 2010
This article reports on the exploration of a method basedon canonical correlation analysis (CCA) ... more This article reports on the exploration of a method basedon canonical correlation analysis (CCA) for the analysis of the relation-ship between gesture and sound in the context of music performanceand listening. This method is a first step in the design of an analysis toolfor gesture-sound relationships. In this exploration we used motion cap-ture data recorded from subjects performing free hand movements whilelistening to short sound examples. We assume that even though the re-lationship between gesture and sound might be more complex, at leastpart of it can be revealed and quantified by linear multivariate regres-sion applied to the motion capture data and audio descriptors extractedfrom the sound examples. After outlining the theoretical background,the article shows how the method allows for pertinent reasoning about the relationship between gesture and sound by analysing the data sets recorded from multiple and individual subjects.
Bookmarks Related papers MentionsView impact
SMC’12 Proceedings of the 9th Sound and Music Computing Conference, 2012
ABSTRACT We propose a hierarchical approach for the design of gesture-to-sound mappings, with the... more ABSTRACT We propose a hierarchical approach for the design of gesture-to-sound mappings, with the goal to take into account multilevel time structures in both gesture and sound processes. This allows for the integration of temporal mapping strategies, complementing mapping systems based on instantaneous relationships between gesture and sound synthesis parameters.
Bookmarks Related papers MentionsView impact
DIS'12 Proceedings of the Designing Interactive Systems Conference, Jan 1, 2012
In this paper, we explore the use of movement qualities asinteraction modality. The notion of mov... more In this paper, we explore the use of movement qualities asinteraction modality. The notion of movement qualities is widely used in dance practice and can be understood as how the movement is performed, independently of its specific trajectory in space. We implemented our approach in the context of an artistic installation called A light touch. This installation invites the participant to interact with a moving light spot reacting to the hand movement qualities. We conducted a user experiment that showed that such an interaction based on movement qualities tends to enhance the user experience favouring explorative and expressive usage.
Bookmarks Related papers MentionsView impact
MOCO'14 International Workshops on Motion and Computing
While human-human or human-object interactions involve very rich, complex and nuanced gestures, g... more While human-human or human-object interactions involve very rich, complex and nuanced gestures, gestures as they are captured for human-computer interaction remain relatively simplistic. Our approach is to consider the study of variation of motion input as a way of understanding expression and expressivity in human-computer interaction and in order to propose computational solutions for capturing and using these expressive variations. The paper reports an at- tempt at drawing the lines of design guidelines for modelling systems adapting to motion variations. We propose to il- lustrate them through two case studies: the first model is used to estimate temporal and geometrical motion variations while the second is used to track variations of motion dynamics. These case studies are illustrated in two applications.
Bookmarks Related papers MentionsView impact
baptistecaramiaux.com
Bookmarks Related papers MentionsView impact
Proceedings of NIME 2011, Jan 1, 2011
Bookmarks Related papers MentionsView impact
Uploads
Featured Papers by Baptiste Caramiaux
We initiate the thesis by presenting an exploratory study sealing the objectives and issues. This thesis focuses on two main themes : the gesture response to sound stimuli and the mode- ling of gesture for analysis and control.
Within the first theme, we propose experimental studies showing the cognitive strategies of participants when they associate gestures to sounds they hear. First, we show that these strategies are related to the level of identification of the causal sound source. Then, when the causal source is not identifiable, relationship strategies vary in the correspondence between the parameters of both the gesture and the sound.
Within the second theme, we address the problem of modeling the musical gesture tempo- ral structures. We present a first model for tracking and recognizing in real-time the temporal profiles of gesture parameters. Motivated by the structural aspects of music, we show the rele- vance of using a segmental-based Markov model for segmenting and parsing musical gesture. Thus, we approach the analysis of gesture from a signal point of view to a symbolic point of view.
Finally, applications of different theoretical contributions are presented. They are proofs of concept aiming at practically illustrating the specific research questions. Precisely, the two applications are : a system of sound selection based on gesture query ; and a system of sound re-synthesis based on morphological monitoring.
Papers by Baptiste Caramiaux
Multimodal interaction (MMI) approach to studying
interactive music performance. We report on a study
where an existing musical work was used to provide a
gesture vocabulary. The biophysical sensing already used
in the work was used as input modality, and augmented
with several other input sensing modalities not in the
original piece. The bioacoustics-based sensor,
accelerometer sensors, and full-body motion capture
system generated data recorded into a multimodal
database. We plotted the data from the different
modalities and offer observations based on visual analysis
of the collected data. Our preliminary results show that
there is complementarity of different forms in the
information. We noted three types of complementarity:
synchronicity, coupling, and correlation.
We initiate the thesis by presenting an exploratory study sealing the objectives and issues. This thesis focuses on two main themes : the gesture response to sound stimuli and the mode- ling of gesture for analysis and control.
Within the first theme, we propose experimental studies showing the cognitive strategies of participants when they associate gestures to sounds they hear. First, we show that these strategies are related to the level of identification of the causal sound source. Then, when the causal source is not identifiable, relationship strategies vary in the correspondence between the parameters of both the gesture and the sound.
Within the second theme, we address the problem of modeling the musical gesture tempo- ral structures. We present a first model for tracking and recognizing in real-time the temporal profiles of gesture parameters. Motivated by the structural aspects of music, we show the rele- vance of using a segmental-based Markov model for segmenting and parsing musical gesture. Thus, we approach the analysis of gesture from a signal point of view to a symbolic point of view.
Finally, applications of different theoretical contributions are presented. They are proofs of concept aiming at practically illustrating the specific research questions. Precisely, the two applications are : a system of sound selection based on gesture query ; and a system of sound re-synthesis based on morphological monitoring.
Multimodal interaction (MMI) approach to studying
interactive music performance. We report on a study
where an existing musical work was used to provide a
gesture vocabulary. The biophysical sensing already used
in the work was used as input modality, and augmented
with several other input sensing modalities not in the
original piece. The bioacoustics-based sensor,
accelerometer sensors, and full-body motion capture
system generated data recorded into a multimodal
database. We plotted the data from the different
modalities and offer observations based on visual analysis
of the collected data. Our preliminary results show that
there is complementarity of different forms in the
information. We noted three types of complementarity:
synchronicity, coupling, and correlation.