What Does A Body Know? is a concert work for Digital Ventriloquized Actor (DiVA) and sound clips.... more What Does A Body Know? is a concert work for Digital Ventriloquized Actor (DiVA) and sound clips. A DiVA is a real time gesture-controlled formant-based speech synthesizer using a Cyberglove®, touchglove, and Polhemus Tracker® as the main interfaces. When used in conjunction with the performer's own voice solos and "duets" can be performed in real time.
We describe the implementation of an environment for Gesturally-Realized Audio, Speech and Song P... more We describe the implementation of an environment for Gesturally-Realized Audio, Speech and Song Performance (GRASSP), which includes a glove-based interface, a mapping/training interface, and a collection of Max/MSP/Jitter bpatchers that allow the user to improvise speech, song, sound synthesis, sound processing, sound localization, and video processing. The mapping/training interface provides a framework for performers to specify by example the mapping between gesture and sound or video controls. We demonstrate the effectiveness of the GRASSP environment for gestural control of musical expression by creating a gesture-to-voice system that is currently being used by performers.
We have added a dynamic bio-mechanical mapping layer that contains a model of the human vocal tra... more We have added a dynamic bio-mechanical mapping layer that contains a model of the human vocal tract with tongue muscle activations as input and tract geometry as output to a real time gesture controlled voice synthesizer system used for musical performance and speech research. Using this mapping layer, we conducted user studies comparing controlling the model muscle activations using a 2D set of force sensors with a position controlled kinematic input space that maps directly to the sound. Preliminary user evaluation suggests that it was more difficult to using force input but the resultant output sound was more intelligible and natural compared to the kinematic controller. This result shows that force input is a potentially feasible for browsing through a vowel space for an articulatory voice synthesis system, although further evaluation is required.
We describe the Responsive User Body Suit (RUBS), a tactile instrument worn by performers that al... more We describe the Responsive User Body Suit (RUBS), a tactile instrument worn by performers that allows the generation and manipulation of audio output using touch triggers. The RUBS system is a responsive interface between organic touch and electronic audio, intimately located on the performer’s body. This system offers an entry point into a more intuitive method of music performance. A short overview of body instrument philosophy and related work is followed by the development and implementation process of the RUBS as both an interface and performance instrument. Lastly, observations, design challenges and future goals are discussed.
The Integrated Multimodal Score-following Environment (IMuSE) [5] is a software project aimed at ... more The Integrated Multimodal Score-following Environment (IMuSE) [5] is a software project aimed at developing a system for the creation, rehearsal and performance of score-based interactive computer music compositions. An enhanced version of the NoteAbilityPro [7] music notation software is the central controller in IMuSE. The score contains conventional notes and performance indications as well as discrete and continuous control messages that can be sent to other applications such as MaxMSP [15] or Pure data (Pd) [11] during performance. As well, multiple modes of score-following [2] can be used to synchronize the live performance to the score. Score-following strategies include pitch-tracking of monophonic instruments, pitchtracking and amplitude-tracking of polyphonic instruments and gesture-tracking of performers’ hand movements. In this paper, we present an overview of the IMuSE system, with a focus on its abilities to monitor and coordinate multiple pitch and gesture trackers. A...
Gesturally Realized Audio, Speech, and Song Performance (GRASSP) is a software implementation of ... more Gesturally Realized Audio, Speech, and Song Performance (GRASSP) is a software implementation of a real-time parallel-formant speech synthesizer in Max/MSP. The synthesizer is based on the JSRU (Joint Speech Research Unit) parallel-formant speech. The resulting synthesizer is controlled by a Cyberglove, a custom glove, a foot pedal, and a Polhemus tracker.
2008 12th IEEE International Symposium on Wearable Computers, 2008
A mobile gesture-controlled speech synthesis system raises various design and aesthetic issues an... more A mobile gesture-controlled speech synthesis system raises various design and aesthetic issues and problems. Solutions must help create speech/song as effectively and expressively as possible, while still promoting the aesthetic vision of the clothing. This technology will allow users to expand their range of artistic expression, allowing them to Walk the Walk, and Talk the Talk.
CHI '11 Extended Abstracts on Human Factors in Computing Systems, 2011
Vocal production is one of the most ubiquitous and expressive activities of people, yet understan... more Vocal production is one of the most ubiquitous and expressive activities of people, yet understanding its production and synthesis remains elusive. When vocal synthesis is elevated to include new forms of singing and sound production, fundamental changes to culture and musical expression emerge. Nowadays, Text-To-Speech (TTS) synthesis seems unable to suggest innovative solutions for new computing trends, such as mobility,
CHI '11 Extended Abstracts on Human Factors in Computing Systems, 2011
What Does A Body Know? is a concert work for Digital Ventriloquized Actor (DiVA) and sound clips.... more What Does A Body Know? is a concert work for Digital Ventriloquized Actor (DiVA) and sound clips. A DiVA is a real time gesture-controlled formant-based speech synthesizer using a Cyberglove®, touchglove, and Polhemus Tracker® as the main interfaces. When used in conjunction with the performer's own voice solos and "duets" can be performed in real time.
The Journal of the Acoustical Society of America, 2009
We describe progress on creating digital ventriloquized actors (DIVAs). DIVAs use hand gestures t... more We describe progress on creating digital ventriloquized actors (DIVAs). DIVAs use hand gestures to synthesize audiovisual speech and song by means of an intermediate conversion of hand gestures to articulator (e.g., tongue, jaw, lip, and vocal chords) parameters of a computational three-dimensional vocal tract model. Our parallel-formant speech synthesizer is modified to fit within the MAX/MSP visual programming language. We added spatial sound and various voice excitation parameters in an easy-to-use environment suitable for musicians. The musician’s gesture style is learned from examples. DIVAs will be used in three composed stage works of increasing complexity performed internationally, starting with one performer initially and culminating in three performers simultaneously using their natural voices as well as the hand-based synthesizer. Training performances will be used to study the processes associated with skill acquisition, the coordination of multiple “voices” within and a...
The Journal of the Acoustical Society of America, 2010
This study investigates vowel mappings for a voice synthesizer controlled by hand gestures for ar... more This study investigates vowel mappings for a voice synthesizer controlled by hand gestures for artistic performance. The vowel targets are on a horizontal plane navigated by the movement of the right hand in front of the performer. Two vowel mappings were explored. In one mapping, the vowels were evenly distributed in a circle to make the vowel targets easier for the performer to find. In the other mapping, the vowels were arranged according to the F2 versus F1 space. Linear hand motions were then made through the vowel space while plotting the formant trajectories. The evenly distributed mapping resulted in formant trajectories that were not monotonic; the F1 and F2 pitch contours varied up and down as the hand carried out the linear motions. This had the unintended result of producing multiple diphthongs. In contrast, the F2 versus F1 mapping enabled the performer to create monotonic formant trajectories and the perception of a single diphthong. The performer found it easier to sp...
What Does A Body Know? is a concert work for Digital Ventriloquized Actor (DiVA) and sound clips.... more What Does A Body Know? is a concert work for Digital Ventriloquized Actor (DiVA) and sound clips. A DiVA is a real time gesture-controlled formant-based speech synthesizer using a Cyberglove®, touchglove, and Polhemus Tracker® as the main interfaces. When used in conjunction with the performer's own voice solos and "duets" can be performed in real time.
We describe the implementation of an environment for Gesturally-Realized Audio, Speech and Song P... more We describe the implementation of an environment for Gesturally-Realized Audio, Speech and Song Performance (GRASSP), which includes a glove-based interface, a mapping/training interface, and a collection of Max/MSP/Jitter bpatchers that allow the user to improvise speech, song, sound synthesis, sound processing, sound localization, and video processing. The mapping/training interface provides a framework for performers to specify by example the mapping between gesture and sound or video controls. We demonstrate the effectiveness of the GRASSP environment for gestural control of musical expression by creating a gesture-to-voice system that is currently being used by performers.
We have added a dynamic bio-mechanical mapping layer that contains a model of the human vocal tra... more We have added a dynamic bio-mechanical mapping layer that contains a model of the human vocal tract with tongue muscle activations as input and tract geometry as output to a real time gesture controlled voice synthesizer system used for musical performance and speech research. Using this mapping layer, we conducted user studies comparing controlling the model muscle activations using a 2D set of force sensors with a position controlled kinematic input space that maps directly to the sound. Preliminary user evaluation suggests that it was more difficult to using force input but the resultant output sound was more intelligible and natural compared to the kinematic controller. This result shows that force input is a potentially feasible for browsing through a vowel space for an articulatory voice synthesis system, although further evaluation is required.
We describe the Responsive User Body Suit (RUBS), a tactile instrument worn by performers that al... more We describe the Responsive User Body Suit (RUBS), a tactile instrument worn by performers that allows the generation and manipulation of audio output using touch triggers. The RUBS system is a responsive interface between organic touch and electronic audio, intimately located on the performer’s body. This system offers an entry point into a more intuitive method of music performance. A short overview of body instrument philosophy and related work is followed by the development and implementation process of the RUBS as both an interface and performance instrument. Lastly, observations, design challenges and future goals are discussed.
The Integrated Multimodal Score-following Environment (IMuSE) [5] is a software project aimed at ... more The Integrated Multimodal Score-following Environment (IMuSE) [5] is a software project aimed at developing a system for the creation, rehearsal and performance of score-based interactive computer music compositions. An enhanced version of the NoteAbilityPro [7] music notation software is the central controller in IMuSE. The score contains conventional notes and performance indications as well as discrete and continuous control messages that can be sent to other applications such as MaxMSP [15] or Pure data (Pd) [11] during performance. As well, multiple modes of score-following [2] can be used to synchronize the live performance to the score. Score-following strategies include pitch-tracking of monophonic instruments, pitchtracking and amplitude-tracking of polyphonic instruments and gesture-tracking of performers’ hand movements. In this paper, we present an overview of the IMuSE system, with a focus on its abilities to monitor and coordinate multiple pitch and gesture trackers. A...
Gesturally Realized Audio, Speech, and Song Performance (GRASSP) is a software implementation of ... more Gesturally Realized Audio, Speech, and Song Performance (GRASSP) is a software implementation of a real-time parallel-formant speech synthesizer in Max/MSP. The synthesizer is based on the JSRU (Joint Speech Research Unit) parallel-formant speech. The resulting synthesizer is controlled by a Cyberglove, a custom glove, a foot pedal, and a Polhemus tracker.
2008 12th IEEE International Symposium on Wearable Computers, 2008
A mobile gesture-controlled speech synthesis system raises various design and aesthetic issues an... more A mobile gesture-controlled speech synthesis system raises various design and aesthetic issues and problems. Solutions must help create speech/song as effectively and expressively as possible, while still promoting the aesthetic vision of the clothing. This technology will allow users to expand their range of artistic expression, allowing them to Walk the Walk, and Talk the Talk.
CHI '11 Extended Abstracts on Human Factors in Computing Systems, 2011
Vocal production is one of the most ubiquitous and expressive activities of people, yet understan... more Vocal production is one of the most ubiquitous and expressive activities of people, yet understanding its production and synthesis remains elusive. When vocal synthesis is elevated to include new forms of singing and sound production, fundamental changes to culture and musical expression emerge. Nowadays, Text-To-Speech (TTS) synthesis seems unable to suggest innovative solutions for new computing trends, such as mobility,
CHI '11 Extended Abstracts on Human Factors in Computing Systems, 2011
What Does A Body Know? is a concert work for Digital Ventriloquized Actor (DiVA) and sound clips.... more What Does A Body Know? is a concert work for Digital Ventriloquized Actor (DiVA) and sound clips. A DiVA is a real time gesture-controlled formant-based speech synthesizer using a Cyberglove®, touchglove, and Polhemus Tracker® as the main interfaces. When used in conjunction with the performer's own voice solos and "duets" can be performed in real time.
The Journal of the Acoustical Society of America, 2009
We describe progress on creating digital ventriloquized actors (DIVAs). DIVAs use hand gestures t... more We describe progress on creating digital ventriloquized actors (DIVAs). DIVAs use hand gestures to synthesize audiovisual speech and song by means of an intermediate conversion of hand gestures to articulator (e.g., tongue, jaw, lip, and vocal chords) parameters of a computational three-dimensional vocal tract model. Our parallel-formant speech synthesizer is modified to fit within the MAX/MSP visual programming language. We added spatial sound and various voice excitation parameters in an easy-to-use environment suitable for musicians. The musician’s gesture style is learned from examples. DIVAs will be used in three composed stage works of increasing complexity performed internationally, starting with one performer initially and culminating in three performers simultaneously using their natural voices as well as the hand-based synthesizer. Training performances will be used to study the processes associated with skill acquisition, the coordination of multiple “voices” within and a...
The Journal of the Acoustical Society of America, 2010
This study investigates vowel mappings for a voice synthesizer controlled by hand gestures for ar... more This study investigates vowel mappings for a voice synthesizer controlled by hand gestures for artistic performance. The vowel targets are on a horizontal plane navigated by the movement of the right hand in front of the performer. Two vowel mappings were explored. In one mapping, the vowels were evenly distributed in a circle to make the vowel targets easier for the performer to find. In the other mapping, the vowels were arranged according to the F2 versus F1 space. Linear hand motions were then made through the vowel space while plotting the formant trajectories. The evenly distributed mapping resulted in formant trajectories that were not monotonic; the F1 and F2 pitch contours varied up and down as the hand carried out the linear motions. This had the unintended result of producing multiple diphthongs. In contrast, the F2 versus F1 mapping enabled the performer to create monotonic formant trajectories and the perception of a single diphthong. The performer found it easier to sp...
Uploads
Papers by bob pritchard