Large-Scale Mobile Audio Environments
for Collaborative Musical Interaction
Mike Wozniewski &
Nicolas Bouillot
Centre for Intelligent Machines
McGill University
Montréal, Québec, Canada
{mikewoz,nicolas}@cim.mcgill.ca
Zack Settel
Jeremy R. Cooperstock
Université de Montréal
Montréal, Québec, Canada
zs@sympatico.ca
Centre for Intelligent Machines
McGill University
Montréal, Québec, Canada
jer@cim.mcgill.ca
ABSTRACT
New application spaces and artistic forms can emerge when
users are freed from constraints. In the general case of
human-computer interfaces, users are often confined to a
fixed location, severely limiting mobility. To overcome this
constraint in the context of musical interaction, we present
a system to manage large-scale collaborative mobile audio
environments, driven by user movement. Multiple participants navigate through physical space while sharing overlaid virtual elements. Each user is equipped with a mobile
computing device, GPS receiver, orientation sensor, microphone, headphones, or various combinations of these technologies. We investigate methods of location tracking, wireless audio streaming, and state management between mobile
devices and centralized servers. The result is a system that
allows mobile users, with subjective 3-D audio rendering,
to share virtual scenes. The audio elements of these scenes
can be organized into large-scale spatial audio interfaces,
thus allowing for immersive mobile performance, locative
audio installations, and many new forms of collaborative
sonic activity.
within our physical environment. This prospect yields a
new domain for musical interaction employing augmentedreality interfaces and large multi-user environments.
We present a system where multiple participants can navigate about a university campus, several city blocks, or an
even larger space. Equipped with position-tracking and
orientation-sensing technology, their locations are relayed
to other participants and to any servers that are managing
the current state. With a mobile device for communication,
users are able interact with an overlaid virtual audio environment, containing a number of processing elements. The
physical space thus becomes a collaborative augmentedreality environment where immersive musical interfaces can
be explored. Musicians can input audio at their locations,
while virtual effects processors can be scattered through
the scene to transform those signals. All users, performers and audience alike, receive subjectively rendered spatial
audio corresponding to their particular locations, allowing
for unique experiences that are not possible in traditional
music performance venues.
Keywords
sonic navigation, mobile music, spatial interaction, wireless
audio streaming, locative media, collaborative interfaces
1. INTRODUCTION & BACKGROUND
With the design of new interfaces for musical expression,
it is often argued that control paradigms should capitalize
on natural human skills and activities. As a result, a wide
range of tracking solutions and sensing platforms have been
explored, which translate human action into signals that can
be used for the control of music and other forms of media.
The physical organization of interface components plays an
important role in the usability of the system, since user motion naturally provides kinesthetic feedback, allowing a user
to better remember the style of interaction and gestures required to trigger certain events. Also, as digital devices become increasingly mobile and ubiquitous, we expect interactive applications to become more distributed and integrated
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
NIME08, Genova, Italy
Copyright 2008 Copyright remains with the author(s).
Figure 1: A mobile performer
1.1
Background
In earlier work, we have spent significant time exploring
how virtual worlds can be used as musical interfaces. The
result of this investigation has led to the development of the
Audioscape engine [30]1 , which allows for the spatial organization of sound processing, and provides an audiovisual
rendering of the scene for feedback. Audio elements can
be arranged in a 3-D world and precise control over the directivity of propagating audio is provided to the user. For
example, an audio signal emitted by a sound generator may
be steered toward a sound processor that exists at some
3-D location. The processed signal may again be steered
1
Available at www.audioscape.org
towards a virtual microphone that captures and busses the
sound to a loudspeaker where it is heard. The result is a
technique of spatial signal bussing, which lends particularly
well to many common mixing operations. Gain control, for
instance, is accomplished by adjusting the distance between
two nodes in the scene, while filter parameters can be controlled by changing orientations.
The paradigm of organizing sound processing in threedimensional space has been explored in some of our previous
publications [27, 28, 26]. We have seen that users easily understand how to interact with these scenes, especially when
actions are related to every-day activity. For instance, it is
instantly understood that increasing the distance between
two virtual sound elements will decrease the intensity of
the transmitted signal, and that pointing a sound source
in a particular direction will result in a stronger signal at
the target location. We have designed and prototyped several applications using these types of interaction techniques,
including 3-D mixing, active listening, and using virtual effects racks [27, 29]. Furthermore, we began to share virtual
scenes between multiple participants, each with subjective
audio rendering and steerable audio input, allowing for the
creation of virtual performance venues and support for virtual reality video conferencing [31].
While performers appreciated the functionality of these
earlier systems, they were nevertheless hampered by constraints on physical mobility. These applications operated
mainly with game-like techniques, where users stood in front
of screens, and navigated through the scene using controllers
such as joysticks or gamepads. The fact that the gestures for
moving and steering sound were abstracted through these
intermediate devices resulted in a lack of immersive feeling
and made the interfaces more complicated to learn.
We thus decided to incorporate more physical movement,
for example, sensing the user’s head movement with an orientation sensor attached to headphones, and applying this
to affect changes to the apparent spatial audio rendering.
To further extend this degree of physical involvement we began to add real-world location awareness to the system, allowing users to move around the space physically instead of
virtually. For example, our 4Dmix3 installation [4] tracked
up to six users in an 80m2 gallery space. The motion of
each user controlled the position of a recording buffer, which
could travel among a number of virtual sound generators in
the scene. The result was a type of remixing application,
where users controlled the mix by moving through space.
In the remainder of this paper, we explore the use of
larger scale position tracking, such as that of a Global Positioning System (GPS), and the resulting challenges and
opportunities that such technology presents. We evolve
our framework to support a more distributed and mobilecapable architecture, which results in the need for wireless
audio streaming and the distribution of information about
the mobile participants. Sections 2 and 3 describe the additional technical elements that need to be introduced into the
system to support wireless and mobile applications, while
Section 4 demonstrates a prototypical musical application
using this new architecture. Musicians in the Mobile Audioscape are able to navigate through an outdoor environment containing a superimposed set of virtual audio elements. Real physical gestures can be used to steer and
move sound through the space, providing an easily understood paradigm of interaction in what can now be thought
of as a mobile music venue.
1.2
Mobile Music Venues
By freeing users from the confines of computer terminals and interfaces that severely limit mobility, application
spaces emerge that can operate in a potentially unbounded
physical space. These offer many novel possibilities that can
lead to new artistic approaches; or they can re-contextualize
existing concepts that can then be revisited and expanded
upon. An excellent example is parade music, where sound
emission is spatially dynamic or mobile; a passive listener
remains in one place while different music is coming and
going. One hundred years ago, Charles Ives integrated this
concept into symphonic works, where different musical material flowed through the score, extending our notions of
counterpoint to include those based on proximity of musical material. The example of parade music listening expands to include two other cases: a mobile listener can walk
with or against the parade, yielding additional relationships
to the music. Our work also integrates the concept of active listening; material may be organized topographically
in space, produced by mobile performers and encountered
non-linearly by mobile listeners. From this approach come
several rich musical forms, which like sculpture, integrate
point of view ; listeners/observers create their own unique
rendering. Thus, artists may create works that explore the
spatial dynamics of musical experience, where flowing music content is put in counterpoint by navigation. Musical
scores begin to resemble maps, and listeners play a larger
role in authoring their experiences.
1.3
Related Work
With respect to collaborative musical interfaces, Blaine
and Fels provide an overview of many systems, classifying
them according to attributes such as scale, type of media,
amount of directed interaction, learning curve, and level
of physicality, among others [7]. However, most of these
systems rely on users to be in a relatively fixed location in
front of a computer. The move to augmented- or mixedreality spaces seems like a natural evolution, offering users
a greater level of immersion in the collaboration, and their
respective locations can be used for additional control.
In terms of locative media, some projects have considered
the task of tagging geographical locations with sound. The
[murmur] project [2] is one simple example, where users
tag interesting locations with phone numbers. Others can
call the numbers using their mobile phones and listen to
audio recordings related to the locations. Similarly, the
Hear&There project [20] allows recording audio at a given
GPS coordinate, while providing a spatial rendering of other
recordings as users walk around. Unfortunately, this is limited to a single-person experience, where the state of the
augmented reality scene is only available on one computer.
Tanaka proposed an ad-hoc (peer-to-peer) wireless networking strategy to allow multiple musicians to share sound simultaneously using hand-held computers [22]. Later work
by Tanaka and Gemeinboeck [23] capitalized on locationbased services available on 3G cellular networks to acquire
coarse locations of mobile devices. They proposed the creation of locative media instruments, where geographic localization is used as a musical interface.
Large areas can also be used for musical interaction in
other ways. Sonic City [16] proposed mobility, rather than
location, alone, for interaction. As a user walks around a
city, urban sounds are processed in real time as a result of
readings from devices such as accelerometers, light sensors,
temperature sensors, and metal detectors. Similarly, the
Sound Mapping [19] project included gyroscopes along with
GPS sensors in a suitcase that users could push around a
small area. Both position changes and subtle movements
could be used to manipulate the sound that was transmitted
between multiple cases in the area via radio signal.
Orientation or heading can also provide useful feedback,
since spatial sound conveys a great deal of information about
directions of objects and the acoustics of an environment.
Projects including GpsTunes [21] and Melodious Walkabout
[15] use this type of information to provide audio cues that
guide individuals in specific directions.
We take inspiration from the the projects mentioned above,
and incorporate many of these ideas into our work. However, real-time high-fidelity audio support for multiple individuals has not been well addressed. Tanaka’s work [22], as
well as some of our past experiences [8], demonstrate how
we can deal with the latencies associated with distributed
audio performance, but minimizing latency remains a major focus of our work. The ability to create virtual audio
scenes will be supported with some additions to our existing
Audioscape engine. To address the need of distributed mobile interaction, we are adding large-scale location sensing
and the ability to distribute state, signals, and computation among mobile clients effectively. These challenges are
addressed in the following sections.
2. LOCATIVE TECHNOLOGY
In order to support interaction in large-scale spaces, we
require methods of tracking users and communicating between them. A variety of mobile devices are available for
this purpose, potentially equipped with powerful processors,
wireless transmission, and sensing technologies. For our initial prototypes, we chose to develop on Gumstix (verdex
XM4-bt) processors with expansion boards for audio I/O,
GPS, storage, and WiFi communication [17]. These devices
have the benefit of being full-function miniature computers
(FFMC) with a large development community, and as a
result, most libraries and drivers can be supported easily.
2.1
Wireless Standards
Given that the most generally available wireless technologies on mobile devices are Bluetooth and WiFi, we consider
the benefits and drawbacks for each of these standards . For
transmission of data between sensors located on the body
and the main processing device, Bluetooth is a viable solution. However, even with Bluetooth 2.0, a practical transfer
rate is typically limited to approximately 2.1 Mbps. If we
want to send or receive audio (16 bit samples at 44kHz),
approximately 700 kbps of bandwidth is needed for each
stream. In theory, this allows for interaction between up
to three individuals, where each user sends one stream and
receives two. Given the need to support a greater number
of participants, we are forced to use WiFi.2 Furthermore,
the range of Bluetooth is limiting, whereas WiFi can relay
signals through access points. Furthermore, we can make
use of higher-level protocols such as Optimized Link State
Routing protocol (OLSR) [18], which computes optimal forwarding paths for ad-hoc nodes. This is a viable way to
reconfigure wireless networks if individuals are moving.
2.2
GPS
GPS has seen widespread integration into a variety of
commodity hardware such as cell phones and PDAs. These
provide position tracking in outdoor environments, typically
associated with the 3-D geospatial coordinates of users.
However, accuracy in consumer-grade devices is quite poor,
ranging between approximately 5m in the best case (highquality receiver with open skies) [25] to 100 metres or more
[6]. Several methods exist to reduce error, for example,
differential GPS (DGPS) uses carefully calibrated base sta2
We note viable alternatives on the horizon, such as the
newly announced SOUNDabout Lossless codec, which allows even smaller audio streams to be sent over Bluetooth.
tions that transmit error corrections over radio frequencies.
The idea is that mobile GPS units in the area will have
similar positional drift, and correcting this can yield accuracies of under 1m. Another technique, known as assisted
GPS (AGPS), takes advantage of common wireless networks
(cellular, bluetooth, WiFi) in urban environments to access
reference stations with a clear view of the sky (e.g., on the
roofs of buildings). Although accuracy is still in the order of 15m, the interesting benefit of this system is that
localization can be attained indoors (with an accuracy of
approximately 50m) [6].
2.3
Orientation & Localization
While GPS devices provide location information, it is also
important to capture a listener’s head orientation so that
spatial cues can be provided, the resulting sound appearing
to propagate from a particular direction. Most automotive
GPS receivers report heading information by tracking the
vehicle trajectory over time. This is a viable strategy for inferring the orientation of a vehicle, but a listener’s head can
change orientation independently of body motion. Moreover, the types of applications we are targeting will likely
involve periods of time where a user does not change position, but stays in one place and orients his or her body in
various directions. Therefore, additional orientation sensing
seems to be a requirement.
In human psychoacoustic perception, accuracy and responsiveness of orientation information are important, since
a listener’s ability to localize sound is highly dependent on
changes in phase, amplitude, and spectral content with respect to head motion. Responsiveness, in particular, is a
significant challenge, considering the wireless nature of the
system. Listeners will be moving their heads continuously
to help localize sounds, and a delay of more than 70ms in
spatial cues can hinder this process [10]. Furthermore, it
has been demonstrated that head-tracker latency is most
noticeable in augmented reality applications, as a listener
can compare virtual sounds to reference sounds in the real
environment. In these cases, latencies as low as 25ms can be
detected, and begin to impair performance in localization
tasks at slightly greater values [11]. It is therefore suggested
that latency be maintained below 30ms.
To track head orientation, we attach an inertial measurement unit (IMU) to the headphones of each participant,
capable of sensing instantaneous 3-D orientation with an
error of less than 1 degree. It should be mentioned that not
all applications will require this degree of precision, and
some deployments could potentially make use of trajectorybased orientation information. For instance, the Melodious
Walkabout [15] uses aggregated GPS data to determine the
direction of travel, and provides auditory cues to guide individuals in specific directions. Users hear music to their
left if they are meant to take a left turn, whereas a low-pass
filtered version of their audio is heard if they are traveling
in the wrong direction. We can conceive of other types of
applications, where instantaneous head orientation is not
needed, and users could adjust to the paradigm of hearing audio spatialization according to trajectory rather than
line of sight. Of particular interest, are high-velocity applications such as skiing or cycling, where users are generally
looking forward, in the direction of travel. Such constraints
can help with predictions of possible orientations, while the
faster speed helps to overcome the coarse resolution of current GPS technology.
3. WIRELESS AUDIO STREAMING
The move to mobile technology presents significant design challenges in the domain of audio transmission, largely
related to scalability and the effects of latency on user experience. More precisely, a certain level of quality needs to be
maintained to ensure that mobile performers and audience
members experience audio fidelity that is comparable to
traditional venues. The design of effective solutions should
take into account that WiFi networks provide variable performance based on the environment, and that small and
lightweight mobile devices are, at present, limited in terms
of computation capabilities.
3.1
Scalability
Reliance on unicast communication between users in a
group suffers a potential n2 effect of audio interactions between them and in turn, to bandwidth explosion. We have
investigated a number of solutions to deal with this problem.
Multicast technology, for instance, allows devices to send
UDP packets to an IP multicast address that virtualizes a
group of receivers. Interested clients are able to subscribe to
the streams of relevance, drastically reducing the overall required bandwidth. However, IP multicast over IEEE 802.11
wireless LAN is known to exhibit unacceptable performance
[14] due to unsupported collision avoidance and acknowledgement at the MAC layer. Our benchmark tests confirm
that multicast transmission experienced higher jitter than
unicast, mandating a larger receiver buffer to maintain quality. Furthermore, packet loss for the multicast tests was in
the order of 10-15%, resulting in a distorted audio stream,
while unicast had almost negligible losses of 0.3%. Based on
these results, we decided to rely for now on a point-to-point
streaming methodology while experimenting with emerging
non-standard multicast protocols, in anticipation of future
improvements.
3.2
Low Latency Streaming
Mobile applications tend to rely on compression algorithms to respect bandwidth constraints. As a result they
often incur signal delays that challenge musical interaction
and performer synchronization. Acceptable latency tolerance depends on the style of music, with figures as low as
10ms [12] for fast pieces. More typically, musicians have difficulty synchronizing with latencies above 50ms [13]. Most
audio codecs require greater than this amount of encoding time.3 Due in part to limited computational resources
available on our mobile devices, we instead transmit uncompressed audio, thus fully avoiding codec delays in the
system.
Other sources of latency include packetization delay, corresponding to the time required to fill a packet with data
samples for transmission, and network delay, which varies
according to network load and results in jitter at the receiver. Soundcard latencies also play a role, but we consider this to be outside of our control. The most effective
method for managing these remaining delays may be to minimize the size of transmitted packets. By sending a smaller
number of audio samples in each network packet, we also
decrease the amount of time that we must wait for those
samples to arrive from the soundcard.
In this context, we have developed an dynamically reconfigurable transmission protocol for low-latency, high-fidelity
audio streaming. Our protocol, nstream, supports dynamic
adjustment of sender throughput and receiver buffer size.
This is accomplished by switching between different levels
of PCM quantization (8, 16 and 32 bit), packet size, and receiver buffer size. The protocol is developed as an external
3
Possible exceptions are the Fraunhofer Ultra-Low Delay Codec (offering a 6ms algorithmic delay) [24] and the
SOUNDabout Lossless codec (claiming under 10ms).
for Pure Data [3], and can be deployed on both a central
server and a mobile device.
In benchmark tests, we have successfully transmitted uncompressed streams with an outgoing packet size of 64 samples. The receiver buffer holds two packets in the queue
before decoding, meaning that a delay of three packets is
encountered before the result can be heard. With a sampling rate of 44.1kHz, this translates to a packetization and
receiving latency of 3 × (64/44.1) = 4.35ms. In addition,
the network delay can be as low as 2ms, provided that the
users are relatively close to each other, and typically does
not exceed 10ms for standard wireless applications. The
sum of these latencies is in the order of 7-15ms.
Practical performance will, of course, depend on the wireless network being used and the number of streams transmitted. Our experiments show that high packet rate results
in network instability and high jitter. In such situations it
is necessary to increase packet size to help maintain an acceptable packet rate. This motivates us, as future work, to
investigate algorithms for autonomous adaptation of lowlatency protocols that deal both with quality and scalability.
4. MOBILE AUDIOSCAPE
Our initial prototyping devices, Gumstix, were chosen
to provide: 1) wireless networking for bidirectional highquality, low-latency audio and data streams, 2) local audio processing, 3) on-board device hosting for navigation
and other types of USB or Bluetooth sensors, 4) minimal
size/weight, and 5) Linux support. A more detailed explanation of our hardware infrastructure can be found in
another publication [9], in particular, the method of Bluetooth communication between Gumstix and sensors.
To develop on these devices, a cross-compilation toolchain
was needed that could produce binaries for the ARM-based
400MHz Gumstix processors (Marvell’s PXA270). The first
library that we needed to build was a version of Pure Data
(Pd), which is used extensively for audio processing and
control signal management by our Audioscape engine. Particularly, we used Pure Data anywhere (PDa), a special
fixed-point version of Pd for use with the processors typically found on mobile devices [5]. Several externals needed
to be built for PDa, including a customized version of the
Open Sound Control (OSC) objects, where multicast support was added, and the nstream object, mentioned in Section 3.2. The latter was also specially designed to support
both regular Pd and PDa, using sample conversion for interoperability between an Apple laptop, PC and Gumstix
units.
We also supplied each user with an HP iPAQ, loaded
with a customized application that could graphically represent their location on a map. This program was authored
with HP Mediascape software [1], which supports the playback of audio, video, and even Flash based on user position.
The most useful aspect of this software was the fact that
we could use Flash XML Sockets to receive GPS locations
of other participants and update the display accordingly.
Although we used the Compact Flash GPS receivers with
the iPAQs for sending GPS data, the interface between Mediascape software and the Flash program running within it
only allowed for updates at 2Hz, corresponding to a latency
of at least 500ms before position-based audio changes were
heard. The use of the GPSstix receiver, directly attached
to the Gumstix processor, is highly recommended to anyone
attempting to reproduce this work.
The resulting architecture is illustrated in Figure 2. Input audio streams are sent as mono signals to an Audioscape
Figure 3: Two participants jamming in a virtual
echo chamber, which has been arbitrarily placed on
the balcony of a building at the Banff Centre.
Figure 2: Mobile Audioscape architecture. Solid
lines indicate audio streaming while dotted lines
show transmission of control signals.
server on a nearby laptop. The server also receives all control data via OSC from the iPAQ devices and stores location
information for each user. A spatialized rendering is computed, and stereo audio signals are sent back to the users.
For all streams, we send audio with a sampling rate of 44.1
kHz and 16-bit samples.
In terms of network topology, wireless ad-hoc connections
are used, allowing users to venture far away from buildings
with access points (provided that the laptop server is moved
as well). Due to the number of streams being transmitted,
audio is sent with 256 samples per packet, which ensures an
acceptable packet rate and reduces jitter on the network.
The result is a latency of 3 × (256/44.1) = 17.4ms for packetization and a minimal network delay of about 2ms. However, since audio is sent to a central server for processing
before being heard, these delays are actually encountered
twice, for a total latency of approximately 40ms. This is
well within the acceptable limit for typical musical performance, and was not noticed by users of the system.
The artistic application we designed allows users to navigate through an overlaid virtual audio scene. Various sound
loops exist at fixed locations, where users may congregate
and jam with accompanying material. Several virtual volumetric regions are also located in the environment, allowing
some users to escape within a sonically isolated area of the
scene. Furthermore, each of these enclosed regions serves
as a resonator, providing musical audio processing (e.g., delay, harmonization or reverb) to signals played within. As
soon as players enter such a space, their sounds are modified, and a new musical experience is encountered. Figure
3 shows two such performers, who have chosen to jam in a
harmonized echo chamber. They are equipped with Gumstix and iPAQs, with both unobtrusively in their pockets.
5. DISCUSSION
Approaching mobile music applications from the perspective of virtual overlaid environments, allows novel paradigms
of artistic practice to be realized. The virtualization of performer and audience movement allows for interaction with
sound and audio processing in a spatial fashion that leads to
new types of interfaces and thus, new musical experiences.
We have presented the challenges associated with supporting multiple participants in such a system, including
the need for accurate sensing technologies and network architectures that can support low latency communication in
a scalable fashion. The prototype application that we developed was well-received by those who experimented with it,
but many improvements still need to be made. The coarseness of resolution available in consumer-grade GPS technology is such that an application must span a wide area for it
to be of any value. This is problematic, since the range of
a WiFi network is much smaller, mandating redirection of
signals through additional access points or OLSR peers. If
all signals must first travel to a server for processing, then
distant nodes will suffer from very large latency.
One solution is to distribute the state of the virtual scene
to all client machines, and perform rendering locally on the
mobile devices. For the prototype application that we developed, this would cut latency in half since audio signals
would only need to travel from one device to another, without the need to return from a central processing server. Furthermore, this strategy would allow users to be completely
free in terms of mobility, rather than in within contact with
the server for basic functionality. However, for scenes of
any moderate complexity, this demands much more processing power and memory than is currently available in
consumer devices, and of course, the number of users will
still be limited by the available network bandwidth required
for peer-to-peer streaming.
A full investigation into distributing audio streams, state
and computational load will be presented in future work,
but for the moment we have provided a first step into the
exploration of large-scale mobile audio environments. The
multi-user nature of the system coupled with high-fidelity
audio distribution provides a new domain for musical practice. We have already designed outdoor spaces for sonic
investigation, and hope to perform and create novel musical interfaces in this new mobile context.
6. ACKNOWLEDGEMENTS
The authors wish to acknowledge the generous support
of NSERC and Canada Council for the Arts, which have
funded the research and artistic development described in
this paper through their New Media Initiative. The prototype application described in Section 4 was produced in coproduction with The Banff New Media Institute (Alberta,
Canada). The authors would like to thank the participants
of the locative media residency for facilitating the work and
in particular, Duncan Speakman, who provided valuable assistance with the HP Mediascape software.
7. REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
HP Mediascape website. www.mscapers.com.
The [murmur] project. murmurtoronto.ca.
Pure Data. www.puredata.info.
Webpage: 4Dmix3. audioscape.org/twiki/bin/view/Audioscape/SAT4Dmix3.
PDa: Real time signal processing and sound
generation on handheld devices. In International
Computer Music Conference (ICMC), 2003.
R. Bajaj, S. L. Ranaweera, and D. P. Agrawal. GPS:
Location-tracking technology. Computer, 35(4):92–94,
2002.
T. Blaine and S. Fels. Contexts of collaborative
musical experiences. In Proceedings of the conference
on New Interfaces for Musical Expression (NIME),
pages 129–134, Montreal, 2003.
N. Bouillot. nJam user experiments: Enabling remote
musical interaction from milliseconds to seconds. In
Proceedings of the International Conference on New
Interfaces for Musical Expression (NIME), pages
142–147, New York, NY, USA, 2007. ACM.
N. Bouillot, M. Wozniewski, Z. Settel, and J. R.
Cooperstock. A mobile wireless platform for
augmented instruments. In International Conference
on New Interfaces for Musical Expression, Genova,
Italy, 2008.
D. Brungart, B. Simpson, R. McKinley, A. Kordik,
R. Dallman, and D. Ovenshire. The interaction
between head-tracker latency, source duration, and
response time in the localization of virtual sounds. In
Proceedings of the International Conference on
Auditory Display (ICAD), 2004.
D. S. Brungart and A. J. Kordik. The detectability of
headtracker latency in virtual audio displays. In
Proceedings of International conference on Auditory
Display (ICAD), pages 37–42, 2005.
E. Chew, A. A. Sawchuk, R. Zimmerman,
V. Stoyanova, I. Tosheff, C. Kyriakakis,
C. Papadopoulos, A. R. J. François, and A. Volk.
Distributed immersive performance. In Proceedings of
the Annual National Association of the Schools of
Music (NASM), San Diego, CA, 2004.
E. Chew, R. Zimmermann, A. A. Sawchuk,
C. Papadopoulos, C. Kyriakakis, C. Tanoue, D. Desai,
M. Pawar, R. Sinha, and W. Meyer. A second report
on the user experiments in the distributed immersive
performance project. In Proceedings of the 5th Open
Workshop of MUSICNETWORK: Integration of
Music in Multimedia Applications, 2005.
D. Dujovne and T. Turletti. Multicast in 802.11
WLANs: an experimental study. In MSWiM ’06:
Proceedings of the 9th ACM international symposium
on Modeling analysis and simulation of wireless and
mobile systems, pages 130–138, New York, NY, USA,
2006. ACM.
R. Etter. Implicit navigation with contextualized
personal audio contents. In Adjunct Proceedings of the
Third International Conference on Pervasive
Computing, pages 43–49, 2005.
L. Gaye, R. Mazé, and L. E. Holmquist. Sonic city:
the urban environment as a musical interface. In
Proceedings of the Conference on New interfaces for
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
musical expression (NIME), pages 109–115,
Singapore, 2003.
Gumstix. www.gumstix.com.
P. Hipercom. RFC 3626, Optimized Link State
Routing protocol (OLSR), 2003.
I. Mott and J. Sosnin. Sound mapping: an assertion
of place. In Proceedings of Interface, 1997.
J. Rozier, K. Karahalios, and J. Donath. Hear &
There: An augmented reality system of linked audio.
In Proceedings of International Conference on
Auditory Display (ICAD), 2000.
S. Strachan, P. Eslambolchilar, R. Murray-Smith,
S. Hughes, and S. O’Modhrain. GpsTunes:
Controlling navigation via audio feedback. In
International Conference on Human Computer
Interaction with Mobile devices & services
(MobileHCI), pages 275–278, New York, 2005. ACM.
A. Tanaka. Mobile music making. In Proceedings of
New Interfaces for Musical Interaction (NIME), 2004.
A. Tanaka and P. Gemeinboeck. A framework for
spatial interaction in locative media. In Proceedings
New Interfaces for Musical Expression (NIME), pages
26–30, Paris, France, 2006. IRCAM.
S. Wabnik, G. Schuller, J. Hirschfeld, and U. Krämer.
Reduced bit rate ultra low delay audio coding. In
Proceedings of the 120th AES Convention, May 2006.
M. Wing, A. Eklund, and L. Kellogg. Consumer-grade
global positioning system (GPS) accuracy and
reliability. Journal of Forestry, 103(4):169–173, 2005.
M. Wozniewski. A framework for interactive
three-dimensional sound and spatial audio processing
in a virtual environment. Master’s thesis, McGill
University, 2006.
M. Wozniewski, Z. Settel, and J. R. Cooperstock. A
framework for immersive spatial audio performance.
In New Interfaces for Musical Expression (NIME),
Paris, pages 144–149, 2006.
M. Wozniewski, Z. Settel, and J. R. Cooperstock. A
paradigm for physical interaction with sound in 3-D
audio space. In Proceedings of International
Computer Music Conference (ICMC), 2006.
M. Wozniewski, Z. Settel, and J. R. Cooperstock. A
spatial interface for audio and music production. In
Digital Audio Effects (DAFx), 2006.
M. Wozniewski, Z. Settel, and J. R. Cooperstock.
Audioscape: A pure data library for management of
virtual environments and spatial audio. In Pure Data
Convention, Montreal, 2007.
M. Wozniewski, Z. Settel, and J. R. Cooperstock.
User-specific audio rendering and steerable sound for
distributed virtual environments. In International
conference on auditory displays (ICAD), 2007.