D Sound Spatialization Using Ambisonic T PDF
D Sound Spatialization Using Ambisonic T PDF
D Sound Spatialization Using Ambisonic T PDF
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
http://about.jstor.org/terms
The MIT Press is collaborating with JSTOR to digitize, preserve and extend access to
Computer Music Journal
This content downloaded from 204.168.144.64 on Sun, 13 May 2018 05:27:24 UTC
All use subject to http://about.jstor.org/terms
David G. Malham and Anthony Myatt
Music Technology Group 3-D Sound Spatialization
Department of Music
University of York using Ambisonic
York YO1 5DD, UK
(dgm2, am l2)@unix.york.ac.uk Techniques
http://www.york.ac.uk/insts/mustech/
welcome.htm
http://www.york.ac.uk/depts/music/welcome.htm
This content downloaded from 204.168.144.64 on Sun, 13 May 2018 05:27:24 UTC
All use subject to http://about.jstor.org/terms
spaced microphones were set left, center, and right, have been the first musical use of electronic sound
and they fed similarly placed speakers. They were spatialization.
much helped by the support of Leopold Stokowski In Europe during the early 1950s, composers in-
in this work, and, indeed, excellent stereo record- volved in the new electronically based forms of mu-
ings of his orchestra are available from as early as sic (and in this we are including both the musique
1932. concrete and elektronische musik schools) became
Around the same time, at EMI in Britain, Alan interested in the possibility of using sound position-
Blumlein was working on a different approach, ing and movement compositionally (Schaeffer
based on creating a convincing illusion of the origi- 1951; Stockhausen 1956). For Gesang Der Jiing-
linge, Karlheinz Stockhausen employed five loud-
nal sound field, rather than the actual sound field it-
self. His simple, crossed pair of directional micro- speakers (or groups of loudspeakers). He intended
phones that captured directional information on four to be positioned around the audience and one
just two channels of audio is now widely regarded above. The loudspeakers around the audience were
as one of the best ways to capture an audio event to fed from a four-track tape, and the fifth (vertical)
be presented over two loudspeakers. The ideas in- signal was provided by a mono tape that was manu-
herent in this approach are also fundamental to the ally synchronized to the four-track tape. Even at
ambisonic 3-D surround sound system. the world premiere in the large broadcast studio of
These stereo techniques were little used over thethe West Deutscher Rundfunk (WDR), Cologne, it
next few years, mostly as a result of the technologi-was impossible to use this configuration; the fifth
cal limitations of the available recording systems loudspeaker had to be placed at center front. Since
(despite the advent of multi-track film-based record- the premiere, only a four-track mix has been used.
ers). The next major development was featured in Stockhausen says of this piece, "... in this compo-
Walt Disney's animated film Fantasia, produced insition, for the first time, the direction and move-
1939, again with Leopold Stokowski's involvement ment of the sounds in space [emphasis in the origi-
(Hope 1979). A team of engineers from RCA and nal] is shaped by a musician, opening up a new
Disney developed a special nine-track recorder, dimension in musical experience" (Stockhausen
based on nine separate synchronized optical record-1956).
ers. The orchestra was recorded with 33 micro- Five years earlier, Jacques Poullin was working in
phones that were mixed by (orchestral) section onto Paris on the potentiometre d'espace system for dis-
six of the tracks. The seventh track contained a tributing sound among four loudspeakers, typically
mono mix of the first six, the eighth track recorded two in front of the audience, another above them,
the sound from a distant microphone, and the and one at the rear. This allowed a performer to po-
ninth track held a metronome or click track. sition a sound by simply moving a small, hand-held
For the cinema presentation, a three-channel ver-transmitter coil toward or away from four large re-
sion of the eight-channel original was played back ceiver coils arranged around the performer in a
from a four-channel optical recorder that was syn-manner that reflected the loudspeaker positions.
chronized with the film. The intention was that The sound controlled by the performer came from
this would be played back over 90 loudspeakers one track of a special five-track tape recorder; the
spread behind the screen and around the audito- other four tracks each supplied one loudspeaker to
rium, although financial pressures meant that this provide a mixture of live and preset sound position-
was rarely done in practice. Notches on the edge of ing. It is interesting to note that this "position
the soundtrack film operated a switching mecha- tracking system" is similar to those now used in
nism that sent selected sounds to the side and rear virtual reality head-tracking systems. Pierre Schaef-
speakers. Adrian Hope indicates that church bells fer speaks of the experiments on the spatial projec-
were presented at the rear, and that the Ave Mariation of sound: "... le nouveau proc6d6 est une dia-
chorus progressed from the rear to link with the lectique du sons dans l'espace et je pense que le
solo voice at the front. We believe that this may terme de musique spatiale [emphasis in original]
This content downloaded from 204.168.144.64 on Sun, 13 May 2018 05:27:24 UTC
All use subject to http://about.jstor.org/terms
lui conviendrait mieux que celui de ster6phonie," Groupe de Musique Exp6rimentale de Bourges'
or, "the new process is a dialectic of sound in space GMEBaphone.
and I think that the term spatial music could fit it
better than stereophony" (Schaeffer 1951).
One of the most significant examples of spatial Approaches to Spatial Reproduction
music was composed during the 1950s: Edgar Var-
ese's Poeme Electronique. Featured in the Philips' Electronic systems that allow sounds to be posi-
pavilion at the Brussels World Fair of 1958, it was tioned artificially in space can adopt one of two ap-
experienced by up to two million visitors. The tech- proaches. They can either attempt to mimic the
nology used was complex and cumbersome, involv- complex changes in timbre, delay, and amplitude
ing 15 tape recorders and 400 loudspeakers (Stim- that occur directly at our ears as a sound moves
son 1991), and with the might of the Philips from one position to another, or they can generate
Corporation and their engineering expertise, it was phantom images (Thiele and Plenge 1977) by ampli-
highly successful. However, as with all work of this tude profiling of feeds to multiple speakers. The
period (and indeed up until the 1970s), it lacked first approach, based on the Head Related Transfer
both a simple control system and the support of a Function or HRTF, (and used in commercial sys-
comprehensive theory of sound localization. tems such as QSound, Roland's RSS, and the Con-
Several attempts were made to design control sys- volvotron (Begualt 1994), normally requires the use
tems for the movement of sound during this early of headphones. Some variants, such as RSS, use in-
period, including systems such as that developed at teraural cross-talk cancellation (Cooper and Bauck
the West Deutscher Rundfunk in Cologne and used 1989) to enable the images to be presented over a
for Karlheinz Stockhausen's Kontakte (1960). This pair of loudspeakers. Systems using interaural
used a rotating, highly directional loudspeaker to cross-talk cancellation only work correctly in an ex-
distribute sounds between microphones. The out- tremely small "sweet spot" within the listening
puts of the microphones were then recorded and area. Presentation over headphones can be very
played back over fixed loudspeakers. As computer- good, approaching the listener's own capabilities for
based systems became available, investigations localization. This assumes, first, that personalized
started into various aspects of sound spatializa- HRTFs are used, and second, that head movements
tion that were not previously accessible. In partic- are tracked to allow the system to modify the
ular, work was done on artificial reverberation sound presented over the headphones, so that
(Schroeder 1962; Moorer 1979) and Doppler shifts sound positions remain constant with respect to
(Chowning 1971) which made a significant contri- the external space, not the listeners head. Without
bution to the understanding of distance and move- the head tracking, front-back reversals are likely to
ment cues in artificially spatialized sounds. In occur: sounds intended to be due forward appear to
1983, E Richard Moore published a description of come from behind, and vice-versa. This is a com-
a generalized model for spatial sound processing mon problem with binaural recordings, i.e., stereo
(Moore 1983), incorporated within the cmusic unit recordings made with microphones place in the
generator module, space. "ears" of a dummy head. These front-back rever-
In recent years, a number of systems offering spa- sals are known to occur even within natural lis-
tial manipulations of sounds have appeared, and tening conditions, but there they are much less
the application of computers has been of great bene- common. We use head rotation as one of the main
fit. There are both commercial systems, such as the ways to distinguish front from rear sound sources.
various forms of Dolby Surround, ambisonics, Ro- Rotation should result in the sounds moving in one
land RSS, and Q-Sound, and also more specialized direction if they are in front of the head, and in the
diffusion arrays like the UK-based Birmingham opposite direction if they are at the back. However,
Electro-Acoustic Sound Theatre (BEAST) and there appears to be no way at present to apply such
This content downloaded from 204.168.144.64 on Sun, 13 May 2018 05:27:24 UTC
All use subject to http://about.jstor.org/terms
rotations to naturally produced binaural recordings. channels, such as those proposed for HDTV sys-
Of course, some sounds are very difficult to local- tems), suffer from two major problems. In all sys-
ize under any circumstances (such as electronic tems of this type, the content of an audio channel
telephone "bells"), so strategies such as head rota- is intended to provide the signal for a particular
tion do not necessarily work even within a natural loudspeaker, implying that the loudspeaker layouts
sound field. in the studio and playback venue must match ex-
Whatever the technical merits of HRTF tech- actly for the positioning of sounds to remain consis-
niques, they are unlikely to be used in the concert tent. This was clearly recognized by those working
situation until such time as someone can afford to at the Institute of Sonology at Utrecht State Univer-
equip every seat in an auditorium with its own sity, "There are four loudspeakers in this studio so
headphones, head tracker, and digital signal proces- that the user can form an impression of the sound
sor unit providing HRTF processing. It will also befield as it would be in the concert hall. The ideal di-
necessary to persuade concert goers to wear head- mensions of a multi-track studio are those of a me-
phones. Such an approach may, however, be more dium-sized concert hall" (Weiland 1975).
acceptable within the context of concerts broadcast It is, of course, rarely possible to match the size
over the World-Wide Web. Artificial creation of or shape of the final performance space when work-
sound fields using HRTFs also requires significanting in a studio. Moreover, a system of this type that
processing power, typically requiring many hun- has fewer than six channels directly feeding the
dreds of multiple-accumulate instructions per same number of speakers cannot meet the criterion
sample, for each individual sound source. of 60-degree angles between adjacent speakers, so
The two-channel amplitude panned stereo, which must inevitably suffer from unstable images in in-
we are accustomed to, is the most common ex- termediate positions. While this can be tolerated
ample of a system that generates phantom images for film or video work, where the listener's atten-
by amplitude profiling of feeds to multiple speak- tion is locked to the relatively small area where the
ers. An 18-dB difference in the levels of a signal pre-picture is presented, it is completely unsatisfactory
sented via two loudspeakers, spaced to subtend anfor the composer of electroacoustic music. The
angle of 60 degrees at the listener's position, will only systems that overcome these difficulties to
place a phantom image in the loudest loudspeaker.any real extent are the QMX system based on the
Progressively smaller differences will cause the im- work of Cooper and Shiga (Cooper and Shiga 1972)
age to move toward the halfway line between the and the ambisonic system, based on the work of,
speakers, then out toward the other speaker as that among others, Michael Gerzon (Gerzon 1972).
in turn becomes louder. The listening area where These two systems are fundamentally equivalent.
this works well is small, but larger than that for in- Control over sound positions and movements can
teraural cross-talk canceled HRTE The image, espe- either be accomplished in the studio or in the per-
cially for central sounds, becomes increasingly un- formance venue -or, ideally, both. For the com-
stable as the angle between the loudspeakers (as poser in the studio to have a reasonably good idea
seen by the listener) goes beyond the optimum 60 of how sound objects will behave in the perfor-
degrees. Once it reaches 90 degrees (as in four- mance space, a system that allows for essentially
loudspeaker quadraphonic systems!), stable central seamlessly scalable loudspeaker configurations is
images cannot be formed, especially at the sides necessary. This removes the changes produced as a
(Thiele and Plenge 1977). Image widths also tend to result of differing numbers and positions of loud-
vary with the spectral content of the sounds used. speakers in the studio and performance spaces, leav-
Such amplitude-panned systems, whether they are ing only the difference in actual acoustics to dis-
two channel as above, four-channel quadraphonic rupt the composer's intentions. In purely technical
systems, or some versions of the Dolby Surround terms, this rules out quadraphonics and related ap-
family (including systems with higher numbers of proaches, as noted above. However, in no way
This content downloaded from 204.168.144.64 on Sun, 13 May 2018 05:27:24 UTC
All use subject to http://about.jstor.org/terms
should this be taken to imply that it is invalid to vertical dimension is essential if a with-height re-
treat loudspeakers and the acoustic space within play system is required. If the B-format specifica-
which they reside as instruments to be played by tions are followed, assuming suitable loudspeaker/
skilled performers. In striving to present sounds as decoder systems are used, then operation in differ-
heard in the studio, we must be careful not to ig- ent venues will be as similar as local acoustics
nore the enormous potential of the speaker acous- allow. In all other respects the two parts of the sys-
tics as an instrument, a potential that can be con- tem, encoding and decoding, are completely sep-
firmed by anyone who has attended a BEAST or a arate.
This content downloaded from 204.168.144.64 on Sun, 13 May 2018 05:27:24 UTC
All use subject to http://about.jstor.org/terms
corded with a sound field microphone or with syn- Since the rotation is about the Z-axis, W an
thesized sound fields containing many sources. remain unchanged. If the same procedure is ap
These multiplying coefficients can be used to posi- to the tilt and tumble equations we have the fo
tion monophonic sounds anywhere on the surface lowing.
of the sound field.
It is possible to manipulate whole sound fields
Tilt by Angle E
that contain many different sound sources in differ-
ent positions, including naturally recorded ones. X' = X,
We have developed the following standard defini-
tions about the way sounds move to new positions Y' = Y x cos E - Z x sin E, and
to keep the equations coherent and to minimize Z' = Y x sin E + Z x cos E.
confusion between movement types.
Tilt by Angle F
positive angle of rotation - a counter-
clockwise, or, by convention, leftward rotation. X' = Xx cos F - Z x sin F,
rotation - a circular movement about Z-axis: S = W,
the same as a counter-clockwise movement in Y' = Y, and
the horizontal plane. Z' = Xx sin F + Z x cos F.
tilt - a rotation about the X-axis: the same as
a counter-clockwise movement in the vertical These equations can be combined to produce
left-right plane. complex transformations such as rotate-tilt.
tumble - a rotation about the Y-axis. This is
the same as a counter-clockwise movement Rotate-Tilt
in the vertical front-back plane. Note that a
tumble is the same as a tilt rotated 90 degrees X' = X cos D - Y x sin D,
about the Z-axis.
Y' = Xx sin D x cos E + Y x cos D x cos E
Using these definitions it is possible, for ex- - Z x sin E, and
ample, to rotate the whole sound field around the Z' = Xx sin D x sin E + Y x cos D x sin E
Z-axis. For simplicity, consider the case of a sound
+ Z x cos E.
field consisting of a single sound source with ampli-
tude r positioned on the horizontal plane at an Many other combinations of movements are pos-
sible.
angle C from the center-front position. Given the
B-format signals for the untransformed position,
This content downloaded from 204.168.144.64 on Sun, 13 May 2018 05:27:24 UTC
All use subject to http://about.jstor.org/terms
vides good stereo presentation. A UHJ decoder and The directivity factor of 0.707 in the equations
four or more horizontally placed loudspeakers can above results in a cardioid source directional re-
reproduce virtually all of the horizontal positional sponse for each loudspeaker. This is optimum for
information contained in a full B-format signal. listening positions close to the loudspeakers or out-
This involves designing wide-band, 90-degree phase side the loudspeaker array (Malham 1992). Where
shifters for both encoding and decoding (Gerzon the intended listening area is significantly smaller
1977a, 1977b). than the speaker array, a more hypercardioid shape
can be employed by increasing the directivity fac-
tor, which results in improved imaging for centrally
Decoding Ambisonics located listeners.
Decoders based on these equations can easily be
Decoding ambisonically encoded signals can appear built with simple operational-amplifier circuitry,
complicated. The complexity appears in the optimi- and it is possible to implement the cubic design by
zation of decoders for British systems (with a lim- setting up an eight-in, eight-out mixing desk to pro-
ited number of loudspeakers and a small listening duce suitable decoding. Loudspeakers and amplifi-
area) that use psychoacoustic techniques, but these ers used in the array should all be similar in size,
are not productive in systems used to cover large response, and output.
areas (Malham 1992). Designing decoders for large
areas, such as concert halls, requires a design strat-
egy aimed at achieving even power distribution.
The distribution of loudspeakers should be as even Compositional Applications
as possible; recognized loudspeaker configura-
tions include squares and regular hexagons for In the past, the availability and expense of the hard-
horizontal-only work, and a cube as the practical ware required to realize spatial compositions has
minimum for with-height work. Each individual often forced composers to position sound in one di-
speaker is then fed a combination of the B-format mension only, and hope that further possibilities
signals corresponding to its position with respect to might present themselves when the piece is per-
the center of the array. For instance, for a square formed. Many composers use the careful control of
(horizontal only) array of four speakers, arranged reverberation, phase, and amplitude to introduce ad-
left front (LF), right front (RF), left back (LB), and ditional spatial cues within the stereo field. Some
right back (RB), the signals are: compositional expertise exists in creating stereo
tapes that can recreate these effects in concert per-
LF = W + 0.707(X + Y), formances (Wishart 1986), but inaccuracies often oc-
RF = W + 0.707(X - Y), cur where the acoustic conditions of the playback
LB = W + 0.707(-X + Y), and venue introduce additional spatial cues (Smalley
RB = W + 0.707(-X - Y). 1992). This type of problem, coupled with the slow
For a cubic array, the signals for the four planar cor-
introduction of spatialization technology in domes-
ners in the "up" (U) and "down" (D) planes are: tic sound-reproduction systems, has restricted the
use of space as a compositional parameter, and of-
LFU = W + 0.707(X + Y + Z), ten discouraged composers from considering multi-
RFU = W + 0.707(X - Y + Z), dimensional sound localization. Electroacoustic
LBU = W + 0.707(-X + Y + Z), composers recognize that sounds can contain or im-
RBU = W + 0.707(-X - Y + Z), ply movement (Smalley 1986), but few have been
LFD = W + 0.707(X + Y- Z), able to fully explore the compositional implica-
RFD = W + 0.707(X - Y- Z), tions and developments of this, specifically, where
LBD = W + 0.707(-X + Y - Z), and movement or the position of sounds is inherent in
RBD = W + 0.707(-X - Y - Z). a piece and all its performances.
This content downloaded from 204.168.144.64 on Sun, 13 May 2018 05:27:24 UTC
All use subject to http://about.jstor.org/terms
Our hope is that ambisonic technology will be a nying this article. The number of sound sources re-
first step in widening composers' access to spatiali- produced within the sound field can exceed the
zation. We present below a summary of techniques number of loudspeakers used to reproduce it, and
appropriate to recordings encoded using this tech- may be greater than the number of recorded tracks
nique, along with a Csound implementation of used to represent it. Each sound within a synthe-
B-format encoding and horizontal plane decoding. sized sound field can be encoded with movement.
These can be used on computer platforms that sup- One of the great benefits of ambisonic encoding is
port the output of four-channel sound files. We be- that B-format signals can be mixed together to pro-
lieve that ambisonics represents a potential stan- duce a resultant sound field that retains the posi-
dard for positional encoding techniques that will tional information of all its components. Compos-
enable compositions with spatial information to be ers can manipulate the spatial path of individual
performed on a range of simple loudspeaker con- sound sources, and mix them with further encoded
figurations without specialized hardware. sources to combine a series of different motions.
The compositional possibilities described here When sounds have been ambisonically encoded,
are based on the definitions of ambisonic theory a decoding process will reproduce the position and
(Gerzon 1972) and experiments by various compos- movement of sounds within the sound field if suit-
ers at the University of York Department of Music. able decoder and loudspeaker configurations are
Work in ambisonics is also being conducted at the used. A sound that has been encoded to move
Australian Center for the Arts and Technology (Ven- around the listener will do so in all recognized loud-
nonen 1994). speaker configurations, from a small, single-
The ambisonic formats that are currently imple- listener environment to a large concert-hall perfor-
mented allow full three-, two- and one-dimensional mance system. The relative position of sounds and
reproduction, using B-format, horizontal decoding the scale of the movement will depend on the dis-
of B-format, and UHJ, respectively. The UHJ encod- tance between loudspeakers, i.e., the absolute
ing can also reproduce two-dimensional sound- sound position is not necessarily encoded, and a
scapes. The full three-dimensional encoding sound encoded to move across the front of the im-
system, B-format, can be used for all implementa- age will always do so despite the distance between
tions, as it can be converted to the lower formats. the front speakers. Playing back a B-format signal,
Sound field microphones (Farrah 1979) can be with an appropriate loudspeaker configuration, will
used to provide three-dimensionally encoded automatically reproduce all encoded position and
sound-source material for composition. Mimetic movement information.
source material (Emmerson 1986) and its spatial Composers who have access to playback systems
content can be captured in this way for composi- using four loudspeakers can only monitor a two-
tional use, and it can also be combined with sound dimensional sound field plane. This type of system
material with artificially encoded spatial informa- is capable of B-format playback, but not full with-
tion. Additionally, the X, Y, Z, and W signals can height reproduction. It is possible, however, to mon-
be manipulated to post-process environmental re- itor the horizontal and vertical planes individually
cordings (effectively changing the original micro- for a B-format signal designed to include height
phone position); zoom into sounds within the cap- encoding.
tured landscape; or subject the entire sound image
to rotation, tumbling, or other time-dependent mo-
tions. Considerations
Any source material without spatial encoding,
such as synthesized sound or mono recordings, canAmbisonic systems use decoding methods that are
be positioned or moved within an ambisonic soundbased on physical and psychoacoustic positional in-
field when subjected to the type of encoding pro- formation to reproduce sound fields, but there are
cess illustrated in the Csound examples accompa- further matters of perception, sound localization,
This content downloaded from 204.168.144.64 on Sun, 13 May 2018 05:27:24 UTC
All use subject to http://about.jstor.org/terms
and movement that should be considered when gen- graphic notations of sound movement can assume
erating synthesized sound fields. that visual and aural acuities are equal. Complex
Our experiments with ambisonic playback sys- sound trajectories that can be visualized (and no-
tems show that the spatial perception of a sound is tated) are often impossible to perceive. Further as-
highly frequency dependent. Some localization of sumptions about aural perception, influenced by a
low-frequency sound is possible, but the strong po- familiarity with multiple microphone (and multi-
sitional cues are provided by higher spectral compo- track) recording techniques and stereo reproduc-
nents. Also, sounds that have a widely distributed tion, also produce certain expectations related to
spectral energy can be localized more easily than sound localization. These methods present sounds
can narrow band signals (consequently, sounds as point sources, usually within a stereo field,
with sharp attack characteristics are usually easy which is rarely a true representation of the spatial
to locate). There are certain conventions and characteristics of the source. Three-dimensional re-
expectations for the localization of high- and low- cording techniques represent sound sources and
frequency sounds based on our experience of sound their spatial positions, including diffuse sound
in the environment (Begault 1994): high-frequency sources.
sound normally occurs above us, and low-frequency Several experiments have shown tha
below us (or at ground level). Inverting these rela- sound fields played back over recogniz
tionships often results in poor localization, unless speaker configurations can be perceive
other cognitive sound cues exist. The acoustic re- side of the array (Malham 1992). Here
sponse of the playback venue can produce spurious can "look in" on sound positions and
positional information that can also make localiza- rather than being surrounded by the s
tion difficult. Additionally, locating moving sounds This has several possible implications f
is easier than locating stationary sounds, as with gration of ambisonic and traditional so
all playback systems that use phantom images. diffusion systems. Further work is be
Doppler shifts can be very important to the per- to determine the possibilities of such s
ception of movement (Dodge 1985). If a sound the potential sound field distortion ef
source which does not contain inherent clues to would be introduced by the use of ind
its movement is moved within an artificially con- loudspeakers. Several ambisonic speak
structed sound field, its perception can be impeded. have been tested for use as traditional
tems using two-track tapes and hardw
Doppler shifts are not necessarily required in all cir-
cumstances, but they can assist in highlighting the signed to position the two channels of
movement of some sounds. source at different points within the
The most difficult problem in synthesizing or ma- (Malham 1992). This has some advant
nipulating sound fields is the dominance of visual placing the stereo image in specific lou
perception (Begault 1994). This acousmatic prob- because smooth spatial transitions be
lem can be acute in ambisonic loudspeaker config- loudspeakers can be achieved even if th
urations, because the phantom image technique tioned at angles greater than 60 degree
allows large angular distances to be subtended be-tener. This method is not a parallel to
tween loudspeakers at the listening position. Themance practice of sound diffusion arti
absence of a visual sound source is difficult for not reproduce the accuracy of sound p
some listeners, but reducing visual dominance byachieved by using very large arrays of
lowering light levels has been shown to increase
the perception of aural localization. This problem
rarely occurs if the angular distance between loud- Do It Yourself
speakers is small.
Visualizing the movement of sound may also beThe Csound orchestra and score files p
a problem. Compositional processes that use Figures 1, 2, 3, and 5 enable simple am
This content downloaded from 204.168.144.64 on Sun, 13 May 2018 05:27:24 UTC
All use subject to http://about.jstor.org/terms
Figure 1. A Csound orches-
tra definition for encoding
B-format ambisonic data.
This content downloaded from 204.168.144.64 on Sun, 13 May 2018 05:27:24 UTC
All use subject to http://about.jstor.org/terms
Figure 2. A Csound orches- Figure 3. A Csound score
tra definition for decoding file to demonstrate the
sound files encoded with encoding instrument of
B-format ambisonic data. Figure 1.
.*************************************** ;***************************************
**
;* *
;* Ambisonic Decoding Orchestra * Ambisonic Encoding Score *
t*
.w I
instr 2 .if~~~~~fk~~~ff~k~~k~
This content downloaded from 204.168.144.64 on Sun, 13 May 2018 05:27:24 UTC
All use subject to http://about.jstor.org/terms
Figure 4. Loudspeaker ar- Figure 5. A Csound score
rangement for playing file to demonstrate the
back four-channel ambi- decoding instrument of
sonic sounds. Figure 2.
; End
Conclusions e
Figure 5
Our experimental work has been carried out using
both software and hardware ambisonic implemen-
tations, including a purpose-built programmableand generation are almost unique to this tech-
periphonic decoder that produces full three- nique. There is also the potential to convert ambi-
dimensional surround sound over 16 loudspeakers. sonic signals to other formats, e.g., binaural, for
The Csound examples are presented here to enable other applications (Malham 1993).
other users to experiment with ambisonic encoding The techniques described here only refer to first-
and decoding, without using specialized hardware order ambisonic encoding. The original work by
devices. The simplicity with which ambisonicGerzonen- (Gerzon 1972) includes descriptions of
coding can represent sound sources in three- higher-order systems that can increase the spatial
dimensional space is a great advantage, and some accuracy of both recording and playback.
of the possibilities for sound field manipulation There are many musical and psychoacoustic is-
This content downloaded from 204.168.144.64 on Sun, 13 May 2018 05:27:24 UTC
All use subject to http://about.jstor.org/terms
sues that require further investigation. We hope tute of Acoustics 14(5):209-216. St. Albans: Institute of
Acoustics.
that the information presented here will encourage
progress in these areas. Malham, D. G. 1993. "3-D Sound for Virtual Reality us-
Further information can be obtained on the ing Ambisonic Techniques." In Proceedings of the 3rd
Annual Conference on Virtual Reality (addendum).
World-Wide Web ambisonic home page, via the
Westport: Meckler.
URL "http://www.york.ac.uk/insts/mustech/
Malham, D. G., and R. O. Orton. 1991. "Progress in the
3d_sound/ambison.htm." Application of 3-Dimensional Ambisonic Sound Sys-
tems to Computer Music." In Proceedings of the Inter-
national Computer Music Conference (ICMC). Mon-
References treal: ICMC, pp. 467-470.
Moore, E R. 1983. '"A General Model for Spatial Pro-
Askew, A. 1981. "The Amazing Clement Ader." Studio cessing of Sounds." Computer Music Journal 7(3):6-15.
Sound 23:9-11. Moorer, J. A. 1979 'About this Reverberation Business."
Begault, Durand R. 1994. 3-D Sound for Virtual RealityComputer Music Journal 3(2):13-28.
and Multimedia. Boston: Academic Press. pp. 65-66, Sanal, A. J. 1976. "Looking Backward." Journal of the
84, and 191-245. Audio Engineering Society 24(10):832.
Chowning, J. 1971. "The Simulation of Moving Sound Schaeffer, P. 1951. "Journal d'Orphee." In E Bayle, ed.,
Sources." Journal of the Audio Engineering Society Pierre Schaeffer l'ceuvre musicale. France 1990. Paris:
19(1):2-6. Reprinted in Computer Music Journal INA/GRM and Librairie SEGUIER.
1(3):48-52. Schroeder, M. R. 1962. "Natural Sounding Artificial Re-
Cooper, D. H., and J. L. Bauck. 1989. "Prospects for verberation." Journal of the Audio Engineering Society
Transaural Recording." Journal of the Audio Engi- 10(3):219-223.
neering Society 37(1/2):3-19. Smalley, D. 1986 "Spectro-Morphology." In S. Emmerson,
Cooper, D. H., and T. Shiga. 1972. "Discrete Matrix ed., The Language of Electroacoustic Music. London:
Multi-Channel Stereo." Journal of the Audio Engi- MacMillan, pp. 73-80.
neering Society 20(5):346-360. Smalley, D. 1992 "The Listening Imagination." In H. O.
Dodge, C., and T. A. Jerse. 1985. Computer Music. New Paynter, et al., Companion to Contemporary Musical
York: Schirmer Books. pp. 245-247. Thought, vol 1. London: Routledge.
Emmerson, S. 1986 "The Relation of Language to Materi- Stimson, Ann, 1991. "The Script for Poeme Elec-
als. In The Language of Electroacoustic Music. Lon- tronique; Traces from a Pioneer." Proceedings of the In-
don: MacMillan, pp. 17-39. ternational Computer Music Conference. Montreal:
Farrah, K. 1979 "The SoundField Microphone." Wireless ICMC, pp. 308-310.
World. November: 99-103. Stockhausen, K. 1956. "Programme Notes for the 1956
Fox, B. 1982. "Early Stereo Recording." Studio Sound World Premiere of Gesang Der Jiinglinge." In liner
24(5):36-42. notes for the 1992 CD Stockhausen 3 Elektonische
Gerzon, M. A. 1972. "Periphony: With-Height Sound Re- Musik 1952-1960 Stockhausen: Verlag.
production." Journal of the Audio Engineering SocietyThiele, G., and G. Plenge. 1977. "Localization of Lateral
21(1):2-10. Phantom Sources." Journal of the Audio Engineering
Gerzon, M. A. 1974 "Surround-Sound Psychoacoustics." Society 25(4):196-200.
Wireless World. December: 484. Vennonen, K. 1994. '"A Practical System for Three-
Gerzon, M. A. 1977a. "Design of Ambisonic Decoders for Dimensional Sound Projection." In Proceedings of the
Multi Speaker Surround Sound." Paper presented at the Symposium on Computer Animation and Computer
58th Audio Engineering Society Convention, 4 Novem- Music. Canberra, Australia: Australian Centre for the
ber, New York. Arts and Technology.
Gerzon, M. A. 1977b. "Surround Sound Decoders" (7 Weiland, E C. 1975. "Electronic Music-Musical Aspects
parts). Wireless World. January to August issues, 1977. of the Electronic Medium." Internal publication, Insti-
Hope, Adrian. 1979. "Fantasia-Multitracked." Studio tute of Sonology, Utrecht State University.
Sound 21(8):29-30. Wishart, T. 1986 "Sound Symbols and Landscapes." In
Malham, D. G. 1992. "Experience with Large Area 3-D S. Emmerson, ed., The Language of Electroacoustic
Ambisonic Sound Systems." In Proceedings of the Insti-Music. London: MacMillan, p. 45.
This content downloaded from 204.168.144.64 on Sun, 13 May 2018 05:27:24 UTC
All use subject to http://about.jstor.org/terms