Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

The Ambisonic Decoder Toolbox: Extensions For Partial Coverage Loudspeaker Arrays

Download as pdf or txt
Download as pdf or txt
You are on page 1of 36
At a glance
Powered by AI
The document discusses Ambisonics, a system for spatial audio representation and reproduction. It focuses on Ambisonic decoders and the challenges of designing decoders for irregular speaker arrays.

Ambisonics is a system for representing sound fields independently of specific speaker signals. It involves capturing sound using microphone arrays and reproducing it using speaker arrays. Key components include B-format, panners, and decoders.

The goals in designing Ambisonic decoders are to mimic natural hearing conditions by achieving constant amplitude/energy gain for all sources and matching low and high frequency perceived directions. A key challenge is maximizing energy concentration in the source direction.

THE AMBISONIC DECODER TOOLBOX:

EXTENSIONS FOR PARTIAL COVERAGE


LOUDSPEAKER ARRAYS
Aaron J. Heller, AI Center, SRI International, Menlo Park, CA US
Eric M. Benjamin, Surround Research, Pacifica, CA US

Linux Audio Conference, May 3, 2014

What is Ambisonics?
Extensible, hierarchical system for representing sound

fields
Says how something should sound, rather than specific speaker

signals.

Capture or creation
Microphone arrays
2-D or 3-D
Natural B-format, Tetrahedral, Spherical arrays

Ambisonic Panners

Reproduction
2-D, horizontal or 3-D with height loudspeaker arrays
Any size or shape array of loudspeakers
Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

What is an Ambisonic Decoder?


In Ambisonics, the program format is independent of the

reproduction layout.
The decoders task is to create the best perceptual

impression possible that the sound field is being


reproduced accurately, given the resources available
Bandwidth, number of speakers, configuration of speakers

We use the term decoder to mean the configuration for a

decoding engine that does the actual signal processing

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

Goals for decoder design


Mimic conditions of natural hearing
Constant amplitude gain for all source directions (P)
Constant energy gain for all source directions (E)
At low frequencies, correct reproduced wavefront direction and
velocity (rV)
At high frequencies, maximum concentration of energy in the
source direction (rE)
Matching high- and low-frequency perceived directions
Getting rE correct is the most difficult aspect
Recent work shows that it is also the most important!

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

Designing Decoders
Decoders for regular polygon and polyhedra loudspeaker

arrays are easy to design


Build the speaker encoding matrix, K, by sampling the spherical

harmonics at the speaker directions


Use pseudoinverse to find the basic decoding matrix M
rE guaranteed to point in same direction as rV

However
Room geometry or visual considerations often limit speaker
placement
3-D HOA requires placing more speakers above and below the
listener
Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

How youd like to do it

AuraLab, San Francisco


Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

A useful compromise

The Bubble, San Francisco


Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

Tradeoffs
Once we deviate from regular geometry
we must trade off localization accuracy for uniform loudness
Directions of rE and rV are not the same
Localization degrades outside the area with a high density

of loudspeakers
Gerzon used nonlinear optimization for this

Many implementations: Wiggins, Moore & Wakefield, Tsang, BLaH

Works well for small arrays (e.g., ITU 5.1)


Convergence is slow for large HOA arrays (hrs)
IDHOA (Scaini and Arteaga) looks promising
Better objective function and zero out small coefficients
Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

New Strategies in Toolbox


Use an inversion technique suited to ill-conditioned

matrices

Constant energy decoder


Truncated SVD
Energy limited

Invert a well-behaved full-sphere virtual speaker array,

map to a real array

Hybrid Ambisonic-VBAP
AllRAD (Zotter and Frank)

Derive a new set of basis functions for which inversion is

well behaved

Spherical Slepian Functions


EPAD (Zotter, Pomberger, Noisternig)
Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

Are these decoders Ambisonic?


Ambisonic theory specifies performance goals, not how to

design a decoder
We use the same criteria for these decoders
But
Apply them only to source directions in the covered part of the sphere
Require them be well behaved in other directions
3rd order Hybrid Ambi-VBAP (AllRAD)
50

50

0.8
0

0.7
-50

0.6
Azimuth HdegreesL

-150 -100

-50

50

100

0.5

150

HaL rE vs. Test Direction

-6
-4
-2
0246

10

4
-50

Elevation HdegreesL

0.9
Elevation HdegreesL

Elevation HdegreesL

10

-2

50

-4

-6
-50

2
Azimuth HdegreesL

-150 -100

-50

50

100

150

HbL rE Direction Error HdegreesL

-6
-4
-2
0246

-8
Azimuth HdegreesL

-150 -100

-50

50

100

150

HcL Energy Gain HdBL

-10
-6
-4
-2
0246

11

CCRMA Listening Room


22 identical loudspeakers in

five rings
Horizontal ring of 8
loudspeakers
2 rings of 6 loudspeakers,
one 50 below horizontal and
one 40 above
1 loudspeaker at each pole
Array is almost regular
Upper 15 used for
hemispherical dome
Full-sphere decoder
described in our LAC2012
paper
Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

12

AllRAD Hybrid Ambi-VBAP


240 point spherical

design for virtual


speaker array
Dome of upper 15

loudspeakers of
CCRMA Listening
Room, 8-6-1
Imaginary speaker

at bottom
Design procedure

detailed in paper

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

13

AllRAD performance rv

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

14

AllRAD performance rE

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

15

AllRAD rv direction grid

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

16

AllRAD rE direction grid

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

17

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

18

Spherical Slepian Functions


Linear combinations of spherical harmonics
Produce a new set of basis functions that are zero outside

the region of interest on the sphere


Remain orthogonal within the region
Used in satellite geodesy to model earths gravitational
and magnetic fields from incomplete data
In Ambisonic decoding, we can specify a region of the

sphere, a dome or a ring, and derive a well behaved set of


basis functions for that region.
Design procedure detailed in paper
Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

19

3rd order spherical harmonics (blue = inverted polarity)

3rd order spherical Slepian functions for +90 to -30 dome (first 13 used for decoder)

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

20

Spherical Slepian performance rv

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

21

Spherical Slepian performance rE

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

22

Spherical Slepian rv direction grid

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

23

Spherical Slepian rE direction grid

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

3rd order Hybrid Ambi-VBAP (AllRAD)

50

50

0.8
0

0.7
-50

0.6
Azimuth HdegreesL

-150 -100

-50

50

100

HaL rE vs. Test Direction

4
-50

-2

50

-4

-6
-50

2
Azimuth HdegreesL

-150 -100

0.5

150

10
Elevation HdegreesL

Elevation HdegreesL

0.9

-6
-4
-2
0246

-50

50

100

-8
Azimuth HdegreesL

-150 -100

150

HbL rE Direction Error HdegreesL

-6
-4
-2
0246

-50

50

100

150

HcL Energy Gain HdBL

-10
-6
-4
-2
0246

3rd order spherical Slepian function (EPAD)


50

50

0.8
0

0.7
-50

0.6
Azimuth HdegreesL

-150 -100

-50

50

100

0.5

150

HaL rE vs. Test Direction

-6
-4
-2
0246

10

4
-50

Elevation HdegreesL

0.9
Elevation HdegreesL

Elevation HdegreesL

Elevation HdegreesL

24

50

-2
-4

-6
-50

2
Azimuth HdegreesL

-150 -100

-50

50

100

150

HbL rE Direction Error HdegreesL

-6
-4
-2
0246

-8
Azimuth HdegreesL

-150 -100

-50

50

100

150

HcL Energy Gain HdBL

-10
-6
-4
-2
0246

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

25

In situ performance measurements


Tested
AllRAD Dome
Spherical Slepian Dome
Full-sphere (from LAC2012)

Dummy head and reference

omni
Dome array using upper 15
speakers in CCRMAs
listening room (8-6-1)

Collected
individual speaker IRs
Ambisonically panned IRs at
10 azimuth, 30 elevation
intervals for each decoder
Analyzed horizontal data
250 Hz ITD (rV)
1 to 4 kHz ILD (rE)

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

26

ITD and ILD measurements


250 Hz ITD

Observations
The measured ITDs were

similar with the three


decoders but ILDs were
very different
1-4 kHz ILD

This supports the subjective

observations that the three


decoders sound different
Detailed analysis is pending

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

27

Informal listening tests


3rd-order test programs
Full-sphere mix of Babel by Allette Brooks (Jay Kadis)
Chroma XII by Rebecca Sanders (Jrn Nettingsmeier)
Both dome decoders sounded good subjectively (but

different!)
Compact and directionally accurate localization down to horizon
Faded below horizon
SSF decoder sounded brighter and more detailed than AllRAD

Neither decoder sounded as good the full-sphere

reference decoder
1st-order orchestral recording not reproduced well
Most of orchestra is below the horizon
Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

28

Decoding Engine
New decoding engine written in FAUST
No inherent limit on order
Dual band, NFC filters, distance compensation,
Toolbox writes out configuration section, appends

implementation
Compiles to LADSPA, LV2, Pd, Supercollider, VST, AU
Can be used independently of toolbox
Drawback: Configuration baked into plugin
Toolbox also writes out configuration files for
Kronlachners ambiX plugin suite
Adriaensens Ambdec
Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

29

Implementation
Toolbox runs in MATLAB and GNU Octave
Implements all known channel ordering and normalization
conventions; both mixed-order conventions (HP and HV)
No inherent limit on Ambisonic order
Actively in use by a few beta testers
Mixed results for graphics output in Octave
Moving graphics output code to Python with MayaVi
Interface to IDHOA optimizer
GNU Affero General Public License
Faust decoder engine BSD 3-Clause License
Git repo at https://bitbucket.org/ambidecodertoolbox/adt

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

30

Summary and Conclusions


Extensions to Ambisonic Decoder Toolbox to handle speaker

configurations that do not cover full sphere


New decoder engine in written in Faust
Ability to generate decoders quickly has proven valuable in
performance settings
Plans
Dual-band AllRAD and Slepian decoders
Optimizer to refine decoders
Open question:
What to do when sources move into areas of poor coverage.
Current implantation fades them out.
Decorrelate and mix into other speakers?
Should transmission standards include rendering hints?

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

31

Thanks!
Fernando Lopez-Lezcano for helping with the listening

tests and in-situ measurements, and overall feedback and


encouragement.
Andrew Kimpel, Marc Lavalle, and Paul Power who are
active users.
Richard Lee, Jrn Nettingsmeier, and Bob Oldendorf who
read early drafts and provided feedback.
LAC 2014 reviewers and organizers

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

32

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

33

Human Auditory Localization


At low frequencies (up to about 800 Hz) works by

Interaural Time Differences (ITDs)


At middle frequencies (800 Hz to 5 kHz) works by
Interaural Level Differences (ILDs)
Transition is fairly sharp
due to the ITDs becoming ambiguous once the wavelength

become smaller than ear spacing.

2-channel stereo doesnt get it right


ILD cues are such that the images tend to stick to nearest speaker
Ambisonics was designed from the beginning to get this

correct with modest resources.


Small number of program channels and loudspeakers
Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

34

Gerzons Theory of Auditory Localization


Early workers in stereo did theoretical analysis showing

how stereo did (or didnt) provide proper localization cues


Gerzons contribution was to integrate those theories and
came up with a theory that defined
rV, the vector sum of the signals from the loudspeakers
rE, the vector sum of the squares of the signals from the

loudspeakers.

By providing a simple mathematical encapsulation, we

can use these to


design decoders
prove theorems, e.g., polygonal decoder theorem
help understand what various spatial sound reproduction systems

can and cannot do


Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

35

Localization Vector Theory


rV predicts low-frequency localization almost perfectly.
If rV=1, then low-frequency sounds will be precisely located.
rE predicts mid-frequency localization moderately well.
If rE=1, then mid-frequency localization will be good
BUT rE is always less than1, unless the sound is coming from a
single point source.
At best rE = cos(/2), where is the angle between the
loudspeakers, so for a square array rE 0.707.
In general, rE is low in directions with few loudspeakers
Best we can do is have it change smoothly in performance from
dense areas to sparse areas.

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

36

Energy Localization Vector


Maximizing rE and getting it to point in the right direction is

the crux of the decoder design problem.


Easy with regular arrays
Irregular arrays always involve tradeoffs
Virtually all real world arrays are irregular!
Arrays need to fit in real rooms

ITU 5.1 is the dominant domestic standard, rear speakers 120 apart.

Because it is a non-linear function of speaker position, we

currently need to use numerical optimization methods.

Heller, Benjamin, The Ambisonic Decoder Toolbox, Linux Audio Conference 2014, ZKM, Karlsruhe, Germany

You might also like