Blah3: Is My Decoder Ambisonic?
Total Page:16
File Type:pdf, Size:1020Kb
Is My Decoder Ambisonic? Aaron J. Heller Artificial Intelligence Center, SRI International Menlo Park, CA 94025, USA Email: [email protected] Richard Lee Eric M. Benjamin Pandit Littoral Dolby Laboratories Cooktown, Queensland 4895, AU San Francisco, CA 94044, USA Email: [email protected] Email: [email protected] Revision 2 Abstract In earlier papers, the present authors established the importance of various aspects of Am- bisonic decoder design: a decoding matrix matched to the geometry of the loudspeaker array in use, phase-matched shelf filters, and near-field compensation [1, 2]. These are needed for accurate reproduction of spatial localization cues, such as interaural time difference (ITD), interaural level difference (ILD), and distance cues. Unfortunately, many listening tests of Ambisonic reproduction reported in the literature either omit the details of the decoding used or utilize suboptimal decoding. In this paper we review the acoustic and psychoacoustic criteria for Ambisonic reproduc- tion, present a methodology and tools for “black box” testing to verify the performance of a candidate decoder, and present and discuss the results of this testing on some widely used decoders. Keywords: Ambisonics, decoder design, listening tests, surround sound, psychoacoustics NOTE: This is a revised and corrected version of the paper presented at the 125th Convention of the Audio Engineering Society, held October 1 – 5, 2008 in San Francisco, CA USA $Id: imda-revised.tex 25737 2012-06-10 23:40:07Z heller $ 1 1 INTRODUCTION rience with good Ambisonic reproduction, we This paper is about testing Ambisonic de- might have stopped there and written off Am- coders. The decoder is the component of an bisonics as yet another failed surround sound Ambisonic reproduction system that derives the technology. loudspeaker signals from the program signals. Instead, we went to the benchmark of Unlike most other surround sound systems in good Ambisonic playback, what are known in- which each channel of a recording is intended formally as Classic Ambisonic Decoders the to drive a single loudspeaker directly, an Am- hardware-based decoders designed by the origi- bisonic recording can be played back on a va- nal Ambisonics team [5] and built up an offline, riety of speaker layouts, both 2-D and 3-D, by file-to-file decoding workflow that mimicked the using an appropriate decoder. processing performed by those decoders. Since A key feature of Ambisonic theory is that it each step produced an intermediate file, we were provides a mathematical encapsulation of prac- able to verify that our implementation was per- tically all known auditory localization models, forming as expected. The techniques described except the pinna coloration and impulsive (high- in this paper are a formalization and extension frequency) interaural time delay models. These of this verification process. mathematical descriptions can be used to prove Finally, by using a playback program that theorems about surround sound recording and provided synchronized playback of a number of reproduction, predict what spatial information files and rapid switching among them, we were can and cannot be conveyed by a particular sys- able to gain an understanding of the effects of tem, guide the design of decoders, and as dis- each of the key components in an Ambisonic cussed in this paper, evaluate and validate im- decoder: a decoding matrix matched to the ge- plementations. ometry of the loudspeaker array in use, phase- We assume that the reader is familiar with matched shelf filters, and near-field compensa- the basic workings of surround sound in gen- tion (NFC). These listening tests demonstrated eral and Ambisonics in particular. Background that using the correct decoder results in dramat- material on these topics, as well as sample Am- ically improved performance [1, 2]. bisonic recordings, can be found at various web- A number of recent papers have reported sites [3, 4]. on the results of Ambisonic listening tests that Our interest in determining whether or not a have used decoders that are clearly faulty or given decoder meets the criteria for Ambisonic employed incorrectly. As an example, in ref- reproduction is motivated by practical consider- erence [6] the authors compare various spatial- ations. When we first conducted listening tests, ization techniques, including Ambisonics. The we did what many do: obtained some record- methodology used was well thought out, but un- ings made with a Soundfield microphone, set up fortunately the software used to decode the Am- six loudspeakers in a hexagon, downloaded a de- bisonic program material may not have been the coder off the Internet, and listened with the de- most appropriate: fault settings. What we experienced was quite “The ‘in-phase’ ambisonic decoder confusing completely ambiguous localization was selected as it is recommended and severe comb filtering artifacts from slight for larger rooms and listening areas, head movements. Over the next few listening preventing anti-phase signals to be sessions, we tried other software decoders and fed to the loudspeaker opposite to other settings with different but equally unsat- the sound source.” isfying results. Had we not had previous expe- Later in the paper, the authors conclude that 2 Ambisonics provides poor localization. How- want to design a decoder; they simply want to ever, given that the listening tests were per- verify that an existing one works properly and formed with single listeners using a speaker then use it. array with 2-meter radius, the best (known) Good engineering practice dictates that the methodology for decoding would have been to behaviors of the individual components of a sys- perform exact, or velocity decoding at low fre- tem under test be verified so that its overall per- quencies, energy decoding at middle and high formance can be properly characterized. While frequencies and use near-field compensation. the design criteria have been outlined or implied Another recent paper discusses “spectral in many papers, we have found no discussion of impairment” in Ambisonic reproduction [7]. tools or methodologies to assess how well they While not stated explicitly, a careful reading re- have been met in a given implementation. veals that exact decoding was used over the en- We confine the discussion in this paper to de- tire frequency range. This is known to produce coders suitable for one or two listeners.1 In this comb-filtering artifacts, and again, better decod- paper we test only horizontal, first-order Am- ing strategies are known. bisonic decoders; however, the extensions for Craven sums up the current situation as fol- full 3-D reproduction (periphony) and arbitrary lows [8], orders are straightforward. There are a number of additional factors, any There may be doubt about the of which can have a large effect on the quality meaning of the term “Ambison- of playback but are beyond the scope of what ics”. The precise definition given is discussed here, such as room acoustics, accu- in [Sec. 1.1 of this paper] seems to racy of speaker positioning and matching, tim- have been largely ignored, and the ing skew in multichannel D/A converters, and so term “ambisonic” is now applied forth. Simply due to the number of interconnec- loosely to any system that makes tions, speakers, and amplifiers in a typical Am- use of circular or spherical harmon- bisonics playback system, the odds of making a ics. setup error are much higher than in the case of Most software decoders have many adjust- stereo and the faults more difficult to diagnose ments, but their authors provide little or no guid- than, for example, a channel reversal in stereo ance on appropriate settings for various play- reproduction. back situations, making it difficult for a user to Due to space limitations, we test just four know if they are functioning correctly without pdecoders and a single speaker configuration, the extensive listening tests. We have read many ac- 3 : 1 rectangle. In this configuration there are counts of “phasey”, “ambiguous”, or “unpleas- four speakers at azimuths 30 and 150 de- ant” Ambisonic reproduction that can be at- grees in the horizontal plane. It was preferred tributed to this problem. In particular, phasey over a square layout in previous listening tests, reproduction will occur when exact velocity de- as well as being easier to fit in most domestic coding is used at higher frequencies, where the rooms. wavelengths are smaller than the inter-ear dis- 1Design of decoders that work well over large areas is tance. a distinct art and in general involves additional constraints The key point here is that it is not enough to that compromise their performance for small areas. [9] simply specify that an Ambisonic decoder was used; not all decoders or decoder philosophies perform in the same way. It is also worth to noting that various workers in the field may not 3 1.1 Definition of Ambisonics tion of the localization perception, and the mag- In summary, we are trying to decide if a given nitude indicates the quality of the localization. decoder and loudspeaker configuration meet the In natural hearing, from a single source the mag- criteria for Ambisonic reproduction as defined nitude of each is exactly 1 and the direction is by Gerzon in [10]. the direction to the source. Ideally, both types of cue will be accurately A decoder or reproduction system recreated by a multispeaker playback environ- ◦ for 360 surround sound is defined ment and they will be in agreement with each to be Ambisonic if, for a central lis- other. In terms of Gerzon’s models this means tening position, it is designed such that rV and rE should agree in direction up to that around 4 kHz; that below 400 Hz, the magnitude of rV is near unity for all reproduced directions; i) velocity and energy vec- j j 2 and that between 700 Hz and 4 kHz, rE is max- tor directions are the same at imized over as many reproduced directions as least up to around 4 kHz, such possible.