Real-Time 3D Ambisonics Using Faust, Processing, Pure Data, and Osc

Proc. of the 18th Int. Conference on Digital Audio Effects (DAFx-15), Trondheim, Norway, Nov 30 - Dec 3, 2015 REAL-TIME 3D AMBISONICS USING FAUST, PROCESSING, PURE DATA, AND OSC Pierre Lecomte, Lab. de Mécanique des Structures et Systèmes Couplés Philippe-Aubert Gauthier, (LMSSC), Conservatoire National des Arts et Métiers GAUS Paris, France Université de Sherbrooke, Québec, Canada Groupe d’Acoustique de l’Université de Sherbrooke (GAUS), Université de Sherbrooke, Québec, Canada [email protected] ABSTRACT detrimental for physically-accurate sound field reproduction as it takes into account the loudspeaker distance from the origin in or- This paper presents several digital signal processing (DSP) tools der to compensate for the loudspeakers spherical waves. In this for the real-time synthesis of a 3D sound pressure field using Am- trend, this work is motivated by the need to develop a practical bisonics technologies. The spatialization of monophonic signal or implementation of Ambisonics for industrial applications such as the reconstruction of natural 3D recorded sound pressure fields is laboratory reproduction of industrial sound environments, vehicles considered. The DSP required to generate the loudspeaker signals cabin noise, virtual vibroacoustics models, architectural spaces, is implemented using the FAUST programming language. FAUST etc. In these scenarios, the reproduced sound field must be as close enables and simplifies the compilation of the developed tools on as possible than the target sound field. Some recent examples of several architectures and on different DSP tool format. In this pa- such applications potentials are found in Refs. [10, 11, 12, 13]. per, a focus is made on the near-field filters implementation which Typical outcomes are related to listening tests, sound quality test- allows for the encoding of spherical waves with distance informa- ing, perceptual studies and other. On this matter, as mentioned by tion. The gain variation with distance is also taken into account. Vorländer [12] with respect to Ambisonics implementations that The control of the synthesis can be made by software controllers or often include modifications inspired by auditory perception to in- hardware controllers, such as joystick, by the mean of PURE DATA crease the listener experience with respect to some expectations, a and OPEN SOUND CONTROL (OSC) messages. A visual feedback "generally applicable reproduction system [for sound field simu- tool using the PROCESSING programming language is also pre- lation] must not introduce any artificial auditory cue which is not sented in the most recent implementation. The aim of this research part of the simulated sound." [12]. derives from a larger research project on physically-accurate sound field reproduction for simulators in engineering and industrial ap- In this context, this paper presents an implementation of Am- plications. bisonics technologies for real-time synthesis of 3D sound field up to 5th order. The signal processing is implemented in FAUST1 1. INTRODUCTION (Functional AUdio STream) programming language. This language proposes a functional approach to implement the signal pro- Ambisonics technologies allow describing 3D sound pressure fields cessing and it compiles in efficient C++ code [14]. From this using a projection on a truncated spherical harmonics basis [1]. code, DSP tools are provided in several formats such as: VST, The resulting 3D encoded sound pressure field can later be ma- LV2, Pure Data, Max/MSP, JACK QT, and others. Thus, from the nipulated, decoded and reconstructed over several loudspeaker- same FAUST code, one can generate tools working on various layouts or even headphones [2]. Ambisonics has two main ob- operating systems and configurations. jectives: the spatialization of virtual sound sources or the reconstruction of recorded sound pressure fields [3]. Several software The focus of this paper is on physical encoding and decoding solutions exist to create, transmit, manipulate, and render sound of 3D sound pressure fields with a special attention dedicated to pressure fields using Ambisonics technologies. See references [4, the near field filters development, definition and implementation. 5, 6, 7, 8] as examples. Albeit being popular for practical appli- After defining the notations in use in Sec. 2, the main equations of cations in music, sound design and audio context, classical and Ambisonics are recalled in Sec. 3. In Sec. 4 the implementation in common Ambisonics implementations suffer from few drawbacks FAUST programming language is addressed with a special atten- that limit their use for physically-accurate sound field reproduction tion on near-field filters. Section 5 presents a visual feedback tool with applications to environment simulators (vehicles, working en- using PROCESSING language. This tool helps visualizing in 3D vironments, etc.) in industrial or engineering context. Indeed, the the virtual sources and the loudspeaker levels. Finally, in Sec. 6, near-field encoding [2] is rarely provided and the encoding/decod- the user control interface is addressed, presenting the possibility of ing in 3D context is limited to the first orders, hence limiting the interfacing all elements with OSC protocol. spatial resolution and area size of physically-accurate reproduction. Indeed, if the sound field is controlled up to an order M, the reconstruction area size is frequency-dependent and given by r = Mc=2π [9] (where r is the are size radius, c the speed of sound in air, and f the frequency). The near-field support is also 1http://faust.grame.fr/ DAFX-1 Proc. of the 18th Int. Conference on Digital Audio Effects (DAFx-15), Trondheim, Norway, Nov 30 - Dec 3, 2015 2. COORDINATE SYSTEM AND NOTATIONS Notations In this section, the different notations used throughout the paper The Laplace variable is denoted s and the discrete time variable z are presented and illustrated. (discrete domain). A vector is denoted by lowercase bold font v and a matrix by uppercase bold font M. Superscript T designates the transposition operation. j is imaginary unit with j2 = −1. Spherical coordinate system The following spherical coordinate system is used throughout this 3. AMBISONICS paper and shown in Fig. 1: In this section, the principal Ambisonics equations are recalled. They will later be used for the real-time DSP implementation. u1 = r cos(θ) cos(δ); u2 = r sin(θ) cos(δ); u3 = r sin(δ) (1) 3.1. Encoding u3 In Ambisonics, the encoding step consists in deriving B-Format 2 P signals from either monophonic signal (with a spatialization con- r text) or microphone array signals (natural sound field encoding). δ u2 θ 3.1.1. Monophonic signal u1 From a monophonic signal, the encoding can be done for a plane wave with amplitude a(z), azimuth θ and elevation δ direction Figure 1: Spherical coordinate system. A point P (u ; u ; u ) is 1 1 1 2 3 or a spherical wave, adding a distance information r described by its radius r, azimuth θ and elevation δ. 1 σ σ Bmn(z) = a(z)Ymn(θ1; δ1) Plane wave σ σ (3) A virtual source position is denoted with its spherical coor- Bmn(z) = a(z)Fm;r1 (z)Ymn(θ1; δ1) Spherical wave dinates r1; θ1; δ1. The rendering loudspeakers are arranged on a sphere of radius r0, as shown in Fig. 2. In Eq. 3, filters Fm;r1 (z) are the forward near-field filters which take into account the finite distance of the virtual source r1 [2]. 3.1.2. Rigid spherical microphone array encoding For a natural 3D sound pressure field recording made with a rigid σ spherical microphone array of radius ra, the Bmn are given by: N σ X σ Bmn(z) = Em;ra (z) Ymn(θi; δi)wipi(z) (4) i=1 Em;ra (z) are the equalization filters which take into account the diffraction of the rigid sphere [3]. The sound pressure signal at the Figure 2: Ambisonics playback layout. Blue: the rendering loud- th i capsule position (ra; θi; δi) is denoted pi(z) on the array of N speakers disposed on a spherical layout of radius r0. Red: the σ microphones. The Bmn components are guaranteed to be exact virtual source at radius position r1. up to order M if the spatial sampling of the spherical microphone array respects the orthonormality criterion of spherical harmonics up to M [15, 16]. Thus, there is possibly a weighting factor wi for Spherical harmonics each capsule in Eq. (4) to ensure this condition. The working band- width of the array without aliasing is given by: f ≤ Mc=(ra2π) The spherical harmonics used in this paper are defined as follow [17], where f is the frequency, c the sound speed, and M the max- in [1]: imum working order for the array. Equation (4) is for a triplet of indices (m; n; σ). Thus, up s to order M, the B-Format signals vector is obtained with matrix σ (m − n)! Ymn(θ; δ) = (2m + 1)n Pmn(sin(δ)) notation: (m + n)! T b(z) = E(z) · Ymic · Wmic · p(z) (5) cos(nθ) if σ = 1 (L×1) 1 σ 1 (L×L) × (2) b(z) = [B00(z) ··· Bmn(z) ··· BM0(z)]. E(z) is sin(nθ) if σ = −1 the diagonal matrix of equalization filters with diagonal terms [E0;ra (z) ··· Em;ra (z) ··· EM;ra (z)], each Em;ra (z) term being where Pmn are the associated Legendre functions of order m and (N×L) replicated (2m + 1) times on the main diagonal. Ymic is the degree n, m and n 2 N with n ≤ m, σ = ±1 and n = 1 if matrix of spherical harmonics up to order M evaluated at each di- n = 0 = 2 n > 0 (N×N) , n if . The spherical harmonics order are rection (θ ; δ ). W is the diagonal matrix of weightings. denoted by m and its degree, by n.

Load more