
Orthogonal Factors Describing Primary and Spatial Sensations of the Sound Field in a Concert Hall Yoichi Ando Graduate School of Science and Technology, Kobe University Rokkodai, Nada, Kobe 657-8501 Japan Subjective preference of the sound field in a concert hall is described based on the model of human auditory-brain system. The model consists of the autocorrelation function (ACF) mechanism and the interaural crosscorrelation function (IACF) mechanism for signals arriving at two ear entrances, and the specialization of human cerebral hemispheres [Ando, Architectural Acoustics, AIP/Springer, 1998]. From this view point, primary sensations such as pitch or missing fundamental, loudness, timbre, and in addition duration sensation which is introduce here as a fourth are well described by the temporal factors extracted from the ACF associated with left hemisphere. And, spatial sensations such as apparent source width (ASW) and subjective diffuseness are described by the spatial factors extracted from the IACF associated with the right hemisphere ORTHOGONAL FACTORS peak. Usually, there are certain correlation between τ τ φ φ n and n +1, and between n and n+1; Primary sensations and spatial sensations as well as (3) Effective duration of the envelope of the subjective preference for sound fields are well normalized ACF, τe, which is defined by the ten- described by a model of the auditory-brain system. The percentile delay and which represents a repetitive model includes autocorrelation function (ACF) and feature or reverberation containing the sound interaural crosscorrelation function (IACF) source it. mechanisms [1,2]. Important evidences supporting this (a) 1 model were discovered in relation to the auditory-brain activity [2]. This article reviews that primary φ sensations and spatial sensations are mainly described 1 by temporal and spatial factors extracted from the ACF ) τ ( 0 and the IACF, respectively. p φ τ Factors extracted from the ACF 1 -1 The ACF is defined by 0 Delay time τ [ms] +T (b) Φ τ 1 τ (1) 0 τ p( ) = p'(t)p'(t+ )dt e 2T -T [dB] -5 where p’(t) = p(t)*s(t), s(t) being the ear sensitivity, ) | which is essentially formed by the transfer function of τ ( p physical system to oval of cochlea. For convenience, φ -10 s(t) may be chosen as the impulse response of an A- weighted network [1,2]. The ACF and the power log | density spectrum contain the same information. There -15 are four factors, which can be extracted from the ACF: 020100 0 Delay time τ [ms] (1) Energy represented at the origin of the delay, Φ (0); p FIGURE 1. Definition of independent factors other than (2) Fine structure, including peaks and delays (Figure Φ (0) extracted from the normalized ACF. (a) Values of 1a). For instance, τ and φ are the delay time and τ φ 1 1 1 and 1 for the first peak; (b) The effective duration of τ φ τ the amplitude of the first peak of ACF, n and n the ACF e is obtained practically by the extrapolation of the envelope of the normalized ACF during the decay, being the delay time and the amplitude of the n-th 5 dB initial (b). The normalized ACF is defined by φp(τ) = Φp(τ) /Φp(0) (2) As a manner shown in Figure 1b, the value of τe is obtained by fitting a straight line for extrapolation of delay time at –10 dB, if the initial envelope of ACF decays exponentially. Therefore, four orthogonal and temporal factors that can be extracted from the ACF are Φp(0), τ1, φ1, and τe . Auditory-Temporal Window In analysis of the running ACF, of particular interest FIGURE 2. Definition of independent factors IACC, is so called an “auditory-temporal window”, 2T in τIACC and WIACC extracted from the normalized IACF. Equation (1), that must be determined. Since the initial τ part of ACF within the effective duration e of the ACF contains the most important information of the signal, PRIMARY SENSATIONS thus the recommended signal duration (2T)r is given by Loudness (2T)r ≈ K1(τe)min [s] (3) Let us now consider primary sensations. Loudness sL is given by where (τe)min is the minimum value of τe obtained by analyzing the running ACF, K1 being the constant sL = f[Φp(0), τ1, φ1, τe, D] (6) around 30 [7]. The running step (Rs) is selected as K2(2T)r, K2 being selected, say, in the range of 1/4 – 3/4. where D is the duration of sound signal as is represented by musical notes. It is worth noticing that τ Factors extracted from the IACF the value of 1 corresponds to pitch of sound and/or the missing fundamental as discussed below. Since the The IACF is given by sampling frequency of the sound wave is more than the twice of the maximum audio frequency, the value +T 10logΦ (0)/Φ (0)ref is far more accurate than the Leq 1 Φlr(τ) = p'l(t)p'r(t+τ)dt (4) which is measured by the sound level meter. 2T -T Scale values of loudness within the critical band were obtained in paired-comparison tests (with filters where p’ (t) = p(t) *s(t), p(t) being the sound with the slope of 1080 or 2068 dB/octave) under the l ,r l,r l,r Φ pressure at the left- and right-ear entrances. The condition of a constant p(0) [2,4]. Obviously, when normalized IACF is given by sound signal has the similar repetitive feature, τe becomes a great value, as like a pure tone, then the 1/2 φlr(τ) = Φlr(τ)/[Φll(0)Φrr(0)] (5) greater loudness results. Thus a plot of loudness versus bandwidth is not flat in the critical band. This where Φll(0) and Φrr(0) are autocorrelation functions contradicts previous results of the frequency range (τ = 0) or sound energies arriving at the left- and right- centered on 1 kHz [5]. ear entrance, respectively. Spatial factors extracted from the IACF are defined in Figure 2 [2]. Pitch In analyzing the running IACF, 2T is selected by Equation (3) also. For the purpose of spatial design for The second primary sensation applying the ACF is sound fields, however, longer values of (2T)r may be the pitch or the missing fundamental of the noise. It is useful, because it is essentially time independent. given by sP = f[Φp(0), τ1, φ1, τe , D] (7) When a sound signal contains only a number of Table 1. Primary sensations in relation to factors extracted harmonics without the fundamental frequency, we hear from the autocorrelation function and the interaural crosscorrelation function. the fundamental as a pitch. This phenomenon is well explained by the delay time of the first peak in the ACF fine structure, τ1 [6,7]. According to experimental Factors Primitive Sensations results on the pitch perceived when listening to a) bandpass noises without any fundamental frequency, Loudness Pitch Timbre Duration the pitch sp is expressed by equation (7) as well, under the condition of a constant Φπ(0). The strength of the ACF Φp(o) X x X X pitch sensation is described by the magnitude of the τ φ 1 X X X X first peak of the ACF, 1. For a signal of short duration, φ x X X X factor D must be taken into account. 1 τe X x X x D xb) xb) Xb) X Timbre The third primary sensation, timbre that includes X and x : Major and minor factors influencing the corresponding response, respectively. pitch, loudness, and duration, might be expressed by a). Timbre in relation to all of temporal and spatial factors is under investigation. b). It is suggested that loudness, pitch and timbre should be sT = f[Φp(0),τe, τ1, φ1, D] (8) examined in relation to the signal duration. It is worth noticing that the intelligibility of single τ syllables as a function of the delay time of single s = f[LL, IACC, IACC, WIACC] (10) reflection is well be calculated by the four orthogonal factors extracted from the running ACF analyzed for where the piece between consonant and vowel sounds [7]. A Φ Φ recent investigation, clearly show that timbre or LL = 10 log [ p(0)/ (0)ref] (11) dissimilarity judgment is an overall subjective response Φ Φ Φ 1/2 Φ Φ similar for the subjective preference of sound fields in And p(0) = [ ll(0) rr(0)] , and ll(0) and rr(0) concert hall. being ACFs at τ = 0 (sound energies), of the signals arriving at the left and right ear-entrances. In four orthogonal factors in Equation (10), the interaural Duration delay time, τ , is a significant factor in determining IACC the perceived horizontal direction of the source. A The forth-primitive sensation, which is introduced well-defined direction is perceived when the here, is the perception of signal duration, which is normalized interaural crosscorrelation function has one given by [12,13] sharp maximum, a high value of the IACC and a narrow value of the W , due to high frequency s = f[Φ (0), τ , φ , τ , D] (9) IACC D p 1 1 e components. On the other hand, subjective diffuseness or no spatial directional impression corresponds to a One of experimental results has been expressed in τ φ low value of IACC (< 0.15) [9]. relation to 1, 1, and D [8]. Table 1 indicates Of particular interest is that, for the perception of a summarization of primary sensations in relation to sound source located in the median plane, the temporal factors extracted from the ACF and physical signal factors extracted from the ACF of sound signal duration D.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages97 Page
-
File Size-