Acoust. Sci. & Tech. 26, 2 (2005) PAPER

The effect of various source signal properties on measurements of the interaural crosscorrelation coefficient

Russell Mason, Tim Brookes and Francis Rumsey Institute of Sound Recording, University of Surrey, Guildford, Surrey, GU2 7XH, UK ( Received 29 June 2004, Accepted for publication 7 December 2004 )

Abstract: Measurements that attempt to predict the perceived spatial impression of musical signals in concert halls typically are conducted by calculating the interaural cross-correlation coefficient (IACC) of an impulse response. The causes of interaural decorrelation are investigated and it is found that this is affected by frequency dependent interaural time and level differences and variations in these over time. It is found that the IACC of impulsive and of narrowband tonal signals can be very different from each other in a wide range of acoustical environments, due to the differences in the spectral content and the duration of the signals. From this, it is concluded that measurements made of impulsive signals are unsuitable for attempting to predict the perceived spatial impression of musical signals. It is suggested that further work is required to develop a set of test signals that is representative of a wide range of musical stimuli.

Keywords: Spatial impression, Concert hall acoustics, Objective measurement techniques

PACS number: 43.55.Hy, 43.55.Mc [DOI: 10.1250/ast.26.102]

which the correlation is calculated and is an offset 1. INTRODUCTION between the two signals under measurement. R Research into the of auditorium acoustics t 2 xðtÞyðt þ Þdt has indicated that spatial impression is an important hit1 NCCðÞ¼ R R 1=2 ð1Þ component that contributes to the perceived quality or t2 x2ðtÞdt t2 y2ðtÞdt t1 t1 preference ratings of concert halls [1]. Consequently, a large amount of research has been conducted into both the The NCC is a measure of the similarity of any two perception of spatial impression in concert halls [2–4] and signals, though for the purpose of the IACC it is employed objective measurement techniques that attempt to predict it to analyse a pair of binaural signals (the signals that reach [5–7]. This research has included much discussion about the ears of a listener or a head and torso simulator). In this the separate attributes that make up spatial impression, and case, is usually measured over a range that is large it appears that a consensus has been reached that there are enough to encompass the maximum interaural time differ- two major factors: apparent source width (ASW) and ence (ITD) that is caused by the physical separation of listener envelopment (LEV) [3]. Objective measurements human ears, typically 1 ms. The final IACC value is then that have been developed to attempt to predict these factors taken to be the maximum absolute value across the range of can be divided into two types: those based on lateral energy , as shown in Eq. (2). [8,9] and those based on the interaural cross-correlation IACC ¼jNCCðÞjmax; for 1ms<<þ1ms ð2Þ coefficient (IACC) [5,7]. This paper focuses on the latter, as it is closer to the human perceptual process due to the fact Previous research indicates that a simple stimulus with that it quantifies the properties of the signals that arrive at an IACC close to 1 will be perceived to be relatively the ears of the binaural receiver as opposed to using a more narrow. As the IACC of the signal is reduced, the sound artificial technique for capturing the soundfield. will be perceived to increase in size or width, until a point The basis of the IACC is the normalized cross- at which it may separate into two spatially distinct correlation function (NCC), which is calculated as shown components, one positioned at each ear [10,11]. in Eq. (1), where x and y are the two signals whose The most common method used for measuring the correlation is to be calculated, t1 to t2 is the period over IACC of concert hall acoustics is to calculate the IACC of

102 R. MASON et al.: EFFECT OF SOURCE SIGNAL PROPERTIES ON THE IACC an impulse response [12]. The impulse response is the 2. CAUSES OF INTERAURAL equivalent of exciting the acoustical environment with a DECORRELATION IN A SIMPLE wideband impulse from a source (usually positioned on the ACOUSTICAL ENVIRONMENT stage) and recording the resulting sound field at one or more receiver positions (usually in the audience). This Two major factors that can affect the interaural impulse response is first divided at a point 80 ms after the correlation of a signal in a simple acoustical environment direct sound; the measurements made of the early part are are variations in interaural time difference (ITD) and related to the perceived properties of the source and the interaural level difference (ILD) over time. Blauert and measurements made of the later part are related to the Lindemann found that these affect the perceived spatial perceived properties of the acoustical environment or impression of a sound [17], and it may be considered as [5]. This is a logical method of making such follows. If a continuous signal is presented to both ears and measurements, as a large number of factors can be derived either the ITD or ILD is varied slowly over time, then a from such an impulse response. listener will perceive this as movement. However, if the However, there are a number of problems with this rate of change is faster than a few Hz, then the subject may method of IACC-based measurements. Firstly, not perceive the change of location, due to the perceptual it has been shown that the two time segments of the process of ‘localisation lag’ or ‘binaural sluggishness’ [18]. impulse response do not affect the perceived properties of Grantham and Wightman found that for rates of temporal the source or the environment in an orthogonal manner fluctuation in ITD of greater than 20 Hz or so, the [9,13]. In fact, the authors found that the properties of the perception of movement was replaced by the perception reflections that arrive after 80 ms can affect the perceived that the stimulus was ‘wide’ or ‘diffuse’ [19]. source width more than the reflections that arrive before 80 ms [13]. Secondly, it is known that variations in IACC 2.1. Acoustical Cause of Fluctuations in ITD and ILD over time can be perceived as a change in width [14], and It is useful to consider how these fluctuations are therefore ideally this should be quantified using a ‘running’ created in natural acoustical environments, starting with the measurement [15]. Finally, objective results calculated by simple case of a direct sound and a single lateral reflection. analysing the reverberant decay from an impulse may not If a sine wave arrives from a source in the median plane, necessarily be relevant to the perception of musical signals. and is followed by a reflection from the lateral plane, then The reason for this is the dissimilarity between an these two components will interfere at the ears. The result impulsive signal and the majority of the types of of this is summation or cancellation of the signals that programme material that are produced in a concert hall reach the ears, and this interaction may be different at each during performances. This difference means that the results ear due to the fact that the direct sound and the reflection of measurements made of impulses can be different from arrive from different directions. The direct sound will reach measurements made of corresponding musical signals, as both ears at the same time and at the same level, whereas found by Griesinger [16]. the reflection will reach the ear that is nearest to the This paper considers this last factor. By analysing a reflecting surface before and possibly at a higher level than number of different signal types that have been passed the ear that is furthest from the reflecting surface. The through a number of acoustical systems, the causes of resulting interaction depends strongly on the relationship interaural decorrelation are explored, and the likely effects between the wavelength of the source signal and the of these on measurements that relate to spatial impression relative and level of the direct sound and reflection. are estimated. From this, recommendations are made This means that there will be a frequency dependent regarding the most suitable source signals for use when interaction, which is likely to be different at the two ears, making IACC-based measurements that relate to certain and this will result in interaural level and phase differences aspects of spatial impression. that are dependent on frequency. The paper contains three main sections. The first If the source signal is more complex than a single section examines the causes of interaural decorrelation in a sinusoid, the frequency-dependent summation or cancella- simple acoustical environment consisting of a direct sound tion may lead to interaural differences in level and phase from the median plane followed by a single lateral that are different for each frequency component. When reflection. The second section develops this by investigat- these components are summed, the resulting signal at each ing the effect of two different source signal types on the ear can have differences in level and phase that vary over resulting IACC. The third section then involves analysis of time. these two source signals in more complex acoustical This can be demonstrated using a simulation of a sound environments. source located 15 m in front of a binaural receiver, with a single reflection from a surface 5 m to the right of and

103 Acoust. Sci. & Tech. 26, 2 (2005) parallel to the path from the source to the receiver, as the frequency components arriving at each ear. These shown in Fig. 1. If a source signal consisting of three sine interaural differences mean that when the frequency tones of 480, 500 and 520 Hz is passed through this system, components are summed, the total signal in each channel frequency-dependent addition and cancellation will occur is different, which can be interpreted as a time-varying ITD that is different at the two ears. This causes fluctuations in and ILD. The variation in ILD over time is visible as a interaural time and level difference, as can be seen in change in the relative amplitude of the signals, and the Fig. 2. variation in ITD over time is apparent as an alternate It can be seen in Fig. 2 that despite the fact that the leading and lagging in the phase of the two signals. source signal consists of an equal amount of each frequen- These temporal fluctuations in ITD and ILD can be cy component, this is not the case when the signal arrives at measured individually to give a clearer indication of the the receiver due to the frequency dependent interaction fluctuations that are present in the signal. The fluctuations mentioned above. It is also apparent that the level of each in ITD can be measured through the use of a consecutive frequency component that arrives at the ears of the listener series of IACC calculations with a short time window, the is different in each ear. In addition to this, there are also positions of the maxima of each across a range of likely to be frequency dependent phase differences between indicating the ITD in that segment of time as discussed in [20]. The fluctuations in ILD can be measured by calculating the amplitude difference of the left and right ear signals and smoothing the result to reduce the effect of the temporal detail of the signal, as discussed in [20]. The results of this analysis are shown in Fig. 3. It can be seen in Fig. 3 that the temporal fluctuations in ITD and ILD are cyclic, at a rate that matches the spacing between the frequency components that make up the original source signal. It therefore appears that the Fig. 1 Diagram of the positions of the simulated omnidirectional source 15 m in front of the 0.18 m fluctuations that are created are dependent on the charac- wide binaural receiver, with a single reflection from a teristics of the source signal as well as the reflection wall with an absorption coefficient of 0 that is posi- pattern. tioned 5 m to the right hand side and parallel to the path of the direct sound from the source to the receiver. 2.2. Relationship between Temporal Fluctuations in ITD and ILD and the IACC

Source signal The temporal fluctuations in ITD and ILD are closely 1 0 related to the IACC of a signal, as was found by Blauert 0.5 -10 and Lindemann [17]. They found that measurements of the 0 -20 IACC and the magnitude of fluctuations in ITD and ILD Amplitude -0.5 Magnitude (dBFS) -1 -30 0.1 0.12 0.14 0.16 0.18 0.2 460 480 500 520 540 Time (secs) Frequency (Hz) 1 Impulse response captured by binaural receiver 1 0 0.5

0.5 0 -10 0 ITD (msecs) -0.5 -20 -0.5 Left ear signal -1 -1 Magnitude (dBFS) -30 0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.2 0.1 0.12 0.14 0.16 0.18 0.2 460 480 500 520 540 Time (secs)

1 0 1

0.5 -10 0.5 0 0

-20 ILD -0.5 Right ear signal

Magnitude (dBFS) -0.5 -1 -30 0.1 0.12 0.14 0.16 0.18 0.2 460 480 500 520 540 Time (secs) Frequency (Hz) -1 0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.2 Time (secs) Fig. 2 Plot of the waveform and spectral content of a sound source signal consisting of three sine tones of Fig. 3 Plot of the measured fluctuations in interaural 480, 500 and 520 Hz, and the resulting left and right time difference (ITD — upper plot) and interaural level binaural channels when passed through the acoustical difference (ILD — lower plot) of the binaural signal simulation shown in Fig. 1. shown in the lower two plots of Fig. 2.

104 R. MASON et al.: EFFECT OF SOURCE SIGNAL PROPERTIES ON THE IACC were highly correlated with each other, as well as with in ITD and ILD that were created were strongly affected by subjective judgements of spaciousness. the spectral properties of the source signal. This indicates As shown in [20], increasing the magnitude of the that the properties of the source signal may be important fluctuations in either ITD or ILD reduces the resulting when analysing the spatial impression created by an IACC. However, a change in the magnitude of the acoustical environment. fluctuations in ILD alters the IACC less than a similar 3. COMPARISON BETWEEN change in the magnitude of the fluctuations in ITD. MEASUREMENTS OF AN IMPULSE Griesinger considered the relationship between the RESPONSE AND MEASUREMENTS OF A fluctuations in ILD and ITD and the resulting IACC, and MORE CONTINUOUS NARROWBAND noted that the IACC ‘‘links fluctuations in both the SOURCE SIGNAL amplitude and the relative phase of the signals in a particular way’’ [21]. This poses the question of whether a As discussed in the introduction, measurements that more accurate result (in terms of predicting the perceived aim to predict the perceived source width of a sound in an effect) will be achieved by measuring the fluctuations in acoustical environment are most commonly implemented ITD and ILD directly, or whether a measurement based on by analysing the impulse response measured from one or the IACC is sufficient. From a perceptual point of view, it is more source positions to one or more receiver positions. not yet clear whether the human auditory system detects However, it was found in the previous section that the the fluctuations in ITD and ILD directly, or whether the frequency content of the source signal has an effect on the perceptual process is more similar to a cross-correlation resulting fluctuations in interaural time difference (ITD) mechanism. However, it is currently more practical to use a and interaural level difference (ILD). In view of this, it is measurement technique based on the IACC, due to the useful to investigate how measurements made of the difficulty of measuring fluctuations in ILD1. As concluded impulse response, as opposed to a source signal convolved by Griesinger, measurements based on the IACC have with the impulse response, compare. The results of this proved useful, and ‘‘might turn out to be as good or better analysis are shown in Fig. 4. than any other measure’’ [21].

Impulse response captured by binaural receiver 2.3. Summary 1 This section of the paper has outlined the theory that 0 the perception of certain auditory spatial attributes may be

Left ear signal -1 caused by fluctuations in interaural time and level dif- 0.5 0.505 0.51 0.515 0.52 0.525 0.53 ferences over time. It has been shown that these can be 1 created in simple acoustical environments, as long as the 0 source signal is more complex than a single sinusoid. The

IACC Right ear signal -1 relationship between these fluctuations and the has 0.5 0.505 0.51 0.515 0.52 0.525 0.53 also been considered, and it has been noted that as yet it is unclear which of these the auditory system detects. As 1 measurements based on the IACC are most commonly used 0

for this type of analysis, it is proposed that it is logical to ITD (msecs) -1 continue this unless firm evidence is found to contradict its 0.5 0.505 0.51 0.515 0.52 0.525 0.53 use. Therefore, the remainder of the paper will focus 0.05 primarily on IACC-based measurements, but will inves- 0 ILD tigate the fluctuations in ITD and ILD where necessary to -0.05 give an indication of their contribution to the resulting 0.5 0.505 0.51 0.515 0.52 0.525 0.53 1 IACC. It was also noted that the properties of the fluctuations 0.5 IACC

0 0.5 0.505 0.51 0.515 0.52 0.525 0.53 1For instance, if the level difference is calculated by subtracting the Time (secs) sample values of one channel from the sample values of the other, the result can be affected by the overall instantaneous signal level Fig. 4 Plot of the impulse response of the acoustical and deviations from an ITD of 0, as discussed in [20]. It is possible simulation shown in Fig. 1, together with measure- to lessen the effect of these confounding factors by taking an ments of the fluctuations in interaural time difference average value across a short window, however this requires a trade- (ITD) and interaural level difference (ILD), and the off between the duration of averaging and the maximum rate of variation in the resulting interaural cross-correlation fluctuations that can be detected. coefficient (IACC) over time.

105 Acoust. Sci. & Tech. 26, 2 (2005)

The upper two plots of Fig. 4 show the left and right Firstly, as can be seen in the upper plot of Fig. 5, there channels of the binaural impulse response, and this is is decorrelation due to the fact that the reflection does not followed by measurements of the fluctuations in ITD and arrive from the median plane and therefore has a non-zero ILD, and the running IACC (where calculations are made ILD. The measurement result is not affected by the non- every 1.5 ms through the stimulus with t1 to t2 in Eq. (1) set zero ITD as the maximum value is used across a range of to be 50 ms as discussed in [22]), in descending order. As of 1 ms; a range that covers all possible natural ITDs. may be expected, the measurements of the fluctuations in However, as the non-zero ILD is not uniform across ITD and ILD show that the arrival of the direct sound frequency, a certain amount of decorrelation remains when causes no change in either of these, and that the arrival of measured across a practical frequency range. In addition to the lateral reflection causes a deviation in both ITD and this, if variations in the IACC are measured over time using ILD. It must be noted that the duration of the deviation in a consecutive series of calculations, it is possible for a ITD is related to the duration of the measurement window particular measurement window to include a section of the (in this case a rectangular window with a duration of binaural impulse response that only contains information in 25 ms). In a similar manner, the duration of the deviation in one channel. Examples of this can be seen at the beginning ILD is related to the smoothing function (in this case a 5 ms and end of the measurement of the reflection shown in the moving average low-pass filter). The measurement of the upper plot of Fig. 5. At a time of approximately 0.05 s, the IACC shows a similar trend: the correlation is unchanged at measured IACC drops from a value of 1 as the start of one a value of 1 for the direct sound, and the lateral reflection channel of the impulse enters the measurement window, causes a reduction in the value to approximately 0.65. the impulse arriving later in the other channel. At a time of Again, the extent of this variation is related to the duration approximately 0.15 s, the measured IACC again drops as of the measurement window. the end of the impulse leaves the measurement window. It is possible to reduce this effect by the use of a measure- 3.1. Causes of Decorrelation for an Impulsive Signal ment window with smoother end transitions, some form of It is useful to consider the cause of the measured temporal smoothing, or weighting of the results by the decorrelation for the impulse response of a direct sound signal level, though it appears that this is not included in and a single reflection. Further analysis, as shown in Fig. 5, the most commonly used measurement techniques [12]. indicates that there are two main contributing factors, one Secondly, as can be seen in the lower plot of Fig. 5, the related to the lateral reflection itself and one related to the interaction between the direct sound and the reflection interaction between the direct sound and the lateral within a single measurement window also affects the reflection. measured correlation. If the long-term IACC is measured (a measurement where t1 in Eq. (1) is the start of the stimulus and t2 is the end of the stimulus), then both the direct sound

Variation in IACC over time for a single lateral reflection and the reflection will be included in the same calculation, 1 and each will have a certain effect on the result. If the 0.8 measurement is made using a number of shorter windows, 0.6 each window may or may not contain both arriving signals,

IACC 0.4 and if both are included, the proportion of each will be 0.2 dependent on the delay of the reflection, and the duration 0 0.0 0.05 0.1 0.15 0.2 0.25 0.3 Time (secs) and position of the measurement window. The effect of the Variation in IACC over time for a direct sound followed by a single lateral reflection interaction can be seen in the results in the time period of 1 0.05 to 0.1 s in the lower plot of Fig. 5. In this region, the 0.8 measurement windows contain proportions of both the 0.6 direct sound and the reflection, and the variations in the IACC 0.4 result across this range are caused by the differing 0.2 proportions of each arriving signal in each measurement 0 0.0 0.05 0.1 0.15 0.2 0.25 0.3 Time (secs) window. By comparing the upper and lower plots of Fig. 5, it can be seen that the interaction between the direct sound Fig. 5 Measurement of the variation in the interaural and the reflection in a single measurement window results cross-correlation coefficient (IACC) over time using a in a lower correlation compared to the reflection alone, and running IACC: the upper plot shows the measurement once the direct sound has ended at approximately 0.1 s, the of a single lateral reflection from 40, the lower plot remaining result is identical to the single reflection alone. shows the measurement of a direct sound from the median plane followed by a lateral reflection from 40 that is delayed by 25 ms.

106 R. MASON et al.: EFFECT OF SOURCE SIGNAL PROPERTIES ON THE IACC

3.2. Comparison between Impulsive and More Con- However, the resulting minimum IACC values are similar; tinuous Signals approximately 0.65 for the impulse response (shown in The above analysis is also relevant when considering Fig. 4) and approximately 0.67 for the TST signal (shown the IACC that results when differing source signals are in Fig. 6). passed through an acoustical environment that adds at least It must be noted that the impulse and the TST signal one reflection to the direct sound. If the duration of the have very different spectral characteristics, and the com- source signal is less than the delay between the direct parison would be fairer if these were more similar. In view sound and the reflection, then there will be no interaction of this, the impulse response was filtered with a 3rd order between sound arriving directly and the reflected version. Butterworth bandpass filter, with the cut-off frequencies set In this case, the result may be more similar to the upper to match the upper and lower frequency components of the plot in Fig. 5, where any decorrelation is solely caused by TST signal: i.e. 20 Hz above and below the stimulus centre the properties of the reflection. On the other hand, if the frequency. In addition, the centre frequency of the signals duration of the source signal is longer than the delay may have an effect on the result, therefore both impulsive between the direct sound and any reflections (or longer and TST signals were created with a range of centre than the delay between each pair of reflections if there is frequencies with octave spacing from 125 Hz to 8,000 Hz. more than one), then the interaction between the simulta- The TST signals consisted of a tone at the centre neously arriving signals will affect the results in a similar frequency and tones 20 Hz either side of this. Each of these manner to that shown in the lower plot of Fig. 5. This will was then passed through the simulation of a direct sound be discussed in more detail later. and a single early reflection as described above. The It can be seen that the fluctuations in ITD and ILD that impulsive signals were created by filtering the impulse are created by the impulse response (shown in Fig. 4) and response of the direct sound and the single reflection as by the three sine tone (TST) signal fed through the same described above. In order to maximise the likelihood of acoustic simulation (shown in Fig. 6) are very different. similar results between the impulsive and tonal signals, the IACC was calculated as a long-term measurement over the entire signal, therefore ensuring interaction between the Impulse response captured by binaural receiver 1 direct sound and the reflection in both cases. The results are shown in Table 1. 0 It can be seen that the results of the two sets of

Left ear signal -1 measurements are highly correlated in most cases, though 0.5 0.51 0.52 0.53 0.54 0.55 0.56 0.57 0.58 0.59 0.6

1 with larger differences at 2,000 and 8,000 Hz. Further analysis indicates that these larger differences are caused 0 by interactions at specific frequencies that affect one type

Right ear signal -1 of signal and not the other. As an example, the frequency 0.5 0.51 0.52 0.53 0.54 0.55 0.56 0.57 0.58 0.59 0.6 response of the interaction between the direct sound and 1 the reflection around 2,000 Hz is shown in Fig. 7. It can be

0 ITD (msecs) -1 0.5 0.51 0.52 0.53 0.54 0.55 0.56 0.57 0.58 0.59 0.6 Table 1 Results of the long-term interaural cross- 1 correlation coefficient (IACC) measurements of the impulse response shown in Fig. 4 filtered using a 0

ILD bandpass filter with a 40 Hz bandwidth centred on the centre frequency indicated in the left column, and the -1 0.5 0.51 0.52 0.53 0.54 0.55 0.56 0.57 0.58 0.59 0.6 related three sine tone (TST) signal consisting of tones 1 at the centre frequency and 20 Hz either side, passed through the acoustical simulation shown in Fig. 1. 0.5 IACC Centre frequency IACC measured of IACC measured of 0 0.5 0.51 0.52 0.53 0.54 0.55 0.56 0.57 0.58 0.59 0.6 (Hz) narrowband impulse TST signal Time (secs) response 125 0.99 0.99 Fig. 6 Plot of the three sine tone (TST) signal con- 250 0.98 0.97 volved with the impulse response of the acoustical 500 0.66 0.67 simulation shown in Fig. 1, together with measure- 1,000 0.87 0.89 ments of the fluctuations in interaural time difference 2,000 0.61 0.52 (ITD) and interaural level difference (ILD), and the 4,000 0.94 0.97 variation in the resulting interaural cross-correlation 8,000 0.70 0.78 coefficient (IACC) over time.

107 Acoust. Sci. & Tech. 26, 2 (2005)

0 that the acoustical interaction may not occur at the ears due -5 to the signals arriving at separate times. This can also

-10 depend on the method used to measure the IACC; a long-

-15 term measurement may give a more similar result for the two signal types due to the inclusion of the interaction -20 between the direct sound and the reflection which may not -25 occur acoustically. FFT Magnitude (dBFS) -30

-35 Right channel 4. INVESTIGATION OF MORE COMPLEX Left channel REFLECTION PATTERNS -40 1970 1980 1990 2000 2010 2020 2030 Frequency Hz The paper has discussed the effect of different source signals on the resulting IACC, but has so far only used a Fig. 7 Frequency response of the interaction between simple example consisting of a direct sound and a single the direct sound and the reflection around 2,000 Hz. reflection. It is logical to expand this to investigate the effect of more complex reflection patterns. seen that there is a dip in the response in the right channel The analysis of the differences between the two types at approximately 2,010 Hz that is deep and narrow (similar of source signal can be expanded to include a full binaural to a high Q bandstop filter with a rejection ratio of impulse response of an acoustical environment such as a approximately 35 dB). This causes a large interaural concert hall. From the results shown above for a direct difference for the filtered impulse that has energy at the sound with a single reflection, it may be expected that the frequency of the dip, but causes only a small interaural multiple reflections that occur in an acoustical environment difference for the TST stimuli whose spectral components will interact with the direct sound in a similar manner and are on each side of the dip. Repeating the analysis using that frequency-dependent and time-varying ITDs and ILDs stimuli centred on 2,010 Hz gives a resulting IACC that is will still be created. However, there will be a number of much lower for both stimulus types, and that is more significant differences with the addition of a more complex similar between the two types of stimulus (0.30 for the reflection pattern as may be present in an acoustical filtered impulse and 0.34 for the TST signal). environment. Firstly, the greater number of reflections Repeating these measurements using source signals means that the interactions will be more complex. consisting of different frequency components and simu- Secondly, the greater range of delays is likely to cause lations that include differing source-receiver and receiver- the duration of the stimulus to have a greater effect on the wall distances again show that the IACC measurements results. Finally, the presence of a large number of made of the two signal types (filtered impulse response or reflections with a greater range of delays gives rise to the continuous tonal signal convolved with the impulse possibility that reverberation will be perceived as a response) give similar results apart from when there is a separate entity from the perceived sound source. high Q notch in the response that affects the two signals The effect of the duration of source signals on the differently due to their differing spectral content. measured IACC in reverberant environments was inves- tigated by Yanagawa, Yamasaki and Itow [23]. They found 3.3. Summary that when continuous and repetitive (i.e. stationary) signals This section of the paper investigated some of the were used, the maximum measured correlation value differences between IACC-based measurements of an changed rapidly at first, and then more slowly as more impulse response and of source signals convolved with reflections reached the receiver, until a steady-state value the impulse response. The mechanisms that cause decorre- was reached. This can be explained by the following. As lation of an impulsive signal were discussed, together with each reflection arrives at the receiver position, it will the effect of different measurement strategies. It was shown interact with the direct sound causing the resulting signal to that there are two factors that affect the measured results: change. If a simplistic model of a reverberant decay is differences in spectral content and interaction between the assumed, such that reflections that arrive later will be of a direct sound and the reflection. The first of these can result lower level, then the higher relative level of the earlier in differing measured values due to the acoustical inter- reflections are likely to cause a large change in the signal action between the direct sound and the reflection causing that reaches the receiver, whereas the lower relative level strong cancellations at certain frequencies that resemble of the reflections that arrive later are likely to cause a lesser high Q bandstop filters. However, even if the spectral change. Therefore, over the duration of time that new content of the two signals is identical, the differences reflections of the direct sound arrive at the receiver, the between the temporal properties of the two signals mean variation in the total signal at the receiver changes less and

108 R. MASON et al.: EFFECT OF SOURCE SIGNAL PROPERTIES ON THE IACC less, until a point where the signal that reaches the receiver shown in Fig. 9. becomes steady-state. The left hand plots of Fig. 9 show the left and right ear On the other hand, for a short source signal (such as an signals of the impulse response from the source to the impulse) the sound field will not reach a steady-state [23]. binaural receiver, the fluctuations in ITD and ILD, and the The transient nature of the source signal causes the direct running IACC, respectively. These were measured using an sound that arrives at the receiver to end quickly, which identical procedure to that used for Fig. 4 and described in means that the system of reflections is not being driven section 3. It can be seen that the fluctuations in ITD are continuously, and that the different reflections that arrive at relatively slow at the beginning of the impulse response, the receiver will cause the total signal to vary a large and become gradually more erratic throughout the rever- amount over the period of the reverberant decay. berant decay. For the majority of the decay, the ITD is To demonstrate this, a simple rectangular room was centred on zero, which indicates that the sound may be simulated that was approximately 20 m wide by 43 m long perceived to be central. It can also be seen that the by 18 m tall, with a reverberation time of approximately fluctuations in ILD are erratic from near the start of the 1.9 s, as shown in Fig. 8. An omnidirectional source was impulse response, and decay in a similar manner to the simulated at a representative stage position that was 3 m impulse response due to their relationship with the overall left of the centre, 5 m from the front wall, at a height of sound level. The IACC starts at a value of 1 but then 2 m. A binaural receiver was simulated at a representative rapidly drops when the first lateral reflections arrive at the audience position that was 4 m to the right of the centre, receiver. From approximately 50 ms onwards the IACC is 13 m from the back wall, at a height of 1.8 m and facing almost constant at a value of approximately 0.15. towards the source. The results of measurements of the impulse response As the total duration of the impulse response is much can be compared with the measurements of the TST signal longer with a complete room simulation, it is more likely convolved with the impulse response, as shown in the right that variations in the IACC will be perceivable, even when hand plots of Fig. 9. The major difference is that in all taking a conservative estimate of the rate of change that three measurements (fluctuations in ITD, fluctuations in can be perceived. Therefore running measurements are ILD, and IACC), two sections can be differentiated that required to plot the variations in the IACC over time, as relate to the presence or absence of a direct sound. For instance, the fluctuations in ITD build up gradually over time and then become almost constant until the direct sound ends (at approximately 1 s) at which point they become more erratic. A similar trend can be seen with the fluctuations in ILD, though in this case as the direct sound ends the fluctuations begin to decay at the same point as they become more erratic. Finally, the IACC measurement drops rapidly from a value of 1 and fluctuates until reaching an almost constant value from approximately 0.5 s to approximately 1 s. As for the other measurements, when the direct sound ends the result again varies erratically. 00 This is in agreement with the findings of Yanagawa et al. described above. The fact that the effect of the direct sound can be seen in the measurement results is an important difference A0 between the two stimulus types. As discussed in the previous sections, the result will be different if there is an interaction between the direct sound and/or the reflections, either within a measurement window or acoustically. The measured results of the TST signal are different to those of the impulse response because there is an acoustical Fig. 8 Diagram of the simulated simple rectangular interaction in the former that is not present in the latter. room with dimensions of 20 m wide by 43 m long by As discussed in the previous section, the comparison is 18 m tall, with surface absorption set to 0.25 and the fairer if the two signals have a similar bandwidth. There- diffusion set to 10%, resulting in an RT60 of fore the impulse response was filtered to have a similar approximately 1.9 s. The source position is indicated by and denoted A0, and the receiver position is bandwidth to the TST signal, as described previously, and indicated by þ and denoted 00. the analysis was similar to that in the previous example.

109 Acoust. Sci. & Tech. 26, 2 (2005)

Impulse response captured by binaural receiver Three sine tone signal captured by binaural receiver 1 0.2

0 0

Left ear signal -1 Left ear signal -0.2 0 0.2 0.4 0.6 0.8 1 0 0.5 1 1.5 2 1 0.2

0 0

Right ear signal -1 Right ear signal -0.2 0 0.2 0.4 0.6 0.8 1 0 0.5 1 1.5 2

1 1

0 0 ITD (msecs) ITD (msecs) -1 -1 0 0.2 0.4 0.6 0.8 1 0 0.5 1 1.5 2 0.1 0.02

0 0 ILD ILD -0.02 -0.1 0 0.2 0.4 0.6 0.8 1 0 0.5 1 1.5 2 1 1

0.5 0.5 IACC IACC

0 0 0 0.2 0.4 0.6 0.8 1 0 0.5 1 1.5 2 Time (secs) Time (secs)

Fig. 9 Plot of the left and right ear signals respectively, together with measurements of the fluctuations in interaural time difference (ITD) and interaural level difference (ILD), and the variation in the resulting interaural cross-correlation coefficient (IACC) over time for an impulse response from a source to a binaural receiver in an acoustical environment in the left hand plots, and a signal consisting of sine tones at 480, 500 and 520 Hz convolved with the impulse response in the right hand plots.

The results are shown in Fig. 10. acoustical environments on the measurements of spatial Again, it can be seen that the results of all the impression made of an impulse response from a source to a measurements are different for the filtered impulse re- receiver, or a source signal that has been convolved with sponse and the TST signal. In this case, it appears that the that impulse response. The analysis was expanded to fluctuations in ITD and ILD are less erratic compared to the include a complete binaural impulse response from a measurements of the wideband impulse response, but that source to a receiver in a simulation of a simple rectangular the IACC varies more erratically, and does not settle to an concert hall. The running measurements showed very almost constant value. different results between the two stimulus types due to the From these results, it can be concluded that measure- lack of interaction of the direct sound with the reflections in ments made of an impulse response are likely to be the case of the impulse response. It was noted that in this different to those made of a stationary signal, such as the case this factor is more important, due to the longer TST example shown above. In addition, there appears to be duration of the impulse response. no clear relationship between the two sets of results, which It was concluded that as the results of the measure- means that it will be difficult to predict the results of one ments made of the impulse and of the TST signal were so based on the results of the other. different, measurements of one may not accurately relate to the perceived spatial impression of the other, and it would 4.1. Summary be difficult to use the results of a measurement of one of the This section considered the effect of more complex stimuli to predict the results of the other.

110 R. MASON et al.: EFFECT OF SOURCE SIGNAL PROPERTIES ON THE IACC

Impulse response captured by binaural receiver Three sine tone signal captured by binaural receiver 1 0.2

0 0

Left ear signal -1 Left ear signal -0.2 0 0.2 0.4 0.6 0.8 1 0 0.5 1 1.5 2 1 0.2

0 0

Right ear signal -1 Right ear signal -0.2 0 0.2 0.4 0.6 0.8 1 0 0.5 1 1.5 2

1 1

0 0 ITD (msecs) ITD (msecs) -1 -1 0 0.2 0.4 0.6 0.8 1 0 0.5 1 1.5 2 0.5 0.1

0 0 ILD ILD

-0.5 -0.1 0 0.2 0.4 0.6 0.8 1 0 0.5 1 1.5 2 1 1

0.5 0.5 IACC IACC

0 0 0 0.2 0.4 0.6 0.8 1 0 0.5 1 1.5 2 Time (secs) Time (secs)

Fig. 10 Plot of the left and right ear signals respectively, together with measurements of the fluctuations in interaural time difference (ITD) and interaural level difference (ILD), and the variation in the resulting interaural cross-correlation coefficient (IACC) over time for a narrowband filtered impulse response from a source to a binaural receiver in an acoustical environment in the left hand plots, and a signal consisting of sine tones at 480, 500 and 520 Hz convolved with the impulse response in the right hand plots.

spatial impression are intended to predict the perception of 5. DISCUSSION musical signals within a concert hall, it is worth consid- This study has only used a limited range of stimuli: a ering what properties a musical signal will have and how wideband impulse, narrowband filtered impulses, or repet- these will be perceived. The majority of musical signals are itive and continuous (i.e. stationary) narrowband signals tonal and relatively continuous; as noted by Barron, based on sine tones. A further type of signal that has not yet impulsive sounds are rare in the normal experience of been considered is a continuous and non-repetitive (i.e. concert hall acoustics [24]. In addition, McManus et al. nonstationary) signal that could be either wideband or observed that for a majority of musical notes, any impul- narrowband, such as noise. Noise can be considered to be a sive elements that may be present are at the onset of the random signal, and if this is passed through one of the note and this is followed by a sustained tonal segment [25]. acoustical systems used in the analysis above, the varia- When these are produced in an acoustical environment, it is tions in the properties of the signal over time mean that the highly unlikely that the human perceptual system can interaction at the ears of the listener will also vary process the signals arriving at the ears to determine a continuously and randomly. This will result in random detailed impression of the impulse response of the room. temporal variations in ITD, ILD and IACC. For this type of Therefore, the spatial impression must be derived from the stimulus it will be difficult to predict the resulting value physical properties of the musical signal that arrives at the unless the short-term spectral properties of the signal are ears, and in order to predict this perception accurately, the known. measurement result must be similar to that for a source The most important type of signal that has not yet been signal that is representative of the programme material that considered is a musical signal. As the measurements of will be auditioned in that acoustical environment.

111 Acoust. Sci. & Tech. 26, 2 (2005)

As musical signals consist mainly of components that A final approach would be to create a single test signal are more continuous and tonal, they are more similar to the whose properties (such as those mentioned above) vary TST signal than to the impulsive signal used in the analysis over time to cover the full range of values that may be above. However, musical signals are unlikely to be as generated by real musical instruments. If this is reproduced repetitive as the sine tone signals and therefore they can be within a concert hall, the resulting measurement could then interpreted as being somewhere between the sine tone be used to predict the range of spatial impression that is signals (that are continuous and repetitive) and the noise possible within that hall from a wide range of musical signals (that are continuous and vary randomly). Previous instruments. Additionally, if temporal segments of the analysis has indicated that a resonating note or chord, such time-varying test signal have properties that are similar to as a plucked acoustic guitar chord or a sustained piano one or more categories of musical instrument, then these note, has properties closer to a stationary signal after the would give results that are relevant to each category of initial transient attack [20]. On the other hand, a sustained musical instrument in addition to the overall range of note or chord that is continuously excited, such as a bowed variation. cello note or sustained trumpet note, is likely to vary Further research is required to determine the detailed somewhat over time due to variations in the excitation [20]. temporal and spectral properties of a wide range of musical Whilst the spectral content of either of the latter will not signals, in order that practical and representative test vary as much or as randomly as a noise signal, the signals can be created. variations that are present will cause fluctuations in the Certainly, it is clear that measurements made of an ITD, ILD and IACC that may be difficult to predict. impulse response are not satisfactory for predicting the The fact that the source signal has such a large effect on perceived spatial impression of musical signals, due to the the measured IACC in an acoustical environment causes a fact that no information is gained about the interactions problem for making perceptually relevant measurements of between the direct sound and/or the reflections of the spatial impression. Ideally, the measurement should predict acoustical environment. the perception for all of the programme material that will 6. CONCLUSIONS be produced in the acoustical environment that is being measured. There is then a compromise to be made between This paper investigated the cause of interaural decor- using a small number of test signals to keep the measure- relation for two types of stimulus (impulses and narrow- ment process practically short, and the need for the results band continuous signals), when they are passed through to be applicable to as wide a range of signal types as various acoustical systems. It was demonstrated that the possible. decorrelation can be caused by frequency dependent There are a number of potential solutions to this interaural time differences (ITDs) and interaural level problem. For instance, a relatively simple approach would differences (ILDs), and by variations in these over time. It be to select a number of single notes or chords from a range was also shown that the measured result can depend on the of different musical instruments, perhaps grouped into properties of the source signal. Firstly, the spectral detail of categories based on the method of sound production (e.g. the signal can affect the result due to differing frequency bowed string instruments, plucked string instruments, dependent interactions in each ear. Secondly, the duration single reed woodwind, double reed woodwind, brass, of the signal can affect the result due to a presence or etc). By selecting one or more recordings from each absence of an acoustical interaction between the direct category and measuring these within a concert hall, it may sound and/or the reflections. be possible to generate results that are representative for Wideband impulsive signals and continuous tonal sig- each type of musical instrument. nals differ in both duration and spectral content, and it was An alternative approach would be to create a range of confirmed that IACC measurements made of impulse re- synthetic test signals that represent a range of musical sponses and continuous tonal signals that have been con- instruments. This could be achieved by determining the volved with the impulse responses can differ greatly. The properties (such as attack, decay, sustain and release times properties of musical signals were also considered, and it [26], and pitch, spectral content and their variation over was concluded that they are more similar to continuous time [27]) of a number of musical signals — perhaps tonal signals than impulsive signals. Consequently, it was categorised by the method of sound production as determined that it is not possible to predict the spatial im- discussed above. This would enable the synthesis of a pression of musical signals in concert halls from IACC mea- number of artificial signals with similar properties to the surements made of the corresponding impulse responses. musical signals, which could again be measured within a It was found that the most accurate prediction of the concert hall to generate results that are representative for perceived spatial impression of musical signals in a concert each type of musical instrument. hall will be achieved by the measurement of signals with

112 R. MASON et al.: EFFECT OF SOURCE SIGNAL PROPERTIES ON THE IACC temporal and spectral properties that are similar to common [12] International Organization for Standardization, ‘‘Acoustics — musical signals. It was therefore proposed that further measurement of the reverberation time of rooms with reference to other acoustical parameters,’’ ISO 3382 (1997). research is required to develop a set of test signals in order [13] R. Mason, T. Brookes and F. Rumsey, ‘‘The perceptual to be able to undertake practical measurements that relate relevance of extant techniques for the objective measurement to as many types of musical signals as possible. of spatial impression,’’ Proc. Inst. Acoust., Auditorium Acoust. Conf., 24, part 4, no. 3, pp. 1–9 (2002). ACKNOWLEDGEMENTS [14] S. E. Boehnke, S. E. Hall and T. Marquardt, ‘‘Detection of static and dynamic changes in interaural correlation,’’ This work was supported by the Engineering and J. Acoust. Soc. Am., 112, 1617–1626 (2002). Physical Sciences Research Council (EPSRC), UK–grant [15] K. Iida and M. Morimoto, ‘‘Basic study on sound field GR/R55528/01. simulation based on running interaural cross-correlation,’’ Appl. Acoust., 38, 303–317 (1993). REFERENCES [16] D. Griesinger, ‘‘The of apparent source width, spaciousness and envelopment in performance spaces,’’ Acus- [1] L. Beranek, Concert and Opera Halls — How They Sound tica, 83, 721–731 (1997). (Acoustical Society of America, Woodbury, 1996). [17] J. Blauert and W. Lindemann, ‘‘Auditory spaciousness: Some [2] W. de V. Keet, ‘‘The influence of early lateral reflections on the further psychoacoustic analyses,’’ J. Acoust. Soc. Am., 80, 533– spatial impression,’’ Proc. 6th ICA, pp. E53–E56 (1968). 542 (1986). [3] M. Morimoto and Z. Maekawa, ‘‘Auditory spaciousness and [18] J. Blauert, ‘‘On the lag of lateralization caused by interaural envelopment,’’ Proc. 13th ICA, pp. 215–218 (1989). time and intensity differences,’’ Audiology, 11, 265–270 [4] Y. Ando, Architectural Acoustics: Blending Sound Sources, (1972). Sound Fields, and Listeners (Springer-Verlag, New York, [19] W. D. Grantham and F. L. Wightman, ‘‘Detectability of 1998). varying interaural temporal differences,’’ J. Acoust. Soc. Am., [5] T. Hidaka, L. L. Beranek and T. Okano, ‘‘Interaural cross- 63, 511–523 (1978). correlation, lateral fraction, and low- and high-frequency sound [20] R. Mason, ‘‘Elicitation and measurement of auditory spatial levels as measures of acoustical quality in concert halls,’’ attributes in reproduced sound,’’ PhD Thesis, University of J. Acoust. Soc. Am., 98, 988–1007 (1995). Surrey (2002). [6] M. Morimoto and K. Iida, ‘‘A practical evaluation method of [21] D. Griesinger, ‘‘IALF — binaural measures of spatial impres- auditory source width in concert halls,’’ J. Acoust. Soc. Jpn. sion and running reverberance,’’ Aud. Eng. Soc. Preprint, 3292 (E), 16, 59–69 (1995). (1992). [7] R. Mason, T. Brookes and F. Rumsey, ‘‘Development of the [22] R. Mason, T. Brookes and F. Rumsey, ‘‘Creation and interaural cross-correlation coefficient into a more complete verification of a controlled experimental stimulus for inves- auditory width prediction model,’’ Proc. 18th ICA, Vol. IV, tigating selected perceived spatial attributes,’’ Aud. Eng. Soc. pp. 2453–2456 (2004). Preprints, 5771 (2003). [8] M. Barron and A. H. Marshall, ‘‘Spatial impression due to [23] H. Yanagawa, Y. Yamasaki and T. Itow, ‘‘Effect of transient early lateral reflections in concert halls: the derivation of a signal length on crosscorrelation functions in a room,’’ J. physical measure,’’ J. Sound Vib., 77, 211–232 (1981). Acoust. Soc. Am., 84, 1728–1733 (1988). [9] J. S. Bradley and G. A. Soulodre, ‘‘The influence of late [24] M. Barron, ‘‘Spatial impression and envelopment in concert arriving energy on spatial impression,’’ J. Acoust. Soc. Am., 97, halls,’’ Proc. Inst. Acoust., 21, 163–170 (1999). 2263–2271 (1995). [25] J. A. McManus, C. Evans and P. W. Mitchell, ‘‘The dynamics [10] R. I. Chernyak and N. A. Dubrovsky, ‘‘Pattern of the noise of recorded music,’’ Aud. Eng. Soc. Preprints, 3701 (1993). images and the binaural summation of loudness for the [26] A. Melka, ‘‘Messungen der Klangeinsatzdauer bei Musikin- different interaural correlation of noise,’’ Proc. 6th ICA, strumenten (Measurements of tone-rise time of musical instru- pp. A53–A56 (1968). ments),’’ Acustica, 23, 108–117 (1970). [11] J. Blauert and W. Lindemann, ‘‘Spatial mapping of intracranial [27] S. Ando and K Yamaguchi, ‘‘Statistical study of spectral auditory events for various degrees of interaural coherence,’’ parameters in musical instrument tones,’’ J. Acoust. Soc. Am., J. Acoust. Soc. Am., 79, 806–813 (1986). 94, 37–45 (1993).

113