acoustics

Article Dynamic Spatial Responsiveness in Concert Halls

Evan Green * and Eckhard Kahle Kahle Acoustics, 1050 Brussels, Belgium * Correspondence: [email protected]; Tel.: +32-2-345-10-10

 Received: 30 May 2019; Accepted: 2 July 2019; Published: 22 July 2019 

Abstract: In musical , a proportion of the reflected sound energy arriving at the ear is not consciously perceived. Investigations by Wettschurek in the 1970s showed the detectability to be dependent on the overall loudness and direction of arrival of reflected sound. The relationship Wettschurek found between reflection detectability, listening level, and direction of arrival correlates well with the subjective progression of spatial response during a musical crescendo: from frontal at pianissimo, through increasing apparent source width, to a fully present room acoustic at forte. “Dynamic spatial responsiveness” was mentioned in some of the earliest research and recent work indicates that it is a key factor in acoustical preference. This article describes measurements of perception thresholds made using a binaural virtual acoustics system—these show good agreement with Wettschurek’s results. The perception measurements indicate that the subjective effect of reflections varies with overall listening level, even when the reflection level, , and direction relative to the direct sound are maintained. Reflections which are perceptually fused with the source may at louder overall listening levels become allocated to the room presence. An algorithm has been developed to visualize dynamic spatial responsiveness—i.e., which aspects of a three-dimensional (3D) Room Impulse Response would be detectable at different dynamic levels—and has been applied to measured concert hall impulse responses.

Keywords: auditorium acoustics; dynamics; spaciousness; spatial response; auditory perception

1. Introduction

1.1. (Dynamic) Spatial Responsiveness Musicians and conductors consider the concert hall part of their instrument—it is a tool for musical expression, with the best concert halls enhancing the range of expression. The deliberate variation of intensity—in other words, changes in musical dynamics—is a key ingredient in most musical composition and performance. Without both strong and subtle dynamic changes, much musical and emotional intensity is lost. Playing a musical instrument with more intensity generates not only an increase in objective sound level, but also changes in the frequency spectrum of the sound. The connection between musical dynamics and spatial impression was observed in some of the earliest research into psychoacoustics. Marshall [1] writes in 1966 that “as a property of the hall, [spatial impression] relates to loudness attributes ... for the listener, it generates a sense of envelopment in the sound and of direct involvement with it”, with similar observations being made by Keet [2] and Barron [3]. Writing in 1978, Kuhl [4] states that (translated from German) “[spatial impression] is only, if at all, present in forte passages. Musical dynamics therefore not only influence the loudness, but with a sufficient spatial responsiveness, they increase the involvement of the room in the music. For the lay-listener, who is perhaps not even aware of this effect, this spaciousness is an unconscious yet pleasant experience.” In certain conditions therefore, increases in dynamics can generate an additional dimension of spatial impression for the listener—changes in the spatial impression due to changes in dynamics will be referred to here as “dynamic spatial responsiveness”.

Acoustics 2019, 1, 549–560; doi:10.3390/acoustics1030031 www.mdpi.com/journal/acoustics AcousticsAcoustics 20192019,, 21 FOR PEER REVIEW 5502

Spatial impression, apparent source width (ASW), and listener envelopment (LEV) have been studiedSpatial in much impression, detail [3,5–10], apparent with source typical width approach (ASW),es using and listener either energy envelopment integrals (LEV) over haverelatively been longstudied time in intervals much detail (for [3example,,5–10], with “early” typical (0–80 approaches ms) or “late” using (0 either ms to energy infinity) integrals time ranges) over relatively and/or largelong timeangular intervals ranges (for (such example, as in “early”the definition (0–80 ms) of Lateral or “late” Fraction), (0 ms to infinity)or considering time ranges) spatial and effects/or large as staticangular and ranges unchanging (such aswith in theperformance definition dynamics. of Lateral Fraction),While valuable or considering results have spatial been e ffachieved,ects as static the formerand unchanging approach “bundles” with performance energy from dynamics. different While times valuable and directions results have and does been not achieved, take into the account former theapproach differing “bundles” sensitivity energy of our from hearing different to reflecti timesons and from directions different and directions, does not take and intonor accountis masking the factoreddiffering in: sensitivity when long of our time hearing ranges to and reflections large angu fromlar di ffrangeserent directions, are integrated, and nor the is (potentially masking factored very important)in: when long details time of ranges what happens and large in angular those rang rangeses remain are integrated, unclear. the The (potentially latter approach very important)overlooks thedetails importance of what happensof dynamics in those and rangesits importance remain unclear.in the subjective The latter acoustical approach quality overlooks of concert the importance halls. of dynamicsRecently, and the its research importance group in theat Aalto subjective Universi acousticalty has qualitytaken up of concertthe topic halls. of dynamics anew, demonstratingRecently, thethat researchdynamic group spatial at responsiveness Aalto University is related has taken to changes up the topicin instrument of dynamics frequency anew, responsesdemonstrating and directivity that dynamic at higher spatial dynamic responsiveness levels [11–14]: is related due to to greater changes binaural in instrument sensitivity frequency to high frequenciesresponses and at lateral directivity angles, at higheran increased dynamic spatial levels impression [11–14]: due results to greater from binauralthe additional sensitivity higher to harmonicshigh frequencies that are at generated lateral angles, when an instruments increased spatial are played impression loudly. results However, from little-known the additional research higher publishedharmonics by that Wettschurek are generated [15] when in instruments1978 demonst arerates played that loudly. the connection However, little-knownbetween dynamics, research loudness,published and by Wettschurek spatial impression [15] in 1978 is linked demonstrates to other thatfundamental the connection processes between in the dynamics, human loudness,hearing system,and spatial in particular impression how is reflections linked to from other differen fundamentalt directions processes are masked in the or human unmasked hearing at different system, overallin particular listening how (dynamic) reflections levels. from different directions are masked or unmasked at different overall listening (dynamic) levels. 1.2. Research by Wettschurek 1.2. Research by Wettschurek In Wettschurek’s thesis [15], he describes an experiment to determine the perception threshold In Wettschurek’s thesis [15], he describes an experiment to determine the perception threshold of of a test reflection dependent on overall listening levels. The experiment is shown diagrammatically a test reflection dependent on overall listening levels. The experiment is shown diagrammatically in in Figure 1. Figure1.

FigureFigure 1. 1. DiagrammaticDiagrammatic representation representation of of an an experiment experiment to to determine determine the the perception perception threshold threshold of ofa reflectiona reflection E Ein inthe the presence presence of of direct direct sound sound D, D, a alateral lateral reflection reflection R, R, and and reverberation H. H. After After WettschurekWettschurek [15]. [15].

AA synthetic soundsound fieldfield was was generated generated in anin anechoican anechoic environment environment consisting consisting of the of direct the sound,direct sound,a 2 s reverberation a 2 second reverberation starting at 50 starting ms, and at two 50 ms, discrete and two reflections: discrete one reflections: reflection one being reflection a fixed being lateral a fixedreflection lateral at reflection 60◦ azimuth at 60° and azimuth 30 ms delay, and 30 and ms the delay, other and a test the reflection other a test at 70 reflection ms and ofat variable70 ms and level of variableand variable level position.and variable The position direct sound,. The direct lateral sound, reflection, lateral and reflection, reverberation and reverberation levels were set levels to achieve were seta direct-to-reverberant to achieve a direct-to-reverberant ratio of 0 dB. Withratio anechoicof 0 dB. With speech anechoic used as speech source used material, as source the overall material, sound the overalllevel at sound the listening level at position the listening (“listening position level”) (“listening was adjusted level”) in was the rangeadjusted 20–80 in the dB inrange 5 dB 20–80 increments, dB in 5subsequently dB increments, the subsequently level of the the test level reflection of the wastest reflection adjusted towas establish adjusted the to perceptionestablish the threshold perception of thresholdthis reflection. of this reflection. The results of this experiment, shown in Figure 2, exhibit a number of notable features:

Acoustics 2019, 1 551

AcousticsThe 2019 results, 2 FOR of PEER this REVIEW experiment, shown in Figure2, exhibit a number of notable features: 3

• reflections from behind (“Hinten”) have the highest perception threshold at all listening levels; • reflections from behind (“Hinten”) have the highest perception threshold at all listening levels; • for all reflection directions, the perception threshold decreases almost linearly with increasing • for all reflection directions, the perception threshold decreases almost linearly with increasing listeninglistening levellevel untiluntil approximatelyapproximately 4040 dB;dB; • at listening levels higher than 40 dB, the sensitivity to reflections from the front (“Vorne”) begins • at listening levels higher than 40 dB, the sensitivity to reflections from the front (“Vorne”) begins toto plateau.plateau. Sensitivity to reflections reflections from behind (Hinten) plateaus at levels above above approximately approximately 6060 dB;dB; • aboveabove 4040 dB dB however, however, the sensitivitythe sensitivity to reflections to refle fromctions the sidefrom continues the side to increasecontinues approximately to increase • linearlyapproximately with listening linearly level; with listening level; • byby 8080 dB,dB, thethe sensitivitysensitivity to reflectionsreflections from the side is almostalmost 10 dBdB greatergreater thanthan forfor reflectionsreflections • fromfrom thethe frontfront oror rear,rear, whichwhich atat thisthis listeninglistening levellevel havehave anan almostalmost equalequal sensitivitysensitivity ofof 8 dB below thethe directdirect soundsound level.level.

Figure 2.2. Perception threshold relative to the direct soundsound for a 70 ms reflectionreflection as a functionfunction of the sound pressure levellevel atat thethe listeninglistening positionposition forfor threethree directionsdirections ofof arrival:arrival: RearRear (“Hinten”,(“Hinten”, azimuthazimuth

180180°),◦), frontfront (“Vorne”,(“Vorne”, azimuth azimuth 0◦ )0°) and and Side Side (“Seite”, (“Seite”, azimuth azimuth 90◦ )90°) (After (After Wettschurek Wettschurek [15] Figure [15] (Figure 1.11)). 1.11)). Wettschurek’s results provide a strong basis for understanding the increase of spatial impression Wettschurek’s results provide a strong basis for understanding the increase of spatial impression with increasing listening level, independent of any changes in the source spectrum: with increasing listening level, independent of any changes in the source spectrum: • At the lowest listening levels, most reflections would not be perceived and room presence would • At the lowest listening levels, most reflections would not be perceived and room presence would bebe extremelyextremely weakweak oror nonexistent.nonexistent. The direct sound would dominate with the sources on stage beingbeing clearlyclearly localizable;localizable; • As overall listening level increases through a crescendo, reflections from the front and side (having • As overall listening level increases through a crescendo, reflections from the front and side a(having lower thresholda lower threshold than those than from those the from rear) wouldthe rear be) would the first be to the be first perceived, to be perceived, with the expectedwith the subjectiveexpected subjective effect being effect an initial being increase an initial in theincrease apparent in the source apparent width source (ASW), width as observed (ASW), and as subsequentlyobserved and measuredsubsequently by Keet measured [2]; by Keet [2]; • As overall listening level increases further, the perception threshold for reflections from all • As overall listening level increases further, the perception threshold for reflections from all directionsdirections ofof arrivalarrival continuescontinues toto decrease.decrease. As a result, additional,additional, relatively quiet reflectionsreflections (e.g.,(e.g., fromfrom behind)behind) becomebecome perceivableperceivable andand roomroom presencepresence increases—thisincreases—this correspondscorresponds withwith listeninglistening experienceexperience inin concertconcert hallshalls andand thethe commentscomments ofof MarshallMarshall [[1],1], KuhlKuhl [[4],4], andand others;others; • AboveAbove aa listeninglistening level of 60 dB, the thresholds for reflectionsreflections from the front and rear plateau, • whilewhile thethe threshold threshold for for side side reflections reflections continues continues to decrease.to decrease. If reflection If reflection paths paths from from the side the exist,side thenexist, these then should these should become become increasingly increasingly perceivable perceivable as overall as listening overall level listening increases, level resulting increases, in aresulting subjective in increasea subjective in room increase presence. in room presence. Wettschurek refers to this journey through a musical crescendo as the room “waking up”. However, this assumes that similar perception threshold relationships exist for all reflection delays and for music as well as speech. Measurements to corroborate this were not published by Wettschurek.

Acoustics 2019, 1 552

Wettschurek refers to this journey through a musical crescendo as the room “waking up”. However, this assumes that similar perception threshold relationships exist for all reflection delays and for music Acoustics 2019, 2 FOR PEER REVIEW 4 as well as speech. Measurements to corroborate this were not published by Wettschurek.

2. Reflection Reflection Thresholds with Speech and MusicMusic

2.1. Experiment with Speech In orderorder toto understand understand how how perception perception thresholds thresholds for for reflections reflections may may vary vary for di forfferent different delay times,delay directionstimes, directions of arrival, of arrival, and for and music, for Wettschurek’smusic, Wettschurek’s experiment experiment was duplicated was duplicated using a binauralusing a binaural method. Themethod. sound The field sound shown field in Wettschurek’sshown in Wettschurek’s experiment ex (Figureperiment1) was (Figure duplicated 1) was in duplicated a virtual Ambisonics in a virtual systemAmbisonics programmed system programmed using MaxMSP using and MaxMSP SPAT [16 ]an softwared SPAT and[16] then software reproduced and then binaurally reproduced over open-backedbinaurally over headphones open-backed using headphones the subject’s using own th measurede subject’s head-relatedown measured transfer head-related function. transfer Head trackingfunction. wasHead used tracking in all was experiments. used in all Measurements experiments. Measurements of the overall of listening the overall level listening were made level under were themade headphones under the usingheadphones a calibrated using sounda calibrated level meter. sound level meter. In a first first step of the experiment, experiment, participants were asked to increase the level of the test reflection reflection until an audible didifferencefference in the sound quality occurred (this could be any change in timbre, spaciousness, loudness, loudness, etc.). etc.). In In the the second second step, step, the the reflection reflection was was first first set setto be to clearly be clearly audible audible and andthen thenreduced reduced in level in level until until the the presence presence of ofthe the reflection reflection could could not not be be detected. The thresholds determined via via these these two two steps steps were were then then averaged, averaged, but butwere were generally generally within within 1–2 dB 1–2 of dBeach of other. each other.Two participants Two participants carried carriedout the outfollowing the following experiment experiments,s, one with one significant with significant experimental experimental listening listeningexperience, experience, the other the with other lesser with lesserexperience. experience. The results The results shown shown are arethe the average average of of the the two participants—as has been found in previous thresholdthreshold experiments, the didifferencesfferences found betweenbetween participants were small, of the order 2–3 dB. AlthoughAlthough the number of subjects was small, Barron [3], [3], for example,example, has has previously previously found found that that using using small smal numbersl numbers of subjects of subjects (sometimes (sometimes only twoonly [3 two]) yields [3]) validyields results. valid results. For the first first experiment, experiment, it it was was attempted attempted to to re reproduceproduce Wettschurek’s Wettschurek’s experiment experiment as asclosely closely as aspossible possible using using anechoic anechoic speech speech and and a a70 70 ms ms delayed delayed test test reflection reflection.. The The resulting resulting thresholds are shown in Figure3 3,, withwith Wettschurek’sWettschurek’s measurements measurements (as (as per per Figure Figure2 )2) adjacent adjacent and and to to the the same same scale scale for reference.

Figure 3. ReflectionReflection perception thresholds (reflection (reflection level re. direct direct sound) for speech for a 70 ms delayed reflection.reflection. Left: measured using binaural reproduction. Right: measured using a loudspeaker system (after Wettschurek [[15]).15]).

Although thethe perceptionperception thresholds thresholds with with binaural binaural reproduction reproduction are are lower lower than than those those measured measured by Wettschurek,by Wettschurek, the the result result is encouraging, is encouraging, as the as generalthe general relationship relationsh betweenip between the frontal,the frontal, rear, rear, and sideand reflectionsside reflections is similar. is similar. There There are, are, however, however, clear clear di ffdierences:fferences: in in the the binaural binaural case, case,the the rearrear reflectionreflection threshold follows the front reflectionreflection closely—thisclosely—this could be attributed to known issues of medianmedian plane confusion with binaural reproduction. Furthermore, it is known that the source material can have a significant effect on measured thresholds, and although anechoic speech was used in both measurements, the source material was different. Nevertheless, it is clear that the thresholds determined using binaural reproduction show trends similar to the (arguably more reliable) loudspeaker tests. For reference, the absolute level of the test reflection, when listened to on its own,

Acoustics 2019, 1 553 plane confusion with binaural reproduction. Furthermore, it is known that the source material can have a significant effect on measured thresholds, and although anechoic speech was used in both measurements, the source material was different. Nevertheless, it is clear that the thresholds determined Acousticsusing binaural 2019, 2 FOR reproduction PEER REVIEWshow trends similar to the (arguably more reliable) loudspeaker tests.5 For reference, the absolute level of the test reflection, when listened to on its own, was in all cases wassignificantly in all cases above significantly the threshold above of hearing, the threshold therefore of thehearing, measured therefore perception the measured threshold perception is an effect thresholdof masking. is an effect of masking.

2.2. Reflection Reflection Perception Thresholds with Music Experiments were carried out to determine reflect reflectionion perception thresholds for music using an anechoic recording of solo cello. This source materi materialal was chosen for its even sound level throughout the recordingrecording and and legato legato musical musical expression expression to contrast to contrast with with the consonant-rich, the consonant-rich, more impulsivemore impulsive speech speechsource usedsource in used the previous in the previous test. The test. experimental The experimental procedure procedure and participants and participants were as described were as describedabove, and above, tests wereand tests made were for reflectionsmade for reflection delayed bys delayed 70, 50, andby 70, 40 50, ms. and All 40 other ms. settingsAll other including settings includingthe fixed lateral the fixed reflection, lateral reverberation,reflection, reverberation, and direct-to-reverberant and direct-to-reverberant ratio were maintained ratio were as maintained previously. Theas previously. results are shownThe results in Figure are 4shown below, in with Figure the speech4 below, measurement with the speech for 70 measurement ms shown in grey for in70 thems shownleft panel. in grey in the left panel.

Figure 4. ReflectionReflection perception thresholds (reflection (reflection level re. direct sound) for music, dependent on reflectionreflection delay,delay, reflection reflection direction, direction, and and overall overall listening listening level. Forlevel. reference, For reference, the binaurally the binaurally measured measuredreflection thresholdsreflection thresholds for speech fo andr speech 70 ms and delay 70 time ms delay are shown time are in grey shown on thein grey left panel.on the left panel.

Figure4 4 shows shows that, that, as as one one might migh expect,t expect, the the 70 70 ms ms perception perception thresholds thresholds are are in general in general higher higher for formusic music than than for speech, for speech, reflecting reflecting the longer the longer “cognitive “cognitive integration integration time” for time” musical for sourcemusical material. source Thematerial. 70 ms The reflection 70 ms reflection thresholds thresholds for side reflections for side reflections are nevertheless are nevertheless 4–5 dB below 4–5 those dB below for frontal those and for frontal6–7 dB belowand 6–7 those dB below for rear those directions. for rear For directions. the higher For listening the higher levels, listening these levels, are slightly theseless are thanslightly for lessspeech, than but for the speech, difference but the at low difference listening at levels low listening (under 40 levels dBA) (under is greater 40with dBA) music is greater than withwith speech.music Atthan 50 with ms delayspeech. time, At 50 the ms threshold delay time, for sidethe threshold reflections for is side almost reflections identical is toalmost that at identical 70 ms, whileto that the at 70thresholds ms, while for the frontal thresholds and rear for reflectionsfrontal and are rear higher. reflections In general, are higher. the sensitivityIn general, to the side sensitivity reflections to sideincreases reflections by 6–7 increases dB over theby range6–7 dB of over listening the range levels of measured listening (approx. levels measured 25–75 dB), (approx. while the 25–75 increase dB), whilein sensitivity the increase for frontal in sensitivity and rear for reflections frontal and is 4 dBrear at reflections most, with is an 4 increasedB at most, of only with 1–2 an dBincrease at 40 ms of onlydelay 1–2 time. dB at 40 ms delay time.

2.3. Subjective Effects of Changing Listening Level While carrying out the perception threshold measurements, it was clear that reflections at different delay times and directions of arrival had very different subjective effects, but also that the overall listening level had an influence on the subjective quality, even when the reflection level was close to the perception threshold. Three main changes in the quality of the sound were perceived: (1) tone coloration, (2) changes to the source spatial extent (reflection experienced as “fused” with the

Acoustics 2019, 1 554

2.3. Subjective Effects of Changing Listening Level While carrying out the perception threshold measurements, it was clear that reflections at different delay times and directions of arrival had very different subjective effects, but also that the overall listening level had an influence on the subjective quality, even when the reflection level was close to the perception threshold. Three main changes in the quality of the sound were perceived: (1) tone coloration, (2) changes to the source spatial extent (reflection experienced as “fused” with the source), and (3) changes to the room spatial impression (reflection experienced as separate to the source). It has long been known that one of the primary effects of reflections with short delay times is to change the tone color [3]. This was indeed the case in this experiment: when listening for whether a short-delay-time reflection was perceptible, “comb filtering” effects were often the primary quality that was identifiable. The comb filtering effect was strongest for reflections from the front and rear, while for a reflection from the side, depending on delay and listening level, tone coloration would occur simultaneously with a change in source spatial extent. At 70 ms delay time, the subjective effect of the test reflection, once it became perceptible, was generally a change in spatial extent or spatial impression for side reflections and perceived distance for frontal and rear reflections. Most interesting, however, was the change in subjective impression of a 70 ms reflection with overall listening level. At a certain loudness level, the reflection could switch from “belonging to the source” to “belonging to the room”. There is increasing evidence to support the idea that our auditory system segregates the auditory scene into various “cognitive auditory streams”: a simple description of listening in musical concerts is that the auditory scene is perceived as a source (foreground) stream and a room (background) stream [17]. This experiment indicates that there are complex factors at play as to whether a sound is allocated to the source stream or room stream (or indeed to both streams, or no stream at all when below threshold): not only is delay time a factor (this experiment and others indicate that there is not a simple cut-off time at 80 ms), but also reflection direction of arrival and overall listening level. Depending on the particular combination of these factors, sounds arriving before 80 ms may not necessarily be fused into the source stream but could be segregated into the room stream. Since our subjective impressions of musical clarity, proximity, and intimacy seem to relate to aspects of stream segregation—as well as to our ability to segregate sound energy into streams at all—it is important to gain a better understanding of these relationships and auditory/cognitive processes. It should also be noted that the number of participants in the experiment described here was very small, so more research is required to understand this effect in more detail and for larger groups of listeners. Another question that arises from this measurement is that of the effect of the “unperceived” sound energy. As mentioned above, the reflected energy which is below the perception threshold is above the absolute threshold of hearing (it is clearly audible once the direct sound and other reflected energy is switched off in the experiment). Does this “unperceived” sound energy have other subjective effects that are not picked up in this experiment, or might our hearing and cognitive system consider it to simply be noise? One conclusion that can be drawn from the above tests with music is that, if lateral reflections are present, then as overall listening level increases one would perceive more lateral sound and the room spatial impression should change in response to the dynamics, resulting in an additional spatial dimension coupled to changes in loudness. However, if lateral reflections are weak or absent, the small differences in perception threshold with listening level for frontal and rear reflections indicate that the room spatial impression would tend to remain static: the subjective perception of the room acoustic would be less responsive to musical dynamics. Whether a dynamic or more static room presence is preferable, and whether this has a significant bearing on the overall assessment of acoustical quality is, to an extent, a matter of taste—experimental results from a much larger number of participants than were used in these experiments would be required to gain sufficient insight into this matter. Acoustics 2019, 1 555

3. Detecting Dynamic Spatial Responsiveness Typical objective measures related to spatial impression such as the lateral fraction (LF) and interaural cross-correlation IACC integrate lateral sound energy over a time window and so do not take into account the specific time, level, and direction of arrival of sound energy. In order see whether components of the sound field that may contribute to dynamic spatial responsiveness can be detected and visualized, and to compare measurements from different concert halls, a filter algorithm has been developedAcoustics 2019 based, 2 FOR onPEER the REVIEW thresholds measured above. 7

3.1. Dynamic Spatial Response Filter The dynamic spatial response filter filter algorithm uses as its input measuredmeasured first-orderfirst-order ambisonic B-format 3D room impulse responses (3DRIR) and pr processesocesses these to reveal only the reflected reflected sound energy that would be subjectivelysubjectively perceived at aa givengiven listeninglistening level.level. Although concert halls and music are of primary interest in this paper, since Wettschurek’s experiments used a larger number of subjects andand cancan therefore therefore be be considered considered to to be be more more reliable, reliable, these these reflection reflection perception perception thresholds thresholds have beenhave usedbeen inused the in filter. the Tofilter. determine To determine the thresholds the thre forsholds delay for times delay other times than other 70 ms than and 70 for ms directions and for ofdirections arrival otherof arrival than those other measured than those by Wettschurek,measured by trends Wettschurek, in our own trends binaural in measurementsour own binaural and interpolationmeasurements have and beeninterpol used.ation have been used. The filteringfiltering processprocess isis shown shown diagrammatically diagrammatically in in Figure Figure5. Firstly,5. Firstly, the the X, Y,X, and Y, and Z channels Z channels of the of first-orderthe first-order B-format B-format 3DRIR 3DRIR are used are toused calculate to calculate the direction the direction of arrival of(DOA; arrival azimuth (DOA; =azimuthθ, elevation = θ, =elevationΦ) for each = Φ sample) for each in the sample impulse in response—the the impulse algorithmsresponse—the developed algorithms at Aalto developed University at [ 18Aalto] are usedUniversity for this [18] step. are used for this step.

Figure 5. AlgorithmAlgorithm schematic schematic for a dynamic spatial response filter. filter.

The direct sound level L(0) is determined from the omnidirectional channel W for a user-selected The direct sound level L(0) is determined from the omnidirectional channel W for a user-selected window of w samples. Next, for each sample i of the impulse response, the sound level L(i) is window of w samples. Next, for each sample i of the impulse response, the sound level L(i) is evaluated evaluated by integrating the squared impulse response over a window of the same length w as used by integrating the squared impulse response over a window of the same length w as used for the direct for the direct sound, centered on the sample of interest i. Since it is desired to collect energy in the window that can be considered to constitute a coherent reflection, each sample n in the window is weighted according to how similar the DOA is when compared to the central sample i. Therefore, before integrating the energy in the window, a weighting g is applied to each sample n in the window according to its DOA relative to the DOA of the central sample in the window. Samples with the same DOA as sample i are given a weighting 1, while samples arriving from the opposite DOA are weighted 0, according to a cosine rule:

𝜃 𝜃 ∅ ∅ 𝑔𝑡 cos cos (1) 2 2

Acoustics 2019, 1 556 sound, centered on the sample of interest i. Since it is desired to collect energy in the window that can be considered to constitute a coherent reflection, each sample n in the window is weighted according to how similar the DOA is when compared to the central sample i. Therefore, before integrating the energy in the window, a weighting g is applied to each sample n in the window according to its DOA relative to the DOA of the central sample in the window. Samples with the same DOA as sample i are given a weighting 1, while samples arriving from the opposite DOA are weighted 0, according to a cosine rule:     θn θi ∅n ∅i g(t ) = cos − cos − (1) n 2 2

The DOA and delay time of the sample i are used to establish the threshold LTH by interpolating the Wettschurek data and our own measurements at different delay times. If the level in the window L(i) relative to the direct level L(0) is greater than the threshold LTH, i.e., if the sample i can be considered to contributeAcoustics to a2019 perceivable, 2 FOR PEER REVIEW reflection and is not masked, then the sample i is passed through8 to the filtered output impulse response W unchanged: The DOA and delay time of the sample i are used to establish the threshold LTH by interpolating the Wettschurek data and our own measurements at different delay times. If the level in the window

L(0) L(i) LTH(s, d, θ, φ) ; W(i, s) TH= W(i) (2) L(i) relative to the direct level− L(0)≥ is greater than the threshold L , i.e., if the sample i can be considered to contribute to a perceivable reflection and is not masked, then the sample i is passed Otherwisethrough the to samplethe filtered is output considered impulse to response be fully 𝑊 masked, unchanged: i.e., the output W is set to 0 for the sample i:

𝐿0 𝐿𝑖 𝐿𝑠, 𝑑, 𝜃, 𝜙 ; 𝑊 𝑖, 𝑠 𝑊𝑖 (2) L(0) L(i) < LTH(s, d, θ, φ) ; W(i, s) = 0 (3) Otherwise the sample is− considered to be fully masked, i.e., the output 𝑊 is set to 0 for the sample i: This is repeated for all listening levels of interest s to give a series of modified 3DRIRs, one for 𝐿0 𝐿𝑖 𝐿 𝑠, 𝑑, 𝜃, 𝜙 ; 𝑊 𝑖, 𝑠 0 (3) each listening level. This is repeated for all listening levels of interest s to give a series of modified 3DRIRs, one for 3.2. Resultseach from listening Measurements level. The dynamic3.2. Results spatial from Measurements response filter has been applied to 3DRIRs measured in the Nouveau Siècle concert hall inThe Lille, dynamic France. spatial This response hall startedfilter has lifebeen in applied the mid-20th to 3DRIRs Centurymeasured in as the a fan-shapedNouveau Siècle conference hall (Figureconcert6, left) hall and in Lille, was France. reconfigured This hall started in 2013, life in maintainingthe mid-20th Century the stage as a fan-shaped and floor conference rake, to create a hall (Figure 6, left) and was reconfigured in 2013, maintaining the stage and floor rake, to create a parallel-sided concert hall (Figure6, right). parallel-sided concert hall (Figure 6, right).

(a)

(b)

Figure 6. (a) Comparison plans of Nouveau Siècle, Lille. Left: fan-shaped conference hall before Figure 6. renovation.(a) Comparison Right: parallel-sided plans of Nouveau concert hall Siafterècle, renovation. Lille. Left: Measurement fan-shaped position conference for results in hall before renovation.Figures Right: 7 and parallel-sided 8 shown with red concert circle. ( hallb) Left: after Lille renovation. Nouveau Siècle Measurement before renovation. position Right: after for results in Figures7 andrenovation8 shown with with new side red walls, circle. balconies, ( b) Left: stage Lille enclosure, Nouveau and canopy. Siècle before renovation. Right: after renovation with new side walls, balconies, stage enclosure, and canopy. The fan-shaped conference hall exhibited subjective deficiencies for music uses, including a lack of reverberation, envelopment, and room presence, with the latter two attributed to an insufficient number and strength of lateral reflections. In the newly configured hall, the side walls were made parallel, while the addition of side balconies, soffits, and a modification of the ceiling design were intended to provide additional lateral reflections. Listening tests in the reconfigured hall indicate that envelopment and room presence have both been increased and that the acoustic now also exhibits greater dynamic spatial responsiveness. The question is, can these qualities be seen in the measured 3DRIRs?

Acoustics 2019, 1 557

The fan-shaped conference hall exhibited subjective deficiencies for music uses, including a lack of reverberation, envelopment, and room presence, with the latter two attributed to an insufficient number and strength of lateral reflections. In the newly configured hall, the side walls were made parallel, while the addition of side balconies, soffits, and a modification of the ceiling design were intended to provide additional lateral reflections. Listening tests in the reconfigured hall indicate that envelopment Acousticsand room 2019 presence, 2 FOR PEER have REVIEW both been increased and that the acoustic now also exhibits greater dynamic9 spatial responsiveness. The question is, can these qualities be seen in the measured 3DRIRs? Figure 77 showsshows measuredmeasured 3DRIRs3DRIRs forfor aa seatseat 1717 mm fromfrom thethe stagestage inin thethe mainmain seatingseating area.area. TheThe left-handleft-hand measurement measurement is before the renovation, and the right-hand plot after. The The 3DRIRs 3DRIRs are are viewed viewed fromfrom thethe top,top, withwith the the direct direct sound sound aligned aligned to theto th “Front”e “Front” direction. direction. The The arrival arrival time time is shown is shown by color, by color,with red with shades red shades indicating indicating energy energy arriving arriving up to 500 up ms,to 500 greens ms, ingreens the range in the up range to 100 up ms, to 100 and ms, blue and up blueto 50 up ms. to While 50 ms. it While is clear it that is clear after that the after renovation the renovation there is more there energy is more arriving energy from arriving lateral from directions, lateral directions,indicated by indicated the higher by soundthe higher level sound in the level “Left” in and the “Right”“Left” and directions, “Right” candirections, filtering can help filtering to visualize help tothe visualize subjectively the subjectively perceived increased perceived dynamic increased spatial dynamic responsiveness spatial responsiveness after renovation? after renovation?

Figure 7. Left:Left: unprocessed unprocessed impulse impulse response response for for No Nouveauuveau Siècle Siècle before before reconstruction and reconfiguration.reconfiguration. Right: Right: unprocessed unprocessed impulse impulse response response for for Nouveau Nouveau Siècle Siècle after after reconfiguration. reconfiguration.

Figure 88 belowbelow showsshows thethe resultsresults ofof applyingapplying thethe dynamicdynamic spatialspatial responseresponse filterfilter toto thethe NouveauNouveau SiècleSiècle measurements measurements (Figure (Figure 7)7) for for listening listening levels levels of of 50, 50, 60, 60, and and 70 70 dB dB and and a 1 a ms 1 ms window window length. length. As withAs with the theunprocessed unprocessed plots plots in inFigure Figure 7,7 results, results fr fromom before before the the renovation renovation project project are shown on thethe left,left, results results after renovation are on the right. At 50 dB listening level, before renovation, pe perceivablerceivable sound sound energy energy is is detected from from the rear (likely seat reflections reflections in in the empty hall) hall) and and from from the reflective reflective stage enclosure, along along with with a a few few other “spikes” “spikes” in in the the range range 0–40 0–40 ms. ms. Post Post renovation, renovation, although although less less sound sound energy energy would would seem seem to be to perceivablebe perceivable at this at this50 dB 50 listening dB listening level, level, a bundle a bundle of lateral of lateral sound sound energy energy from the from right-hand the right-hand side is clearlyside is clearlyevident evident along alongwith withsome somelater latersound sound afte afterr 70 70ms: ms: the the beginnings beginnings of of noticeable noticeable source broadening and running reverberation could be expected. At 60 dB listening level, prerenovation, the the reflections reflections already already perceivable perceivable at at 50 50 dB are supplemented by by a a bundle bundle of of lateral lateral energy energy at atarou aroundnd 80 80ms. ms. However, However, in the in therenovated renovated room, room, the right-sidethe right-side reflections reflections present present at 50 dB at are 50 joined dB are by joined reflections by reflections from the left-front, from the left-rear, left-front, and left-rear, right- rear,and right-rear,all in the range all in 60–80 the range ms: 60–80subjectively, ms: subjectively, it would be it expected would be that expected the sound that theis now sound becoming is now enveloping,becoming enveloping, along with along a larger with apparent a larger source apparent width. source width. At 70 dB listening level, the spatial response in both cases is filled filled out, with energy in the range 100–200 msms becomingbecoming apparent. apparent. However, However, in thein the renovated renovated room, room, a new a bundlenew bundle of lateral of energylateral aroundenergy around50 ms from 50 ms the from left the becomes left becomes apparent apparent and overall and overall the perceivable the perceivable reflected reflected energy energy is primarily is primarily lateral. lateral. These plots indicate that, both before and after renovation, the perceivable room spatial impression should increase with listening level—however, it is clear that subjective room spatial impression and envelopment should be much stronger after renovation since the strength of perceivable lateral reflections is around 6 dB higher than before renovation. This magnitude of difference is not so evident in the full unprocessed impulse responses shown in Figure 7. Furthermore, before renovation the strongest reflections were clustered around the source or arrived from behind, and this attribute did not change with listening level, indicating an unresponsive room spatial impression. After renovation, the spatial zone around the source is relatively free of reflected

Acoustics 2019, 1 558

These plots indicate that, both before and after renovation, the perceivable room spatial impression should increase with listening level—however, it is clear that subjective room spatial impression and envelopment should be much stronger after renovation since the strength of perceivable lateral reflections is around 6 dB higher than before renovation. This magnitude of difference is not so evident in the full unprocessed impulse responses shown in Figure7. Furthermore, before renovation the strongest reflections were clustered around the source or arrived from behind, and this attribute did Acoustics 2019, 2 FOR PEER REVIEW 10 not change with listening level, indicating an unresponsive room spatial impression. After renovation, the spatial zone around the source is relatively free of reflected sound energy (this can be attributed sound energyto the absorption (this can of be the attributed new choir seating to the and absorp the partiallytion of absorbing the new upstage choir wall)seating with and the lateral the partially absorbingdirections upstage being wall) filled with with the reflected lateral energy. directions being filled with reflected energy.

Figure 8.Figure Dynamic 8. Dynamic spatial spatial response response filterfilter applied applied to measurements to measurements of Nouveau of Si èNouveaucle, Lille for Siècle, different Lille for differentlistening listening levels levels of (a of) 50 (a dB,) 50 (b )dB, 60 dB,(b) and60 dB, (c) 70 and dB. Left(c) 70 column: dB. Left before column: renovation. before Right renovation. column: Right column: afterafter renovation. renovation.

Overall, through processing 3DRIRs with the dynamic spatial response filter, features of the impulse responses that indicate how the room spatial impression is perceived at different listening levels can be extracted and visualized. In the case of the Nouveau Siècle concert hall, the extracted features correspond well with the subjective effect of changing room spatial impression as loudness increases. The filter is currently relatively crude, based on a small dataset of thresholds and a very basic algorithm for attributing measured sound energy to coherent reflections. To improve the algorithm, further research is planned into the analysis of energy belonging to individual reflections and how the limited spatial acuity of the human hearing system (particularly in judging the direction of reflected energy) can be used to improve the filter.

4. Conclusions Connections between musical and acoustical attributes—in this case objective loudness changes that generate subjective differences in spatial impression or room response—are believed to be an important factor in acoustical preference. Research into such connections has however been rare. Expanding on the work by Wettschurek in the 1970s, measurements of the perception thresholds of early reflections have been made using a binaural virtual acoustics system. Both speech and music were used as a source, and various reflection directions of arrival and delay times were tested. These

Acoustics 2019, 1 559

Overall, through processing 3DRIRs with the dynamic spatial response filter, features of the impulse responses that indicate how the room spatial impression is perceived at different listening levels can be extracted and visualized. In the case of the Nouveau Siècle concert hall, the extracted features correspond well with the subjective effect of changing room spatial impression as loudness increases. The filter is currently relatively crude, based on a small dataset of thresholds and a very basic algorithm for attributing measured sound energy to coherent reflections. To improve the algorithm, further research is planned into the analysis of energy belonging to individual reflections and how the limited spatial acuity of the human hearing system (particularly in judging the direction of reflected energy) can be used to improve the filter.

4. Conclusions Connections between musical and acoustical attributes—in this case objective loudness changes that generate subjective differences in spatial impression or room response—are believed to be an important factor in acoustical preference. Research into such connections has however been rare. Expanding on the work by Wettschurek in the 1970s, measurements of the perception thresholds of early reflections have been made using a binaural virtual acoustics system. Both speech and music were used as a source, and various reflection directions of arrival and delay times were tested. These measurements indicate that for music, perception thresholds for reflections from the front and behind vary little with overall listening level whereas for side reflections, the perception threshold decreases substantially with increasing listening level. This indicates that during a crescendo, increasing numbers of lateral reflections should become perceivable—the hall “wakes up”—while a much weaker effect is to be expected for frontal or rear reflections. These findings correspond with listening experience, with those halls lacking lateral reflections also tending to exhibit weaker dynamic spatial responsiveness. It was also observed in the listening tests that, depending on the overall listening level, delay, and direction of arrival, reflections could be allocated to the source auditory stream (fused with the source) or experienced as part of the room stream i.e., separate from the source. This is an important factor in our perception and rating of acoustical quality, and therefore warrants detailed further research with much larger groups of listeners than were used here. The question of how the “below threshold” sound energy influences our subjective impression is also pertinent. Based on measured reflection perception thresholds, 3D room impulse responses have been filtered in order to visualize changes in perceivable reflected energy at difference listening levels. Initial tests of the so-called dynamic spatial response filter using measurements made in the Nouveau Siècle concert hall in Lille, France indicate that such filtering can aid in the discrimination of impulse responses measured in concert halls that do exhibit dynamic spatial responsiveness. As a next step in this research, the measured thresholds and filter algorithm will be refined and applied to additional concert halls.

Author Contributions: Conceptualization, E.G. and E.K.; methodology, E.G. and E.K.; software, E.G.; validation, E.G. and E.K.; formal analysis, E.G. and E.K.; investigation, E.G. and E.K.; resources, E.G. and E.K.; data curation, E.G.; writing—original draft preparation, E.G.; writing—review and editing, E.K.; visualization, E.G.; supervision, E.K.; project administration, E.G. and E.K. Conflicts of Interest: The authors declare no conflict of interest.

References

1. Marshall, H. A note on the importance of room cross-section in concert halls. J. Sound Vib. 1966, 5, 100–112. [CrossRef] 2. Keet, W.V. The influence of early lateral reflections on the spatial impression. In Proceedings of the Sixth International Congress on Acoustics, Tokyo, Japan, 21–28 August 1968; Volume 3, pp. E53–E56. 3. Barron, M. The subjective effects of first reflections in concert halls—The need for lateral reflections. J. Sound Vib. 1971, 15, 475–494. [CrossRef] 4. Kuhl, W. Räumlichkeit als Komponente des Raumeindrucks. Acta Acust. 1978, 40, 167–181. Acoustics 2019, 1 560

5. Barron, M.; Marshall, A.H. Spatial Impression due to Early Lateral Reflections in Concert Halls: The Derivation of a Physical Measure. J. Sound Vib. 1981, 77, 211–232. [CrossRef] 6. Barron, M. Objective Measures of Spatial Impression in Concert Halls. In Proceedings of the 11th International Congress on Acoustics, Paris, France, 19–27 July 1983; pp. 105–108. 7. Bradley, J.S.; Soulodre, G.A. Objective measures of listener envelopment. J. Acoust. Soc. Am. 1995, 98, 2590–2597. [CrossRef] 8. Bradley, J.S.; Reich, R.D.; Norcross, S.G. On the Combined Effects of Early and Late-Arriving Sound on Spatial Impression in Concert Halls. J. Acoust. Soc. Am. 2000, 108, 651–661. [CrossRef][PubMed] 9. Morimoto, M.; Iida, K.; Sakagami, K. The role of reflections from behind the listener in spatial impression. Appl. Acoust. 2001, 62, 109–124. [CrossRef] 10. Furaya, H.; Fujimoto, K.; Ji, C.Y.; Higa, N. Arrival direction of late sound and listener envelopment. Appl. Acoust. 2001, 62, 125–136. [CrossRef] 11. Pätynen, J.; Tervo, S.; Robinson, P.W.; Lokki, T. Concert halls with strong lateral reflections enhance musical dynamics. Proc. Natl. Acad. Sci. USA 2014, 11, 4409–4414. [CrossRef][PubMed] 12. Pätynen, J.; Lokki, T. Perception of music dynamics in concert hall acoustics. J. Acoust. Soc. Am. 2016, 140, 3787–3798. [CrossRef][PubMed] 13. Pätynen, J.; Lokki, T. Concert halls with strong and lateral sound increase the emotional impact of orchestra music. J. Acoust. Soc. Am. 2016, 139, 1214–1224. [CrossRef][PubMed] 14. Pätynen, J.; Lokki, T. Towards a common parameter characterizing the dynamical responsiveness of a concert hall. J. Acoust. Soc. Am. 2017, 141, 3599. [CrossRef] 15. Wettschurek, R. Über die Abhängigkeit raumakustischer Wahrnehmung von der Lautstärke. Ph.D. Thesis, Technical University of Berlin, Berlin, Germany, 1976. 16. SPAT Acoustic Simulation Tool by IRCAM. Available online: http://forumnet.ircam.fr/product/spat-en2 (accessed on 1 March 2017). 17. Griesinger, D. The Psychoacoustics of Apparent Source Width, Spaciousness and Envelopment in Performance Space. Acta Acust. 1997, 83, 721–731. 18. SDM Toolbox from Aalto University. Available online: https://www.mathworks.com/matlabcentral/ fileexchange/56663-sdm-toolbox (accessed on 1 June 2017).

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).