The Role of Spectral-Envelope Characteristics in Perceptual Blending of Wind-Instrument Sounds

ACTA ACUSTICA UNITED WITH ACUSTICA Vol. 101 (2015) 1039 –1051 DOI 10.3813/AAA.918898 The Role of Spectral-Envelope Characteristics in Perceptual Blending of Wind-Instrument Sounds Sven-Amin Lembke, Stephen McAdams Centre for Interdisciplinary Research in Music Media and Technology (CIRMMT), Schulich School of Music, McGill University,555 SherbrookeStreet West, Montréal, Québec, Canada H3A 1E3. [email protected] Summary Certain combinations of musical instruments lead to perceptually more blended timbres than others. Orchestra- tion commonly seeks these combinations and can benefitfrom generalized acoustical descriptions of perceptually relevant features that allowthe prediction of blend. Previous research on correlating such instrument-specific features with the perception of blend shows an important role of spectral-envelope characteristics, leaving unan- swered, however, whether global or local characteristics are more important (e.g., spectral centroid or formant structure). This paper reports howwind instruments can be characterized through pitch-generalized spectral- envelope descriptions that exhibit their formant structure and howthis is represented in an auditory model. Two experiments employing blend-production and blend-rating tasks study the perceptual relevance of formants to blend, involving dyads of arecorded instrument sound and aparametrically varied synthesized sound. Frequency relationships between formants influence blend critically,asdoes the degree of formant prominence. In addition, multiple linear regression relying primarily on local spectral-envelope characteristics explains 87% of the vari- ance in blend ratings. Aperceptual model for the contribution of spectral characteristics to perceivedblend is proposed. PACS no. 43.66.Ba, 43.66.Jh, 43.75.Cd 1. Introduction Along aperceptual continuum, maximum blend is most likely only achievedfor concurrent sounds in pitch unison Knowledge of instrument timbre leads composers to se- or octaves. Even though other intervals may be rightly as- lect certain instruments overothers to fulfill adesired sumed to maketwo instruments more distinct, certain in- purpose in orchestrating amusical work. One such pur- strument combinations would still exhibit higher degrees pose is achieving a blended combination of instruments. of blend than others. On the opposite extreme of this The blending of instrumental timbres is thought to depend continuum, astrong distinctness of individual instruments mainly on factors such as the synchronybetween note leads to the perception of aheterogeneous, non-blended onsets, partial tones aligned along the harmonic series, sound. Assuming auditory fusion to rely on low-levelpro- and specificcombinations of instruments [1]. Whereas the cesses (related to auditory scene analysis, see [5]), in- first twofactors depend on compositional decisions and creasingly strong and congruent perceptual cues for blend their precise execution during musical performance, the should counteract even deliberate attempts to identify in- third factor strongly relies on instrument-specificacous- dividual sounds. tical characteristics. Arepresentative characterization of Previous research on timbre perception has shown a these features would thus facilitate explaining and theoriz- dominant importance of spectral properties. Timbre sim- ing perceptual effects related to blend. In agreement with ilarity has been linked to spectral-envelope characteristics past research [1, 2, 3], blend is defined as the perceptual fu- [6]. Similarity-based behavioral groupings of stimuli re- sion of concurrent sounds, with acorresponding decrease flect acategorization into distinct spectral-envelope types in the distinctness of individual sounds. It can involvedif- [7] or the exchange of spectral envelopes between synthe- ferent practical applications, such as augmenting adomi- sized instruments results in an analogous inversion of po- nant timbre by adding other subordinate timbres or creat- sitions in multidimensional timbre space [8]. Furthermore, ing an entirely novel, emergent timbre [4]. This paper ad- Strong &Clark [9] reported increasing confusion in in- dresses only the first scenario, as the latter likely involves strument identification (e.g., oboe with trumpet)whenever more than twoinstruments. prominent spectral-envelope traits are disfigured, mak- ing instruments resemble each other more. With regard Received15June 2014, to blending, Kendall &Carterette [2] established alink accepted 11 May 2015. between timbre similarity and blend, by relating closer ©S.Hirzel Verlag · EAA 1039 ACTA ACUSTICA UNITED WITH ACUSTICA Lembke,McAdams: Spectral-envelope characteristics Vol. 101 (2015) timbre-space proximity between pairs of single-instrument introduces the chosen approach to spectral-envelope de- sounds to higher blend ratings for the same sounds form- scription, its corresponding representation through audi- ing dyads. ‘Darker’ timbres have been hypothesized to be tory models, and howinthe perceptual investigation the favorable to blend [4, 10], quantified through the global spectral description is operationalized in terms of paramet- spectral-envelope descriptor spectral centroid,with ‘dark’ ric variations of formant frequencylocation. Section 3out- referring to lower centroids. Strong blend wasfound to be lines the design of twobehavioral experiments that inves- best explained by alow centroid composite,i.e., the cen- tigate the relevance of local variations of formant structure troid sum of the sounds forming adyad. to blend perception, with their specificmethods and find- By contrast with global descriptors, attempts to ex- ings presented in Sections 4and 5, respectively.Finally, plain blending through local spectral-envelope character- the combined results from acoustical and perceptual in- istics focus on prominent spectral maxima, also termed vestigations are discussed in Section 6, leading to the es- formants [11, 12] in this context. Reuter [3] reported be- tablishment of aspectral model for blend in Section 7. havioral findings in favoroftimbre blend occurring whenever formant regions between twoinstruments coincide. 2. Spectral-envelope characteristics His explanation argues that this coincidence avoids incomplete masking [13], which inversely hypothesizes that Acorpus of wind instrument recordings wasused to es- the non-coincidence of formant locations prevents audi- tablish ageneralized acoustical description for each in- tory fusion due to incomplete mutual masking of the pre- strument. The orchestral instrument samples were drawn sumedly salient formants between instruments, facilitating from the Vienna Symphonic Library1 (VSL), supplied as the detection of their distinct identities. stereo WAVfiles (44.1 kHz sampling rate, 16-bit dynamic As prominent signifiers of spectral envelopes, formants resolution), with only left-channel data considered. The have been employed widely to describe wind instruments investigated instruments comprised (French)horn, bas- [3, 7, 8, 12, 14, 15, 16, 17, 18, 19]. Likethe formant soon, Ctrumpet, B clarinet, oboe, and flute, with the structure found in the human voice [11, 12, 20], formants available audio samples spanning their respective pitch in wind instruments are located at absolute frequencyre- ranges in semitone increments. Because the primary fo- gions, which remain largely unaffected by pitch change cus concerned spectral aspects, all selected samples con- [14, 17, 18]. This invariance may in fact allowfor the gen- sisted of long, sustained notes without vibrato. As spectral eralized acoustical description for these instruments and envelopes commonly exhibit significant variation across together with assessing its potential constraints (e.g., in- dynamic markings, all samples included only mezzoforte strument register,dynamic marking), it will be of value to markings, representing an intermediate levelofinstrument musical applications (e.g., [21]). Furthermore, it is mean- dynamics. ingful to assess howsuch prominent spectral features are represented at an intermediary stage between acoustics 2.1. Spectral-envelope description and perception, i.e., at asensorineural level, simulated by Past investigations of pitch-invariant spectral-envelope computational models of the human auditory system. Au- characteristics pursued comprehensive assessments of ditory models can account for effects related to spectral spectral analyses encompassing extended pitch ranges of masking, i.e., to what neural excitation pattern aspectrum instruments [14, 17, 18]. The spectral-envelope descrip- of asingle or compound sound leads. Forinstance, ex- tion employed in this paper wasbased on an empirical es- citation patterns typically involveanasymmetric upward timation technique relying on the initial computation of spread in frequency, butthe shape of excitation still varies power-density spectra for the sustained portions of sounds both as afunction of frequencyand excitation level. The (excluding onset and offset), followed by the detection Auditory Image Model (AIM)simulates different stages of of partial tones, i.e., their frequencies and power levels. the peripheral auditory system, covering the transduction Acurve-fitting procedure employing a cubic smoothing of acoustical signals into neural responses and the sub- spline (piecewise polynomial of order 3) applied to the sequent temporal integration across auditory filters yield- composite distribution of partial tones overall pitches ing the stabilized auditory image (SAI), which provides yielded

The Role of Spectral-Envelope Characteristics in Perceptual Blending of Wind-Instrument Sounds

The Science of String Instruments

Vapar Synth--A Variational Parametric Model for Audio Synthesis

A Comparison of Viola Strings with Harmonic Frequency Analysis

Timbre Perception

Automatic Transcription of Bass Guitar Tracks Applied for Music Genre Classiﬁcation and Sound Synthesis

Advances in Perceptual Stereo Audio Coding Using Linear Prediction Techniques

Expressive Sampling Synthesis

Uncontrolled Manifolds and Short-Delay Reflexes in Speech Motor Control : a Modeling Study Andrew Szabados

Parsing the Spectral Envelope

Analysing Deep Learning-Spectral Envelope Prediction Methods For

A Separate and Restore Approach to Score-Informed Music Decomposition

A Comparison of Spectral Smoothing Methods for Segment Concatenation Based Speech Synthesis Q David T