Nonlinear Dynamics of the Perceived Pitch of Complex Sounds
Total Page:16
File Type:pdf, Size:1020Kb
Nonlinear Dynamics of the Perceived Pitch of Complex Sounds Julyan H. E. Cartwright1,∗, Diego L. Gonz´alez2,†, and Oreste Piro3,‡ 1Instituto Andaluz de Ciencias de la Tierra, IACT (CSIC-UGR), E-18071 Granada, Spain. 2Istituto Lamel, CNR, I-40129 Bologna, Italy 3Institut Mediterrani d’Estudis Avanc¸ats, IMEDEA (CSIC–UIB), E-07071 Palma de Mallorca, Spain (Physical Review Letters, 82, 5389–5392, 1999) We apply results from nonlinear dynamics to an old problem in acoustical physics: the mechanism of the perception of the pitch of sounds, especially the sounds known as complex tones that are important for music and speech intelligibility. PACS numbers: 05.45.-a, 43.66.+y, 87.19.La The pitch of a sound is where we perceive it to lie on a (Fig. 1b), the difference combination tone ωC = ω2 − ω1 musical scale. Like all sensations, pitch is a subjective quan- (i.e., p = 1, q = −1) between two successive partials has tity related to physical attributes of the stimulus, in this case the frequency of the missing fundamental ω0, that is ωC = mainly its component frequencies. For a pure tone with a sin- (k + 1)ω0 − kω0 = ω0. However, a crucial experiment seri- gle frequency component, the relation is monotonic and en- ously challenged nonlinear theories of the residue. Schouten ables us to adopt an operational definition of pitch in terms et al. [4] demonstrated that the behavior of the residue cannot of frequency: the pitch of an arbitrary sound is given by the be described by a difference combination tone: if we shift all frequency of a pure tone of the same pitch. The scientific in- the partials by the same amount ∆ω (Fig. 1c), the difference vestigation of pitch has a long history dating back to Pythago- combination tone remains unchanged, and the same should ras, but the origin of the pitch of complex tones with several thus be true of the residue. Instead it is found that the per- frequency components is still not well understood. The first ceived pitch also shifts, showing a linear dependence on ∆ω perceptual theories considered pitch to arise at a peripheral (Fig. 1d). This phenomenon is known as the first pitch shift level in the auditory system [1–5]; more recently it has been effect, and has been accurately measured in many experiments thought that central nervous system processing is necessary (psychoacoustic experiments on pitch can attain an accuracy [6–8]. However, the experimental evidence is that this pro- of 0.2% [13]). A first attempt to model qualitatively the be- cessing is carried out before the primary auditory cortex [9]. havior of the pitch shift shows that the slopes of the lines in The latest models integrate neural and peripheral processing Fig. 1d depend roughly on the inverse of the harmonic num- [10,11]. Here we develop a nonlinear theory of pitch percep- ber k +1 of the central partial of a three-component complex tion for complex tones that describes experimental results on tone. However, for small k at least, the change in slope is the pitch perception of complex sounds at least as well as do slightly but consistently larger than this, but smaller if we re- current models, and that removes the need for extensive pro- place k +1 by k. This behavior is known as the second pitch cessing at higher levels of the auditory system. shift effect. Also, an enlargement of the spacing between par- A key phenomenon in pitch perception is known as the tials while maintaining fixed the central frequency produces problem of the missing fundamental, virtual pitch, or residue a decrease in the residue pitch. As this anomalous behavior perception [12], and consists of the perception of a pitch that seems to be correlated with the second pitch shift effect, it cannot be mapped to any frequency component of the stimu- is usually included within it [4]. Pitch-shift experiments, and lus. Suppose that a periodic tone such as that shown in Fig. 1a others that demonstrated that the residue is elicited dichoti- arXiv:chao-dyn/9907002v1 26 Jun 1999 is presented to the ear. Its pitch is perceived to be that of a cally (part of the stimulus exciting one ear and the rest of the pure tone at the frequency of the fundamental. The number stimulus the other, controlateral, ear) and not just monotically of higher harmonics and their relative amplitudes give tim- and diotically (all of the stimulus exciting one or both ears) bral characteristics to the sound, which allow one to distin- led to the abandonment of peripheral (periodicity [2,4] and guish a trumpet from a violin playing the same musical note. place [1,3]) theories and to the development of theories that Now suppose that the fundamental and some of the first few considered the pitch of complex tones to be a result of central higher harmonics are removed (Fig. 1b). Although the timbre nervous system processing [6–8]; thence to integrated neural changes, the pitch of the tone remains unchanged and equal to and peripheral models [10,11]. the missing fundamental; this is residue perception. We demonstrate below that the crucial pitch-shift experi- The first physical theory for the residue is due to von ment of Schouten et al. can be accurately described in terms Helmholtz [3], who attributed it to the generation of differ- of generic attractors of nonlinear dynamical systems, such that ence combination tones in the nonlinearities of the ear. A a theory of pitch perception for complex tones can be con- passive nonlinearity fed by two sources with frequencies ω1 structed without the need to resort to extensive central pro- and ω2 generates combination tones of frequency ωC , which cessing. We model the auditory system as a generic nonlinear are nontrivial solutions of the equation pω1 + qω2 + ωC =0, forced oscillator, and identify experimental data with struc- where p and q are integers. For a harmonic complex tone turally stable behavior of this class of dynamical system. We 1 emphasize that since we are interested in universal behavior, Since complex tones can be decomposed as a series of the results we obtain are not dependent on the construction partials — a sum of purely sinusoidal components — and of a particular model, but rather represent the behavior of any residue perception is elicited with at least two of these, we dynamical system of this type. search for suitable attractors in the class of two-frequency quasiperiodically forced oscillators. Quasiperiodically forced oscillators show a great variety of qualitative behavior that falls into the three categories of periodic attractors, quasiperi- odic attractors, and strange attractors (both chaotic and non- Temporal Fourier Pitch Waveform Spectrum chaotic). We propose that the residue behavior we seek to 1 explain is a resonant response to forcing the auditory system a 2 quasiperiodically. From stability arguments [14–17] we sin- 3 4 5 P=ω gle out a particulartype of two-frequencyquasiperiodicattrac- 6 7 0 8 tor, which we term a three-frequencyresonance, as the natural ω0 candidate for modelling the residue. Three-frequency reso- 4 b 5 nances are given by the nontrivial solutions of the equation 6 7 pω1+qω2+rω =0, where p, q, and r are integers, ω1 and ω2 8 P=ω R 0 are the forcing frequencies, and ωR is the resonant response, and can be written compactly in the form (p,q,r). Notice that ω 0 combination tones are three-frequency resonances of the re- c stricted class (p, q, 1). This is the only type of response possi- P=ω +∆P ble from a passive nonlinearity, whereas a dynamical system 0 such as a forced oscillator is an active nonlinearity with at ω least one intrinsic frequency, and can exhibit the full panoply 0 ∆ω of three-frequency resonances, which include subharmonics d k=6 k=7 k=8 P of combination tones. The investigation of three-frequency systems is a young area of research, and there is not yet any ∆ ∆P ∝∆ω P consistency in the nomenclatureof these resonancesin the sci- ω0 ∆ω entific literature: as well as three-frequency resonances, they are also called weak resonances or partial mode lockings; see ω ω ω f 70 8 0 9 0 Baesens et al. [18] and references therein. It is known that FIG. 1. Fourier spectra and pitch of complex tones. Whereas three-frequency resonances form an extensive web in the pa- pure tones have a sinusoidal waveform corresponding to a single fre- rameter space of a dynamical system. In particular, between quency, almost all musical sounds are complex tones that consist of any two parent resonances (p1, q1, r1) and (p2, q2, r2) lies the a lowest frequency component, or fundamental, together with higher daughter resonance (p1 + p2, q1 + q2, r1 + r2) on the straight frequency overtones. The fundamental plus overtones are together line in parameter space connecting the parents. collectively called partials. a A harmonic complex tone. The over- For pitch-shift experiments, the vicinity of the external fre- tones are successive integer multiples k = 2, 3, 4 . of the funda- quencies to successive multiples of some missing fundamen- mental ω0 that determines the pitch. The partials of a harmonic com- tal ensures that (k + 1)/k is a good rational approximation to plex tone are termed harmonics. b Another harmonic complex tone. their frequency ratio. Hence we concentrate on a small inter- The fundamental and the first few higher harmonics have been re- val around the missing fundamental between the frequencies moved. The pitch remains the same and equal to the missing fun- damental.