Diversity in Pitch Perception Revealed by Task Dependence
Total Page:16
File Type:pdf, Size:1020Kb
ARTICLES https://doi.org/10.1038/s41562-017-0261-8 Diversity in pitch perception revealed by task dependence Malinda J. McPherson! !1,2* and Josh H. McDermott! !1,2 Pitch conveys critical information in speech, music and other natural sounds, and is conventionally defined as the perceptual correlate of a sound’s fundamental frequency (F0). Although pitch is widely assumed to be subserved by a single F0 esti- mation process, real-world pitch tasks vary enormously, raising the possibility of underlying mechanistic diversity. To probe pitch mechanisms, we conducted a battery of pitch-related music and speech tasks using conventional harmonic sounds and inharmonic sounds whose frequencies lack a common F0. Some pitch-related abilities—those relying on musical interval or voice recognition—were strongly impaired by inharmonicity, suggesting a reliance on F0. However, other tasks, including those dependent on pitch contours in speech and music, were unaffected by inharmonicity, suggesting a mechanism that tracks the frequency spectrum rather than the F0. The results suggest that pitch perception is mediated by several different mechanisms, only some of which conform to traditional notions of pitch. itch is one of the most common terms used to describe sound. could involve first estimating the F0 of different parts of a sound Although in lay terms pitch denotes any respect in which and then registering how the F0 changes over time. However, pitch sounds vary from high to low, in scientific parlance pitch is changes could also be registered by measuring a shift in the con- P 24–27 the perceptual correlate of the rate of repetition of a periodic sound stituent frequencies of a sound, without first extracting F0 . It (Fig. 1a). This repetition rate is known as the sound’s fundamental thus seemed plausible that pitch perception in different stimulus frequency, or F0, and conveys information about the meaning and and task contexts might involve different computations. identity of sound sources. In music, F0 is varied to produce melo- We probed pitch computations using inharmonic stimuli, ran- dies and harmonies. In speech, F0 variation conveys emphasis and domly jittering each frequency component of a harmonic sound intent, as well as lexical content in tonal languages. Other everyday to make the stimulus aperiodic and inconsistent with any single sounds (birdsong, sirens, ringtones and so on) are also identified F0 (Fig. 1b)30. Rendering sounds inharmonic should disrupt in part by their F0. Pitch is thus believed to be a key intermediate F0-specific mechanisms and impair performance on pitch-related perceptual feature and has been a topic of intense interest through- tasks that depend on such mechanisms. A handful of previous out history1–3. studies have manipulated harmonicity for this purpose and found The goal of pitch research has historically been to characterize modest effects on pitch discrimination that varied somewhat the mechanism for estimating F0 from sound4,5. Periodic sounds across listeners and studies25–27. As we revisited this line of inquiry, contain frequencies that are harmonically related, being multiples it became clear that effects of inharmonicity differed substantially of the F0 (Fig. 1a). The role of peripheral frequency cues, such as across pitch tasks, suggesting that pitch perception might parti- the place and timing of excitation in the cochlea, have thus been a tion into multiple mechanisms. The potential diversity of pitch focal point of pitch research6–10. The mechanisms for deriving F0 mechanisms seemed important both for the basic understanding from these peripheral representations are also the subject of a rich of the architecture of the auditory system and for understanding research tradition6–14. Neurophysiological studies in non-human the origins of pitch deficits in listeners with hearing impairment or animals have revealed F0-tuned neurons in the auditory cortex of cochlear implants. one species (the marmoset)15,16, although as of yet there are no com- We thus examined the effect of inharmonicity on essentially parable findings in other species17,18. Functional imaging studies in every pitch-related task we could conceive and implement. These humans suggest pitch-responsive regions in non-primary auditory ranged from classic psychoacoustic assessments with pairs of notes cortex19–21. The role of these regions in pitch perception is an active to ethologically relevant melody and voice recognition tasks. Our area of research22,23. Despite considerable efforts to characterize the results show that some pitch-related abilities—those relying on mechanisms for F0 estimation, there has been relatively little con- musical interval or voice perception—are strongly impaired by sideration of whether behaviours involving pitch might necessitate inharmonicity, suggesting a reliance on F0 estimation. However, other sorts of computations24–27. tasks relying on the direction of pitch change, including those using One reason to question the underlying basis of pitch percep- pitch contours in speech and music, were unaffected by inhar- tion is that our percepts of pitch support a wide variety of tasks. In monicity. Such inharmonic sounds individually lack a well-defined some cases it seems likely that the F0 of a sound must be encoded, pitch in the normal sense, but when played sequentially nonetheless as when recognizing sounds with a characteristic F0, such as a per- elicit the sensation of pitch change. The results suggest that what son’s voice28, but in many situations we instead judge the way that has traditionally been couched as ‘pitch perception’ is subserved by F0 changes over time—often referred to as relative pitch—as when several distinct mechanisms, only some of which conform to the recognizing a melody or speech intonation pattern29. Relative pitch traditional F0-related notion of pitch. 1Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA. 2Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA, USA. *e-mail: [email protected] NATURE HUMAN BEHAVIOUR | www.nature.com/nathumbehav © 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. ARTICLES NATURE HUMAN BEHAVIOUR a Waveform Power spectrum Autocorrelation 1.0 1.0 –10 0.8 0.8 –20 0.6 0.6 –30 ) 0.4 0.4 r ( 0.2 0.2 –40 0 0 –0.2 –50 –0.2 Power (dB) –0.4 –60 Correlation –0.4 –0.6 –0.6 –70 –0.8 –0.8 Sound pressure (arbitrary units) –1.0 –80 –1.0 21 23 25 27 29 31 0 500 1,000 1,500 2,000 2,500 3,000 012345678 Time (ms) Frequency (Hz) Lag (ms) b Waveform Power spectrum Autocorrelation 1.0 –10 1.0 0.8 0.8 –20 0.6 0.6 –30 ) 0.4 0.4 r ( 0.2 –40 0.2 0 0 –0.2 –50 –0.2 Power (dB) –0.4 –60 Correlation –0.4 –0.6 –0.6 –70 –0.8 –0.8 Sound pressure (arbitrary units) –1.0 –80 –1.0 21 23 25 27 29 31 0 500 1,000 1,500 2,000 2,500 3,000 012345678 Time (ms) Frequency (Hz) Lag (ms) Fig. 1 | Example harmonic and inharmonic tones. a, Waveform, power spectrum and autocorrelation for a harmonic complex tone with a F0 of 250!Hz. The waveform is periodic (repeating in time), with a period corresponding to one cycle of the F0. The power spectrum contains discrete frequency components (harmonics) that are integer multiples of the F0. The harmonic tone yields an autocorrelation of 1 at a time lag corresponding to its period (1/F0). b, Waveform, power spectrum and autocorrelation for an inharmonic tone generated by randomly ‘jittering’ the harmonics of the tone in a. The waveform is aperiodic and the constituent frequency components are not integer multiples of a common F0 (evident in the irregular spacing in the frequency domain). Such inharmonic tones are thus inconsistent with any single F0. The inharmonic tone exhibits no clear peak in its autocorrelation, indicative of its aperiodicity. Results by detecting shifts in the spectrum without estimating F0, perfor- Experiment 1: Pitch discrimination with pairs of synthetic tones. mance should be impaired for Inharmonic-changing but similar for We began by measuring pitch discrimination using a two-tone dis- Harmonic and Inharmonic conditions. Finally, if the jitter manipu- crimination task standardly used to assess pitch perception31–33. lation was insufficient to disrupt F0 estimation, performance should Participants heard two tones and were asked whether the second be similar for all three conditions. tone was higher or lower than the first (Fig. 2a). We compared per- To isolate the effects of harmonic structure, a fixed bandpass fil- formance for three conditions: a condition where the tones were ter was applied to each tone (Fig. 2c). This filter was intended to harmonic, and two inharmonic conditions (Fig. 2b). Here and else- approximately equate the spectral centroids (centres of mass) of where, stimuli were made inharmonic by adding a random amount the tones, which might otherwise be used to perform the task, and of ‘jitter’ to the frequency of each partial of a harmonic tone (up to to prevent listeners from tracking the frequency component at the 50% of the original F0 in either direction) (Fig. 1b). This manipula- F0 (by filtering it out). This type of tone also mimics the acous- tion was designed to severely disrupt the ability to recover the F0 tics of many musical instruments, in which a source that varies in of the stimuli. One measure of the integrity of the F0 is available in F0 is passed through a fixed filter (for example, the resonant body the autocorrelation peak height, which was greatly reduced in the of the instrument). Here, and in most other experiments, low-pass inharmonic stimuli (Fig. 1 and Supplementary Fig.