Perception &Psychophysics 1981,29 (4), 395402 Perception of acoustic iterance: Pitch and infrapitch

RICHARD M. WARREN and JAMES A. BASHFORD, JR. University of Wisconsin, Milwaukee, Wisconsin 53201

Detection of acoustic repetition is considered as a continuum extending from .5 through 16,000 Hz. Perceptual characteristics are mapped for the entire range, using repeated randomly derived waveforms (segments from Gaussian noise) as model stimuli. Contributions from tem­ poral domain (neural periodicity) analysis extend from about .5 through 5,000 Hz and from fre­ quency domain (neural place) analysis from roughly 50 through 16,000 Hz. Within the range of overlapping analyses (50 through 5,000 Hz), it is difficult to separate the effects of temporal cues from place cues. However, by using low- acoustic iteration from 1 through 16 Hz, we were able to study temporal analysis in the absence of place cues to repetition. New perceptual phenomena are reported for the "infrapitch" produced by "infratones," some of which are analogous to phenomena observed for the pitch produced by tones. It appears useful for theory to consider pitch and infrapitch as a single topic: the perception of acoustic iterance.

Waveforms excised from Gaussian noise and re­ has a line spectrum spanning all audible , peated without pauses are useful as model periodic with successive separated by 1 Hz. The stimuli. Such repeated noise segments (which will be ear's spectral analyzing power is limited (see Plomp, called recycled Gaussian noise or RGN) have no 1964) and cannot resolve such closely spaced compo­ a priori restrictions concerning power and phase nents. The interaction of the many unresolved har­ spectra, and thus represent a general case for de­ monics stimulating the same locus on the basilar termining general rules governing perception of acous­ membrane produces complex local patterns of ampli­ tic repetition or iterance. tude fluctuation that are repeated once each second. As we shall see, experiments with RGN have sug­ The change of amplitude with time differs at separate gested that perception of acoustic iterance represents loci, but each of these different patterns has the same a continuum extending well below the tonal limit, period. Information concerning this low-frequency ranging from about .5 through 16,000 Hz (repetition periodicity seems to be available along the entire periods from about 2 sec through .06 msec). Two length of the basilar membrane, for, when we swept types of neural analyses appear to operate within this a V3-octave filter through a broadband infratonal range: temporal domain or periodicity analysis from RGN, listeners could hear an infrapitch periodicity roughly .5 through 5,000 Hz, and frequency domain corresponding in frequency to that of the iterated or place analysis from roughly 50 through 16,000 Hz. acoustic waveform at all center frequencies of the Low-frequency RGNs from about .5 through 4 Hz filter. It should be noted that infratonal acoustic like periodic "whooshing," and from about iterance is detected solely through temporal informa­ 4 through 20 Hz sound like "motorboating" (Guttman tion without the concurrent place information char­ & Julesz, 1963). The noisy or hiss-like quality asso­ acteristic of the tonal range. ciated with lower frequencies disappears for RGNs We designed Experiment 2 to determine whether at about 100 Hz (Warren & Bashford, 1978). At such a "pure" neural temporal analysis in the infra­ higher frequencies, in keeping with other complex tonal range follows rules observed for the mixed tones, pitch is determined by the period, and timbre temporal and place analysis occurring in the tonal is determined by the particular structure range. But, before undertaking Experiment 2, it was of individual RGNs. considered desirable to complete the description of Let us look more closely at the RGNs below the RGNs of different frequencies by studying the un­ tonal range, or "infratonal" RGNs, I first studied by explored range from 20 through 100 Hz. Experi­ Guttman and Julesz (1963). An RGN of, say, 1 Hz ment 1 provides information to fill this gap.

EXPERIMENT 1: This study was supported in part by a grant from the National Perceptual Categories and Boundaries ScienceFoundation (BNS 79-12402)and in part by funds provided for Low-Frequency Pitch by the Graduate School and the College of Letters and Science at the University of Wisconsin-Milwaukee. Requests for reprints should be directed to Richard M. Warren, Department of Psychol­ When we explored perception of RGNs between ogy. University of Wisconsin, Milwaukee, Wisconsin 53201. 100 and .5 Hz in a preliminary study, we could find

Copyright 1981 Psychonomic Society, Inc. 395 0031-5117/81/040395-08$01.05/0 396 WARREN AND BASHFORD no distinct threshold value for pitch. The RGN be­ Procedure. The subjects were tested while seated in an audio­ came progressively more hiss-like with decreasing metric room. There were two sessions, each lasting about 30 min. frequency. Somewhere between 100 and 50 Hz, the In the first session, the subjects chose the RON repetition rate corresponding to their lower limit for pitch (i.e., they judged that RGN underwent a transition from a smooth, homo­ a decrease below this value did not produce a change in any pitch­ geneous signal to one that sounded rough and seemed like quality of the RON), and also selected the RON repetition rate to pulsate at the repetition rate. Within this transition within the pitch range corresponding to the transition from a range, it sometimes was possible to change at will smooth, homogeneous sound to a rough, pulsed sound. Each type of judgment was made twice: once with presentation of subranges from an "analytic" mode producing the rough sen­ by the experimenter in order of increasing repetition rates (ininal sation with a temporally fine structure to a "synthetic" subrange setting at maximum value) and once with presentation mode producing the smooth unitary percept. Further of subranges in the opposite order (with minimum initial settings). decrease in frequency caused the pulsing sound grad­ The subjects varied the repetition rates within successive subranges ually to lose its tonal quality. While Guttman and until the criterion value was reached by adjustment within a sub­ range. The experimenter then recorded the clock frequency driving Julesz (1963) described RGNs as sounding like "motor­ the line, which was used to calculate the transition-threshold boating" from 19 through 4 Hz and as sounding like judgment. Judgments of the pitch/infrapitch boundary were alter­ "whooshing" from 4 through 1 or .5 Hz, no formal nated with judgments of the smooth/pulsating boundary, and pre­ data were presented for placing this transition at sentation orders were balanced within the group of subjects. During the second testing session on a subsequent day, each subject 4 Hz. The listeners in our laboratory also heard the made the same judgments, with presentation orders reversed with boundary between motorboating and whooshing at respect to those received in the first session. about 4 Hz. However, the transition seemed to take place over an octave or more, and any values within Results and Discussion this range seemed acceptable to our listeners. All the subjects found it possible to make judg­ It should be noted that detection of RGN iterance ments of the pitch/infrapitch and smooth/pulsing requires no special training. It can be heard clearly category boundaries for RGNs. Each of the subjects within a few seconds by inexperienced listeners for reported that the qualitative changes corresponding any frequency greater than 1 Hz. to both types of transitions varied gradually with repetition frequency, with some perceptual ambiguity Method at the values selected for transitions. Subjects. Six subjects were used, each of whom had training in auditory research experimentation, music, or both. They all had The mean threshold values obtained in Experi­ heard RONs of different frequencies before the start of the experi­ ment 1 are shown in Table 1. These values, together ment. with earlier information for other RGN repetition Stimuli. The stimuli were produced by repetition of white-noise rates (data for the infrapitch . range reported by segments. Output voltage from a white-noise generator was sampled every 20 IlSeC and coded in l2-bit form by a digital delay Guttman & Julesz, 1963; data for the portion of the line built to our specifications by the Physical Data Company. pitch range above 100Hz reported by Warren & Maximum storage was 600,000 bits, corresponding to a l-sec delay Bashford, 1978) were used to construct Figure 1, when the delay line was operating at its maximum bandwidth of which describes the perceptual characteristics for the 16,000 Hz. By using a frequency synthesizer as an external clock, entire range of readily detectable RGN repetition it was possible to control the stepping rate of the delay line's shift registers. Input was bandpassed from 50 through 16,000 Hz. rates.' While boundaries separating the different per­ By closing a "recycle switch," input to the delay line was rejected, ceptual qualities are represented diagrammatically at and the signal was looped or recirculated indefinitely in digital single frequencies, it should be emphasized that all form. Digital-to-analog conversion produced an RON with a period transitions are gradual and occur in the vicinity of the determined by the number of shift registers placed in the circuit and the external clock rate. An internal antialiasing filter reduced indicated frequencies. Within the transitional range, spectral artifacts corresponding to digital-to-analog conversion, it is often possible to direct attention to one side of and reduced transients corresponding to the closing of the digital the boundary or the other. Tones above 5,000 Hz are loop (such transients could not be heard by listeners or seen on sound spectrographs). A low-pass filter (48 dB/octave cutoff) limited the frequency of signals delivered to the matched TDH-49 headphones to 8,000 Hz, producing a sharp cutoff at the upper Table 1 limit of the headphones' flat response. Repetition Frequencies (in Hertz) Corresponding to Transitions RON repetition rates from 1.22 through 300 Hz were used, cor­ in Perceptual Quality for Repeated Gaussian Noise Segments responding to periods from 823 through 3.33 msec, respectively. Mean Frequency This range was spanned in 13 neighboring subranges, each having limits with a repetition ratio of 1:1.527. The lowest subrange, for Order example, extended from 1.22-1.86 Hz, the next from 1.86-2.84 Hz, etc. The subjects were free to adjust the repetition rate within each Transition D c SE subrange by turning a knob that changed the clock frequency con­ Infrapitch/Pifch 18.7 22.5 20.5 2.1 trolling the stepping rate of the shift registers. Each time the sub­ Pulsing/Smooth 70.5 74.0 72.2 11.7 range was changed by the experimenter, a new waveform was cap­ tured from the noise generator with the delay line clock-rate set Note-The means for increasing (I) order and decreasing (D) at the lowest frequency of the subrange. Adjustments by subjects order of frequency presentation each represent 12 judgments within a subrange caused temporal expansion or contraction of the (2 for each of the six subjects). with a total of24 judgments for repeated waveform. Stimuli were heard diotically at 75 dB SPL. the combined (C) means. PITCH AND INFRAPITCH 397

EXPERIMENT 2: NO'HERIODlC' REPErITlYE HISS NOISYPITCHI1 PURE=--'-=-- PITCH__llNALOIBLE I HISS:.· I ~ Perception of Multiple Infratonal ~~ ~ ; pulsed l'---_=_ Repetition Rates ~--7"'//--~NELRAlPERlOOiCITY CHECTle:t.l ~'/;:L---­

~~NEURAl. PLACE DETECTl(X\l~~

<---PERIOOlClTY· PLACE _._- When two recycled Gaussian noises are mixed, and c.;;~~>;;;/;:::~//////::///~•.. ~:::::: each of these RGNs has an identical period but a dif­ OCAvES~ G 2 i. 6 8 10 12 11. ferent randomly derived waveform, the resultant rlEPZ· C5 1 10 100 l,CXXJ 10,COJ WAVEFORM REPETiTICN >=REDJEr--(:Y sound is a new recycled noise with the same period. Informal observations had shown that if mixed in­ fratonal RGN's of the same sound-pressure level had Figure 1. Detection of acoustic iterance for model stimuli con­ sisting of repeated segments of Gaussian noise. Iterance is desig­ repetition rates that differed only very slightly from 3 nated in Hz and also as the number of octaves above the lower unison (more than about one part in 10 ), then no frequency limit for detection. The upper portion of the figure de­ consistent infrapitch periodicity was heard." Loss of scribes the perceptual characteristics: Transitions between the consistent periodicity also occurred when an infra­ qualities given are gradual, and category boundaries are diffuse, with limits approximately as shown. The lower portion of the fig­ tonal RGN was mixed with on-line noise of equal in­ ure describes the probable neurological mechanisms operating at tensity, indicating that each of the RGNs in the pair particular repetition frequencies, with diagonal lines slanting mistuned from unison acted as did nonperiodic noise downward to the left for temporal domain (neural periodicity) in masking the periodicity of the other. We reasoned and diagonal lines slanting downward to the right for frequency that, if the mistuning were increased so that the pe­ domain (neural place). See the text for further details and dis­ cussion. riodicities had small integral ratios, the harmonic mix­ ture of repetition rates might exhibit one or more de­ tectable periodicities. considered amelodic in Figure I, since musical inter­ RGNs with integral ratios were mixed in Experi­ vals cannot be identified for frequencies above this ment 2. When the ratio was 1:2, the mixture had an limit (Bachem, 1948; Ward, 1954). Figure 1 also in­ ensemble (or fundamental) repetition frequency dicates the frequency ranges of neural mechanisms equal to that of the component with the lower rate. underlying the continuum of acoustic periodicity de­ We designed a part of our experiment to determine tection. The literature indicates that the loci of ex­ if this overall repetition rate could be detected and citation maxima along the basilar membrane provide also if the higher harmonic component could be per­ direct information concerning the frequency of spec­ ceived, even though it did not correspond to the wave­ tral components of the stimulating sound from a dif­ form repetition frequency. It should be noted that an fuse lower limit of roughly 50 through about 16,000 Hz autocorrelation analysis could reveal the presence of (Bekesy, 1960). The discharge of auditory nerve fibers harmonic components, as will be discussed later. associated with these loci maintain some degree of In addition to the RGNs with repetition ratios of synchrony with the local stimulus frequency up to 1:2, we also used pairs with ratios of 2:3 and 3:4. about 4,000 or 5,000 Hz (Hind, 1972; Rose, Brugge, Such mixtures produce a periodic waveform having Anderson, & Hind, 1967), so that there also can be a frequency differing from both components, with a neural time-domain cues to tonal pitch below that fundamental or ensemble periodicity corresponding limit. 3 As was discussed earlier, low-frequency acous­ to unity. We wished to determine (l) if either or both tic repetition can be detected only through a neural harmonic component periodicities could be detected, time-domain analysis. The of even though neither corresponded to the frequency a long period infratonal RGN is below the limit for of the ensemble waveform, and (2) if the fundamen­ detectable spectral components, and the harmonic tal repetition rate could be detected, even though it components of the acoustic line spectrum are too was missing from the components. Analogies to each closely spaced to permit neural place resolution. Only of these types of perception occur in the pitch range, neural periodicities corresponding to the temporal (1) corresponding to the "hearing out" of harmonic patterns of amplitude fluctuation along the basilar components within complex tones (Helmholtz, 1877/ membrane provide information concerning acoustic 1954), and (2) corresponding to perception of a pitch iterance. equivalent to the fundamental when the fundamental Hence, the lower infrapitch range provides a rather is missing in a complex tone (see Plomp, 1976). special opportunity for examining the results of neural periodicity analysis in the absence of neural Method place involvement. Experiment 2 was designed to ex­ Subjects. Four subjects were used, each of whom had served in amine the results of such an isolated neural periodi­ Experiment 1. Stimuli. The Physical Data digital delay line described in Ex­ city analysis of simultaneous harmonic repetition periment 1 (maximum signal storage of 1 sec) was employed to­ rates. gether with an Eventide Model 1745A digital delay line (maximum 398 WARREN AND BASHFORD total storage of 600 msec with extra storage module) that had a alone, following the addition of a second harmonically related 50-kHz sampling rate and coded the signal in a IO-bit form. The RGN (a brief silence preceded addition of the second RON)." frequency responses of the two delay lines were matched to within The subjects were tested in a sound-attenuating booth and lis­ ±2 dB from 50 ti'lrough 12 kHz at the sampling rates employed. tened to stimuli diotically through matched TDH-49 headphones. In addition to internal antialiasing filters, the output from each They participated in four sessions, each lasting approximately delay line was bandpass filtered (48 dB/octave slopes) from 100 45 min. During each session, the subjects made three types of to 8,000 Hz. Single infratonal RGNs prepared in this fashion had judgments with each of the six RGN pairs, finishing all judgments no audible clicks, and sound spectrograms gave no evidence of with one pair before moving to the next. The order in which pairs transients corresponding to clicks. were presented was counterbalanced across subjects and sessions. A single crystal served as the time base for both delay lines. This For their first task with a mixed pair, the subjects listened as clock was located in the Eventide delay line, and its output was long as they wished and then told the experimenter the number used to drive an adjustable Rockland Model 5100 frequency syn­ of different repetition frequencies that they could detect. For thesizer, which in turn drove the Physical Data delay line. The fre­ their second task, they matched each of the repetition frequencies quency synthesizer operated at values close to 600 kHz in driving heard (in order of increasing frequency) with the interruption rate the Physical Data delay line and could be adjusted in steps of of an on-line noise. They switched at will between the RGN pair .001 Hz. and the noise interrupted by a periodic 30-msec silent gap, ad­ Although each delay line had a frequency response that mea­ justing the interruption rate to match the repetition rate. The ex­ sured flat to within ± 1.5 dB from 100 through 8 kHz using a perimenter always set the interruption rate at the lower limit of Rockland Model 512S spectrum analyzer, in order to minimize .1 Hz before the subject started adjustment. After completing a spectral bias, RON captures on each delay line were based on re­ match, the listener was presented with the component RGNs in­ cordings (made previously on the same channel) of white noise dividually (with orders counterbalanced across sessions) and asked produced by the same generator passed through the other delay to choose which, if either of them, matched the interruption rate line set at zero delay. The repetition ratios of members of an previously selected. (The on-line noise interrupted at the rate just RGN pair were accurate, as given in this paper, to at least one part chosen was available to the subject, who could switch between 8 in 10 • Six sets of component pairs were used: 1+ 2 Hz, 4 + 8 Hz, this sound and the single RGN at will.) 2 + 3 Hz, 3 +4 Hz, 8 + 12 Hz, and 12+ 16 Hz. Experimental stim­ The third and final task required listening for several seconds uli were prepared by recording both RGNs at the same time on to one component of an RGN pair heard alone and then, when separate channels of an Ampex Model 440C tape recorder: Simul­ ready, instructing the experimenter to add the other component. taneous recording prevented random variations in recording All input to the headphones was shut offfor 2 sec (see Footnote 9), speeds from changing the precise integral ratio of repetition rates.' and the mixture was then presented. The subject kept a timing but­ Separate noise captures for each of the six stimulus pairs were used ton depressed whenever the original RGN repetition rate could be in each of the four experimental sessions. Recordings were used heard in the mixture. The criterion for successful detection required for convenience, but only after preliminary observations indicated the listener to hear the first component in the mixture for 20 con­ that results were similar to those obtained with on-line RGNs. 6 tinuous seconds, starting within 30 sec after the addition of the Listeners matched the RGN periods that they heard, using non­ other component. 10 Both members of each RGN pair were judged recycled white noise at 75 dB SPL, interrupted by 30-msec silent in this fashion, with the order of presentation counterbalanced gaps occurring regularly at a period controlled by a knob that they across sessions. adjusted. For mixed pairs, component RGNs were each at 75 dB. Mixing was accomplished using a Gately SPM6 mixer. Procedure. Preliminary experiments indicated that the fun­ Results and Discussion damental period corresponding to the waveform repetition of a When the subjects listened to the stimuli consisting harmonic mixture of infratonal RGNs was perceived readily and of pairs of RONs with repetition ratios of 1:2, 2:3, and was the strongest periodicity heard. It was necessary to use con­ 3:4, multiple rates could beheard with each of the three siderable caution in choosing an experimental procedure for deter­ separate tasks. When asked how many repetition rates mining whether or not the harmonic RGNs also could be detected. We could not use the procedure described by Helmholtz (1877/ were heard (the first task), two or more rates were re­ 1954) for hearing out spectral components of a complex tone by ported for more than 65070 of the stimulus presenta­ first presenting a spectral component alone and then listening for tions. The second task involvedmatching of repetition its continuity when followed immediately by the complex tone at rates, and, as shown in Table 2, matches were generally a louder level. When this procedure is used with an RGN (say, 3 Hz), followed immediately by a louder mixture of 2- and 3-Hz accurate. The fundamental or ensemble periodicity RGNs, the 3-Hz component indeed can be heard readily, along tended to be dominant and, as can be seen in the with the dominant I-Hz periodicity. But the 3-Hz repetition can table, was matched to within one-half octave or better be heard to continue if followed by a "pure" I-Hz RGN with no in almost every trial. Usually, in addition to matching harmonic components, and illusory continuity can be heard even to the fundamental, listeners were able to match to when the 3-Hz RGN is followed immediately by an on-line (non­ repeated) noise. 7 We also observed that listeners who had had con­ either one or both of the frequencies of the RON pair. siderable practice listening to harmonic mixtures of RGNs re­ Finally, it was possible to continue to hear an RGN ported perceiving infratonal harmonics within single "pure" RGNs following addition of a second harmonically related without immediate prior exposure to that harmonic." Experiment 2 RGN (the third task). Continued perception of the tar­ was designed so as to minimize the risk of a false demonstration of the ability to detect multiple infratonal repetition rates. Our get RGN occurred more frequently than is listed in listeners, while experienced in listening to single RGNs, were kept Table 2: Often, this component would disappear and from prior experience with the multiple infratonal periodicities then reappear, and so not meet the criterion for con­ used in this study. As a further precaution, three separate mea­ tinuous detection necessaryfor a successfultrial. Thus, sures for the detection of harmonic periodicity were used: (1) a re­ each of three separate measures used in Experiment 2 port of the number of components heard, (2) the matching of periodicities for each of the components heard, and (3) a report indicates that listeners can hear harmonic components of the duration of continued perception of an RON, first heard in a mixture of infratonal RGNs. PITCH AND INFRAPITCH 399

Table 2 While a time domain analysis is required for per­ Repetition Frequencies Perceived for Mixed Pairs ception of infratonal frequencies, the question can be of Repeated Gaussian Noise Segments asked whether such temporal analysis of RGNs is Alone- Acceptable Mean Accept- based upon recognition of the recurrence of singu­ Pair Combined* Matches] able Match SE larities in the pattern of stimulation or upon a holis­ Octave Pairs (in Hertz) tic recognition of the repeated pattern. Earlier work I 100 100 1.02 .014 with infratonal repetition, using recycled sequences 2 56 50 2.04 .148 consisting of three or four successivesounds (selected 4 94 100 3.92 .041 from among sinusoidal tones, square waves, and vari­ 8 50 25 8.89 .368 ous noise-derived ), indicated that a holistic Harmonic Component Pairs and Their Fundamental pattern recognition took place with such sequences Repetition Frequencies (in Hertz) when item durations were too brief to permit naming 1 100 1.03 .017 of individual components within the iterated patterns 2 50 63 1.95 .034 (Warren, 1974; Warren & Ackroff, 1976; see, also, 3 25 6 2.81 Warren, 1976). The subjects listening to these re­ 1 94 1.03 .019 3 44 19 2.65 .224 peated sequences could discriminate readily between 4 19 38 4.00 .099 permuted orders of component sounds. If recurrence 4 100 3.96 .051 of singularitieswas the only cue to repetition, permuted 8 63 50 9.69 .287 orders of the same components repeated at the same 12 44 19 12.00 .889 frequency could not be distinguished, since the rep­ 4 100 4.08 .064 etition of singularities corresponding to the individ­ 12 44 50 12.70 .243 16 31 19 17.10 .480 ual components would occur at the same rate with different internal arrangements. Only temporal rela­ Note-The fundamental repetition frequencies (in italics) were tions within the patterns were changed with permuted generated by mixing the pair of harmonic components listed below them. "Percentage of16 trials (4 trials for each offour orders. These observations led to a preliminary experi­ subjects) during which the repetition frequency of the com­ ment with RGNs. An RGN with a period of 150 msec ponent listed, first heard alone, could be detected as continuing was divided into three 50-msec segments, which for a minimum of 20 sec after being combined with the second we can call A, B, and C. Using programming equip­ component. fPercentage of trials for which matching of the repetition frequency listed was within criterion limits (see text ment and digital delay lines, two RGNs were made for further details). available to listeners-one was ABCABCA... and one was ACBACBA.... Even untrained listeners could distinguish between these two patterns, indi­ ADDITIONAL OBSERVATIONS AND cating that temporal relations within the patterns GENERAL DISCUSSION could be appreciated readily. It appears that the de­ tection of infrapitch periodicity may be related to Let us consider again the nature of neural place what has been called "echoic memory" (Neisser, and neural periodicity analyses of acoustic repetition. 1967) or "tape-recorder memory" (Norman, 1972). Wiener (1930) has shown that an autocorrelational In music, "rhythm" can be considered as repeti­ analysis in the time domain can provide the same in­ tion of special auditory patterns at infratonal fre­ formation provided by a power spectrum analysis in quencies. It is of interest that music plays with acous­ the frequency domain. Both types of analyses appear tic repetition (and systematic variations from strict to occur with tones, but the role ofeach in various as­ repetition) for frequencies covering most of the range pects of pitch perception is controversial, partly be­ shown in Figure 1, extending upward through the cause either alone can explain many of the phenom­ melodic range to within a few octaves of the limit of ena observed (for discussion of theories of pitch per­ audibility, and extending downward in the rhythmic ception, see de Boer, 1976; Evans, 1978; Plomp, range for a few octaves below the limit for "whoosh­ 1967, 1976). ing" of repeated Gaussian noise segments (the value In the lowest several octaves of detectable acoustic of .5 Hz in Figure 1 is not intended as the lowest de­ repetition, frequency analysis based upon place can­ tectable iteration frequency for acoustic patterns, even not provide information concerning the repetition for RGN; see Footnote 2). Music can involve multi­ frequency of RGNs. The fundamental and lower ple infratonal periodicities of considerable complex­ spectral harmonics of infratonal RGNs are not audi­ ity: African polyryhthms use sequences of percussive ble, and, as discussed earlier, the individual higher sounds that can contain several harmonically related spectral harmonics are too closely spaced for resolu­ rhythmic frequencies, and listeners are able to attend tion by a frequency domain analysis. Only neural either to the ensemble periodicity or to one or another patterns in the time domain can provide information of the rhythmic components. However, unlike the case concerning acoustic iterance at these low frequencies with RGNs, rhythmic percussive lines can be fol­ (see Figure 1). lowed even when they deviate from harmonic rela- 400 WARREN AND BASHFORD tions with other lines. Some African music uses integration are a matter of some dispute (for review, "additive polyrhythms" in which component inhar­ see Plomp, 1976). However, we have found that it monic periodicities are each perceived (see Sachs, is possible to examine integration of temporal infor­ 1953), while 6ur work has shown that mixing of in­ mation at different loci in the absence of place cues harmonic RGNs inhibits detection of any periodicity.II to repetition by using harmonic infratonal iterances. We have seen that consideration of RGNs as model A beating sinusoidal tone pair can be used to pro­ stimuli led to the hypothesis that responses to acous­ duce any frequency of infratonal amplitude modula­ tic periodicities from roughly .5 through 16,000 Hz tion at any frequency-sensitive place on the basilar represent the single perceptual continuum described membrane. The difference in frequencies of the pure by Figure 1.12 This hypothesis in turn led to the ex­ tones determines the rate, and the center fre­ pectation that some phenomena found in the pitch quency of the tone pair determines the place of this range may be found for infrapitch as well. Experi­ infratonal amplitude modulation. Use of several si­ ment 2 describes the first of these analogs, but we multaneous pairs of beating tones permits generation have looked for, and found, others. One of these of a harmonic sequence of beats (a complex beat), analogs is infrapitch . with any desired locus for individual harmonic com­ When a noise is mixed with its echo having a delay ponents. It was found that pooling of the infratonal of T sec, a pitch corresponding to lIT Hz can be heard. temporal information took place under all conditions This "echo pitch" has been reported to occur for de­ employed, including those involving nonoverlapping lays from about .02 sec through 5 X 10-4 sec, cor­ widely separated positions on the basilar membrane. responding to pitches from 50 through 2,000 Hz (for Even when the fundamental rate of the complex beat reviews, see Bilsen, 1970; Yost & Hill, 1978). The­ was missing, the ensemble or fundamental infratonal ories concerning the basis of echo pitch generally in­ iterance was heard (Warren, Note 1). volve the spectral peaks in the power spectra that oc­ Hence, while Experiment 2 and infrapitch echo cur at integral multiples of 1IT Hz. Indeed, one of the detection indicate that temporal patterns that are names for this sound, "rippled noise pitch," refers mixed at the same locus can be differentiated, com­ to its power spectrum. plex beats demonstrate that separate temporal pat­ However, we considered that analysis in the tem­ terns at different loci can be integrated. Temporal poral domain might be involved as well, so that infra­ domain analysis appears to be quite versatile and pitch echo might be detectable. Investigation led to capable of producing effects at infratonal frequen­ the finding that echo delay of noise can be identified cies analogous to those encountered with tones. down to at least 2 Hz (T=.5 sec), although pitch is indeed lost at about 50 Hz. Infrapitch echo is marked REFERENCE NOTE by an intermittent quality and is similar to (but weaker 1. Warren, R. M. Auditory integration of multiple beat rates. and more difficult to hear than) RGNs of the same Manuscript submitted for publication, 1980. repetition frequency. The spectral peaks of infrapitch echo are too closely spaced along the basilar mem­ brane for a power spectrum analysis based on place REFERENCES of stimulation to be effective, and detection appears BACHEM, A. Chroma fixation at the ends of the musical frequency to be based on analyses in the temporal domain. scale. Journal of the Acoustical Society of America, 1948, 20, Further discussion of infrapitch echo is given else­ 704-70S. where (Warren, Bashford, & Wrightson, 1980). BEKESY, G. VON. Experiments in hearing. New York: McGraw­ Both Experiment 2 and infrapitch echo involve Hill, 1960. BlLSEN, F. A. Repetition pitch; its implication for hearing theory temporal patterns superimposed on the same coch­ and room . In R. Plomp & G. F. Smoorenburg (Eds.), lear loci. The different patterns of musical poly­ Frequency analysis and periodicity detection in hearing. Leiden, rhythms also stimulate identical (or overlapping re­ The Netherlands: Sijthoff, 1970. gions) of the basilar membrane. But what would be BILSEN, F. A., & RITSMA, R. J. Repetition pitch and its implica­ perceived if harmonically related infrapitch repeti­ tion for hearing theory. Acustica, 1969/70,22,63-73. DE BOER, E. On the 'residue' and auditory pitch perception. In tions were delivered to separate loci on the same W. D. Keidel & W. D. Neff (Eds.), Handbook of sensory cochlea with no spatial overlap? In the pitch range, physiology: Auditory system, clinicaland special topics (Vol. S, when the lower harmonic components of a complex Pt. 3). Berlin: Springer-Verlag..J976. tone stimulate different loci on the basilar mem­ EVANS, E. F. Place and time coding of frequency in the peripheral auditory system: Some physiological pros and cons. Audiology, brane, integration of frequency information takes 1978,17,369-420. place, and a single pitch corresponding to the fun­ GUTTMAN, N., & JULESZ, B. Lower limits of auditory periodicity damental frequency (or waveform repetition fre­ analysis. Journal of the Acoustical Society of America, 1963, quency) generally is heard, even when the fundamen­ 35,610. GUTTMAN, N., & PRUZANSKY, S. Lower limits of pitch and musical tal is missing from the harmonic sequence of spec­ pitch. Journal ofSpeech and Hearing Research, 1962,5,207-214. tral components. The relative contributions of spatial HELMHOLTZ, H. L. F. YON. On the sensations oftone as a physto­ domain and temporal domain information to this logical basis for the theory of mUSIC (2nd English ed.). New PITCH AND INFRAPITCH 401

York: Dover, 1954. (Translation conformal with 4Ih German achieved with relatively little training. While they gave no specific ed., 1877) values, they indicated that a practiced observer could, with effort, HIND, J. E. Physiological correlates of auditory stimulus period­ achieve detection of repetition at lower frequencies. We found that icity. Audiology, 1972, 11,42-57. our best listener could, after considerable training. tap out the HOUTGAST, T. Psychophysical evidence for lateral inhibition in period for unknown waveform repetition frequencies down to at hearing. Journal of the Acoustical Society of America, 1972, least .1 Hz. These very low frequencies were generated using two 51,1885-1894. recirculating digital delay lines, say, one producing an RGN at NEISSER, U. Cognitive psychology. New York: Appleton-Century­ 2 Hz and the other producing an RGN at 2.1 Hz-when the RGNs Crofts, 1967. were mixed, the ensemble waveform was repeated at .1 Hz. NORMAN, D. A. The role of memory in the understanding of 3. For a detailed discussion of peripheral place and time coding language. In J. F. Kavanagh & L G. Mattingly (Eds.), Language of tonal frequencies and for interesting suggestions concerning by ear and by eye: The relationshipsbetween speech and reading. their possible interactions, see Evans (1978). Cambridge, Mass: M.LT. Press, 1972. 4. It was necessary to use uncorrelated segments of white noise PLOMP, R. The ear as a frequency analyzer. Journal ofthe Acous­ for these observations of mistuned unison. When we recycled the ticalSociety ofAmerica, 1964,36,1628-1636. very same noise segment having a repetition rate of 2 Hz on two PLOMP, R. Pitch of complex tones. Journal of the Acoustical separate delay lines and increased the clock frequency driving one Society ofAmerica, 1967,41,1526-1533. of the delay lines to change one of these identicai waveforms to PLOMP, R. Aspects of tone sensation. London: Academic Press, 2.2 Hz, then cyclical pitch variations occurred with a period of 1976. .2 Hz (i.e., once every 5 sec) when the RGNs were mixed. A rising ROSE, J. E., BRUGGE, J. F., ANDERSON, D. J., & HIND, J. E. pitch was heard as the component RGN waveforms moved into Phase-locked response to low-frequency tones in single auditory alignment, followed immediately by a falling pitch as they moved nerve fibers of the squirrel monkey. Journal ofNeurophysiology, out of alignment. These glissandos appear to be a form of echo 1967,30,769-793. pitch (for discussions of echo pitch or repetition pitch, produced SACHS, C. Rhythm and tempo. New York: Norton, 1953. by mixing two sounds of similar or identical waveforms, see Bilsen WARD, W. D. Subjective musical pitch. Journal of the Acoustical & Ritsma, 1969170; Plomp, 1976). Analogous doubleglissandos Society ofAmerica, 1954,16, 369-380. also have been described for mistuned pitch-range RGNs (Warren, WARREN, R. M. Illusory changes of distinct speech upon repetition Note 1). - The verbal transformation effect. British Journal of Psy­ 5. As with mistuned unisons (ratio 1:1), detection of mixed in­ chology, 1961,52,249-258. frapitch RGNs with periodicity ratios of 1:2, 2:3, 3:4 was quite WARREN, R. M. Auditory temporal discrimination by trained sensitive to mistuning from integral ratios. Mistuning by more listeners. Cognitive Psychology, 1974,6,237-256. than one part in 1()3 disrupted detection of regular periodicities, WARREN, R. M. Auditory illusions and perceptual processes. In and mistuning by one part in 1()4 or 10' produced a continual N. J. Lass (Ed.), Contemporary issuesin experimental phonetics. change in the quality of the periodicity heard. Mistuning of one New York: Academic Press, 1976. part in 10" was generally undetectable, and, in the experiment re­ WARREN, R. M., & ACKROFF, J. M. Two types of auditory se­ ported, mistuning was well below the limit of detectability. quence perception. Perception & Psychophysics, 1976, 20, 6. While random variations in the speed of the tape recorder 387-394. used in Experiment 2 caused perturbations in frequency of about 4 WARREN, R. M., & BASHFORD, J. A. Production of white tone one part in 10 , such variation did not affect the playback ratio of from white noise and voiced speech from whisper. Bulletin of the frequencies of the RONs that had been recorded simultane­ the Psychonomic Society, 1978,11, 327-329. ously, so that their moment-by-moment interaction produced a WARREN, R. M., BASHFORD, J. A., JR., & WRIGHTSON, J. M. fixed periodic waveform. Infrapitch echo. Journal of the Acoustical Society ofAmerica, 7. This apparent continuity seems to result from a temporal 1980,68,1301-1305. induction, which has been described by Warren, Obusek, and WARREN, R. M., OBUSEK, C. J., & ACKROFF, J. M. Auditory Ackroff (1972) as follows: "If there is contextual evidence that a induction: Perceptual synthesis of absent sounds. Science, 1972, sound may be present at a given time, and if the peripheral units 176,1149-1151. stimulated by a louder sound include those which would be stim­ WIENER, N. Generalized harmonic analysis. Acta Mathematica, ulated by the anticipated fainter sound, then the fainter sound 1930,55, 117-258. may be heard as present." Since infratonal RGNs and Gaussian YOST, W. A., & HILL, R. Strength of the pitches associated with noise are broadband, they stimulate all loci along the basilar ripple noise. Journal of the Acoustical Society of America, membrane, and illusory continuity is heard when the RON is fol­ 1978,64,485-492. lowed by the noise. For a more detailed description of temporal induction and a comparison with the related concept of "pulsa­ tion threshold" by Houtgast (1972), see Plomp (1976). NOTES 8. It should be noted that illusory continuity is not the only type of temporal induction. Sufficient training or experience can 1. We will use the terms "tone" and "pitch" in keeping with lead to an anticipated sound being heard within any louder sound the following definitions approved by the American National that could be masking the anticipated signal were it present (see Standards Institute: "A tone is a sound wavecapable of exciting an Warren, 1976). auditory sensation having pitch" and "Pitch is that attribute of 9. The brief silent interval before presentation of the mixture auditory sensation in terms of which sounds may be ordered on was necessary to inhibit the illusory continuity of a noise through a scale extending from high to low" (ANSI S3.20-1973). This scale a subsequent temporally contiguous louder noise. Such illusory extending from high to low usually is taken as melodic pitch rather continuity ofa fainter noise has been reported to persist for several than detectable repetition (e.g., Guttman & Pruzansky, 1962, con­ tens ofseconds (Warren, Obusek, & Ackroff, 1972). sidered 19 Hz to be the lowest frequency imparting "pitch" to 10. It was sometimes necessary for the listener to "search" for pulse trains). Therefore, we will consider RGNs with detectable several seconds before teasing out the target periodic component periodicities lying below the limits of musical tonality to be "infra­ from the mixture-hence, the allowance for initial searching time. tones" producing an "infrapitch" that can be ordered from fast Identification of the target was not considered successful unless to slow (rather than from high to low). it could be heard continuously for 20 sec, in order to avoid the 2. Guttman and Julesz (1963) gave .5 Hz as the lower limit for possibility of a transient misidentification. Preliminary experi­ RGN detection, using as their criterion relatively easy detection ments using an initial target RON, followed (after a .2-sec pause) 402 WARREN AND BASHFORD by a harmonically related single RGN (3 dB louder than the target), 12. It should be noted that specific sounds may not be suitable indicated the effectiveness of the procedure employed for avoiding for periodicity detection over this entire range. Since sinusoidal false identifications of components in mixed RGNs. tones become inaudible below about 20 Hz, they are eliminated II. The differences noted in perception of polyrhythms and as stimuli for low frequencies. Pulse trains at low frequencies are mixed infratonal RGNs are not inconsistent with our suggestion heard as clicks- separated by silence, and the only lower limit for that repeated Gaussian noise be considered as a general or model detection of repetition is set by the patience of the listener. stimulus: That is, the rules governing perception of RGNs apply to other audible periodic sounds at corresponding repetition fre­ quencies; but also, additional rules limited to special repeated sounds may be found. Other complex sounds with infratonal periodicities and special additional characteristics are repeated sequences containing three or four component sounds (Warren, (Received for publication November 3, 1980; 1974)and repeated single words (Warren, 1961, 1976). revision accepted February 23,1981.)