Quick viewing(Text Mode)

Recognition of Music in Long-Term Memory: Are Melodic and Temporal Patterns Equal Partners?

Recognition of Music in Long-Term Memory: Are Melodic and Temporal Patterns Equal Partners?

Memory & Cognition 1997.25 (4).518-533

Recognition ofmusic in long-term memory: Are melodic and temporal patterns equal partners?

SYLVIE HEBERT and ISABELLE PERETZ University ofMontreal, Montreal, Quebec, Canada

The notion that the (i.e., pitch structure) of familiar is more recognizable than its ac­ companying (i.e.,temporal structure) was examined with the same set of nameable musical ex­ cerpts in three experiments. In Experiment 1,the excerpts were modified so as to keep eithertheir orig­ inal pitch variations, whereas durations were set to isochrony (melodic condition) or their original temporal pattern while played on a single constantpitch (rhythmic condition). The subjects, who were selected without regard to musical training, were found to name more tunes and to rate their feeling of knowing the musical excerpts far higher in the melodic condition than in the rhythmic condition. These results were replicated in Experiment 2, wherein the melodic and rhythmic patterns ofthe musical ex­ cerpts were interchanged to create chimeric mismatched tunes. The difference in saliency of the melodic pattern and the rhythmic pattern also emerged with a music-title-verification task in Experi­ ment 3, hence discarding response selection as the main source of the discrepancy. The lesser effec­ tiveness of rhythmic structure appears to be related to its lesser encoding distinctiveness relative to melodic structure. In general, rhythm was found to be a poor cue for the musical representations that are storedin long-term memory. Nevertheless, in all three experiments, the most effective cue for music identification involved the proper combination of pitches and durations. Therefore, the optimal code of access to long-term memory for music resides in a combination of rhythm and melody,of which the latter would be the most informative.

Recognition of familiar music is immediate and easy Variations in intensity (dynamics), in spectral composition for everyone. Despite this apparent effortlessness, music (), and in speed (tempo) probably facilitate recog­ recognition is a complex procedure that recruits multiple nition insofar as these characteristics respect the struc­ processing components. At the very least, recognition of ture ofthe original music or ofits most frequently heard a familiar piece ofmusic entails a mapping procedure be­ version. These variations are, however, considered sec­ tween the sensory input and a stored long-term represen­ ondary in that they are not determining factors ofmusic tation that captures some of the invariant properties of recognition. For instance, most listeners easily recognize the music. These long-term representations are assumed their national anthem, regardless of whether it is sung, to be stored in a network that contains all the specific mu­ played on an instrument lacking dynamics (such as the sical pieces to which one has been exposed during her/his harpsichord), or played fast or slowly (within certain lim­ lifetime. Given the size and variety ofthis network, it is its, as shown by Warren, Gardner, Brubaker, & Bashford, essential to specify the nature ofthe sensory code that is 1991). Thus, music recognition appears to rely most heav­ used by the mapping procedure to select a particular stored ily on melodic and temporal cues. The question raised in representation. the present study concerns their relative effectiveness as There are a large number of sensory cues that can, in access codes to the musical stored representations. principle, serve as the entry code(s) to the network. Among This question grew out ofneuropsychological observa­ these, the cues that arise from sequential variations along tions. In a former study (Peretz et al., 1994), we found that the pitch and the temporal dimension are apparently the brain damage can interfere with the ability to recognize most determinant. We will refer to these two classes ofin­ music in an interesting way. Twopatients (C.N. and G.L.), formation with the generic terms melody and rhythm, as whose speech and cognitive functions were otherwise nor­ defined by pitch and temporal variations, respectively. mal, were found to be unable to identify or experience a feeling ofknowing for musical excerpts that were highly familiar to them before their brain injury. Priorknowledge Order ofauthors is alphabetical since both authors equally contrib­ of the music was checked with presentation of titles or uted to the paper. This research was supported by a predoctoral fellow­ ship to the first author and by a research grant to the second author, lyrics that usually accompany the musical selections; this both from the National Science and Engineering Research Council of verbal information was readily identified by the 2 patients. Canada. The authors are particularly grateful to Brian Ross, Marilyn Thus, both patients suffer from an auditory deficit that is Boltz, Andrea Halpern, and one anonymous reviewer for their con­ specific to music. What intrigued us is that both patients structive and insightful comments made on an earlier draft ofthis paper. were also found to be severely impaired in their perceptual Correspondence should be addressed to I. Peretz, Departement de Psy­ chologie, Universite de Montreal, c.P. 6128, succ. Centre-ville, Mon­ analysis of sequential pitch variations. In contrast, none treal, Quebec, Canada H3C 317 (e-mail: [email protected]). of them was impaired in discriminating and retaining

Copyright 1997 Psychonomic Society, Inc. 518 MELODY AND RHYTHM IN MUSIC RECOGNITION 519

temporal patterns. Thus, in principle, both patients could is that, by distinguishing the data-driven and memory­ have used rhythm as a basis for recognition of familiar driven sources of information, we will be able to better music. The fact that the patients could not recognize fa­ specify both the nature of the perceptual cues that are miliar tunes raised the possibility that melodic informa­ effective in activating the appropriate musical stored rep­ tion, whose processing was impaired in both patients, is resentation and the content ofthese memory representa­ critical to access long-term memory for music. tions. Given the paucity ofresearch in the area, specifying The literature is scarce on this issue. To our knowledge, the general conditions under which melody and rhythm the only study that has dealt with the relative importance contribute to the recognition offamiliar music represents ofpitch over temporal cues in the recognition offamiliar a much needed enterprise. music by normal listeners is that ofWhite (1960). In that In effect, the primacy ofpitch over time factors cannot study, listeners were required to recognize familiar mu­ be considered as a general property ofmusic processing. sical excerpts that had undergone several transformations. This primacy can be reversed, depending on the experi­ Among these, two are ofrelevance here. One transforma­ mental context. In judging similarity between pairs of tion consisted ofpresenting the excerpts isochronously­ brief unfamiliar musical sequences, subjects have been that is, with every tone being ofthe same duration; only shown to exhibit higher reliance on temporal factors than the melody was kept as in the original. The other trans­ on melodic ones (Monahan & Carterette, 1985). Injudg­ formation consisted of the reverse; that is, the excerpts ing whether or not a musical sequence was identical to a were played on a single pitch, keeping only the original comparison sequence interleaved with distractor tones, temporal pattern. When presented with the isochronous subjects' attention was governed by rhythmic expectancies sequences, listeners achieved 88% correct identification. more than by melodic ones (Dowling, Lung, & Herrbold, With the temporal pattern, they were 33% correct (chance 1987). These two studies illustrate that temporal informa­ level being 10%, as will be clarified below). This latter re­ tion may take precedence over pitch information in some sult clearly indicates that recognizing a well-known piece conditions. At the present stage, it is difficult to identify ofmusic from its temporal structure alone is not straight­ the parameters that influence the saliency ofmelody rel­ forward, despite its wide acceptance as a reliable cue for ative to rhythm in music processing. The experimental music recognition (Dowling & Fujitani, 1971; Halpern, situations differ on too many levels, such as familiarity 1984; Lenneberg, 1967; Sturges & Martin, 1974). What ofthe musical material, task demands, and musical back­ this study clearly shows, however, is that melody is more ground of the subjects, to draw general principles. By determinant than rhythm in the recognition of familiar following up on White's study-examining more system­ music. This outcome is thus highly consistent with the atically the respective contribution ofthe melodic and the neuropsychological observations. temporal structure to the recognition offamiliar music­ There are nevertheless a number of methodological we aim to enlarge the database and, in doing so, provide' points in White's (1960) study that deserve closer exam­ a firmer ground against which the primacy ofpitch over ination. One of these points is that top-down processing time factors can be weighted. (i.e., from memory to perception), by way ofconsulting We explored this question in three experiments, using the stored representations to enhance recognition ofthe the same set of25 familiar musical excerpts taken from perceptual input, may have played a large role in White's popular songs. The selection ofthis particular subset was results. Top-down processing was promoted by the fact motivated by the need to collect namingjudgments, which that only 10 different musical selections were employed are among the best estimates ofthe precision ofrecogni­ and that all 10 titles were provided as written multiple tion. The stimuli correspond thus to the few musical ex­ choices on the response sheet. Thus, subjects knew from cerpts that elicited name agreement in the vast majority the beginning of the experiment what the possibilities of subjects in a former normative study (Peretz, Babai, were. Moreover, each ofthese 10 excerpts was presented Lussier, Hebert, & Gagnon, 1995). Experiment I was ba­ in 12 different formats along with its original version. sically a replication study ofWhite (1960) with unantic­ Through repetition, even under substantial transforma­ ipated and single presentation of stimuli. Subjects were tion, recognition may have been facilitated. Therefore, an­ presented with all 25 excerpts in either a melodic format, ticipation and prior exposure may have boosted perfor­ by being isochronous, or in a temporal format, by being mance to an artificially high level. Yet, it is unlikely that produced on a constant pitch. Their task was to name, if these factors fully account for the rather large discrepancy possible, each excerpt and to rate their feeling ofknow­ observed between recognition on the basis ofthe tempo­ ing it on a 5-point scale. In Experiment 2, the melodic pat­ ral structure and recognition on the basis ofmelodic struc­ tern ofone excerpt was combined with the temporal pat­ ture; this difference appears genuine although it requires tern ofanother excerpt so that each component was equally a more controlled assessment. familiar but improperly matched. This procedure was Reexamination ofWhite's study was thus the primary used so as to achieve maximal comparability in isolating goal ofthe present study. In particular, it is ofinterest to the temporal and melodic structure. Subjects were required distinguish bottom-up processing, which is driven by sen­ to perform the same task as in Experiment I, while fo­ sory information, from top-down processing, which can cusing on melody and ignoring rhythm, or vice versa. Fi­ arise from the activation ofstored representations via con­ nally, in Experiment 3, a music-title verification task was ceptual information such as provided by titles. The idea employed instead ofthe naming task so as to measure the 520 HEBERT AND PERETZ

possibility raised in the context ofWhite's study that ac­ iarity and verbal associations (Peretz et aI., 1995). The excerpts cor­ cess to the title favors top-down processing. respond to the musical part of folk songs and were rated as highly familiar, with a mean score of4.9 and a standard deviation 0.36 on EXPERIMENT 1 a 5-point scale, with 1 meaning unknown and 5 meaning veryfamil­ iar. All stimuli were correctly named by at least 80% ofthe 60 sub­ jects tested in the earlier study. No more than 25 musical selections Experiment I compared melodic and temporal patterns fulfill this latter criterion. Since these judgments were obtained as determinants of the first contact between an isolated with computer-generated fragments, there was no need to verify the and unanticipated familiar musical fragment and its rep­ appropriate length, tempo, or instrumentation for full recognizabil­ resentation inlong-term memory. To do so, two groups ity ofthe stimuli. Each fragment consisted ofthe basic melodic and ofuniversity subjects, selected without regard to musical temporal patterns as written in available scores. No dynamic, tim­ bre, accompaniment, or performance timing variations were pres­ training, were presented with either the melodic or the ent. The fragments were one or two phrases long (according to Pis­ rhythmic version of25 nameable musical excerpts. In the ton's, 1978, criteria), contained 10-30 notes (M = 16.9 notes), and rhythmic group, subjects were presented with the tem­ lasted 3.9-14.5 sec (M = 8.5 sec). This version ofthe stimuli will be poral patterns devoid of pitch variations. In the melodic referred to as the combined version, since it incorporates both group, subjects were tested with the melodic patterns de­ melodic and temporal variations. void of temporal variations. After the test, prior knowl­ Each excerpt was then submitted to two transformations that cor­ edge ofthe stimuli was verified with the original version responded to the two different experimental versions. In one ver­ sion, all pitch variations were removed and replaced by a single of the tunes-that is, with their melodic and rhythmic pitch that could vary from one excerpt to the next but that remained patterns combined. For each stimulus, subjects were re­ constant within the same excerpt (the pitches were selected from quired to provide a naming response and a feeling-of­ the middle range from C4 to C5). This transformation kept the orig­ knowing judgment. inal variations in duration and tempo; it will be referred to as the Wetake naming as the best indicator ofthe achievement rhythmic version. In the melodic version, all temporal variations of the mapping procedure between the stimulus and a were removed, leaving only the original pitches. Each note duration was set to 600 msec, a value that is representative ofa musical note stored memory representation, and we take feeling-of­ duration according to Fraisse (1974). This procedure somewhat knowing judgments as a less precise but convergent mea­ lengthened the stimuli in the melodic condition, with a mean dura­ sure. Feeling ofknowing was assessed by way ofa 5-point tion of 10.2 sec. Musical notations for the three versions (combined, rating scale, with 1 meaning unknown (i.e., no feeling of rhythmic, and melodic) as applied to "Bonne Fete" ("Happy Birth­ recognizing the stimulus) and 5 meaning very familiar (a day") are displayed in Figure 1. high degree ofconfidence in recognizing it). These judg­ The stimuli were randomly sorted in each version and were gen­ erated by a Yamaha TX-81Z synthesizer controlled by an IBM­ ments usually refer to meta-memory phenomena (see Nel­ compatible computer. The analog output was preset to a MIDI­ son, 1984,for a review).However,inthe present study,these timbre imitating the sound of a pan flute and was recorded onto a judgments are treated as an estimate ofthe retrievability of cassette tape. Tapes were played back to the subject in free field at the target, following Koriat's (1993) conceptualization. a comfortable listening level. These judgments are therefore considered as a compara­ Procedure. The subjects were tested in small groups and were tive measure ofthe amount and intensity ofactivation ofa assigned randomly but in equal number to either the rhythmic or stored representation that is elicited by hearing each musi­ the melodic version. The subjects were first informed about the na­ ture ofthe transformations applied to the sequences that they would cal excerpt when recovery ofits title has failed. Numerous be hearing. The session began with two examples, which had been studies have shown that subjective conviction, as reflected taken from the same pool offamiliar musical selections but had not by feeling-of-knowing judgments, is predictive of objec­ been selected as stimuli. These two examples were transformed in tive performance (e.g., Leonesio & Nelson, 1990). There­ the same way as the following experimental trials. The experi­ fore, in the present study, subjects who fail to retrieve the menter provided the correct response after each ofthese two exam­ names ofthe presented music might nevertheless estimate ples and then told the subjects that they would no longer obtain feedback in the following experimental trials. The subjects were re­ with better-than-chance success whether they would be quired to rate, on a 5-point scale, whether they had recognized the able to produce it in response to more information,as when sequence (with I meaning I did not recognize it at all, and 5 mean­ presented with the original version ofthe music. ing I did recognize it very well). They were further required to write down the title ofthe presented musical stimulus whenever possible. Method They had 6 sec to respond. After having been presented with the ex­ Subjects. Thirty-two adults from the university community perimental version, the combined version was presented to them as (range = 22-48 years of age; M = 27.7 years) volunteered to par­ a control for their prior knowledge of the stimuli. With the com­ ticipate in this experiment. Since music is culture-dependent to an bined version, the subjects performed the same rating and naming important degree, the subjects were required to have been living in tasks within the same time limits (i.e., 6 sec). The whole testing ses­ Quebec since at least 2 years ofage and to have French as their first sion lasted approximately 30 min. language. They were recruited without regard to musical training. About a third ofthe subjects were categorized as musically experi­ Results and Comments enced, having had 5 or more years of formal musical training (de­ The subjects made few naming errors (corresponding fined as lessons on an instrument or in ); none were to 12% of the total number oferrors); most errors were professional musicians. Stimuli and Material. The 25 familiar musical excerpts that omissions. These naming scores (with 1 for a correct title served as stimuli (and are fully provided in Appendix A) were cho­ and 0 for an omission or incorrect title) as well as the rat­ sen from our pool of 144 musical excerpts normalized for famil- ings ofknowing (from I to 5) were submitted to separate MELODY AND RHYTHM IN MUSIC RECOGNITION 521

Combined version

Melodicversion

Rhythmic version

Figure 1. Example of stimuli used in Experiments 1 and 3. The combined, melodic, and rhythmic versions of "Bonne Fete" ("Happy Birthday") are displayed in the top, middle, and bottom musical scores, respectively.

analyses of variance (ANOYAs). Each type of response action between group and version was also significant was analyzed by considering either subjects as the ran­ [FI(I,30) = 62.7,MSe = 0.44;F2(1,24) = 68.93,MSe = dom variable (FI) or items as the random variable (F2). 0.34; ps < .00 I]. As predicted, the ratings were reliably Group (rhythmic vs. melodic) and version (experimental higher for the melodic version than for the rhythmic ver­ vs. combined) were the between- and within-subjects fac­ sion [t(30) = 9.746, and t(24) = 8.228; bothps < .001]. tors, respectively, when subjects were the random vari­ Given the scale in use, with I meaning unrecognized and able. Group (melodic vs. rhythmic) and version (experi­ 5 meaning well recognized, a mean rating below 2.5 in­ mental vs. combined) were both treated as within-items dicates low recognizability of the stimulus, whereas a factors when items were the random variable. score above 2.5 signals retrievability. Accordingly, rhyth­ In Figure 2, the scores are represented in percentages of mic patterns ofthe familiar tunes were, on average, per­ correct naming (left panel) and in mean ratings of know­ ceived as unfamiliar (M = 1.6), whereas the melodic pat­ ing judgments (right panel) for each group being tested terns were judged overall to be familiar (M = 3.5).1 with either the rhythmic or the melodic version, as well as The performance obtained here with the combined ver- . their results on the combined version. As can be seen, the sion may have been enhanced by prior presentation in a two experimental groups did not differ in their naming different format, since all subjects were first tested with scores of the combined musical excerpts, with means of either the rhythmic or the melodic version of the same 90% and 92% for the melodic group and the rhythmic stimuli. To control for this possible bias, a new group of group, respectively. Similarly, their confidence in know­ 15 subjects selected along the same criteria as in the pres­ ing the combined excerpts was very high. As expected ent experiment were presented with the combined version from our previous normative work, there was no notice­ only and were required to perform the same task. Essen­ able difference between groups in prior knowledge ofthe tially, the same high level of performance was obtained musical material. All excerpts were easily named and well from these unprimed subjects. Their confidence in know­ recognized. This was not the case for the melodic and ing the stimuli was extremely high (M = 4.95), and this rhythmic versions, as revealed by the interaction between was confirmed by their near-perfect naming scores (M = groupand version [FI(I,30) = 80.73,MSe = 6.34;F2(l,24) 93.2% correct). When the data obtained with this group = 63.08, MSe = 5.19; bothps < .001]. Although the sub­ are compared statistically with those ofthe subjects who jects generally performed more poorly with the experi­ performed the same tasks with the experimental versions, mental version than with the combined version (p < .05,by the same effects and interaction as those reported earlier TukeyA post hoc comparisons), they produced more cor­ were obtained. We can therefore confidently reject the rect titles with the melodic version (M = 49.2%) than possibility that the high levels ofperformance previously with the rhythmic version (M = 6%) [t(30) = 7.348, by an obtained with the combined version were due to repeti­ unpaired t test computed across subjects; t(24) = 8.050, by tion priming. a paired t test across items; both ps < .01]. It is noteworthy Musical training was not found to exert any significant that the correct naming responses obtained in the melodic influence in the present experiment. Number ofyears of version were distributed across 19 of the 25 stimuli. Six instrument practice for each subject did not correlate stimuli never elicited the correct title (for no obvious rea­ with the ratings of feeling of knowing [r(29) = .10] or sons), whereas the others did in a fairly distributed man­ with the number of correct naming responses [r(29) = ner across both subjects and items. .09] across the two experimental versions. The feeling-of-knowing judgments yielded a similar In summary, the present experiment replicated the re­ pattern ofresults (see right panel in Figure 2). The inter- sults initially reported by White (1960) with a larger sub- 522 HEBERT AND PERETZ

.( 100 ...... ------, 6..,..------, r:: 90 5 80 70 b.O 4 60 .5.... 50 l:! 3 r:: 40 III 30 ~ 2 20 1 1~ i.LEili_IiiILJtiiiliIIIIII Experimental Combined Experimental Combined Version [I Melodic group Version • Rhythmic group Figure 2. Mean percentages of correct naming (left panel) and mean ratings of knowing judgments (right panel) obtained by the melodic and rhythmic groups with the experimental versions and with the combined control condition in Experiment 1. Error bars represent the standarderrorofthe mean (equal to 0.0 for know­ ing judgments in the combined version).

set offamiliar and unanticipated tunes. The subjects here tern but came from different tunes. This procedure retains named about halfofthe musical excerpts on the basis of the same genre of musical structure and the same level their melodic structure alone and only 6% on the basis of of familiarity for the musical components while depart­ their temporal structure. Thus, the present results con­ ing from the original arrangement ofpitch and duration. firm the notion that familiar music is not easily identi­ Subjects were instructed to focus their attention on ei­ fied by its rhythm. The melodic pattern is clearly more ther the melodic or the rhythmic pattern ofthese chimeric informative in this regard, although it does not fully ac­ tunes; otherwise, task demands were identical to those count for music identification. Naming a tune on the of Experiment 1. Ifrecognition from melodic variations basis of its melodic pattern does not reach the level ob­ was indeed facilitated by isochrony in Experiment 1, tained for the original excerpt where pitches and durations then presenting the same melodic information with a new are combined. rhythm-that is, with temporal variations that are un­ related to the original tune-should considerably impair EXPERIMENT 2 recognition performance.

The results obtained in Experiment I may be conceived Method as arising from the way pitches and durations are differ­ Subjects. Twenty-eight students (range = 19-42 years of age; entially structured in popular songs. Popular music is very M = 25 years) were selected using the same criteria as in Experi­ simple-particularly with respect to their temporal struc­ ment 1. Six subjects had taken music lessons in the past (for 4-11 years); none were active performers. ture, which is basically organized around two durations Stimuli and Material. From the 25 musical excerpts used in Ex­ (a short duration and a long duration in a ratio of2 to I). periment 1, 19 could be paired in terms of note number and were It is therefore possible that portions ofthe original rhythm kept as stimuli. With these 19 excerpts, we created 19 mismatched of a tune incidentally coincide with an isochronous se­ tunes by interchanging the melodic pattern and the temporal pat­ quence. Such portions, which contain the original combi­ tern. This was done by pairing two excerpts containing the same nation ofmelody and rhythm, may be sufficient for recog­ number of notes, without any concern for other musical parameters (see Figure 3). For 3 remaining tunes of the same length, the same nition. In other words, shifting tunes to a constant duration procedure was applied so that none kept its original rhythmic pat­ may be less detrimental to recognition than shifting them tern. All the mismatched tunes are provided in Appendix B. to a constant pitch, because the original tunes also con­ Procedure. The same set of mismatched tunes served as stimuli tain a subset ofnotes with a constant duration but, less fre­ in the two experimental conditions; only the instructions to the sub­ quently, a series ofnotes with a constant pitch. jects differed. Halfof the subjects were instructed to name and rate To reduce such coincidences, we presented here to new their feeling of knowing for the melodic pattern while ignoring the groups ofsubjects the same set oftunes with their respec­ rhythmic variations. The other half were required to focus on rhythm while ignoring melodic variations. Otherwise, the procedure was tive melodic and rhythmic patterns interchanged (see identical to that of Experiment 1. The subjects were tested in small Figure 3, for an example). With these mismatched tunes, groups, with two examples followed by 19 experimental stimuli. each stimulus still possessed a melodic and a temporal pat- After completing the task in the experimental condition, the sub- MELODY AND RHYTHM IN MUSIC RECOGNITION 523

Bonne rete Lundi matin, Ie roi, sa femme et le petit prince

m r e h I y o t d Mismatched h y m

r m h e Y I t o h d m y

Figure 3. Example of stimuli used in Experiment 2. The original combined version of "Bonne Fete" and "Lundi Matin" are represented on top. Interchanging their respective melodic and rhythmic patterns produces the two mis­ matched tunes represented below. jects were presented with the combined (i.e., matched) version of on the basis ofthe melodic pattern. The rhythm-instructed the musical excerpts. group achieved only 8% correct (see left panel in Fig­ ure 4). Interestingly, the subjects who were instructed to Results and Comments focus on rhythm erred in giving the title corresponding As can be seen in Figure 4, there is a striking similar­ to the melodic pattern. This occurred in 28% oftheir nam­ ity between the results obtained in this experiment and ing responses. It is as if they could not ignore the melodic those ofExperiment I (see Figure 2). Statistical analyses information, which was even processed up to the level of ofthe data were performed in the same fashion as in Ex­ name retrieval. These naming errors seldomly occurred periment 1 and revealed a similar outcome. in the melody-instructed group, with less than 2% ofthe First, the subjects were all highly and equally familiar trials. This rate ofintrusions in the melodic condition was with the musical material as indexed by their excellent per­ significantly lower than the one observed in the rhythmic formance with the combined (matched) version. However, condition [1(26) = 4.119, and 1(18) = 5.389;ps < .001]. this high rate ofnaming responses and ofconfidence level It is apparent that recognizability ofthe mismatched se­ in recognizing the stimuli dropped with the mismatched quences was contaminated by the melodic dimension de­ tunes (i.e., the experimental version in Figure 4). On the spite the fact that it had to be ignored. There is, thus, an naming scores with the mismatched tunes, the melody­ asymmetry in selective attention, with attention being instructed group was found to be clearly superior to the driven mainly by melodic information rather than tem­ rhythm-instructed group (p < .05, by Tukey A post hoc poral information. comparisons). This pattern ofresults was supported by an The results on the feeling-of-knowingjudgments mir­ interaction between group and version [Fl(I,26) = 31.78, ror those ofthe correct naming responses (see right panel MSe = 7.30; F2(1,18) = 42.11, MSe = 5.53; both ps < in Figure 4). The interaction between group and version .00 I]. As in Experiment I, the subjects from the melody­ was highly significant [FI(I,26) = 39.60, MSe = 0.18; instructed group named, on average, 53% of the stimuli F2( 1,18) = 17.61, MSe = 0.54; both ps < .001], indicat- 524 HEBERT AND PERETZ

.§ 100 6.,..------, a 90 : 80 5 ~ 70 U ~ 4 60 ...... '0 50 l:! 3 I:: al f :~ ~ 2 1il 20 0- 1 ~ 10 Q) 0 ~ iIlll o rill Experimental Combined Experimental Combined

Version [] Melody-instructed group Version

• Rhythm-instructed group Figure 4. Mean percentages of correct naming (left panel) and mean ratings of knowing judgments (right panel) obtained for the experimental (mismatched) tunes by the melody-instruction group and the rhythm­ instruction group, as well as their scores for the combined (matched) version in Experiment 2. Error bars rep­ resent the standard error of the mean.

ing that the subjects did not differ in their prior knowl­ We thought it desirable, however, to add some formal edge ofthe original tunes (i.e., in the combined version) evidence in support ofthe notion that temporal structure but differed in response to the experimental version de­ was indeed modified by the mismatching procedure. To pending on the selective-attention instruction received. this end, we analyzed our two stimulus sets (the original The experimental mismatched tunes were judged to be excerpts and their·mismatched formats) with the rules more familiar when the subjects focused on the melodic used by Boltz (1991, 1993), because she has addressed pattern than when they paid attention to the temporal similar questions with stimuli coming from preexisting pattern (p < .05, by Tukey A post hoc comparisons). In folk music. In her work, Boltz used basically two rules fact, the melodic pattern was generally judged to be fa­ to define the sequence in terms ofaccent structure. One miliar (with a mean rating 00.7), whereas the temporal rule is related to melodic information in stating that tonic pattern of the same mismatched tunes was judged to be triad members (the tonic, mediant, and dominant) create unfamiliar (with a mean rating of2.2). melodic accents. The other rule is defined along the tem­ As in Experiment I, musical training was found to poral dimension and attributes accents to notes that are have little impact on these tasks. In fact, the number of prolonged in duration. When the two kinds ofaccents con­ years of instrument practice was negatively correlated, verge on the same note, then the joint accent structure but not significantly so, with both feeling-of-knowing draws attention to the same points in time and so facili­ judgments [r(26) = - .29] and correct naming responses tates processing and, hence, recognizability. We computed [r(26) = - .20]. The absence ofan expertise effect should, the serial position ofthese joint accents in our 19 origi­ however, be interpreted with caution, since there were nal tunes and found that only 4 ofthe 18 that lent them­ few musically trained subjects in the present samples. selves to such an analysis kept these joint accents in the It is still possible that recognition from the melodic same serial positions in the mismatched tunes. Thus, for variations occurred by coincidental matches with the un­ the large majority of our stimuli, we can conclude that related rhythm. Indeed, it is plausible that the temporal changing the rhythm did modify its original accent struc­ structures ofour stimuli are interchangeable because they ture. This modification ofaccent structure did not affect are very similar. Hence, creating mismatched tunes would naming accuracy ofthe melodic component: The subjects not alter the original musical structures to any significant identified more tunes when there was a disruption ofthe degree. For the musically trained reader who can check accent structure (with 59% correct) than when there was the stimuli fully presented in the Appendices, this possi­ no obvious change (with 37%). Note, however, that the bility is not very likely. The most fundamental character­ latter score concerns only 4 stimuli. istic oftemporal structure-that is, metric organization­ In summary, the present results clearly show that the was changed from duple to triple, and vice versa, in more melodic pattern is more indicative of identity than is its than halfthe stimuli. In such cases, the temporal accent­ temporal pattern for highly familiar music. In the present ing that occurs on a down beat falls at different points in experiment, where the melodic pattern was embedded in time in the original and its mismatched counterpart. a new rhythm, there was no evidence of a decrement in MELODY AND RHYTHM IN MUSIC RECOGNITION 525 recognition and naming relative to when the melodic pat­ Thus, there were nine experimental conditions, defined tern was presented without rhythmic variations (in Ex­ by the musical version presented (melodic, rhythmic, periment 1). The subjects were successful in ignoring the and combined) and the title onset asynchrony (TOA; on­ irrelevant rhythmic pattern, be it isochronous or unrelated set, middle, and offset). Each condition was employed to the original music. This was not the case for the tem­ with a different group ofsubjects in order to avoid repe­ poral pattern, which remains a very poor cue for music tition priming. identification. TOA was manipulated here, enabling us to assess the contribution oftop-down processing. Reading of a title EXPERIMENT 3 is expected to activate one specific internal representa­ tion, which, in turn, will generate expectations that can Compelling evidence for the more determinant role be checked with the perceptual input, either directly by played by melodic information over temporal information examining the auditory input or indirectly by affecting in music recognition and identification has been gath­ response decisions. Ifthis top-down influence is direct, ered in Experiments 1 and 2. Melodic information does by supplying information to be checked in the unfolding not, however, fully account for music recognition. In Ex­ sequence, then performance should be best when the title periments 1 and 2, the original combination of melody is provided early-that is, at the onset of the auditory and rhythm was systematically better recognized and input. When the title appears late, at the offset ofthe au­ identified than was the melodic component considered ditory sequence, the selection-difficulty hypothesis and separately. Thus, temporal information must contribute the encoding-distinctiveness hypothesis entail opposite to the final decision, even if it is a very poor cue when outcomes for the rhythmic patterns. Following the selec­ considered in isolation. The level in the processing ofthe tion hypothesis, several candidates should have been ac­ musical information at which temporal information con­ tivated at the offset ofthe auditory sequence, and the title tributes to the recognition process remains to be assessed. would help to select one ofthem. In that case, performance We hypothesized that perhaps rhythmic contribution to should be fairly accurate. According to the encoding­ the recognition process lacks selectivity (to which we will distinctiveness hypothesis, the rhythmic sequence should refer as the selection-difficulty hypothesis). Temporal not have activated a particular candidate in memory, and, structures are shared, at least in large segments, across hence, the late title appearance will not be effective as a several tunes, as previously stated. Hence, temporal infor­ selection cue. In that case, recognition performance should mation may map onto the stored representation ofmul­ drop considerably. Presenting the title in the middle of tiple candidates and make response selection problem­ the sequence was expected to produce intermediate effects. atic. If correct, then narrowing the number of responses Because melodic pattern is believed to provide more dis­ down to a single choice, by providing a title, should con­ tinctive cues for music identity, the title-verification task siderably enhance recognition performance on the basis on the basis ofmelodic information should be less sen­ oftemporal cues. Temporal information could even reach sitive to the early versus late appearance ofthe title. the efficacy level ofmelodic information in a title verifi­ Since subjects were individually tested with on-line cation task. Alternatively, temporal patterns may not stimulus generation and data collection by computer, it contain sufficiently distinctive features to activate a sin­ was convenient to measure response times (RTs). The RT gle internal representation (to which we will refer as the data were, however, expected to be less reliable than ac­ encoding-distinctiveness hypothesis). In that case, pro­ curacy measures, given the small number of responses and viding a title for verification will not affect response de­ the differences in time at which the target stimuli can be cision; hence, recognizability will be low. Performance differentiated. may, however, depend on top-down processing, as will be clarified below after the design of the experiment has been Method described. Subjects. One hundred and thirty-five subjects (range = 18~51 The aim ofExperiment 3 was to distinguish these two years of age; M = 21.7 years), who did not participate in the pre­ hypotheses with regard to the inferiority ofrhythm rela­ ceding experiments, were tested here. They were selected using the tive to melody in recognition. The same set oftunes and same criteria as in Experiments I and 2. Eleven subjects considered transformations used in Experiment 1 served as stimuli themselves musicians (average number ofinstrument practice and oftheory = 9.3 years; range = 3~16 years). here. There were two types of sequences: the melodic iso­ Material and Apparatus. The same three sets of25 musical ex­ chronous transformation and the temporal equitonal trans­ cerpts used in Experiment I (the combined version and their corre­ formation. Again, the combined version served as a base­ sponding melodic and rhythmic transformations) served as stimuli. line against which to compare the respective contribution These stimuli were paired with their respective titles, containing of melodic and temporal information. The major dif­ 2-10 words (M = 4.2; see Appendix A), so as to create the experi­ ference was the visual presentation of a title that corre­ mental pairings. To create distractor pairings, 25 new pairs oftune sponded to the auditory sequence in halfofthe trials. Ti­ and title were selected so as to be highly familiar (mean rating = 4.7 on a 5-point scale; range = 4.6-4.9; see Peretz et aI., 1995) and tles were presented at different onset times with respect to be similar in length to the experimental tunes. The distractor to the unfolding auditory stimuli. The title appeared at the tunes lasted from 4.4 to 15.5 sec (M = 7.9 sec), and their titles con­ onset, the middle, or the offset of the auditory sequence. tained 1-7 words (M = 3.7). The distractor tunes were transformed 526 HEBERT AND PERETZ

in the same way as the experimental ones, so as to generate a com­ Each set of stimuli was randomized and generated by the same bined version, a rhythmic transformation, and a melodic transfor­ equipment used in the previous experiments. The analog output was mation. Unlike the experimental pairings, the distractor tunes and then digitized at a sampling rate of 44.1 kHz via the Soundtools titles were randomly paired so as to avoid a true pairing. These true program from Digidesign on a MacIntosh II FX computer and di­ and untrue tune-title pairings were organized in three sets, each rectly transmitted to the subject via a digital interface amplifier and containing 25 experimental trials and 25 distractor trials. Thus, there Maico headphones. Coupling the visual presentation on the com­ was a combined set, a rhythmic set, and a melodic set correspond­ puter screen with the auditory stimulus, as well as with the response ing to the combined, rhythmic, and melodic versions, respectively. recordings, was done with the Experimenter program (Altmann, In order to reduce variability in RTs, the duration of each tune Wathanasin, Birkett, & Russell, 1992). was kept constant across the three versions. To do so, the tone du­ A trial consisted ofthe presentation of an auditory sequence syn­ ration ofthe melodic version was determined by dividing the total chronized with the presentation of a title on the computer screen. duration of its combined version (as well as its rhythmic transfor­ TOA was such that the title appeared (I) at the onset ofthe first tone mation, since they had identical durations) by the number oftones ofthe auditory sequence, (2) in the middle ofthe sequence, corre­ contained in the excerpt. Thus, unlike in Experiment I where tone sponding to its total duration divided by 2 within I-msec accuracy duration was fixed at 600 msec, the tone duration of the melodic (on average, 4.25 sec after the beginning ofthe sequence), or (3) at versions here varied across stimuli from 283 to 857 msec (M = the offset ofthe last tone ofthe auditory sequence (on average after 471 msec), while remaining constant within a melody. 8.5 sec). These various TOAs are illustrated in Figure 5 with "Happy

Title onset asynchrony

Bonne rete

Bonne rete

Bonne tete

osec 7.25 sec ONSET MIDDLE OFFSET Figure 5. Illustration of the three different time asynchronies used in Experiment 3 between visual presentation of the title and the onset ofthe auditory presentation of"Bonne Fete" (in its combined version). MELODY AND RHYTHM IN MUSIC RECOGNITION 527

Birthday" presented in its combined version. Each set of stimuli There was a large effect ofversion [F(2,126) = 178.50, was presented to the subject with a constant TOA. This created nine MSe = 0.34,p< .001] and a main effectofTOA [F(2,126) conditions, defined by the three musical versions (combined, me­ = 3.22, MS = 0.34,p < .05]. However, the interaction be­ lodic, and rhythmic) and the three TOAs (onset, middle, and offset). e tween version and TOAjust missed significance [F(4,126) Procedure. The subjects were randomly assigned but in equal number to one ofthe nine possible experimental conditions. There = 2.25, MSe = 0.34, P < .07]. were thus 15 subjects performing the task under the same condi­ Although the interaction between version and TOA tion. The subjects were individually tested in a session lasting about failed to reach the conventional level ofsignificance, it is 20 min. They were informed about the nature ofthe structural trans­ apparent in the data that performance was differentially formation applied to the musical sequences (ifany) that they would affected by the delay in title presentation in each version. be hearing. The experimental session began with four examples. To assess statistically the influence oftitle delay on per­ The subjects were instructed to respond as quickly as possible whether or not the visually presented title corresponded to the mu­ formance, separate ANOVAs were performed for each sical excerpt that they were hearing. They responded "yes" or "no" version, considering TOA as the only between-subjects on a response box that was connected to the computer. The RTs were factor. These analyses revealed that increasing time delay computed from the onset oftitle presentation. The title appeared in between auditory presentation and title appearance de­ the center of the screen and disappeared after 5 sec, or as soon as creased performance significantly in the rhythmic ver­ the subject had responded. The auditory sequence went on until its sion only [F(2,42) = 9.88, MS = 0.24,p < .001]. In the end and, 2 sec later, was followed by the next trial. No feedback was e provided to the subjects. combined and melodic versions, the effect of TOA was far from being significant (both Fs < 1). Results and Comments Another aspect ofthe results that needed to be assessed For each subject, the number ofhits ("yes" responses separately was the apparent superiority ofthe melodic ver­ for true pairings) and false alarms ("yes" responses for sion over the rhythmic version at each TOA (see Table 1). untrue pairings) were computed. The correction recom­ These comparisons were assessed by unilateral t tests at mended by Snodgrass and Corwin (1988) was applied to each TOA. The melodic advantage was fairly general, al­ these scores, by adding .5 to each value and dividing the though it was much reduced when title was presented at the sum by N+ 1, where N is the number ofexperimental and onset ofthe auditory input [t(28) = 2.00,3.54, and 6.27 at distractor trials for true and untrue pairings, respectively. onset, middle, and offset TOAs, respectively; all ps < .05]. The scores are presented in Table 1along with the d' mea­ Thus, the accuracy measures show that title verifica­ sure that takes both hits and false alarms into account. tion is generally an easy task to perform. Verification was Statistical analyses were performed by way of a 3 X 3 better than chance in all conditions (chance level being ANOVA, taking d' as the dependentmeasure. The between­ 50% in accuracy and zero for d' measures), indicating that subjects factors were TOA (onset vs. middle vs. offset) recognition can occur on the basis ofrhythmic informa­ and version (combined vs. melodic vs. rhythmic). tion alone. This conclusion could hardly be drawn from' As can be seen in Table 1, title verification for the com­ the results obtained in the two preceding experiments. In bined sequences was nearly perfect across TOAs. It was Experiments 1 and 2, rhythmic cues did not evoke a clear less accurate for the melodic version which remains, how­ sense of recognizability. Another important result ob­ ever, better recognized than the rhythmic version. In the tained with the title-verification task is that recognition latter case, a striking aspect ofthe results is that perfor­ accuracy was modulated by the moment at which the title mance decreases systematically as title presentation is de­ appeared on the screen. Performance was best when title layed. These trends were generally supported statistically. appeared early in the evaluation ofthe sensory input, par­ ticularly so with the rhythmic version. This effect ofTOA on performance is highly consistent with the top-down Table I hypothesis, whereby title presentation activates the cor­ Mean Corrected Rates of Hits and False Alarms, Along With responding musical stored representation, which, in turn, the Derived d' Scores, Obtained for Each Version at Each Delay (i.e., Title Onset Asynchrony) Between Auditory Presentation can supply the relevant information to be checked in the ofthe Sequence and Presentation ofthe Title sensory input. When a title is presented at onset, the acti­ Title Onset Asynchrony vated mental representation can be directly compared to Onset Middle Offset the unfolding stimulus. When a title is presented at off­ set, such a direct verification ofthe sensory input can no Combined Version longer take place; the title mainly serves to select the cor­ Hits .96 .96 .95 Falsealarms .07 .06 .05 rect response. Recourse to direct verification ofthe sen­ d' 3.43 3.46 3.45 sory input seems to account largely for the results observed Melodic Version in the rhythmic version and less so for those in the me­ Hits .90 .80 .80 lodic and combined versions. Falsealarms .28 .20 .15 The RTs provide support for the use of a verification d' 2.06 1.82 1.99 procedure, with all musical versions. As can be seen in Fig­ Rhythmic Version ure 6, the subjects were slower to respond when the title Hits .84 .75 .66 appeared at the beginning ofthe sequence than when it ap­ Falsealarms .32 .40 .37 peared later. This indicates that the listener kept monitor­ d' 1.60 1.00 0.83 ing the input as long as it was available. In other words, 528 HEBERT AND PERETZ

-D- Original version There was one aspect ofthe RTs that puzzled us: Since ~ Melodic version naming the combined sequences was easy for every sub­ ject, as shown in Experiments 1 and 2, why did the sub­ --0- Rhythmic version jects take so long to verify the title ofthe combined stim­ 4000 uli (i.e., 746 msec) when it appeared at the end of the sequence? At that point in time, most subjects should have 3500 .05]. sentation in memory. Nevertheless, the optimal code seems MELODY AND RHYTHM IN MUSIC RECOGNITION 529

to require a combination of both melodic and rhythmic well after auditory presentation. The results are more information. compatible with the idea that, relative to melody, rhythm The empirical evidence for the predominance of me­ lacks encoding distinctiveness when the purpose of the lodic cues over temporal cues was gathered in naming re­ task is to activate a representation in long-term memory. sponses, feeling-of-knowing judgments, and speeded title The present results raise the question as to the degree verification. In Experiment I, naming responses and feel­ ofautonomy that melodic organization holds with respect ing-of-knowing judgments revealed that transformations to temporal organization in the recognition of familiar of familiar musical excerpts that kept the original pitch music. Are these two types oforganization distinct, such variations were clearly more diagnostic ofmusic identity that one organization is more effective than the other? Or than were the original temporal variations. For about half should we consider that melody and rhythm are normally ofthe musical excerpts, melodic information alone led to treated as a single entity, whose melodic content would title retrieval. In contrast, the original temporal variations be most salient? This question corresponds to a much de­ were judged to be unfamiliar and largely failed to evoke a bated issue in music cognition. title. This large advantage for the melodic pattern over the There are currently two distinct positions with regard temporal pattern to activate music memories was repro­ to the question ofautonomy for melody and rhythm. Ten­ duced in Experiment 2 with chimeric musical sequences. ants of the integration view, mainly developed by Mari The latter were created by combining the melodic pattern Riess Jones and, more recently, by Marilyn Boltz, argue of one familiar excerpt with the temporal pattern of an­ that melody and rhythm are treated as a unified dimen­ other familiar excerpt. The subjects' task was to selectively sion (e.g., Jones & Boltz, 1989). In their model, referred attend to melody and ignore rhythm, or vice versa. They to as the dynamic model, temporal structure is consid­ were found to weigh pitch structure more heavily than ered to be the vehicle that drives attention toward or temporal structure, even when instructed to ignore the away from important melodic (i.e., nontemporal) infor­ pitch variations. Generally, the subjects were not affected mation. This time-oriented strategy not only facilitates by what happened to rhythm when focusing on melody: initial encoding but is believed to apply to remembering They simply ignored temporal information. Finally, in the as well. Basically, it is suggested that pitch and time title-verification task used in Experiment 3, the subjects function as integral parts and, therefore, that remember­ were found to match more quickly and, in general, more ing a melody involves the reinstatement ofthe original accurately the visually presented title with the auditory se­ pitch-time structure. Tenants of the opposite view, the quence when the latter contained melodic cues than when independence view, argue instead that listeners' percept they carried only temporal cues. ofmusical sequences results from an independent influ­ Wedo not claim, however, that temporal structure plays ence ofmelodic and temporal factors (Monahan & Car­ a negligible role in music recognition. In each experiment, terette, 1985; Palmer & Krumhansl, 1987a, 1987b), at performance on the original tunes, which combine the me­ least in the initial stages of processing (Peretz & Kolin­ lodic and temporal structures, was systematically supe­ sky, 1993; Peretz et aI., 1994; Thompson, 1994). rior to performance on the separate structures. Melodic The studies that have addressed these two models em­ information in isolation is often not sufficient to produce pirically have yielded mixed results. In support ofthe in­ a perfect match with a stored representation; temporal dependence view are findings that rhythmic patterns that structure must be included to do :>0. Furthermore, as conflict with melodic structure yield a much poorer per­ shown in Experiment 3, temporal structure alone can formance than when the two coincide. Such integrality lead to recognition when a title is provided as an early aid effects have been observed in a variety of contexts, in­ to monitor the sensory input. That is, the subjects were cluding written recall (Boltz, 1991; Boltz & Jones, 1986; found to be quite accurate in matching a rhythmic se­ Deutsch, 1980), delayed recognition (Jones & Ralston, quence with a title while listening to the stimulus. How­ 1991; Jones, Summerell, & Marshburn, 1987; but see ever, when the title appeared too late, relative to the mon­ Smith & Cuddy, 1989, for negative results), retrospective itoring of the input (i.e., when the auditory stimulation duration judgments (Boltz, 1995), same-different clas­ had ceased), performance was found to be rather poor. sification judgments (Boltz, 1993; Jones, Boltz, & Kidd, The latter result provides important information with re­ 1982; Peretz & Kolinsky, 1993), and phrase-completion gard to the nature of the contribution of rhythm to the judgments (Boltz, 1989a, 1989b, 1989c; Schmuckler & process ofrecognizing familiar music. First, rhythmic in­ Boltz, 1994). Moreover, interactive effects ofrhythm and formation must be contained in the stored representation melody manipulations have been observed in selective­ ofthese musical pieces. Otherwise, there would be no pos­ attention tasks; when subjects are explicitly instructed to sibility for the listener to use a top-down monitoring pro­ ignore the rhythmic dimension (Bigand & Pineau, 1996; cedure. Second, the weaker contribution ofrhythmic pat­ Boltz, 1989a; Kidd, Boltz, & Jones, 1984), they still ex­ tern to the recognition process is not simply due to a hibit response patterns that are affected by this dimen­ selection difficulty, following the idea that temporal pat­ sion and conversely for the melodic dimension (Jones, terns are shared to an important degree by many differ­ 1987; Pitt & Monahan, 1987). ent tunes. If this were the case, title presentation by nar­ The other studies that have obtained results that are rowing down the possibilities to a single choice should more compatible with the independence view have also facilitate response decision, even when the title appears involved a large variety of tasks. Melodic and rhythmic 530 HEBERT AND PERETZ

structures have been shown to exert independent influ­ a rather limited set oftunes. This selection was strict in ences in similarity ratings for unfamiliar (Mon­ that only nameable tunes could be used. This raises the ahan & Carterette, 1985; Monahan, Kendall, & Carter­ issue as to what extent the results obtained with these ette, 1987),comparison ofsimultaneously or consecutively highly familiar tunes are generalizable to other tunes that presented pairs ofmusical segments (Thompson & Sin­ are less familiar and to what extent they are generaliz­ clair, 1993), phrase-completion judgments (Palmer & able to music in general. This is obviously a question to Krumhansl, 1987a, 1987b), and change detection in in­ which we cannot provide any direct response. This should dividual tones and patterns (Thompson, 1994). Thus, task be the goal of future investigations. A change in para­ factors cannot easily account for the discrepancy between digm would be necessary, however, since naming or title these studies and those supporting the integration model. verification with most musical selections cannot be taken One ofus (Peretz, 1990; Peretz & Kolinsky, 1993) has as a reliable index ofmusic identification. provided neuropsychological evidence that melody and We predict, however, that the present findings would rhythm are dissociable after brain damage. As mentioned generalize at least to recognizability ofmost music ofthe in the introduction, processing of melodic information Western culture. Indeed, the superiority ofpitch factors can be impaired while processing oftemporal informa­ over time factors may just reflect the regularities exhib­ tion remains intact, and vice versa. In a detailed case study ited by the music to which we have been exposed since (Peretz & Kolinsky, 1993), the dissociation between mel­ an earlyage. The music of our culture is simply more elab­ ody and rhythm was replicated across different sets of orate in pitch than in duration, and it uses pitch categories material and was maintained in conditions that promoted that are fixed to certain frequencies, whereas durational integration in the normal brain (by using a Stroop-like categories are more flexibly defined. design in a same-different comparison task of'briefmu­ Western music is very stereotyped in rhythm. Accord­ sical sequences). These results argue against the view ing to Fraisse (1982), who counted the frequency ofdif­ that melody and rhythm are treated as a unified dimen­ ferent durations in various classical compositions rang­ sion throughout processing. At the same time, they sup­ ing from Beethoven to Bartok, composers principally use port the view that integration takes place after early sep­ two contrasting durations in any piece: a long duration aration ofthe two dimensions (see also Thompson, 1994, and a short duration in a ratio of2 to 1. This accounts for for a similar view). over 85% ofthe notes in a given composition. The large The present results are also compatible with the view majority ofthe tunes (22 of25) used in the present study that integration occurs late in the processing ofmusic­ fits with this description. Pitches are more variable. On that is, after the initial stages ofperceptual analyses that average, the musical fragments contained 5.9 different are believed to be largely preattentive and mandatory. pitches within the key ofthe tune. Yet, such variations in The present observations suggest that, between access to pitch can be considered as rather stereotyped because the appropriate stored representation and recognition re­ they do not include tonal deviations, as is generally the sponses, an assembly process needs to take place whereby case in music compositions. Moreover, pitch was not har­ the melody information and temporal information are monically elaborated here, as is also generally the case combined. in most music. Nevertheless, even in these monophonic What remains unknown, however, is at which process­ folk tunes, a disparity in variability along the pitch and ing stages the combination takes place and to what extent duration dimensions is apparent and may account for this combination process requires integration ofmelodic their differential distinctiveness. and rhythmic information into a unique code. That is, in­ Furthermore, tones usually have a fixed pitch in our tegration into a unique representation may be required for culture (with the A4 fixed at 440 Hz). Durations are less the mapping process between the perceptual input and constrained, by allowing large deviations for the purpose the stored representation ofthe tune. In that case, the me­ ofexpressiveness (e.g., Gabrielsson, 1993). Perhaps, the lodic and rhythmic perceptual properties of this unique rigidity of the pitch system affords efficient anchoring trace should be accessible separately. Alternatively, the points for perception and memory, whereas the flexibil­ two kinds ofstructure may not be integrated; rather, they ity of the time structure serves mostly purposes other may function as distinct but highly interconnected rep­ than recognition. resentations of the same tune. In that case, integration However, this asymmetry between melodic and rhyth­ might concern the retrieval ofthe memory trace, which mic cues for recognition is likely to be culture-specific. would combine these two distinct codes for adequate mu­ Other cultures have more sophisticated (e.g., in sical recognition. The present data are compatible with ancestral Indian cults in the Martinique island, rhythmic both possibilities. Further work would be required to dis­ patterns per se constitute musical tokens; see Desroches tinguish between these two possible classes offunctional & Benoist, 1983). Even in less extreme instances, one organization. Nevertheless, the present results highlight would probably find cultures where a reverse contribu­ the need to incorporate in both accounts a weighting fac­ tion of melodic and rhythmic pattern to recognizability tor in favor ofmelodic information as the most salient di­ can be observed. With Balzano (1987), we might suppose mension in long-term memory for tunes. that another culture whose music is less elaborate in pitch The ensuing question is obviously the generalizability structure than ours would have a more rhythmically rich ofthe present findings. In this study, we have examined tradition as sources of additional structure in their music. MELODY AND RHYTHM IN MUSIC RECOGNITION 531

REFERENCES KORIAT, A. (1993). How do we know that we know? The accessibility model ofthe feeling ofknow ling. Psychological Review, 100,609-639. ALTMANN, G., WATHANASIN, S., BIRKETT, T., & RUSSELL, P. (1992). Ex­ LENNEBERG, E. H. (1967). Biological foundations oflanguage. New perimenter [Computer program]. Sussex, U.K.: Network Analysis York: Wiley. Ltd., University of Sussex. LEONESIO, R. J., & NELSON, T. O. (1990). Do different metamemory BALZANO, G. (1987). Measuring music. In A. Gabrielsson (Ed.), Action judgments tap the same underlying aspects ofmemory? Journal ofEx­ and perception in rhythm and music (pp. 177-199). Stockholm: Royal perimental Psychology: Learning, Memory, & Cognition, 16,464-470. Swedish Academy of Music. MONAHAN, C. B., & CARTERETTE, E. C. (1985). Pitch and duration as BIGAND, E., & PINEAU,M. (1996). Context effects on melody recogni­ determinants ofmusical space. Music Perception, 3,1-32. tion: A dynamic interpretation. Cahiers de Psychologie Cognitive, MONAHAN, C. B., KENDALL, R. A., & CARTERETTE, E. C. (1987). The 15,121-134. effect of melodic and temporal contour on recognition memory for BOLTZ, M. [G.] (1989a). Perceiving the end: Effects of tonal relation­ pitch change. Perception & Psychophysics, 41, 576-600. ships on melodic completion. Journal ofExperimental Psychology: NELSON, T. O. (1984). A comparison ofcurrent measures ofthe accuracy Human Perception & Performance, 15, 749-761. offeeling-of-knowing predictions. Psychological Bulletin, 95, 109-133. BOLTZ, M. lG.] (1989b). Rhythm and "good endings": Effects oftemporal PALMER, C; & KRUMHANSL, C. (1987a). Independent temporal and structure on tonality judgments. Perception& Psychophysics, 46, 9-17. pitch structures in determination ofmusical phrases. Journal ofEx­ BOLTZ,M. [G.) (1989c). Timejudgments ofmusical endings: Effects of perimental Psychology: Human Perception & Performance, 13, 116­ expectancies on the "filled interval effect." Perception & Psycho­ 126. physics, 46, 409-418. PALMER, C, & KRUMHANSL, C. L. (1987b). Pitch and temporal contri­ BOLTZ,M. [G.] (1991). Some structural determinants ofmelody recall. butions to musical phrase perception: Effects ofharmony, performance Memory & Cognition, 19,239-251. timing, and familiarity. Perception & Psychophysics, 41, 505-518. BOLTZ,M. G. (1993). The generation of temporal and melodic expec­ PERETZ, I. (1990). Processing of local and global musical information tancies during musical listening. Perception & Psychophysics, 53, in unilateral brain-damaged patients. Brain, 113,1185-1205. 585-600. PERETZ, I., BABA'i, M., LUSSIER, I., HEBERT, S., & GAGNON, L. (1995). BOLTZ,M. G. (1995). Effects ofevent structure on retrospective dura­ Corpus de 144 extraits rnusicaux: Indices de farniliarite, d'evocation tionjudgments. Perception & Psychophysics, 57,1080-1096. verbale et d'age d'acquisition. Canadian Journal ofExperimental BOLTZ,M. [G.], & JONES,M. R. (1986). Does rule recursion make mel­ Psychology, 49, 211-239. odies easier to reproduce? If not, what does? Cognitive Psychology, PERETZ,I., & KOLINSKY, R. (1993). Boundaries ofseparability between 18,389-431. melody and rhythm in music discrimination: A neuropsychological DESROCHES, M., & BENOIST, J. (1983), Tambours de l'lnde a la Mar­ perspective. Quarterly Journal ofExperimental Psychology, 46A, tinique: Structure sonore d'un espace sacre [Drums from India to 301-325. Martinique Island: Sound structure ofa sacred space]. Etudes Creoles, PERETZ,I., KOLINSKY, R., TRAMO, M., LABRECQUE, R., HUBLET, C; DE­ 5,39-58. MEURISSE, G., & BELLEVILLE, S. (1994). Functional dissociations fol­ DEUTSCH, D. (1980). The processing of structured and unstructured lowing bilateral lesions ofauditory cortex. Brain, 117,1283-1302. tonal sequences. Perception & Psychophysics, 28, 381-389. PISTON, W. (1978). Harmony (4th ed.). New York: W. W. Norton. DOWLING, W. J., & FUJITANI,D. S. (1971). Contour, interval, and pitch PITT, M. A., & MONAHAN, C. B. (1987). The perceived similarity ofau­ recognition in memory for melodies. Journal ofthe Acoustical Soci­ ditory . Perception & Psychophysics, 41, 534-546. etv ofAmerua, 49, 524-531. SCHMUCKLER, M. A., & BOLTZ,M. G. (1994). Harmonic and rhythmic DOWLING, W. J.. LUNG. K. M.-T., & HERRBOLD, S. (1987). Aiming at­ influences on musical expectancy. Perception & Psychophysics, 56,' tention in pitch and time in the perception of interleaved melodies. 313-325. Perception & Psychophysics, 41, 642-656. SMITH, K., & CUDDY, L. L. (1989). Effects ofmetric and harmonic rhythm FRAISSE, P. (1974). Psychologie du rythme [Psychology of rhythm]. on the detection of pitch alterations in melodic sequences. Journal Paris: Presses Universitaires de France. ofExperimental Psychology: Human Perception & Performance, 15, FRAISSE, P. (1982). Rhythm and tempo. In D. Deutsch (Ed.), The psy­ 457-471. chology ofmusic (pp. 149-180). New York: Academic Press. SNODGRASS, J., & CORWIN, J. (1988). Pragmatics of measuring recog­ GABRIELSSON, A. (1993). The complexities of rhythm. In T. Tighe & nition memory: Applications to dementia and amnesia. Journal of W. 1. Dowling (Eds.), Psychology ofmusic. The understanding of Experimental Psychology: General, 117,34-50. melody and rhythm (pp. 93-120). Hillsdale, NJ: Erlbaum. STURGES, P.T., & MARTIN, J. G. ( 1974). Rhythmic structure in auditory GOODMAN, L. A., & KRUSKAL, W. H. (1954). Measures of association temporal pattern perception and immediate memory. Journal ofEx­ for cross classifications. Journal ofthe American Statistical Associ­ perimental Psychology, 102,377-383. ation, 49, 732-764. THOMPSON, W. F. (1994). Sensitivity to combinations of musical pa­ HALPERN, A. R. (1984). Organization in memory for familiar songs. Jour­ rameters: Pitch with duration, and pitch pattern with durational pat­ nal ofExperimental Psychology: Learning. Memory. & Cognition, tern. Perception & Psychophysics, 56, 363-374. 10,496-512. THOMPSON, w., & SINCLAIR, D. (1993). Pitch pattern, durational pattern, JONES, M. R. (1987). Dynamic pattern structure in music: Recent theory and timbre: A study of the perceptual integration ofauditory quali­ and research. Perception & Psychophysics, 41, 621-634. ties. Psychomusicology, 12,29-47. JONES, M. R., & BOLTZ, M. [G.I (1989). Dynamic attending and re­ WARREN, R. M., GARDNER, D. A., BRUBAKER, B. S., & BASHFORD, sponses to time. Psychological Review, 96, 459-491. J. A., JR. (1991). Melodic and nonmelodic sequences of tones: Ef­ JONES, M. R., BOLTZ,M. [G.], & KlDD,G. (1982). Controlled attending fects ofduration on perception. Music Perception, 8, 277-290. as a function ofmelodic and temporal context. Perception & Psycho­ WHITE, B. W. (1960). Recognition ofdistorted melodies. American Jour­ physics, 32, 211-218. nal ofPsychology, 73, 100-107. JONES, M. R., & RALSTON, J. T. (1991). Some influences ofaccent struc­ ture on melody recognition. Memory & Cognition, 19,8-20. NOTE JONES, M. R., SUMMERELL, L., & MARSHBURN, E. (1987). Recognizing melodies: A dynamic interpretation. Quarterly Journal ()f Experi­ 1. The gamma correlations (Goodman & Kruskal, 1954) that are usu­ mental Psychology, 39A, 89-121. ally calculated between feeling-of-knowing judgments and subsequent KlDD,G., BOLTZ,M. [G.], & JONES, M. R. (1984). Some effects of'rhyth­ recall of the solicited target information could not be computed because mic context on melody recognition. American Journal ofPsychol­ the subjects demonstrated near-perfect performance in the combined ver­ ogy, 97, 153-173. sion. VI APPENDIX A l;.) Original Tunes Used in Experiments 1,2, and 3 N

1- J_10 A la claire fontaine 13- J_I. GeIlS du pays ::r:: tn· ,~S~ ~ IJj J IJ ALA II ~5iJ J J I f IJ J J I r I J F r I r J I ,J JI .III tn ~ 2- j _100 Fais du feu 14- J_.50 Le petit renne au nez rouge

~2 JBJIIJJI~JFI·II ~ ~ ~ •2 J J 1 r p r Fj r U r J__ o 1.5- J_120 Frere Jacques ""0 3- J_100 Petit papa Noel rn ~q IJIJJIJJIJJIJJlrlJJlrl1 :;Q ~'11 JI r r rEI r' U I r r I e II tn r r >-l 16- 110 Vive Ie vent N 4- j _100 II etait un petit navire J_ ~ ~'2~UIJrUlr~IJJnIAJII2 aJJaJJI nOJIl 17- J_lGO Auclair de la lune 5- j _110 Les anges dans nos campagnes ~IOtJlrrlDUlrll 3 J1 ll ,' rrr01r'pr J§r01r'pr 18- J_I. Chevaliers de la table ronde 6- j. 151 II est ~edivin enfant ~I J a 0' r r g' J P 01 J II ~'iJJI JPI JJI JI J&ru l AJI JJII 19- J.It" Fais dodo 100 ~ 7- j _ Mon beau sapin ~I R J )1 J idOSEI J J j JII J Jl ~.2 j I £j J J I J,)/1 B r J I J J II fl 20- J.If: Le phoque en Alaska 8- j.uo Ce n'est qu'un au revoir ~II~i I rrFir a I rr I I re;I rr r I r 0 IFrrI F'II ~ II 2 J I J J j tl J.;I] mI J 3d J I J II ~_J-120 a Canada ,J II 9- j _110 Minuit chretien i 2 I p'l J JI J r r r I J ~ 2 ;,1 ; J I 11j J I ,j JI J J I r¥r' l J; I; HP ~-J.120 C'est ~r du jour ~el'an gl ~"- (f 01 E gl ~"AIJI PEllU Ell 0' 10- j _110 g;undi malin 'i 23- J-I5ll Mon pa,s 1 2 r" J, j 1 ]I t ~__ IJJ GffJ 1 J, iJ_1 JJI ~ irrlrJJIJJJIJrrlfll 11- j _100 Bonne fete 24- J"50 Le petit bonheur fqJjIJJJ!JJjI3JJ1,&~:lrJJlJJomJJJ! J! ~~'I J"JJJJdD JIi jJaIiiVi )l1J%IllU 3dJJ!JPII JII 12- L120 Ohquand j'entends chanter ~_J_.. La vie en rose 2 JIJJ~JlhISJJllrrlrrJJlell ~I ~ 1 ~qr ~0 r _Ir r II APPENDIXB ~ Mismatched Tunes Used in Experiment 2 1- J.,.., ='~M~-:-~_~_ J J I J II

2- J. ,OO ~aW-=m-!¥tiI r r r J I II I

3- J. III *'fr=?--EfF~'tuIF UI EJ¥D I

15Z ~ ~ ~ ~T~~4- J. r r l r l JI JI J J B s- J.1CIO ~ mr~1 l 31 __ r r r 01 F r ~ rn C­ O ~~ll;j>J II JI J gJ J flp jlJ ~ ;J> 7- J. ,lO ~ II Z CI I ~.i' I ~.;rI j • JJJ2i12f4 J J J I ;:0 ::r:: 8- ,... ~ -< J. >-l ~iM~ ::r:: J I J J I J J I Hi J J J J ~ 9- J.,oo § Z ~~@ ~ J. JlI JJ I J fI c tr: II}- J...., n ;:0 ~ rn l2jU n o a (Manuscript received August 2, 1995; z revision accepted for publication August 6, 1996.) -o>-l Z

Ul l;J l;J