Memory & Cognition 1989. 17 (4), 373-383 Modality effects in the coding and reproduction of rhythms

ARTHUR M. GLENBERG, STUART MANN, LISA ALTMAN, TIM FORMAN, and SEAN PROCISE University of Wisconsin, Madison, Wisconsin

The temporal coding assumption is that time of presentation is coded more accurately for audi­ tory events than for visual events. This assumption has been used to explain the modality effect, in which of recent auditory events is superior to recall of recent visual events. We tested the temporal coding assumption by examining the coding and reproduction of quintessentially temporal stimuli-rhythms. The rhythms were produced by sequences of short and long auditory stimuli or short and long visual stimuli; in either case, the task was to reproduce the temporal sequence. The results from four experiments demonstrated reproduction of auditory rhythms superior to that of visual rhythms. We conclude that speech-based explanations of modality ef­ fects cannot accommodate these findings, whereas the findings are consistent with explanations based on the temporal coding assumption.

In classic information processing models of Glavanov, and Murdock (1981) presented stimuli in dif­ (e.g., Atkinson & Shiffrin, 1968), the input modality of ferent spatial locations. Auditory stimuli were recalled bet­ information plays little role after the information is coded ter than were visual stimuli when recall was by temporal in short-term store. In fact, early work on the modality position, but this difference was eliminated when recall effect (e.g., Crowder & Morton, 1969) seemed to con­ was by spatial location. Glenberg and Swanson (1986) firm this limited role: Auditory information is recalled manipulated the duration of the intervals separating suc­ better than is visual information, but only at the end of cessive to-be-remembered items on a list. This manipu­ a list of items, supposedly indicating recall from a sen­ lation greatly affected recall of auditory items (enhanc­ sory memory or a short-term memory. In the interven­ ing the recency effect), but had little effect on the recall ing years, we have found, however, that modality plays of visual items. Glenberg and Fernandez (1988) discov­ a role throughout the cognitive system. For example, ered modality effects in an order judgment task. After a J. M. Gardiner and Gregg (1979) have demonstrated long­ list of stimuli was presented, subjects judged if the items lasting modality effects that are robust in the face of in­ in a re-presented pair were in the correct order. They terference. Glenberg and Fernandez (1988) reported mo­ found an auditory superiority both for the stimuli pre­ dality effects in position judgments and in frequency es­ sented at the end ofthe list and for those presented at the timations. In this article, we report a new type of modality beginning of the list. Furthermore, the auditory superi­ effect, one that offers support for a general interpretation ority increased with duration of the interitem intervals. ofmodality effects in memory. The new effect is that the Thus, across these paradigms, the temporal coding as­ coding and reproduction of rhythmic stimuli are more ac­ sumption is supported by the fact that there was greater curate for auditory rhythms than for visual rhythms. This sensitivity to temporal manipulations of auditory events supports the interpretation that modality effects reflect the than there was to temporal manipulations of visual events fact that coding of temporal information for auditory (but see Crowder and Greene, 1987, for a dissenting events is superior to that for visual events. opinion). Glenberg and Swanson (1986) proposed an account of These demonstrations of effects consistent with the tem­ modality effects based on what we will call the temporal poral coding assumption rely on relatively long-term codingassumption: Coding of time of presentation is more memory. Even in the immediate recall of a short list of accurate for auditory events than for visual events. Evi­ items, the modality effect is enhanced by serial recall, dence for this claim can be found in a variety of paradigms which delays recall ofthe end-of-list items where the lar­ using a variety ofmanipulations. For example, Metcalfe, gest modality effect is found. The temporal coding as­ sumption leads us to expect, however, effects much closer This research was funded by National Science Foundation Grant BNS to the perception of the events. That is, it proposes that 8416300 and University of Wisconsin Graduate School Grant 890200 time ofoccurrence is coded (not just remembered) more awarded to Arthur Glenberg. We thank Michael Davis, Lisa Knudson, accurately for auditory than for visual stimuli. This ex­ Nancy Lightfoot, and Rebecca Mauer for their help in designing these pectation led us to look for modality effects in a domain experiments. Requests for reprints may be sent to Arthur G1enberg, Department of Psychology, University of Wisconsin, Madison, WI that is quintessentially temporal-the perception and re­ 53706. production of rhythms.

373 Copyright 1989 Psychonomic Society, Inc. 374 GLENBERG, MANN, ALTMAN, FORMAN, AND PROCISE

A rhythm is produced by a sequence of stimuli that can the duration is perceived more accurately than is the du­ be identicalexcept for their temporal durations. Thus, per­ ration of the visual stimulus). ceiving a rhythm requires coding of temporal informa­ We tested these ideas by manipulating the background. tion and not much more. Our expectation is that rhythms In the auditory-noise-background condition, the stimuli produced by a sequence of long- and short-duration au­ (both auditory and visual sequences) were presented while ditory stimuli will be easier to perceive and reproduce than white noise was being played. The white noise should will identical rhythms produced by a sequence of long­ decrease the change from background for the auditory and short-duration visual stimuli.' Inthe General Discus­ stimulus and thereby reduce performance on the auditory sion section, we present the implications of our findings sequences. In the visual-noise-background condition, the for explanations of the modality effect. stimuli were presented while visual noise (flickering ran­ Inour rhythm task, subjects were presented with a se­ dom dot patterns) appeared on the display screen. This quence of n (n = 5-8) stimuli. Each stimulus was pre­ background should selectively reduce performance of the sented for either a long duration (750 msec) or a short visual stimuli. Finally, in the no-background condition, duration (250 msec), and each was followed by an empty, the stimuli were presented in a quiet, dark environment. short (250 msec) interstimulus interval. For an auditory Our reasoning did not depend on equating the auditory sequence, the stimuli were long- and short-duration tones. and visual backgrounds or changes in the backgrounds. For a visual sequence, the stimuli were asterisks that were Instead, we intended to determine if changing the back­ presented on a display terminal for a long or short dura­ ground in modality selectively affects coding and repro­ tion. Immediately following the last stimulus, subjects duction of rhythms presented in that modality. reproduced the rhythm, using button presses to indicate the sequence of long- and short-duration stimuli. Method This task is not ideal for studying the perception and Subjects. The subjects were 20 students enrolled in introductory coding of time of presentation; there is still a memory psychology courses at the University of Wisconsin-Madison. They component. The memory component is minimized, how­ received credit toward a research requirement for participating in the experiment. ever, in that there are only two stimuli (long and short), Materials. The subjects were seated approximately 55 ern in front and the retention interval is very short. of a visual display terminal and two loudspeakers. All subjects wore In overview, we conducted four experiments. Experi­ a limited-vision mask that restricted vision to a 6x 12 em area of ments 1 and 2 demonstrated the basic finding that per­ the display screen. This mask was used to eliminate ambient visual formance is better for auditory rhythms than for visual stimulation in the no-background condition, but all subjects wore rhythms. Inaddition, these experiments ruled out expla­ the mask throughout the experiment. The auditory stimulus was a tone of complex composition.? The nations of the effect based on modality differences in the intensity of the tone at the ear was approximately 77 dB(A). The detectability of onsets and offsets of auditory and visual tone was presented for either 250 msec (short) or 750 msec (long). stimuli. In Experiment 3, we explored the possibility that During the block of trials corresponding to the auditory-noise­ the effect is due to response requirements and that it may background condition, white noise was played continuously at an reflect differential experience in coding auditory and intensity of approximately 75 dB(A). visual rhythms. Experiment 4 demonstrated a correspon­ The visual stimulus was a single asterisk (produced using an Apple II's Turtlegraphics font) centered in the visible area of the dence between the rhythm modality effect and the mo­ display screen. In the visual-noise-background condition, a flick­ dality effect found in recall tasks. ering random dot pattern appeared on the display screen, except for a .5 x 1.2 cm rectangle in which the asterisk appeared. The ran­ dom dot pattern appeared after the ready signal (see below) and EXPERIMENT 1 was presented for 500 msec before the first stimulus and through­ out the presentation of the stimuli. Experiment 1 was designed to determine if auditory su­ Each sequence began with a ready signal consisting of a warning periority in the rhythm task is due to perceived change tone, a visual message indicating whether the following sequence from background. OUT experience in the task indicated would consist of tones or asterisks, andthe number of tones or aster­ that the auditory stimulus is more alerting than is the visual isks (five, six, or seven). The visual message was displayed for stimulus (cf. Posner, Nissen, & Klein, 1976), perhaps be­ 500 msec. The ready signal was followed by a 5OO-msec delay that was filled with the visual noise background in that condition. cause the auditory stimulus is a greater change from its Each sequence consisted of five, six, or seven stimuli. The se­ background (silence) than the visual stimulus is from its quence always contained at least one long and one short stimulus. background (ambient visual stimulation). This alerting Other than this constraint, the sequences were generated randomly could enhance auditory performance in at least two ways. for each subject. Following the last stimulus in a sequence there First, the alerting may increase the accuracy of the per­ was a 5OO-msec delay before the subject was prompted to respond ception of duration, because the actual onset of the audi­ (the delay was imposed to eliminate masking due to the prompt). The prompt consisted of a series of five, six, or seven dashes on tory stimulus is noted more accurately than is the actual the screen. For each serial position in the sequence, the subjects onset of the visual stimulus. Second, the alerting may responded with a left-hand button to indicate a short stimulus and recruit attentional resources to enhance the coding and a right-hand button to indicate a long stimulus. On responding, the memory of auditory duration information (whether or not dash in the corresponding serial position was replaced by an S or MODALITY EFFECT IN THE CODING OF RHYTHM 375 an L. The subject responded in serial order, but, by pressing both 9 Auditory Visual buttons simultaneously, he/she could erase the sequence and start Sequences Sequences over. When satisfied with the response, the subject pressed a third button to record the response and initiate the ready signal for the 8 next trial. f­ U Design. Except for the use oftwo experimenters, all factors were W 0:: 7 manipulated within subjects. The order of the three background con­ ZO:: 00 ditions (auditory noise, visual noise, no noise) was counterbalanced i=U 0::>­ across subjects. Within a background condition, the modality of O-l 6 the sequences (auditory or visual) was blocked and counterbalanced. Cl.w '-. None Of­ Within each modality, there were eight sequences of each length O::w .. \ Cl.-.J 5 II., \ Background (5, 6, or 7). These sequences were blocked by length and were o, -, \ ~ CJ • '. ~ Visual always presented in order starting with Sequence Length 5. o Auditory U '. Auditory Before each change of background, the subjects were given a brief 4 break and were informed ofthe nature of the upcoming condition. None In addition, before the experimental trials began, the subjects were given practice (in the no-noise condition) on two sequences of tones iii I and two sequences ofasterisks. At the end ofthe experiment, they 5 6 7 5 6 7 SEQUENCE LENGTH were asked if they had closed their eyes to avoid the visual noise background during the auditory trials. No subject admitted to do­ Figure 1. Data from Experiment I: Proportions of sequences re­ ing this. called exactly as presented as a function of sequence length, modal­ ity of presentation, and background condition. Results The first two sequences of each length for each modal­ ground condition was within a few percentage points of ity for each background condition were treated as prac­ performance in the visual-noise-backgroundand no-noise­ tice and were not scored. The responses for the remain­ background conditions. Considering performance on the ing six sequences were scored in two ways. First, a visual sequences, performance in the visual-noise­ sequence was scored as 1 if the responses (short or long) background condition (.61) was slightly larger than it was in all serial positions were correct (or 0 otherwise). We in the auditory-noise-background condition (.56) and the will refer to this as the completely correct score. Second, no-noise-background condition (.55). we scored the response in each serial position as correct The results from the serial position analysis, averaged or not, so that we could examine serial position curves. over sequence length, are illustrated in Figure 2. For each In the analyses of the completely correct variable, ex­ sequence length, we scored the first two and last three perimenter was treated as a factor (with the reduction in serial positions so that we could include sequence length degrees of freedom for various error terms). Neither the as a factor in the analysis of variance. There was a main main effect nor any of its interactions were significant. effect for sequence modality, indicating auditory superi­ This factor was not included in analyses by serial posi­ ority [F(l, 19) = 22.23, MSe = 1.09]; a main effect for tion. All statistical analyses were conducted with the prob­ length, indicating a decrease in performance with the ability of a Type I error set equal to .05, and the depen­ longer lengths [F(2,38) = 36.51, MSe = 1.10]; and a dent variable was always number correct (rather than main effect of serial position, indicating a decrease in proportions as indicated in the figures). recallacross the positions[F(4,76) = 34.77, MSe = 1.31]. The data from the completely correct measure are il­ A number of interactions were significant. First, the lustrated in Figure 1. Note that there was a general ad­ modality X serial position interaction was significant, in­ vantage for the auditory sequences (.70) over the visual dicating an increase in the auditory advantage toward the sequences (.58) [F(1,18) = 18.94, MSe = 2.63]. This end of the sequence [F(4,76) = 4.34, MSe = .64]. There result demonstrates auditory superiority in the coding and was also a significant sequence length x serial position reproduction of rhythmic stimuli. There also was a highly interaction [F(8,152) = 5.56, MSe = .70]. This interac­ significant effect of sequence length [F(2,36) = 51.37, tion was produced by the fact that the three lengths did MSe = 1.32]. not differ much in the initial serial position (.97, .95, and The change-from-background hypothesis predicts a dec­ .95, for Lengths 5,6, and 7, respectively), but did differ rement in performance when the stimulus modality in the final serial position (.87, .81, and .77). Finally, matches the background modality. In other words, stimu­ the three-factor background condition X sequence mo­ lus modality and background condition should interact. dality X length interaction was sigr.ificant [F(4,76) Although none of the two-factor interactions were signifi­ = 2.87, MSe = .88]. As with the corresponding inter­ cant (ps > .19), the three-factor, background condition action in the completely correct analysis, this interaction x modality x length interactionwas significant[F(4,72) = was not systematic: In the no-noise background, the au­ 2.98, MSe = 1.21]. Nonetheless, this interaction does ditory superiority was greatest for sequences of Length 5, not offer much support to the change-from-background in the auditory background condition the auditory superi­ hypothesis. Note that, except at Sequence Length 7, per­ ority was greatest for sequences of Length 6, and in the formance on the auditory sequences in the auditory back- visual background condition, the auditory superiority was 376 GLENBERG, MANN, ALTMAN, FORMAN, AND PROCISE

Auditory No Visual 10 Background Background Background I- -, U Wa:: ~:::~,\ ~~ ..... ~\ a:: 9 0 ~ \~:-r./l> o \b..• 'O.. '" Auditory Auditory ....., .~ Auditory <,, z 0 8 \...... Visual i= a:: '. Visual Visual 0 Q. .7 0a:: Q.

i , i iii 2 L-2 L-I L I 2 L-2 L-I L 2 L-2 L-I L SERIAL POSITION

Figure 2. Data from Experiment 1: Proportion of stimuli correct at each input serial position as a function of modality of presentation. greatest for sequences of Length 7. Table I provides the EXPERIMENT 2 means contributing to these interactions. The three-factor interaction notwithstanding, there is Like Experiment I, Experiment 2 tested the change­ little support for the change-from-background hypothe­ from-background hypothesis. In this experiment, how­ sis. Across the three background conditions, average per­ ever, the manipulation was of stimulus energy, rather than formance on the auditory sequences ranged only from.90 of type of background. to .92, and average performance on the visual sequences The auditory stimulus was either loud or soft. The ranged only from .86 to .87. Nonetheless, the serial po­ change-from-background hypothesis predicts better per­ sition curves are notably different from standard serial formance with the loud stimulus. The visual stimulus was position curves in two respects. First, there is little or no either a single asterisk or many asterisks on the display recency effect. Second, the auditory superiority is found screen. The hypothesis predicts better performance in the throughout the serial position curve, although as indicated many-asterisks condition. by the significant modality x serial position interaction, auditory superiority does increase toward the end of the Method sequence. We will discuss this further after presentation Subjects. The subjects were 22 students obtained from the same of Experiment 2 and in the General Discussion section. source as were those in Experiment 1. Materials. In the soft condition, the auditory stimuli were iden­ tical to those used in the no-noise-background condition ofExperi­ Table 1 ment 1, except that the intensity of a stimulus tone was approxi­ Means from Experiment 1 mately 55 dB(A). In the loud condition, the intensity was set to Sequence Serial Position approximate 85 dB(A). The visual stimulus (asterisk) in the single Modality Length 2 Last 2 Last 1 Last condition was exactly the same as the visual stimulus in the no­ noise condition ofExperiment 1. In the many condition, the visual No Background stimulus consisted of an array of asterisks that completely filled Auditory 5 .98 .97 .93 .92 .96 the visible part of the display screen. 6 .95 .98 .88 .89 .82 Design. Except for assignment to one ofthe two experimenters, 7 .92 .90 .82 .82 .82 all factors were manipulated within subjects. The two auditory con­ Visual 5 .93 .90 .88 .89 .80 ditions (loud and soft) and the two visual conditions (single and 6 .94 .90 .92 .87 .79 many) were blocked, but the order ofthe blocks was counterbalanced 7 .89 .89 .76 .77 .68 across subjects. In addition, we counterbalanced the order of the Auditory Background two conditions within a modality block. Within a condition, the subjects were presented with 10 sequences of each of the lengths Auditory 5 .98 .96 .94 .92 .91 6 .97 .93 .91 .90 .83 (5,6, and 7, in that order). The structure of each sequence (ready 7 .96 .95 .81 .77 .77 signal, sequence, response interval) was identical to that used in Experiment I. Visual 5 .98 .98 .94 .97 .83 6 .94 .89 .86 .78 .73 7 .96 .93 .76 .79 .69 Results Visual Background The results ofthe completely correct scoring are illus­ trated in Figure 3. The only significant effects were for Auditory 5 .98 .97 .93 .91 .87 6 .98 .94 .92 .85 .87 sequence length [F(2,40) = 33.06, MSe = 2.67] and con­ 7 .99 .98 .83 .81 .90 dition [F(3,60) = 4.27, MSe = 3.21]. We further exam­ Visual 5 .98 .94 .96 .92 .86 ined the main effect of condition using planned orthog­ 6 .93 .97 .86 .83 .82 onal contrasts. The difference between the auditory and 7 .96 .92 .77 .77 .72 the visual sequences was significant [F(l,60) = 12.05], MODALITY EFFECT IN THE CODING OF RHYTHM 377

given our own experience in the task. Our subjective im­ pression is that there is no need to engage an active, ef­ .8 fortful coding process for the auditory stimuli-the rhythm f- U simply impresses itself upon one. In contrast, the visual W .A.. c:: .7 stimuli seem to require attention and effortful processing zc:: 00 e,<>" to code the durations as long or short. When queried, all FU -...... of our subjects felt similarly. Although small in absolute ~~ 6 terms, the auditory superiority was very consistent. OW "e ", Loud The auditory superiority is remarkable in that the phys­ c::f-W " Soft a. ical information that must be coded is exacty the same a.-l .5 ~ "'<>" for the auditory and visual sequences: In either case, the 0 ',,'-.. Few U subject is responsible only for the duration of the events. A '. Many This should be contrasted with standard modality effects f I I I in which the physical stimuli (e.g., a spoken word and 567 a written word) are markedly different for the two mo­ SEQUENCE LENGTH dalities. Consistent with the temporal coding assumption, Figure 3. Data from Experiment 2: Proportions of sequences re­ the auditory modality apparently is more specialized for called exactly as presented as a function of sequence length and mo­ coding temporal information than is the visual modality. dality of presentation. The results of Experiments 1 and 2 offer little support for the change-from-background hypothesis. Although the but the difference between the loud and soft conditions auditory stimuli may be more alerting, it is not because was not significant, nor was the difference between the there is a greater change from the auditory background single and many conditions (both Fs < 1). Thus, once than there is from the visual background. again, there is no support for the change from background hypothesis. EXPERIMENT 3 The results of the serial position analysis (first two and last three positions) are illustrated in Figure 4. There were The auditory superiority may not be due to any intrinsic significant main effects for condition [F(3,60) = 8.11, difference between coding of auditory and visual stimuli MSe = 2.67], length [F(2,40) = 22.11, MSe = 2.14], (as implied by the temporal coding assumption). Instead, and position [F(4,80) = 17.59, MSe = 1.93]. Although it may simply reflect differential practice. Most of us it appears that the auditory advantage increased near the have had extensive practice identifying auditory rhythms end of the sequence, the condition X position interaction from listening to music, poetry, or a well-tuned engine. was not significant [F(l2,240) = 1.62, MSe = .72, Rhythms carried by the onsets and offsets of visual events P = .09]. As in Experiment 1, there was little recency (e.g., a flickering television screen) seem rarer. Experi­ effect, and the auditory advantage was found at most serial ment 3 was designed to test this notion by giving the sub­ positions. jects extensive practice in the rhythm task over 2 days. We examined the data for two indicants of differential Discussion of Experiments 1 and 2 amounts of previous to code rhythms. First, if The most important result was that there was an auditory it is just a matter of performance in the visual condition superiority in the coding and reproduction of rhythms. catching up to performance in the auditory condition, we The result was somewhat smaller than we had expected should observe equal asymptotes after practice. Suppose, however, that performance does not reach asymptotic levels after 2 days of practice. In this case, we are forced f- lO U to consider whether performance in the visual condition w c:: is improving faster than performance in the auditory con­ c:: .9 dition (indicating that visual performance is at an earlier 0 ~ Soft part of a negatively accelerated learning curve). To do U ~_'::>-';i=c::==--tt Z ',-_ .. Loud this, we used auditory sequences of Lengths 6, 7, and 8, 0 -:-'.a. .8 ''.::-----.a. Few and visual sequences of Lengths 5, 6, and 7. We made f= ------. Many 0:: the auditory task more difficult so that we could find au­ 0 o, ditory and visual conditions that started at about the same 0 .7 performance level (e.g., Auditory Length 7 and Visual 0:: o, Length 6), enabling us to examine the learning curves for r i i 2 Last-2 Last-I Last different rates of learning. We also manipulated the type of response the subjects SERIAL POSITION made after the sequence. In Experiments 1 and 2, the sub­ Figure 4. Data from Experiment 2: Proportion of stimuli correct jects responded by using two keys to indicate the sequence at each input serial position as a function of modality of presentation. of long and short stimuli. Perhaps this dichotomous re- 378 GLENBERG, MANN, ALTMAN, FORMAN, AND PROCISE sponse encouraged unnecessary verbal recoding of visual before these trials started, the subjects had practiced on two audi­ stimuli into the words long and short, leading to poorer tory trials and two visual trials. All other aspects of the design and performance with the visual stimuli. In Experiment 3, half procedure were identical to those of Experiment 2. of the subjects used the dichotomous (short-long) response mode, and half used a continuous response mode. In the Results continuous response mode, the subject pressed one key The first sequence in every group of six was considered for a duration that matched the remembered duration of as practice and was not scored; otherwise, scoring for the the first stimulus. The key then was released and pressed subjects in the dichotomous response condition was iden­ again to indicate the duration of the second stimulus, and tical to that used in Experiments I and 2. A different so on. The duration of each keypress was indicated on procedure was used for the subjects in the continuous the display terminal by a line whose horizontal extent in­ response condition. The raw data consisted of the dura­ creased while the key was pressed. Lifting the key moved tions for which the response key was pressed. We needed the cursor down the display screen where the next dura­ to find a cutoff duration so that any longer duration key tion would be indicated. The various line lengths (indicat­ press would be classified as long and any shorter dura­ ing durations) were continuously displayed during the tion key press would be classified as short. To find a cutoff response interval (much like the Ss and Ls in the dichoto­ for each subject, we used an iterative procedure that max­ mous response mode), and the subject could start over imized the proportion of matches between the actual at any time. In addition, the instructions never mentioned stimulus (long or short) and the classification. After the the words long and short, and never referred to the fact responses were classified, they were scored exactly as in that only two durations were used. Thus, we tried to elim­ the dichotomous response condition. inate any encouragement to verbally recode the stimuli. The results from the completely correct scoring are dis­ played in Figure 5. First, note that learning has, appar­ ently, reached an asymptote by the end of Day 2. In fact, Method many of the curves show a slight downturn between Subjects. The subjects were 24 students from the same source Blocks 2 and 3 on Day 2. Thus, we need not worry about as for Experiments I and 2. having to compare the rates of learning; instead, we can Materials. The stimuli were analogous to the loud-tones and many-asterisks conditions from Experiment 2. Auditory sequence concentrate on the asymptotic levels. Second, for se­ lengths were 6, 7, and 8, and the visual sequence lengths were 5, quences of comparable lengths, performance in the audi­ 6, and 7. tory conditions exceeds performance in the visual condi­ Design. The subjects were assigned randomly to one of two ex­ tions at the asymptotes and in the initial parts of the perimenters and to a response mode (dichotomous or continuous). learning curves. Thus, the auditory superiority is unlikely For the subjects in the continuous response mode, the instructions to reflect only differential extraexperimental practice. On were slightly modified to remove any reference to the number of different stimulus durations, and the words long and short were the other hand, note that there is some evidence (particu­ not used. The other factors were manipulated within subjects. One larly in the dichotomous response condition) that perfor­ variable was day; the subjects participated in the experiment on mance in the visual conditions showed an initial spurt from 2 consecutive days. Each day, the subjects experienced three blocks Block 1 to Block 2 (on Day 1) not matched by the audi­ of sequences. A block consisted of six auditory sequences at each tory conditions. Finally, note that, in the continuous length (6, 7, and 8) and six visual sequences at each length (5, 6, response mode, the auditory superiority was as large and and 7). Within a block, the order of the auditory and visual sequences was counterbalanced over subjects (for one particular subject, how­ as consistent as it was in the dichotomous response mode. ever, the order remained the same across all blocks and days), but We learn from this last result that the auditory superiority the sequences were always blocked by modality and were always was not due to the fact that the subjects were forced to presented in the order of increasing sequence length. In addition, use a dichotomous and verbally codable response mode.

1.0 CONTINUOUS G .9 DICHOTOMOUS w zO::: .8 Vis-5 Qgs Aud-6 1-<.) .7 .....-----______Aud- 7 gs ~ 6 Vis-6 Q.w . _-.-----. Vis- 7 01- .5 8:~ ~".------o-_~.- __ •••-' Aud-8 Q. .4 Vis Aud ~ .3 5 ... <.) 6 0 .2 • 7 • 0 I 8 0

BLOCK: 2 3 2 3 2 3 2 3 DAY I DAY 2 DAY I DAY 2

Figure 5. Data from Experiment 3: For each response mode, proportion of sequences recalled exactly as presented as a function of modality of presentation, sequence length, and practice. MODALITY EFFECT IN THE CODING OF RHYTHM 379

These observations are supported by analyses of vari­ 10 ~ ance conducted separately for the dichotomous and contin­ u uous response modes. Consider first the dichotomous re­ W 0:: 9 ..:~~.,'.-- n---o-----{) Aud-6 sponse mode. Significant practice effects are indicated by ------.. ------15u .. the maineffects for day [F(l,11) = 16.74, MSe = 2.09] ..•- u----o. <. Vis-6 and for block [F(2,22) = 11.80, MSe = 1.39]. The inter­ t5 .8 ...------Aud-7 i= action of these two variables [F(2,22) = 10.95, MSe = 0:: '. o 1.22] indicates that the improvement over blocks was ....------.----- ~ 7 ---. Vis-? smaller on Day 2 than it was on Day 1. That is, learning H: appears to have reached an asymptote by Day 2. There also was a main effect of length [F(2,22) = 40.60, I 2 Lost-3 Lost-2 Lost-I Lost MSe = 1.68], indicating a decrease in performance with an increase in sequence length. The modality main effect SERIAL POSITION was not significant (F < 1) because of the use of longer Figure 6. Data from Experiment 3 (dichotomous response mode): sequence lengths for the auditory conditions. Proportion of stimuli correct at each input serial position as a func­ Two higher order interactions also were significant. One tion of modality of presentation and sequence length. interaction was among the modality, day, and block fac­ tors [F(2,22) = 5.57, MSe = .61]. This interaction in­ There were main effects of modality [F(l,l1) = 9.38, dicates that, on Day 1, the visual sequences showed a MSe = 1.96], sequence length [F(l,l1) = 15.16, MSe = greater improvement over blocks than did the auditory 1.76], serial position [F(5,55) = 12.61, MSe = .93], and sequences. Thus, there is some support for an initial learn­ block [F(2,22) = 12.40, MSe = 1.23]. The main effect ing spurt for the visual sequences, as predicted by the of block was due to improvement from .80 to .87 to .89 differential practice hypothesis. The second interaction across Blocks I, 2, and 3, respectively. Although there was among modality, day, and sequence length factors is a trend in the data for the auditory superiority to in­ [F(2,22) = 3.67, MSe = .46]. We could find no sensi­ crease across serial positions, the F ratio for the modal­ ble interpretation for this interaction. ity by serial position interaction was less than 1.0. The results from the analyses of the continuous response Three interactions were significant (or nearly so), but mode condition were simpler. There were main effects they were not of apparent theoretical interest. There was of day [F(l, 11) = 11.36, MSe = 2.69] and block a block X serial position interaction [F(1O, 110) = 1.94, [F(2,22) = 3.45, MSe = 1.10], both of which indicate MSe = .52]. Over blocks, there was a 10% improvement practice effects. There also was a main effect of sequence at the last serial position, but only a 3% improvement at length [F(2,22) = 28.28, MSe = 1.23], indicating a de­ the first serial position, probably because of ceiling ef­ crease in performance with an increase in sequence length. fects. The interaction between sequence length and serial The maineffect of modality was not significant, again be­ position was marginally significant [F(5,55) = 2.29, cause of the use of longer sequence lengths for the audi­ MSe = .56, p = .058]. As can be seen in Figure 6, the tory conditions. Also, there were no significant interac­ decrease over serial positions was more dramatic for se­ tions involving modality, providing little support to the quences of Length 7 than for sequences of Length 6. Fi­ differential practice hypothesis for this response mode. nally, the block x sequence length x serial position in­ We analyzed serial position functions only for the teraction was significant [F(lO,loo) = 2.25, MSe = .33]. dichotomous response mode, for Day 1, for sequence The interaction between length and serial position was not lengths of 6 and 7. We do not present the data from the apparent in the last block, probably because of ceiling ef­ continuous response mode because of the possibility that fects. Table 2 provides the means contributing to these our scoring procedure (finding a cutoff between long and interactions. short responses that maximizes overall probability ofbe­ We draw two new conclusicns from these results. First, ing correct) might have had unexpected effects at differ­ because of differences between the modalities in asymp­ ent serial positions. That is, the cutoffthat maximized per­ totic performance, we conclude that the auditory superi­ formance at the early serial positions might have been ority does not reflect only differential preexperimentai different from the cutoff that maximized performance at practice with auditory rhythms. There was a hint of the later serial positions. Furthermore, we do not present differential practice for the dichotomous response mode the analyses ofdata from Day 2 because of problems with in that there was an initial learning spurt for the visual ceiling effects at many of the serial positions. sequences that was not matched by the auditory sequences. The data are presented in Figure 6. Note that the serial However, this interaction was not found for the continu­ position functions display the same two unusual charac­ ous response mode, and, in any case, it did not eliminate teristics that we observed in Experiments 1 and 2, namely, the difference between the modalities in asymptotic per­ the auditory superiority throughout most of the serial po­ formance. Second, because the difference between the sitions and the absence of a strong recency effect. modalities was also found in the continuous response 380 GLENBERG, MANN, ALTMAN, FORMAN, AND PROCISE

Table 2 the rhythm task when the sequences for the recall task Means from Experiment 3: are composed of just two stimuli. In Experiment 4, the Dichotomous Response Mode, Day 1, Sequence Lengths 6 and 7 subjects experienced four sequence presentation condi­ Serial Position tions. As in the previous experiments, auditory and visual Block Modality 2 Last 3 Last 2 Last I Last rhythmic sequences were used. In addition, the subjects Sequence Length 6 recalled sequences composed of the letters Land S, which Auditory .92 .95 .87 .82 .82 .90 were presented either aurally or visually . Visual .88 .78 .88 .72 .65 .73 2 Auditory .98 .98 .97 .92 .88 .87 Visual .92 .93 .90 .90 .85 .88 Method 3 Auditory .98 .98 .95 .90 .95 .88 Subjects. The subjects were 16 members of the University of Visual .93 .92 .87 .95 .90 .82 Wisconsin summer community. Seven were paid for their partici­ pation, and the others used participation in Experiment 4 to partially Sequence Length 7 fulfill a course research requirement. Auditory .92 .87 .73 .78 .70 .75 Materials. A new piece of equipment was used to present the Visual .95 .90 .75 .67 .68 .60 rhythmic sequences. Thisequipment consisted of a small box (resting 2 Auditory .95 .92 .88 .82 .73 .82 on top of the video monitor) that enclosed a fixed-frequency square­ Visual .92 .83 .78 .77 .67 .70 wave oscillator used to drive a 2-in. loudspeaker. A diffused lens 3 Auditory .95 .90 .88 .88 .85 .87 5-mm red LED was also part of the box. The Apple 11+ computer Visual .92 .85 .85 .73 .83 .80 was used to turn on the oscillator (for auditory rhythmic stimuli) and the LED (for the visual rhythmic stimuli) and to time the stimu­ lus duration. mode, we conclude that the auditory superiority did not We created the auditory letter stimuli by recording single exem­ result from a forced incompatible response mode after the plars of the letters Sand L on an Instavox, a device that allows rapid random access to analog recordings of sound. The Instavox visual presentations. was then used to construct a tape recording of the appropriate se­ quences of the spoken letters, which was amplified and played EXPERIMENT 4 through the same speaker used to present the auditory rhythmic se­ quences. The visual letter sequences were presented in the stan­ For the sake of parsimony, we would like to maintain dard Apple Turtlegraphics font on the video monitor controlled by that the auditory superiority we have observed here is the computer. All stimulus sequences were 8 elements long. The 12 sequences related (by the temporal coding assumption) to the mo­ used in Experiment 4 were constructed from the groups of elements dality effect observed in recall. Two aspects ofthe results LL, SSL, and LSS (each letter indicates a long or short element argue against this relationship, however. First, the mo­ for the rhythm task or presentation of the letter L or S in the recall dality effect in recall is generally confined to the end of task). Each sequence contained exactly one LL group and two of a list, but we observed modality effects throughout the the other groups (one of each or one used twice). Once three groups list. It should be noted, however, that through-list mo­ were selected (e.g., LL, SSL, LSS), all possible orders of the three groups were constructed, resulting in a set of 12 unique, 8-elemenl dality effects have been reported for recall tasks. Greene sequences. These stimulus sequences were constructed to correspond (1985), Routh (1976), and J. M. Gardiner, Gregg, and to those used in another experiment that will not be reported here. M. M. Gardiner (1984) have reported through-list mo­ Four random orderings of the 12 sequences were generated. These dality effects in serial recall, and J. M. Gardiner et al. random orderings were assigned in a counterbalanced fashion to (1984) have found them in . the four sequence presentation conditions. Second, the modality effect in recall usually takes the Timing of the rhythmic sequences (including the ready signal and form of an enhanced recency effect for auditory material the responses) were as described for Experiments 1-3 (short = 250 rnsec, long = 750 rnsec, interstimulus interval = 250 rnsec). compared to visual material. Here, we observed very lit­ For the letter sequences, each letter was presented within a 400-rnsec tle recency effect. However, there are many differences window and followed by a 350-msec interval. Thus, the average between the rhythm task and standard recall. For one, in duration of a letter sequence equaled the average duration of a rhyth­ the rhythm task, subjects are recalling a sequence of du­ mic sequence. The ready signal and the response requirements for rations; in the standard recall task, we have proposed the letter sequences were exactly the same as those used for the (Glenberg & Swanson, 1986) that subjects use temporal rhythmic sequences. Design. The two main factors (rhythmic or letter sequence and information as a retrieval cue to recall other information, auditory or visual presentation) were manipulated within subjects, namely, the items on the list. This extra stage interven­ so that each subject experienced all four sequence presentation con­ ing between the temporal informationand recall could well ditions. The sequences were blocked so that all of the letter sequences hide any through-list effects (see, for example, Glenberg appeared first for half of the subjects and all of the rhythmic se­ & Fernandez, 1988). Another difference between the stan­ quences appeared first for the others. Within these blocks, the se­ dard task and the rhythm task is that typically a variety quences were blocked by modality of presentation so that all of the auditory sequences appeared first for half of the subjects and all of stimuli is used in the standard task, not just two (long of the visual sequences appeared first for the others. As noted be­ and short) as we used in the rhythm task. fore, each of the four types of trials was assigned one of the four Experiment 4 was designed to determine if the modal­ orders of the 12 unique 8-eIement sequences. This assignment was ity effect in recall would resemble the modality effect in counterbalanced across subjects. Each block of 12 trials was MODALITY EFFECT IN THE CODING OF RHYTHM 381

10 quences. We were able to reject three hypotheses that at­ tributed this auditory superiority to relatively uninterest­ t­ Recall-Aud U ing processes. The change-from-background hypothesis a:UJ was ruled out by the results from Experiments I and 2: gj 9 '0. Auditory and visual noise (and intensity of the auditory u ...... Q~.,- Z and visual signals) had little effect on the size of the o .----...."<, -,-, t­ _~ a: "', Rhythm-Aud auditory superiority. The differential learning hypothe­ o ...-... 0.... _...0-- 0­ sis was ruled out by the results of Experiment 3: There o ---"'''':::''-0''''--- secan-vrs a: "'a _-A Rhythm-Vis was little to suggest that, given enough practice, percep­ 0- ----,.,.--- tion of visual rhythms will approach that of auditory rhythms. The results of Experiment 3 also demonstrated 3 4 5 6 that the auditory superiority is not due to specific response SERIAL POSITION requirements. Figure 7. Data from Experiment 4: Proportion of stimuli correct The remainder of the discussion is organized around at each input serial position as a function of sequence presenta­ three questions. How are the modality effects in the tion condition. rhythm and recall tasks related? What are the implica­ tions of this relationship for explanations ofthe modality preceded by instruction as to the nature of the upcoming trials and effect? How can we conceptualize this relationship? two practice trials. There are three reasons for believing that the modality effect found with rhythmic stimuli are produced by the Results and Discussion same cognitive mechanisms that produce the modality ef­ The analysis will focus on the serial position scoring fect found using letter stimuli. First, the rhythm modal­ of the 12 sequences in each of the four main conditions. ity effect is in itself evidence that the processes that The data are presented in Figure 7. produce the standard modality effect are being tapped. The maineffect for type of sequence [F(l, 15) = 7.57, Of course, one could propose that modality of presenta­ MSe = 7.89] indicates better performance with the let­ tion influences multiple information processing stages and ter sequences than with the rhythmic sequences. The main that different stages are responsible for the two modality effect of modality [F(l,15) = 7.79, MSe = 4.04] indi­ effects. Nonetheless, until evidence for this claim is made cates better performance with the auditory sequences than available, it is more parsimonious to assume that the two with the visual sequences. Finally, there was a main ef­ modality effects are produced by the same underlying fect of serial position [F(7, 105) = 11.53, MSe = 2.25]. processes. None ofthe interactions were significant in this analysis, On the basis of the results of Experiments 1-3, one nor were any interactions significant in an analysis ofthe could argue that the rhythm modality effect is different last three serial positions. from the standard modality effect, because it is found at The major question is whether or not the unusual fea­ all serial positions and because it is not associated with tures of the modality effect in the rhythm task are also a large recency effect. However, this argument is at least found in the recall task, now that we have controlled for partially nullified by the results ofExperiment 4: The un­ the number of stimuli, presentation rate, and so forth. usual features of the rhythm modality effect were also First, judging from Figure 7 and the statistical analyses, characteristic of the recall modality effect when the num­ the auditory superiority was about the same size in both ber ofstimuli was reduced to two. Thus, the fact that the tasks. Second, for both tasks, the auditory superiority was shapes of the serial position functions in the rhythm and found in most serial positions, not just at the end of the recall tasks were similar is a second reason for believing list. Third, because the interaction between task, modal­ that the two modality effects tap the same mechanisms. ity, and serial position was not significant (p = .31, con­ Glenberg, Jona, and Curran (1989) discuss a third rea­ sidering all positions; p = .14, considering the last three son. Their reasoning is based on a correlational analysis: positions), there is not much evidence that the auditory If two effects are produced by the same information recency effect was larger for the recall task than it was processing mechanism, the sizes of the effects should for the rhythm task. However, this last result is weak. correlate across subjects. In fact, Glenberg et al. found Not only is it based on the acceptance ofthe null hypothe­ a correlation of .61 between the size of the modality ef­ sis, but it may well be influenced by a ceiling effect in fect in a rhythm task and the size ofthe modality effect in the recall data. Nonetheless, it is clear from these data a more standard recall task. For reasons discussed in that the two modality effects were quite similar. Glenberg et al. 's manuscript, application of this analysis to the data from Experiment 4 is open to a number ofar­ GENERAL DISCUSSION tifacts, and thus will not be reported. 3 If, indeed, the rhythm and recall modality effects are There is a modality effect in the perception and repro­ produced by the same cognitive processes, what are the duction of rhythms: Auditory rhythmic sequences are implications for explanations of the modality effect? First, reproduced more accurately than are visual rhythmic se- we note that our results using the rhythm task beara family 382 GLENBERG, MANN, ALTMAN, FORMAN, AND PROCISE resemblance to Roberts's (1986) finding of modality ef­ lation eliminated the rhythm modality effect by reducing fects in memory for melodic stimuli (see also Greene & performance on the auditory sequences to that on the Samuel, 1986). Roberts presented to trained musicians visual sequences, whereas the manipulation had little ef­ sequences of musical notes or the visual counterparts of fect on performance on the visual sequences. Thus, this the notes using musical notation. As in Experiments 1-4, purely temporal manipulation affects auditory sequences Roberts found auditory superiority that was not confined but not visual sequences-a result that strongly supports to the recency portion of the list, and she found relatively the assumption that temporal information plays a more small recency effects. Thus, Roberts's results (as pointed important role in memory for auditory events than in out by Greene & Samuel, 1986), as well as the results memory for visual events. from Experiments 1-4, demonstrate that auditory superi­ We have yet to describe, however, exactly how tem­ ority is not necessarily linked to speech. Therefore, poral coding is affected by modality. One hypothesis, de­ speech-based explanations of the modality effect are, at rived from the reports of our subjects, attributes modal­ best, incomplete. Two such explanations are the Crowder ity differences to the coding of duration. Many subjects and Morton (1969) speech-based precategorical acoustic claimed that they attempted to code the visual stimuli store model and the Shand and Klima (1981) primary lin­ by quietly humming. Perhaps then, duration is encoded guistic code hypothesis. The latter hypothesis is that the by a single mechanism that is stimulated directly by modality effect results from coding information using the auditory signals, but requires intervention (e.g., hum­ code most often used by the perceiver for coding linguis­ ming) to encode duration of visual signals. For example, tic information. Baddeley and Salame (1986) phonetic store could be one We can make an even stronger claim. To the extent that such mechanism. Variability in beginning or ending the the rhythm modality effect and the recall modality effect intervention could make discriminating long and short tap the same mechanisms (and Experiment 4 and Glen­ visual stimuli more difficult than discriminating long and berg et al. 's 1989 data suggest that they do), we can dis­ short auditory stimuli. Alternatively, the intervention may count speech-based hypotheses as viable accounts of any not activate the time encoder to the same degree as direct modality effect. That is, these hypotheses cannot account stimulation. for the rhythm modality effect, and because the rhythm A second hypothesis concerns the coding of sequential, and recall modality effects are (probably) produced by or ordinal, relations among stimuli. This hypothesis is the same mechanisms, the hypotheses cannot adequately based on research investigating the coding of time of explain the recall modality effect. presentation of events (rather than of duration). This The results from Experiment 1 (in which background research indicates that the representation of time of oc­ was manipulated) are also inconsistent with Nairne's currence is unlikely to be akin to a tag that gives clock (1988) framework for interpreting recency effects. Nairne time. Instead, it appears that time of presentation is coded proposed that visual recency effects are modest because as an ordinal relation, that the target event occurred after visual features of the ambient visual environment inter­ some other event, or simultaneously with another (Tzeng fere with the recently encoded visually presented items. & Cotton, 1980; Winograd & Soloway, 1985). In addi­ This account would predict that, compared to the no­ tion, there is some evidence that this sort of relational cod­ background condition, the visual-noise-background con­ ing is hierarchical, both in the short term (Lee & Estes, dition would severely degrade performance on the visual 1981) and the long term (Friedman, 1987; Friedman & sequences. The data in Figure 1 provide little comfort for Wilkins, 1985). this position. Following Winograd and Soloway (1985), suppose that The rhythm modality effect also has positive implica­ when one event reminds the learner of a previous event, tions in that it supports the idea that modality effects reflect an ordinal (temporal) relation is coded between the two the coding of temporal information. That is, the rhythm events. Suppose further that this reminding can be based modality effect is consistent with the claim (J. M. on perceptual similarities as well as conceptual similari­ Gardiner, 1983; Glenberg & Swanson, 1986) that audi­ ties. Now, why might this form of ordinal coding be su­ tory stimuli result in more accurate coding of temporal perior for auditory events than for visual events? There information than do visual stimuli. Rhythms are, of are many possibilities, but one that appeals to us is based course, temporal stimuli, and the very fact that the rhythm on the notion of auditory streaming (Bregman, 1981; modality effect is obtained with a pure temporal stimulus Bregman & Campbell, 1971). An auditory stream is a type is suggestive that temporal mechanisms are contributing of perceptual organization imposed on the auditory en­ to the effect. vironment. The stream is the equivalent of an auditory G1enberg et al. (1989) report data that are, perhaps, object. Whether or not successive auditory events stream more convincing that the rhythm modality effect (and thus, depends on similarity of spatial location, pitch, and rhythm by implication, the recall modality effect) reflects the cod­ (Handel, Weaver, & Lawson, 1983). Thus, perhaps the ing of temporal information. Glenberg et al. disrupted the auditory events used in Experiments 1-4 stream, enhanc­ rhythmic (temporal) quality of the stimulus sequences by ing the perception of similarity. Because of this perceived randomly varying the duration of the interstimulus inter­ similarity, ordinal relations are coded between the repre­ val between the elements in the rhythm task. This manipu- sentations of successive tones (or auditory letters) more MODALITY EFFECT IN THE CODING OF RHYTHM 383 readily than they are between the representations of suc­ GREENE, R. L., 8< SAMUEL, A. G. (1986). Recency and suffix effects cessive asterisks (or visual letters). The result is a fine­ in serial recall of musical stimuli. Journal ofExperimental Psychol­ ogy: Learning, Memory, & Cognition, 12, 517-524. grained coding of order relations for auditory events but HANDEL, S., WEAVER, M. S., 8