Psychological Review J Copyright © 1977 \^J by the American Psychological Association, Inc.

VOLUME 84 NUMBER 2 MARCH 1977

Controlled and Automatic Human Information Processing: II. Perceptual Learning, Automatic Attending, and a General Theory

Richard M. Shiffrin Walter Schneider Indiana University University of California, Berkeley

The two-process theory of detection, search, and presented by Schneider and Shiffrin is tested and extended in a series of experiments. The studies demonstrate the qualitative difference between two modes of informa- tion processing: automatic detection and controlled search. They trace the course of the learning of automatic detection, of categories, and of automatic- attention responses. They show the dependence of automatic detection on at- tending responses and demonstrate how such responses interrupt controlled processing and interfere with the focusing of attention. The learning of cat- egories is shown to improve controlled search performance. A general frame- work for human information processing is proposed; the framework emphasizes the roles of automatic and controlled processing. The theory is compared to and contrasted with extant models of search and attention.

I. Introduction serial in nature with a limited comparison rate, is easily established, altered, and even In Part I of this paper (Schneider & reversed by the subject, and is strongly de- Shiffrin, 1977) we reported the results of pendent on load. Automatic detection is rela- several experiments on search and attention tively well learned in long-term , is that led us to formulate a theory of informa- demanding of attention only when a target tion processing based on two fundamental is presented, is parallel in nature, is difficult processing modes: controlled and automatic. to alter, to ignore, or to suppress once learned, In the context of search studies, these modes and is virtually unaffected by load. took the form of controlled search and auto- In the present article we shall report matic detection. Controlled search is highly several studies to elucidate further the proper- demanding of attentional capacity, is usually ties of automatic and controlled processing and their interrelations, to demonstrate the qualitative difference between these processing The research and theory reported here were sup- ported by PHS Grant 12717 and a Guggenheim modes, to study the development of automatic Fellowship to the first author, Grant MH23878 to detection and the role of the type and nature the Rockefeller University, and a Miller Fellowship of practice in such development, to study at University of California at Berkeley to the second the effects of categorization, and to examine author. This report represents equal and shared con- the development of automatic attending and tributions of both authors. Requests for reprints should be sent to Richard its effects. After the presentation of the M. Shiffrin, Department of , Indiana studies we shall present a general theory of University, Bloomington, Indiana 47401. ' information processing, with emphasis on the 127 128 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER

TRIAL I

s TRIAL 2 r 63

CM -

TRIAL 3

TRIAL 4

TRIAL I

TRIAL 2 CJK

VM -

TRIAL 3

TRIAL 4

Figure 1. Examples of trials in the multiple-frame search paradigm of Experiment 1, Part I. In all cases, memory-set size = 4 and frame size = 2. Four varied-mapping (VM) trials and four con- sistent-mapping (CM) trials are depicted. The memory set is presented in advance of each trial, then the fixation dot goes on for .5 sec when the subject starts the trial, and then 20 frames are presented at a fixed time per frame. Either 0 or 1 member of the memory set is presented during each trial. Frame time, memory-set size, and frame size are varied across conditions. roles of automatic and controlled processing. frame. Each trial consists of the presentation Our theory will then be compared and con- of 20 frames in immediate succession. The trasted with extant theories of search and elements presented are characters (i.e., digits attention. or consonants) or random dot masks. In ad- vance of each trial the subject is presented A. Review of Paradigms and Results From with several items, called the memory set, and Part I is then required to detect any memory-set items that appear in the subsequent frames. The paradigm for most of the studies of The frame time is kept constant across the Part I (and Part II also) is depicted in 20 frames of each trial, and the basic depen- Figure 1. Four elements are presented simul- dent variable is the psychometric function taneously in a square; and their joint presen- relating accuracy to frame time for each tation for a brief period of time is termed a condition. PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 129

Three basic independent variables were Table 1 manipulated in Part I/Experiment 1. The Examples of CM and VM Trials for number of characters in each frame (the Four Successive Trials frame size, F) was varied from 1 to 4 (but was constant for all frames of a given trial). Memory Distractor Trial set set The number of characters presented in ad- Target vance of a trial (the memory-set size, M) Consistent mapping (CM) varied from 1 to 4. The product of M and F 1 7481 KGJCM 4 is termed the load. A memory-set item that 2 2583 CHFLD none appears in a frame is called a target; an 3 1739 KGFDM 7 item in a frame that is not in the memory 4 6S82 CMJKD 8 set is called a distractor. One half of the trials Varied mapping (VM) contained one target, and one half contained 1 MJDG CFHKL D no target. Finally, and most important, the 2 CJKH LGDFM none nature of the training procedure across trials 3 GMCH DLFKJ none and the relation of the memory-set items to 4 JLKF CDGHM L the distractors were varied. In the consistent mapping (CM) procedure, across all trials, memory-set items were never distractors (and only a single frame on each trial; accuracy vice versa). In addition, memory-set items was high and the results were analyzed on were from one category (e.g., digits) and the basis of the reaction time of the responses. distractors from another category (e.g., con- The results confirmed those of Part I/Experi- sonants). In the varied mapping (VM) ment 1, and a quantitative model was fit to procedure, memory-set items and distractors the VM results of both experiments. This were randomly intermixed over trials and model assumed that controlled search was were all from one category (e.g., consonants). a serial, terminating comparison process in Figure 1 gives examples of trials in both which one first compared all frame items the VM and CM conditions. Depicted are against one memory-set item before switching four consecutive trials from a CM block and to the next memory-set item. Each compari- four from a VM block in each of which son and each switch was assumed to require M = 4 and F = 2. Table 1 gives the memory some time to be executed. The success of the set, distractor, and target (if present) for model in fitting the results of both experi- each trial in Figure 1. Note that memory-set ments suggests that the same search mech- items and distractors intermix across trials anisms underlie search experiments that uti- in the VM condition but do not intermix over lize both accuracy and reaction time mea- trials in the CM condition. sures, and suggests that the same search The most important results are shown in mechanisms underlie performance in both Figure 2. These results showed that the VM divided-attention and search paradigms. conditions were strongly affected by load and The vast differences between results of the were quite difficult; the CM conditions were CM and VM conditions provided a basis for virtually unaffected by load and were all reorganizing and classifying the results of easier than even the easiest VM condition. previous search and detection studies. These It was suggested that a controlled, serial studies fell into a relatively simple organiza- search was operating in the VM conditions tion, and many perplexing and seemingly and that a qualitatively different process, contradictory results became explicable. automatic detection, was operating in the Part I/Experiment 3 utilized a multiple- CM conditions. target, multiple-frame procedure. The study The results of Part I/Experiment 1 were was similar to Part I/Experiment 1, but the analyzed on the basis of the accuracy of the subject was presented either zero, one, or two detection responses. Part I/Experiment 2 targets per trial and was required to report utilized comparable conditions but presented the number of detected targets. The condi- 130 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER CONSISTENT VARIED MAPPINGS MAPPINGS 100 100 c/) O) H .(4,1) 4,2) I80 80 r- (4,4) A" (1,4) 60 LU 60 O LU LJ 40 40 80 120 120200 400 600 800 FRAME TIME (msec) FRAME TIME (msec) Figure 2. Data from Experiment 1, Part I. Hits as a function of frame time for each of the 12 conditions. Three frame times were utilized for each condition. The first number in parentheses indicates the memory-set size and the second number in parenthesis indicates the frame size.

dons of greatest interest were those in which frame procedure and will infer the presence two targets per trial were presented. In these of automatic detection or controlled search cases the spacing between the two targets from the nature of the spacing effect and the varied from 0 to 4 (0 spacing indicates simul- target-similarity effect. taneous presentation; spacings of 1, 2, and 4 indicate the number of intervening frame B. Rationale for the Experiments plus 1). Target similarity also varied, the two targets being either physically identical We argued in Part I that the cause of the (II) or different (NI). difference between the CM and VM results The results for Part I/Experiment 3 was the consistency of the mapping (over showed markedly different patterns for the trials) of the memory-set items and dis- VM and CM conditions. Consider target tractors to responses. We argued that con- similarity first. In the VM conditions de- sistent mapping leads to the development of tection of identical targets (II) was superior automatic detection, which enables auto- to detection of different targets (NI). How- matic-attention responses to become attached ever, in the CM conditions, the reverse to the memory-set items. The automatic-at- was true: NI detection was superior to II tention responses enable the serial, controlled detection. Consider target spacing next. In search to be bypassed by a parallel detection the VM conditions performance was lowest process unaffected by load. However, these when the targets occurred in successive frames hypotheses must be further tested for the (spacing 1). However, in the CM conditions following reasons. First, the fact that mem- performance was lowest when the targets were ory-set items were categorically distinct simultaneous (spacing 0). These results from the distractors was confounded with the helped emphasize the qualitative difference consistency of the mapping. These factors between the automatic detection processing will be separated in Part II/Experiments 1, mode presumed to be utilized in the CM 2, and 3. Second, the course of development conditions, and the controlled search mode of the hypothesized automatic detection presumed to be utilized in the VM conditions. process was not studied in Part I. The course In certain of the studies in this paper, we of learning will be traced in Part II/Experi- shall utilize this multiple-target, multiple- ments 1 and 3. Third, there was no demon- PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 131 stration in Part I that an automatic-atten- was always equal to 2 and the memory-set size was tion response is learned in CM paradigms. always equal to 4. Two disjoint character sets were used for the memory ensemble and the distractor Such a fact will be suggested by Part II / set. One consisted of the following nine consonants: Experiments 1 to 3 and demonstrated in B, C, D, F, G, H, J, K, and L; the other con- Part II/Experiment 4. Finally, although these sisted of the following nine consonants: Q, R, S, goals provide an initial justification for the T, V, W, X, Y, and Z. There were four new subjects, two of each sex, present experiments, these studies will serve 1 all naive to our tasks. Two subjects began training an even more important purpose in elu- with the consonants from the second half of the cidating the characteristics and development alphabet 'as the memory ensemble, and two subjects of automatic processing and in differentiating began with the consonants from the first half of automatic from controlled processing. the alphabet as the memory ensemble. There were five blocks of trials per session, each containing 60 test trials. There were no practice trials. Subjects II. The Development of Automatic Processing: were informed at the start of the experiment con- Perceptual Learning cerning the nature of the study and the composi- tion of the memory ensemble and the distractor set. A. Perceptual Learning and Unlearning in a Each trial began with the presentation of a memory CM Task Using Letters Only set (M = 4) selected randomly from the memory ensemble. Experiment 1 of the present series is simple During the first 1,500 trials of the experiment in conception. With the use of the same basic for each subject, the frame time was 200 msec; multiple-frame paradigm used in Part I/ during the following 600 trials, 120 msec. After these 2,100 trials the memory ensemble and the Experiment 1 (see Figure 1), performance is distractor set were switched for each subject and examined as a function of the amount of the frame time was set back to 200 msec. This training under consistent-mapping conditions. reversal 'Condition was then run at the frame time One change is made, however: Both the dis- of 200 msec for a total of 2,400 additional trials. The subjects were informed of the reversal at the time tractor set and the memory ensemble consist of the switch. of consonants. This procedure enables us to The subjects responded whenever a target was study the acquisition of automaticity when detected, or gave a negative response at the end of the two sets are not already-learned cate- the trial. The accuracy of the response and the reaction time for hits were both recorded. Sub- gories. Furthermore, it had been observed jects heard a tone signifying an error after each informally that the use of digits and con- incorrect response. sonants (in the CM conditions of Part I/ Experiment 1) led to a relatively rapid 2. Results and Discussion acquisition of automatic detection. It was felt that the use of letters only to make up The results are presented in Figure 3. Re- the two sets would slow down acquisition. sults for each block, averaged across sub- Values of M and F were chosen so as to jects, are graphed consecutively. Thus, the make controlled processing difficult (M = 4, graphed points in each interval are based on F—2), and a frame time (/) of 200 msec 120 observations. was chosen so as to make performance low In the initial group of 1,500 trials, the hit when controlled processing was being utilized. rate rose from just over 50% to about 90%, (These choices are justified by the data in while the false alarm rate dropped from Figure 2.) Thus it was expected that per- about 12% to about 3% (and the reaction formance would be quite poor at the start of training, when automatic detection had not been learned and controlled search had to be 1 Since the experiments are not reported in their used, but would improve markedly as auto- chronological order, we list here the experiments in their original order and indicate the subjects that matic detection developed. were used in each. Experiment 1 was run with new subjects. Experiment 4 was run next) using the I. Method subjects who took part in Experiment 3 of Part I. Experiment 2 was then run on three of the four The CM presentation procedure of Part I/Experi- subjects in Experiment 4. Experiment 3 was run ment 1 was utilized (see Figure 1). The frame size last, on new subjects. 132 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER

100

EC/) ot 80 0:=

60

INITIAL LEARNING • REVERSED LEARNING- 40 HO: 200 MSEC FRAME TIME 2< 120 MSEC FRAME TIME _ LU-J 20 2 o *tt*^*fi*. 600 1200 1800 2400 X 850 o: £

I- 2 650 g 550 UJ 600 1200 1800 2400 CC TRIAL NUMBER Figure 3. Data from Experiment 1. Initial consistent-mapping learning and reversed consistent- mapping learning for target and distractor sets taken from the first and second halves of the alphabet. Memory-set size = 4J frame size = 2, frame times are shown. Percentage of hits, percentage of false alarms, and mean reaction time for hits are graphed as a function of trial number. After 2,100 trials the target and distractor sets were switched with each other.

time for hits dropped from about 770 msec alarm probabilities in the first 60 trials of to about 670 msec). the present study were .56 and .13, respec- It seems plausible that the subjects adopted tively. The estimated probability of detecting a controlled search strategy at the start of a single target and the observed probability training. Some support for this hypothesis is of giving a false alarm when no target was found in a comparison of the data from this present were .60 and .18, respectively, in the study with the data from Experiment 3 of VM condition of Experiment 3b of Part I in Part I (though different subjects were in- which M was equal to 4 and F was equal to 2 volved in the two studies). The hit and false (this condition and the first 60 trials of the PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 133 present study were both run at a frame time planted in long-term memory and if the at- of 200 msec). These two conditions should tention response occurs automatically, then give similar results if controlled search is it should prove very difficult for the subject being utilized in both studies; in fact, the to alter or unlearn his automatic response in probabilities are quite similar. any short period of time. To test this hy- It should be noted that the subjects all re- pothesis we reversed the memory ensemble ported extensive, attention-demanding re- and the distractor set for each subject (and hearsal of the memory set during about the set the frame time back to 200 msec). We first 600 trials of Experiment 1, but they hypothesized that automatic detection would gradually became unaware of rehearsal or prove impossible and that the subject would other attention-demanding controlled pro- be forced to revert to controlled search. cessing after this point. Both the objective The results of the reversal were quite evidence and subjective reports, then, sug- dramatic. The hit rate just after reversal gest that the subjects used controlled search fell to a level well below that seen at the at the start of Experiment 1 but gradually start of training when the subjects were com- shifted to automatic detection. pletely unpracticed. Very gradually there- The hit rate and false alarm rate after after the hit rate recovered, so that after 1,500 trials appeared to be changing only 2,400 trials of reversal training, performance slowly, but this could have resulted from a reached a level about equal to that seen after ceiling effect. The frame time was therefore 1,500 trials of original training. In summary, reduced to 120 msec for the next 600 trials. the original training resulted in quite strong At the conclusion of this additional training, negative transfer, rather than positive the hit rate appears to have reached about transfer. 82%, while the false alarm rate dropped to Subject's verbal reports indicated that a about 5%. These results may be compared shift back to controlled search occurred after with the CM results from Experiment 1 of reversal. Initially, before reversal, all sub- Part I (see Figure 2). In that study, with jects reported rehearsing the memory set / = 120 msec, M - 4, and F = 2, the CM hit during each trial. After the 2nd day (600 rate was about 92% and the false alarm rate trials) subjects reported that they were no (not shown) was about 5%. longer rehearsing and only glanced at the Thus, the present results indicate that memory set. However, subjects reported that subjects were utilizing automatic detection after reversal they tried various methods to by the end of the initial 2,100 trials of CM perform the task and eventually ended up training. First, they performed at a level not rehearsing the memory-set items again, though far from that of the Part I/Experiment 1 this rehearsal also decreased after a week of subjects in the comparable CM condition postreversal practice. This pattern of re- with the same frame time. Furthermore, the ports could indicate a shift from controlled present study utilized a larger memory en- search to automatic detection during original semble and distractor set than the VM con- learning, then a shift after reversal to con- ditions of Part I/Experiment 1, a fact that trolled search, and then finally a return to could only have increased the task difficulty. automatic detection. Second, from the Part I/Experiment 1 VM What might be the cause of the negative data we can estimate that a frame time of transfer after reversal? If the reversal caused 600-800 msec would have been required to subjects to revert to controlled search, then achieve a similar level of performance had it might have been expected that perform- controlled search been used. ance would fall to the level seen at the start At this stage of the experiment, after 2,100 of original learning (when controlled search trials of CM training, the subjects could be was presumably utilized). The actual results expected to have developed a well-learned suggest that controlled search is hindered attention response to the members of the when the distractors are items that subjects memory ensemble. If such learning is firmly have been previously trained to respond to as 134 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER

CM targets (i.e., items that, due to previous always consisted solely of items from the training, give rise to automatic-attention distractor set in the CM conditions. Thus a responses). This hypothesis is verified in subject at the conclusion of the series of Experiment 4, to be reported later.2 Part I studies who had been searching, say, Beyond the poor performance at the start for consonants in digit distractors, had never of reversal learning, negative transfer is also been given a consonant as a distractor: evidenced by the extremely large amount of Whenever a consonant had appeared, it was training needed to overcome the effects of orig- a target. These facts naturally suggested a inal learning. After about 900 trials of original study in which the roles of consonants and practice the hit rate reached about 90%. digits were reversed for these subjects. However, it took about 2,100 trials to return to this performance level after reversal. Ap- 1. Method parently the necessity to unlearn the at- The procedure was identical to the multiple- tention responses to the previous memory-set frame, multiple-target paradigm of Experiment 3a items, or the necessity to overcome some of Part I (whose results were described in the learned inhibition to the previous distractors, Introduction) in most respects, except that the or both, causes a reduction in the rate of memory ensemble and the distractor set were switched in both the CM and VM conditions. Thus acquisition of automatic processing after in the CM condition the two subjects who had reversal. been searching for consonants in digit distractors It would be interesting to know whether now searched for digits in consonant distractors; negative transfer would obtain in several in the VM condition these two subjects had been variations of our paradigm. For example, in searching for digits in digits and now searched for consonants in consonants. These contingencies were the present study, the initial training might reversed for the remaining two subjects who had have caused the memory set to become a previously been searching for digits in consonant learned category, but probably did not cause distractors in their CM conditions. categorization of the distractor set. If the The frame time (/) was 60 msec for the CM conditions and 200 msec for the VM conditions. distractor set was a well-known category, Just as in Experiment 3 a of Part I, M was equal would reversal still cause a severe perform- to 2, and F was equal to 2; zero targets were pre- ance drop? Furthermore, it could be argued sented on one fourth of the trials, one target was that the initial training did not continue long presented on one fourth of the trials and two enough for the negative transfer to reach full targets were presented on one half of the trials. When two targets were presented, they were either strength. The amount of negative transfer identical (II) or nonidentical (NI), and the spac- might depend on the amount of overtraining ing between them was either 0, 1, 2, or 4. Each during initial learning. Such a view has been subject was given 12 blocks of 132 trials in each put forth for certain verbal and animal learn- of the VM and CM conditions. The VM and CM blocks alternated. ing situations (see Handler, 1962, 196S; Upon 'Completion of the initial set of 24 blocks, Jung, 1965 for two views on the subject). the frame time was increased to 120 msec and an Thus it might be asked whether performance additional 18 blocks of CM trials were run for each drops would be caused by reversal if the subject. original automatic detection responses were greatly overlearned. These two questions are 2 One might hypothesize that the basic compari- answered by Experiment 3. son rate of controlled search is altered and slowed after reversal, thereby accounting for the negative B. The Reversal of Consonants and Digits transfer. Two facts argue against this hypothesis. for Well-Practiced Subjects Experiment 2 shows that controlled search proceeds at an unchanged rate even when all items consist Throughout the series of experiments in of targets from previous CM conditions. Experiment Part I, the memory ensemble in the CM 4 shows that even one CM target presented in a condition was always the same, either digits to-be-ignored spatial location greatly reduces detec- tion of a simultaneously presented target in a to-be- or consonants, and no member of the CM attended location. This result suggests that reversal memory ensemble was ever a distractor, even performance is harmed because attention tends to in the VM conditions. The VM conditions be drawn to the distractors. PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 135

It should be noted that the subjects for this ex- tion of two targets is about 80% prior to periment had had roughly 20,000 trials of practice reversal and about 20% after reversal. Fur- in the various CM and VM conditions in earlier studies (see Footnote 1). thermore, these data represent 12 blocks of training over which very little, if any, re- covery was taking place. It is interesting to 2. Results and Discussion note that all of the subjects predicted that The results are depicted in Figure 4. A no change in their performance would occur simple correction for guesses and false alarms after the consonants and digits were reversed, has been carried out on the raw data, as and the subjects were uniformly startled and described in Part I, but this correction did even dismayed by the extreme difficulty of not affect the qualitative features of the re- the reversed task. sults. The circles in the left-hand panel give These results resolve the confounding be- the comparable VM prereversal data from tween categorization and the mapping condi- Experiment 3a of Part I (/= 200 msec), tions. Even when there is a well-learned and while the circles in the center panel give the well-practiced categorical difference between comparable CM prereversal data from Ex- memory ensemble and distractor set, reversal periment 3c of Part I (/ = ,60 msec); the in the CM paradigm gives rise to marked triangles in the two left-most panels give the impairment of performance. Thus, categorical data from the present experiment (see Foot- differences between memory ensemble and note 1). distractor set cannot be the key factor under- The most dramatic findings of Experiment lying the utilization of automatic detection. 2 are shown in the middle panel of Figure 4, Rather it is the consistency of mapping that which gives the CM reversal results. It is underlies the acquisition of automatic detec- evident that performance drops off markedly tion. after reversal. Estimated percentage of detec- Consider next the VM data given in the

REVERSAL CONDITIONS

— IUU u h ' ' ' ' ' D LU °o Q UJOT 80 - a A 01- , llJ " ^^°~ I-O A

zee D6U0 UJ< i v-: A a! ;J 40 . Nf^. o< ^A__^ *^A—Ti - £ 1- i i i i i i i i i i i i t i i i i i w 0'I24 0124 0124 SPACING SPACING SPACING 0 I 01 2 01 NUMBER OF TARGETS Figure 4. Data from Experiment 2. Probabilities of correctly detecting one target when one was presented, or two targets when two were presented, as a function of spacing, for each condition. The data shown have been adjusted to remove effects of guessing and false alarms. The observed percentage of "none" responses to no-target trials is also shown. Circles show the estimated percentage of detection prior to reversal (data from Experiment 3, Part I), while triangles indicate performance after reversal in Experiment 2. The squares in the right-hand panel show post- reversal detection from Experiment 2 after frame time was increased to 120 msec. The open symbols and dashed lines indicate the II conditions (identical targets); the closed circles and solid lines indicate the NI conditions (different targets). (VM = varied mapping; CM = consistent mapping; f = frame time, in msec.) 136 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER left-hand panel of Figure 4. It is evident that search operates normally, with targets com- the switch from consonants to digits (or pared in a random order of search, and per- vice versa for other subjects) had almost formance is equivalent to that seen prior to no effect. The levels of performance remain the switch of character sets. the same as in Experiment 3a of Part I, and In short, the use in a search task of stimuli the qualitative pattern of results is still about that elicit automatic-attention responses (de- the same, showing the pattern indicative of veloped through previous CM training) can controlled search. That is, worst perform- (a) improve performance over that seen in ance occurred at spacing 1, and performance the usual VM conditions if these stimuli are was worst when the targets were different memory-set items but not distractors, in (NI). These effects indicate the presence of which case automatic detection will be uti- controlled search. In summary, a change of lized; (b) leave performance unchanged the character set used in the VM conditions from that seen in the usual VM conditions if had no appreciable effect under the present these stimuli are both memory-set items and conditions. distractor items, in which case normal con- These results are most interesting, be- trolled search will be utilized; and (c) im- cause after reversal the memory ensemble pair performance from that seen in the usual and the distractor set both consisted of VM conditions if these stimuli are distractors items that had come to elicit automatic- but not memory-set items, in which case attention responses in previous training. In subjects will utilize controlled search that the reversal conditions of Experiment 1 we will be reduced in effectiveness by automatic saw that making the distractor set but not the attending to distractors. memory ensemble consist of such "automatic" Let us now turn to the results of the final items resulted in considerable negative trans- reversed CM training blocks. We suspected fer. In the present experiment it is evident that the subjects in the reversed CM condi- that performance is not worse after reversal tion (the middle panel) were reverting back when subjects have been trained to re- to the use of controlled search, but perform- spond to both the targets and distractors in ance levels were so low that it was difficult previously CM training. to interpret the pattern of results. Thus These results might be expected on the frame time in the CM reversal condition was basis of the following reasoning. Just after increased to 120 msec, and 18 blocks of addi- reversal in both Experiment 1 and in the CM tional CM training were run. conditions of Experiment 2 the subjects were The results of the trials with / = 120 msec using controlled search. In Experiment 1, after are represented by the squares in the right- reversal, the automatic-attention responses hand panel of Figure 4. It should be noted to the distractor items tended to draw at- first that the pattern of the results is in- tention away from target items to which sub- dicative of controlled search in that the worst jects had not previously been trained to re- performance occurs at spacing 1. However, spond and hence to order the controlled search the results do not show the target-similarity in such a way that the distractor items were effect that we have been led to expect (in compared earlier in the search. Thus per- Experiment 3 of Part I) for controlled search: formance was impaired. (This hypothesis is Results for conditions using II target pairs confirmed in Experiment 4.) However, in the are not superior to those using NI target present experiment, after reversal, all items, pairs. Furthermore, the overall level of per- distractors and targets alike, gave rise to formance is higher than one would expect attention responses, which could be expected for the VM conditions. At a frame time of to cancel each other out on the average. 120 msec performance is higher than that That is, there should be no selective bias to seen for the VM conditions at a frame time compare distractors prior to targets since of 200 msec (in the left-hand panel of Fig- both types of items give rise to equivalent ure 4). attention responses. Thus, the controlled The results can be explained simply and PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 137

elegantly by the hypothesis that the sub- of memory ensemble and distractor set causes jects were carrying out a controlled search a great decrement in performance in the CM for the category as a whole. The Experi- conditions. The CM decrement occurs despite ment 2 reversal condition differed from the the categorical difference between the mem- Part I/Experiment 3 VM conditions in that ory-set items and distractor items (letters the memory-set items and the distractor vs. numbers). It is argued that the subjects, items in the present instance were cate- after reversal, adopt a controlled search gorically different (numbers and letters). strategy in which they search for the category This categorical difference obviously did not as a whole rather than search for the in- allow the subjects to utilize automatic detec- dividual members of the category. tion after reversal. However, the categorical The results of Experiments 1 and 2 could difference could very well have helped the hardly have provided a more dramatic demon- subjects to adopt a more efficient controlled stration of the qualitative difference between search. Suppose the subject ignored the automatic detection and controlled search, specific members of the memory set and and of the dependence of the search mecha- searched instead for any instance of the cate- nism on the nature of the training procedures gory "numbers" (or "letters" as the case (i.e., the consistency of the stimulus-response may be). Then only one comparison would mapping). be needed for each display item in each The results of Experiments 1 and 2 also frame, regardless of memory-set size. demonstrate the long-term nature of the Compared with the case when a categoriza- learning underlying the automatic process. tion is unavailable, a controlled search for Experiment 1 showed that even original the category of each input saves search time learning of an automatic response could take in two ways. First, there is no need for mul- thousands of trials of practice to develop if tiple memory-set comparisons for each frame a previously known categorization did not item. Second, there is no need for time- distinguish the memory ensemble and the dis- consuming switches between memory-set tractor set. Relearning after reversal was items. Thus performance improves consider- even slower to develop. Experiment 2 showed ably. Furthermore, the distinction between that relearning after reversal could be very identical and nonidentical multiple targets is retarded even when the sets were categorized, no longer meaningful—both targets are sim- as long as the original automatic detection ply category members, regardless of their process was well overlearned. physical similarity. Thus no difference be- tween the II and NI conditions would be predicted. The results in Figure 4 support C. The Role of Categorization in Controlled these contentions. Search and in the Development oj Automaticity As far as we can tell, through a block-by- block analysis of the reversal results, the sub- The role played by categorization in con- jects did not begin to recover any appreciable trolled search and in the development of degree of automatic detection even after 30 automatic detection is suggested by results blocks of CM reversal training. Undoubtedly, of the preceding experiments, but it is an the difficulty in relearning (or unlearning) important enough concept to be examined is related to the great amount of overtrain- in a separate experiment. Furthermore, the ing with the original mapping of stimuli to nature of the process by which categories responses. Eventually, however, were the are learned is also worth studying. These reversal experiment continued, we would ex- considerations underlay the design of Experi- pect automatic detection to develop once ment 3. We decided to use new subjects and again. to train them in two conditions, in neither In summary, then, the results of Experi- of which were the memory and distractor ment 2 show that a switch of character sets sets preexperimentally categorized. The first does not affect VM performance, but a switch condition was designed to induce controlled 138 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER

100

S 80

60

40 M=2 M=4 o • Categorical 20 a • Mixed

f = 350

0-4 5-9 10-14 15-19 20-24 ^25 28 30-32 33-35 f VM Training CM Training SESSIONS

Figure S. Data from Experiment 3. Estimated percentage of detection of both targets, as a func- tion of session number, averaged over spacing and target-similarity conditions, for each of the four main conditions. At Session 25 training was switched to a CM procedure. At Session 30, the frame time was reduced to 160 msec. (VM = varied mapping; CM = consistent mapping; M = memory-set size; f = frame time, in msec). search in circumstances such that a cate- ditions, and two memory-set sizes were used in gorization could not develop. The second con- different blocks: M = 2 and Af = 4. In the mixed condition, the members of the dition was also designed to induce controlled memory set were chosen randomly on each trial from search, but in circumstances such that a the ensemble of eight consonants; the distractor set categorization of the memory sets could be consisted of four items randomly chosen from learned. Both of these conditions utilized a those consonants not used in the memory set. These varied-mapping procedure to prevent auto- four .distractors were chosen randomly to fill the nontarget, nonmask, positions in the various frames matic detection, and both were followed by of the trial. The key feature of the mixed condition a series of CM trials, during which trials was the fact that memory-set items and distractors automatic detection developed. Thus, the were randomly intermixed from trial to trial. CM trials traced the course of development In the categorical .condition, the eight ensemble items were divided into two sets of four that re- of automatic detection when a categoriza- mained disjoint throughout the experiment for each tion was, or was not, present at the start of subject. The sets of four are indicated by the posi- CM training. tion of the comma in the above listing of the con- sonants making up each set. Note that the two sets 1. Method of four were chosen to be highly confusable with each other in the sense that pairs of visually con- The subjects each took part in two conditions fusable letters were separated into the two sets. On called mixed and categorical, run in different blocks. a given trial one of these sets of four was chosen In each condition the memory ensemble and dis- randomly to be the distractor set, and the memory- tractor set were drawn from a total of eight con- set items were then chosen randomly from the sonants. The two eight-consonant sets did not over- remaining set. Thus, in the categorical condition lap. The sets were {GMFP, CNHD} and {RVJZ, there were just two disjoint categories of items, BWTX}. The assignment of these two sets to the one from which the memory set was drawn, and two conditions was permuted across subjects. one that served as the distractor set. However, the Both conditions used a VM procedure similar in category that served as the distractor set varied certain respects to that of Experiment 2 of this randomly from trial to trial. Thus, 'the procedure report and Experiment 3 of Part I. The multiple- still involved a varied mapping, but the categories target, multiple-frame paradigm was used with were never intermixed and could eventually become / — 250 msec per frame. Only 12 frames were used well learned. on each trial, with all targets appearing on frames To make it easier for the subjects to learn the 3 through 10. Frame size was equal to 2 in all con- categories, the members of the memory set pre- PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 139 sented at the start of each trial were always pre- 2. Results and Discussion of the VM sented in alphabetical order (this was also done in Conditions the mixed 'condition). In each session there were four blocks, always run in the following order: (a) categorical condition, M = 2; (b) categorical The results are summarized in Figures 5 condition, M = 4; (c) mixed condition, M = 2; (d) and 6. Figure 5 gives the estimated prob- mixed condition, M = 4. There were 96 trials in ability of detecting both targets when two each block, 24 with no target, 24 with one target, are present, averaged across target similarity, and 6 for each of the eight combinations of spacing spacing, subjects and certain combinations of and target similarity when two targets were presented. sessions. Performance in all four main con- Four new subjects, one male and three females, ditions tended to improve during VM train- took part. Each ran in 24 sessions of about 1 hour ing, but is clearly stable over the last 10 each. This finished the first, varied mapping, phase of the study. After these 24 sessions the subjects sessions (15-24). Initially the M = 2 con- should have been using controlled search in both dition is superior to the M — 4 condition in conditions, but should have learned the categories both the categorical and mixed cases. in the categorical condition. Before turning to the CM procedure used in subsequent sessions, we shall After training, however, the pattern changes discuss the VM results. considerably. In the categorical conditions

100

80 o> II NI o— & M = 2 2 60 Categorical, VM

c i i o *u- 0> 100 lu •a c a.

73 a > O a> o E -t > O r o O Mixed, VM

01012 4 i Spacing of two targets | Number of targets Figure 6. Data from Experiment 3. Estimated percentage of detection as a function of number, spacing, and similarity of targets for VM Sessions 19-23 for all conditions. (VM = varied mapping; M = memory-set size; II = identical targets; NI = nonidentical targets.) 140 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER the results of the M — 2 and M = 4 condi- dition. This reasoning explains why the cate- tions converge (indicating category learning gorical conditions are both superior to the and the use of a categorical search strategy). best mixed condition, which uses a memory- In the mixed condition performance in the set size of 2. M = 4 condition remains much lower than in The category hypothesis, however, sug- the M = 2 condition, but performance in gests that target similarity should not matter, both of these is worse than in either cate- so that the II and NI functions should be gorical condition (suggesting that the pres- identical. To explain the spacing 0 results ence of well-known categories reduces the without abandoning the category model, we effective memory-set size to 1). suggest the following hypothesis. Suppose To ascertain whether controlled search was that when a subject locates a target category, being utilized in these various VM conditions he or she briefly switches to an item mode, and to determine the nature of that search, it perhaps to check that the input is truly a is necessary to consider the spacing functions category member. If the second item in the shown in Figure 6. This figure gives the frame is identical, then it will be found in an spacing functions for Sessions 19-23, when item-comparison mode, but if nonidentical, it performance had stabilized, averaged across will be located only if the subject reverts subjects. For both M = 2 and M = 4 con- quickly enough to a categorical-comparison ditions, the mixed condition shows the pat- mode. By the next frame, the reversion to a tern we have come to expect for controlled categorical mode is complete, so the various search, with performance worse at spacing 1 functions tend to converge. Why would sub- and worse for different targets (NI). jects tend to switch to an item mode? The Performance in the categorical condition categories were constructed so as to be ex- was equivalent when M = 2 and M — 4. tremely confusable with each other. Perhaps The usual performance impairment at spacing category is learned under these cir- 1 occurs for identical targets (II), but when cumstances but remains somewhat error the two targets differ (NI), performance in prone. Then it would be logical to check any the spacing 0 condition is greatly impaired, target category by using an item mode. a result not seen in previous VM conditions. Whatever the explanation for the details of Furthermore, the II and NI conditions show the spacing function, the data as a whole only a small difference except at spacing 0. make a strong case that categories have been The results from the mixed condition con- learned in the categorical condition during form to the pattern expected for controlled VM training, and that the presence of cate- search in three respects; When M = 4 per- gories in a VM situation allows the subject formance is worse than when M — 2; there to adopt a simpler and more efficient form of is a performance reduction at spacing 1; and controlled search. performance is worse for NI than II con- The argument that controlled search is ditions. operating in these conditions would be sub- The results from the categorical condition stantiated if it could be demonstrated that were unexpected in some respects but are performance improves when the subjects explicable in terms of an hypothesis that switched to CM training. We turn next to categories were learned and utilized in con- the CM procedure. trolled search. Let us assume that the pres- ence of a known category allows one to com- 3. Method pare the category of the memory set to the Following the 24th session of VM training, a CM category of any given display item in a single procedure was initiated. In the categorical condi- operation. Then the M = 2 conditions should tion, one of the categories was fixed thenceforth not differ from the M — 4 conditions, and in as the memory ensemble, and the other category fact both conditions should show performance was fixed thereafter as the distractor set. In the mixed condition, a set of 4 was chosen and was levels equivalent to those expected for a fixed thereafter as the memory ensemble—the re- memory-set size of 1 in a normal mixed con- maining 4 items were fixed thereafter as the dis- PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 141

tractor set. The two sets of 4 used in the mixed the first four sessions of CM training). In condition were those indicated by the position of other words, the learning of a category is the comma in the listing of the two sets of eight consonants: {GMFP, CNHD} and (RVJZ, BWTX}. apparently not a necessary prerequisite for In other words, these were the same sets that had the development of automatic detection. been used as categories in the categorical conditions Figure 7 shows the spacing functions for for other subjects. This CM training procedure was the six sessions (30-35) of CM training run identical for both conditions and followed in other respects the procedures used in the preceding VM at a 160-msec frame time. The results are conditions. Each subject was given at least 10 averaged over subjects and sessions. In ad- sessions of VM training. The frame time after CM dition the M — 2 and M = 4 conditions are Session 4 (Session 24 overall) was reduced to 160 lumped together since they did not differ. msec and after CM Session 11 (Session 35 overall) was .reduced again to 120 msec (at the time of this Somewhat to our surprise the pattern is al- writing this study had not been completed). most identical to that seen for the categorical condition during the latter stages of VM training, except that the performance level 4. Results and Discussion of the CM is much higher. The comparable pattern from Conditions the CM conditions of Part I/Experiment 3 The changes over sessions following the was quite different: Detection was about switch to CM training are depicted in Figure equal in all conditions except for a depressed 5. Consider first the results of Sessions 25 detection rate when spacing was 0 and when through 28, which immediately followed the the targets were identical. At spacing 0 the switch to CM training. Quite clearly, a very present data clearly show performance to be rapid and dramatic rise in performance took higher when the targets were identical. place, and furthermore, the results of the Disregarding the shape of the spacing mixed and categorical conditions tend to functions, the large improvement in perform- converge. By Session 28 the performance ance when CM training commenced does sug- level in the mixed condition had risen to over gest that the subjects were learning auto- 90% and in the categorical condition had matic detection. An hypothesis that reconciles risen to over 95%. Since the subjects were these facts suggests that the subjects were clearly approaching ceiling, the frame time using automatic detection to locate the first was then reduced to 160 msec and CM train- target, and were then reverting to controlled ing continued for six additional sessions. search to check the accuracy of the target Note that performance was still improving detection. Some time may be lost before the and that the mixed and categorical, and M subjects revert to automatic detection. Thus = 2 and M = 4, conditions effectively con- the advantage of II at spacing 0 would be verged. due to the temporary use of a controlled It is interesting to note that during VM search. This hypothesis is similar to that training, 20-25 sessions were necessary before used to explain the categorical results in the the M = 2 and M = 4 performance levels be- VM conditions (Figure 6). In both cases, a came equal in the categorical condition. Thus tendency to recheck the first detected item one might tentatively conclude that category may have led to an alteration in search encoding in the present experimental context strategy. The reason may be the same in both required 20-25 sessions to develop. Thus cases: Rechecking of located targets may category encoding would be unlikely to de- have been induced by the stimulus sets, velop for the mixed condition in just four which were chosen so that the memory set CM sessions. Yet after only four sessions of and distractor set were maximally visually CM training, the performance level in the confusable. Thus, both automatic detection mixed condition rose dramatically and ap- and category encoding may have been some- proached that of the categorical condition. This what error prone, thereby requiring recheck- result suggests that automatic detection de- ing in an item mode. We are currently ex- veloped in the mixed condition in the absence ploring this hypothesis. of a well-learned category (at least during In summary, a number of conclusions can 142 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER

I/) 1 \J\J g «- ** c . ° ^x ^p=Z^^ - i- 80- d 0 - m •So .^.^^^.~-^" II N I Categorical ~ a— •* Mixed CM,f = 160MSEC Estimat e

detectio n i i i i i i -P * cr > O 0 1012 4 i Spacing of two targets t N umber of targets Figure 7. Data from Experiment 3. Estimated percentage of detection as a function of number, spacing, and similarity of targets for CM Sessions 30-35, averaged over memory-set size (which had no effect) for all conditions. (CM — consistent mapping; f = frame time; II = identical targets; NI = nonidentical targets.) be drawn from the results of Experiment 3. category. Verbal labels are one common class First, arbitrary collections of characters can of categories, but other types of categories be learned as categories. For this learning to might also exist. For example, a visual symbol occur, it is necessary that the categories re- might represent several objects. Of course, a main consistently denned across trials, but category in general need not exist in a form not necessary that the same category re- similar to any of the sensory modalities. A main the target set across trials. Second, the completely abstract node could represent two learning of a category in a VM search para- or more objects in memory. digm enables the subject to adopt a more With respect to search tasks, the import- efficient controlled search, one in which the ance of categories is clear. When all of the category of the memory set may be com- members of a memory set are members of pared in a single operation to the category the same category and no distractors are in of a display item, thus eliminating the effects that category, then a controlled search can of variations in memory-set size. Third, a bypass the individual memory-set items and switch to a CM training procedure results utilize comparisons that involve matching the in fairly rapid acquisition of automatic de- single category against the category of each tection, with a concomitant improvement in display item. For category search to be uti- performance for both the categorical and non- lized effectively, however, the category must categorical stimuli. These conclusions sup- be learned well enough that each displayed port those suggested by the results of Ex- element will be encoded automatically and periment 2 in all important respects. consistently according to its category. Of course other features, such as the element's D. What is a Category? A Discussion and name and shape, will also be encoded, but Selective Review only the category feature is necessary for a category search. We suggest that the category 1. Some Hypotheses Concerning Categories coding must be automatic, because if it were Experiments 2 and 3 showed how search not, a search of long-term memory would benefits from the learning of a categorical have to be carried out to identify the cate- distinction between memory ensemble and gory; the time taken for such a long-term distractor set. But what is a category? Most search would almost certainly wipe out any generally, we may define any object in mem- gains that are due to the reduction in num- ory that refers to (stands for, consists of) ber of comparisons allowed by the category any two or more objects in memory as a search (at least for small memory-set sizes). PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 143

Note that automatic category encoding is members to develop so quickly that advan- not the same as automatic detection. Auto- tages due to a category response were mini- matic detection refers to the case when a mized. stimulus gives rise to an automatic-attention It is conceptually important to keep in response that bypasses the need for a serial mind the possibility that automatic responses search through either memory or the display. might be attached to different stages of en- Automatic category encoding refers to the coding of a single stimulus. In particular, case when the stimulus gives rise auto- when an automatic-attention response de- matically to a node representing the category velops for a category, other attention re- of the stimulus. However, without an at- sponses may develop at a different rate for tention response attached to the stimulus the stimuli making up that category. Thus, node or the category node, the presence of if a categorization is available at the start of the category would have to be deduced CM training, the attention responses to the through a serial search of the displayed category might develop sooner than atten- elements. tion responses to some or all of the individual Our studies have shown that an automatic stimuli in the category. This effect would be category encoding of arbitrary collections of expected since any one stimulus would appear characters (consonants, in the case of Experi- only occasionally as a target, whereas every ment 3) can develop. The cause of the learn- target would be an instance of the category. ing is not yet clear. Some subjects in early On the other hand, in the cases in which the versions of Experiment 3 never showed any memory ensemble does not form a category evidence of categorical search (usually sub- at the start of CM training, there might be jects with low levels of performance). It may at least some individual stimuli that come be that these subjects did not notice the to elicit automatic-attention responses while categorical nature of the memory ensemble the category is still being learned. and hence could not develop a well-learned Once an automatic-attention or encoding category node in memory. Alternatively, response develops, we assume that it is no these subjects could have developed an ap- longer under control of the subject and will propriate category node but have failed to occur whenever its corresponding stimulus is use this node to facilitate their search. presented. This assumption will be supported There are several lines of evidence sug- by Experiment 4. However, when a cate- gesting that automatic detection may develop gorial encoding (but not an attention re- faster if a categorization is available at the sponse) is learned, it will not necessarily be start of consistent training. For example, it utilized by the subject during a controlled is easy to learn to search for a set of stimuli search. We suppose a category response will defined by a simple physical feature (see facilitate controlled search only if the subject Neisser, 1963, 1967, and studies in the next both notices the category and also decides to section). The CM results in Figure 5 show that alter his search to compare the category the categorical condition retains some superi- rather than the individual stimuli. These ority over the mixed condition for about eight considerations would no longer apply if an sessions (25-32) but the magnitude of the automatic-attention response were learned in difference is surprisingly small. It is possible response to the category; in such a case sub- that the high confusability between the two ject strategy would not matter since attention categories of Experiment 3 makes the cate- would be directed to the category in any gories and the automatic response difficult to event. learn. If so, more training prior to the CM phase could possibly have led to a larger dif- 2. Search Studies Using Categories ference in the rate of development for the two groups. Alternatively (or additionally) In a number of memory-search studies (in the small size of these categories may have these studies F = 1 and M varies), the mem- allowed automatic responses to the individual ory set consists of several categories, and 144 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER the distractors might be drawn from these the use of some sort of controlled search that same categories or from different (unknown) is facilitated by the presence of categories in categories. The results of these studies all the memory set.3 demonstrate the use of some sort of con- The final studies to be considered are those trolled search, perhaps in conjunction with visual search studies in which M = 1 and F simultaneous automatic detection, that is varies, but in which the visually presented facilitated by the presence of categories. items fall in two or more categories, one of For example, Naus, Glucksberg, and Orn- which matches the category of the memory- stein (1972) and Naus (1974) used memory sets consisting of words from several cate- 3 We shall not attempt at this time to explain gories and a distractor set made up of words why a particular controlled search strategy is adopted from these same categories. The results in a particular paradigm. Procedural differences showed linear memory-set-size functions, among the studies would make such explanations parallel for negative and positive display pure guesswork. Another class of studies using categories also deserves mention for completeness, items. However, there was a slope reduction but is not discussed in the main text because of a when the number of categories increased. The difficulty in ascertaining whether the effects found authors suggested that categories were suc- in these studies were due to automatic or controlled cessively searched in random order, each in processing. These are visual search studies (M = 1, serial, exhaustive fashion, until the category F varied) that have used memory ensembles and distractor sets from different categories. They have of the displayed item was reached. When the shown either flat set-size functions or curvilinear set- search of this category was completed, a size functions with reduced slopes (e.g., Brand, response was initiated. Homa (1973) pre- 1971; Ingling, 1972; Jonides & Gleitman, 1972, sented a related paradigm; his results sug- 1976). These studies have utilized CM designs with low practice levels; this makes it difficult to ascertain gested that a serial, exhaustive search of whether the slope reductions are due to the presence categories was followed by a serial, exhaustive of automatic detection, to beneficial effects of cate- search within the category of the displayed gorization on controlled search, or to some combina- item. Williams (1971), who presented an- tion of these factors. other related paradigm, also argued that a Note that the slope reductions for visual set-size functions normally should be taken as evidence of serial, exhaustive search of categories was automatic detection. That is, in a controlled search followed by a serial search within the cate- one would expect a serial comparison process to gory. Okada and Burrows (1973) used mul- be needed to compare the memorized category tiple categories in the memory set and some- against the category of each displayed item. How- ever, it was seen in our VM 'data from Part I/ times cued the subject as to the category of Experiment 2 that a reduction of slope occurred the displayed item in advance of the trial. when M was equal to 1 and F varied, and we sug- Their results showed that the search could gested that a "controlled parallel search" might be be restricted to the cued category. Clifton responsible for this result. A similar argument might and Gutschera (1971) used memory sets con- be made to explain the slope reductions in the above visual search studies. We do not particularly favor sisting of two-digit numbers and showed that such an explanation because it does not explain on some trials the subjects would first com- why normal set-size functions are found in visual pare the 10s digits and then would compare search studies that do not use categories (e.g., the Is digit only if a 10s digit match was Atkinson, Holmbren, & Juola, 1969). A finding related to those from studies using found. Lively and Sanford (1972) used a categories in visual search is that of De Rosa and memory set from one category and a dis- Morin (1970) in a memory search study. They tractor set consisting of some members of showed that when the memory set consisted of that category and other members outside consecutive digits, then both the positive and nega- that category. Negative items outside the tive reaction times increased as a function of the numerical closeness of the displayed item to the category of the memory set showed a set-size boundaries of the consecutive memory set. The CM function with reduced slope. procedure was used with small amounts of practice, The various findings in these memory- so it is again difficult to ascertain whether such search studies differ in many details that will effects were due to the action of automatic detection or to some controlled search facilitation allowed by not be discussed here. They all tend to show the categorical character of the memory set. PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 145 set item. Smith (1962) and Green and Ander- tion deficits in these paradigms are due to son (1956) carried out similar studies in which the limitations of controlled processes, in a field of colored two-digit numbers was pre- particular, to the limited rate of short-term sented and the subject was asked to search search. for a particular two-digit number in a par- Thus, all our comments and conclusions ticular color. The results showed that search concerning search and detection apply equally rate depended much more on the number of well to attention studies. In particular, CM items in the designated color than on the training should lead to the development of total number of items presented, though both automatic detection that shoulld bypass effects were present. A VM procedure was divided-attention limitations, while VM utilized in their studies so that automatic training should cause controlled search that detection could not have been used to restrict should severely limit the ability to divide search to the items of a particular color. attention. Probably a two-phase controlled search was In discussing the development of automatic utilized in this situation. First, a rather fast detection we have proposed that an auto- serial search was probably carried out to matic-attention response is learned in re- locate the next item of the designated color; sponse to the unchanging members of the then a slow comparison of that two-digit memory ensemble. The present study will number was probably made to determine test this hypothesis. The test (Experiment whether it was the target. This explanation is 4d) will entail asking the subject to ignore supported by the fact that searching took a certain locations and then inserting in those long time (up to 20 sec), so that a 30-40 msec locations items that subjects had been pre- search rate for color could have accounted viously trained to respond to as CM targets for the increases in response time as total (to see whether these targets attract atten- size increased. These increases were small tion). only in comparison with the larger within- These studies will also answer the following category effects. question: To what degree can the subject The main conclusion to be drawn from the focus attention on a specified subset of the studies reported in this section, based on re- inputs without distraction from the remain- peated findings, is that the presence of cate- ing (irrelevant) inputs. Such studies are gories in search tasks can be used to facilitate, usually termed focused-attention studies, in benefit, and modify controlled search. contrast to the studies of Part I and Experi- ments 1 to 3 of the present article, studies III. Experiments on Focused Attention that would be appropriately termed divided- In part I we made the case that the attention studies, processes utilized in attention experiments and those utilized in search and detection experi- A. Terminology ments are often the same. In fact, in many There is a problem of terminology in the cases it is purely arbitrary whether a given studies of this section that is best solved by study is referred to as an "attention," the introduction of the following definitions: "search," or "detection" study. Experiment 1 in Part I could be described as a divided- 1. A foil refers to any input that appears in attention study in which attention had to a to-be-ignored display location, whereas a be divided among M memory-set items and distractor refers to a nontarget that appears F frame items during each frame. The results in to-be-attended display location. showed tremendous deficits in dividing at- 2. A CM foil is a foil that has previously tention in the VM conditions, and virtually been used in CM training as a memory-set no deficit in dividing attention in the CM item. conditions. The results of Experiment 2 of 3. A CM target foil is a CM foil that would Part I, and the fit of the quantitative model have been a target requiring a positive re- to both studies, showed that divided-atten- sponse if it had appeared in a to-be-attended 146 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER

100 positions without decrement caused by the F-2 foils. F-4/diagonal F-4 1. Met hod eo The paradigm utilizes the multiple-target, multiple- frame VM procedure. There are three main con- 40 ditions: (a) M = 2, F = 2; (b) M = 2, F = 4; (c) M = 2, F = 4, diagonal. Condition (c), denoted 20 "diagonal," was designed so that one of the di- agonals of the display was always valid (to-be- attended), and the other diagonal was always in- M ~ 0 I I 0 I 2 4 | valid (to-be-ignored). In this condition four char- I SPACING OF 2 TARGETS | acters were presented on each frame, two valid, NUMBER OF TARGETS and two invalid foils. We expected that this diagonal condition would elicit performance similar to that Figure 8. Data from Experiment 4a. Estimated per- for the M = 2, F = 2 condition, and better than centage of detection as a function of number and that of the M = 2, F = 4 condition. spacing of targets in a VM procedure with frame time equal to 200 msec. The spacing 0 points are non- The four subjects in this study were the same identical targets, while the spacings of 1, 2, and 4 are as those used as in Experiment 3 of Part I. In the diagonal condition, only the upper-left and identical targets. (F = frame size.) lower-right frame positions ever contained a target. The subjects were fully instructed regarding this display location (although it appears in fact fact and were instructed to ignore the invalid in a to-be-ignored display location). It is a diagonal. For each subject, the blocks of trials member of the current memory set. utilizing the diagonal procedure were run after the other conditions were completed. The M = 2, F = 2 4. A VM foil is a foil that has previously condition had 14 blocks per subject; the M = 2, served as both a target and distractor in VM F = 4 condition had 11 blocks per subject; the conditions. diagonal condition had 16 blocks per subject, the 5. A VM target foil is a VM foil that would first 4 of which were practice blocks. There were 120 test trials per block, plus an additional 30 have been a target requiring a positive re- trials of practice for the first block of a session and sponse if it had appeared in a to-be-attended 15 trials of practice for each subsequent block in display location. It is a member of the cur- a session. rent memory set. The procedure used in this study differed some- 6. Valid positions or characters are to-be- what from that of Part I/Experiment 3 and from the other multiple-target tasks reported in this attended positions or characters. Invalid article. The primary difference lay in the relation positions or characters are to-be-ignored posi- of target similarity to spacing. The spacing 0 con- tions or characters. dition utilized NI targets only, and the spacing 1, 7. FII denotes a trial in which two identical 2, and 4 conditions utilized II targets only. In addition, the present experiment allowed targets to memory-set items appear, one in a to-be-at- reoccur in the same frame positions if they were not tended display position, and one in a to-be- in successive frames. Also, in the present study, the ignored display position. subjects pushed a single response button each time 8. FNI denotes a trial in which two dif- they thought they detected the target. The responses ferent memory-set items appear, one in a to- were to be made when the targets appeared rather than at the trial's end. be-attended display position, and one in a to-be-ignored position. 2. Results and Discussion

B. Focusing Attention in a VM Multiple- On about 1% of the trials in each condition Frame Task a response was made before the target ap- peared, and these responses were not counted. The first study in the present series of The results are presented in Figure 8. The experiments is designed to show that "con- results of the F = 2 and diagonal conditions trolled search" is not a misnomer, that sub- obviously do not differ, but results of both jects can control their search to the extent conditions are clearly superior to those of the that VM foils can be ignored, and that search F = 4 condition. These results were expected can be carried out through the valid display on the supposition that the subject should PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 147 have been able to control his processing in g IUU the diagonal condition so that comparisons H O would occur only for the characters on the UJ H valid diagonal. UJ 90 - The peculiar relationship of spacing to 0 *v Q 0 1 — 0 target similarity in this study caused the UJ 80 shape of the spacing functions to appear to cc N. / differ from those in previous multiple-target \- ^V o— o FII studies. In fact, however, the corresponding 70 \- •—•FNI points from the present study and Part I/ UJ Experiment 3 are quite close in value. It is o: UJ en i ii i interesting to note the subject's comments -I +1 +4 at the end of the experiment. They noticed NUMBER OF FRAMES FROM TARGET an excess number of II conditions overall, but TO FOIL none noticed that the II pairs were not tested Figure 9. Data from Experiment 4b. FII = target at spacing 0, nor that the NI pairs were not foil identical to target; FNI = target foil nonidentical tested at the longer spacings. to target. Percentage of target detection as a function The implications of this study are straight- of the spacing and the similarity between the target forward. Subjects can control their search in and the target foil. Varied-mapping procedures were used, and frame time was 200 msec. VM situations at least to the degree that comparisons can be limited to a specified The peculiarities of Experiment 4a were eliminated. diagonal. It is possible that characters on the The multiple-frame procedure was utilized with invalid diagonal are sometimes compared, but M = 2 and F = 4. The upper-left, lower-right not until the valid diagonal is searched first diagonal was valid, and the other diagonal was (if time were taken to compare foils during invalid. Either zero or one targets appeared on the valid diagonal, and an appropriate binary re- the search of the valid diagonal, then per- sponse was required. A VM task was utilized on formance would be worse). Of course, this the valid diagonal, and the foils on the invalid study demonstrates only a minimum amount diagonal were chosen from the distractor set for the of subject control. In future studies it would valid diagonal. In addition, there was on every be desirable to explore this matter more trial exactly one VM target foil (i.e., a member of the memory set) on the invalid diagonal. thoroughly. For example, we might ask: On one-third of the trials, only a VM target foil Can search alternate between diagonals in appeared, always on frames 8-13. On two-thirds of successive frames? Can search order be cued the trials both a target foil and target appeared; individually for each frame? on one-half of these trials the target and target foil were identical (FII), and on one-half of these trials the target and target foil were different (FNI). C. Distraction Caused by "Targets" Finally, the target appeared on any of frames 8-13, During VM Search and the target foil appeared equally often on frames —4, —1, +1, and +4 with respect to the target Experiment 4a showed no distracting effect frame. That is, the spacing between the target and target foil was systematically varied. Thus, the of VM foils (on the invalid diagonal). How- target-foil-only trials occurred with a probability ever, none of these foils was in fact identical of J, and each of the other eight conditions occurred to any member of the memory set. The with a probability of 8 X i = A. All these trials were present study, then, is designed to determine randomly intermixed within each block. the distracting effect of VM target foils: To Each block contained 144 test trials preceded by 15 practice trials. Each subject ran through a total what extent can members of the memory set of 12 or 13 blocks in four sessions. be ignored when they appear in invalid dis- play locations? 2. Results and Discussion 1. Method The results are shown in Figure 9. The The paradigm is similar in general outline to figure gives the percentage of hits as a func- that of the VM conditions of Part I/Experiment 3. tion of the spacing between the target and 148 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER the target foil, when the target foil is iden- inadvertently checked in addition. In this tical (FIX) and nonidentical (FNI) to the event, target foils would occasionally be target. The correct rejection rate when no noticed and might harm subsequent detection target was present was 91%. The results may in a fashion similar to that caused by true be summarized briefly: The target foil had target detection. Of course, this hypothesis no effect on the hit rate except when it was is speculative and other explanations are different from the target (FNI) and preceded undoubtedly available. the target by 1 frame, in which case detec- Whatever the explanations for the details tion probability was decreased by .11. of the results, it is safe to summarize the In the present study a target foil reduced results of Experiments 4a and 4b as follows: performance when it preceded, but not when Subjects are able to control their search in it followed, the target, Suppose that this re- varied-mapping paradigms. The degree of duction is due to the same factors that pro- their control is sufficient to eliminate any duce decrements in multiple-target detection distracting effect of nontarget foils and to in VM conditions (Experiment 3, Part I). reduce the distracting effect of target foils Then one target probably inhibits detection well below the level caused by other targets. of a subsequent target, but not of an anteced- Thus, attention focusing is quite successful: ent target. That is, the reduced detection at VM foils have, at most, a small distracting spacing 1 seen in the VM conditions of Part effect when they appear in to-be-ignored I/Experiment 3 and also seen in Figure 4 locations during controlled search. and 6 of the present paper, could have been caused by either of the two targets' reducing D. The Distracting Effect of "Targets" detection of the other. The present results During CM Search provide some reason to argue that it is the first of two successive targets that reduces Experiments 4a and 4b examined the detection of the second. ability to focus attention during controlled Let us assume that the decrement in per- search. We next ask: To what degree does centage of detection of two targets in VM attention focusing affect automatic detec- paradigms is in fact a decrement in detection tion? Our previous studies have demonstrated of the second target. Then it is possible to that automatic detection is not affected by compare the magnitude of the decrements in frame size, so there would be no point in the multiple-target studies (e.g., Experi- carrying out a CM counterpart of Experiment ment 3, Part I) and the present study. In 4a in which the invalid diagonal would con- fact, the decrements are much larger in the tain distractors only. We decided to carry out earlier studies in which all display locations a counterpart to Experiment 4b to find out were valid. For example, the difference in whether a CM target foil would interfere detection between spacing 1 and spacing 4 with automatic target detection on the valid in Part I/Experiment 3 was 30% in the NI diagonal. condition and 8% in the II condition; the M and F were set equal to 2. During each comparable decrements in the present study frame, each diagonal contained one mask and are \\% and 0%. Thus a target foil reduces one character. One diagonal was always detection of a target in the next frame, but valid, the other invalid. On every trial a CM the reduction is much smaller than that target foil appeared on the invalid diagonal caused by an actual target. in exactly one frame. A CM target appeared How can the present findings be reconciled on the valid diagonal on two-thirds of the with those of Experiment 4a in which foils trials; on one-half of these trials the target did not impair performance? The pattern of and target foil were identical (FII) and on findings would be explicable if the valid one-half, nonidentical (FNI). In each of diagonal were always searched first and then, these cases, the target foil occurred either whenever that search finished early, one or four frames before, in the same frame with, more characters on the invalid diagonal were or four frames after the target (-J probability PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 149

each). There were 162 trials per block and 6 Table 2 blocks per each 1-hour session. Two sessions Effect of Distraction on A utomatic were run with frame time equal to 60 msec Detection: Experiment 4c and two sessions were then run with frame Spacing time equal to 30 msec. (target % correct The results are shown in Table 2. The Variable to foil) % hits rejections only significant effect was a slight drop in Session 1 60-msec frame time 90.0 performance (about 4%) in the FII condition FII -4 91.9 at / = 60 msec with simultaneous target and FNI -4 94.2 FII 0 89.4 foil (0 = 4.45, p<.QOQl); performance FNI 0 95.6 dropped considerably when / dropped to FII +4 96.1 95. 1 30 msec, but all conditions became roughly FNI +4 Session 2 equal. 60-msec frame time 91.2 The results of this study demonstrate that FII -4 95.4 FNI -4 93.5 a target foil hinders detection of an identical FII 0 90.7 FNI 0 93.8 simultaneous target. The magnitude of the FII +4 92.8 effect is small, but so is the magnitude of FNI +4 95.6 the decrement that occurs when two simul- Session 3 taneous CM targets are presented. In fact 30-msec frame time 75.2 FII -4 75.7 these decrements are not much different at FNI -4 79.1 FII 0 74.5 the equivalent frame times (4% vs. &%). FNI 0 77.6 It seems reasonable to assume that the same FII +4 77.1 mechanism is causing both effects.4 FNI +4 80.1 Session 4 It is interesting to compare these findings 30-msec frame time 71.5 to those of Eriksen and Eriksen (1974). In FII -4 77.3 FNI -4 78.9 our terminology, they used a single-frame FII 0 80.3 CM task with M — 2, F = 1, and they used FNI 0 81.0 FII +4 78.7 reaction time as the dependent measure. FNI +4 78.2 However, the distractor set and the memory Note. FII = trial in which two identical memory-set items ensemble were both of size 2 and were con- appear, one in a to-be-attended display position, and one in a to-be-ignored display position. FNI •= trial in which two dif- sistently mapped across trials, so that an ferent memory-set items appear, one in a to-be-attended posi- equally strong but opposite tendency to auto- tion, and one in a to-be-ignored position. matically respond to the members of the two slowing (since they automatically produced a sets was probably learned. (In our CM competing response); flanking characters studies, only the memory-set items can come that were not distractors or memory-set to elicit automatic responses, because dis- items produced a moderate slowing (regard- tractors are presented on every trial. The only exception in our studies occurred when 4 F = 1 in the single-frame paradigm of Part I, The results in Table 2 give some indications that the disrupting effect may change with practice. and even then the distractor set was always Over the four sessions, the differences between FII larger than the memory set.) In the Eriksen and FNI at spacing 0 was, respectively, 6.2%, 3.1%, and Eriksen study, the relevant item always 3.1%, and .7%. If this trend actually exists, it may appeared directly above the fixation point, be caused by the development of a new automatic but three invalid items were placed on each response that restricts search to the valid diagonal. That is, position-specific information might govern side of the relevant item. The nature and the automatic-attention response; such an auto- distance of these invalid items were varied. matic process could be learned because the valid When the distance to the nearest item reached diagonal never changes over trials. It would be 1 ° of visual angle, the invalid characters had interesting in future investigations to compare per- little effect on reaction time. At closer dis- formance in this condition with performance in a condition that alternates the valid diagonal from tances, all reaction times were slowed, but trial to trial, since alternation should prevent the flanking distractors produced the greatest long-term learning of position-specific encoding. 150 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER

test of the hypothesis that an automatic- 90 attention response to consistently trained NO FOIL- o targets is learned. LU r~ 80 LU 1. Method Q Each frame contained 4 characters. The upper- 70 left to lower-right diagonal was valid. Memory-set cr size was 2, and the VM conditions were utilized. Search on the valid diagonal was for consonants in consonants (or digits in digits for other sub- 60 jects). The foils on the invalid diagonal were LU chosen from the distractor set used for the valid o diagonal, except for the CM foil. When a CM foil OL appeared it was chosen from the set that had served LU 0. 50 as the CM memory ensemble in all previous studies -I 0 +1 for that subject. Thus, if consonants were being NUMBER OF FRAMES FROM used on the valid diagonal, a CM foil would be a digit, and vice versa. TARGET TO FOIL On one-sixth of the trials, neither a target nor a Figure 10. Data from Experiment 4d. Percentage of CM foil appeared; on one-sixth of the trials only target detection in a varied-mapping procedure as a a target appeared; on one-sixth of the trials only function of the spacing between the target and the a CM foil and no target appeared; on one-half of target foil, when subjects had previously been trained the trials both a target and a CM foil appeared, to respond to the foil as a consistent-mapping target. with the spacing between them equally likely to Frame time was 200 msec. be —1, 0, +1. Each block of trials contained 144 test trials. The first block of each session had IS practice trials and the other three blocks had 5 less of their visual similarity to either set), practice trials. Two sessions were run for each of and flanking targets, whether identical or the four subjects (the same subjects as those used nonidentical, produced the least slowing in Experiments 4a-4c.) The frame time was 200 (since they automatically produced a com- msec. patible response). Both the Eriksen and Eriksen (1974) 2. Results and Discussion study and our study suggest that CM target foils can neither be excluded from processing The results are shown in Figure 10. Each nor prevented from affecting performance. In percentage is based on 7,68 observations. addition, however, the Eriksen and Eriksen When no CM foil or target appeared the result suggests that the spatial configuration false alarm rate was 4%. When only a CM of the inputs can determine the magnitude foil appeared the fajse alarm rate was again of these effects (see Footnote 4 for a discus- 4%. Thus, in the absence of an actual target sion of location-specific automatic detection). the presence of a CM foil caused no increase in false alarms. Since the appearance of CM E. The Distraction of Controlled Search foils was highly correlated with the ap- by Automatic Detection pearance of targets, one might have supposed that a bias or guessing strategy would arise, Controlled search depends on an apparently such that subjects would more often guess serial process that should easily be disturbed that targets were present if a CM foil was if an automatic-attention response occurs seen. The present data argue against such a that draws attention to an invalid location. bias (as do the data discussed below). Such a rationale suggests a paradigm for The hit rate when a target but no CM foil our next study in which CM foils appear on was present was 84%. The distraction caused the invalid diagonal while a controlled search by a CM foil may be determined through occurs on the valid diagonal. comparison with this performance level. This study is probably the most important When the CM foil preceded the target by one in this series (4a-4d) because it provides a frame the hit rate was &2

These findings are particularly important messages barely harm shadowing performance because previous demonstrations of difficul- when they are distinguishable by an obvious ties in focusing attention have usually utilized physical characteristic (such as spatial origin; incompatible responses. That is, a stimulus see Cherry, 1953; Cherry & Taylor, 1954). in response to which a subject has been trained These studies are probably like our Experi- to emit response x is presented in a to-be- ment 4a in which one diagonal is relevant ignored location in conjunction with a stimulus during VM search and the other can be requiring a response y that is incompatible ignored. A result related to Experiment 4b with x. In the Stroop Color-Word Inter- can be found in studies by Treisman (1964a, ference Test (1935), for example, the sub- 1964c) in which items of some possible ject must read names of colors that are relevance are presented in the nonshadowed printed in ink colors that do not correspond ear (similar to our target foils). For example, to the names. Reading speed is greatly slowed if the distracting message is in the same in this case (see Jensen & Rohwer, 1966). voice, but in French, a considerable distrac- Keele (1972) showed that decrements in tion occurs. Many studies have shown that processing stimuli resulted when a similar the message in the nonshadowed ear is an- color-name incompatibility was present. The alyzed to some considerable depth (e.g., Eriksen and Eriksen (1974) study discussed Lewis, 1970; Treisman, 1960, 1964b). In above also showed that response time was these studies it is not clear which type of dependent on the compatibility of responses processing causes the analysis, but in other between stimuli to be attended to and those studies it is clear that automatic processing to be ignored. is responsible for the analysis that occurs Many other examples of this kind can be (Corteen & Wood, 1972; Von Wright, Ander- found in the literature, but our demonstra- son, & Stenman, 1975). tions differ in one very important respect. Further reviews of such studies will not Namely, the distracting stimulus in our stud- be undertaken here because of the difficulty ies hurts performance even though subjects in ascertaining whether focused-attention have been trained to respond to it in exactly deficits are caused by controlled processing of the same way as to the target stimulus. Both invalid items or by automatic processing of types of studies demonstrate that the to-be- invalid items.5 If nothing else, our results ignored stimulus receives processing, but the strongly suggest that future research on studies utilizing incompatible responses do focused attention should include controls to not demonstrate any processing interaction differentiate deficits caused by controlled and short of the motor responses themselves. On automatic processing. the other hand, our studies suggest that the In summary, we have seen that subjects focused-attention deficit results from a di- can divide attention almost without deficit version of attention and an accompanying when automatic detection is utilized (Part I/ loss of processing time, since there is no re- sponse incompatibility. 6 Our studies have shown that focused-at- Controlled processing might be given to oc- casional inputs in an irrelevant ear, because ear of tention deficits can arise due to distraction origin is a fairly difficult cue to utilize (compared by either CM or VM items in appropriate with spatial location in a visual task, for example). contexts, though probably due to rather dif- Whether automatic processing or controlled proc- ferent mechanisms in the two cases. In many essing is used to restrict analysis to the desired ear, studies in the literature that have shown occasional failures of selection are to be expected, failures that will cause occasional items in the to-be- focused-attention deficits, it is not clear which ignored ear to be given controlled processing. In factor is responsible. For example, in shadow- addition, 'Controlled processing of items in the to-be- ing tasks the listener repeats aloud a desig- ignored ear may occur on occasion when processing nated message (usually in one ear) while of the items in the valid ear is completed and a brief interval occurs before the next valid input other distracting messages are also presented. arrives (an argument similar to this is used in the The early studies showed that distracting discussion of Experiment 4b). PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 153

Experiments 1 and 2, CM conditions, see matic-attention response to the features that Figure 2) and cannot divide attention with- are encoded from an input target; this re- out deficit when controlled search is utilized sponse directs attention to the representation (Part I/Experiments 1 and 2, VM conditions, in short-term memory of the relevant visual see Figure 2). Focused-attention deficits are input and also to the representation in mem- quite substantial when caused by automatic ory of the appropriate member of the memory responses to to-be-ignored items (Experi- set. Second, in addition, an automatic "tar- ment 4d and Figure 10) but are quite a bit get" response is learned that tells the subject smaller when caused by controlled processing that a target is among the inputs. Third, in of to-be-ignored items, since the subjects' addition, in some situations an automatic control is usually adequate to prevent such overt motor response (such as a button press) processing (Experiments 4a and 4b, and is learned in response to a target. Figures 8 and 9). One implication is the pre- The third factor, an automatically pro- diction that the type of training procedures, duced overt motor response, is probably VM or CM, will determine whether divided- limited to special situations in which the or focused-attention deficits will occur. same completely consistent response to all targets is immediately required. Our reaction G. The Nature of Automatic Detection time study (Part I, Experiment 2) may rep- The reversal studies, Experiments 1 and 2, resent such a situation, but our multiple- and the category study, Experiment 3, make frame tasks do not. In the multiple-frame it clear that long-term learning is responsible tasks, no overt response is required until the for the phenomenon of automatic detection trial's end. In addition, some of the tasks in that is seen in the CM conditions. The lengthy our studies require that the number of training period required for acquisition, and targets be counted, rather than that a re- the even longer training periods required sponse be made to each target as it appears. to alter the detection process once learned, Finally, the tasks in Experiment 4 occa- testify to the permanent nature of the learn- sionally present target foils in to-be-ignored ing. Furthermore, the phenomenon is power- locations, and the subject has little difficulty ful enough to transcend the laboratory suppressing responses to such targets. Thus, context. Subjects given CM training on we feel that automatic overt responses can one group of letters as memory-set items with be learned but are probably not a major another group as distractors, as in Experi- contributor to performance in most of our ments 1 and 3, report effects on reading out- tasks. side the laboratory. Despite the fact that all The first two factors, the occurrence of the letters used in the experiments were automatic "attention" and "target" responses, capitalized, the subjects reported that the undoubtedly play a major role in our tasks. memory-set items from the CM conditions However, such responses must be followed by tended to "jump out" from the page during some sort of controlled process or controlled normal reading. This effect was distracting decision to generate the required overt re- enough that one subject would not attempt to sponse. A number of controlled processes read for an hour or more after an experi- may be used. In most of our studies it would mental session. not have been sufficient to simply use the It is clear then that automatic detection occurrence of an automatic "attention" or reflects a powerful long-term process. We now "target" response as a basis for a decision ask: What is it that is learned? Experiments to respond. For one thing, some of our studies 4a-4d imply that an attention response is require responses to be counted; for another, learned, but many details of the automatic some of our studies present target foils to detection process remain to be specified. which responses must be withheld. Further- It is our feeling that several factors con- more, in Experiment 3 we saw that the mem- tribute to the process that has been termed ory and distractor sets could be so confus- automatic detection. First there is an auto- able that automatic responses could not be 154 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER learned in a perfectly accurate manner, so harmful on another (when the producing that a switch to a controlled item mode was stimulus is a distractor). A subject cannot necessary to check the accuracy of automatic learn both an attending and a nonattending responses. response to the same stimulus. While this For all of these reasons, we propose that argument seems clear, an important theo- following the occurrence of an automatic- retical question remains. What is the under- attention or target response, the subject en- lying mechanism that inhibits the learning gages in the minimal controlled processing of an automatic-attention response in the necessary to satisfy the task requirements. VM search situation? Such controlled processes might consist of Suppose that a Learning event takes place comparing the memory representations to each time a target is found correctly and is which attention has been drawn, counting the therefore reinforced. In consistent (CM) target instances, and checking the spatial paradigms there will be no impediment to location of located targets to see whether the learning of attention responses. In VM they are foils or valid targets. paradigms, the outcome is less clear. It might Before concluding this discussion, it is be supposed that every item, whether cur- useful to consider briefly a few supplementary rently a memory-set item or a distractor, hypotheses regarding the automatic detec- comes to elicit attention responses due to tion process. Each of these hypotheses as- those trials on which it is a target. There are sumes the occurrence of an automatic-atten- then several possibilities: (a) Learning con- tion response. tinues until all items have acquired attention First, consider the possibility that atten- responses of roughly equal strength. Because tion is drawn automatically to the representa- the responses are of equal strength, they tend tion of the relevant visual input, which is to cancel each other, and controlled search then compared serially to the memory set. must be utilized. Later, if a switch is made This hypothesis is easily rejected since it to a CM paradigm, the strength of the re- implies, contrary to fact, that memory-set sponses to the fixed memory-set items be- size will have a large effect under CM con- comes greater than that for the distractor ditions. items, (b) As an attention response begins Second, consider the possibility that an at- to develop it starts to occur on trials when tention response directs the subject to the the input is a distractor. Because attention relevant visual input, whose category is then is directed to the wrong input, performance compared in one step to the category of the suffers, and the response is therefore in- memory set. This hypothesis is testable in hibited, (c) Each time a distractor is com- any of several ways but is difficult to reject pared during a controlled search, whether on the basis of present evidence. A tentative or not an attention response occurs, inhibition inference from Experiment 3 suggested that of any previously reinforced attention re- automatic detection for items in a four-letter sponse may take place because a comparison set developed much faster than category is carried out but not reinforced. According to learning for that set. If so, then this category explanations (b) and (c), the inhibition hypothesis could be ruled out (because the cancels any learning that would otherwise M = 2 and M = 4 CM functions converged occur, so that attention responses to any of quickly in the mixed condition of Experiment the items never develop to significant degrees. 3). Nevertheless, additional research will be We prefer hypotheses (b) or (c), or any needed to test this hypothesis conclusively. similar proposal that implies that attention Assuming that an automatic-attention re- responses do not develop in VM paradigms. sponse underlies automatic detection, it seems Is there any evidence to distinguish these clear why automatic detection does not de- views from hypothesis (a)? Only indirect velop in VM situations: An attention or overt evidence is available at present, but the response that is helpful on one trial (when models are easy to test. For example, after the producing stimulus is a target) will be VM training one could introduce new items PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 155

as targets, with other new items as distrac- complex interrelations among nodes. An in- tors, or with the previously trained VM dividual node may consist of a complex set items as distractors. The old VM distractors of information elements, including associa- should severely hinder performance if they tive connections, programs for responses or elicit attention responses developed during actions, and directions for other types of previous training. Other similar tests could information processing. What then sets off also be carried out, but a definitive answer one node from a group of nodes? One node is not yet available at the time of this writing. is distinguishable from a group of nodes be- cause it is unitized, that is, when any of its IV. A Framework for Information Processing elements are activated (i.e., placed in STS), all of them are activated. One activated node This section of the paper will be organized may of course activate another node, but it as follows. Section A will present an overview does not do so in all situations, only when the of the system and an overview of the role of context or the state of the information- controlled and automatic processing. Section processing system is appropriate. B will elaborate on the fundamental nature of controlled and automatic processing. Sec- tion C will present a framework for search, 2. Structural Levels detection, and attention. Section D will dis- cuss how automatic and controlled processing It seems likely that the structure of long- are utilized in memory and retrieval. term store, at least in part, is arranged in levels (perhaps sometimes in a hierarchical A. An Outline of a General Theory tree structure). By levels we refer to a tem- poral directionality of processing such that Memory is conceived to be a large and certain nodes activate other nodes but not permanent collection of nodes, which become vice versa. In sensory processing, there is a complexly and increasingly interassociated tendency for increasing information reduction and interrelated through learning. Most of as successive levels are activated. Thus, a these nodes are normally passive and inactive visually presented word could first be pro- and termed long-term store, or LTS, when in cessed as a pattern of contrast regions, colors, the inactive state. The set of currently ac- regular variations, and so forth; then lines, tivated nodes is termed short-term store, or angles, and other similar features could be STS. LTS is thus a permanent, passive repos- activated; then letters and letter names and itory for information. STS is a temporary verbal or articulatory codes; then the word's state; information in STS is said to be lost verbal code; and finally, the meaning and or forgotten when it reverts from an active semantic correlates of the word. This se- to an inactive phase. Control of the informa- quence is meant as an example, and we do tion-processing system is carried out through not wish to imply that these are the relevant a manipulation of the flow of information features, that this is the only possible order- into and out of STS. These control processes ing, or that this listing is exhaustive. Such include decisions of all sorts, rehearsal, cod- a sequence of feature encoding should occur ing, and search of short- and long-term store. automatically to a normally skilled reader. LTS contains learned sequences of informa- tion processing which may be initiated by a control process or by environmental or in- 3. Automatic Processes ternal information input, but are then ex- An automatic process can be defined within ecuted automatically with few demands on this system as a sequence of nodes that nearly the capacity of STS. always becomes active in response to a par- 1. Long-Term Structure ticular input configuration, where the inputs may be externally or internally generated and The structure of the nodes making up LTS include the general situational context, and will be treated as a very general graph with where the sequence is activated without the 156 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER necessity of active control or attention by or intensities even these features will not be the subject. activated. An automatic sequence differs from a single Although we have described automatic node because it is not necessarily unitized. processes as largely beyond subject control, The same nodes may appear in different some indirect control is possible through ma- automatic sequences, depending on the con- nipulation of the activation threshold for auto- text. For example, a red light might elicit matic processes. In particular, according to a braking response when the perceiver is in what we shall call the contextual hypothesis, a car, and elicit a walking, halting, or traffic- the activation threshold can be lowered by scanning response when the perceiver is a the inclusion of information in STS (at the pedestrian. time of presentation of the activating input) Since an automatic process utilizes a rela- that is associatively related to the nodes tively permanent set of associative connec- making up the automatic sequence.6 tions in long-term store, most automatic Note that a lowering of the threshold does processes will require an appreciable amount not imply that the quality of processing is of training to develop fully. Furthermore, improved. One result of threshold lowering once learned, an automatic process will be is that the automatic process will be triggered difficult to suppress or to alter. by inputs that normally would and should When an automatic sequence is initiated, not do so. For example, the feature "horse" its nodes are activated and hence the as- might be incorrectly triggered by the input sociative information enters STS. This fact "house" if the threshold for "horse" is does not, however, mean that the various lowered sufficiently. elements of the automatic process must be available to consciousness or recallable at a S. Controlled Processes later time. The activation in STS could be extremely brief (in msec, say) and unless A controlled process utilizes a temporary attention is directed to the process when it sequence of nodes activated under control of, occurs or unless the sequence includes an and through attention by, the subject; the automatic attention-calling response, then sequence is temporary in the sense that each the information may be immediately lost activation of the sequence of nodes requires from STS, and the subject may be quite un- anew the attention of the subject. Because aware that the process took place. Even active attention by the subject is required, when an automatic-attention response is part only one such sequence at a time may be con- of the sequence, it will not necessarily affect trolled without interference, unless two se- ongoing controlled processing unless the quences each requires such a slow sequence of strength of the attention response is suffi- activations that they can be interwoven ciently high. serially. Controlled processes are therefore tightly capacity-limited, but the cost of ca- 4. Thresholds of Activation pacity limitations is balanced by the benefits However an automatic process is initiated, deriving from the ease with which such whether by internal or external input or by a processes may be set up, altered, and applied control process, it is presumed that the prob- in novel situations for which automatic se- ability that the process will run to comple- tion depends on the strength of the initiating 6 It may also be possible to lower the threshold stimulation. The simplest and most common by a .recent activation of the same automatic se- examples are seen in studies of psychophysical quence. However, this hypothesis is difficult to thresholds. If a letter, for example, is visually distinguish from the contextual one, because an presented at a low-enough intensity or a activation of a sequence is usually accomplished by the prior, .recent presentation of information as- short-enough duration, then it may be en- sociatively related to the sequence, and this related coded not as a letter but as a partial collec- information is still likely to be in STS at the time tion of line-like features. At lower durations of test. PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 157 quences have never been learned. Controlled STS has two somewhat distinct roles. The processing operations utilize short-term store, first is the provision of a temporary store- so the nature of their limitations is deter- house for information currently important to mined at least in part by the capacity limita- the organism. That is, it acts as a selective tions of STS. window on LTS to reduce the amount of information for processing to manageable 6. Short-Term Store (STS) proportions. The second role of STS is the provision of a work space for decision making, Short-term store is the labile form of the thinking, and control processes in general. memory system and consists of the set of concurrently activated nodes in memory. 7. Learning: Transfer to LTS The phenomenological feeling of conscious- ness may lie in a subset of STS, particularly Consider first what is meant by transfer in the subset that is attended to and given from STS to LTS. Transfer implies the for- controlled processing.7 mation in permanent memory of information The capacity of STS is determined sto- not previously there. To be precise, this con- chastically (see Shiffrin & Cook, Note 3) sists of the association (in a new relation- so that a large amount of information may ship) of information structures already in be present (activated) at any one moment, LTS. A minimal requirement for this new but only a small amount of information will associative structure is the simultaneous persist for an appreciable time period of presence in STS of the separate elements to several seconds or more. or loss be associated or related. That is, the various from STS is simply the reversion of cur- nodes to be linked in a new relationship must rently active information to an inactive state be activated in STS. Thus, transfer to LTS in LTS. does not imply the removal of the transferred What determines the loss rate? We sup- information from STS, nor the placing of pose that the rate of loss of any informational new "subunits" in LTS that do not already element or node in STS depends on the num- exist in either LTS or STS. Rather, transfer ber of similar elements simultaneously active means the formation of new associations (or in STS. By similarity we refer not only to the strengthening of old associations) be- formal physical similarity but also to sim- tween information structures or nodes not ilarity of features at comparable levels of previously associated (or strengthened) in processing (e.g., the typeface of a printed LTS. Most new associative structures will word will be less likely than the word itself include as a component the context in STS to cause forgetting of a verbally encoded at the time of the transfer. second word in memory). When a large The above remarks specify the nature of amount of similar information is present, the new learning in STS but not the cause under- loss rate will be rapid but will slow as the lying the storage mechanism. It has been amount of active information decreases. To give an example, when a complicated visual 7 An alternative formulation, closer to that sug- scene is presented to a subject briefly in a gested by Atkinson and Shiffrin (1968), would tachistoscope, a flood of visual information use the term STS to refer to those nodes that are enters STS and initiates a series of chains given attention and controlled processing. Other activated nodes would be referred to by another of automatic processing that result in higher term, such as "sensory register" in Atkinson and level features' also entering STS. However, Shiffrin's terminology. At such a general level of most of the activated information will have discussion, it is doubtful that there are substantive decayed and will be lost from STS in just differences between the two approaches—versions of a few hundred milliseconds after physical each could be constructed to be identical to each other. Each approach has its own heuristic ad- offset of the scene (see Sperling 1960); just vantages, but a theory is better judged in terms of a few of the features, perhaps at higher levels, its detailed assumptions and predictions than by will remain present longer than a few seconds. its choice of either type of terminology. 158 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER

shown that rehearsal (and coding rehearsal) trolled search) and have shown how auto- are strongly implicated in learning (see matic processing (e.g., automatic detection) Shiffrin, 197Sb, for a discussion). We prefer can bypass attentional limitations. It should an extension of the rehearsal hypothesis. We not be overlooked that the initial learning of assume that what is stored is what is at- automatic processing may require a variety tended to and given controlled processing. of control processes and hence will be sub- Rote rehearsal is just one form that attention ject to various limitations that may dis- can take. In fact, maintenance rehearsal may appear when learning has progressed to a result in storage of low-level auditory or high level.8 verbal codes not helpful for long-term , (but helpful for long-term recognition), while 8. Retrieval of Information from Short-Term coding rehearsal may result in storage of Store high-level codes useful for recall. This model Several retrieval modes from STS are pos- suggests that some degree of attention or sible. An automatic process might direct the controlled processing is a prerequisite for retrieval process to just a subset of the ac- storage. Thus, incidental learning situations, tivated information (e.g., only the letters, in which no extended rehearsal takes place, not the masks, are compared in our VM will still result in some learning to the degree search studies). that the input items are attended to during The various controlled search strategies presentation. that are available are normally serial in na- These hypotheses that controlled processing ture but a variety of search orders is possible. will underlie learning apply of course to the Search order may depend upon instructions, development of automatic processing, and strength of short-term traces, the nature of in particular, to the development of auto- categories of the traces, the modality of the matic detection studied in the search and information, and the structure of STS. This detection paradigms earlier in this paper and last point is worth emphasizing: Since STS in Part I. In those paradigms it was seen is embedded in LTS, it partakes of the struc- that automatic detection developed only with ture of LTS and is not a totally undiffer- consistent training. Consistent training is entiated mass of information. Thus the order- crucial when the learned sequences in LTS ing of a search can utilize this existent struc- contains an internal or overt response that ture. In addition, the comparison process may will be harmful to performance on trials that be exhaustive or terminating, and the in- are inconsistent. For example, an attention formation located in one phase of the search response to an item will harm performance on can be used to redirect a later phase of the trials when that item is a distractor. Purely search. informational sequences (i.e., sequences that It is the control of search order that is do not include responses that direct internal responsible for many of the selective atten- processes or overt actions) will be stored in tion effects that are observed. Information in LTS when attended to, regardless of con- locations to which attention is first directed sistency of training. In all cases, however, (i.e., the first locations searched) will in consistent training and large numbers of general receive faster and more accurate repetitions should lead to stronger automatic processing for the obvious reasons. encodings and processes.

An implication of the hypothesis that trans- 8 fer to LTS is engendered by controlled pro- This distinction is helpful in understanding the relation of our work to studies of attention in cessing is the important concept that con- infrahuman organisms: the traditional studies of at- trolled processing, and hence attentional tention in discrimination learning and blocking (e.g. limitations, will be involved during the ac- see Mackintosh, 1975) are involved with those quisition of automatic processing. We have limitations that occur during acquisition, while a number of newer studies are involved with limita- largely been identifying attentional limita- tions in steady state situations when learning is not tions with controlled processing (e.g., con- possible (e.g. see Riley and Leith, 1976). PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 159

9. Retrieval of Information from Long-Term their nature imprecise and fallible. When Store retrieval fails, perhaps temporarily, then we Both automatic and controlled processing say forgetting has occurred. The causes of are utilized in long-term retrieval. In gen- such failure are somewhat to the side of the eral, an automatic associative mechanism will main interests of this paper but may be sum- cause currently inactive information in LTS marized briefly as follows. First, the probe to become active when associatively related cues used to activate related information in information is in STS. The simplest example LTS may be ineffective, either because the is the perceptual encoding process by which subject chooses them incorrectly or because information concerning environmental inputs the general context in STS at the moment is is located in LTS. As each feature is acti- dissimilar to that stored with the desired in- vated, it can serve as a base for further acti- formation in LTS. Second, the process of vation of features associatively related to it retrieval itself may harm additional retrieval and to other previously activated features. attempts, especially if the probe cues are This same principle applies to long-term not altered from one cycle of the search to retrieval of higher order information. the next. Third, the search may fail due to It should be noted well that information premature termination; that is, it may fail once activated automatically from LTS will due to a decision that further retrievals are then be in STS and may have to be located not worth the effort, or due to the reaching through use of any of the short-term retrieval of some sort of time limit available for mechanisms discussed above. search. (See Shiffrin, 1970, 1976, for a fuller How does the subject exert control over discussion of these matters.) LTS retrieval? The subject can utilize re- hearsal and selective processes to raise the B. The Characteristics of Controlled and strength of certain information in STS above Automatic Processing the strength of other information. Then the In capsule form, we have covered the major activated and retrieved information from phases of the information-processing system. LTS will tend to be information related to We shall now attempt to define and out- the selectively accentuated subset, line the characteristics of automatic and con- In Shiffrin's (1970) paper the details of trolled processing in a more precise fashion. the controlled LTS search were presented at length. In summary, the controlled search 1. Controlled Processing process is a series of cycles. On each cycle (a) the subject generates probe information It is important to note that not all control and places it in STS; (b) the probe informa- processes are available to conscious percep- tion, along with the general contextual infor- tion, and not all control processes can be mation presently in STS, activates associated manipulated through verbal instruction. It is information from LTS, called the search set; therefore convenient to divide control pro- (c) the subject searches the STS search set; cesses into two classes: accessible and veiled. (d) the subject decides whether the appro- Accessible control processes are those like priate information has been found and rote rehearsal or alphabetic search of LTS whether the search is over. The process then that can be instituted and modified by in- continues cycling. Note that step (c) is a struction. These are generally slow processes retrieval from STS, so that one phase of con- that are easily perceived by the subject. trolled LTS retrieval may be STS retrieval. Veiled control processes are those like the serial comparison of items in short-term store 10. Long-Term Forgetting that are difficult to modify through instruc- The forgetting of information in LTS is, tion. They are not easy to perceive through by definition in the present theory, a question introspection because they take place so of retrieval failure. The retrieval processes quickly. discussed in the preceding sections are by Both classes of control processes have the 160 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER

following general properties: (a) They are trol, but once initiated all automatic processes limited-capacity processes requiring attention. run to completion automatically (though Because these limitations prevent multiple some indirect control is possible through control processes from occurring simulta- manipulation of the contents of STS at the neously, these processes often consist of the time of the inciting input), (c) They require stringing together in time of a series of considerable training to develop and are most singly controlled unitary operations, (b) The difficult to modify, once learned, (d) Their limitations of control processes are based on speed and automaticity will usually keep those of STS (such as the limited-comparison their constituent elements hidden from con- rate and the limited amount of information scious perception, (e) They do not directly that can be maintained without loss), (c) cause new learning in LTS (though they can Control processes can be adopted quickly, indirectly affect learning through forced al- without extensive training, and modified location of controlled processing), (f) Per- fairly easily (though not always by verbal formance levels will gradually improve over instruction), (d) Control processes can be trials as the automatic sequence is learned. used to control the flow of information within In many cases asymptotic performance levels and between levels, and between STS and may not be reached for thousands of trials. LTS. In particular, they can be utilized to It is particularly important to specify care- cause permanent learning (i.e., transfer to fully what we mean when we say that an LTS) both of automatic sequences and of in- automatic process does not partake of the formation in general, (e) Control processes capacity limitations of STS. To make this show a rapid development of asymptotic per- clear, let us suppose that X, Y, and Z are formance. For a given control process, if nodes that are automatically activated in automatic processing does not develop (due, turn by input I in the presence of general say, to varied-mapping procedures) and if context nodes C. We may depict the sequence the constituent elements of the process do as follows: not otherwise change due to long-term learn- ing, then performance level will stabilize very quickly at an asymptotic value. When per- C C formance improves over trials, it does so be- As each of X, Y, and Z is activated, it enters cause the control process is changed, or be- STS and its future residence in STS will be cause automatic processing develops, or be- affected by the limitations of STS. In fact, cause the constituent nodes that are linked if the nodes X, Y, and Z do not include an by the control process are themselves altered attention-attracting response, then it is quite by long-term learning. conceivable that all three nodes will decay Common examples of control processes in- and be lost within a few hundred milliseconds, clude maintenance or rote rehearsal, coding and the subject will be quite unaware that rehearsal, serial search, long-term memory such an automatic sequence ever occurred. search, and decisions and strategies of all Nevertheless, the sequence itself is not kinds. These will be discussed in more detail governed by the limitations of STS in the in the sections below. sense that the probability of its being ac- 2. Automatic Processing tivated and the rate of its occurrence will not be affected by other concurrent automatic Automatic processes have the following and conrolled processes taking place in STS properties, (a) They are not hindered by the (at least if context C is present and the capacity limitations of STS and do not re- other concurrent processes do not also utilize quire attention. Thus automatic processes nodes X, Y, or Z). often appear to act in parallel with one an- other and sometimes appear to be indepen- 3. The Development oj Automatic Processing dent of each other, (b) Some automatic The tendency for one node to activate processes may be initiated under subject con- another will be increased if both nodes are PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 161

present simultaneously in STS and if a con- trolled processing, it must not be concluded trol process and attention are directed toward that automatic processing invariably precedes these nodes, thereby increasing their salience controlled processing. In fact, automatic and in STS. controlled processes can proceed in parallel Before we can describe efficacious training with one another (as in Experiment 4d when conditions, we must distinguish between two automatic processing of the elements on the types of automatic processes. One type of invalid diagonal proceeded in parallel with automatic sequence may be called actional, controlled processing of the valid diagonal). because it includes phases that direct internal Even more important, controlled processing processes (like calls for attention) or phases is often used to initiate automatic processing. that produce overt responses (such as button Particularly in complex processing situations, presses). A second type, called informational, (such as reading), an ongoing mixture of con- contains no directions for actions. The dis- trolled and automatic processing is utilized. tinction is crucial because an informational The next steps in the serial, controlled sequence strengthened on one trial will not processing sequence are based on the output have deleterious consequences for perform- of automatic processes initiated earlier; then ance on other types of trials. On the other new automatic sequences are initiated and hand an actional sequence may give rise to these run to completion in parallel with the a response antagonistic to a response re- ongoing controlled processing. quired on another trial. Thus, to be useful in a given task, actional sequences require 5. The Benefits of Automatic and Controlled special, consistent, training conditions. Processing To make this point clear, suppose that node B produces a response antagonistic to A system based on the two basic processing that produced by node C, while nodes D and modes with the characteristics we have de- E produce no responses. If any of A-B, A-C, scribed has many advantages. In novel situa- A-D, or A-E is trained alone, then that tions or in situations requiring moment-to- sequence can be learned. If training on A-D moment decisions, controlled processing may is mixed from trial to trial with training on be adopted and used to perform accurately, A-E, then A will come to elicit both D and E. though slowly. Then as the situations become However, if A-B trials are mixed with A-C familiar, always requiring the same sequence trials then learning of both will be impossible, of processing operations, automatic processing since A cannot lead simultaneously to two will develop, attention demands will be eased, antagonistic responses. The automatic de- other controlled operations can be carried tection system discussed at length in this out in parallel with the automatic processing, paper is just an actional sequence, since an and performance will improve. Some of the automatic-attention response to one stimulus advantages of such a system are (a) It allows is incompatible with a simultaneous atten- the organism to make efficient use of a tion response to another stimulus; therefore limited-capacity processing system. The de- automatic detection requires consistent train- velopment of automatic processing allows the ing to develop. Thus, in the varied-mapping limited-capacity system to be cleared and conditions of any of our studies automatic devoted to other types of processing neces- detection could not develop and controlled sary for new tasks, (b) It allows attention processing had to be utilized. to be directed (through automatic-attention responses) to important stimulation, what- ever the nature of the ongoing controlled 4. Combinations of Automatic and Controlled Processing processing, (c) It allows the organism to adjust to changes in the environment that Although sensory inputs are first encoded make previously learned activity patterns with the automatic processing system, with useless or harmful, (d) It allows the organism the results being made available to con- to deal with novel situations for which auto- 162 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER Long term store Short term store Level Level Automatic Level Level R s 2 Encoding N-1 N E — *=K *t=3— N K=3— — M=) S S "Ti* v~ /"" P i \, 0 0 0 i~ 4 ^ ^"' •^, heb U R ^f — / / N T "V *~ ' / S Y s* P (^- tku omatic E response U I P T N V. ,

Figure 11. A model for automatic and controlled processing during tasks requiring detection of certain input stimuli. Short-term store is the activated subset of long-term store. N levels of auto- matic encoding are shown, the activated nodes being depicted within each level. The dashed arrows going from higher to lower levels indicate the possibility that higher level features can sometimes influence the automatic processing of lower level features. The solid arrow from a node in Level 2 to the attention system indicates that this node has produced an automatic-attention response, and the large arrow from the attention system to Level 2 indicates that the attention system has responded. The arrow from level N to the Response Production indicates that this node has called for an automatic overt response, which will shortly be executed. The arrow from Controlled Processing to the Response Production indicates the normal mode of responding in which the response is based on controlled comparisons and decisions. Were it not for the automatic responses indicated, detection would have proceeded in a serial, controlled search of nodes and levels in an order chosen by the subject. matic sequences have not previously been For example, the letter "M" may first be learned, (e) It allows the organism to learn encoded in features indicating contrast, increasingly complex modes of processing by color, and position; then curvature, con- building upon automatically learned sub- vexity, and angles; then a visual letter systems. code and a verbal, acoustic-articulatory code, then the codes "letter," "consonant," C. A Framework jor Processing in Detection, "capital;" and finally, perhaps, semantic Search, and Attention Tasks and conceptual codes like "followed by 'N'," "middle of the alphabet," and the like. (We The framework we have in mind is built do not necessarily imply that these features within the general theory already presented. are correct or exhaustive; if, say, amplitude It is illustrated in Figures 11 and 12. When a components of the spatial-frequency analysis set of inputs is presented (let us suppose of the inputs prove to be relevant features for convenience that the inputs are visual) encoded by the system, the theory we de- then each begins to undergo automatic scribe would be unchanged in all important processing. The system automatically encodes respects.) What features will become ac- each stimulus input in a series of stages and tivated depends on the physical nature of the activates a series of features in the process. nervous system that was predetermined gen- PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 163

etically, the degree and type of prior learning, nodes that are automatically activated may the physical characteristics of the display initiate a response that will direct subsequent (like duration and contrast), and the general processing or subsequent actions. For ex- context, both that in the environment and ample, an attention response might be ac- that generated in STS by the subject. To the tivated by a feature; the attention response extent that the subject directs the sensory might direct controlled processing to the cor- receptor orientation and to the extent that responding set of features representing that internally generated information can alter the input stimulus, so long as other competing context in STS, the subject will have at least attention responses do not occur simulta- some indirect control over automatic sensory neously. As another example, an overt re- coding. sponse might be engendered by a particular The automatic processing as described stimulus (such as a startle response to a above takes place in parallel for each of the sudden loud noise, a galvanic skin response to input stimuli. The processing of each stimulus an aversively conditioned word, a ducking is often independent, except for lateral and response to a missile thrown at the head). temporal interactions at early stages, called If the set of inputs contains a target stimulus masking, and except for learned relationships that gives rise to a relevant nonconflicting between items that may affect processing at attention response, or to an overt response, later stages (if adjacent letters form a word, then we say that automatic detection is for example). The various features that are operating. Note that the attention response activated are all placed thereby in STS could be attached to features at any level where they reside for a short period before of processing and to more than one feature being lost (i.e., before returning to an in- at a time. In Figure 12, as an example, at- active state in LTS). tention responses are attached both to the We propose that some of the features or feature "8" and to the feature "number."

Visual Higher Visual character Category level features code codes Inputs ^\ code IV h"\S I Controlled processing Son's age" 'Model railroad track" etc.

Figure 12. Another view of the model shown in Figure 11. This figure depicts a conceivable (but abbreviated) series of feature encodings for a frame in which two characters and two masks are presented. The 'Consistent-mapping condition is shown in which numbers are memory-set items and letters are distractors. The arrows skipping levels indicate that a given feature can help activate features at several different levels. The stimulus 8 is a consistent-mapping target, and hence both the visual and category codes produce automatic-attention responses. The attention response has attracted attention to the information deriving from the input 8. 164 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER

It is possible that many of the inputs will ever this is the case, the framework described give rise to attention or other responses and here applies equally to attention tasks. that these responses will conflict. For ex- Divided-attention tasks study the increments ample, every input might automatically in performance that may occur when the engender an attention response; since it is load is decreased. Decreased loads are usually impossible to direct attention to every item specified through advance instructions that in the display at once, the various attention certain inputs are irrelevant, thereby de- responses will conflict and cancel each other. creasing the size of the memory set, the frame In such cases, the subject will not be governed set, or both. by the output of the automatic system and According to our framework, experiments will base subsequent actions and responses that do not show dividing attention to re- on the results of a controlled search. These duce performance fall into two classes. One hypotheses were supported in Experiment 2 class includes those studies in which auto- (see Figure 4, left panel for the results). In matic detection is operating. These are usually this study, all inputs were items that sub- studies utilizing a CM paradigm and high ject had previously been trained to respond degrees of practice. Divided-attention deficits to with automatic-attention responses. The will not be seen because the target stimulus results showed that the subjects reverted to will be detected automatically, in parallel the use of controlled search; furthermore, with the other stimuli, and often indepen- the controlled search was almost identical dently of other stimuli. An exception to this to that seen in the usual VM conditions in general rule may apply when two (or more) which none of the inputs had been trained to targets are presented simultaneously on a elicit automatic-attention responses. trial. Even if both targets generate attention In cases when the input stimuli do not responses, there may be a difficulty in dis- activate automatic responses that govern the criminating a double from a single occurrence. detection process, then a controlled search In such an event, the controlled comparison must be used. In this event, the subject must system may have to be called into play to attempt to search through the set of features count the targets. (A detection decrement that is activated and to use a search strategy may then occur, because the second item that is as efficient as possible. For example, may have decayed from STS by the time in our visual search studies, letter codes are the comparison of the first item is complete; normally compared in preference to (and see Moray, 1975, Shiffrin, 1976, and the prior to) sets of angles and lines; a category discussion of the 0 spacing effect in the CM code representing a set of letters is often condition of Experiment 3, Part I.) compared in preference to (and prior to) The second class of studies in which di- individual letter codes, as shown in Experi- vided-attention deficits do not occur is that ment 3. Furthermore, the order of search in which controlled search is utilized but the and the placement of decisions within the capacity of STS is not stressed. Examples search are also controlled in an attempt to are seen in the series of studies by Shiffrin increase efficiency. To give an example, com- and his colleagues, summarized in Shiffrin's parisons in our studies in Part I took place (1975a) article. In such studies either the in an order that cycled through the frame load in STS is kept low, or a cue informing for a given memory-set item before switching the subject which input is relevant appears to a new memory-set item, and a matching at about the same time as the inputs. decision was made after every comparison. It is, however, only in specially designed To give another example, matching decisions situations that capacity limits are not stressed. in many memory search studies (M varying, In most studies requiring controlled search, F — 1) are withheld until all comparisons the extra time required to search a larger are completed (Sternberg, 1975). number of relevant inputs will impair per- Note finally that many attention tasks formance, whether measured by reaction utilize search or detection paradigms. When- time or accuracy. This was the case in the PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 165

VM conditions of the experiments reported terminating search model and discussed in this paper, and in Part I: In all these some alternative models. Our discussion was cases increases in load reduced performance. limited, however, to our basic single or There are two basic reasons for the perform- multiple-frame search task with small values ance reduction in controlled search studies. of M and F. We will now consider a few First, some of the features in STS may decay alternative paradigms, along with special and revert to LTS before the comparison assumptions they might require. process reaches them. Second, new inputs may arrive and require processing, forcing 1. Tasks Utilizing Large Memory Sets the comparison process to switch away from Suppose there are too many items in the the previous inputs. Probably only this second memory set to be maintained in STS. Of factor operated in the studies we have re- course, then, the memory set must be learned ported in this paper. For either of these in LTS in advance of the test. There are reasons, divided-attention deficits can be ex- then two basic search modes. In the first, pected in detection tasks requiring controlled the test item is automatically encoded and search. accesses a node in memory that contains the The situation is quite different, however, desired information. For example, in deciding in focused-attention studies. In these studies, whether an item is any word, the input item, certain inputs are known to be relevant, and if a word, is encoded to the nodes represent- others are to be ignored. If controlled search ing the word and the information in those is being utilized, there is little reason to ex- nodes is usually sufficient to classify the pect that any substantial deficit will be item as a word; on the other hand, a non- caused by the presence of to-be-ignored stim- word obeying appropriate orthographic con- uli, at least if it is clear to the subject well in straints is encoded to a lesser degree and the advance which stimuli and locations are failure of the automatic encoding process relevant and if there are no location con- could itself serve as a cue for "nonwordness." fusions. In such cases the subjects should The second type of search mode would con- direct the search order so that the relevant sist of entering successive parts of the mem- locations are searched first; thus perform- ory set from LTS into STS, using controlled ance deficits should not arise. Evidence sup- search to examine each subset in STS. For porting these propositions is found in Experi- example, if asked whether there is a flower, ment 4a: Subjects were able to search one country, or first name whose fourth letter is diagonal and ignore the other. "u" a subject might successively generate The tasks in which large deficits in focus- members of each category and serially check ing attention will occur are those in which them. irrelevant items give rise to attention re- The above comments make it clear that sponses during automatic encoding. In such the case of large memory sets is not different cases, attention will be attracted to the to- in principle from the cases in which the be-ignored item and a loss of time will occur memory set may be held in STS, though before the controlled search can be redirected the models may need additional assump- to the relevant inputs. The time lost will tions. For example, Atkinson and Juola impair performance. Such focused-attention (1974) carried out a study in which sub- deficits were seen in the reversal results of jects had a large memorized set in LTS. Experiments 1 and 2, and in the attention They proposed a hybrid model in which a results of Experiment 4. judgment based on familiarity enabled the Thus, our framework can be applied to a subject to decide on some trials without a wide variety of search, detection, and at- search that the test item was definitely in, tention tasks, although additional assump- or definitely not in, the memorized set. On tions must be made to generate precise models trials when this initial familiarity judgment to deal with the particulars of each task. In was ambiguous, then a controlled search Part I we presented a quantitative serial, through the set was made. 166 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER

2. Large Display Sets inputs (assuming that one particular subset is consistently relevant). Second, the per- When large numbers of items appear in the ceptual categorization could influence the display, then it may turn out that acuity order and perhaps the nature of controlled varies considerably across the display for a search. To give one simple example, a dis- given fixation point. In turn, acuity changes play arranged in two separate rows will tend are likely to lead the subject to adopt a to be processed one row at a time, in reading strategy in which eye movements are used order. A second example occurs in situations to bring subsets of the display successively where controlled processing can enable one into a high-acuity region. The location of to more quickly decide about the category each new fixation can be governed by a of the input than about the memory-set controlled search strategy or by an automatic membership of the input. In these cases a process, but in either event the necessity of two-phase controlled search might be adopted, obtaining good acuity will force this process with the first phase locating the relevant to be serial across successive eye movements. inputs by category, and the second phase Within each fixation, furthermore, either matching these inputs to the memory set. automatic detection or controlled search could This is illustrated by Green and Anderson's be utilized. The search studies in which sub- (1956) and Smith's (1962) studies in which jects scan rows of characters for targets have a two-digit number of a particular color had shown both modes of search. For example, to be found in a display of many two-digit the Neisser (1963) results, reviewed in Part numbers of differing colors. I, showed that automatic detection developed Automatic processing based on perceptual for subjects well trained in responding to categorization may have played an important letter sets presented in CM fashion (memory- role in the studies of search and detection set size had no effect), while controlled search reported in this paper. We have assumed was utilized for sets that had been given throughout that only the display positions only small amounts of training. An interest- with characters, and not those with masks, ing intermediate case occurred when the sub- are considered in the search. How is search ject was instructed to locate a row of char- restricted to character positions? It is con- acters that did not contain a given character. ceivable that a preliminary controlled search In this case, even with CM training, the task is used to identify character positions, but proved quite difficult. In our view the sub- the results of Experiment 4a argue against ject would have to adopt a serial process such a view—knowing the relevant diagonal from row to row (rather than from fixation of a four-character frame leads to perform- to fixation). Within each row automatic de- ance identical to that seen when two masks tection might be used to locate the key and two characters appear randomly on each character, but then a decision would have frame. Thus, a preliminary controlled search to be made before the next row could be would have to be extremely fast relative to considered. the time for each character comparison. More likely, an automatic process develops that 3. Categorical Partitioning oj Display Sets directs the controlled search away from mask Partitioning of the display set has usually positions and toward character positions. been accomplished by a categorization de- Since such a response would be consistently pendent on a relatively gross simple physical trained and reinforced, it should be learned feature, such as color, shape, or size. When in a fashion similar to that for automatic- displays are segregated into two or more attention responses. groups by such features, processing may be It should be noted that the development affected in two ways. First, the patterning of automatic responses to direct controlled of the input can govern the nature of auto- search is very similar to the process termed matic processing—controlled processing may by Neisser (1967) as "pre-attentive." Neisser be directed automatically to a subset of the was trying to explain how search could be PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 167 directed to a perceptually segregated subset at some time after presentation. One example of the inputs. We are suggesting that an is the "split-span" technique reviewed by automatic process might develop and direct Broadbent (19S8, 1971). Other examples the search in situations where the search- are reported by Sperling (1960) and Von directing response is consistently trained (as Wright (1970), who studied cued partial is usually true in such studies). We do not report following tachistoscopic exposure of feel, however, that the segregation of the character displays. relevant inputs must be based on simple In these memory paradigms, essentially physical features. With enough consistent the same underlying mechanisms apply as training, an automatic search-directing re- in the detection paradigms, though memory sponse could develop for a set of stimuli storage rather than detection is the processing defined by quite complex features. For ex- goal. When controlled processing is utilized, ample, Gould and Cam (1973) reported the subject has considerable control over results suggesting that an automatic process storage: The items that are rehearsed or was directing search to the locations of items coded are the items that will be recalled. from a set of 10 potentially relevant letters If an input is presented that generates an and away from the locations of items from a automatic-attention response, then this in- set of 8 other letters that never included a put will receive attention and tend to be target. (Any target was always drawn from recalled regardless of the nature of the con- the potentially relevant set of 10 letters, but trolled process utilized to store the other on most trials distractors were chosen from items. An excellent demonstration of these this set of 10 as well as from the set of 8). points may be found in Kahneman (197S). Although the evidence is not yet available, we raise the possibility that the converse of 5. Threshold Detection Tasks the argument in the preceding paragraph An input in a threshold detection task, by might also hold: Even with perceptual seg- definition, is presented in such a way that regation of the display into categories deter- automatic encoding on most trials will be mined by simple physical features, auto- incomplete and therefore ambiguous. That is, matic restriction of the search to the relevant in Section IV.A.4 we discussed the fact the category might not be possible unless the input stimulation must be above some thresh- relevant category is consistent across trials. old level for automatic encoding to run to completion in an accurate fashion. In Figure 4. Attention Tasks 12, for example, the "M," if presented near threshold, might cause activation of some The models applying in attention studies of the features at the visual feature level, but that utilize detection or search paradigms no features at higher levels. This set of are just those discussed in the preceding partial features will in general be ambiguous sections. The general rule is that instructional —several letters and numbers might be con- manipulations affect the order of controlled sistent with the activated feature set. Thus, search; increases in load cause the time a target will on some trials give rise to feature needed for controlled search to increase; and sets consistent with either targets or distrac- consistent training leads to the development tors. Similarly, distractors will on some trials of automatic responses that allow attentional give rise to feature sets consistent with either limitations to be bypassed. targets or distractors. Under these conditions We will say a few words, however, con- it is clear even in CM situations that auto- cerning attention studies that do not utilize matic-attention responses and automatic de- detection or search paradigms. One class of tection can only develop very slowly, if at studies utilizes a memory paradigm. The all, because the internal representations of inputs are manipulated in ways similar to targets and distractors will be mapped con- those used in detection studies, but the sub- sistently to responses only on those few trials ject is asked to recall or recognize the items when the encoding happens to be complete. 168 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER

A similar argument applies if encoding is in- detection tasks as well as many others, can accurate rather than incomplete. be summarized briefly in the following man- Even in CM threshold tasks, therefore, the ner: Sensory inputs are encoded auto- subject will be forced to adopt controlled matically, resulting in states of evidence search on most of the trials, namely the trials (consisting of feature sets) that the subject on which encoding is incomplete. On such utilizes in subsequent controlled processing. trials a serial comparison process will be While generally accurate, this view is too necessary to match the activated features simple in several respects. First, it must not against the features of the members of the be assumed that there is ever one moment memory set. Occasionally, however, auto- in time at which all activated features are matic encoding will manage to process the simultaneously present and available to the target completely. Suppose that almost al- subject for controlled processing. Rather it ways, those stimuli that are completely will usually be the case that some features processed will be encoded correctly. Suppose will still be undergoing the process of en- also that a CM procedure is being utilized in coding (and hence will not yet be activated) the task. In these special conditions, an at- while other features have already been acti- tention response can be learned that might vated and are in the process of being for- be utilized on those trials when complete gotten. Second, it must not be assumed that encoding of the target happens to occur. controlled processing will follow automatic A model very similar to this was used by processing in time. Such an assumption is a Shiffrin and Geisler (1973) to fit the results fairly accurate simplification when used in from a threshold detection study by Estes models for certain tasks including threshold (1972). In that model, stimuli automatically detection. However, it is an inherent feature processed to a final, incomplete result were of our theory that automatic and controlled called "component detections" and stimuli processing can often operate in parallel, so automatically processed to the complete that controlled processing can begin while (letter) level were called "letter recogni- automatic processing is still progressing. This tions." The model proposed a two-part con- point becomes crucial in situations in which trolled search in which the letter recognitions the subject is required to give extremely fast were searched first (in parallel through mem- responses (including those paradigms pre- ory, but serially through the display), and senting above-threshold stimuli), since a then the component detections were matched very fast response might have to be based serially, feature by feature, against the mem- on the features available at a certain point bers of the memory set. In light of the in time, even though better features might present work, we would like to propose for be available at a later point in time. consideration an alternative but similar model in which the letter recognitions result D. Automatic and Controlled Processing in in an automatic-attention response, so that Memory Storage and Retrieval controlled search need be utilized only if the In this section we shall review briefly how target is not among the letter recognitions. automatic and controlled processing might be In practice, the predictions of these two models would not differ greatly. involved in short-term retention, learning, and in long-term retrieval. In summary, it seems clear that controlled search, whether at the feature or character level, is bound to be the predominant de- 1. Maintenance of Information in Active tection mechanism in all threshold detection Memory tasks, but there is a possibility in CM para- We suggest that the control process most digms that automatic detection can be util- evident to introspection is rehearsal. Atkinson ized on a subset of the trials when the target and Shiffrin (1968) studied many tasks in happens to be encoded relatively completely. which subjects utilized a continually updated Our general view of processing, in threshold rehearsal "buffer" to aid their memory sys- PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 169 tern. Rote rehearsal is of course only one of Craik & Jacoby, 1975; Craik & Lockhart, many control processes that cause items to 1972; and Craik & Tulving, 1975, for a dis- have an extended residence in STS; coding, cussion of these issues.) deciding, retrieving, and the like also cause The other side of the storage question con- a similar result, though these processes are cerns the nature of LTS transfer when at- primarily intended to serve other purposes. tention is not directed to an item or sequence Iq general, items given attention of any sort of items. Studies of incidental learning tend are maintained in STS, including items to to show that subjects learn what they attend which attention is drawn by an automatic to, not what they are told to learn (see Craik process. & Tulving, 1975; Hyde & Jenkins, 1969, Items from which attention has been re- 1973; Postman, 1964; Schneider & Kintz, moved do not necessarily leave STS quickly, 1967). however. When the load is small and new If controlled processing is necessary for inputs are minimized, items can remain in long-term learning then automatic processing STS for extended periods (up to 40 sec in without controlled processing should not lead the Shiffrin, 1973, study). Another example to appreciable retention. One source of evi- is seen in the overt forced-rehearsal study dence is found by examining retention for reported in Atkinson and Shiffrin (1971). A distractors in CM tasks. Such items pre- long list of items was presented and subjects sumably have received almost no controlled rehearsed aloud a continually updated list processing. However, it may be assumed that of the three most recent items. As a result, due to prior exposures and prior learning the the last three items presented were always distractors are given automatic encoding at recalled on an immediate test. However, the least up to the "name" level (for evidence probability of recall for items prior to the see Corteen & Wood, 1972; Keele, 1972; last three decreased systematically as a func- Von Wright et al., 1975, among others). tion of the spacing from last rehearsal to Does the automatic encoding that these dis- test. Thus an item removed from controlled tractors receive during the many trials of a processing still requires a period of time CM task lead to any retention? Moray before it becomes lost from STS (i.e., be- (1959) had subjects repeat aloud a prose comes inactive). This persistence in the passage in one ear while a seven-word list absence of attention might be called the auto- was repeated 35 times in the other ear. Later matic component of short-term maintenance. recognition for the unattended words was at It is this automatic component of persistence the chance level. Gordon (1968) showed that that was studied by Shiffrin and Cook (see a set of four distractors in a CM search task Note 3) and that determines the basic ca- was recognized near the chance level even pacity and persistence of STS. In brief, after 10 days of practice. Gleitman and Shiffrin and Cook propose that the loss Jonides (1976) showed that distractors were process is a stochastic mechanism whose more poorly recognized in a CM search task rate increases as the amount of similar ma- than in a VM search task (though, due to terial concurrently in STS increases. the low levels of practice used in their study, the effect could have been due to the use of 2. Coding, or Transfer to LTS a controlled search for categories in the VM The role of control processes in long-term condition). storage is well established. Rote rehearsal Evidence of a rather different sort is found appears to transfer low-level auditory-verbal in reading tasks. If automatic processing does codes to LTS; these are not very useful for not lead to retention, then it might be ex- long-term recall but can facilitate long-term pected that a skilled reader who automatically recognition. Coding rehearsal appears to processes surface-structure features and who facilitate long-term retrieval of all kinds. In gives controlled processing to conceptual general, what is attended to and rehearsed is properties of a passage would retain little what is stored in LTS. (See Bjork, 1975; information regarding the surface-structure 170 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER details. Bransford and Franks (1971) have tivate many features (and perhaps responses) collected data that may be interpreted in and place them in STS. Internally generated this fashion. Of course, if controlled process- inputs also tend automatically to activate ing is directed toward levels of analysis associated information in LTS and to place normally carried out automatically, then the it in STS. features attended to will be remembered (see In general, the two primary controlled Postman & Senders, 1946). phases of LTS retrieval consist of the pro- To summarize, whether or not some cesses the subject uses to search and process nominal storage manages to be eked out in the information in STS that has just been the absence of controlled or attentive process- activated, and the processes used to alter the ing, it seems clear that any phenomenon that probe information from step to step of the would be appropriately called "automatic LTS search. storage" is a far less important determinant It is important to note that the selection of LTS learning than controlled processing. of probe information is not entirely under Before leaving the general topic of learn- subject control. In fact, the probe informa- ing it is interesting to speculate on the de- tion includes the general contextual informa- velopment of complex information-processing tion present in STS at the time of retrieval, skills. Since controlled processing is limited, in addition to the specific cues the subject only a small part of the memory system can manipulates to facilitate retrieval. The gen- be modified at any one time. After invariant eral context is only partly under subject con- relations are learned at one stage, the pro- trol, since much of it is environmentally de- cessing becomes automatic and controlled termined and even the internally generated processing can be allocated to higher levels context (e.g., what the subject is currently of processing. "thinking about") may be difficult to alter For example, the child learning to read radically. Thus, changes in probe cues will would first give control processing and then be under subject control to only a degree, give automatic processing to various units of and this fact is one of the factors underlying information, The sequence of automatically retrieval failure. learned units might be foreground-back- This extremely brief and superficial over- ground, features, shapes, letters,, words, and view should not be allowed to give the reader meanings of phrases or sentences. The child an impression that distinguishing the auto- would be utilizing controlled processing to matic and controlled phases of retrieval is lay down "stepping stones" of automatic generally a simple matter. Furthermore, the processing as he moves on to more and more controlled phase can be a most complex and difficult levels of learning. The transition many-faceted process. To give just one ex- from controlled to automatic processing at ample, Anderson and Bower (1973) consider each stage would result in reduced discrimina- how subjects compare test sentences with tion time, more attention to higher order stored sentences in long-term memory. In features, and ignoring of irrelevant infor- our model, there are two phases to this ex- mation. Gibson (1969, chap. 20) describes periment: (a) a retrieval of the appropriate these effects to be three of the major trends informational nexus from long-term memory in perceptual development. In short, the through the use of probe information (in- staged development of skilled automatic per- cluding the test question) and (b) a com- formance can be interpreted as a sequence parison in short-term memory of the test of transitions from controlled to automatic question and the retrieved search set. The processing. Anderson and Bower model, in the present view, is primarily a model of the second of 3. Retrieval from LTS these processes—the comparison within Retrieval from LTS has a large automatic short-term memory of the test sentence and component. Sensory inputs result in an auto- the retrieved information. See Shiffrin's matic series of stages of encoding that ac- (1970, 1975b, 1976) studies for a more PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 171 elaborate discussion of long-term retrieval tive frequency of positive and negative processes. trials). In Figure 11, in which our more general framework is depicted, these stages V. Models of Search and Detection proposed by Sternberg may be placed as follows: "Encoding" is equivalent to auto- In this section the theory we have pro- matic encoding; "serial comparison and bi- posed will be compared with some of the nary decision" both are part of controlled previous models that have been proposed for processing; "response organization and mo- detection and search tasks. tor response execution" are part of response production. In conclusion, then, the Stern- A. The Sternberg Model: Serial, Exhaustive berg model differs from the detailed quan- Search titative model we fit to our data in Part I, but nevertheless may be viewed as a special The serial, exhaustive search model is pre- case of our general framework. sented in Sternberg (1966). Extensions, further evidence, and reviews of the litera- 2. Problems for the Serial, Exhaustive ture are given in Sternberg (1969a, 1969b, Comparison Model and Their Resolution 197S). This model is meant to apply to memory search tasks (F — 1, M varies) The serial, exhaustive model is particu- with small memory-set sizes. It has been larly important on account of the research it confirmed impressively often but is limited has generated that give results not consistent in scope, as it is meant to apply only to a with the theory in its simplest form. These small range of paradigms. In our present inconsistencies have led to many new models, theory, serial, exhaustive search is considered some of which will be discussed below. The to be one of the controlled search strategies, apparent inconsistencies include the follow- one that is very commonly adopted. ing results. The factors that lead subjects in our situ- ation to adopt terminating, controlled search 1. Reaction times depend on the serial remain uncertain at present. Possibly the presentation position of the tested item much larger search loads in our study led to within the memory set (in VM procedures a terminating strategy. One other minor dis- with memory sets changing every trial). crepancy in results involves the Sternberg The most pronounced effects occur when the (1966) finding apparently showing fixed sets memory-set items are presented quickly and (CM) and varied sets (VM) to give equiva- the display item is presented very soon after lent set-size functions. In retrospect we can the last memory-set items (Burrows & see that Sternberg's subjects in the fixed set Okada, 1971; Clifton & Birenbaum, 1970; procedure were given far too little training, Corballis, 1967; Klatzky, Juola, & Atkin- and the fixed sets were varied too often for son, 1971; Klatzky & Smith, 1972). automatic detection to have developed. 2. Reaction times depend on the relative frequency of presentation of items within 1. The Relation Between Sternberg's Model the memory set in CM procedures (see Bie- and Our Framework derman & Stacy; Krueger, 1970; Miller & Pachella, 1973; Shiffrin & Schneider, 1974; Reaction time was proposed by Sternberg Theios et al., 1973). (1969a, 1969b) to be the sum of motor re- 3. Reaction times can be speeded for cued sponse time and the times for four additive items in VM paradigms (Klatzky & Smith, stages: (a) encoding (affected by stimulus 1972) or speeded for expected items in CM legibility), (b) serial comparison (affected paradigms (Shiffrin & Schneider, 1974; but by size of the memory set), (c) binary de- note that the fixed sets changed every 1,60 cision (affected by whether a positive or trials so that only low degrees of practice negative trial has occurred), (d) translation were involved), or speeded in a VM para- and response organization (affected by rela- digm for an item repeated during the pre- 172 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER sentation of the memory set (Baddeley & the comparison process is carried out in a Ecob, 1973). speeded fashion for an expected item. An- 4. The outcomes in CM paradigms often other possibility is that the encoding process differ in fundamental ways from those in or the response production stage is speeded VM paradigms, especially when some simple when an expected item is tested. Shiffrin and physical basis separates the memory and dis- Schneider (1974) attempted to carry out a tractor sets, or when the degree of training test discriminating among the possible mod- is high. Many such findings have been dis- els. While the general notion of expectancy cussed extensively in earlier sections of this was supported by the results, the study was paper and will not be reviewed again here. not conclusive in determining which version 5. Subjects typically give incorrect re- of the model was to be preferred. sponses on 1-10% of the trials, depending The important point to be emphasized is upon the condition. The serial search models that we (and Sternberg also) suggest that do not generally posit explicit mechanisms findings (1) to (3) listed above are not in- to predict errors. compatible with a controlled search process that is serial and exhaustive. We suggest these Before turning to alternative models, it is findings can be explained by factors that af- useful to consider how these phenomena are fect other stages of the response process than dealt with in the framework of our theory, the comparison stage or perhaps by a factor and also to see how Sternberg explains the that affects the comparison process only on results. certain trials. Many other researchers have The various findings in CM paradigms are preferred to discard the hypothesis of a serial, easiest to explain: Automatic detection de- exhaustive comparison process. Some of the velops that enables the serial search to be models proposed as alternatives will be dis- bypassed. Much of the research in this paper cussed below. was directed toward establishing this fact. The final problem for the serial, exhaustive Sternberg (1975) is less specific but also sug- search model is posed by the occurrence of gests that some alternative, more efficient, errors. The explanations of controlled search search process is used in such situations. processes by Sternberg (and by us) for re- Within varied-mapping paradigms, re- action time tasks do not propose an explicit course to the hypothesis of automatic detec- mechanism by which errors may occur. Many tion cannot be used to explain the results. investigators have simply ignored errors as It would be parsimonious if each of the long as the error rate rate is low, under 5%, above findings (1) to (3) proved to be the say. This is perhaps justifiable if errors are result of a common mechanism. Shiffrin and anomalous events that do not interact with Schneider (1974) proposed one possible mech- any of the other variables being studied. In- anism—namely, that when information con- deed the stability of findings across numerous cerning test probabilities is available to the studies that do not attempt to fix error rates subject, then one item is placed in a special at any given value and that vary consider- state prior to each test display (called a ably in the observed rates (in the range 0- state of "expectancy"), and that a test of 10%) lends some support to the view that the expected item results in a faster response ignoring errors will not distort the conclusions time than tests of nonexpected items. Stimu- drawn from the reaction time data. lus probability, serial position, cueing, stim- In principle, however, error rates cannot ulus repetition, or differential importance be ignored. Pachella (1974) has shown that could all be expected to determine the prob- instructions used to change error rates by ability with which stimuli will be expected. just a few percent can cause considerable The effects of expectancy could possibly changes in the level and pattern of reaction act at several different stages en route to the times. There are many ways in which error- execution of the response. One possibility, producing mechanisms can be appended to also mentioned by Sternberg (1975), is that controlled search models. Two of the most PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 173 important are as follows: First, the subject error rates corresponding to each amount of might terminate, or be forced to terminate, available search time are collected. The model the controlled search before the search is derived from one study can then be used to completed. Then a response would have to predict the results of the other. This is the be made as a guess based on partial informa- method that was adopted in Experiments 1 tion at most, or perhaps no response at all and 2 of Part I. This approach has the ad- would be made (an omission). These types vantage of relating two fields of inquiry that of error resulting from early search termina- are normally treated separately. In the tion account for almost all errors in search present paper, for example, attention tasks and attention tasks using above-threshold using accuracy as a measure were linked to stimuli with accuracy as a measure; in par- search tasks using reaction time as a measure. ticular, most of the errors in the accuracy It should be noted, by the way, that the tasks reported in the present paper and in type of error predicted for the results of Part Part I are errors of the type. Second, the I/Experiment 1 is that caused by premature search process might be carried out incor- termination of controlled search whose char- rectly; that is, a comparison might be car- acteristics were derived from the reaction ried out incorrectly owing to confusions time results of Part I/Experiment 2. How- among items, forgetting, misperception, and ever, the errors seen in Part I/Experiment 2 the like. When error mechanisms are ap- were not necessarily caused by the same pended to the basic search model a new ex- mechanism. In fact many, if not most, of panded theory results, which must be tested them may have resulted from confusions, through joint consideration of error data and forgetting, or anomalous condition-indepen- reaction time data. dent factors. Two basic approaches may be utilized to test search theories incorporating error pre- B. The Theios Model dictions. One approach involves manipulat- ing the error rate systematically within each There are really two separate treatments condition; the manipulation may be carried to be discussed here: the specific micromodel out through instruction, deadline training, used by Theios et al. (1973) to predict re- or signals-to-respond (Reed, 1973). This ap- sults that would otherwise be fit by a serial, proach has the advantage of making full use exhaustive model, and the general theory of both the error and reaction time data used by Theios (1973, 197S) to describe the from a single experiment. It has the dis- production of response times in a variety of advantage that the manipulations of error tasks. rates may affect the nature of the controlled search strategy adopted by the subject. For example, Reed (1976) carried out a remark- 1. Serial, Terminating Search Through a List able series of tests of a wide variety of models of Memory Items and Distractors using data collected in the signal-to-respond In this model a list is constructed in mem- procedure. Unfortunately, though one model ory, a list on which appear all items that could be singled out in preference to the might be displayed for test. Each of these others, the basic data differed in important items has an associated response cue (posi- respects from those ordinarily found in the tive or negative) attached to it. The memory- simpler version of the same paradigm. set items and distractors may in general be The alternative approach involves carrying intermingled in this list, but a variety of out two separate studies involving the same factors affect list ordering, and in many cases subjects. One study utilizes the usual re- the memory-set items may tend to appear action time methodology, with instructions early in the list. During the test, the subject to keep error rates low. The second study serially scans the list until the test item is utilizes a paradigm in which the available located, then terminates the search and re- search time is systematically varied and the sponds according to the cue information 174 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER located along with that item. A group of serial, terminating search is not a good can- items at the end of the list is assumed, how- didate as a model for the simpler para- ever, to be searched in parallel, in one step, digms in which linear, parallel set-size func- if search has not previously terminated. These tions are collected (see Sternberg, 1975, for last items are supposed to be accessed through the relevant arguments). There are many LTS, while those scanned serially are in STS. additional reasons for us to reject the hy- (There are a variety of complications and pothesis of Theios et al; these reasons arise extensions added to this model that we from the results of the present paper. Bas- shall not discuss.) In at least some situations, ically, the Theios et al. model was proposed this model can approximately predict linear to deal with a variety of stimulus prob- set-size functions that are parallel for nega- ability effects that arise in CM paradigms. tive and positive trials. We have shown in this article that indeed a Various versions of the model were fitted different detection process develops in such to data collected by Theios et al. (1973). paradigms; however, the fully developed They utilized a CM procedure with a moder- automatic process and the pure controlled ate amount of training. It seems likely, there- process are two different mechanisms and fore, that many of the observed effects in both are relatively simple. In effect, Theios their study, especially those based on differ- et al. have been led to propose a rather com- ences in stimulus frequency, resulted from plex unitary model to try to fit data that the development of at least a small degree of probably reflect a mixture of two different automatic detection. The more frequent items simple search mechanisms. would come to elicit automatic responses sooner, causing the observed reduction in 2. A Hybrid Serial-Parallel Model response reaction times. In our terms, the A much more general theory of response subject's strategy was probably a mixture time generation in search tasks has been of automatic detection and controlled search, presented by Theios (1973, 1975). This making simple conclusions difficult. Never- theory has many similarities to the theory theless, even if automatic detection is not presented in the present article. It postulates involved, it may be asked whether models input and identification stages (which we of the complexity suggested by Theios (or would call automatic encoding), a response suggested by the comments above) are neces- determination stage (which in our terms is sary to fit his data. Shiffrin and Schneider either a controlled search or an automatic (1974) showed that a simple version of the process or some combination, depending on expectancy model using serial, exhaustive the paradigm), and response program selec- search through the positive set could provide tion and response output stages (which we a fit to the data about as good as that pro- term response production). Theios's de- vided by the more complex models of Theios scriptions of the situations in which auto- et al. (1973). Under these circumstances the matic processing should occur are quite sim- data do not provide strong support for the ilar to those we suggest in this paper. His Theios model. distinction between controlled and automatic Aside from questions concerning the de- processing is not made as explicit as in the tails of the Theios et al. (1973) study, one present article, and in his view the use of can address the more general hypothesis put controlled search is tied more to stimulus- forward by these investigators. This hy- response compatibility than to the type and pothesis suggests that most memory search amount of practice (i.e., the consistency of studies are best conceived of in terms of a the mapping). Furthermore, the details he (rather complex) serial, terminating search presents concerning controlled search pro- model. Although we have collected data cesses are somewhat different and more limited arguing for a serial, terminating search than we have presented in the present treat- through the memory set (Experiment 2, Part ments. On the whole, however, the Theios I), we agree with Sternberg (1975) that a (1975) theory is quite compatible with the PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 175 general approach taken in this paper; in in long-term memory, a location containing fact, we suggest that interested readers read information enabling the correct response to that paper for additional evidence regarding be given. These models are most applicable automatic processing in reaction time tasks. to CM paradigms, although they have some- times been applied to VM situations and al- though some have been elaborated to contain C. Parallel Processing Models serial, short-term search as an additional Certain parallel processing models have process. Generally, in these models the trace been proposed for VM search paradigms as strength of the code in long-term memory alternatives to the usual serial models. Atkin- determines the ease of access and the re- son et al. (1969), Murdock (1971), and sponse time for a given item. Townsend (1971) have all presented versions Speaking generally, it is our feeling that of such models. The Townsend model can such models capture part of the process we serve as an example. In this model the time have described as automatic detection. Note, to complete any one comparison is distributed however, that the two concepts are not exponentially with a rate constant that de- identical. An example will make this point pends on the number of current items under- clear. Suppose we define the memory set to going processing. When any item completes consist of all words whose fourth letter is comparison, the rates are immediately re- alphabetically prior to the second letter. A adjusted. word presented auditorally will presumably Such models almost completely mimic be encoded automatically until the long-term serial models. The same set-size predictions memory node is located that contains the are derived as for serial models, whether or spelling. Then a controlled process will check not termination of search is assumed. There the spelling to see whether the criterion is are some minor problems with these models. satisfied. Such a process is not "automatic For example, the exponential assumption detection." If several items are presented at implies a particular relation between the once, each would have to be checked serially. growth of the mean reaction time and the This is an example of a task designed so that growth of the variance of the reaction time, the information found in long-term store with set size. More important, however, this corresponding to a given input is sufficient sort of parallel model is not conceptually for a decision to be made correctly, without very different from the serial models—com- considering other members of the memory set. parisons occur, on the average, at evenly This property could hold for either VM or spaced intervals of time for each of the items CM tasks, regardless of task-specific train- of the memory set (or the display set). We ing and regardless of whether an attention (or prefer, therefore, to think of such models as other task-specific) response has been learned. members of a class also containing the serial The automatic detection mode is therefore models. As Sternberg (1966) has shown, similar to the search mode of trace-strength some types of parallel models, in which each models in that each input is encoded auto- item is processed independently of the num- matically, and relevant information is located ber of alternative items, cannot predict the in LTS. If the information located must be observed results. processed in limited, controlled fashion (so that multiple inputs would have to be de- cided about sequentially, for example) then D. Direct Memory Access Models Based on we would not consider the process to be Trace Strength one of automatic detection. On the other A number of models have included a search hand, if the information located is attention mechanism that resembles in certain respects directing, or even response eliciting, then the the automatic detection process presented in process would be one of automatic detection. this paper. In these models, an item presented Several varieties of trace-strength models for test is encoded directly to some location may be distinguished. We consider these next. 176 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER

1. Trace-Strength Discrimination Models 2. The Atkins on-Juola Model In these models, the information found in This model (Atkinson & Juola, 1974) was memory for any tested item consists of a constructed to explain the results of search unidimensional value of strength (often in- studies in which a large, previously learned terpreted as familiarity). The task is de- ensemble of items forms the memory set. signed so that the response can be based on The model assumes that the test item is the retrieved strength value, and it is as- evaluated in terms of its familiarity—a value sumed that criteria can be chosen so that above a criterion leads to a positive response, errors are kept low in frequency. Examples a value below a lower criterion leads to a of such models are found in Corballis, Kirby, negative response, and a value between the and Miller (1972); Nickerson (1972); two criteria leads the subject to carry out a Baddeley and Ecob (1973); Cavanagh serial search of the list. Such a model is (1973, 1976), and Anderson (1973). quite compatible with our theory as applied There are many tasks in which we regard to a task of this kind, but there is at least a response strategy based on trace strength one problem requiring further research. The to be highly plausible, whether the trace estimated serial search rate is only about strength is utilized in a controlled decision 10 msec per item. If the serial search mode or used to initiate an automatic response. were similar to that in the usual procedure But we must ask whether such models are then a rate nearer 40 msec would be ex- reasonable when applied to typical VM pected. There are several hypotheses to ex- search paradigms. There is no question that plain the discrepancy—for example, it is pos- the models are capable of generating ac- sible that the search is somehow restricted curate predictions if strengths for items in to a subset of items that are similar to the variously sized memory sets are appropri- test item. A somewhat different hypothesis ately adjusted. However, a model with this would suggest that no serial search occurs much freedom could predict anything and at all, but that some rechecking of the would not be interesting. In practice, of familiarity value is necessary when the first course, various assumptions are made to observation is between the criteria, and that govern the possible changes in trace strengths the confusability of the test item, and hence across conditions. The extant models, how- the need to recheck the test item, will de- ever, do not appear to us to have the sim- pend on the list length. plicity and elegance of serial search theory In paradigms like that of Atkinson and when both are applied to VM search tasks. Juola (1974), where trace strength or famil- The models of Corballis (1975) and iarity is used as a basis for the response Cavanagh (1976) both assume that trace (thereby sometimes bypassing a serial search), strength is affected by rehearsal (called "se- it might be asked to what degree the detec- quential " by Corballis). Both models tion is an automatic process. Certainly the have been developed primarily in response to encoding of the stimulus and the generation the findings of serial position effects. As of a familiarity value is an automatic process. discussed earlier, such effects are not incom- The decision how to respond for a given patible with serial search, or even serial, value of familiarity could very well be a exhaustive search. Thus if rehearsal affects controlled process initially, though with suffi- search in important ways it might do so cient practice in the task, the decision and because it controls the trace strengths used response initiation for extreme familiarity to respond (in the trace-strength models), or values might also become automatized. because it affects serial search order (in a terminating search), or because it affects E. Some Conclusions about Search Models encoding or response time rather than com- parison time (in a serial, exhaustive model). Each of the models we have considered has Future research will be necessary to establish some elements in common with our general which, if any, of these possibilities is true. theory. At the same time, each has been PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 177 applied to some paradigms that we think in Stage 1 is called "filtering" or "stimulus would be better handled by an alternative set," and that in State 2 is called "pigeon- process. The direct access trace-strength holing" or "response set." There is a very theories, for example, have a close affinity short-term sensory-information store (less with our automatic detection mechanism than 1-sec duration) called a "buffer," which and have a natural application to CM para- accepts information in parallel from the digms, tasks with perceptual cues separating physical inputs. (This store is equivalent to the memory and distractor sets, or tasks in the more peripheral information in STS in which associations in LTS to each test item our theory.) The filter selects information contain information enabling a response to from the buffer and sends it through a be made. Such models should not, however, limited-capacity channel. In the Broadbent be applied, in our view, to VM paradigms of (1958) theory the filter allowed information the type that are usually studied in the from only one source at a time to continue laboratory. For such tasks serial, controlled through the system. In the theory proposed search seems a more attractive possibility in 1971, Broadbent's earlier view was modi- (or at least limited controlled search of one fied in accord with data put forth and theory sort or another). Much of the present article suggested by Treisman (e.g., 1960, 1969). has been devoted to demonstrating that two This modified theory suggests that the filter qualitatively different detection or search does not block, but only attenuates, informa- modes exist and to establishing the conditions tion from nonattended sources. The states of under which each is utilized. Many of the evidence in the system are the output of the models to date have attempted, we think filter. unsuccessfully, to apply a single mode of While there are obvious elements of sim- search to all the search paradigms. ilarity between our approach and Broad- The second point we wish to emphasize bent's, we would like to focus on the differ- is that many of the prior models have been ences : constructed to deal with just a few studies or just a few results from those studies. The 1. The Role of Control Processes history of investigation in this area has In our theory, selective attention is largely demonstrated beyond doubt that numerous defined in terms of control processes. Selec- models are capable of predicting results from tivity due to structural characteristics of the any given study. A proper evaluation of processing system, even structure that has models should incorporate two tests: (a) been learned, is not considered to be an at- Can the model predict a wide variety of tentional process. In the Broadbent model, results in differing paradigms, and can it the distinction between the structural, learned predict results in both the reaction time and components of selectivity and those under accuracy domain? (b) Can the model predict subject control are not as clearly distin- results from a single series of studies on the guished. One consequence of Broadbent's ap- same subjects, a series in which most of the proach is that the type and amount of train- commonly examined variables are manipu- ing is not implicated as a particularly im- lated? portant determinant of the presence or ab- sence of attentional deficits. Much of the VI. Models of Selective Attention data in the present article, however, has shown how the conditions of practice (par- A. The Broadbent and Treisman Models ticularly the consistency of the mapping of The Broadbent (1971) model supposes attention to a feature to a reinforced out- that there are two basic selective processes, come) determine the development of auto- one leading from the physical input to a set maticity, the bypass of controlled search, and of internal codes that serve as the evidence, the elimination of attentional limitations. and a second leading from the internal codes The emphasis on the role of control pro- to categories and responses. The selection cesses in our theory also implies that we are 178 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER advocating an active rather passive view of than information on a nonattended channel. attentional selectivity. In our theory atten- We will discuss below some studies by tional selectivity is largely the result of ac- Shiffrin and his colleagues that seem to pro- centuation of certain informational elements vide strong evidence in favor of this view- through the use of control processing; in the point. Broadbent-Treisman approach, on the other hand, inhibition and attenuation of the pro- 3. Short-Term Store and Levels of Processing cessing of certain inputs play the most im- In the Broadbent theory there is a short- portant role in selectivity. term buffer more or less corresponding to a short-term memory for peripheral informa- 2. Selectivity Leading to the Internal States tion. In the original 1958 theory this was of Evidence the sole basis for short-term memory, but in In our theory the internal states of evi- the 1971 theory, a later short-term system dence are the result of automatic processing called "primary memory" was added sub- of sensory inputs, with subject control af- sequent to the filter and the limited-capacity fecting sensory encoding only indirectly and channel. Thus the new theory is in close cor- to a small degree. On the other hand, the respondence with the approach of Atkinson selectivity that results from filtering plays and Shiffrin (1968). a large and fundamental role in Broadbent's However, the Broadbent theory identifies theory. This fact alone would not be a source selectivity with particular levels of processing: of discrepancy between the theories if these Selectivity operates after the buffer and prior filtering processes reflected only relatively to primary memory. We, on the other hand, permanent, structural components of selec- treat all short-term memory as a single con- tivity. However, the examples of filtering tinuum consisting of the results of automatic given by Broadbent make it clear that this encoding as well as inputs from LTS. Thus process is intended to reflect temporary shifts STS consists of information at a wide variety in attentional control. Thus, in numerous of levels of processing. In our view, selec- studies, filtering is said to select inputs tivity is not restricted to any special levels. on the basis of a physical difference, like Rather, selectivity operates at whatever level location. In some of these studies the same controlled processing is utilizing. If con- physical cue is always assigned consistently, trolled processing is checking color serially, so that automatic detection (in our terms) then attentional selectivity will appear at the could have developed, but the number of peripheral level of hue, while if words are trials is usually so small that controlled pro- compared serially to see which is a synonym cessing probably was utilized. In other of of a memory item, then attentional selectivity these types of studies the physical cue was will appear at the central semantic level. assigned randomly across trials, so that selec- tion must have been based on temporary con- 4. The Role of Timing trol processes (see Broadbent, 1971, chap. An important concept in Broadbent's S). In either class of studies, Broadbent sug- theory is that of "switching time." Broadbent gests filtering as the basis for selection, with interprets the results of many studies by attended inputs presumably resulting in a assuming that attention (the filter) may better state of internal evidence than non- not be redirected from one source to another attended inputs. before the lapse of some minimal time, called We take a different view. We propose that switching time. Although some of the results automatic processing results in roughly Broadbent ascribes to switching time limita- equivalent internal states, but that controlled tions are due in our theory to other mech- processing must examine these internal states anisms, we have no objection to arguments sequentially. Thus information on an at- that attentional limitations are rooted in tended channel is examined first, and is there- limitations on the rate of processing opera- fore detected and recalled more effectively tions. To the contrary, we have carried this PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 179

argument much further and have suggested must be high-level ones, however, because in that the time utilized during controlled pro- the studies masks were used to delete low- cessing is in many tasks the immediate, level features from storage. Thus such results proximal cause of the inability to divide call into question the need for two selective attention. The results of Experiments 1 and processes, one before and one after genera- 2 in Part I, and our analysis of them, were tion of states of internal evidence. We argue intended to demonstrate the role of timing instead that only one controlled selection by relating the accuracy of performance in process occurs, after automatic encoding has attention tasks to the reaction times produced produced states of internal evidence. Even in search tasks. if it should eventually be determined that some small degree of selection occurs during B. The Shiffrin (1975a) Model initial perceptual processing, the accumula- The work by Shiffrin and his colleagues tion of data makes it clear that the magnitude summarized by Shiffrin (1975a) may be re- of attentional effects due to controlled pro- garded as a preface to the present studies and cessing after automatic encoding is far present theory. The primary aim of the greater than the magnitude of any effects due earlier studies was a demonstration that en- to early selection. For further discussion see coding quality is virtually unaffected by at- Shiffrin & Gardner (1972), Shiffrin, Craig, & tentional instructions. The results from a Cohen (1973), Shiffrin, Gardner, & Allmeyer (1973), Shiffrin & Grantham(1974), Shiffrin, series of experiments demonstrated that in- formation from a source (location) is pro- Pisoni, & Casteneda-Mendez (1974), and cessed as well when it is the only information Shiffrin, McKay, & Shaffer (1976). The hy- requiring processing as when it arrives pothesis supported by these studies, that of simultaneously with other information also a single selection phase following automatic encoding, has been proposed by a number of requiring processing. If filtering or attenua- tion were occurring between stimulus input researchers and theorists. We turn now to and the production of internal states of evi- these proposals. dence, then attention would have been al- located less effectively when simultaneous C. The Deutsch and Deutsch Theory inputs were used, and performance would As preface to the Deutsch and Deutsch have suffered. (1963) model and to other similar theories, It could well be argued that the results we will review some results in auditory selec- were obtained only because attentional ca- tive perception from shadowing or dichotic pacity had not been exceeded. In effect, this listening tasks. Then we will consider the is our own argument. Such an argument does Deutsch and Deutsch treatment, objections not help resuscitate the views that early to it, and our present view. filtering, attenuation, or blocking of pro- cessing takes place, since the situations in 1. Experiments on Auditory Shadowing and which the results were obtained were as com- Dichotic Listening plex at the stimulus input side as most of the studies in which attentional deficits are Starting with Cherry's (1953) study, nu- found. merous studies have utilized a shadowing It should be noted that the Shiffrin technique in which the subject repeats con- (197Sa) results are easily accommodated in tinually a stream of speech (usually presented Broadbent's general theory. It must be as- to one ear) while other messages are pre- sumed that the various inputs are retained in sented simultaneously (usually to the other the sensory buffer long enough that they may ear). Moray (1959) showed that information all be passed successfully through the limited- on the nonshadowed ear could not be recalled, capacity system (in essence this is our ex- recognized, or relearned with savings. How- planation, also, if "sensory buffer" is replaced ever, the subject's own name on the non- by "STS"). The features in the sensory buffer shadowed ear could be noticed and recalled 180 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER at the end of the experiment. The traditional tion, overcomes these objections. If the stim- explanations for such results hold that the uli presented do not cause automatic-atten- information presented to the nonattended ear tion responses, then a controlled search situa- is greatly attenuated, but that some informa- tion (like that of Experiment 4a) obtains, tion, enough to let highly salient inputs be and a similar explanation holds: The subject noticed, does pass the filter. The basis for will carry on attention-demanding, controlled the types of information that are noticed on processing primarily on the shadowed or at- the nonshadowed ear is usually considered to tended message, with only minimal controlled be rooted in the content of the input. The processing of the nonattended information, earlier treatments supposed that simple phys- just enough to establish which information ical cues served as a basis (not only in is to be given deeper processing. Physical Broadbent, 19S8, but also in Neisser, 1967). cues, when available, will make this relevancy For example, two messages in the same voice decision easy. On the other hand, if auto- are confused, but if the messages are in a matic-attention responses have been pre- man's voice and a woman's voice, then the experimentally attached to stimuli, these nonshadowed message can be ignored. A stimuli will be attended to and remembered number of studies by Treisman (see 1969 for even when they are presented on a to-be- a review) showed that content of messages ignored channel. This fact explains the re- can also serve as a basis for selection though call for the subject's own name, for example. usually not as effectively as physical cues. Furthermore, the objection that selection is difficult without a physical cue does not hold, 2. The Deutsch and Deutsch Approach since we have shown that consistent train- Faced with selection on the basis of con- ing leads to the development of automatic- attention responses even without the pres- tent, Deutsch and Deutsch (1963) suggested that all inputs are analyzed to a relatively ence of a simple, consistent physical cue (though a great deal of training may be high level and that the results of the pro- cessing are then used to select certain stimuli necessary—see Experiments 1 and 3). for further processing, for memory, and for To return to the original controversy, some response. We are in essential agreement with of the most persuasive evidence in favor of this view, though we have elaborated on it the Deutsch and Deutsch position has been considerably and introduced the important the subject's ability to divide attention with- distinction between controlled search and out deficit, even in VM situations when auto- learned attention responses leading to auto- matic detection could not have been operat- matic detection. ing. Such data have often been discounted by The Deutsch and Deutsch model has not Broadbent and others on the following basis. yet gained general acceptance and in par- It is argued that relatively unanalyzed in- ticular has been rejected by Broadbent formation is held temporarily in a sensory (1971) and Treisman and Riley (1969). buffer; thus it should not be surprising that Within the framework of shadowing studies, multiple stimuli can pass through the filter several objections have been raised to the as long as a free period exists after presenta- Deutsch and Deutsch position. First, selection tion that allows the information to be passed without a physical cue remains difficult even serially from the buffer through the filter. with wide variations in content. Second, dif- This argument is used to explain the ap- ferences in content affect selection very little parent ability to analyze multiple simulta- when a physical cue is present. Third, selec- neous stimuli in a variety of tasks and is tion of a given word in a dichotic sequence thereby used to discount much of the evi- depends on the previously attended word and dence supporting the Deutsch and Deutsch not on the previously presented, but un- position. Broadbent's argument, however, attended, word. does not account for the results of our multi- Our present theory, with its distinction be- ple-frame studies, for in these studies there tween controlled search and automatic detec- were no blank periods to allow serial process- PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 181 ing, and yet the rate of presentation could trolled processing. Thus extrapolating from even be speeded under the consistent (CM) Norman's theory, visual search for a single training conditions. We do not disagree that memory item, even in a VM task, should be under VM conditions a postpresentation free almost capacity-unlimited and parallel. The period will allow simultaneous inputs to be subject should, by hypothesis, raise the per- handled serially by the attention system. We tinence value of the single item, and hence merely wish to point out that this explanation this item should be processed first. However, cannot help explain the relationship between all of our studies have demonstrated that our VM and CM conditions in the multiple- such pertinence changes cannot take place on frame tasks. Instead, Broadbent and Treis- a trial-by-trial basis. It is only after con- man would have to argue that the salience of siderable CM training that an automatic- the consistently trained targets is so high attention response develops (or, in Norman's that the filter plays no role in processing terms, that pertinence is raised). Thus a sub- whatsoever, in which case it might as well be ject may wish to find a particular stimulus argued that automatic processing bypasses but does not in general have the attentional the selective system. control to enable the stimulus to be located In summary, we argue that the position without a serial, limited, controlled search. of Deutsch and Deutsch, when extended in the manner of our present theory, is both E. The Neisser Model fully defensible and in better accord with the Neisser (1967) has proposed a general data than the opposing views. model that postulates an early stage of par- D. The Norman Model allel processing not under subject control called "preattentive" (similar to our sys- Norman, alone and with coauthors, has temic, automatic stage), and a later, controll- been responsible for a number of rather dif- able, serial stage referred to by the term ferent models. We consider here the version "focal attention." In many respects this gen- most closely in accord with that of Deutsch eral approach is quite similar to the present and Deutsch (Norman, 1968, 1969; see also treatment. For example, Neisser discusses in Lindsay & Norman, 1972). This model is some detail the fact that preattentive pro- basically an extension of the Deutsch and cesses can control attention or responses (as Deutsch approach in the sense that it pro- in eye movements). Furthermore, it is sug- poses a relatively complete analysis of all in- gested that practice can lead to search be- coming stimuli. As in the Deutsch and coming dependent on preattentive mecha- Deutsch approach, a mechanism is proposed nisms, and hence lead detection to become by which certain of these complete encodings automatic. are selected for further serial processing. This Neisser's own treatment of his visual search mechanism is called "pertinence." data differed in some respects from our present In the model, each set of encodings is as- treatment. For example, Neisser argued that signed a value of pertinence that determines irrelevant targets in CM designs are not pro- the order of further processing. The value of cessed in as much detail as relevant targets. pertinence is assumed to derive from two We argue that automatic analysis is equal, sources: the stimulus encoding and an anal- but that attention is drawn to the relevant ysis of previous inputs. The item selected for targets. Furthermore, there are some differ- further analysis is that one whose pertinence ences in emphasis between Neisser's and our value is highest as determined by a com- models. For example, Neisser tends to em- bination of sensory encoding and previous phasize that preattentive discriminations are analysis. In very general terms this model is based on relatively crude physical differ- highly similar to the theory we have proposed. ences among stimuli, whereas we feel "pre- However, the Norman model fails to make attentive" processes are determined by the important and necessary distinction be- conditions of training and practice, with phys- tween automatic-attention learning and con- ical differences affecting the rate of develop- 182 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER ment of automaticity. However, such differ- determine the ability of features to attract ences should not be allowed to mask the attention. That is, we have shown how the similarity of the general approaches.9 attention-attracting ability of features de- pends on the use of consistent-mapping train- F. The LaBerge Model ing procedures. LaBerge (1973, 1975) presents an atten- On the whole, our model and LaBerge's tional model, in the context of reaction time agree when applied to the same data. Thus, matching tasks, that has many elements in although the two theories have been built common with the present theory. He sup- on somewhat different data bases, they are poses, like Deutsch and Deutsch, that there consistent even to the degree that there are exists an automatic, learned encoding system similarities in the diagrammatic representa- that operates in parallel on input stimuli and tions (compare our Figures 11 and 12 with produces features independently of controlled, the depictions of the model in LaBerge, 1973, attentive processing. Depending on prior 1975). learning, there are no limits to the depth of Let us now return to models that are in such processing. Subsequent to such auto- basic agreement with the Broadbent-Treis- matic encoding, controlled processing re- man view, as opposed to the Deutsch and quiring attention is necessary to carry out Deutsch view. further analyses of the input. The features resulting from automatic encoding are as- G. The Kahneman Theory sumed to be capable of calling attention to Kahneman (1973) proposes a rather ex- themselves. LaBerge proposes, and demon- tensive theory and reviews the literature in strates empirically, that new automatic en- considerable detail. We cannot begin to do codings, that is, perceptual learning, may justice to his overall system in the limited occur with practice. Finally, LaBerge em- space available and will therefore discuss phasizes that perceptual learning takes place only a few general features. Kahneman pro- mainly after high accuracy is achieved. All of poses a general theory of the allocation of these aspects of the system are quite similar capacity according to a factor called "effort." to ours. In our terms, he is concerned with the general One additional element of the LaBerge limits on controlled processing: How can model holds that attention may be directed control be allocated? Can total capacity toward a not-yet-learned feature, causing it vary? What types of operations can be car- to act temporarily as an automatically learned ried out together with what losses of unit, but only while attention is maintained. efficiency? That is, a feature that would normally re- Interpreting our theory in Kahneman's quire extended controlled processing to be terms, one would say that we have proposed encoded from automatically activated, sim- a specific time-based theory determining con- pler subunits, can, when attention is focused trolled processing capacity in detection tasks, on it in advance, be activated by the per- and that our theory gives a set of rules or ceptual encoding process. This suggests at least some degree of control over encoding. While we have downplayed the role of such 9 One must take care to distinguish the Neisser control, we have not ruled it out. In fact, (1967) view of attention from the Neisser (1967) we suggested that the results from Part I/ view of cognitive processing in general. The general view emphasizes the constructivist aspects of per- Experiment 2, when M was equal to 1, might ception, sometimes known as "top-down" processing, well be explained by such a mechanism. in which perception is heavily influenced at all If there is a difference between our ap- stages by the subject's expectations and general proach and LeBerge's, it lies in the different knowledge. The essence of this general view seems to be a bit incompatible with the notion of an areas of applications and the completeness initial automatic processing system that is fixed, of the assumptions. For example, we have permanent, and largely unaffected by higher level elaborated on the conditions of training that controlled processing. PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 183 strategies by which subjects reallocate their is aimed to describe visual detection in processing in such tasks. Such a time-based threshold tasks. It differs from many previous theory is not specifically given by Kahneman. models in that it is quantitative and hence Furthermore, in Kahneman's theory there are makes explicit, testable predictions. It makes many results taken to indicate a change in a key assumption that the number of features allocatable capacity, a change often deter- abstracted, and the rate of features ab- mined by changes in effort. We feel that stracted, from a given input stimulus will be some of these findings are better explained by inversely related to the total number of items recourse to the hypothesis that automatic simultaneously presented. That is, attention processing has developed. As we have argued must be shared (i.e., divided) among multiple in this paper, the development of automatic inputs, even at the earliest stages of pro- processing provides a means of improving cessing. The simultaneous-successive studies performance independently of the effort put by Shiffrin and Gardner (1972), described into controlled processing. In fact, the effort earlier, most clearly show that this assump- required is usually greatly reduced by the tion is false. The features abstracted are development of automatic processing even of equivalent worth whether or not the in- though performance improves. puts are successive or simultaneous. As we There is one other important specific area have seen in the present article, the difficulties of difference between Kahneman's theory in dividing attention arise later, during con- and ours. In common with a number of the trolled processing, subsequent to the en- theories we have described (i.e., Broadbent), coding of features. Kahneman's assumes that control of alloca- tion affects processing prior to the stage at I, Classification Models for Attentional which features indicating recognition are ac- Paradigms tivated. We have dealt with this issue at A number of investigators have provided length and will not repeat the arguments, useful and interesting classifications of at- but we feel that control does not operate at tentional paradigms. One example is a classi- this point in the midst of what we call auto- fication of types of attentional limitations matic processing (except in certain special by Norman and Bobrow (197S). In general, cases, such as instances where the inputs are Norman and Bobrow suppose that inputs degraded or ambiguous, in which case auto- may be given limited processing for either of matic encoding never reaches the recognition two reasons: "data-limitations" and "re- stage). source-limitations." In our terms, data-limita- Reversing the emphasis, it may be asked tions refer to the performance-reducing inter- what elements in Kahneman's approach are actions that occur during automatic encoding; not considered in our theory. The principal for example, a signal is heard better in quiet such element is the entire concept of effort. than in a background of white noise. Resource We have confined ourselves to tasks in which limitations refer to the kind of attention we assume effort is continually maintained limitations that we would classify as the at the highest possible level. Of course it result of controlled processing. Norman and must be true that performance will drop Bobrow argue that many authors have given when effort drops to a low enough level, but insufficient thought to the source of the per- our theory has not addressed this question -formance limitations observed in their tasks. at all. It is natural, however, for us to assume They argue that many tasks can be varied that changes in effort will affect controlled parametrically (quantitatively) so that data- processing but not automatic processing. limitations occur under some conditions and resource limitations under other conditions. H. The Norman and Rumelhart Model We agree fully with these arguments; our The attentional features of this model studies have, we think, spanned the range of (Norman & Rumelhart, 1970) derive largely these limitations and demonstrated these from Rumelhart's (1970) article. This model points quite clearly. 184 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER

Many other investigators have provided in humans to models deriving from auto- classifications of attentional paradigms and matic-attention learning in consistent-map- results. The Treisman (1969) review, for ping paradigms in animals. The results in the example, is largely atheoretical and classifies present article should make it clear that such a variety of paradigms and tasks with a view a correspondence will fail. In fact, however, toward indicating the types of attention re- it is becoming clear that a comparison of striction that occur in each. Similarly Moray VM with VM, and CM with CM, studies in (1969a, 1969b) has classified attentional the two areas reveals similar findings. To studies and paradigms. Unfortunately, the give just one example, Riley and Leith variety of results and the experimenter's (1976) review some paradigms in which ani- ability to design new tasks seem to have mals are forced to utilize what we term con- grown about as fast as the growth in com- trolled processing; in such cases they reveal plexity of the classifications. Thus as useful capacity limitations and attentional control as these classifications have been, they have roughly similar to that found in human sub- been limited to a posteriori descriptions. jects (see also Mackintosh, 1975, and Wagner, 1976, for discussions of related J. Other Models matters). The brief summary of a number of theories given above by no means exhausts the field. K. Summary Keele (1973) and Hochberg (1970) pre- sented models like that of Deutsch and Many of the elements constituting our Deutsch and generally similar to ours. Im- theory may be found in previous proposals. portant and influential reviews, with con- There are certain themes, which have re- siderable theoretical import, have been curred in earlier theories, about which we written by Egeth (1967), Moray (1969a, take a particular position. 1969b),Lindsay (1970), and Swetsand Krist- 1. Control oj processing during encoding offerson (1970). Important research has been and feature abstraction. We argue that con- carried out by Eriksen and his associates trol is minimal, that subject-controlled filter- (e.g., Eriksen Si Collins, 1969; Eriksen & ing, blocking, or attenuating does not occur Eriksen, 1974; Eriksen & Hoffman, 1972; during this stage but subsequent to per- Eriksen & Rohrbaugh, 1970) and by Posner ceptual encoding. and his associates (e.g., Posner & Boies, 2. The role oj crude physical differences in 1971; Posner & Snyder, 1975). Although leading to automatic discrimination. We argue space limitations preclude our reviewing that the conditions of training are crucial to this work, we feel that the models we have the development of automatic processing and discussed above represent a cross section of automatic detection; the nature of physical typical approaches that have been put for- differences among items is of secondary im- ward for tasks that resemble those in this portance theoretically, though it may in paper. practice be important in determining the We would like to finish this discussion of rate of acquisition and levels of perform- alternative approaches by considering very ance during acquisition. briefly the relation of our theory to those 3. The limitations on, and capacity oj, con- arising in the fields of animal discrimination trolled processing. We have presented a learning and attention (e.g., see Mackintosh, specific, detailed model in which limitations 1975; Sutherland & Mackintosh, 1971). We on rate of processing determine the con- feel that data and theory in the human and trolling system's capacity. infrahuman areas are in much closer accord than previous work might lead one to believe. On the one hand, our theory may be seen There has been a tendency to compare at- as both a specification and extension of views tentional models deriving from controlled proposed in various earlier models. On the processing effects in varied-mapping studies other hand, our theory is grounded in a PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 185 much firmer empirical foundation (i.e., Ex- 2. The learning of an automatic-attention periments 1-3 of Part I, and 1-4 of Part II) response is a long-term phenomenon greatly than the previous proposals, encompasses a resistant to change (since many more trials greater breadth than some of the earlier were needed to recover to a given perform- views, and states its assumptions precisely ance level after reversal than to reach that enough that at least in our standard paradigm level in initial training). and in a number of others as well, predic- 3. The presence among the distractors of tions can be made in advance of experimental items that automatically attract attention manipulations. Perhaps most important, we harms the subject's ability to carry out a have, in Part I, taken steps toward quantita- controlled search (since postreversal per- tively linking attentional phenomena and formance dropped below initial performance search mechanisms, and have thereby tied levels). together two important fields of inquiry— 4. Eventually automatic-attention responses attention and search—that have too often can be "unlearned" and new sets of auto- been treated separately in previous models. matic responses learned, but only after con- siderable amounts of retraining. VII. Summary and Conclusions The studies reported in this article were Experiment 2 was carried out to test the designed to test and extend the ideas pre- effects of reversal when the two sets in- sented in Part I. In Part I we showed that volved were well-known categories. In Ex- controlled search is utilized in varied-mapping periment 2, the subjects who had been trained search paradigms and that automatic detec- extensively to search for digits in a back- tion is utilized in consistent-mapping search ground of consonants (or vice versa) were paradigms. These findings obtained in both reversed in a fashion similar to that used in multiple-frame tasks measuring accuracy and Experiment 1. The results showed in single-frame tasks measuring reaction time. The VM results in both types of tasks were 1. Performance in a VM condition after fitted by a quantitative model assuming reversal was identical to that in normal VM serial, terminating search, with comparisons controlled search, even though all items pre- cycling through the frame for each memory- sented were from the previous CM target cate- set item. gory. Hence, when all items have equally The experiments in the present article were strong automatic-attention responses, these designed to examine the learning and unlearn- either "cancel each other out" or can be ing of automatic detection, the role of cate- ignored, and normal controlled search takes gorization, and the learning of automatic at- place. tending. A summary of the main findings are 2. A reversal of memory and distractor given below. sets for the CM conditions reduced perform- Experiment 1 was carried out to study the ance almost to chance levels. This result development of automatic processing. Auto- shows that the phenomenon of automatic matic detection for Letter Set 1 in a back- detection is not due to the presence of a ground of Letter Set 2 developed over several categorical difference between memory and thousand training trials during which time distractor sets, since a distinction between performance improved considerably. Then the letters and numbers was still available after two sets were reversed and performance not reversal. only deteriorated sharply but dropped below 3. The pattern of results after reversal in- the level obtaining at the start of training dicated that a controlled search was being when controlled search was being utilized. utilized, but one that compared the category The results showed that of each input to the memory-set category 1. Automatic detection could develop when in a single operation. The results suggest that the memory and distractor sets were not pre- categories do improve search performance experimentally categorized. but do so by facilitating controlled search. 186 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER

Experiment 3 was designed to test directly 2. Dividing attention is possible when the the role of categories in the development of targets have been consistently mapped during automatic detection and the operation of training until automatic detection operates. controlled search. It was shown that in VM 3. Focused-attention deficits arise when the conditions, category learning for arbitrary, distracting stimuli initiate automatic-atten- visually confusable sets of letters eventually tion responses. took place after 25 sessions of training. At 4. Focusing attention is possible during this point, controlled search was still being controlled processing (e.g., in VM tasks). used, but the comparison process had switched from individual items to categories as a We shall conclude by summarizing briefly whole (memory-set size had no effect). the progress we feel has been made in organ- However, when training was then switched izing the phenomena of detection, search, to a CM procedure, performance improved attention, and perceptual learning. One of radically and automatic detection was learned. the greatest contributions of the work is the The results suggest that fact that all these areas have been tied both empirically and theoretically to each other 1. Both categories and automatic responses and to common mechanisms. Another im- can be learned for arbitrarily constituted, portant contribution is the delineation of visually confusable character sets. the differences between, and the character- 2. The role of categories is to improve con- istics of, automatic and controlled processing. trolled search, and possibly to speed the We make no claims that these experiments acquisition of automatic detection. have concluded the study of these areas. Quite to the contrary, one of the most im- As a group, Experiments 1, 2, and 3 pro- portant results of this work is the host of new vided a convincing demonstration of the questions that can now be phrased, and we qualitatively different characteristics of con- hope, answered. trolled search and automatic detection. Experiment 4 was designed to test the sub- jects' ability to focus attention in the face Reference Notes of distraction by (a) neutral characters, (b) 1. Sperling, G. Multiple detections in a brief visual current targets in to-be-ignored display posi- stimulus: The sharing and switching of atten- tions, (c) items in to-be-ignored positions tion. Paper presented at the 16th annual meeting that subjects had been trained previously to of the Psychonomic Society, Denver, November 1975. respond to with automatic-attention re- 2. Sperling, G. Attention operating characteristics sponses. The results showed that controlled for visual search. Paper presented at the 9th search can usually be directed to locations annual meeting, New that the subject desires to attend to but that York, August 1976. 3. Shiffrin, R. M., & Cook, J. R. Short-term for- automatic-attention responses can overwhelm getting without rehearsal or interference: Data the controlled processing system and can and theory. Paper presented at the 16th annual cause attention to be allocated to positions meeting of the Psychonomic Society, Denver, that should be ignored. Thus automatic-atten- November 1975. tion responses cannot be ignored and will interrupt and redirect ongoing controlled References processing. Anderson, J. A. A theory for the recognition of The attention literature as a whole was items from short memorized lists. Psychological considered, and in light of our results the Review, 1973, 80, 417-438. following rules were suggested. Anderson, J. R., & Bower, G. H. Human associative memory. Washington, D.C.: V. H. Winston & Sons, 1. Divided-attention deficits arise from 1973. Atkinson, R. C., Holmbren, J. E., & Juola, J. F. limitations on controlled processing. In par- Processing time as influenced by the number of ticular, detection deficits are due to the elements in a visual display. Perception & Psycho- limited rate of the serial comparison process. physics, 1969, 6, 321-326. PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 187

Atkinson, R. C., & Juola, J. F. Search and decision archical search processes in a recognition memory processes in recognition memory. In D. H. task. Journal of Verbal Learning and Verbal Be- Krantz, R. C. Atkinson, R. D. Luce, & P. Suppes havior, 1971, 10, 528-541. (Eds.), Contemporary developments in mathe- Corballis, M. C. Serial order in recognition and matical psychology (Vol. 1). San Francisco: recall. Journal of Experimental Psychology, 1967, Freeman, 1974. 74, 99-105. Atkinson, R. C. & Shiffrin, R. M. Human memory: Corballis, M. C. Access to memory: An analysis of A proposed system and its control processes. In recognition times. In P. M. A. Rabbit & S. Domic K. W. Spence & J. T. Spence (Eds.), The psy- (Eds.), Attention and performance V. New chology of learning and motivation: Advances in York: Academic Press, 1975. research and theory (Vol. 2). New York: Academic Corballis, M. C., Kirby, J., & Miller, A. Access to Press, 1968. elements of a memorized list. Journal of Experi- Atkinson, R. C., & Shiffrin, R. M. The control of mental Psychology, 1972, 94, 185-190. short-term memory. Scientific American, 1971, 225, Corteen, R. S., & Wood, B. Autonomic responses to 82-90. shock-associated words in an unattended channel. Baddeley, A. D., & Ecob, J. R. Reaction time and Journal of Experimental Psychology, 1972, 94, short-term memory: Implications of repetition ef- 308-313. fects for the high-speed exhaustive scan hy- Craik, F. I. M., & Jacoby, L. L. A process view pothesis. Quarterly Journal of Experimental Psy- of short-term retention. In F. Restle et al., (Eds.), chology, 1973, 25, 229-240. Cognitive theory (Vol. 1). Hillsdale, N.J.: Erlbaum Biederman, I., & Stacey, E. W., Jr. Stimulus prob- Associates, 1975. ability and stimulus set size in memory scanning. Craik, F. I. M., & Lockhart, R. S. Levels of pro- Journal of Experimental Psychology, 1974, 102, cessing: A framework for memory research. Jour- 1100-1107. nal of Verbal Learning and Verbal Behavior, 1972, Bjork, R. A. Short-term storage: The ordered out- 11, 671-684. put of a central processor. In F. Restle et al., Craik, F. I. M.. & Tulving, E. Depth of processing (Eds.), Cognitive theory (Vol. 1). Hillsdale, N.J.: and the retention of words in . Erlbaum Associates, 1975. Journal of Experimental Psychology: General, Brand, J. Classification without identification in 1975, 104, 268-294. visual search. Quarterly Journal of Experimental DeRosa, D. V., & Morin, R. E. Recognition reaction Psychology, 1971, 23, 178-186. time for digits in consecutive and nonconsecutive Bransford. J. D., & Franks, J. J. The abstraction of memorized sets. Journal of Experimental Psy- linguistic ideas. , 1971, 2, chology, 1970, 83, 472-479. 331-350. Deutsch, J. A., & Deutsch, D. Attention: Some Broadbent, D. E. Perception and communication. theoretical considerations. Psychological Review, London: Pergamon Press, 1958. 1963, 70, 80-90. Broadbent, D. E. Decision and stress. London: Egeth, H. Selective attention. Psychological Bulletin, Academic Press, 1971. 1967, 67, 41-57. Burrows, D. & Okada, R. Serial position effects in Eriksen, C. W., & Collins, J. F. Temporal courses high-speed memory search. Perception & Psycho- of selective attention. Journal of Experimental physics, 1971, 10, 305-308. Psychology, 1969, 80, 254-261. Cavanagh, P. Holographic processes realizable in the Eriksen, B. A., & Eriksen, C. W. Effects of noise neural realm: Prediction of short-term memory letters upon identification of target in nonsearch performance (Doctoral dissertation, Carnegie- task. Perception & Psychophysics, 1974, 16, 143- Mellon University, 1972). Dissertation Abstracts 149. International, 1973, 33, 3280B. (University Micro- Eriksen, C. W., & Hoffman, J. E. Some character- films No. 72-234, 79) istics of selective attention in visual perception Cavanagh, P. Holographic and trace strength determined by vocal reaction time. Perception & models of rehearsal effects in the item recogni- Psychophysics, 1972, 11, 169-171. tion task. Memory & Cognition, 1976, 4, 186-199. Eriksen, C. W., & Rohrbaugh, J. W. Some fac- Cherry, C. Some experiments on the reception of tors determining efficiency of selective attention. speech with one and with two ears. Journal American Journal of Psychology, 1970, 83, 330- of the Acoustical Society of America, 1953, 25, 342. 975-979. Estes, W. K. Interactions of signal and background Cherry, E. C., & Taylor, W. K. Some further experi- variables in visual processing. Perception & Psy- ments upon the recognition of speech with one chophysics, 1972, 12, 278-286. and two ears. Journal of the Acoustical Society Gibson, E. J. Principles of perceptual learning and of America, 1954, 26, 554-559. development. New York: Prentice-Hall, 1969. Clifton, C., & Birenbaum, S. Effects of serial posi- Gleitman, H., & Jonides, J. The cost of categoriza- tion and delay of probe in a memory scan task. tion in visual search: Incomplete processing of Journal of Experimental Psychology, 1970, 86, 69- targets and field items. Perception & Psycho- 76. physics, 1976, 20, 281-288. Clifton, C., & Gutschera, K. D. Evidence for hier- Gordon, G. T. Interaction between items in visual 188 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER

search. Journal of Experimental Psychology, 1968, LaBerge, D. Acquisition of automatic processing 76, 348-3SS. in perceptual and associative learning. In P. M. A. Gould, J. D., & Cam, R. Visual search, complex Rabbit & S. Dornic (Eds.), Attention and per- backgrounds, mental counters, and eye move- formance V. New York: Academic Press, 1975. ments. Perception & Psychophysics, 1973, 14, Lewis, J. L. Semantic processing of unattended 125-132. messages using dichotic listening. Journal of Ex- Green, B. F., & Anderson, L. K. Color coding in a perimental Psychology, 1970, 85, 225-228. visual search task. Journal of Experimental Psy- Lindsay, P. H. Multichannel processing in perception. chology, 1956, 5;, 19-24. In D. I. Mostofsky (Ed.), Attention: Contem- Hochberg, J. E. Attention, organization, and con- porary theory and analysis. New York: Appleton- sciousness. In D. I. Mostofsky (Ed.), Attention: Century-Crofts, 1970. Contemporary theory and analysis. New York: Lindsay, P. H., & Norman, D. A. Human informa- Appleton-Century-Crofts, 1970. tion processing: An introduction to psychology. Homa, D. Organization and long-term memory New York: Academic Press, 1972. search. Memory & Cognition, 1973, 1, 369-379. Lively, B. L., & Sanford, B. J. The use of category Hyde, T. S., & Jenkins, J. J. Differential effects of information in a memory search task. Journal of incidental tasks on the organization of recall of Experimental Psychology, 1972, 93, 379-385. a list of highly associated words. Journal oj Mackintosh, N. J. A theory of attention: Variations Experimental Psychology, 1969, 82, 472-481. in the associability of stimuli with reinforcement. Hyde, T. S., & Jenkins, J. J. Recall for words as a Psychological Review, 1975, 82, 276-298. function of semantic, graphic, and syntactic Mandler, G. From association to structure. Psycho- orienting tasks. Journal of Verbal Learning and logical Review, 1962, 69, 415-426. Verbal Behavior, 1973, 12, 471-480. Mandler, G. Subjects do think: A reply to Jung's Ingling, N. W. Categorization: A mechanism for comments. Psychological Review, 1965, 72, 323- rapid information processing. Journal of Experi- 326. mental Psychology, 1972, 94, 239-243. Miller, J. O., & Pachella, R. G. Locus of the stimulus Jensen, A. R., & Rohwer, W. D. The Stroop Color- probability effect. Journal of Experimental Psy- Word Test: A review. Acta Psychologica, 1966, 25, chology, 1973, 101, 227-231. 36-93. Moray, N. Attention in dichotic listening. Affective Jonides, J., & Gleitman, H. A conceptual category «ues and the influence of instructions. Quarterly effect in visual search: O as letter or as digit. Journal oj Experimental Psychology, 1959, 11, Perception &• Psychophysics, 1972, 12, 457-460. 56-60. Jonides, J., & Gleitman, H. The benefit of cate- Moray, N. Attention: Selective processes in vision gorization in visual search: Target location with- and hearing. New York: Academic Press, 1969. (a) out identification. Perception & Psychophysics, Moray, N. Listening and attention. Harmondsworth, 1976, 20, 289-298. Middlesex, England: Penguin Books, 1969. (b) Jung, J. Comments on Handler's "From association Moray, N. A data base for theories of selective to structure." Psychological Review, 1965, 72, listening. In P. M. A. Rabbit & S. Dornic (Eds.), 318-322. Attention and performance V. New York: Aca- Kahneman, D. Attention and effort. Englewood demic Press, 1975. Cliffs, N.J.: Prentice-Hall, 1973. Murdock, B. B., Jr. A parallel-processing model for Kahneman, D. Effort, recognition, and recall in scanning. Perception & Psychophysics, 1971, 10, auditory attention. In P. M. A. Rabbit & S. 289-291. Dornic (Eds.), Attention and performance V. Naus, M. J. Memory search of categorized lists: A New York: Academic Press, 1975. consideration of self-terminating search strategies. Keele, S. W. Attention demands of memory retrieval. Journal of Experimental Psychology, 1974, 702, Journal of Experimental Psychology, 1972, 93, 992-1000. 245-248. Naus, M. J., Glucksberg, S., & Ornstein, P. A. Keele, S. W. Attention and human performance. Taxonomic word categories and memory search. Pacific Palisades, Calif.: Goodyear, 1973. Cognitive Psychology, 1972, 3, 643-654. Klatzky, R. L., Juola, J. F., & Atkinson, R. C. Neisser, U. Decision time without reaction time: Test stimulus representation and experimental context effects in memory scanning. Journal of Experiments in visual scanning. American Journal of Psychology, 1963, 76, 376-385. Experimental Psychology, 1971, 87, 281-288. Klatzky, R. L., & Smith, E. E. Stimulus expectancy Neisser, U. Cognitive psychology. New York: and retrieval from short-term memory. Journal oj Appleton-Century-Crofts, 1967. Experimental Psychology, 1972, 94, 101-107. Nickerson, R. S. Binary-classification reaction time: Krueger, L. E. Effect of stimulus probability on A review of some studies of human information two-choice reaction time. Journal oj Experimental processing capabilities. Psychonomic Monograph Psychology, 1970, 84, 377-379. Supplements, 1972, 4, 275-318. LaBerge, D. Attention and the measurement of Norman, D. A. Towards a theory of memory and perceptual learning. Memory & Cognition, 1973, attention. Psychological Review, 1968, 75, 522-536. 1, 268-276. Norman, D. A. Memory while shadowing. Quarterly PERCEPTUAL LEARNING AND AUTOMATIC ATTENDING 189

Journal of Experimental Psychology, 1969, 21, Shiffrin, R. M. Capacity limitations in information 85-93. processing, attention, and memory. In W. K. Norman, D. A. & Bobrow, D. G. On data-limited Estes (Ed.), Handbook of learning and cognitive and resource-limited processes. Cognitive Psy- processes: Memory processes (Vol. 4). Hillsdale, chology, 1975, 7, 44-64. N.J.: Erlbaum Associates, 1976. Norman, D. A., & Rumelhart, D. E. A system for Shiffrin, R. M., Craig, J. C., & Cohen, U. On the perception and memory. In D. A. Norman (Ed.), degree of attention and capacity limitations in Models of human memory. New York: Academic tactile processing. Perception & Psychophysics, Press, 1970. 1973, 13, 328-336. Okada, R., & Burrows, D. Organizational factors in Shiffrin, R. M., & Gardner, G. T. Visual processing high-speed scanning. Journal of Experimental capacity and attentional control. Journal of Ex- Psychology, 1973, 101, 77-81. perimental Psychology, 1972, 93, 72-82. Pachella, R. G. The interpretation of reaction time Shiffrin, R. M., Gardner, G. T., & Allmeyer, D. H. in information processing research. In B. H. On the the degreee of attention and capacity limi- Kantowitz (Ed.), Human information processing: tations in visual processing. Perception & Psy- Tutorials in performance and cognition. Hillsdale, chophysics, 1973, 14, 231-236. N.J.: Erlbaum Associates, 1974. Shiffrin, R. M., & Geisler, W. S. Visual recognition Posner, M. I., & Boies, S. J. Components of atten- in a theory of information processing. In R. L. tion. Psychological Review, 1971, 78, 391-408. Solso (Ed.), Contemporary issues in cognitive Posner, M. I., & Snyder, C. R. R. Facilitation and psychology: The Loyola Symposium. Washington, inhibition in the processing of signals. In P. M. A. D.C.: V. H. Winston & Sons, 1973. Rabbit & S. Domic (Eds.), Attention and per- Shiffrin, R. M., & Grantham, D. W. Can attention formance V. New York: Academic Press, 197S. be allocated to sensory modalities? Perception & Postman, L. Short-term memory and incidental Psychophysics, 1974, IS, 460-474. learning. In A. W. Melton (Ed.), Categories of Shiffrin, R. M., McKay, D. P., & Shaffer, W. O. human learning. New York: Academic Press, 1964. Attending to forty-nine spatial positions at once. Postman, L., & Senders, V. Incidental learning and Journal of Experimental Psychology: Human generality of set. Journal of Experimental Psy- Perception and Performance, 1976, 2, 14-22. chology, 1946, 36, 153-165. Reed, A. V. Speed-accuracy tradeoff in recognition Shiffrin, R. M., Pisoni, D. B., & Castaneda-Mendez, memory. Science, 1973, 181, 574-576. K. Is attention shared beteen the ears? Cognitive Reed, A. V. List length and the time course of Psychology, 1974, 6, 190-215. recognition in immediate memory. Memory & Shiffrin, R. M., & Schneider, W. An expectancy Cognition, 1976, 4, 16-30. model for memory search. Memory & Cognition, Riley, D. A., & Leith, C. R. Multidimensional psy- 1974, 2, 616-628. chophysics and selective attention in animals. Smith, S. L. Color coding and visual search. Journal Psychological Bulletin, 1976, S3, 138-160. of Experimental Psychology, 1962, 64, 434-440. Rumelhart, D. E. A multicomponent theory of the Sperling, G. The information available in brief perception of briefly exposed visual displays. visual presentations. Psychological Monographs, Journal of Mathematical Psychology, 1970, 7, 1960, 74(Whole No. 498). 191-218. Sternberg, S. High-speed scanning in human memory. Schneider, F. W., & Kintz, B. L. An analysis of the Science, 1966, 153, 652-654. incidental-intentional learning dichotomy. Journal Sternberg, S. The discovery of processing stages: of Experimental Psychology, 1967, 73, 85-90. Extensions of Donder's method. In W. G. Roster Schneider, W., & Shiffrin, R. M. Controlled and (Ed.), Attention and performance 11. Amsterdam: automatic human information processing: I. De- North-Holland, 1969. (a) tection, search, and attention. Psychological Re- Sternberg, S. Memory scanning: Mental processes view, 1977, 84, 1-66. revealed by reaction time experiments. American Shiffrin, R. M. Memory search. In D. A. Norman Scientist, 1969, 57, 421-457. (b) (Ed.), Models of human memory. New York: Sternberg, S. Memory scanning: New findings and Academic Press, 1970. current controversies. Quarterly Journal of Ex- Shiffrin, R. M. Information persistence in short-term perimental Psychology, 1975, 27, 1-32. memory. Journal of Experimental Psychology, Stroop, J. R. Studies of interference in serial verbal 1973, 100, 39-49. reactions. Journal of Experimental Psychology, Shiffrin, R. M. The locus and role of attention in 1935, 18, 643-662. memory systems. In P. M. A. Rabbit & S. Domic Sutherland, N. S., & Mackintosh, N. J. Mechanisms (Eds.), Attention and performance V. New York: of animal discrimination learning. New York: Academic Press, 1975. (a) Academic Press, 1971. Shiffrin, R. M. Short-term store: The basis for a Swets, J. A., & Kristofferson, A. B. Attention. Annual memory search. In F. Restle et al. (Eds.), Cogni- Review of Psychology, 1970, 21, 339-366. tive theory (Vol. 1). Hillsdale, N.J.: Erlbaum Theios, J. Reaction time measurements in the study Associates, 1975. (b) of memory processes: Theory and data. In G. H. 190 RICHARD M. SHIFFRIN AND WALTER SCHNEIDER

Bower (Ed.), The psychology of learning and Treisman, A. M. Strategies and models of selective motivation: Advances in research and theory attention. Psychological Review, 1969, 76, 282-299. (Vol. 7). New York: Academic Press, 1973. Treisman, A. M., & Riley, J. G. A. Is selective at- Theios, J. The components of response latency in tention selective perception or selective response? simple human information tasks. In P. M. A. A further test. Journal of Experimental Psychol- Rabbit & S. Dornic (Eds.), Attention and per- ogy, 1969, 79, 27-34. formance V. New York: Academic Press, 197S. Von Wright, J. M. On selection in visual immediate Theios, J. et al. Memory scanning as a serial self- memory. In A. F. Sanders (Ed.), Attention and terminating .process. Journal of Experimental Psy- performance (Vol. 3). Amsterdam: North-Holland, chology, 1973, 97, 323-336. 1970. Townsend, J. T. A note on the identifiability of Von Wright, J. M., Anderson, K., & Stenman, U. parallel and serial processes. Perception & Psycho- Generalization of conditioned GSRs in dichotic physics, 1971, 10, 161-163. listening. In P. M. A. Rabbit & S. Dornic (Eds.), Treisman, A. M. Contextual cues in selective listen- Attention and performance V. New York: Aca- ing. Quarterly Journal of Experimental Psy- demic Press, 197S. chology, 1960, 12, 242-248. Wagner, A. R. Priming in STM: An information processing mechanism for self-generated or re- Treisman, A. M. Effect of irrelevant material on trieval-generated depression in performance. In the efficiency of selective listening. American Jour- T. J. Tighe & R. N. Leaton (Eds.), Habituatiom: nal of Psychology, 1964, 77, S33-S46. (a) Perspectives from child development, animal be- Treisman, A. M. Monitoring and storage of ir- havior, and neurophysiology. Hillsdale, N.J.: relevant messages in selective attention. Journal of Erlbaum Associates, 1976. Verbal Learning and Verbal Behavior, 1964, 3, Williams, J. D. Memory ensemble selection in human 449-459. (b) information processing. Journal of Experimental Treisman, A. M. Verbal cues, language and mean- Psychology, 1971, 88, 231-238. relevant messages in selective attention. Journal of Psychology, 1964, 77, 206-219. (c) Received August 30, 1976 •