Animal uaming & Behavior 1991. 19 (4). 317-325 Extinction of operant behavior: An analysis based on foraging considerations

ROGER L. MELLGREN University of Texas, Arlington, Texas and TIMOTHY F. ELSMORE Walter Reed Army Institute ofResearch, Washington, D.C.

In two experiments, the frequency of food provided by variable interval(VI)sched­ ules prior to extinction was varied. In the fIrst experiment, two-component multiple schedules resulted in a greater number of responses in extinction in the presence of the stimulus previ­ ously associated with the richer of the two component schedules than that previously associated with the leaner schedule. In the second experiment, different groups of animals were trained on different VI schedules. Responding in extinction was analyzed into bouts of responding show­ ing that the number of response bouts increased and the number of responses per bout decreased with decreasing frequency of reinforcement during training. These data are compatible with an analysis ofoperant behavior based on an analogy to processes that presumably occur in naturalistic foraging situations. According to this analogy, behavior associated with search for a food source (i.e., number of response bouts) and that of procurement offood from a source (i.e., responses per bout) represent aspects of behavior that are differentially strengthened by different VI sched­ ules. Extinction serves to reveal this differential strengthening.

It has been suggested that performance in the Skinner vironrnent (Batson, Best, Phillips, Patel, & Gilleland, & box is a useful analogue to aspects of foraging as it oc­ 1986; Dersich, Mazmanian, & Roberts, 1988; Mellgren curs in nature (e.g., Baum, 1983; Collier & Rovee­ & Olson, 1983; Roberts & Dersich, 1989). "Search," Collier, 1981; Fantino & Abarca, 1985; Kamil & Sar­ as defined in the laboratory context, is a behavior that gent, 1981; Killeen, Smith, & Hanson, 1981; Lea, 1979, brings the subject into contact with a food source, or 1981; Mellgren & Olson, 1983; Shettleworth, 1988, patch, and "contact with a patch" means being close 1989). It may be argued that an operant chamber is analo­ enough to the response manipulandum in a Skinner box gous to a patch, or clump of food, that occurs in a natural to be able to press it. Obviously, search involves move­ environment and the problem the forager/subject faces is ment through space, but contact with a patch (procure­ primarily one ofprocurement ofthe food available in the ment) does not require movement through space. Rather, patch. An operant chamber is a very strange environment, it involves repeated motor actions in contact with (or in however, because there is only one patch available and close proximity to) the response manipulandum. There­ the forager is confined to a single location. fore, we argue that as long as the rat maintains contact Foraging also involves several factors other than and/or close proximity to the response manipulandum in procurement of food from a patch. Among these other the operant box it is "procuring." When the rat breaks factors, search for and location of a patch are of obvious contact with the manipulandum, it recontacts the lever importance. Some of the maze procedures used in the only when it reinstates a "searching" process. As long laboratories of experimental psychologists may be analo­ as the schedule of reinforcement remains in force, the sub­ gous to searching and locating a patch in the natural en- ject will tend to stay in contact with the patch and be procuring. Essentially, the schedule of reinforcement dominates and controls the subject's behavior. The domi­ nation of the schedule is probably enhanced by the fact In conducting the research described in this report, the investigators that standard schedules used in operant studies maintain adhered to the Guide for the Care and Use ofLaboratory Animals, as a constant payoff over time, but in nature most patches promulgated by theCommittee on Care and Use of Laboratoty Animals of the Institute of Laboratory Animal Resources, National Research deplete with continued procurement and therefore result Council. The views ofthe authors do not purport to reflect the position in the subject's breaking contact with the patch. The ex­ of the Department of the Army or the Depanment of Defense (Para. 4-3, tinction schedule, however, is a condition in which con­ AR 360-5). Address correspondence to either Roger L. Mellgren, tact with the patch will not be continuous because it Department of Psychology, University ofTexas at Arlington, Box 19528, Arlington, TX 76019, or Timothy F. Elsmore, Department of Mcdical represents a patch in which total depletion has occurred. Neurosciences, Walter Reed Army Institute of Research, Washington, Thus, the extinction schedule is analogous to a patch that DC 20307. has been totally depleted of resources.

317 Copyright 1991 Psychonomic Society, Inc. 318 MELLGREN AND ELSMORE

In laboratory studies of animal , extinction has Previous research on the effect of relative density of often been used as a method for assessing the nature of scheduled reinforcers on resistance to extinction is con­ learning that has preceded the extinction phase. Hearst sistent with the conclusion that search (general search) (1986), for example, argues that extinction is a useful and procurement (focal search) represent different procedure for revealing those things that were learned dur­ processes. The resistance to extinction shown by subjects ing acquisition, but not evident in the behavior of the sub­ in runways increases as percentage of reinforcement ject during the acquisition phase. This point has also been decreases (Robbins, 1971). In the operant situation, the made by researchers interested in the partial-reinforcement subject is already in the potential food source andthe most effect (Amsel, 1967; Capaldi, 1967), a phenomenon that critical process is that of procurement by repeated con­ depends on transfer from different acquisition schedules tact with the response manipulandum. Studies that vary to an extinction schedule for its definition. the density of scheduled reinforcers on both interval and Although there is a voluminous literature using extinc­ ratio schedules in the operant box show that resistance tion performance in runways as the main dependent vari­ to extinction increases as the density of reinforcement in­ able to evaluate the effects ofdifferent schedules of rein­ creases: the opposite result of the runway studies (Nevin, forcement, very few of the data have come from the 1979). operant-box situation (e.g., Robbins, 1971). Exceptions We take the differences between runway and operant to this generalization involve attempts to use operant studies to reflect the relative importance in each proce­ methods in a manner analogous to runway studies (e.g., dure of search and procurement processes, that is, the Bitgood & Platt, 1971; Haddad, Walkenbach, & Goed­ components of foraging behavior as it occurs in nature. del, 1980; Overmann & Denny, 1974) or studies done On the other hand, there are several differences between in the operant box in which there was no analogy to run­ the two procedures that might also contribute to differ­ way procedures, but the results were interpreted with ences in results. The runway data are based on between­ respect to theories developed using runway data as their subjects comparisons, but the operant data are based on base (e.g., Pavlik & Carlton, 1965; Pavlik & Collier, within-subject comparisons. We consider both types of 1973). In addition, there have been some operant studies comparisons in the present studies. The dependent varia­ by Nevin and his colleagues (Nevin, 1974, 1979; Nevin, ble in runway studies is the time to respond (or its recipro­ Mandell, & Atak, 1983; Nevin, Mandell, & Yarensky, cal, speed), but rate ofresponse is used in operant studies. 1981) that were independent of both method and theory The experiments in this report were designed and ana­ developed in the runway, and Nevin (1988) has addressed lyzed in order to test the hypothesis that search and some ofthe apparent contradictions between operant and procurement are differentially affected by schedules of runway data with respect to the partial-reinforcement ef­ reinforcement as measured during extinction, with the ob­ fect. Other relevant operant studies are primarily of histor­ jective of reconciling the apparent differences between ical interest (e.g., Jenkins, McFann, & Clayton, 1950; runway and operant data. Experiment 1 utilizes multiple Perin, 1942; Skinner, 1938; Williams, 1938). schedules for within-subject comparisons, and Experi­ In the experiments described in this article, extinction ment 2 uses between-subjects schedule comparisons, of responding following variable interval schedules of analogous to the typical runway methodology. food reinforcement was viewed as reflecting processes analogous to those that might occur in a foraging situa­ tion. In particular, the processes we hypothesized to be EXPERIMENT 1 operating are those of search and procurement. "Search" was operationally defined as a period of noncontact with Previous work on "response strength" (Nevin, 1979) the response manipulandum terminated by contact with has documented that following multiple variable interval (mult Vl) schedules in which different VIs are associated the manipulandum. We do not mean that the subject is with different cues, there is greater response strength in exclusively engaged in searching behavior when not in extinction in the component of the schedule that formerly contact with the response lever, but that the recontact with was associated with the richer schedule. In order to main­ the lever after a period of absence of contact reflects the tain continuity with this previous work, Experiment 1 used occurrence of searching. "Procurement" was operation­ VI 18-, 56-, and 180-sec schedules, with one group of ally defined as sustained contact with the manipulandum. rats receiving a mult VI 18 VI56 schedule and the other According to our analysis, the proper method for evalu­ group receiving a mult VI56 VI 180schedule. Both groups ating the extinction of operant behavior is with respect were then given extinction to both components, mult (ext, to these two processes. Timberlake and Lucas (1989) sug­ ext), with the expectation that more responding would oc­ gest a similar analysis based on the proximity of behavior cur in the formerly rich component of the multiple relative to consumption of food. In their terminology, the "general search mode" is roughly what we call "search" schedule. and "focal search/food handling" is what we call "pro­ Method curement." Timberlake and Lucas suggest that the par­ Subjects. Twelve experimentally naive male albino Sprague­ tial reinforcement effect will be evident for behaviors of Dawley rats approximately 90 days old were randomly divided into the general search mode, but the opposite will occur for two groups of6 each. They were maintained on 12 g of Purina lab focal search/food handling behaviors (1989, p. 269). chow per day in addition to the approximately 4.5 g offood pellets EXTINCTION OF OPERANT BEHAVIOR 319 they earned in the experiment. All rats were fed at the same time, discriminative control is the proportion oftotal responses approximately 15 min after the last subject finished the session. on the correct lever. On Day I of acquisition, the mull Water was available continuously in the home cage, and the rats VI 18 VI56 group showed an accuracy level of 95.4 on were weighed daily. the VI 18 component and 85.4 on the VI56 component, Apparatus. Twelve chambers (Colboum and they maintained this level through the course of ac­ Instruments) were used. Each chamber was equipped with two response levers located 6.5 cm from the floor of the chamber and quisition, ending with 96.8 and 82.0, respectively, on the separated by the food-delivery tray. Small pilot lights covered by seventh session. Discriminative control remained high in white glass jewels centered 3.5 em above each lever served as dis­ extinction at 94.2 and 90.8, respectively. The mult VI56 criminative stimuli for the multiple schedules. The chambers were VI 180 group did show improvement on this index, start­ housed in sound-attenuating shells, and ventilation and masking noise ing at 87.7 and 77.8 for the VI 56 and VI 180 components, were provided by exhaust fans mounted on the shells. Control of and ending at 94.7 and 96.4, respectively, on the seventh the apparatus and recording of responses was accomplished by a PDP/8e computer running the SUPERSKED software system (Snap­ session. Again, discriminative control remained high in per, Kadden, & Inglis, 1982). extinction, 94.8 and 99.4, respectively. Procedure. The rats were given 12 g offood per day for 8 days Another index ofdiscriminative perfonnance is the rela­ prior to the start of the experiment. They were then magazinetrained tive response rate in each component ofthe schedule, that by delivering food pellets on a variable time (V'F) 56-sec schedule, is, the degree to which the components of the multiple with both response levers prograrruned to deliver pellets on a con­ schedules controlled different rates of response. These tinuous reinforcement (CRF) schedule. Throughout both experi­ ments, food-pellet delivery was accompanied by a l-sec illumina­ data showed that the main change in responding across tion of the food-delivery tray. A total of 100 pellets was given on the course of acquisition was an increase in rate in the the first session, either via the VT schedule or a combination of VII8 component for the mutt VII8 VI56 condition. the VT schedule and leverpresses, For those subjects that did not whereas responding remained constant in the other com­ press the lever during the first pretraining session, the experimenter ponents. Comparison ofthe VI 56 component between the shaped leverpressing on the next session. On the third pretraining two groups showed no difference in response rate, that session, a multiple CRF, CRF schedule was in effect. On this sched­ ule, the light over the left or right lever was randomly chosen and is, a failure to observe a "nonspecific contrast effect" illuminated. A press on the lever under the light resulted in deliv­ (cf. Dunham, 1968). ery of a food pellet and another random selection of the operative The density ofschedule varied both within subjects and lever. A total of 100 pellets were obtained by the subject in this between subjects. We refer to the rich component as the session. VI 18 component for the mult VI 18 VI 56 condition and Halfof the subjects received a multiple variable interval 18-sec, the VI 56 component for the mult VI 56 VI 180 condition. variable interval 56-sec schedule (mult VI 18 VI56), and the other The two conditions, mull VII8 VI56 and mult VI56 half received a multiple variable interval 56-sec, variable interval ISO-sec schedule (mutt VI56 VI ISO), for seven sessions. Intervals VI 180, differed in the overall richness of schedules (the of the schedules were selected according to the method of Fleshier combined rate ofpellet delivery provided by the multiple and Hoffman (1962). lllumination of the light above a lever was schedule), and the effects of this difference are referred the signal that the lever was operative, and left and right lever lights to as "group" effects. were counterbalanced so that the richer schedule was on the left The analysis of response rate in extinction included rich for half the subjects and on the right for the other half. Presses on versus lean schedules (within subject) as one variable and the unlighted lever side were recorded, but had no programmed consequences. Each component was presented for 90 sec, after mult VII8 VI56 versus mult VI56 VI 180 as the other which the computer randomly sampled to determine which com­ (between subjects). All statistical analyses that are referred ponent would occur next. All lights were off and responses had to as significant are at the .05 level ofsignificance or bet­ no effect for I sec between components. To equate the number of ter. As would be expected from Nevin's (1974) previous pellets delivered on each lever, the ratio of sampling a particular findings, the richer schedules resulted in higher response component was 3 to I in favor ofthe less rich component (the V156 rates in extinction than did leaner schedules [F(1,IO) = for the mult VI 18 VI 56 group and the VI ISO for the other group). For the group on the mult VI 18 VI 56 schedule, a total of42 com­ 18.52]. The mult VI 18 VI56 group had a higher response ponents was presented, and for the group on the mult VI56 VI ISO rate in extinction than did the mult VI 56 VI 180 group schedule, a total of 132 components was presented. This arrange­ [F(1,10) = 6.59]. The interaction ofcomponent richness ment would result in the subjects receiving approximately 50 pellets with groups [F(I,IO) = 6.43] is accounted for by the rela­ per component in each session. In actuality, the subjects earned tively bigger difference in rate between the rich and lean somewhat fewer pellets than they theoretically might have earned components for the mult VI 18 VI 56 group as compared (usually around 43 to 47 per component). with the mult VI 56 VI 180 group. Ofcourse, this pattern After seven sessions ofacquisition training, extinction testing was done. The extinction session began with 8 presentations of rein­ ofextinction results is similar to the differences that ex­ forced components, 4 of the richer and 4 of the leaner, followed isted at the end ofacquisition as shown in Table 1. There­ by SO extinction presentations, 40 ofeach component, with the pellet fore, the extinction rates were transformed by dividing dispenser turned off. Components lasted for 90 sec, and their order the extinction rate by the rate of response on the last day was randomly determined with the restriction that exactly 40 presen­ of acquisition (Anderson, 1963). tations of each component occurred in the session. The transformed extinction response rates showed no significant difference due to th'e richness of the compo­ Results nent schedules or the group effect, but did show a sig­ The degree of exhibited by the multi­ nificant interaction between the two variables [F(1,IO) = ple schedule was evaluated in two ways. One index of 9.17]. The transformed extinction rates were much higher 320 MELLGREN AND ELSMORE

Table 1 Behavioral Measures for Mult VI 18 VI 56 or Mult VI 56 VI 180 Training Schedule Mult VI 18 VI56 Mult VI 56 VI 180 Dependent Variable Rich Lean Rich Lean Acquisition rate (in minutes) 57.5 26.6 22.7 9.9 Extinction rate (in minutes) 15.0 3.7 5.4 2.5 Extinction rate (transformation) .26 .14 .24 .25 Latency (in seconds) 46.51 52.92 37.64 38.00 Frequency of nonresponse 14.33 18.17 10.16 11.67 in the richer component for the mult VI 18 VI56 condi­ blocks [F(3,30) = 3.87]. Overall, the groups started ex­ tion, but there were no differences between the compo­ tinction at the same level and diverged across blocks. nents ofthe mult VI56 VI 180 condition, as shown in Ta­ The number of components in which no response oc­ ble I. The time course of extinction was considered by curred showed a pattern that would be expected from the plotting the transformed extinction rates (using the first time data. There were more components without a block of extinction as the denominator for the transfor­ response for the mult VI 18 VI56 group than for the mult mation) in blocks of 5 and 10 components. The slope of VI56 VI180 group [F(l,IO) = 5.65], but no difference the rates failed to reveal systematic differences between due to schedule richness or the interaction of groups with rich and lean components, in contrast to other results schedule richness [F(l, 10) = 0.10 and 0.45]. Table 1 (Nevin, 1988). shows both the latency averaged across extinction and the In discrete trial, runway extinction studies, the usual nonresponse data. dependent variable is the latency, or time to respond (or The conclusion about the effect of the richness, or den­ its transformation, speed of response). Therefore, we ana­ sity, of multiple schedules on the persistence of behavior lyzed the time to the first leverpress response after a com­ in extinction is that, "it depends." It depends on what ponent light was presented, assuming it to be analogous measure of behavior is used to assess resistance to ex­ to latency in the runway. Failure to respond for the 90­ tinction and on whether the comparison is made between sec presentation of one component was given the value components but within subjects, or just between subjects. of 90. Figure I shows the mean latencies (in seconds) Relative rate of response (adjusted for different rates in across blocks of 10 component presentations. acquisition) shows more resistance to extinction for the An analysis of variance showed that the mult VI18 VI56 denser schedule in the mult VI18 VI56 group, but no group had significantly longer response latencies than did between-groups difference. However, the absolute rate the mult VI56 VI 180 group [F(I, 10) = 7.53]. Schedule of response is higher following denser schedules both for richness and the interaction of groups with schedule rich­ a within-subject, between-eomponents comparison and for ness were not significant sources of variance [F(l, 10) = a between-groups comparison. The response time and the 1.00 and 0.83]. There was a significant effect due to tendency to make at least one response in a component suggest the opposite conclusion. There were shorter response times and a higher likelihood of making at least one response for less dense schedules between groups of 80' V0L T VI' 8 v,56 subjects, supporting the conclusion that the less dense the YU ...T VI~l6 V '80 ....J If) schedule of reinforcement, the greater the resistance to z VI (SEC) extinction. Although these conclusionsare seemingly con­ ~ 60 .J:] If) 18 " tradictory, they are consistent with existingliterature when ....J 56 ~ viewed from the perspective of the foraging analogy >-'" 180 8 If) ,0 -'--­ ------E) described in the introduction of this article. This analogy '"u:: 40 ,'._ 0' o ." claims that there are two processes controlling operant f-. w .0··········· behavior. One is the tendency to "search," which we operationalize as the tendency to contact physically the .. S response lever after a period away from it, and the other is the tendency to ••procure," which we operationalize ;':0 as thetendency to continue leverpressingonce contact with the lever has occurred. The search process is usually iso­ 234 lated in the straight runway experiment, in which the sub­ BLOCKS OF 10 EXT COMPONENTS ject is introduced into the situation by the experimenter, Figure 1. Mean time to first response foUowing presentation of allowed to approach a potential food source, and then re­ the cue 6ght starting a component across the 2 b of extinction In moved by the experimenter. Generally, the runway data Experiment 1. Nonresponse components were lISSigneda value of 90. are consistent in showing that the tendency to approach EXTINCTION OF OPERANT BEHAVIOR 321

during extinction increases as density of scheduled food Apparatus. The samechambers andcontrol apparatus were ~sed decreases. In the present situation, the tendency to ap­ in this experiment as were used in Experiment 1. Only the nght proach is measured by the response time follo~in~ presen­ lever was functional, and illumination was provided by the left light. tation ofthe cue light. The between-group extmcnon com­ Procedure. The subjects were given 12 g of food per day for 8 days and then were placed individually in the operant chambers ~erefore parison in this operant-box experiment.is and given food pellets on a VT56 schedule and CRF for leverpressc:s consistent with between-groups runway findings in show­ until 100 pellets had been received. A second day of these COndI­ ing a shorter response t~e (and gr~ter lik~l~~ of tions was given to all subjects, and the rats that had not pressed response) with lower density of food m acquisitron. the lever during the first session were given by the ex­ The tendency to persist in leverpressing once respond­ perimenter. By the end ofthe second session. all s':lb~ had: learned ing has been initiated is called the procurement process. to press the lever. The subjects were randomly distnbuted mto ~me of threegroups, VI 18, VI56, or VI ISO,andreceived theappropnate Ifonly overall number or rate of responses is considered schedule for the next 8 sessions. The computer controlled the du­ in analyzing the data, then the procurement process tends ration of a session so that once the subject had earned 100 pellets. to predominate and pr~u~ res~ts that favor ~ hypothe­ the light inside the box went out, the lever becameinoperative, ~ sis that resistance to exnnction IS greater followmg denser a signal was delivered to the experimenter indicating that a subject schedules of reinforcement, as was found. This finding was ready to be removed from theapparatus. This procedure~ted is true even though we have not factored out the hypothe­ in average session durations of32-38 min for the VI 18 ~ondltlOn, 102-110 min for the VI56 condition, and 320-330 rrun for the sized "search time" at the beginning ofeach component. VI ISO condition. All subjects were given the daily food ration in the home cage approximately 15 min after the last subject in the EXPERIMENT 2 VIlSO condition had finished. The extinction session was conducted the day after the eighth ac­ According to the foraging analogy to the operant.situa­ quisition session. It began with the subject's earning 10 pellets on tion the use of a multiple schedule facilitates a particular the appropriate schedule and continued for 2 h after ~e ~Oth pellet pattern of behavior. We hypothesized that leverpressing was received. During extinction. each response and Its time ofoc­ currence were recorded by the computer for later analysis. is analogous to the procurement of food from a patch. Cessation of leverpressing and moving away from the Results lever, followed by the reinitiation ofcontact, is ~ogous To provide a general picture ofextinction, ~yse~ of to searching for a food source in the natural situation. In variance were done on the response rate during exnnc­ a multiple schedule, the termination of one component tion and the transformed measure of extinction, that is, and initiation of another are not under the control of the extinction rate divided by acquisition rate. The baseline subject, but are under the control of the experimenter, response rate as measured by the last session before ex­ a situation not strictly analogous to the foraging situation. tinction varied as a function of schedule [F(2,18) = In this experiment, different groups ofsubjects were given 22.18], with higher density ofavailable food resulting in simple experience with one value of a VI schedule fol­ higher response rates. Although extinction respon.se rate lowed by extinction. On the basis of the results of. Ex­ increased with increasing acquisition schedule density, the periment I, we hypothesized that there would be a dlf~t differences were not statistically significant [F(2,18) = relationship between the density ofthe programmed r~m­ 2.68) (see Table 2). The derived measure of extinction, forcer in acquisition and the persistence of respon.dl~g, response rate in extinction divided by response rate ~ ac­ ex~tion. ~t ~ once responding was initiated, during .. Im­ quisition baseline, increased significantly as previous portant to note that it is only once the subject ~ tnl~ated schedule of reinforcement decreased in density [F(2,18) leverpressing that the number of responses IS predicted = 24.61J. to be directly related to the scheduled density of reinfo~cc:­ The pattern of results shown in Table 2 and the cor­ ment, The foraging analogy suggests that tendency to uu­ responding analyses do not provide a definitive insight tiate contact with the patch (leverpress) is inversely related into between-group extinction behavior following differ­ to previous schedule ~ensity. This hrpothesis is ~onsis­ ent values ofvariable interval schedules. On the one hand, tent with the runway literature showing that persistence response rate was greater with more densely scheduled of responding increases as density of reinforcement reinforcers, but the differences failed to reach statistical decreases. Therefore, we hypothesize two opposite effects significance. The fact that groups differed in their base­ on the persistence of behavior as a function of the den­ line rates at the start ofextinction further complicates the sity of reinforcement. When measured as the tendency to leverpress (referred to as starting a "bout" ofrespond­ ing), persistence should increase as density decreases. When measured as the tendency to continue leverpress­ Table 2 Behavioral Measures as a Function or ing once a bout has been initiated, persistence should Schedule of Reinforcement decrease as density decreases. Schedule Method Dependent Variable VIl8 VI56 VI ISO Subjects. Naive rats like those used in Experiment 1 were ~sed Baseline rate 52.56' 27.73 8.46 in this experiment. Two replications of 12 rats were run. With 4 Extinction rate 9.02 7.17 3.29 subjects per group per replication. Extinctionlbaseline .15 .29 .43 322 MELLGREN AND ELSMORE interpretation ofthe data. Anderson (1963), for example, jects. Figure 4 shows all ofthe cumulative records from advocates the use of a transformation on extinction data individual subjects in each ofthe VI groups. Rather than when there are differences in performance levels at the plotting raw responses in which thedifferences in response end ofacquisition. When this transformation is performed rates would make it difficult to compare subjects and VI on the present data, a totally different picture emerges conditions, the cumulative records are presented as cu­ compared with the analysis of untransformed extinction mulative percentages of total responses in the extinction response rates. The transformed measure (extinction rate session. The pattern of pausing and then emitting a long relative to baseline) indicates that the lower density sched­ bout of procurement responding in the VI 18 subjects gives ules produce greater persistence of responding than do a different appearance than the relatively smoother and the higher density schedules. The fact that the baseline less steep cumulative records of the VI 180 subjects. The response rates are so different encourages caution in in­ VI56 schedule was intermediate to the other two. terpreting differences in extinction based on a transfor­ mation ofthis sort. The observed differences in the trans­ Discussion formed measure of extinction may be more a function of The results ofExperiment 2, when analyzed using stan­ the differences in acquisition rates than a revelation of dard dependent variables such as response rate, number anything about extinction. Therefore, we turn to a more of responses, or transformed response rate based on ac­ detailed evaluation of the extinction of responding for a quisition rate, are equivocal. Perhaps the reason there is clearer picture of the controlling processes. little published data on extinction following simple acqui- The foraging analogy ofoperant behavior suggests that extinction ofoperant responding involves two processes, BOUT CRI'ERIO'J (SEC) procurement and reinstatement ofprocurement following 250 • ____-fl. S" a pause in responding (search). This analysis suggests that 10 0 /---- operant responding, as defined by leverpressing, occurs 20 c / G 200 ~ in discrete bouts. The tendency to continue responding z 40 0 / within a bout is one measure of the persistence of be­ o : havior. The tendency to engage in a bout of responding ~ 1 5C J once leverpressing has been terminated is a different mea­ '"cr I sure of the persistence of behavior. Therefore, the ex­ .: tinction data were analyzed for the occurrence of bouts of responding. The definition of when a bout had termi­ nated was made by measuring the amount oftime between ------Q successive leverpresses. When a pause (an interresponse time, IRT) between responses n and n + 1 ofcriterion du­ oL---r ration or longer occurred, the computer treated the nth 18 56 '80 response as a member of bout number x and response n + 1 TRAI" NG V (SEC) as a member of bout number x + 1. There is no a priori Figure 2. Mean number of bouts of responding in Experiment 2 way of deciding what the criterion length of the IRT for the three variable interval c:oncIitiom, tBng four bout-termination should be; therefore, a range of criteria was used to de­ criteria. fine bouts. Analyzing the data in this way resulted in two dependent variables: number of bouts and number of BOUT CRITERION (SEC) responses within bouts for each bout criterion. 60" S" The less dense the acquisition schedule, the greater the ------10 0 number of bouts. This relationship held regardless ofthe 50~ 20 o >­ 40 0 ~ IRT used to define the bout criterion. The bout-termination o criteria were 5, 10,20, and 40 sec. The analyses ofvari­ 0) 40 u, ance showed significant differences due to schedule for '"Q all criteria [F(2,18) = 8.96,20.93,21.70, and7.43]. Fig­ :;:]30 V1z ure 2 shows the number ofresponse bouts for the differ­ o ent bout-termination criteria. Bi 20 1 The denser the acquistion schedule, the greater the num­ '"cr , ber ofresponses per bout. This relationship held regard­ less ofthe IRT used to define bouts. The analyses ofvari­ 10l~=-=-=___=_=_ _ ance showed significant differences due to schedule for o T ------r----- all criteria except 40 sec [F(2,18) = 9.23,6.40,4.71, '8 56 180 and 2.75]. Figure 3 shows responses per bout for the dif­ TRAINI'.G VI (SEC) ferent bout-termination criteria. Figure 3. Mean number of responses per bout in Experiment 2 The summary of performance of the subjects seen in for the three variable interval conditiom, tBng four bout-tennination Figures 2 and 3 is also seen in the data of individual sub- criteria. EXTINCTION OF OPERANT BEHAVIOR 323

VI 18 IRE~)rf_:~-/:_;::e.=' ~

.- >

U. O'---'--.L..-----'--.L..--'-_'------'------.I. o 100,.---,--,----,--,----,--==0;;;0___ VI 56 ~ 80

40

30 45 60 75 90 105 120 15 30 45 60 75 90 105 120 TIME IN EXTINCTION (MIN) Figure 4. Cumulative records ofthe percentage of tota1 respon.ws in extinction for eacb subject in Experiment 2. The left pmeIs are for the first squad ofsubjects, and the right panels are for the second, or replication, squad. sition on a schedule is because the results are not defini­ the operation ofprocesses that govern foraging behavior tive when the standard dependent variables are analyzed. in the natural environment. For an omnivore such as the Dividing extinction into two components, search and rat, foraging may take many forms. At a minimum, food procurement, implies that the appropriate dependent vari­ must be located and, once located, must be harvested or ables are the "nonstandard" ones, bouts of responding procured. The degree to which laboratory preparations and responding within bouts. Evaluating bouts and are consistent with the naturally occurring foraging be­ responses within bouts results in opposite conclusions haviors ofthe subject may determine the generality ofthe regarding the resistance to extinction following different results (Shettleworth, 1975). It is our contention that the density schedules. Interestingly, this pattern of results is laboratory preparations commonly used to study in­ consistent with the existing literature if the analogy be­ strumental learning are closely related to two of the fun­ tween search and the runway procedure and between damental problems of foraging: search and location of a procurement and the leverpressing procedure on multi­ food source, and harvesting or procurement offood from ple schedules is maintained. In runway investigations, in­ the source. Runway studies are argued to be analogous creased resistanceto extinction is associated with decreas­ to the search process as it occurs in foraging. Once a food ing density of reinforcement, as was found with respect source has been located, there is often the additional to the number-of-bouts dependent variable in this experi­ problem ofobtaining the available food. "Handling," as ment. In operant investigations using multiple schedules it is often called in the foraging literature, or "procure­ of reinforcement, increased resistance to extinction is as­ ment, " as we call it, has its analogy in the Skinner box sociated with increasing density of reinforcement, as was apparatus. Under most of the standard procedures used found when the number of responses in a bout was the in laboratory studies, the focus is on one or the other of dependent variable. the processes related to foraging. In runway studies (and maze studies in general), when food is found it is immedi­ GENERAL DISCUSSION ately available in a food cup arrd comes packaged in soft pellets so that the rat does not even touch the food with We have attempted to show that resistance to extinc­ its paws in consuming it. In Skinner box studies, the sub­ tion, and more generally, persistence ofbehavior, reflects ject is placed into the box and obtains food by repeatedly 324 MELLGREN AND ELSMORE contacting the response manipulandum in a fixed loca­ ment 2. There was, of course, a major difference between tion. Thus, there is no influence of search processes as Experiments I and 2 since the opportunity to engage in long as the subject is under the control of the procure­ a bout of responding was enabled by the experimenter in ment requirement dictated by the schedule of reinforce­ the multiple-schedule procedure of Experiment 1, but was ment and remains in contact with the manipulandum. under the subject's control in Experiment 2. The within­ The argument that runways focus upon search processes subjects difference in total responses in each component while Skinner boxes focus upon procurement processes of the multiple schedule comes about because of the does not mean that each type of apparatus must necessar­ "procurement" difference between schedules. Once ily exclude the other process. The animal comes equipped responding begins in the denser component, more of it to engage in both search and procurement functions since occurs there than in the less dense schedule. When both they are both parts of the feeding behavior system (Tim­ components are presented an equal number of times in berlake and Lucas, 1989), and when the experimental sit­ extinction, the "procurement responding" results in a uation is structured in a way that requires both processes, greater total number ofresponses in the denser schedule. the animal is capable of engaging in both. For example, Despite these differences in total responses, there were Mellgren and Olson (1983) had rats run down a runway no differences in initiation of responding within groups to a "goal box" partially filled with sand. The rats had between the components of the multiple schedules. This to dig in the sand to find buried food. The running to the pattern of results lends further support to the hypothesis sand box clearly fits the analogy to search, and the dig­ that search and procurement represent separate processes ging for food clearly fits the analogy to procurement. In that are not similarly affected by a given outcome and il­ extinction, the persistence of running to the sand box was lustrates the usefulness of this analysis in understanding greater if food had been found only half the time during appetitive behavior. acquisition compared with all the time. Digging, on the other hand, was more persistent if food had been present REFERENCES on every trial in acquisition than if it had been present AMSEL, A. (1967). Partial reinforcement effects on vigor and persis­ on only half the trials. These results are consistent with tence. In K. W. Spence & J. T. Spence (Eds.), The psychology of those reported in this article in showing that resistance learning and motivation (Vol. I, pp. 1-65). New York: Academic to extinction may depend on the function ofthe behavior, Press. that is, search or procurement. ANDERSON, N. H. (1963). Comparison of different populations: Another important dimension for understanding extinc­ Resistance to extinction and transfer. Psychological Review, 70, 162-179. tion effects is the nature of the comparison (between­ BATSON, J. D., BEST, M. R., I'HILUPS,D. L., PATEL, H., '" GILULAND, subject or within-subject paradigms) and the dependent K. R. (1986). Foraging on the radial-arm maze: Effects of altering variable (time to respond or number of responses) used the reward at a target location. Animal Learning & Behavior, 14, to assess relative resistance to extinction. Response la­ 241-248. BAUM, W. M. (1983). Studying foraging in the psychologicallabora­ tency (or its transformation, response speed) seems rela­ tory. In R. L. Mellgren (Ed.), Animalcognition andbehavior (pp. 253­ tively insensitive when measured using a within-subject 283). Amsterdam: North-Holland. dependent variable, but relatively sensitive when used in BITGOOD, S. C., '" PLATT, J. R. (1971). A discrete-trials PREE in an conjunction with a between-subjects dependent variable. operant situation. Psychonomic Science, 23, 17-19. In Experiment I, the time-to-respond data showed no dif­ CAPALDI, E. J. (1967). A sequential hypothesis of instrumental learn­ ing. In K. W. Spence & J. T. Spence (Eds.), The psychology oflearn­ ferences between the VI 18 and VI56 components in the ing and motivation (Vol. I, pp. 67-157). New York: Academic Press. mult VI 18 VI56 group and no differences between the COLUER, G. H., ",ROVEE-COLUER, C. K. (1981). A comparative anaI­ VI56 and VI 180 components in the mult VI56 VI180 ysis of optimal foraging behavior: Laboratory simulation. In A. C. group. This insensitivity in the time-to-first-response mea­ Kamil & T. D. Sargent (Eds.), Foraging behavior: Ecological, etho­ logical, and psychological approaches (pp. 39-76). New York: sure occurred despite clear discrimination between the Garland. components in the number of responses made to each. In­ DUNHAM, P. J. (1968). Contrasted conditions of reinforcement: A selec­ terestingly, the between-subjects comparison for the same tive critique. Psychological Bulletin, 69, 295-315. measure was significant. The mult VIl8 VI56 group, FANTINO, E., '" ABARCA, N. (1985). Choice, optimal foraging, and the although making many more total responses in extinction, delay-reduction hypothesis. Behavioral & Brain Sciences, 8, 315-362. FLESHLER, M., '" HOFFMAN, H. S. (1962). A progression for generat­ was significantly slower to make the first response in ing variable interval schedules. Journal ofthe Experimental Analysis either component than the mult VI56 VIl80 group. Of of Behavior, 5, 529-530. course, the usual dependent variable in runway studies HADDAD, N. F., WALKENBACH, J., '" GoEDDEL, P. S. (1980). Sequen­ is time to run from the start box to the goal box, and the tial effects on rats' lever-pressing and pigeons' key-pecking. Ameri­ can Journal ofPsychology, 93, 41-51. usual comparison is a between-subjects one. It is interest­ HEARST. E. (1986). Extinction reveals stimulus control: Latent learn­ ing that the parallel dependent variable and experimental ing of feature-negative discriminations in pigeons. Journal of Ex­ design produced parallel results between the runway and perimental Psychology: Animal Behavior Processes, 13, 52-64. the Skinner box in this instance. IURSICH, T. J., MAZMANIAN, D. S., 4 ROBERTS, W. A. (1988). Forag­ The time-to-respond data from Experiment 1 are con­ ing for covered and uncovered food on a radial maze. Animal Learn­ ing & Behavior, 16, 388-394. sistent with the idea that lower density VI s should result JENKINS, W.O., McFANN, H., 4 CLAYTON, F. L. (1950). A methodo­ in more bouts of responding during extinction in Experi- logical study of extinction following aperiodic and continuous rein- EXTINCTION OF OPERANT BEHAVIOR 325

forcernent. Journal ofComparative &: Physiological Psychology, 43, PERIN,C. T. (1942). Behavior potential as a joint function ofthe amount 155-167. of training and degree of hunger at the time of extinction. Journal KAMIL, A. C., .t SARGENT, T. D. (Eds.) (1981). Foraging behavior: of Experimental Psychology, 30, 93-113. Ecological, ethological and psychological approaches. New York: ROBBINS, D. (1971). Partial reinforcement: A selective review of the Garland. alleyway literature since 1960. Psychological Bulletin, 76,415-431. KILLEEN, P. R., SMITH, J. P., &: HANSON, S. J. (1981). Central place ROBERTS, W. A., &: lLERSICH, T. J. (1989). Foraging on the radial maze: foraging in Rattus norvegicus. Animal Behavior, 29, 64-70. The role of travel time, food accessibility, and the predictability of LEA, S. E. G. (1979). Foraging and reinforcement schedules in the food location. Journal ofExperimental Psychology: Animal Behavior pigeon: Optimal and nonoptima1 aspects of choice. Animal Behavior, Processes, IS, 274-285. 27, 875-886. SHETTLEWORTH, S. J. (1975). Reinforcement and the organization of LEA, S. E. G. (1981). Correlation and contiguity in foraging behavior. behavior in golden hamsters: Hunger, environment, and food rein­ In P, Harzem & M. D. Zeiler (Eds.), Advances in analysis ofbe­ forcement. Journal of Experimental Psychology: Animal Behavior havior: Vol. 2. Predictability, correlation, and contiguity (pp. 355­ Processes, 104, 56-87. 406). Chichester, England: Wiley. SHETTLEWORTH, S. J. (1988). Foraging as operant behavior and oper­ MELLGREN, R. L., &: OLSON, M. W. (1983). Mazes, Skinner boxes, ant behavior as foraging: What have we learned? In G. Bower (Ed.), and feeding behavior. In R. L. Mellgren (Ed.), Animal cognition and The psychology oflearning and motivation: Advances in research and behavior (pp. 223-252). Amsterdam: Nonh-Holland. theory (Vol. 22, pp. 1-32). New York: Academic Press. NEVIN, J. A. (1974). Response strength in multiple schedules. Journal SHETTLEWORTH, S. J. (1989). Animals foraging in the lab: Problems ofthe Experimental Analysis of Behavior, 39, 49-59. and promises. Journal ofExperimental Psychology: Animal Behavior NEVIN, J. A. (1979). Reinforcement schedules and response strength. Processes, IS, 81-87. In M. D. Zeiler & P. Harzem (Eds.), Reinforcement and the organi­ SKINNER, B. F. (1938). The behavior oforganisms: An experimental zation of behavior (pp. 117-158). New York: Wiley. analysis. New York: Appleton-Century-Crofts. NEVIN, J. A. (1988). Behavioral momentum and the partial reinforce­ SNAPPER, A. G., KADDEN, R. M., &: INGLIS, G. B. (1982). State nota­ ment effect. Psychological Bulletin, 103, 44-56. tion of behavioral procedures. Behavioral Research Methods &: In­ NEVIN, J. A., MANDELL, C., &: ATAK, J. R. (1983). The analysis of strumentation, 14, 329-342. behavioral momentum. Journal ofthe Experimental Analysis ofBe­ TIMBERLAKE, W., &: LUCAS, G. A. (1989). Behavior systems and learn­ havior, 39, 49-59. ing: From misbehavior to general principles. In S. Klein & R. R. NEVIN, J. A .. MANDELL, C.; &: YARENSKY, P. (1981). Response rate Mowrer (Eds.), Contemporary learning theories: Instrumental con­ and resistance 10 change in chained schedules. Journal ofExperimental ditioning theory and the impact ofbiological constraints on learning Psychology: Animal Behavior Processes, 7, 278-294. (pp. 237-275). Hillsdale, NJ: Erlbaum. OVERMANN, S. R., &: DENNY, M. R. (1974). The free-operant partial WILLIAMS, S. B. (1938). Resistance to extinction as a function of the reinforcement effect: A discrimination analysis. Learning &: Motiva­ number of . Journal ofExperimental Psychology, 23, tion, 5, 248-257. 506-522. PAVLIK, W. B.,.t CARLTON, P. C. (1965). A reversed partia1reinforce­ ment effect. Journal of Experimental Psychology, 70, 417-425. PAVLIK, W. B., &: COLLIER, A. C. (1973). Reinforcer magnitude ef­ fects on a within-subjects reversed PRE. Bulletin ofthe Psychonomic (Manuscript received November 2, 1990; Society, 2, 233-234. revision accepted for publication May 30, 1991.)