Burrhus Frederic Skinner (1904 - 1990)

Chapter 5

Burrhus Frederic Skinner

1. Born Mar. 20, 1904 Susquehanna, Pennsylvania. 2. Did his PhD (1931) from Harvard. 3. Wanted to become a writer was disappointed to learn that he had nothing to write about, instead became a great 1904-1990

psychologist. 2

Burrhus Frederic Skinner

4. Wrote The behavior of organisms (1938). Walden

two (1948), after Thoreau’s www.simplypsychology.pwp.blueyonder.co.uk Walden. 5. Taught at University of Minnesota (1936-48). 6. Chair at Indiana University (1945/48). 7. Came back to Harvard (1948-90). 1904-1990 3

1 Burrhus Frederic Skinner

8. Beyond freedom and dignity (1971). 9. About behaviorism

(1976). images - 10. Upon further reflection cdn01.associatedcontent.com (1987). 11. Continued to publish to the end of his life in journals like Analysis of Behavior (1989). 1904-1990 4

Burrhus Frederic Skinner

12. Great contributions to learning and education. 13. Contributions to child development. 14. Project ORCON (ORganic CONtrol). Project ORCON 15. Died in 1990. pavlov.psicol.unam.mx:8080

5 1904-1990

Comparison

Operant Conditioning Respondent Conditioning Skinnerian or operant Classical, Pavlovian, or conditioning respondent conditioning Type R conditioning Type S conditioning reinforcing stimulus is reinforcing stimulus is contingent upon a contingent upon a response stimulus S R S (Food) S S (Food) R

2 Comparison Continued

Operant Conditioning Respondent Conditioning Responses are emitted to Responses are elicited to a known reinforcer. a known stimulus. Conditioning strength Conditioning strength = Rate of response = Response magnitude

Theoretical Differences

Functionalists Associationists Edward Thorndike Ivan Pavlov Burrhus Skinner Edwin Guthrie Concentrated on Concentrated on stimuli responses as they as they brought brought about responses. consequences.

S R S S S R

Radical Behaviorism

1. Behavior cannot be explained on the basis of drive, motivation and purpose. All of these take psychology back to its mentalistic nature. 2. Behavior has to be explained on the basis of consequences (reinforcements, punishments) and environmental factors. This, Skinner proposed, was the back bone of all scientific psychology.

3 Principles of Operant Learning

1. We need to know what is reinforcing for the organism. How can we find a reinforcer? It is merely a process of selection, which is difficult to determine. Reinforcers related to bodily conditions are easy to determine, like food and water. 2. This reinforcement will predict response. 3. Reinforcement increases rate of responding.

Operant Chambers

Skinner devised operant chambers for rats and pigeons to study behavior in a controlled environment. Operant chambers opportunities to control reinforcements and other stimuli.

Magazine Training

1. At the beginning of this training the rat is deprived (a procedure) of food for 23 hours, and placed in the operant chamber. 2. The experimenter presses a hand held switch which makes a clicking sound (secondary reinforcer) and a food pellet (primary reinforcer) drops in the food magazine. 3. The rat learns to associate the clicking sound with the food pellet.

4 Magazine Training

4. To train the rat to come to the food magazine and eat food, the experimenter presses the switch when the rat is near the food magazine. After a few trials the rat associates clicking sound with coming of the food, and stays close to the magazine to eat food.

Lever

Food Pellet Food Magazine 13

Shaping

1. To train the rat to press the lever and get a food, the experimenter shapes rat’s behavior. Shaping involves reinforcing (secondary) rat for behaviors that approximate the target behavior, i.e., coming closer and closer to the lever and finally pressing it. This procedure is called successive approximation. 2. To shape lever-pressing behavior, differential reinforcement can also be used. In this procedure only lever-pressing behaviors are

reinforced not others. 14

Cumulative Recording

Second Response Operant Level

One Response CumulativeResponses

Paper Time Movement 15

5 Responding Rate

Slow rate of responding

Rapid rate of responding

CumulativeResponses ShallowSteep trace trace

Time 16

Cumulative Responses: Sniffy

CumulativeResponses 75 75 75 Responses Responses Responses

Extinction

S R S Lever Lever pressing Food response

Remove reinforcement (food) and the lever pressing behavior is extinguished.

6 Extinction

No Extinction

CumulativeResponses Food (Operant Level)

Time

Spontaneous Recovery

Just as we have spontaneous recovery in classical conditioning, a restful period after extinction initiates lever-pressing response in the animal. 60

50 Spontaneous 40 Recovery

30 Extinction

20 & Rest

0 Behavior (Cumulative Responses)(Cumulative Behavior 5 10 15 20 25 30 20 Trials

Discrimination Learning

The organism can be conditioned to discriminate between two or more stimuli. A discriminative operant is a response that is emitted specifically to one stimulus (SD) but not the other (SΔ).

Discriminative Response Reinforcement Stimulus

Light ‘ON’ (SD) Press lever Food

Lever not Light ‘OFF’ (SΔ) No Food pressed 21

7 Secondary Reinforcement

“Any neutral stimulus paired with a primary reinforcer (e.g., food or water) takes on reinforcing properties of its own" (Hergenhahn and Olson, 2001)” and is called a secondary stimulus. Thus, all discriminative stimuli are secondary reinforcers.

Generalized Reinforcers

1. A secondary reinforcer can become a generalized reinforcer when paired with a number of primary reinforcers. Money then is a generalized reinforcer, for it is associated with primary reinforcers like food, drink and mates. 2. Secondary reinforcer is similar to Allport’s (1961) idea of functional autonomy. First there is activity for reinforcement, but then the activity by itself becomes reinforcing, e.g., joined merchant navy for money but now enjoys

sailing for its own sake. 23

Chaining

A discriminative stimulus (SD) initiates a response (SR) which serves as a stimulus (SD) for the next response (SR) and so on till the final response (R) is followed by primary reinforcement.

SD R SD R SD R SR SR SR Many Orients Sight of Approaches Contact Presses Food stimuli lever lever bar Pellet

Similar to Guthrie’s movement-produced stimuli. 24

8 Reinforcement & Punishment

If response is followed by a reinforcer then the response increases. However, if it is followed by a punisher then the response decreases.

Reinforcement

Reinforcer Contingency Example Behavior Doing work getting Work Primary Positive food increases Studying books Studying Secondary Positive getting good increases grades Heater Heater proximity Primary Negative proximity avoids cold increases Waking Waking early Secondary Negative early avoiding traffic increases26

Punishment

Punisher Contingency Example Behavior Work with Work with Primary Positive electricity get electricity shock decreases Insult boss get Insulting boss Secondary Positive reprimanded decreases Quarrelsome Quarrelsome Primary Negative behavior behavior lose food decreases Coming home late Coming late Secondary Negative no going out decreases 27

9 Consequences & Contingencies

Contingency Positive Negative Behavior Behavior Reinforcement increases increases Consequence Behavior Behavior Punishment decreases decreases

Like Thorndike, Skinner believed that positive reinforcement strengthened behavior but punishment did not weaken behavior.

Estes’s Punishment Experiment

500 No reinforcement + punishment 400 No reinforcement

300

200

Cumulative Responses 100

0 1 2 3 Extinction Session 29

Punishment

1. Unwanted emotional byproducts (generalized fears). 2. Conveys no information to the organism. 3. Justifies pain to others. 4. Unwanted behaviors reappear in its absence. 5. Aggression towards the agent. 6. One unwanted behavior appears in place of another. 30

10 Punishment

Why punishment? It reinforces the punisher!

Alternatives to Punishment 1. Do not reinforce the unwanted behavior. 2. Let the individual engage in the undesirable behavior for long till he is sick of it. 3. Wait for the unwanted behavior to dissolve over development.

Schedule of Reinforcement

A. When a response is always followed by reinforcement it is called continuous reinforcement. Such a response after learning is easy to extinguish. B. When occurrence of reinforcement is probabilistic it is termed as partial reinforcement, and is difficult to extinguish. During partial reinforcement superstitious behaviors arise. An animal behaves peculiarly to get reinforcement, when its not being received.

Ratio Schedules

1. Reinforcement that occurs after every nth response is called fixed ratio schedule. For example, when the rat presses the bar 5 times to get food, it is on FR5 schedule. 2. Reinforcement occurs after an average of n responses is known as variable ratio schedule. Sometimes the reinforcement is introduced after 3 bar presses at other times 8 bar presses, however, the average bar presses equals 5. Abbreviated as VR5.

11 Interval Schedules

3. When reinforcement occurs after a specified interval of time is called fixed Interval schedule. Animal gets food after 5 seconds. Abbreviated as FI5. 4. When reinforcement occurs after an average interval of time is called variable Interval schedule. Some times the rat gets the food pellet after 3 seconds and some times after 8 seconds however the average time interval equals 5 seconds (VI5).

Schedules of Reinforcement

Different learning curves emerge with different reinforcement schedules. For ratio schedules they are steeper than interval schedules. Sequence Fixed Variable

Ratio

Domain

Interval

Concurrent Schedules

5. Concurrent schedules provide two simultaneous schedules of reinforcements, organisms (pigeons) will distribute their responses according to these schedules (Skinner, 1950).

5 VI5 VI10 VI 5 4 VI 10

0 Behavior (Cumulative Responses)(Cumulative Behavior 1 5 10 15 20 25 30 36 Time (Minutes)

12 Herrnstein Matching Law

Herrnstein (1970; 1974) showed with a mathematical equation that relative reinforcement equals relative response (behavior).

1.0

0.9

Red Key Green Key 0.8

0.7

0.6

0.5

0.4 B1 R1 0.3 = Key Red Behavior Relative 0.2 B1 + B2 R1 + R2 0.1 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Relative Reinforcement Red Key 37

Simple Choice behavior

Gratification from rewards can be immediate or delayed. Our simple choice behaviors are dictated by these reinforcements accordingly.

Delayed Study Reward Gratification with a good grade

Immediate Reward Going to Gratification by seeing a the movies movie 38

Concurrent Chain Schedule

6a. Concurrent chain schedule produce complex choice behaviors so under one condition pigeons preferred small sooner reinforcer (Rachlin & Green, 1972).

Light Delay Reinforcement 2 sec 2 sec of grain

6 seconds 6 sec of grain 4 seconds difference 39

13 Concurrent Chain Schedule

6b. And in the other condition, pigeons preferred large delayed reinforcers (Rachlin & Green, 1972).

Light Delay Reinforcement 2 sec 20 seconds of grain

6 sec 24 seconds of grain 4 seconds difference 40

Complex Choice Behavior

Thus organisms (human and animal) behave differently to different rewards. Selection of rewards in a complex choice situation is based on a combination of reward imminence (how large or small they are) and reward delay (length of time to reach them).

Progressive Ratio Schedule

7a. Progressive ratio schedule provides a tool to measure the efficacy of a reinforcer. To determine whether one reinforcer is more effective than the other, progressive ratio schedule requires the organism to indicate in behavioral terms the maximum it will “pay “ for a particular reinforcer.

14 Progressive Ratio Schedule

7b. The organism is trained on a fixed ratio schedule say FR2 and receives say 5 pellets of food. The schedule is increased to FR4, so now the animal makes 4 responses before it gets 5 pellets of food. The schedule is increased to FR8 and so on. There comes a time for a schedule (FR64) that the animal is not willing to engage in responses to get the reinforcement.

Progressive Ratio Schedule

7c. We can compare two reinforcements (food and water) and determine at which schedule the animal breaks down for them, thus comparing their efficacy. Food breaks down before water. 16 Reinforcement A (Food) 14 Reinforcement B (Water) 12

2 Mean LogMeanReinforcement Rate 0 0 1 2 4 8 16 32 64 128 256 512

Log FR Schedule 44

Verbal Behavior

Like any other behavior language (verbal behavior) is also a behavior and largely consists of speaking, listening, writing and reading behaviors. These behaviors are governed by antecedent conditions (stimuli), and consequences (reinforcements).

15 Types of Verbal Behavior

1. Mand (from demand or command): A listening or talking behavior. The individual (child) behaves appropriately to the command given by another (adult) and is reinforced. The child may also request (demand) something to relieve a need. The adult says, “look (mand) I have a toy for you”. The child looks (behaves) and is reinforced with the toy (reinforcement). 46

Types of Verbal Behavior

2. Echoic Behavior: A talking behavior. A word or a sentence repeated verbatim. Can be loud or silent as in reading. The adult says “cookies” (stimulus) the child echoes the word (behavior) and gets a smile (reinforcement). Cookies

Audible Silent 47

Types of Verbal Behavior

3. Tact: A talking behavior. A verbal behavior in which individuals correctly names or identifies (tact) objects (stimuli) and the other individuals reinforce them for a correct match.

Flowers Good

16 Types of Verbal Behavior

4. Autoclitic Behavior: A talking behavior. This behavior (autoclitic) occurs when a question (stimulus) is posed. The answer to the question is followed by reinforcement (praise). Also called intraverbal behavior. A whale! Which mammal lives in the sea?

ABC of Verbal Behavior

Type Antecedent (A) Behavior (B) Consequence (C) Mand State of Deprivation or Verbal utterance Reinforcer that aversive stimulation reduces state of deprivation Echoic Verbal utterance from Repetition of what Conditioned another individual the speaker says reinforcement (praise) from the other person Tact Stimulus (usually Verbal utterance Conditioned object) in the naming or referring reinforcement from environment to the object the other person Autoclitic Verbal utterance Verbal response Verbal feedback or (often a question) (answer to a reinforcement from another person question) Based on Skinner (1957) 50

Programmed Learning

Skinner was interested in applying theory of learning to education, therefore introduced teaching machines. Electromechanical devices that promoted

teaching and learning. upload.wikimedia.org

17 Programmed Learning

1. Teaching machines provide sustained activity. 2. Insures a point is understood before moving on (small steps). 3. Presents learner with material he is ready for. 4. Helps learner find the right answer. 5. Provides immediate feedback.

Learning Theory & Behavior Technology

1. Skinner did not believe in formulating a theory of learning, the way Hull did. 2. Behavior should be explained in terms of stimuli, not physiology. 3. Functional analysis of stimuli and behaviors should be the goal of psychology not the “why of behaviors”. 4. We need behavior technology to resolve human problems. But our culture, government and religion erodes reinforcements to problem-free behaviors. 53

David Premack

1. Born: October 26, 1925, Aberdeen, South Dakota. 2. Started working at the Yerkes Primate Biology Laboratory (1954). 3. Intelligence in Apes and Man (1976). The Mind of an Ape (1983). Original Intelligence: The 1925-Present Architecture of the Human

Mind (2002). 54

18 David Premack

4. Emeritus professor of psychology at the University of Pennsylvania. 5. William James Fellow Award (2005).

1925-Present

Premack Principle

Responses (behaviors) that occurred at a higher frequency could be used as reinforcers for responses that occurred with low frequency. In other words High-probability behavior (HPB) can be used to reinforce low-probability behavior (LPB). Eating (HPB) Grooming In order to increase grooming behavior (HPB) (LPB), eating behavior (HPB) was used as a reinforcer. Each time the animal Grooming groomed, it was given the opportunity (LPB) to eat. His grooming behavior increased.

Proportion of behavior 56 in the animal

Relativity of Reinforcement

To test his theory in humans, Premack took 31, 1st graders and gave them gumball and pinball machine to play with. Based on their activity he was able to classify them into eaters and manipulators.

Phase I

Gumball Pinball machine Machine 57

19 Relativity of Reinforcement

Phase II

If the child was an eater, he If the child was a manipulator, was only allowed to eat if he he was only allowed to play if he played the pinball machine. ate from the gumball machine.

Playing behavior increased! Eating behavior increased! 58

Transituational Nature of Reinforcement

A high probability behavior like eating will become a low probability behavior if the animal eats. Not only does the probability of the behavior changes, but the very nature of the reinforcement changes with time.

Food Rewarding Neutral Punishing

Nature of reinforcement over time (Kimble, 1993).

Disequilibrium Hypothesis

Timberlake (1980) suggests that any activity can become a reinforcer if the activity is blocked in some way. If drinking is blocked a state of disequilibrium is produced in the animal, and now can be used as a reinforcer.

State of Disequilibrium

30% 20% 10% 10%

Eating Drinking Activity 60 Wheel

20 Marian Breland Bailey

1. Born Dec. 2, 1920 in Minneapolis, Minnesota. 2. Became the second PhD student under Skinner moved to Hot Spring and relocated Animal Behavior Enterprises (ABE). 3. Studied functional analysis of behavior and taught at 1925-2001 Henderson State University. 4. Died Sep. 25, 2001. 61

Instinctive Drift

When instinctive behavior comes in conflict with conditioned operant behavior, animals show a tendency to drift in the direction of instinctive behavior. Marian Breland and Keller Breland trained raccoons to put a wooden coins in a box (commercial for a saving bank) but raccoons had trouble putting the coins in the box especially, when there were two coins to deposit. Brelands argued that raccoons instinctive behavior of washing (rubbing) the food before eating came in conflict with the learnt behavior. 62

Questions

17. Would you use the same reinforcers to manipulate the behavior of both children and adults? If not what would make the difference? 18. What is partial reinforcement effect? Briefly describe the ratio and interval reinforcement schedules studied by Skinner. 19. Explain the difference between Premack’s and Timberlake’s views of reinforcers.