Burrhus Frederic Skinner (1904 - 1990)
Chapter 5
1
Burrhus Frederic Skinner
1. Born Mar. 20, 1904 Susquehanna, Pennsylvania. 2. Did his PhD (1931) from Harvard. 3. Wanted to become a writer was disappointed to learn that he had nothing to write about, instead became a great 1904-1990
psychologist. 2
Burrhus Frederic Skinner
4. Wrote The behavior of organisms (1938). Walden
two (1948), after Thoreau’s www.simplypsychology.pwp.blueyonder.co.uk Walden. 5. Taught at University of Minnesota (1936-48). 6. Chair at Indiana University (1945/48). 7. Came back to Harvard (1948-90). 1904-1990 3
1 Burrhus Frederic Skinner
8. Beyond freedom and dignity (1971). 9. About behaviorism
(1976). images - 10. Upon further reflection cdn01.associatedcontent.com (1987). 11. Continued to publish to the end of his life in journals like Analysis of Behavior (1989). 1904-1990 4
Burrhus Frederic Skinner
12. Great contributions to learning and education. 13. Contributions to child development. 14. Project ORCON (ORganic CONtrol). Project ORCON 15. Died in 1990. pavlov.psicol.unam.mx:8080
5 1904-1990
Comparison
Operant Conditioning Respondent Conditioning Skinnerian or operant Classical, Pavlovian, or conditioning respondent conditioning Type R conditioning Type S conditioning reinforcing stimulus is reinforcing stimulus is contingent upon a contingent upon a response stimulus S R S (Food) S S (Food) R
6
2 Comparison Continued
Operant Conditioning Respondent Conditioning Responses are emitted to Responses are elicited to a known reinforcer. a known stimulus. Conditioning strength Conditioning strength = Rate of response = Response magnitude
7
Theoretical Differences
Functionalists Associationists Edward Thorndike Ivan Pavlov Burrhus Skinner Edwin Guthrie Concentrated on Concentrated on stimuli responses as they as they brought brought about responses. consequences.
S R S S S R
8
Radical Behaviorism
1. Behavior cannot be explained on the basis of drive, motivation and purpose. All of these take psychology back to its mentalistic nature. 2. Behavior has to be explained on the basis of consequences (reinforcements, punishments) and environmental factors. This, Skinner proposed, was the back bone of all scientific psychology.
9
3 Principles of Operant Learning
1. We need to know what is reinforcing for the organism. How can we find a reinforcer? It is merely a process of selection, which is difficult to determine. Reinforcers related to bodily conditions are easy to determine, like food and water. 2. This reinforcement will predict response. 3. Reinforcement increases rate of responding.
10
Operant Chambers
Skinner devised operant chambers for rats and pigeons to study behavior in a controlled environment. Operant chambers opportunities to control reinforcements and other stimuli.
11
Magazine Training
1. At the beginning of this training the rat is deprived (a procedure) of food for 23 hours, and placed in the operant chamber. 2. The experimenter presses a hand held switch which makes a clicking sound (secondary reinforcer) and a food pellet (primary reinforcer) drops in the food magazine. 3. The rat learns to associate the clicking sound with the food pellet.
12
4 Magazine Training
4. To train the rat to come to the food magazine and eat food, the experimenter presses the switch when the rat is near the food magazine. After a few trials the rat associates clicking sound with coming of the food, and stays close to the magazine to eat food.
Lever
Food Pellet Food Magazine 13
Shaping
1. To train the rat to press the lever and get a food, the experimenter shapes rat’s behavior. Shaping involves reinforcing (secondary) rat for behaviors that approximate the target behavior, i.e., coming closer and closer to the lever and finally pressing it. This procedure is called successive approximation. 2. To shape lever-pressing behavior, differential reinforcement can also be used. In this procedure only lever-pressing behaviors are
reinforced not others. 14
Cumulative Recording
Second Response Operant Level
One Response CumulativeResponses
Paper Time Movement 15
5 Responding Rate
Slow rate of responding
Rapid rate of responding
CumulativeResponses ShallowSteep trace trace
Time 16
Cumulative Responses: Sniffy
CumulativeResponses 75 75 75 Responses Responses Responses
17
Extinction
S R S Lever Lever pressing Food response
Remove reinforcement (food) and the lever pressing behavior is extinguished.
18
6 Extinction
No Extinction
CumulativeResponses Food (Operant Level)
Time
19
Spontaneous Recovery
Just as we have spontaneous recovery in classical conditioning, a restful period after extinction initiates lever-pressing response in the animal. 60
50 Spontaneous 40 Recovery
30 Extinction
20 & Rest
10
0 Behavior (Cumulative Responses)(Cumulative Behavior 5 10 15 20 25 30 20 Trials
Discrimination Learning
The organism can be conditioned to discriminate between two or more stimuli. A discriminative operant is a response that is emitted specifically to one stimulus (SD) but not the other (SΔ).
Discriminative Response Reinforcement Stimulus
Light ‘ON’ (SD) Press lever Food
Lever not Light ‘OFF’ (SΔ) No Food pressed 21
7 Secondary Reinforcement
“Any neutral stimulus paired with a primary reinforcer (e.g., food or water) takes on reinforcing properties of its own" (Hergenhahn and Olson, 2001)” and is called a secondary stimulus. Thus, all discriminative stimuli are secondary reinforcers.
22
Generalized Reinforcers
1. A secondary reinforcer can become a generalized reinforcer when paired with a number of primary reinforcers. Money then is a generalized reinforcer, for it is associated with primary reinforcers like food, drink and mates. 2. Secondary reinforcer is similar to Allport’s (1961) idea of functional autonomy. First there is activity for reinforcement, but then the activity by itself becomes reinforcing, e.g., joined merchant navy for money but now enjoys
sailing for its own sake. 23
Chaining
A discriminative stimulus (SD) initiates a response (SR) which serves as a stimulus (SD) for the next response (SR) and so on till the final response (R) is followed by primary reinforcement.
SD R SD R SD R SR SR SR Many Orients Sight of Approaches Contact Presses Food stimuli lever lever bar Pellet
Similar to Guthrie’s movement-produced stimuli. 24
8 Reinforcement & Punishment
If response is followed by a reinforcer then the response increases. However, if it is followed by a punisher then the response decreases.
25
Reinforcement
Reinforcer Contingency Example Behavior Doing work getting Work Primary Positive food increases Studying books Studying Secondary Positive getting good increases grades Heater Heater proximity Primary Negative proximity avoids cold increases Waking Waking early Secondary Negative early avoiding traffic increases26
Punishment
Punisher Contingency Example Behavior Work with Work with Primary Positive electricity get electricity shock decreases Insult boss get Insulting boss Secondary Positive reprimanded decreases Quarrelsome Quarrelsome Primary Negative behavior behavior lose food decreases Coming home late Coming late Secondary Negative no going out decreases 27
9 Consequences & Contingencies
Contingency Positive Negative Behavior Behavior Reinforcement increases increases Consequence Behavior Behavior Punishment decreases decreases
Like Thorndike, Skinner believed that positive reinforcement strengthened behavior but punishment did not weaken behavior.
28
Estes’s Punishment Experiment
500 No reinforcement + punishment 400 No reinforcement
300
200
Cumulative Responses 100
0 1 2 3 Extinction Session 29
Punishment
1. Unwanted emotional byproducts (generalized fears). 2. Conveys no information to the organism. 3. Justifies pain to others. 4. Unwanted behaviors reappear in its absence. 5. Aggression towards the agent. 6. One unwanted behavior appears in place of another. 30
10 Punishment
Why punishment? It reinforces the punisher!
Alternatives to Punishment 1. Do not reinforce the unwanted behavior. 2. Let the individual engage in the undesirable behavior for long till he is sick of it. 3. Wait for the unwanted behavior to dissolve over development.
31
Schedule of Reinforcement
A. When a response is always followed by reinforcement it is called continuous reinforcement. Such a response after learning is easy to extinguish. B. When occurrence of reinforcement is probabilistic it is termed as partial reinforcement, and is difficult to extinguish. During partial reinforcement superstitious behaviors arise. An animal behaves peculiarly to get reinforcement, when its not being received.
32
Ratio Schedules
1. Reinforcement that occurs after every nth response is called fixed ratio schedule. For example, when the rat presses the bar 5 times to get food, it is on FR5 schedule. 2. Reinforcement occurs after an average of n responses is known as variable ratio schedule. Sometimes the reinforcement is introduced after 3 bar presses at other times 8 bar presses, however, the average bar presses equals 5. Abbreviated as VR5.
33
11 Interval Schedules
3. When reinforcement occurs after a specified interval of time is called fixed Interval schedule. Animal gets food after 5 seconds. Abbreviated as FI5. 4. When reinforcement occurs after an average interval of time is called variable Interval schedule. Some times the rat gets the food pellet after 3 seconds and some times after 8 seconds however the average time interval equals 5 seconds (VI5).
34
Schedules of Reinforcement
Different learning curves emerge with different reinforcement schedules. For ratio schedules they are steeper than interval schedules. Sequence Fixed Variable
Ratio
Domain
Interval
35
Concurrent Schedules
5. Concurrent schedules provide two simultaneous schedules of reinforcements, organisms (pigeons) will distribute their responses according to these schedules (Skinner, 1950).
5 VI5 VI10 VI 5 4 VI 10
3
2
1
0 Behavior (Cumulative Responses)(Cumulative Behavior 1 5 10 15 20 25 30 36 Time (Minutes)
12 Herrnstein Matching Law
Herrnstein (1970; 1974) showed with a mathematical equation that relative reinforcement equals relative response (behavior).
1.0
0.9
Red Key Green Key 0.8
0.7
0.6
0.5
0.4 B1 R1 0.3 = Key Red Behavior Relative 0.2 B1 + B2 R1 + R2 0.1 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Relative Reinforcement Red Key 37
Simple Choice behavior
Gratification from rewards can be immediate or delayed. Our simple choice behaviors are dictated by these reinforcements accordingly.
Delayed Study Reward Gratification with a good grade
Immediate Reward Going to Gratification by seeing a the movies movie 38
Concurrent Chain Schedule
6a. Concurrent chain schedule produce complex choice behaviors so under one condition pigeons preferred small sooner reinforcer (Rachlin & Green, 1972).
Light Delay Reinforcement 2 sec 2 sec of grain
6 seconds 6 sec of grain 4 seconds difference 39
13 Concurrent Chain Schedule
6b. And in the other condition, pigeons preferred large delayed reinforcers (Rachlin & Green, 1972).
Light Delay Reinforcement 2 sec 20 seconds of grain
6 sec 24 seconds of grain 4 seconds difference 40
Complex Choice Behavior
Thus organisms (human and animal) behave differently to different rewards. Selection of rewards in a complex choice situation is based on a combination of reward imminence (how large or small they are) and reward delay (length of time to reach them).
41
Progressive Ratio Schedule
7a. Progressive ratio schedule provides a tool to measure the efficacy of a reinforcer. To determine whether one reinforcer is more effective than the other, progressive ratio schedule requires the organism to indicate in behavioral terms the maximum it will “pay “ for a particular reinforcer.
42
14 Progressive Ratio Schedule
7b. The organism is trained on a fixed ratio schedule say FR2 and receives say 5 pellets of food. The schedule is increased to FR4, so now the animal makes 4 responses before it gets 5 pellets of food. The schedule is increased to FR8 and so on. There comes a time for a schedule (FR64) that the animal is not willing to engage in responses to get the reinforcement.
43
Progressive Ratio Schedule
7c. We can compare two reinforcements (food and water) and determine at which schedule the animal breaks down for them, thus comparing their efficacy. Food breaks down before water. 16 Reinforcement A (Food) 14 Reinforcement B (Water) 12
10
8
6
4
2 Mean LogMeanReinforcement Rate 0 0 1 2 4 8 16 32 64 128 256 512
Log FR Schedule 44
Verbal Behavior
Like any other behavior language (verbal behavior) is also a behavior and largely consists of speaking, listening, writing and reading behaviors. These behaviors are governed by antecedent conditions (stimuli), and consequences (reinforcements).
45
15 Types of Verbal Behavior
1. Mand (from demand or command): A listening or talking behavior. The individual (child) behaves appropriately to the command given by another (adult) and is reinforced. The child may also request (demand) something to relieve a need. The adult says, “look (mand) I have a toy for you”. The child looks (behaves) and is reinforced with the toy (reinforcement). 46
Types of Verbal Behavior
2. Echoic Behavior: A talking behavior. A word or a sentence repeated verbatim. Can be loud or silent as in reading. The adult says “cookies” (stimulus) the child echoes the word (behavior) and gets a smile (reinforcement). Cookies
Cookies
Audible Silent 47
Types of Verbal Behavior
3. Tact: A talking behavior. A verbal behavior in which individuals correctly names or identifies (tact) objects (stimuli) and the other individuals reinforce them for a correct match.
Flowers Good
48
16 Types of Verbal Behavior
4. Autoclitic Behavior: A talking behavior. This behavior (autoclitic) occurs when a question (stimulus) is posed. The answer to the question is followed by reinforcement (praise). Also called intraverbal behavior. A whale! Which mammal lives in the sea?
49
ABC of Verbal Behavior
Type Antecedent (A) Behavior (B) Consequence (C) Mand State of Deprivation or Verbal utterance Reinforcer that aversive stimulation reduces state of deprivation Echoic Verbal utterance from Repetition of what Conditioned another individual the speaker says reinforcement (praise) from the other person Tact Stimulus (usually Verbal utterance Conditioned object) in the naming or referring reinforcement from environment to the object the other person Autoclitic Verbal utterance Verbal response Verbal feedback or (often a question) (answer to a reinforcement from another person question) Based on Skinner (1957) 50
Programmed Learning
Skinner was interested in applying theory of learning to education, therefore introduced teaching machines. Electromechanical devices that promoted
teaching and learning. upload.wikimedia.org
51
17 Programmed Learning
1. Teaching machines provide sustained activity. 2. Insures a point is understood before moving on (small steps). 3. Presents learner with material he is ready for. 4. Helps learner find the right answer. 5. Provides immediate feedback.
52
Learning Theory & Behavior Technology
1. Skinner did not believe in formulating a theory of learning, the way Hull did. 2. Behavior should be explained in terms of stimuli, not physiology. 3. Functional analysis of stimuli and behaviors should be the goal of psychology not the “why of behaviors”. 4. We need behavior technology to resolve human problems. But our culture, government and religion erodes reinforcements to problem-free behaviors. 53
David Premack
1. Born: October 26, 1925, Aberdeen, South Dakota. 2. Started working at the Yerkes Primate Biology Laboratory (1954). 3. Intelligence in Apes and Man (1976). The Mind of an Ape (1983). Original Intelligence: The 1925-Present Architecture of the Human
Mind (2002). 54
18 David Premack
4. Emeritus professor of psychology at the University of Pennsylvania. 5. William James Fellow Award (2005).
1925-Present
55
Premack Principle
Responses (behaviors) that occurred at a higher frequency could be used as reinforcers for responses that occurred with low frequency. In other words High-probability behavior (HPB) can be used to reinforce low-probability behavior (LPB). Eating (HPB) Grooming In order to increase grooming behavior (HPB) (LPB), eating behavior (HPB) was used as a reinforcer. Each time the animal Grooming groomed, it was given the opportunity (LPB) to eat. His grooming behavior increased.
Proportion of behavior 56 in the animal
Relativity of Reinforcement
To test his theory in humans, Premack took 31, 1st graders and gave them gumball and pinball machine to play with. Based on their activity he was able to classify them into eaters and manipulators.
Phase I
Gumball Pinball machine Machine 57
19 Relativity of Reinforcement
Phase II
If the child was an eater, he If the child was a manipulator, was only allowed to eat if he he was only allowed to play if he played the pinball machine. ate from the gumball machine.
Playing behavior increased! Eating behavior increased! 58
Transituational Nature of Reinforcement
A high probability behavior like eating will become a low probability behavior if the animal eats. Not only does the probability of the behavior changes, but the very nature of the reinforcement changes with time.
Food Rewarding Neutral Punishing
Nature of reinforcement over time (Kimble, 1993).
59
Disequilibrium Hypothesis
Timberlake (1980) suggests that any activity can become a reinforcer if the activity is blocked in some way. If drinking is blocked a state of disequilibrium is produced in the animal, and now can be used as a reinforcer.
State of Disequilibrium
30% 20% 10% 10%
Eating Drinking Activity 60 Wheel
20 Marian Breland Bailey
1. Born Dec. 2, 1920 in Minneapolis, Minnesota. 2. Became the second PhD student under Skinner moved to Hot Spring and relocated Animal Behavior Enterprises (ABE). 3. Studied functional analysis of behavior and taught at 1925-2001 Henderson State University. 4. Died Sep. 25, 2001. 61
Instinctive Drift
When instinctive behavior comes in conflict with conditioned operant behavior, animals show a tendency to drift in the direction of instinctive behavior. Marian Breland and Keller Breland trained raccoons to put a wooden coins in a box (commercial for a saving bank) but raccoons had trouble putting the coins in the box especially, when there were two coins to deposit. Brelands argued that raccoons instinctive behavior of washing (rubbing) the food before eating came in conflict with the learnt behavior. 62
Questions
17. Would you use the same reinforcers to manipulate the behavior of both children and adults? If not what would make the difference? 18. What is partial reinforcement effect? Briefly describe the ratio and interval reinforcement schedules studied by Skinner. 19. Explain the difference between Premack’s and Timberlake’s views of reinforcers.
63
21