operant conditioning aka pigeons in cages. operant conditioning: learning through consequences what...
TRANSCRIPT
Operant ConditioningAKA Pigeons in Cages
Operant Conditioning: Learning Through Consequences• What is Operant Conditioning?– It’s how a dog learns to sit on command– It’s how you learned to drive a car– It’s how you learned to make friends
• Operant Conditioning studies voluntary responses, where Classical Conditioning studies elicited responses.
Thorndike’s Law of Effect• 1898- Edward Thorndike studies how
animals solve problems.• Because performance improvement was
solved, Thorndike concluded that animals do not gain insight into the problem.
• Instrumental learning: behavior is instrumental in bringing about an outcome
• Law of effect: in a given situation, a response followed by a satisfying consequence will become more likely to occur and a response followed by an annoying consequence will become less likely to occur
B.F. Skinner
• Coined the term Operant Conditioning – type of learning in which behavior is
influenced by the consequences that follow it
• Known for the “Skinner Box”
B.F. Skinner
Primary Types of Consequences: • Reinforcement: response is strengthened
by an outcome that follows it– Example A: rat receives food when he presses
the lever, increasing the chance that he will press the lever
• Punishment: occurs when a response is weakened by outcomes that follow it– Example A: rat receives a small electric shock
when he presses the lever, decreasing the chance that he will press the lever
B.F. Skinner
• Three-Part Contingency:– IF antecedent stimuli are present • IF I say, “Sit!”
– AND behavior is emitted• AND my dog Sheila sits,
– THEN consequences will occur• THEN Sheila will receive a treat.
• The consequence of receiving food is contingent upon Sheila’s response of sitting.
Identifying When to Respond
• Discriminative Stimuli signal that a particular response will now produce certain consequences.– Back the rat: If pressing the lever
dispenses food only when a light is on, the rat will quickly learn to press the lever only when the light is on.
Five Major Operant ProcessesPROCESS BEHAVIOR CONSEQUENCE RESULT
REINFORCEMENT
Positive Reinforcement
Response occurs(rat presses lever)
Stimulus is presented(food pellets appear)
Response increases(lever pressing increases)
Negative Reinforcement
Response occurs(person takes aspirin)
An aversive stimulus is removed(headache pain goes away)
Response increases(increased tendency to take aspirin for headache relief)
OPERANT EXTINCITION
Response occurs(rat presses a lever)
Stimulus that was reinforcing the behavior no longer appears (no food pellets)
Response decreases (lever pressing decreases)
PUNISHMENT
Aversive Punishment (positive punishment)
Response occurs (two siblings fight over a toy)
An aversive stimulus is presented(Parents spank children)
Response decreases(fighting decreases)
Response Cost (negative punishment)
Response occurs(two siblings fight over a toy)
A stimulus is removed (No TV for a week)
Response decreases(fighting decreases)
Consequences: Positive Reinforcement• Occurs when a response is strengthened
by the subsequent presentation of a stimulus– Rat receives food pellets when it presses the
lever, and begins to press it more often– A new employee receives praise for
completing a small project quickly, and so completes more of her projects on time
– A volunteer receives donations, praise from her fellow workers, and feelings of pride in helping others, and so can keep going despite adverse conditions
Consequences:Positive Reinforcement• Primary Reinforcers: stimuli that an
organism naturally finds reinforcing because they satisfy a biological need (food, water, etc.)
• Secondary Reinforcers: stimuli that acquire reinforcing properties through their association with primary reinforcers (money, praise)
Consequences:Negative Reinforcement• A response is strengthened by the
subsequent removal of an aversive stimulus– NOT PUNISHMENT!– Example A:
• Your alarm clock goes off.• You hit the ‘snooze’ button.• Annoying ringing stops.
– Example B: • Your mom nags you to clean your room.• You clean your room.• Mom stops nagging you.
Operant Extinction
• The weakening and eventual disappearance of a response because it is no longer reinforced.– Often used as an alternative to punishment
in attempting to reduce undesirable behaviors.• Mrs. Adams sought help at a child guidance clinic
because her 5-year-old son, Jonathan, delighted in misbehaving. She had tried to reason with him. Then she resorted to yelling. Finally, she began spanking him. Even that did not work.
• Psychologist’s advice: Ignore him when he misbehaves. Pay attention when he acts appropriately. He is actually after attention, and she is awarding it (though it’s not pleasant, it’s still attention).
Consequence:Aversive Punishment• Or Punishment by application, or
positive punishment• Response is weakened by the
subsequent presentation of a stimulus– Spanking– Child touching a hot burner
Consequence: Response Cost• Taking away something that
satisfies the individual• A response is weakened by the
subsequent removal of a stimulus– Time-out (kids or hockey players)– Speeding ticket fine
Shaping and Chaining• Shaping: reinforcing successive approximations
toward a final response– Reinforcing performance of the ‘next step’ in a
sequence, but not performance of the previous step• Chaining: developing a sequence of responses by
reinforcing each response with the opportunity to perform the next response– The rat has learned to press a lever for food when the
light comes on. We place a bell in his cage. Eventually, he bumps into it and the bell rings, turning on the light. The rat sees the light, presses the lever, and receives food. Over time, the rat will learn to ring the bell because the response is reinforced by the light turning on, which provides the opportunity to press the lever for food.
Generalization and Discrimination• Operant generalization: an operant
response occurs to a new antecedent stimulus or situation that is similar to the original– Lassie responds to the “sit” command, even
when it is given by someone other than her owner.
• Operant discrimination: the operant response will occur to one antecedent stimulus, but not to another– Upon seeing a police car, you slow down,
but you do not do so upon seeing just any car.
Schedules of Reinforcement
• Reinforcement comes in different patterns, referred to as schedules.
• Continuous reinforcement: every response of a particular type is reinforced. – Every time you put $1.25 into the soda
machine, you get a bottle of soda.• Partial reinforcement: only a portion of
the responses of a particular type are reinforced.– Powerful motivators
Schedules of Reinforcement:Partial Reinforcement• Fixed-ratio schedules: reinforcement occurs
after a fixed number of responses– FR-3 reinforcement occurs after every third
response• Variable-ratio schedules: reinforcement occurs
after a variable number of responses, centered around an average– The gambler plays a slot machine programmed to
pay off an average of every 20 pulls. After 8 pulls, he receives a 10-coin jackpot. After 2 more pulls, he receives a 15-coin jackpot. But, after 30 more attempts, nothing. He’s frustrated, but still stuck in the VR cycle. That next attempt might just pay off... Or the next… or the next….
Schedules of Reinforcement:Partial Reinforcement• Fixed-Interval Schedule: the first
response that occurs after a fixed time interval is reinforced– After the rat presses a lever, he must wait 2
minutes before another response will occur, regardless of how many times he hits the lever.
• Variable-Interval Schedule– Reinforcement is given for the first
response after a variable time interval, centered around an average.• Pop quizzes, random drug testing, roadside
speed traps
Escape and Avoidance Conditioning• Escape conditioning: the organism learns a
response to terminate an aversive stimulus– Acquired and maintained by negative
reinforcement– If you’re cold, putting on a sweater is
negatively reinforced by the desirable consequence that you no longer shiver.
• Avoidance conditioning: the organism learns a response to avoid an aversive condition– We dress warmly to avoid feeling cold in the
first place.
Two-Factor Theory of Avoidance Learning• Classical and Operant conditioning work
together to achieve avoidance learning.– Carol’s fear of cars is classically conditioned. – The sight of a car elicits fear and causes her to
flee, thereby avoiding riding in cars.– The avoidance is negatively reinforced by fear
reduction, so it remains strong.– Extinction (without therapy) is difficult
because she does not allow herself the opportunity to be in the car without experiencing pain and fear.
– Exposure therapy works by preventing avoidance.
Speaking of Classical Conditioning…• Classical and Operant Conditioning can
operate together… and frequently do.• Example: A teacher has a talent for making
chalk squeak on a chalkboard. Eventually, seeing him raising the chalk to the chalkboard will become a CS that automatically triggers a CR of shivers up the spine. It also becomes a signal for the students to cover their ears (an operant response) which is reinforced by the consequence of reducing the squeaking sound. – Ta Da! One stimulus has classical and operant
functions!
Operant vs. Classical ConditioningQuestion Classical Conditioning Operant Conditioning
What type of behavior is involved?
Elicited: CR is a reflexive response triggered by a CS
Emitted: response operates on the environment and is under the subject’s control
How does learning occur? Through CS-UCS pairings, on stimulus is associated with another
Organism’s responses are associated with reinforcing, punishing, or neutral consequences
What is the sequence of events?
The CS occurs before the CR and triggers it
Consequences follow an organism’s response; antecedent stimuli may set the occasion for emitting certain responses