Home Basics Operant Conditioning – The Reinforcement Theory of Learning

Operant Conditioning – The Reinforcement Theory of Learning

Updated on Mar 24, 2022

Reviewed by Dr. Nereida Gonzalez-Berrios, MD , Certified Psychiatrist

March 1, 2022

Operant Conditioning- Meaning, History, Types, Examples & Principles

Key Takeaways

The basis of Operant conditioning is the idea that behavior is a learned response.
Behavior becomes strong or weak depending upon the outcome of the behavior.
The motive of the person is to get rewards and avoid punishments.
Operant conditioning is the brainchild of B.F. Skinner.
Reinforcements, punishments, and extinction form the basis of Operant Conditioning.
The theory is also known as associative learning or instrumental conditioning.

Psychologists describe Operant Conditioning as the theory of learning. It is a basic method where one gets the reward for good and desirable behavior so that it can get repeated in the future.

One receives punishment for showcasing maladaptive human behavior. It helps to weaken and ensure that there is no repetition in the future.

This theory speaks about the influence of rewards and punishments on learning.

The theory receives wide-scale praise for its relevance in real-life. It has many uses in social psychology and educational psychology.

In this article, let’s look into the minute aspects of the theory. We will further analyze how one can use it to shape healthy behavior patterns.

Plus, we will also see how to discard unhealthy habits. These habits prevent you from achieving the desired behaviors in life.

Read on……

Table of contents hide

What is Operant Conditioning?

A brief history of the operant conditioning theory

Classical and operant conditioning – a comparative and analytical study

Differences between classical conditioning and operant conditioning

A detailed discussion on the comparative analysis of classical and operant conditioning

Examples of operant conditioning

Principles of operant conditioning

Types of operant conditioning

Schedules of reinforcement in operant conditioning

Examples of positive reinforcement

Examples of negative reinforcement

Examples of positive punishment

Examples of negative punishment

Components of operant conditioning

Uses of the operant conditioning theory

Criticism of Operant conditioning

Summing up from ‘ThePleasantMind’

What is Operant Conditioning?

SUMMARY
Operant conditioning is a learning method first identified by B.F. Skinner. The theory says that actions and behavior leading to pleasurable consequences become strong. All those leading to adverse outcomes get weak. The theory refers to the role of reinforcement and punishment as a guiding light to learning.

Operant conditioning is a learning process. In this process, behavior leads to a consequence. The nature of the consequence modifies the organism’s tendency to repeat the behavior in the future.

This is also known as instrumental conditioning. Operant behavior is the one that comes out of operant conditioning.

In simpler terms, the operant behavior comes out of the events that follow it. These events can be positive or negative.

An operant behavior that leads to a positive event is likely to repeat itself in the future.

Suppose, a child plays the piano well and everyone applauds her for it. This will motivate her to play it again. This means that learning is getting reinforced with positive feedback.

Let us consider another example. A child watches too much television. His parents scolded him for it. Thus, he will be hesitant to repeat the action.

This is a case of operant behavior followed by a negative event that is less likely to repeat itself.

So, the key to operant conditioning is the immediate reinforcement of a response.

Operant conditioning gets its name for a particular reason. It is because an organism operates on the environment to produce a specific effect.

The organism first does something and is then reinforced by the environment. The reinforcement does not cause the behavior.

But it increases the chances of the behavior repeating itself in the future.

Hence, the basis of operant conditioning depends on the following factors –

The following consequence of the behavior or the reinforcement or reward.
The time-lapse between the response and the reinforcement.
The nature of the behavior shown or the response given by the organism.

A brief history of the operant conditioning theory

B.F. Skinner (1904-1990) first gave the concept of operant conditioning. He was an American psychologist and behaviorist. His theory focuses on the observable causes of behavior.

Skinner’s work was largely influenced by Edward L. Thorndike’s Law of Effect (1898). Thorndike constructed “puzzle boxes” inside which he kept cats.

The cats could escape by achieving simple tasks like pushing or pulling a rope.

Thorndike found that initially, the cats took a long time to escape. But with repeated trial and error, the cats learned to escape more quickly.

Thus, the learned behavior to escape quickly from the puzzle box became stronger.

So, he concluded that actions which gave rise to a favorable response i.e., pulling/pushing a rope and leading to escape were more likely to repeat themselves.

Similarly, actions which did not give rise to a favorable response were less likely to be repeated. This was known as Trial and Error Learning.

Skinner modified The Law of Effect. He introduced a new term called reinforcement. Skinner modified the law by saying that the behavior which reinforces itself would become stronger.

And behavior that is not reinforced would tend to become weak.

Similar to Thorndike’s puzzle box, Skinner constructed an experimental setup of his own. One called it the Skinner box or an operant conditioning chamber.

Skinner placed rats inside the chamber. These rats had to learn to respond to stimuli by pressing a lever.

This would result in rewards in the form of food or the removal of an obnoxious stimulus like loud noise or electric shock.

Skinner developed an apparatus called a cumulative recorder to record the total number of responses by the rats.

This device could plot every response as an upward movement of a horizontally moving line. The slope (angle of slant) gave an idea about the rate of response.

Skinner discovered that this rate did not depend on any preceding event.

It was in direct contradiction to the earlier theories put forth by John B. Watson and Ivan Pavlov. Classical conditioning was the basis of those theories.

Skinner thus named this behavior that he observed as operant behavior.

He published his findings in his first book, “The Behavior of Organisms” in 1938. Skinner is the Father of Operant Conditioning for his contributions to theories of learning.

Skinner later conducted his experiment using pigeons instead of rats. The structure of the Skinner box remained similar. Just the key-pecking mechanism replaced the lever-pressing one.

Classical and operant conditioning – a comparative and analytical study

Before discussing further operant conditioning, we need to know a bit about classical conditioning.

Classical conditioning is a type of unconscious or automatic learning procedure. It creates an association between a conditioned (neutral) stimulus and a response.

One can achieve it through the repeated presentation of the conditioned stimulus with an unconditioned (natural) stimulus that originally elicits that response.

Classical conditioning involves a substitution of response elicited by a natural stimulus, by that of a neutral stimulus which originally elicited some other response in the individual.

This is also referred to as respondent conditioning.

Russian physiologist Ivan Petrovich Pavlov (1849-1936) first proposed the concept of classical conditioning. Hence, he is also known as the Father of Classical Conditioning.

Pavlov’s experiment with dogs in this context is well known.

Pavlov used the sound of a bell (neutral stimulus) in association with exposure to food (unconditioned or natural stimulus) to cause salivation in the dogs (response).

After repetition of the experiment several times, Pavlov observed that the sound of the bell alone would produce salivation in the dogs.

To honor Pavlov’s contributions, classical conditioning is also sometimes referred to as Pavlovian conditioning.

Important terms of classical conditioning theory

Let us now take a look at some of the important terms relevant in the context of classical conditioning.

Unconditioned Stimulus – This refers to a stimulus that gives rise to an unconditional or natural response in an organism.

For example, if we touch a very hot object, we tend to withdraw our hands immediately. In this case, the ‘hot object’ is an unconditioned stimulus.

Neutral Stimulus and Conditioned Stimulus – A neutral stimulus is one that initially does not trigger any response on its own.

However, by repeated exposure or due to some associated event, a neutral stimulus can turn into a conditioned stimulus, which creates a strong response in the organism.

For example, a hen may be a neutral stimulus to a child. But if a hen attacks that child, he will henceforth be afraid of a hen. In that case, the hen becomes a conditioned stimulus for him.

Unconditioned Response – This refers to an automatic or natural response that occurs on its own in the presence of an unconditioned stimulus.

In the previously mentioned example of withdrawing the hand on touching a hot object, the action or response of withdrawing the hand is the unconditioned response.

Conditioned Response – This is the creation of a learned response where no response was present before.

Considering the previously discussed example of the child being afraid of the hen, the fear response in the child is the conditioned response.

Thus, we can see that classical conditioning depends on the following factors –

The intensity of conditioned or neutral stimuli
Nature or type of the unconditioned stimuli, for example, if it’s unpleasant or pleasant
The time gap between the presentations of the two stimuli.

Differences between classical conditioning and operant conditioning

The following comparison table will help us to differentiate between classical conditioning and operant conditioning.

Areas of differences	Classical Conditioning	Operant Conditioning
The theory was given by	Ivan Petrovich Pavlov	Burrhus Frederic Skinner
Another name of the theory	Also known as respondent conditioning	Also known as instrumental conditioning
Animals used during the study	Dogs	Rats and pigeons
Definition in simple terms	It is a form of learning that takes place by forming an association between an unconditioned stimulus and a conditioned stimulus	It is a type of learning in which a response to a stimulus follows some results. The strength of the relationship between response and consequence results in learning
Area of focus	Single association between stimulus and response	A chain of responses that leads to the desired behavior
Based on	Behavior that is reflexive, automatic, or involuntary	Voluntary behavior
Basis of association	Law of contiguity	Law of effect
Reinforcement	Takes place first to give rise to the response	Comes after a response comes through
Response	Brings out the behavior/response.	One emits the behavior/response.

Difference between classical and operant conditioning

A detailed discussion on the comparative analysis of classical and operant conditioning

With reference to the above table, the differences between classical conditioning and operant conditioning are now explained briefly below:

Classical conditioning is the brainchild of Ivan Petrovich Pavlov. Whereas B.F. Skinner gave the idea of operant conditioning.
The other name of Classical conditioning is respondent conditioning since an organism responds to an environmental event.

Operant conditioning is simply called instrumental conditioning.

Pavlov conducted his experiments on dogs. Whereas, Skinner used rats and pigeons for his studies.
The theory of classical conditioning states that learning occurs due to the formation of bonding between unconditioned and conditioned stimuli.

But, according to the principle of operant conditioning, the relationship between a response and its future consequence (reinforcement and punishment) determines the chance of repetition or non-repetition of that response in the future.

Classical conditioning focuses on single stimulus-response bonding. So, the basis of operant conditioning lies in a chain of responses known as shaping. The basis of Classical conditioning is on involuntary behavior, whereas the basis of operant conditioning is voluntary behavior.
In classical conditioning, the basis of bonding between stimulus and response lies in the law of contiguity. This law states that an association between two things occurs due to closeness in time and space.

In contrast, in operant conditioning, the Law of Effect is the basis of stimulus-response bonding.

This law states that any behavior that follows a pleasant outcome will likely increase that behavior. Similarly, any behavior leading to an unpleasant result will likely come down.

Reinforcement in classical conditioning is the unconditioned stimulus, for example, one gives food before the response occurs.

In the case of operant conditioning, one gives reinforcement after the response.

In classical conditioning, behavior is “elicited.” It means that the organism’s actions result in the behavior.

In operant conditioning, behavior is “emitted.” This means that the behavior simply appears and was not there in the organism before.

Examples of operant conditioning

Now take a look at your surrounding environment. You will find several examples of operant conditioning.

Consider the case where a student gets the reward for a hundred percent attendance in school.

Another example can be an employee receiving a bonus for working extra hours outside the scheduled job time.

Now let us look at some other examples of operant conditioning.

Suppose a dog trainer is teaching a dog to do new tricks. He or she may give treats to the dog after it successfully performs a trick. This motivates the dog to learn and perform complex tricks easily.

An instance of this can be dogs that are being trained to participate in a dog show.

If a child talks during class, the teacher might ask him or her to stay back and write a hundred times on the blackboard that they will not talk during class again. The child is thus afraid to talk in class for fear of the consequences.
Suppose a child does not like solving arithmetic sums. His mother tells him that if he studies geography in the afternoon, he will not have to solve sums in the evening.
Many video games operate on the reward system. When a user completes a task or mission, he receives attractive prizes. This motivates him to play more.
On the other hand, a child spends too much time playing video games and his mother scolds him. The child would thus reduce his playing time out of the fear that his mother can scold him again.

Principles of operant conditioning

Being a behavioral psychologist, Skinner was keen to know the effects of reinforcement on learning.

In simple terms, reinforcement is any stimulus that can encourage or strengthen the behavior of a person.

When the behavior is strong enough, the chances of its future occurrence also increase.

The basic principles of B.F. Skinner’s theory is as follows:

Learning occurs as a result of a consequence. It means that if a behavior or response is followed by a favorable stimulus or event, it will increase the chances of its occurrence in the future. On the flip side, if the behavior is followed by punishment or unfavorable consequences, it will not get repeated in the future.
Using reinforcements and punishment can shape and alter the behavior.
The consequences of a person’s action can make their behavior strong or weak.
The operant behavior occurs due to an association between behavior and consequence.

Types of operant conditioning

There are broadly three types of operant conditioning. These might result in a change of behavior in an organism. They are as follows –

1. Reinforcement

It is anything within the environment that gives strength to a behavior. In other words, it increases the chance or probability of behavior again in the future.

There can be two types of reinforcement – positive reinforcement and negative reinforcement. Both positive and negative reinforcers are used to shape and modify behavior.

Positive reinforcement

In this type of reinforcement, one adds a positive stimulus to a situation. This increases the chances that a given behavior will occur again.

One can refer to this kind of stimulus as a positive reinforcer.

There are several types of positive reinforcers that we use in our daily lives. These are various kinds of food, money, social approval, physical comfort, etc.

Negative reinforcement,

In this case, an unpleasant stimulus gets away from a situation. This increases the likelihood of the behavior repeating itself. This kind of stimulus is called a negative reinforcer.

When behavior is negatively reinforced, it removes or weakens the undesired behavior.

Negative reinforcers are also present in our surroundings. Examples can be a loud noise, electric shock, hunger pangs, etc.

The effect of both positive and negative reinforcement is the same. They both strengthen behavior.

However, there is a major difference. Positive reinforcement involves the presentation of a beneficial stimulus.

In comparison, negative reinforcement involves the removal of an unpleasant or harsh condition.

2. Punishment

This is defined as the presentation of unpleasant, harsh, or cruel stimuli. It may also include the removal of positive ones.

Punishment differs from negative reinforcers. A negative reinforcer strengthens response, while a punishment does not.

However, it does not weaken the response either. The effects of punishment are therefore less predictable than those of a reward.

Punishment lowers the chances of behavior occurring in the future. If you punish a child for his/her nail-biting habits, they will remember it.

Later on, the same negative behavior will not be repeated. It will be completely wiped out.

Punishment can have the following effects–

It suppresses behavior.
It causes a negative feeling.
It spreads the effects of that negative feeling.

Punishment can be of two types –

Positive punishment: In this case, an aversive stimulus comes up. For example, a fine levied against a biker for breaking traffic rules.
Negative punishment: Here, the environment gets rid of positive reinforcement or stimulus. (Look at the third example given under the section of Examples of Operant Conditioning above.)

3. Extinction

It involves a process or happening where the information learned is lost.

For example, if a child does not receive praise or rewards every time after scoring high grades in a subject, the child will no longer receive motivation to perform better in the future.

Extinction also refers to a slow weakening of responses when the reward is no longer there. In operant conditioning, extinction occurs when your behavior no longer receives any reward.

For example: If you train a child to do yoga every day in the morning and reward him/her with stickers every time they do it successfully, the behavior increases in strength.

But after some days, you stop giving the reward and it slowly weakens the response. The child no longer feels any kind of motivation to continue with the yoga sessions.

Schedules of reinforcement in operant conditioning

The rate at which one acquires and maintains operant behavior is a function of the schedule of reinforcement employed.

In brief, a reinforcement schedule is a rule. It states the guidelines under which the delivery of reinforcements will take place.

Reinforcement can be given after an action or behavior either to make the behavior strong or weak.

One can give reinforcement in two ways. They are either on a continuous schedule or an intermittent one.

In continuous schedules reinforcing the organism happens after every trial. This procedure leads to an increase in the response rate. But this is an inefficient use of reinforcement as it is time-consuming.

On the other hand, intermittent schedules are more effective. Here the organism does not reinforce for every response. It saves time.

Intermittent schedules are of four types. One can divide them into two broad headings. They are interval schedules and ratio schedules.

Interval-based schedules

Interval-based schedules can be of two types. They are –

Fixed-interval (FI) schedule:

In the FI schedule, one gives reinforcement only after an appropriate time has elapsed since the previous reinforcement. One does not consider the response rate of the organism in this case.

The FI reinforcement schedule has a particular pattern. Here, the rate of responding gradually increases with time. Then it sharply speeds up near the end of the interval.

This particular pattern of response is known as a fixed-interval scallop. It occurs because the time for reinforcement is fast approaching.

FI schedules also yield low rates of responding immediately after obtaining the reinforcement. This phenomenon is known as a post-reinforcement pause.

Variable interval (VI) schedule:

In the VI schedule, the reinforcements depend on some stated time interval. These intervals are irregular. One cannot predict them. Thus, the organism is said to be on a variable-interval schedule.

Reinforcements under variable-interval schedules are a function of time alone. The organism must make an appropriate response after the interval is over to receive reinforcement again.

VI reinforcement helps to achieve steady response rates. These rates are slow to die down. This is because the organism cannot precisely predict when the next reinforcement will come.

Ratio-based schedules

Ratio-based schedules can also be of two types. These are –

Fixed-ratio (FR) schedule:

In the FR schedule, the reinforcement is given only after the organism emits a predetermined or “fixed” number of responses. The ratio denotes the number of unreinforced to reinforced responses.

Reinforcement by fixed ratio schedules produces extremely high learning levels. The more the organism responds, the more reinforcement it receives.

This schedule is fairly common in everyday life. It exercises considerable control over behavior.

Variable-ratio (VR) schedules:

When the number of responses required for reinforcement is varied randomly around some specific average value, the organism lies on a variable-ratio (VR) schedule.

The VR schedule produces an extremely high and constant response rate. This happens because the organism does not know precisely when the next reinforcement will be coming.

Extinction of behavior that one acquires on variable ratio schedules usually occurs at a slow pace.

Examples of positive reinforcement

Let us see some of the examples of positive reinforcement that we find in our daily lives.

Giving a teenager more monthly allowances for doing house help.
A store manager gives a bonus to his employee for finishing work on time.
One can give chocolate to a child for getting up early every day while going to school.
A teacher gives stickers to her students for maintaining silence in the classroom.
You receive praise from your parents and sports coach for winning the tough game of tennis.
You pat your pet dog every time he picks up the thrown ball in the right manner.

Examples of negative reinforcement

Some of the daily life experiences of negative reinforcement are as follows:

A child is not told to clean the study desk (unpleasant action) if he/she finishes the homework on time (desired behavior).
One teaches a teenager that safe driving (good behavior) will reduce the chances of accidents (unpleasant event).
If you remove trash items from your storeroom (desired behavior) rats will not come inside the house (unpleasant stimulus).
Your boss will not shout at you (unpleasant stimulus) if you report to work on time.
If the child finishes the lunch plate neatly, parents will not throw away his/her toys.

Examples of positive punishment

Some of the easily found scenarios that explain positive punishment in a simple way are as follows:

The teacher assigns more class assignments to the student (punishment) because the child didn’t behave properly with other kids in the class.
Parents shout at their children for disobeying them.
The police issued a fine on the man for his rash and unsafe driving.
Someone may stop giving pocket money to a teenager for his/her unruly conduct.
An official may charge extra expenses from you if you break rules in a pub.

Examples of negative punishment

Let us analyze a few examples of negative punishment to understand the concept better.

Students lose their music classes for being rowdy in the classroom.
Parents may not allow a child to enjoy screen time because he bullied his younger sibling badly.
A police officer can seize your driving license if you do not stop reckless and speed driving on busy streets.

Components of operant conditioning

There are various components of the theory of Operant Conditioning that has many different implications in our daily life.

Behavior modification

This refers to the alteration and changing of behavior in an individual. One can carry it out with the help of certain techniques or therapies.

Here, the reinforcement of good behavior takes place. At the same time, undesired behavior receives punishment. Its primary aim is to reduce maladaptive behavior in children and adults.

Behavior modification therapy includes behavior shaping and token economy. The brief descriptions of both are as follows:

Behavior Shaping

In this method, the first reward goes to the gross estimations of the behavior. The next reward is for all those closer approximations.

Finally, the last reward is for the desired behavior. The organism moves slightly beyond the earlier response of reinforcement.

Now, one can use this slightly exceptional value as the new minimum standard for reinforcement.

Skinner compared shaping behavior to a sculptor molding a statue from a large lump of clay. The case of language development in a human child can be the best example of shaping behavior.

Children learn to talk slowly by trial and error. They make several mistakes in pronunciation before becoming perfect.

If the child is motivated by the parent by rewards after small successful attempts, the chances of honing the skills of language development become easy and free-flowing.

Token economy

It is a system in which the desired behavior reinforces itself with the help of tokens. One can exchange these later on with rewards.

Tokens can be of many types – fake money, stickers, etc. Rewards range from certain privileges to treats or activities.

One can find the best use of token economy in the case of managing patients who are lying in hospitals.

Uses of the operant conditioning theory

One can use the theory of operant conditioning in the education and learning industry. The use takes place in the following key areas:

Unlearn bad habits in children such as nail-biting
Shaping desirable behavior in the classroom
The principles of this theory help manage classrooms well
One can use it to learn new behavior and unlearn the behavior that is not good or desirable
Training pet animals such as dogs, cats, and birds

Criticism of Operant conditioning

Operant conditioning is a behaviorist theory of learning. There were various reasons behind the criticism of this theory. A few of them are as follows:

The theory ignores the role of cognitive processes in learning.
It assumes that only reinforcements and punishments shape behavior. It doesn’t take into account the role of social factors in learning.
Operant conditioning also overlooks the influence of genetics in learning.

The video link shared below shows a detailed explanation of the theory of operant conditioning. Do check out.

Watch this video on YouTube

Summing up from ‘ThePleasantMind’

So, we see that operant conditioning has several real-world examples. We can explain many events that we witness in our daily lives through this principle.

The basis of this theory lies in the simple idea that when a desired action or behavior pairs up with a reward, it increases the chances of its future occurrence.

The theory has a wide range of uses in daily life. If you’re looking at forming a new habit or changing or building a new behavior pattern, you can use the principles of this theory effectively.

Article Sources

1. https://www.thoughtco.com/operant-conditioning-definition-examples-4491210
2. https://www.simplypsychology.org/operant-conditioning.html
3. https://positivepsychology.com/operant-conditioning-theory/
4. https://keydifferences.com/difference-between-classical-and-operant-conditioning.html

Chandrani Mukherjee

A Psychologist with a master's degree in Psychology, a former school psychologist, and a teacher by profession Chandrani loves to live life simply and happily. She is an avid reader and a keen observer. Writing has always been a passion for her, since her school days. It helps to de-stress and keeps her mentally agile. Pursuing a career in writing was a chance occurrence when she started to pen down her thoughts and experiences for a few childcare and parenting websites. Her lovable niche includes mental health, parenting, childcare, and self-improvement. She is here to share her thoughts and experiences and enrich the lives of few if not many.

Chandrani Mukherjee March 1, 2022

Phobia