Primary VS Secondary Reinforcers
Measuring The Reinforcement Value of Reinforcement
[Warning: This post is more advanced than most of the content on this website. Don’t feel bad if you don’t understand everything. People who have read my ebook or worked in the animal training field will have an easier time comprehending and discussing this topic.]
Recently I came across a paper I wrote when I first started at SeaWorld. The paper is a proposal to conduct some “in-house studies” to determine how reinforcing the dolphins found different reinforcers. I wanted to present this paper today to start a discussion on animal reinforcement. That is all. This was an early stage research proposal and clearly there are items that would need to be addressed before an actual study could begin.
What do dolphins find most reinforcing?
I was inspired to answer this question because, up until this point, I had always assumed that for the most part, animals found primary reinforcement most reinforcing (primary reinforcement is anything that has intrinsic value to the animal – in this case, fish). It wasn’t until I started my career with SeaWorld Orlando at Whale and Dolphin Stadium that I was told that the animals actually found secondary reinforcers more reinforcing (secondary reinforcers are stimuli that have been conditioned to be reinforcing, for example, a basketball or ice).
Of course, animals are individuals and we can’t say speak for all of them. However, I was curious to see if these animals had actually been conditioned to find secondary reinforcers to be more reinforcing than primary.
So, on my own accord, I created the proposal below. Honestly, I don’t know what version this proposal is, but the general concept is still there. Although the research was never executed, I thought I would share it with all of you. Maybe it would inspire some constructive discussion!
What do you think? Can an animal be conditioned to find a secondary reinforcer more reinforcing than primary? Would my “experiment” even have worked? How can we tell what animals find reinforcing? Leave your responses and thoughts below. I am going to be engaging in this discussion as much as possible so bring your questions, comments and critiques!
Primary Versus Secondary Reinforcement
The training theories and applications used by the Animal Training Department at Whale and Dolphin Stadium of Sea World Orlando conclude that secondary reinforcers are more reinforcing to animals than primary reinforcers. In order to examine the validity behind this statement I am proposing a series of trials to determine how much more, if at all, secondary reinforcers are more reinforcing than primary reinforcers.
In order to fully explain the content of the proposed trial and understand the benefits from its findings we must first define what we mean when we discuss secondary reinforcers, primary reinforcers, and reinforcement in general, as applied to marine animal training.
The International Marine Animal Training Association defines a reinforcer as “any consequence of a response that increases the frequency of that response.” We have identified two sub groups of reinforcers that are utilized to increase desired behavior; secondary reinforcers and primary reinforcers.
According to the International Marine Animal Training Association, a secondary reinforcer is “ a reinforcer that has acquired reinforcing value through learning by being paired with events that are already reinforcing.” This is also known as a conditioned reinforcer where “a stimulus becomes a reinforcer because it is paired with another reinforcer, usually a primary reinforcer.” The primary reinforcer is “an unconditioned reinforcer” or “anything of intrinsic value to an organism” – in our case, fish, water, air, etc. Our most commonly used primary reinforcer is fish.
In the Whale and Dolphin department we have identified the following three categories as secondary reinforcers that are more reinforcing than primary reinforcers:
Toys (buoys, balls, etc)
Tactile (rub downs, water hose, etc)
Consumables other than fish (jello, ice, etc)
While it is agreed that other stimuli such as; whistle bridges, underwater sound tones, presence of a trainer, clapping, asking for behavior, and other environmental changes are also technically secondary reinforcers, they are not, in general, more reinforcing than primary reinforcers because they have not been conditioned to be.
After identifying what primary and secondary reinforcers are, it prompts the question; how does one reinforcer gain more reinforcing value than another?
The thought process behind the Whale and Dolphin method of training is as follows.
Initially, primary reinforcers are the most reinforcing stimulus to an animal because it has intrinsic value to them and is necessary for basic survival. However, when properly conditioned, an animal can find a secondary reinforcer more reinforcing than primary.
For example, after the subject completes a behavior at a higher criteria than normal, a trainer at Whale and Dolphin should first issue a secondary reinforer. For the case of this example, let’s assume it is a rub down. Following the rub down, the trainer must then issue a large amount of primary reinforcers in order to reinforce the rub down and the behavior that preceded it. Through time, this “magnitude” will change the balance of reinforcing value, giving the secondary reinforcer a higher reinforcing value. Theoretically, the subject will associate the rub down as a precursor to a “magnitude” and over time the secondary reinforcer will gain a higher reinforcing value because of what it represents (the precursor to a large sum of primary). Therefore, when the subject exceeds criteria or achieves highly desired behavior, the trainer should reinforce with a secondary and then, if desired, primary.
Conversely, a trainer should avoid using secondaries when an animal achieves low criteria or requires multiple attempts to perform a desired behavior and should be given a small amount of primary instead. Due to proper conditioning, secondary reinforcers, over time, will obtain a higher value than primary and should be used for highly desired behavior, while average to low criteria (but correct) behavior should be reinforced only with primary as to not overly increase the frequency of poor criteria or multiple attempts.
For example, the subject is asked to do a front flip and the animal refuses to perform the desired behavior. After the LRS, the trainer asks a second time, the animal succeeds and is reinforced with a small amount of primary.
In another scenario, the subject is asked to do a front flip. The subject does so immediately and at a height and precision above its normal criteria. The trainer should then reinforce with a secondary, such as ice, a buoy, or rub down and then may, if desired, reinforce with primary.
The higher reinforcing value of secondary reinforcers is also due to the novelty of the secondary stimulus. Each dolphin is offered a fixed amount of primary each day during a variety of sessions and shows and therefore it becomes an expected “norm.” Secondary reinforcers, however, are used more sporadically (often times as a “change”) and therefore are novel to the animals.
While this illustrates our training thought process, does it illustrate our training practices? In reality, do our animals actually find secondary reinforcers more reinforcing than primary reinforcers?
The goal of these trials will be to obtain information that will help our team align our training thought processes and our training practices so that they reflect the best environment for increasing desired behavior from our collection.
Proposed Trial: To determine the reinforcement value of secondary and primary reinforcers.
The results from the trial should answer the following seven questions.
Which secondary reinforcers does the animal find most reinforcing?
Which secondary reinforcers does the animal find least reinforcing?
Which primary reinforcers and/or amount of primary reinforcement does the animal find most reinforcing?
Which primary reinforcers and/or amount of primary reinforcement does the animal find least reinforcing.
Comparing low and high secondary reinforcers with low and high primary reinforcers, which reinforcers does the animal find most and least reinforcing?
Do these findings support the Whale and Dolphin training thought process?
If not, what corrections (if any) would align our training practices and thought process?
The proposed trial is divided into three different phases and two due diligence and concluding components:
Research, Due Diligence & Planning
Phase One: Value of Secondary Reinforcers
Phase Two: Value of Primary Reinforcers
Phase Three: Value Comparison of Secondary and Primary Reinforcers.
Findings and Conclusions
Research, Due Diligence & Planning
The team will consist of one to two Trial Coordinators and one Advisor. The Trial Coordinator(s) will be responsible for carrying out the trials and recording the data. They also will be in charge of taking the data and drawing conclusion from the data. The Advisor will assist with the trials, offering advice and insight in order to make the trials as accurate as possible. The Advisor will also aid in the final report, making sure the conclusions drawn are supported by evidence from the trials.
Criteria of Subject & Reinforcers
I propose utilizing one dolphin for the entire set of trials. This will allow consistency in our findings and avoid analyzing an array of different preferences from different animals.
To be considered, the subject should have as little history of secondary and primary discrimination as possible, be available for trials at varied times during the day, week, and month, and be able to be worked by both Trial Coordinator(s) and Advisor.
Once the dolphin is determined, the secondary and primary reinforcers to be used must be determined. The subject’s history with these stimuli will be utilized, as well as trainer knowledge on the animal’s assumed preferences.
Trial Overview and Execution
In order to determine a “value” of a stimulus we must create an environment that allows the animal to choose one stimulus over another. For example, by offering two secondary reinforcers in one pool, each of equal distance to the subject animal, and free of other major reinforcers and stimuli, we assume the reinforcer that is “chosen” first by the subject animal is the reinforcer the animal finds more reinforcing and therefore has a higher reinforcement value. In short, when given the option of a basketball and a buoy, which one does the animal play with first? If it chooses the basketball – the basketball has the higher reinforcement value. If given the option of a herring or ice, which reinforcer does the subject animal choose? If it chooses the herring – the herring has the higher reinforcement value. This methodology will be the basis for these trials.
Trials should be executed at varied times of day, week, and even month. On the days trials are performed, social, behavioral, and medical occurrences before and after the trials should be noted. Trial findings in Phase One and Two will aide in determining trial set up in Phase Three. For example, if Phase One shows buoys are the subjects most reinforcing secondary reinforcer and Phase Two shows that one capelin is the subjects least reinforcing primary reinforcer then the trial to compare the buoy and the capelin’s reinforcing value can be executed before the subject’s first feed of the day and/or after the a large feed. Determining food motivation of the subject before each trial will be key.
There will be no set timeline for the trials as; this is not a behavioral priority for the animal, there is no rush to determine these findings, and the accuracy of the trials are not dependent on any specific timeline.
It is understood that the results from this trial will be specific to one animal, specific to the actual scenario of the trial and not be an absolute representation of the entire Whale and Dolphin collection. It is also understood that each individual animal and scenario dictate what stimuli are most and least reinforcing. For example, some animals will always find a basketball more reinforcing than eight pounds of herring, while other animals will always find one capelin more reinforcing than a rub down with a trainer.
It is also known that these trials simply integrate the placement of reinforcers, not the application of reinforcers. Meaning, a ball by itself in a pool (placement) may be more or less reinforcing as playing fetch with the same ball with a trainer (application).
There is also the possibility that the subject is intelligent enough to take advantage of both secondary and primary reinforcers during the trial by choosing the primary first. For example, a ball and a herring are placed in the pool. The animal consumes the herring first and then plays with the ball as long as it wants or until it is offered additional primary for the return of the secondary. An ideal trial would immediately take away the reinforcer not chosen first and overtime the subject would learn that whatever is not chosen first will no longer be an option. This would give us a more accurate conclusion on the values of reinforcers. While this type of trial would be more comprehensive, it is more difficult to execute and could also lead to frustration and aggression among the subject. It is recommended a trial like this not be attempted in order to avoid these obstacles.
Even with these discrepancies, the information obtained will still be of use to the training staff at Whale and Dolphin. Overwhelming percentages favoring primary or secondary reinforcers will be a good indicator on where the value of reinforcement lies for the subject. The training staff can then utilize the more highly valued reinforcers appropriately or discuss the best practices to increase desired behavior through the use of reinforcers. If the percentages of favoritism are similar, then a conclusion could be drawn that specific scenarios play a more crucial role than realized or the balance of value to our reinforcements needs to be modified or utilized more efficiently. All conclusions will include not only the results of the trials but of course any notable behavior from the subject concerning the trials.
Other Benefits of the Trials
These trials will add a new and novel stimulation for the subject animal. The subject will be utilized in a variety of pools and be exposed to a variety of reinforcers and multiple combinations of primary and secondary reinforcers. These trials can be utilized to break up the subject’s day and offer a change from the more controlled show and session scenarios.
These trials will be a great way to utilize and educate trainers on behavior, while assisting in the progression of training methods used in the area.
Overall, the subject benefits by being introduced to a new and novel scenario, the Trial Coordinator(s) benefit by getting a first hand experience on behavior, running professional trials, and better understanding the uses and applications of conditioned stimuli and primary reinforcers and finally, the results from the trial will assist in the education of the entire training staff.
Results of these trials will be published in a formal report and a presentation can also be created to showcase the findings to the training staff.
Phase One: Value of Secondary Reinforcers
Trials in Phase One will determine which secondary reinforcers the subject finds least and most reinforcing. The trial will begin by choosing which secondary reinforcers the subject will be exposed to. For example, suppose the subject, Phil, has been chosen to be exposed to five different secondary reinforcers; jello, a buoy, a basketball, a running hose, and ice. The first trial will begin with a cube of Jello and a buoy being placed in the same empty pool. The subject will be given access to the empty pool and the Trial Coordinator must record which stimulus the subject chooses first. Does Phil eat the jello and then go play with the buoy? Does Phil play with the buoy and completely ignore the jello? Let’s assume it’s the latter and the data is recorded.
Jello Vs Buoy: Buoy
The same trial will be repeated two more times – perhaps in different pools, a different time of day, or consecutively. Suppose the last two trials result in the same outcome as Trial One, then it can be concluded that the subject prefers the Buoy over the Jello. A 2/3 majority is needed to conclude the subject’s preference. This gives us the first order in Phase One. A buoy is more reinforcing to the subject animal than jello, or:
Buoy > Jello
Now the buoy can be compared to another secondary stimulus. Let’s say ice is the next comparison and the three trials result in the subject eating ice first and then playing with the buoy. We can then conclude that:
Ice > Buoy > Jello
Now let’s assume the ice is compared to the running hose and the subject completely ignores the running hose and only eat the ice. We can then assume that the Ice > Running Hose, but we cannot assume that the Running Hose > Buoy > Jello. Additional trials of those stimuli would have to be run using the same method described above until an order was determined.
Ie. Ice > Buoy > Basketball > Jello > Running Hose
Once the order is determined Phase One is complete and the team may move to Phase Two.
Phase Two: Value of Primary Reinforcers
Phase Two follows the same principles as Phase One. The purpose of Phase Two is to determine which units of primary are most and least reinforcing for the subject. Fish will be the only primary reinforcement used in these trials.
The Trial Coordinator(s) must first determine what units of primary will be used for the trial. One capelin, a handful of capelin, one herring, or five herring can each be considered a unit of primary. The Trial Coordinator(s) can decide how many units and what kind of units of primary to use in this Phase.
Unit One: One Capelin
Unit Two: Five Capelin
Unit Three: One Herring
Unit Four: Five Herring
All the units will be tested following the same rules in Phase One. Two Units of primary will be compared in an empty pool. For example, if Unit One and Unit Three are placed in a pool and the subject chooses Unit Three first in all three trials then we can assume that Unit Three > Unit One or one herring has a higher reinforcement value than one capelin.
Trials will continue for all units of primary until an order is derived.
(ie. Unit Four > Unit Three > Unit Two > Unit One)
*It will be key to note how much primary the subject has had before each trial.
Phase Three: Value of Secondary & Primary Reinforcers
Phase Three will determine which types of reinforcers the subject finds more reinforcing. The trials will follow the same principles as Phase One and Two but will use one secondary reinforcer and one primary reinforcer in each trial.
In this Phase, a variety of trials will be configured. Secondary reinforcers that rated low on the reinforcement value scale can be compared with primary reinforcers that also rated low on the reinforcement value scale. Suppose the results of that trial conclude that Unit One (one capelin) is more reinforcing than a running hose. We can then assume that Unit One > Running Hose. We can continue comparing Unit One with the rest of the secondaries to see which secondaries are more preferred to the subject. Based on the results of each trial we can determine which units we want to prepare.
For example, suppose Trial One of Phase Three determined that Unit One > Running Hose. Then the results of Trial Two of Phase Three determined that Unit One < Jello. Then in Trial Three of Phase Three we could go up one Unit and compare it to the same Secondary (jello). The results of Trial Three of Phase Three determined that Unit Two > Jello.
Similar trials would take place until enough data was collected to support a conclusion.
Findings and Conclusions
After all the Phases have been completed and all the data is collected it is now time to review the results to form some conclusions. It will be very important for the Trial Coordinator(s) and Advisor to draw conclusions based on facts, and not on any assumptions or guessing. “The sun was in his eye, and that’s why he didn’t see the capelin” would not be appropriate rhetoric for the final report. Although we can note the weather and make a suggestion that the sun may have had an impact – we cannot make any assumptions. Only facts will be reported and any anthropomorphic dialogue is not suitable for this report.
The report will contain five portions much similar to this proposal;
Introduction: Description of process and reasons for trials
Phase One: Planning, execution, and results
Phase Two: Planning, execution, and results
Phase Three: Planning, execution, and results
Conclusion and Findings: Application of results to our program
Once the final report is created a formal presentation can be given to the staff in order to showcase the trials and findings. The final report will be accessible to the entire staff and will be open for interpretation and discussion.