The Four Quadrants of Operant Conditioning: Why I Use Them All

helgisangret
Jun 11, 2023
9 min read

I have a confession to make: I use all four quadrants of operant conditioning. This shouldn’t be a controversial statement, but some folks from the purely R+ community might react like I am an animal abuser for saying that, while some folks from a strictly traditional pressure/release training background will think I am completely out to lunch.

Before you switch your caps lock on and start hammering out a comment telling my why I am wrong, allow me to explain why and how I use all four quadrants in my training, and why I think equestrians from both extremes of the training spectrum should be open to doing the same.

Some of you may be well-versed in the quadrants and their definitions, and some of you may not. I am going to break down each one, one at a time, including the textbook definition, common misunderstandings, how it could be misused, and some examples of how I use it in a way which I think is fair and kind to the horse.

First let’s define the term Operant Conditioning. Simply put, it just means shaping behaviour (aka: training). We can break this down into two categories – we are either trying to increase a behaviour that we want, or to decrease a behaviour that we don’t want. I always try to use operant conditioning of any kind while keeping this caveat in mind: We can get a horse to do just about anything if we are good at training (regardless of which method), but if the horse has a physical limitation or weakness that makes the behaviour difficult for him, we need to address that limitation, or adapt our expectations. If he is overriding some internal anxiety or stress to perform the behaviour, we cannot ignore this, otherwise it will either create a breakdown in the behaviour later, cause some sort of other unwanted behaviour to crop up, or potentially send the horse into a state of freeze, shutdown, or even learned helplessness.

The same must be said when we are trying to decrease an unwanted behaviour – we must remember that behaviour is communication, and stifling said behaviour without addressing the reason for it’s existence in the first place can cause the same problems mentioned above. If we forget this, then any quadrant of training we use has the potential to become unfair to the horse at best, or cruel at worst.

Plus I think it’s important to remember that ultimately horses will be horses, and sometimes they will do horse things that we might not want them to do, or that might not make sense to us, but that they simply need to do to be a horse.

So without further ado, let’s discuss each quadrant:

Positive Reinforcement (R+)

Let’s start with the one I think most people are familiar with. The name in this case means we are adding something (positive) to increase a behaviour’s occurrence (reinforcement). It stands to reason that if we want a horse to do more of something, then the thing that we add needs to be a reward of some kind. The reward needs to be a more motivating reinforcer than any reason the horse may have for not doing the thing. For example if your horse really loves standing still, and you want him to walk away from his friends at a brisk pace, a quick pat on the neck is not going to cut it. In many cases a food reward is used, as most horses find food quite motivating.

Some common misconceptions around this type of training include misunderstanding the name. People see the word “positive” and think that must mean “nice” or “happy” or “fun”. While this style of training can be much less uncomfortable for the horse than some other types, it is not automatically a great time just because of the name. One very common misconception is that it will turn your horse into a raging cookie monster. While this certainly can happen if it’s not used correctly, it shouldn’t happen if correct protocols and techniques are followed, not limited to but including the choice of food type, the timing of reinforcers, rewarding relaxation, and the emotional state and clarity of the trainer. If this type of training was guaranteed to create an animal that will be all over you the second you come anywhere near them, it would not be the primary training of choice for zoo handlers working with apex predators.

One area of caution that I have with this type of training is that it can be very easy to have a horse overstep their comfort zone in order to get the reward, without fully dealing with the underlying tension or anxiety. While this can be a helpful tool in encouraging a horse to push into the edges of their window of tolerance, we must be careful that the horse isn’t just roboting through the motions quickly without processing and regulating. For example if the horse is very worried about the trailer, but will get in quite quickly for a treat, we must watch out that they don’t suddenly panic once they have finished eating their treat and realize that they are inside the scary horse death box of doom.

Personally I tend to use this type of training as part of clicker or marker training for added clarity, and usually use it for overcoming areas where the horse has a fear (trailer loading, clipping, flyspray etc) or for sharpening up a behaviour that’s a bit sticky (picking up a foot, standing at the mounting block etc). I also like to use it to keep training fun by teaching the horse a few party tricks. It’s great for rainy or snowy days, or days when you don’t have time to ride but just want to get your horse’s brain working!

Negative Reinforcement (R-)

This is by and large the most common method of training that has been used with horses throughout history. Also known as “pressure and release” training, the basic premise is that we are increasing a behavior by taking away an aversive when the horse performs said behavior. If you think of any of the aids we use under saddle (squeeze the legs to ask the horse to go faster, stop squeezing when he goes), they all fall under this category. When we think of the word “aversive”, this often conjures up images of severe bits or heavy-handed pulling or whipping. While these certainly would be considered aversive and are unfortunately not uncommon practice in the horse world, “aversive” can be reframed as just pressure. This pressure can be as light as a squeeze of the ring finger on the reins, or even a shifting of weight onto our outside seat bone. But the long and short of it is, it has to be something the horse would rather not have happening, otherwise he wouldn’t be motivated for it to stop happening.

Because of the term “negative” in the name, as well as the many many instances of the amount of pressure being used being far more than is necessary, this quadrant of training can get a bad rap, especially amongst purely positive trainers. Another potential pitfall of this type of training is that an uneducated hand might either apply too much pressure to begin with, or not release the pressure soon enough. Poor timing can result in the horse “tuning out” the pressure, which in turn can cause us to need to continually escalate the amount of pressure applied to get the same result.

Personally I use this quadrant quite a lot. I ride using primarily traditional aids and methods, and I use a light pressure on the lead rope to get my horse to walk on in hand if a soft invitation doesn’t do the trick. I always try to use the least amount of pressure possible and am mindful of the timing of my releases. I also like to jazz it up by combining R+ and adding a reward right after releasing the pressure when possible. Especially when teaching a new behaviour.

Positive Punishment (P+)

Now we are getting into the lesser understood, and more frequently misused quadrants. Although the name Positive Punishment might bring to mind an image of someone gleefully beating their horse, the name is actually defined as: Adding something to decrease a behaviour. While the thing we add does need to be an aversive, and certainly has the potential to be a far harsher aversive than is necessary, it can again be reframed as pressure. For example, if I am asking my horse to stand still, and he takes a step forward, I would apply pressure on the leadrope to get him to step back. Or I could swing the lead rope, tap his chest or leg with the whip, step forward into his space….all of those things would technically be positive punishment, and that is the way that I use it. It is simply the reverse of negative reinforcement. So, if I am riding a circle and my horse starts to swing his quarters to the outside, I apply some pressure behind the girth with my outside leg. That is positive punishment. I think it is important to give examples like this, because as soon as people see the word “punishment” they often think about examples of it being used incorrectly, or in extreme. We must be careful never to use it in a reactionary or overly emotional way and must always use the smallest amount of pressure needed.

Negative Punishment (P-)

This is by far the least understood quadrant of the training scale. (Although there is absolutely an argument that you cannot use positive reinforcement without also using negative punishment, which I will get into). In P-, we are removing something to reduce a behavior. The example people often think of is someone withholding food from their horse after a bad ride. Firstly, I would argue that this isn’t technically P-, because unless the removal of the food happens within 3 seconds or less of the behaviour, it has absolutely no correlation whatsoever to the shaping of the behavior. At that point I would say it is just the human having an emotional reaction after the fact. Secondly, it can be extremely frustrating to the horse to remove their food, so this is not a technique I would recommend undertaking, even with the best of timing. Where we can get picky with the semantics though, is that if you are only rewarding the horse with a treat when he does the behaviour you want, and not when he doesn’t, is that not using both R+ and P- together?

The way that I primarily use this quadrant is with a non-food reinforcer such as wither scratches. For example, if I am working with a horse on standing still at the mounting block, I will park him where he needs to be and then start giving him his favourite scratches. The moment he takes a step, I stop scratching. Sometimes at that point I will use some mild pressure to re place his feet, and then immediately begin the scratches again. It usually doesn’t take long for the horse to figure out that the best scratches happen at the block, and he is happily standing there quietly for plenty of time for an unhurried mount to take place.

That is how I use all four quadrants of operant conditioning. I use them in varying amounts from horse to horse, from day to day, and from moment to moment. I have no judgment on trainers who use differing amounts of each quadrant to, as long as they are done with good timing, a good sense of feel, and done in a way that is as clear and fair as possible to the horse. I truly believe that each quadrant has the potential to be done incorrectly or in a way which could be harmful to the horse, and that each quadrant has a place in training when used in the right way. It is all about context and what each horse needs in a given moment. For example, in situations where the horse already has a lot of fear, adding pressure of any kind can be too much and tip them over into fight/flight. On the other hand, with my guy who has a history and well-established pattern of disassociating when he gets overwhelmed, deep pressure can help him stay present in his body and help him regulate his nervous system. The tricky part is knowing which tool to use at what time. This is where knowing the horse you are working with and learning to read them very well is important.

I also believe that energy, intention, and the emotional and mental state of the trainer or handler has just as much of an effect as the techniques being used, but that is a whole other article….

If you think about different examples of times when you are working with your horse, can you pinpoint which quadrant(s) you are using? I think it is important to know which one(s) we are using at any given time, because then we can make sure we have the right timing, as well as ensuring that we are shaping the behaviour in the direction we are meaning to. I think it is important to be open to using whichever quadrant is going to be the most effective, as long as it is being done in a way that is fair and kind to the horse. Most of all, I think it is important for us all to continue to expand our knowledge and skillset, and to be open to trying new things to be able to support our horses as best we can.

The Four Quadrants of Operant Conditioning: Why I Use Them All

Recent Posts

Comments