The original version of this article was written – from the perspective of dogs – by Eileen Anderson of www.eileenanddogs.com in May 2015.
The original can be found by clicking here.
This version has been edited to be equine-appropriate, by Vikki Spit, Rosie Watson and Max Easey of Horse Charming and approved for publication by Eileen Anderson.
For the most part, translating this article into “horse” has involved changing the word “dog” to “horse” and replacing dog examples with those that would be more relevant to horse owners.
We are grateful to Eileen for her generosity in allowing us to bring this material to the attention of horse lovers.
Positive reinforcement-based training (+R) is subject to a lot of misunderstanding and misrepresentation. Many people genuinely don’t understand how it works, and others seem to deliberately misrepresent it. Some of these misunderstandings and misrepresentations are very “sticky.” Misunderstandings, straw men, myths – call them what you will, but they are out there, and they are potent.
Here are six that are quite common. There are many more out there. For example, I didn’t even hit on “horses trained with food learn to bite and nip” or “positive reinforcement training only works for tricks and easy horses” or “R+ training is bribery.”
But the following six illuminate some common misunderstandings about positive reinforcement-based training.
- Positive reinforcement-based training is permissive.
I believe this one is a true misunderstanding for a lot of people. Before I started studying learning theory, I certainly would have had no clue how one could use positive reinforcement as part of a training plan, for instance, to get rid of an unwanted behaviour. All I could imagine was someone passing out cookies for good behaviour. It seemed like a good recipe for chaos. What would one do with a cookie if the horse did something “bad”? What I didn’t know was that positive reinforcement-based trainers not only reinforce desired behaviours, but also have several humane techniques for interfering with the reinforcement for unwanted behaviours so that they don’t pay off for the animal. These include antecedent arrangement, reinforcement of alternative behaviours, and in some cases negative punishment. Positive reinforcement-based training, especially when applied to behaviour problems, takes careful thought and planning. It is precise, deliberate, and the opposite of “let’s all hang out here in happy fairy rainbow land.”
- Positive reinforcement-based trainers just ignore bad behaviour.
The one also brings a very bad image to mind: a doting horse owner letting her animal forage on her for food, drag her around, or bite her. But, the truth is quite different. What we actually do about unwanted behaviour is to 1) prevent it from happening in the first place; 2) teach the horse something acceptable to do instead; and occasionally, 3) punish it using negative punishment. We know that ignoring reinforced behaviours doesn’t make them go away. But to make things a little more complicated, there are two situations where “ignoring” is used in training. One is when training new behaviours and/or associating a verbal cue with a new behaviour. In these cases, if the horse makes an unwanted behaviour, nothing happens. We do not treat. But in these situations, we are not dealing with some habitual, harmful behaviour that is getting reinforced some other way. It’s just a wrong guess in a guessing game. The other situation where ignoring might be used as a part of a training approach is when the animal’s behaviour is being reinforced with attention. But even in that situation, we would not use ignoring by itself. I now have a whole post about the issues with ignoring: “Does Ignoring Bad Behaviour Really Work?”
- Positive reinforcement trainers believe that nothing unpleasant should happen in the horse’s life, ever, and they try to protect their horse from all aversives.
First, this is impossible. Mild to moderate aversive stimuli are around us at all
times, and we – and our animals – perform loads of behaviours to avoid or lessen them. Perhaps the horse is too hot. That’s aversive. Perhaps there is a fly buzzing around her head. That’s aversive. Perhaps the horse has to get a shot from the vet. That’s aversive! The truth is that we avoid training with aversives, even with mild ones.
As I’ve written elsewhere, if a horse is bothered by the rain blowing in his face, and turns his back to the wind, this is called natural or automatic negative reinforcement. The behaviour of turning away from the wind direction is reinforced by escape from driving rain. Wind and rain are unavoidable aversives in life. I might try to provide my horses with some relief from wind and rain by putting up a field shelter or planting hedges. But I would never make use of naturally aversive situation – for example spraying the horse in the face with water – deliberately to get a certain behaviour out of my horse.
And as for major aversives (thinking vet visit again) – we do prepare the horse for them as best we can to make them less so. That’s the opposite of using their aversive qualities.
- Because of #3, positive reinforcement-based trainers will do things like let their horse move into the path of traffic rather than pull on the lead rope or reins, or avoid any medical procedure that might “hurt.”
This one is almost always a straw man. I’m pretty sure the people saying it and acting like they believe it really don’t think we would stand by in an emergency and watch our horses get hurt. In an emergency we will body block or grab or apply halter pressure to a horse who is about to do something dangerous, just like any other normal human being who cares about his or her horse. Yes, this is using an aversive. But it is not part of a teaching scenario. Different behaviours are expected and needed in difficult situations. For example, a friend might ask me to use a needle to remove a sliver that she can’t reach. I would do this if asked, even if it might mean hurting her. But because I am willing to do that, it does not follow that I am fine with training her a new job skill by poking her with a needle every time she makes an error.
- Positive reinforcement-based trainers use punishment but just don’t know it (or just don’t admit it).
This is silly. We are generally the ones who are trying our best to leave mythology behind and learn the science behind good training. But again, the claim can come from someone who just doesn’t understand what it is we are doing; someone who figures there just has to be punishment in there somewhere! Sometimes there is. And those of us who use negative punishment know when we are using it! But a common variant of this claim is, “When you train, you don’t always give the horse the treat. You are withholding the reward and that’s punishment, har har har.” Actually it is not. As long as there is no consequence to the horse’s wrong guess it is not punishment. It is extinction at work. Extinction by itself is no picnic for the horse either, but in general we don’t use it by itself. Usually another behaviour or multiple other behaviours are being reinforced, and we help the animal make the transition to performing one of those instead. We also know and freely admit that certain tools fall easily into aversive use. It’s no news that a plain old [collar can be used to hurt a dog] head collar or bitless bridle can be used to hurt a horse. That’s why when we start using any gear on a horse, we use counterconditioning to help the horse build pleasant associations, and we teach the horse behaviours we want with positive reinforcement, so as to minimize the chance of discomfort. This is the opposite of using the aversive properties of a piece of gear.
- Positive reinforcement-based training is just as stressful on horses as balanced or aversive-based training.
Training with positive reinforcement can surely be stressful. But as I’ve written elsewhere, the stressors generally have to do with lack of skill (errors by the trainer), or an added aversive situation that wasn’t planned. It is not sensible to argue that a method that consists of giving the horse food or scratching her when she performs a desirable behaviour is as aversive as a method that depends on applying discomfort, pain, or intimidation.
Every one of these points is focused on punishment or aversive stimuli. Clearly that is a sticking point in people’s understanding of positive reinforcement-based training. The claims also fit neatly into two categories. The first four misrepresent positive reinforcement-based training. They paint it in a ridiculous light and imply it is impossible or ineffective. The last two blur the lines between positive reinforcement-based training and training that involves deliberate use of aversives.
In rhetorical terms, the first four are straw man arguments, and the latter two use the tu quoque fallacy in addition to the continuum fallacy. (Follow the links for definitions and examples of the individual terms.)
But as irritating as it is to read and hear these over and over, I try to keep in mind that they can be made from ignorance rather than malice. This is described nicely in the straw man link. Every one of us grew up in a culture that instructs us to use aversives to attempt to change behaviour. The “cultural fog” around learning and behaviour that Dr Susan Friedman refers to makes us leery of reinforcement, and can cause us to equate it with mere indulgence or even moral corruption.
I am sure that many of the people who make these arguments are completely unfamiliar with the planning and precision that necessarily go into positive reinforcement training plans. I know I was. I got over it by listening to you folks out there who patiently explained the processes involved in positive reinforcement-based training. I hope you keep describing to the world what you do!
© Eileen Anderson 2015 eileenanddogs.com