How do positive reinforcement trainers get their horses to behave?

What many people find baffling about force-free, rewards-based training (positive reinforcement) is how it is possible to get the horse to do something in the first place, so that we can reward it.

This is because, until we come across this very different way of training, we have all historically only ever been shown how to use some kind of pressure (aversive stimulation) to get behaviour. Which means that while we like the idea of using a new way of training that is more genuinely rewarding for the horse, we can get a bit stuck for ideas for how to get the horse to do something we can reward!

When it comes to their behaviour, we ideally want 3 key things from our chosen way of keeping and training our horses and ponies:

  • We want to be able to produce repeatable desirable behaviours
  • In doing so, we want to avoid causing the horse to choose to perform undesirable behaviours
  • We want to reduce or eliminate undesirable behaviours

In order to produce repeatable desirable behaviours using positive reinforcement, we need to be able to do 3 things. We need to have a way to create the behaviour in the horse in the first place, then we need to reinforce that behaviour so that the horse will want to repeat it, and finally we need to pair it with a cue that can act as a unique prompt for that behaviour, so that the horse knows exactly what behaviour to perform to obtain reinforcement when that cue is given.

What is reinforcement?

Reinforcement makes behaviour more likely to be repeated. There are only two types of reinforcement. One is where the horse gains something he values and that provides him with a pleasurable outcome. The other is where the behaviour results in escape from something that is unpleasant, or that the horse expects to be unpleasant.

Successfully escaping or avoiding an actual or anticipated aversive (unpleasant) stimulus provides the horse or pony, donkey or mule with relief that it’s over or has been avoided. These types of reinforcement of behaviour are going on all the time, with or without our involvement, and even if we don’t realise what they are or know what they are called.

The foal that struggles to his feet when he is born, who eventually wobbles his way on his unsteady legs towards his mother for his first drink, gains life-giving milk and colostrum. His first experience of the world is of positive reinforcement – the gain of something appetitive and life giving.

The horse that turns his back to the wind and lowers his head in a hailstorm firstly escapes the painful feeling of hailstones on his face and then avoids further stinging pain by adjusting his position relative to the wind direction. His behaviour of turning away from the hail is negatively reinforced initially by his escape, and then he maintains or repeats that behaviour to avoid the pain from the hailstones. These are 2 forms of learning known as escape and avoidance learning, and these are what everyone relies on when using aversives to train horses.

The behaviour of the horse that pulls away from his handler when being led from the stable to the field is reinforced when he gets to the field full of grass and to his friends – more so if he ran out of hay hours ago, does not get much turn-out, is very anxious about being separated from his friends and experiences aversive handling when being led.

Whether we consider his behaviour to have positively reinforced (we imagine his behaviour results from him gaining food and friends and freedom), or negatively reinforced (we think he experiences temporary relief from the unpleasant psychological and physical feelings of being starved and hungry and separated and confined and restrained), we can definitely know that this behaviour of pulling away from someone leading him is reinforced, if it keeps happening – even though we cannot know for sure what it is that he finds most reinforcing.

We know that behaviour that results in reinforcement will be repeated, so if we want to train a desirable behaviour, we need to have a way to form that behaviour first and a way to provide a reinforcing consequence for the horse so that the horse wants to do it again in the same circumstances. Only then can we get it on cue.

The key difference between positive reinforcement training and every other kind, is that as trainers we try to use ways of forming behaviour that do not involve creating aversive situations for the horse to escape or avoid.

How do we get behaviour so we can reinforce it?

Whether we choose to deploy negative reinforcement or positive reinforcement strategies, there are only 6 different ways we can form behaviour – or indeed that behaviour comes about, whether it is behaviour we want or don’t want – in any animal.

In combination with an immediately reinforcing consequence (a motivation to do it again), these are the only ways we have to either cause the horse or pony to want to do the behaviour, or to explain to them what we want them to do.

The first and universally used way of training horses in ground and ridden work in traditional, classical, straightness training, academic training (such as that promoted by the Equitation Science advocates), western riding, and in all flavours of natural horsemanship, is aversive stimulation. An aversive (unpleasant) stimulus is applied to cause the horse to perform a behaviour that it will perform to escape or avoid that stimulus. Provided that when the desired behaviour happens, it is immediately reinforced by cessation or reduction in strength of the aversive, then the horse will consider that behaviour to have worked for her and will repeat it in the future.

The second way is through physical manipulation, also called moulding (or sometimes sculpting), and this is also used routinely with horses. This involves physically moving the entire animal or part of the animal into a position by direct contact with their body. This could involve taking hold of a body part (the head or a limb for instance) and pushing or pulling on the animal’s body directly to cause all or part of him to move into a position or place. This is also achieved by attaching restraint or manipulation devices to his body – such as halters and ropes, around the head, body or legs.

Most horse owners use manipulation routinely every day – for leading and feet handling. With horses we should always assume that this way of making them move or preventing their movement will be aversive to them initially, and that they will be either frightened or very likely to resist that pushing or pulling to begin with. They can of course learn through negative reinforcement (they are released only if they remain relaxed or when they cease to struggle) to comply, but their compliance should never be assumed to imply consent, confidence, acceptance or willingness – since it is accomplished entirely through coercive means.

Alternatively, by introducing them to being handled gradually, slowly and gently, without any restraint or additional aversives being used, they can learn through positive reinforcement to like it and to cooperate enthusiastically even if their movement is restricted. Horses do what works for them. If we are, for instance, teaching them to have their feet handled, and their struggling results in escape because we cannot hold onto their foot, that struggling will be repeated. Every farrier knows that! So it’s better to go slowly, building the time they can consent to their feet being held and handled, gradually, checking that the horse is totally relaxed before we start, and checking for relaxation at all times when training for all physical handling, than to risk creating a problem that can be difficult to overcome.

Other less generally useful ways to form specific behaviours but that fit with a force-free philosophy include using a food lure. A common every day application of this is for carrot stretches. Sometimes a horse that has not been trained to lead yet can be enticed with food to go somewhere, pending proper training. Often though, people try to use food to entice horses to go somewhere they do not want to go and then trap them, and doing this can make a horse forever suspicious of people offering them food. But even if we don’t do that, a horse following food is focussed on the food, not especially on his own behaviour, so, other than for carrot stretches, it’s preferable to only use food to lure a horse as a temporary measure. Getting a horse trained properly to lead and load and giving him no reason to feel coerced or tricked and trapped into doing things – as soon as possible – is preferable to using a food lure.

Social or observational learning (learning by watching what happens to others and then doing what they do) happens with all social species including horses and can work to our advantage. Horses will see that if their mother is relaxed in certain situations that these need not be feared. Sometimes it is useful to use another horse as a lead to show an uncertain horse that he need not fear crossing water or over something on the ground, and we could reward the horse with some food or a scratch for doing that. But if the nervous horse is simply following another to avoid being left behind he may not always learn confidence in himself or to like being in that situation, even if we think we are rewarding that behaviour. We might just be using the confident horse as a lure for our nervous horse and not teaching him anything at all.

Ways that are unique to positive reinforcement

When we switch to using more positive reinforcement, two additional important options for getting behaviour to happen become available to us that are not available with a negative reinforcement approach.

The first involves creatively contriving situations in the environment of the horse, in which the behaviour is most likely to happen on its own, and then marking and rewarding it.

So we set up the ideal situation, wait for the behaviour to happen, and then make sure that the behaviour results in something that is immediately reinforcing for the horse.

In positive reinforcement training this is called “free shaping” (where successive steps in the direction of the finished behaviour are reinforced) or “capturing” (where the complete behaviour happens and can be opportunistically reinforced). If we are clever in our set-up, the behaviour we want is going to be the one the horse is most likely to choose to perform.

We must observe closely the behaviour of the horse, and then reinforce by marking and rewarding, usually with food – any behaviour that is a step in the direction of the finished product. If the horse wants to do a behaviour anyway, we don’t even need to mark and reinforce it, but unless we do, we won’t be able to get it it to where it can be made repeatable – on cue – and therefore something we could reproduce in the future.

The marker signal I refer to is called a bridging stimulus or bridge, because it bridges the short time lapse between when the horse performs the specific behaviour we want, and receives the food or scratch.

The second and most commonly used technique, and one that can only be used with positive reinforcement, is target training.

Target training – how we take advantage of natural horse behaviour

Target training takes advantage of the natural behaviour of horses to investigate novel objects. We can carefully present a target prop to a horse or put a target prop on the ground near to the horse and by bridging and rewarding his voluntary approach to investigate it by looking at, sniffing or touching it with his nose (or feet if we are using something we want him to put his feet onto) we can teach the horse that his behaviour of touching this object will be positively reinforced.

missay-on-target-cone

This is all done with no pressure being put on the horse to approach the target, as none is needed. This is actually one of the most natural ways of creating behaviour – allowing a horse to perform what is a perfectly natural investigative behaviour when presented with novel objects.

Targeting can be used to form every behaviour for which aversive stimulation is normally used. It can be used for groundwork whether in hand or at liberty, and for ridden work, with or without anything on the head of the horse. We can use stationary targets such as cones or mats or dressage arena letters, and we can use a stick with a target on the end of it for teaching movements.

Once the horse learns that he will be positively reinforced for touching a specific body part to or being near the target, or for following a moving target (all of which takes seconds for most horses to learn), or for stepping on a target, we can use this to influence the movement of all or part of the horse. Having formed the behaviour using a target we can then substitute an alternative cue (visual, voice, touch) so as to reproduce the behaviour, and discontinue the use of the target. The target is just a way to show the horse where to be and what to do, and once that behaviour is on a cue the target prop can be faded out of the picture and is no longer needed to get the behaviour to happen, because the cue now achieves that purpose.

Done well, and built up into more complex behaviours over time, it is a very easy way to influence the movement and posture of a horse without the tension or anxiety that arises when the horse is vigilantly looking out for aversives, such as in a situation in which the people he is with are the source of routine aversive stimuli – so much so that for the horse, people come to have significant threat potential.

Targeting can be used to teach all ground work and ridden movements – catching, haltering, leading over any surface or into a trailer, for teaching halt and standing still, backing up, moving the front end away, disengaging or moving the hindquarters over, circling, straightness on circles, stepping under behind, crossing over in front, lunging, long lining, moving in a forward-down stretched posture, for shoulder in, haunches in, side-pass, rein cues for turns to left and right, shifting the weight back, lateral and vertical flexion, walk, trot, canter, jumping, back and leg and abdominal muscle engagement.

Name something that you want the horse to do by way of moving his body (or keeping it still) and it can be trained with imagination and with bridge and target training.

For the most part, what we want to do with horses either involves them being really good at standing still and relaxing or it involves influencing their movement in all directions at all paces, in time and space.

To have that biomechanically healthy movement we need the horse to have the right kind of balanced, relaxed energy and enthusiasm.

If you have yet to learn how to incorporate target training into your way of training your horse, don’t miss out on some fabulous ways to make both every day handling and biomechanically healthy movement easy and enjoyable for your horse, without pressure.