Verbal cues. Threat or treat?

I often see people discussing the use of voice commands and cues (verbal prompts to the horse to perform a behaviour) or praise (when we use our voice in an effort to let the horse they know they did something we like).

I thought it might be useful to consider how horses come to learn what voice commands or cues or verbal praise actually mean, by way of a short lesson in the science of behaviour.

Vocal cues and commands have to be learned

To begin with, when we first train a horse to perform a behaviour, they have no idea of the meaning of either voice cues or praise.

Both of these types of use of the voice have no significance to an untrained horse.

She doesn’t know what they mean at all, and they have to be learned.

Before training any animal, we could use a verbal command and the animal is unlikely to perform anything other than what they are already doing, or what they choose to do based on other things going on in their environment.

The process by which these sounds come to have meaning has to be learned by every animal, and it’s a form of associative learning.

What that means is that the animal comes to associate the sound we make with our voice with some other event or stimulus.

It’s a form of sensitisation – we take a neutral stimulus and by following it by something else that is meaningful to the horse, the neutral stimulus comes to be a predictor of something and can thereby begin to elicit its own emotional response.

The way in which vocal cues and praise are learned by the horse in any aversive training method is the complete opposite to the way they are learned in a positive reinforcement model.

By the way, I should say that this is not going to be a lesson in how to train cues – because that is much too complex to do in a blog post, and would require you as the reader to have some good foundation knowledge of the practice of training with positive reinforcement – and I don’t intend to try to do that in this article. The structured lessons we provide are the way to go about learning that.

But what I can do is to talk about the fundamental differences in the learning of commands and cues between systems using aversive prompts with negative reinforcement of correct responses (so that would be in traditional and natural horsemanship based methods) and those using positive reinforcement.

Commands and Praise in Aversive Training

In an aversive (negative reinforcement) system, the command can be given first, and then some kind of aversive stimulus is applied to produce the behaviour from the horse.

Imagine saying “Walk on!” or clucking with your tongue, or making a kissy noise, and then using a lunge whip, or your legs or a whip tap to apply some kind of aversive stimulus (also called pressure) to produce the behaviour.

As soon as the horse performs the correct behaviour (or makes an attempt which is in the right direction), the trainer should take that aversive stimulus off.

After some repetitions, the command now predicts the onset or application of the aversive stimulus and the horse will act before the aversive is applied (assuming there is no other competing motivation – something more salient than the aversive from the trainer) in order to avoid it.

The process involves escape learning (the horse learns that he can escape from the aversive stimulus by acting) and then avoidance learning.

First of all the horse learns that he can escape the aversive stimulus (and make it stop) by “behaving” and then he realises that there is a warning that is given before the aversive stimulus is applied.

Pretty soon the horse will recognise the vocal command and because he will now be anticipating the aversive onset, he will act before it is applied, thus avoiding the actual unpleasantness of the aversive stimulus.

When people describe horses as “anticipating” or “making assumptions” about what they should do, it is very often because they have read signals from the handler or rider, perhaps given unintentionally, that they are about to apply an aversive stimulus.

A horse might start to “offer” to do something, because he thinks that this will mean he can avoid the onset of the aversive.

Psychologists refer to the learning process in this model as “fear-conditioning” when commands are learned in this manner. The command predicts that an aversive will be applied to make a specific behaviour happen. The command is enforced by aversive application if the animal does not respond, until he does the correct behaviour.

And in fact, it is necessary to enforce the command (make the horse do the behaviour) in order for the meaning of the command to be maintained.

The command comes to mean “perform [insert relevant behaviour] to avoid experiencing some aversive stimulus until you do.”

Equally in an aversive training system, praise acquires a specific meaning.

Usually praise is given in the period after the animal has performed the desired behaviour (and the aversive stimulus has been removed), or during a period in which the animal is continuing to perform the behaviour so that the aversive stimulus is not re-applied.

So as an example, if we say “Walk on!” and give the horse a squeeze or a little kick with our legs to cause him to move, then in a negative reinforcement system, we should stop squeezing or kicking once he walks on if we want to increase the likelihood that he will walk on in future when we do these things. This is how we should negatively reinforce his correct behaviour.

The right thing to do if the horse were to stop walking (in an aversive training system) would be to repeat the cue to “Walk on” and then re-apply the lunge whip, leg squeeze, whip tap, or kick. But not to keep using legs and cues to keep the horse going, because otherwise there is no relief for a correct response, and relief is required to strengthen behaviour.

While the horse is walking (which is what we prompted him to do) we should discontinue all input from us by way of pressure – and leave him alone.

Otherwise we are not providing any reinforcement (absence of pressure) for the behaviour we wanted.

So now, assuming our horse is doing what we want, and having taken the pressure off, we verbally praise our horse, then the praise comes to signify a period of time during which the horse will not be subjected to anything aversive (at least for a second or two, or however long it is before we ask him to do something different, or he stops doing the commanded behaviour).

If the praise is given AS the pressure / aversive stimulus is taken off or ceased, then the praise acts as a conditioned negative reinforcer. It tells the horse “here comes a short period of nothing aversive from the rider / handler, following the behaviour just performed”.

As you can imagine, the use of vocal cues and praise is poorly understood by almost all riding instructors and horse trainers in the aversive training world that I have come across, and few seem to understand that this is how their verbal commands and praise comes to have significance for the horse. Consequently, for the most part they don’t really give the horse any consistent information.

Commands come to mean “do this [insert relevant behaviour] to avoid an aversive” and praise means “there’s going to be a break from some specific type of aversive stimulus / pressure.”

This of course assumes that the horse isn’t finding doing the commanded behaviour unpleasant, hard work, frightening or painful in the first place. And that’s a whole other story.

Cues and praise in appetitive (positive reinforcement) training

By contrast, in a positive reinforcement model, we get the behaviour happening first, without using any pressure or aversive stimulus to produce it, and then once it is being offered reliably, we introduce a cue. So the behaviour will have been positively reinforced (and this is why the horse will offer that behaviour) and then we can introduce a cue.

The cue becomes associated both with the behaviour that follows it (and we have to introduce the cue AS the behaviour is happening or as we cause it to happen – for example by presenting a target) and with the positive reinforcement that follows.

Cues come to signal the opportunity for positive reinforcement for the behaviour described by the cue.

As for “praise” in a positive reinforcement model – well we can see the bridging stimulus / marker signal / click (if you choose to use a clicker) itself as being a form of “praise” because it is conditioned (learned) to predict that there is going to be food or a good long lip curling scratch.

There really isn’t a much better and more precise form of praise than a short sharp use of a bridging stimulus at the instant that the horse has done a behaviour (or sequence of behaviours) we want more of. And that’s because we would follow that with some food as reinforcement to maintain the association between the bridge (the positive reinforcement equivalent of praise) and the food or scratch (the primary reinforcer).

Verbal praise can also function as a “keep going signal” (a subject for another day and a detailed description) – a stimulus that comes, after correct training, to predict that continuing the behaviour will result in the behaviour being bridged and reinforced eventually.

In summary, vocal cues and commands and praise have no meaning until they are associated with something.

A command becomes a threat that an aversive will be applied for non-response. A positively trained cue becomes a predictor of treats for a correct response.

If we want praise to acquire meaning to a horse, it has to be followed by something that has its own meaning and significance to that animal.

Praise on its own will have little significance if any, until the animal learns that it predicts “something” – whether that is something nice, or the temporary absence of something unpleasant. We choose nice!

Given the choice of threats or treats, we choose treats. We hope you do too!