26 Jul Operant Conditioning
Operant Conditioning (Trial and Error Learning)
Behaviour = Consequenses
- Operant conditioning forms an association between a behavior and a consequence.
When the horse trials a behavior and it has a good consequence, it will be more likely to repeat that behavior again in the future, when the same stimulus is presented..
If the behavior has a bad consequence the horse will be less likely to repeat it again in future.
Trial and error learning is the most permanent form of learning, and successful trials will quickly develop into habits.
An example of learning by trial and error: The horse smells food in a container, it pushes it with its nose, the lid comes off and the horse gets to eat the food.
The next time the horse sees the container it will almost certainly try pushing it with it’s nose.
As the trainer you always need to control the consequences of your horses behavior.
So that a behavior is always followed by the same consequences.
Consistency of consequences results in the formation of habits.
The importance of Timing in Trial and Error Learning
Consequences have to be immediate and clearly linked to the behavior.
With humans, we can explain the connection between the consequence and the behavior, even if they are separated in time. For example, you might tell a friend that you’ll buy dinner for them because they helped you move some furniture, or a parent might explain that the child can’t go and stay at a friends house because they didn’t take out the garbage.
With animals, you can’t explain the connection between the consequence and the behavior.
So the consequence has to be immediate, otherwise it will not be associated with the behavior.
The way to work around this is to use a bridge signal or marker signal such as a clicker.
This allows you to mark the behavior, but it bridges the time between the behavior and the consequences.
It is difficult to supply an animal with one of the things it naturally likes (or dislikes) in time for it to be an important consequence of the behavior. In other words, it’s hard to toss a fish to a dolphin while it’s in the middle of a jump or give a horse a treat while its in the middle of an extended trot.
This can be overcome by teaching the animal to associate something it wants (e.g food) with something that’s easier to “deliver”, like the sound of a clicker.
The sound has no meaning to the horse at first, so it is repeatedly paired with a reinforcer,
(like food) until the horse associates the sound of the marker signal with getting the food.
This is done by classical conditioning and repetition.
In other words the trainer clicks then gives the horse some food, after a few repetitions the horse associates the clicker with the food. The clicker can then be used as a bridge signal, to mark a behavior, and bridge the moment in time between the behavior and the reward.
The signal is usually a sound (like the clicker) but it can also be something visual or a touch.
A sound is the easiest to use because it can be used when the animal isn’t looking at it, and also when the animal might be too far away to touch it.
The marker signal can be anything as long as it is distinct and the same every time.