Learning Theory: Current Status
LEARNING THEORY: CURRENT STATUS
When most students of psychology hear the term learning theory, they probably think about the history of psychology. After all, what comes to mind when you think about learning theory? Perhaps one thinks of the names of many great psychologists, such as Thorndike, Pavlov, Tolman, Hull, and Skinner. Perhaps you recall Skinner's pigeons pecking at lights in an operant chamber to obtain morsels of food. Or maybe you remember Pavlov's famous discovery that the sound of a metronome can be conditioned to make dogs salivate if it is presented together with powdered meat. But is learning theory a relic of psychology's past, as these vignettes might suggest? The answer is no. In fact, learning theory is an active and vibrant area of study in psychology, with a steady stream of fresh theories that are attempting to tacklel-earning about complex stimuli, the involvement of context in memory retrieval, and the role of time in learning. Moreover, neuroscientists are increasingly relying on learning theory in the quest to dissect the brain systems involved in learning and memory. Learning theory is alive and well.
One of the most influential modern learning theories was crafted in the early 1970s by Robert A. Rescorla and Allan R. Wagner (Rescorla and Wagner, 1972). The Rescorla-Wagner model accounts for several features of a form of associative learning called Pavlovian conditioning. Pavlovian or classical conditioning is a type of learning in which "neutral" or conditional stimuli (CS), such as tones or lights, can predict the occurrence of biologically significant or unconditional stimuli (US), such as food or illness. After having been paired with a US, a CS elicits a learned or conditional response (CR) that often is similar to the unlearned or unconditional response (UR) elicited by the US.
Rescorla and Wagner used a simple mathematical formula to model Pavlovian conditioning:
δVCS = α (λ - σVT) (1)
The equation has three critical variables: α, the salience of the CS, λ, the magnitude of the US, and σVT, the total amount of learning acquired by all of the CSs present on the trial (For simplicity, a fourth variable, β, describing the salience of the US has been omitted). According to this equation, the amount that is learned on a conditioning trial (δVCS) is the product of the salience of the CS (α) and the difference between the magnitude of the US (λ) and the amount of learning that has already accrued to all of the CSs presented on that trial (σVT). A key feature of the model is that the amount that is learned on any given trial (CS, US, or CS-US presentation) is determined by how much the animal has already learned about all of the CSs present on that trial.
To illustrate some of the properties of the Rescorla-Wagner model, Figure 1 displays a family of learning curves computed with Equation 1. Note that the highest point on each learning curve (the asymptote) approaches the value of λ. The rate at which the asymptote is approached is determined by α. This means that the amount that is learned during classical conditioning is set by the strength of the US and the speed of that learning is set by the salience of the CS. For example, Pavlov's dogs learned to salivate profusely to a CS that was followed by a large quantity of powdered meat (the US). Similarly, the speed at which the dogs attained that profuse quantity of salivation was rapid when Pavlov used a salient CS, such as a loud metronome. The Rescorla-Wagner model predicts these features of Pavlovian conditioning.
Despite its simplicity, the Rescorla-Wagner model can account for a wide variety of interesting learning phenomena including extinction, blocking, and overshadowing. For example, in blocking conditioning, one CS (CSA) prevents conditioning to another CS (CSB) when the two are presented together during compound conditioning (CSA/CSB-US). Simply put, learning does not occur to CSB because the US is already perfectly predicted by CSA-CSB doesn't tell the animal anything it doesn't already know. In terms of the Rescorla-Wagner model, the value of (λ - σVT) is close to zero on the compound trial, because VCSA is close to 1 (CSA is already conditioned). The beauty of the Rescorla-Wagner model is that this simple mathematical equation can predict many complex learning phenomena.
Is the Whole More Than the Sum of Its Parts?
The Rescorla-Wagner model is an elemental model of conditioning because it makes predictions about learning based on the individual stimuli in a given trial. However, stimuli often occur together in complex combinations during classical conditioning. Consider, for example, the places where classical conditioning happens, such as a dentist's office. Here you learn to fear (CR) the sound of a dental drill (CS) that predicts tooth pain (UR). But you also learn that the dentist's office itself, which is composed of many unique stimuli (reclining chair, observation light, gurgling fountain, and the odor of the dentist's breath, to name a few), predict tooth pain and elicit learned fear. Rescorla and Wagner would argue that each of these contextual stimuli is an element, and that learning to these elements happens individually. According to their model, δVCS is calculated for each element separately in order to determine the amount of learning that happens to these stimuli.
An alternative has been proposed by John M. Pearce at Cardiff University in the United Kingdom (Pearce, 1994). Pearce suggests that stimuli that co-occur should be treated as unique combinations or configurations of stimuli that are distinct from their elements. Hence, rather than break down the dentist's office into many individual elements (such as the gurgling sound and dentist's bad breath), Pearce argues that we form a configuration of stimuli, such as "dentist's office," that includes all of the individual stimuli and becomes associated with the US.
The power of this configural approach is apparent in a type of learning called negative patterning. In this form of learning, two different CSs (CSA and CSB) are paired with a US, such as food. After several pairings, these CSs will elicit responses associated with food delivery, such as salivation. During this training, trials are interspersed in which CSA and CSB are presented together (a so-called compound trial) without food delivery. What happens in these compound trials? You might imagine that there is profuse salivation on these compound trials because both CSA and CSB predict food. But animals and people readily learn to withhold salivation on the compound trials because no food is delivered on these trials.
The Rescorla-Wagner model has trouble with this phenomenon because the compound stimulus CSA/CSB is treated as the sum of two elements (CSA and CSB), which strongly predict the US. Despite the fact that the US does not occur on compound trials, V CSA/CSB is always higher for the compound stimulus than the individual CSs. This incorrectly predicts that the compound stimulus will elicit a greater CR than the elements presented alone. However, Pearce's model can readily account for the fact that animals and people respond more on to the CSA and CSB compared to the compound stimulus. Pearce assumes that that compound stimulus is distinct from the elements that compose it—in effect, it is like a third CS. Using a computational formula much like the Rescorla-Wagner model, Pearce shows that animals rapidly learn to respond discriminatively on CSA, CSB, and CSA/CSB trials. By considering stimuli as configurations that are more than the sum of their parts, Pearce has provided a new and powerful explanation for several forms of learning.
Retrieving the Meaning of an Event
Of course, stimuli, whether they are CSs or words, for that matter, often have ambiguous meanings. Consider, for example, the word vessel. This word has several different meanings; it may refer to a water-craft, a container, or a tube containing blood, for example. How do we interpret the meanings of such ambiguous stimuli? Recent studies indicate that the context within which the stimulus occurs is important for understanding its meaning. For instance, sentences such as "The Titanic was a grand vessel" or "The aorta is the largest vessel in the body" provide a context in which to understand the meaning of the word vessel.
Pavlovian conditioning experiments in animals indicate the importance of context in understanding the meaning of ambiguous CSs. For example, if a CS is first paired with food for several trials, but then presented by itself (without food), the ability of the CS to elicit a CR is weakened. This process is called extinction. Although one might imagine (as the Rescorla-Wagner model predicts) that presenting the CS by itself results in unlearning of the CS-US memory, considerable data suggest that extinction itself represents new learning. In this case, animals learn that the CS predicts the absence of the US. So how does the animal know how to respond to this ambiguous CS? After all, the CS has predicted both the presence and absence of the US and various times.
Elegant work by Mark E. Bouton at the University of Vermont has solved this enigma (Bouton, 1993). Bouton has found that the context in which conditioning and extinction occur is critical for determining how animals respond to the CS. If animals are conditioned (i.e., receive CS-US pairings) in one context but are extinguished (i.e., receive the CS by itself) in a different context, then the context controls how the animals respond. If they are placed in the context where the CS was presented by itself, they retrieve a memory that the CS predicts the absence of the US and show little conditional responding. In contrast, if the animals are placed in the conditioning context, they retrieve a memory that the US will follow the CS and show high levels of learned behavior. Hence, Bouton argues that contexts—which include physical environments, internal states (such as hunger), and even time—regulate the ability of an ambiguous CS to evoke a learned response.
It's All in the Timing
The theories of Rescorla, Wagner, Pearce, and Bouton all assume that the foundation of learning is an association, a bond or link formed between two or more stimuli. Pavlovian conditioning, for example, assumes that an association forms between the CS and US, and that this association is necessary for the CS to elicit a learned response (CR). In a radical departure from this fundamental assumption of associative learning theory, Randy Gallistel and John Gibbon have argued that the time at which events occur, not associations between the events, is at the heart of learning and memory (Gallistel and Gibbon, 2000).
In any conditioning experiment, stimuli occur in a temporal as well as a physical context. Years of animal and human research have yielded important information about the influence of timing on Pavlovian conditioning. For example, conditioning is often optimal when the CS is turned on slightly before the US. Long intervals between the onset of the CS and US tend to produce weak learning; this is commonly called the delay of reinforcement gradient. However, conditioning is related to both the delay of reinforcement and the amount of time that elapses between CS-US trials (the intertrial interval or ITI). Learning rates are comparable across short and long delays of reinforcement long as the ITI is varied in accordance with the CS-US delay.
Gallistel and Gibbon's temporal model, which they call rate estimation theory, explains this and other properties of conditioning. In their model, animals compute the ratios between the temporal intervals associated with the delay of reinforcement and the ITI. These computations are then used to make decisions about how to respond to a particular stimulus. In general, conditioning improves when the ratio between the ITI and the delay of reinforcement is large. And, as mentioned before, increasing the duration of the delay and reinforcement gradient results in weaker conditioning unless the ITI increases as well. These predictions are made without any appeal to associations between the CS and US; in essence, the animals are just keeping time.
Despite the computational power of rate-estimation theory, the model faces with a fundamental challenge: Does the brain learn this way? The anatomical and physiological facts about brain function seem to be more consistent with associative models of learning and memory. For example, sensory pathways in the brain transmit information about CSs, USs, contexts, and other stimuli. There is considerable convergence among these pathways in several brain areas thought to be involved in learning and memory, such as the hippocampus and amygdala. Neurons in the brain exhibit changes in synaptic function after learning, and these changes are consistent with the associative nature of Pavlovian conditioning, for example (Maren, 1999).
One of the promises of associative learning theory is that it is stimulating and guiding neuroscientists in their quest for discovering the brain mechanisms of learning and memory. The continued elaboration and refinement of learning theories are an integral component of understanding both brain and behavior in humans and animals.
See also:LEARNING THEORY: A HISTORY
Bouton, M. E. (1993). Context, time, and memory retrieval in the interference paradigms of Pavlovian conditioning. Psychological Bulletin 114, 80-99.
Gallistel, C. R., and Gibbon, J. (2000). Time, rate, and conditioning. Psychological Review 107, 219-275.
Maren, S. (1999). Long-term potentiation in the amygdala: A mechanism for emotional learning and memory. Trends in Neuroscience 22, 561-567.
Pearce, J. M. (1994). Similarity and discrimination: A selective review and connectionist model. Psychological Review, 587-607.
Rescorla, R. A., and Wagner, A. R. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In A. H. Black and W. F. Prokasy, eds., Classical Conditioning II: Current Research and Theory. New York: Appleton-Century-Crofts.