Conditioning, Classical and Instrumental
CONDITIONING, CLASSICAL AND INSTRUMENTAL
Classical (Pavlovian) and instrumental (Thorndikian) conditioning are the two most widely employed paradigms for studying simple, associative learning resulting from the organism's exposure to the temporal conjunction of two or more events. The fully specified classical conditioning paradigm consists of a set of operations involving an unconditioned stimulus (US) reliably producing an unconditioned response (UR) and a conditioned stimulus (CS) initially shown not to produce a response resembling the UR. The CS and US are then presented repeatedly to the organism in a specified order and temporal spacing, and a response similar to the UR develops to the CS that is called the conditioned response (CR); that is, CS-CR functions are obtained. Control over the temporal conjunction of the CS, US, and UR makes classical conditioning preparations ideal vehicles for studying associative learning because they can uniquely specify stimulus antecedents to the target response. Various temporal arrangements of the CS and US give rise to different forms of classical conditioning (e.g., delay, trace, simultaneous). Classical conditioning is called classical reward conditioning if the US is a positive stimulus and classical defense conditioning if it is negative stimulus. The positive or negative designation depends on the independent demonstration of the organism's performing instrumental responses necessary to obtain the US or to remove itself from the US, respectively. What distinguishes classical from instrumental conditioning is that presentation or omission of the US is independent of CR occurrence; and the definition of a CR is restricted to a target response selected from among those effector systems elicited as URs by the US. Adherence to both components of the definition of classical conditioning avoids common confusions and ambiguities with other associative learning paradigms commonly designated as "classical conditioning."
The designation "classical conditioning" has been applied to paradigms meeting only the requirement that CS and US be administered independent of the target response, ignoring selection of the UR. As a consequence, the term "classical conditioning" has been extended from Russian physiologist Ivan Petrovich Pavlov's CS-CR to stimulus—stimulus (S-S) conditioning paradigms involving principally conditioned stimulus-instrumental response (CS-IR) and autoshaping procedures. The CS-IR paradigms include conditioned suppression and other classical-instrumental transfer procedures in which the stimulus-stimulus pairings of classical conditioning are conducted with a CS and a biologically significant event (e.g., shock) but without measurement of a UR or CR. The CS is then presented during ongoing instrumental behavior and its facilitory or disruptive effect on responding is measured; therefore, CS-IR functions are obtained. Autoshaping consists of response-independent presentations of a lighted manipulandum (e.g., lighted key) as a CS and activation of a food magazine as the US; the target response is contact with the manipulandum (e.g., key pecking). Key pecking is not an instrumental response, nor is it a UR appearing in the constellation of URs to food in the mouth (Woodruff and Williams, 1976). Hence, acquisition of a response in an effector system not elicited by the US qualifies autoshaping as a "new" associative learning paradigm.
Some discriminative approach procedures have been designated as "Pavlovian" simply because an explicit cue (CS) is presented and food or water, designated the US, is made available at a fixed time following CS onset (e.g., Holland and Rescorla, 1975) and the approach behavior, by definition instrumental to receipt of the reinforcing event, has been erroneously designated a CR. At present, the preceding paradigms are widely employed in the study of associative learning, but whether they will converge with the empirical laws of CS-CR paradigms has yet to be determined systematically. The S-S and discriminative approach paradigms lack the capability of CS-CR paradigms to exercise absolute control over the timing and sequencing of stimulus events; and to identify the stimulus antecedents to the target response from the outset of training. In addition, with CS-IR and discriminative approach procedures, the target response is instrumentally conditioned. Consequently, these paradigms might be expected to be even less likely to display convergence with the empirical laws of CS-CR paradigms. In any event, despite the greater technical demands of measuring URs and CRs in CS-CR research, their methodological characteristics favor their use in the study of associative learning.
The CS-CR paradigms are ideally suited for the study of the biological substrates of associative learning because the target response is defined anatomically by a set of movements or secretions. The UCS's elicitation of the UCR permits identification of the target response's final common neural pathway(s) outside the conditioning situation and, thereby, affords the opportunity to observe changes in its activity from the start of conditioning (Thompson, 1976). In contrast, S-S contingency and discriminative approach paradigms are inherently unsuitable for studying the biological basis of learning. First, in CS-IR paradigms, changes in the instrumental target response are not the consequences of its participation in the learning process. Rather, the changes are the result of interactions of hypothetical (unobserved) CRs with the CS that are governed by prior CS-US pairings. As a consequence, any neural analysis of learning that is directed at changes in the target response is pointless. Second, since the target response in the discriminative approach paradigm is outcome-defined, a wide variety of different body movements can yield the required outcome. Therefore, it is virtually impossible to identify a final common pathway for the movements that make up the response.
The associative nature of a conditioning preparation has come to be determined by the contiguous occurrence of the CS and US and a set of control operations intended to estimate the contribution of other possible processes to responding. All response systems show some level of baseline activity, often raised by UCS presentations, which can produce an accidental coincidence of the CS and target response. Moreover, the likelihood of a target response to the CS may be systematically affected by alpha responses, which are URs to the CS in the same effector system as the target response; and pseudo-conditioned and sensitized responses established on the basis of prior US-alone presentations. A detailing of the latency, duration, amplitude, and course of habituation of the alpha response with a control group given CS-alone presentations can provide a basis for eliminating alphas from consideration as a CR, since they are usually of a shorter latency than CRs. Hence, if a sufficiently long CS-US interval is employed, both alphas and CRs can be observed in the interval and scored accordingly (Gormezano, 1966; Gormezano et al., 1983).
The reinstatement or augmentation of alphas to the CS through US-alone or CS-US pairings is referred to as sensitization. After eliminating alphas from consideration, the contribution of pseudo-CRs to CR measurement can be assessed by presentations of the US one or more times prior to CS presentations. The procedure frequently results in responses to the CS, labeled pseudo-CRs, which are treated separately from CRs because of their occurrence in the absence of CS-US pairings. However, the US-alone procedure precludes trial-by-trial assessment of pseudo-CRs for comparison with CRs. Accordingly, a single unpaired control procedure has evolved in which CS-alone and US-alone trials are presented randomly the same number of times as the paired CS-US group, but at variable CS-US intervals exceeding those effective for CR acquisition. Under the unpaired control, responses on CS trials (excluding alphas) provide a summative measure of pseudo-CRs and baseline responses.
Use of the unpaired control is based on associative assumptions that temporal contiguity of the CS and US is necessary for CR acquisition; and responding produced by the unpaired control is nonassociative, since the randomized sequencing of CSs and USs at exceedingly long, random intervals prevents any CS-US contiguity effects. However, associative theory and its unpaired control methodology have been challenged by a contingency hypothesis (Prokasy, 1965; Rescorla, 1967), which asserts that associative learning in classical conditioning can be viewed as determined by the statistical relationship between the CS and US. The hypothesis assumes that if US probability is greater in the presence of the CS than in its absence, a positive contingency prevails and excitatory associative effects would accrue to the CS; conversely, if US probability is higher in the absence than in the presence of the CS, the negative contingency would yield inhibitory associative effects. Moreover, the contingency hypothesis assumes the unpaired control's perfectly negative contingency would lead the CS to acquire inhibitory associative effects. Hence, Rescorla (1967) proposed a truly random control to provide an associatively neutral condition for assessing excitatory and inhibitory conditioning.
Rescorla (1967) specified the truly random control as involving independent programming of the CS and US or equal US probabilities in the presence and absence of the CS. However, delineating pairing/unpairing cannot be determined a priori but only empirically. CS-US pairing is specified by the CS-US intervals demonstrated to produce CR acquisition for a specific preparation, while "explicitly unpaired" denotes the use of stimulus intervals outside the effective CS-US intervals. Consequently, in the absence of an empirically derived metric (i.e., effective CS-US conditioning intervals) to designate paired and un-paired conditions, it is virtually impossible to program an associatively neutral truly random control or predictable excitatory or inhibitory conditioning groups. Rescorla, seeking to validate the truly random control, reported that its CS had no effect upon avoidance conditioning (Rescorla, 1966) or upon responding in a CS-IR study where shock-US probability in the presence and absence of the CS were equal (Rescorla, 1968). However, these findings were challenged by CS-IR studies revealing that trial number and frequency of chance CS-US pairings under a truly random control could substantially affect (excitatory) conditioning (Gormezano and Kehoe, 1975). Subsequently, Rescorla (1972) disavowed the contingency hypothesis and truly random control and reverted to the use of the unpaired control. Nevertheless, the truly random control is still widely employed despite the detailing of additional methodological limitations (Papini and Bitterman, 1990; Wasserman, 1989).
Instrumental conditioning procedures are all characterized by a contingent relationship between the organism's response and a stimulus. Typically, if the stimulus increases, decreases, or leaves unaffected the probability of the response, it is identified as positive, negative, or neutral, respectively. Although such labeling appears to be circular, Edward Thorndike's (1913) characterizations of stimuli as "satisfying" (positive) and "annoying" (negative) were not circular because they were specified by behavior changes independent of the target response. Noncircularity can also be achieved by demonstrating transitivity of stimulus effects on the target response to other (new) responses.
A positive or negative contingency between the target response and reinforcing stimulus gives rise to a variety of instrumental conditioning paradigms. The five most extensively studied are reward, punishment, omission, escape, and avoidance, which derive from responses producing a positive (reward) or a negative (punishment) stimulus; preventing a positive (omission) or negative (avoidance) stimulus from occurring; and terminating a negative stimulus (escape). Woods (1974), employing a classification schema that includes operant conditioning and presence or absence of a discriminative stimulus, enumerated sixteen instrumental conditioning paradigms. However, despite repeated attempts at conceptual clarification, the operant remains devoid of causal stimulus antecedents (Coleman, 1981) and, consequently, it cannot be employed to study associative learning. The operant is applicable only to the study of performance variables affecting postasymptotic or steady-state responding. Aside from the study of discrimination learning processes, discriminative instrumental conditioning paradigms are widely employed to assess the effects of concurrent classical conditioning of "fear" or "incentive motivation" to the discriminative stimulus upon the instrumental target response.
Any occurrence of the target response without prior conjunction with the reinforcing stimulus is designated a nonassociative response attributable to 1. base rate; 2. independent presentations of the reinforcing stimulus; and 3. presentations of the reinforcing stimulus independent of the target response. Implementing controls for the first two factors are self-evident, and achieving a control for the third factor has been essentially limited to the yoked-control design. In the design, pairs of subjects are selected and one of them is randomly designated the experimental and the other the control subject. During conditioning, when the experimental subject performs the target response, the contingent event is received by both subjects. Therefore, both members of the pair receive the same number and temporal distribution of stimulus events; the only difference is that the experimental but not the control subject always receives the reinforcing event after execution of the target response, while the yoked partner receives the reinforcing event independent of execution of the response.
Thus, the yoked-control design appears to be admirably suited to test the (null) hypothesis that the temporal relationship between the response and subsequent stimulus event is irrelevant to the observed behavior change. Unfortunately, the design confounds within-subject sources of random error with the treatment effect: Control of stimulus events by experimental subjects can allow for systematic differences in the number of experimental subjects that are more affected by the stimulus event than their yoked partners. The possibility of such confounding has rendered the results of yoked-control designs necessarily ambiguous. As a consequence, a means for assessing the contribution of the third nonassociative factor to instrumental conditioning has not yet been achieved.
Coleman, S. R. (1981). Historical context and systematic functions of the concept of the operant. Behaviorism 9, 207-226.
Gormezano, I. (1966). Classical conditioning. In J. B. Sidowski, ed., Experimental methods and instrumentation in psychology. New York: McGraw-Hill.
Gormezano, I., and Kehoe, E. J. (1975). Classical conditioning: Some methodological-conceptual issues. In W. K. Estes, ed., Handbook of learning and cognitive processes, Vol. 2: Conditioning and behavior theory. Hillsdale, NJ: Erlbaum.
Gormezano, I., Kehoe, E. J., and Marshall, B. S. (1983). Twenty years of classical conditioning research with the rabbit. In J. M. Sprague and A. N. Epstein, eds., Progress in psychobiology and physiological psychology, Vol. 10. New York: Academic Press.
Holland, P. C., and Rescorla, R. A. (1975). Second-order conditioning with food unconditioned stimulus. Journal of Comparative and Physiological Psychology 88, 459-467.
Papini, M. R., and Bitterman, M. E. (1990). The role of contingency in classical conditioning. Psychological Review 97, 396-403.
Prokasy, W. F. (1965). Classical eyelid conditioning. Experimenter operations, task demands, and response shaping. In W. F. Prokasy, ed., Classical conditioning: A symposium. New York: Appleton-Century-Crofts.
Rescorla, R. A. (1966). Predictability and number of pairings in Pavlovian fear conditioning. Psychonomic Science 4, 383-384.
—— (1967). Pavlovian conditioning and its proper control procedures. Psychological Review 74, 71-80.
—— (1968). Probability of shock in the presence and absence of CS in fear conditioning. Journal of Comparative and Physiological Psychology 66, 1-5.
—— (1972). Informational variables in Pavlovian conditioning. In G. H. Bower and J. T. Spence, eds., Psychology of Learning and Motivation. New York: Academic Press.
Thompson, R. F. (1976). The search for the engram. American Psychologist 31, 209-227.
Wasserman, E. A. (1989). Pavlovian conditioning: Is contiguity irrelevant? American Psychologist 44, 1,550-1,551.
Woodruff, G., and Williams, D. R. (1976). The associative relation underlying autoshaping in the pigeon. Journal of the Experimental Analysis of Behavior 26, 1-13.
Woods, P. J. (1974). A taxonomy of instrumental conditioning. American Psychologist 29, 584-597.