Simply put, operant conditioning refers to a systematic program of rewards and punishments to influence behavior or bring about desired behavior. Operant conditioning relies on two basic assumptions about human experience and psychology: (1) a particular act results in an experience that is a consequence of that act, and (2) the perceived quality of an act's consequence affects future behavior. In addition, a central idea of operant conditioning holds that the main influences on behavior are external—that is, it is in a person's external environment that his or her behavior is programmed.
The Harvard psychologist B.F. Skinner pioneered the field of behaviorism in the late 1930s and continued to contribute to it through the mid-1970s. Operant conditioning is one of the key concepts of this school of psychology. Skinner called his brand of conditioning “operant conditioning” to distinguish it from the conditioning theory developed by the Russian physiologist Ivan Pavlov, now referred to as “classical conditioning.” Classical conditioning primarily concerned itself with reflexive or unlearned behavior, such as the jerking of a knee upon being tapped with a hammer. In a famous experiment, Pavlov training dogs to salivate in expectation of food at the sound of a bell. Operant conditioning, however, deals with learned, not reflexive behavior; it works by reinforcing (rewarding) and punishing behavior based on the consequences it produces. Reinforcement is used to increase the probability that behavior will occur in the future, whereas punishment aims to decrease that probability. In addition, the process of removing reinforcement from an act is called extinction.
Organizational management literature often refers to operant conditioning as part of reinforcement theory and work behavior modification. Unlike other theories of management and motivation, operant conditioning does not rely on attitudes, beliefs, intentions, and motivation for predicting and influencing behavior, although Skinner
and other behaviorists do not suggest that these factors do not exist. Instead, they posit that these notions find their genesis in external conditions and reinforcement. Hence, organizational management theorists who adopt this approach look to external factors—the environment—to explain and influence behavior within the work place. For example, this approach to management views motivation as a product of workers' environments, not as an internal quality of each individual worker's psychological makeup. Therefore, employees are highly motivated whenever quality is reinforced with pay raises, promotions, and other conditions that employees find desirable.
Since most of the behavior taking place in a business is learned rather than reflexive, operant conditioning can be applied to organizational management. Workers learn various kinds of behavior before and after joining a company, and they encounter a host of stimuli in a company setting that can cause them to behave in certain ways with certain consequences. These kinds of behaviors are rewarded and punished depending on their value to a company. The stimuli in the workplace include schedules, corporate structures, company policies, telephone calls, managers, and so on. The consequences of workplace behavior include approval or disapproval from managers and coworkers, promotions, demotions, pay increases, and so forth. When consequences are directly linked to certain kinds of behavior, they are contingent on these kinds of behavior. The classic example is touching a hot stove and experiencing the immediate consequence of being burned.
However, most consequences in a company are only partially contingent on the behavior (performance) of employees, and thus there are often entire networks of relationships between employee behavior and its consequences. These relationships are called schedules of reinforcement, and applying operant conditioning to the work place means controlling these schedules.
Reinforcement schedules are either continuous or intermittent (partial). Continuous reinforcement schedules are those situations in which every occurrence of an act is reinforced. In contrast, intermittent schedules are those situations in which only some instances of an act are reinforced. Continuous reinforcement schedules generally facilitate new learning or the acquisition of new skills at the fastest rate. New employees learning how to process customer orders, for example, will learn the proper procedure the fastest if they are reinforced every time they take an order correctly. However, if a continuous schedule is suspended outright after being implemented for any substantial period, the behavior being reinforced might stop altogether. In addition, after a certain kind of behavior has been learned, it will occur more often if reinforced intermittently. Hence, employees who have learned the proper procedure for taking customer orders have the greatest likelihood of continuing to do so correctly if managers adopt an intermittent schedule after the behavior has been learned.
Moreover, reinforcement can be positive (adding something new, such as a raise or a promotion) or negative (the removal of something from the work environment, such as constant supervision) after new employees demonstrate they have sufficiently learned their jobs. Negative reinforcement, however, should not be confused with punishment, which involves undesirable or aversive consequences and decreases the probability of an act being repeated. Negative reinforcement, rather, is a kind of a reward that removes constraints or other elements from the work environment to encourage employee behavior.
As Nelson and Quick reckon reward systems should have corresponding strategic significance to the organization. Strategically designed reward systems enable organizations to achieve the objectives of motivating behavior and encouraging further accomplishments among employees while advancing the core objectives of the organization at the same time. Managers who adopt a strategic approach in their positive reinforcement actions present employees with rewards that have long–term implications (such as training, educational programs, and awards that demonstrate recognition of achievements made by employees). For example, Marriott International annually honors its best twenty employees with the J. Willard Marriot Award of Excellence.
Organizational behavior modification (OBM) is the most common mode of operant conditioning that is practiced by many organizations because of its relevance to organizational management and wide ranging options for implementation. Social, nonfinancial, and financial reinforcements are the key pillars of the OBM concept. According to Nelson and Quick, a comprehensive review of past research on the impact of OBM in organizations revealed that the concept registered improved performance in organizations in the service and manufacturing sectors that adopted OBM. The review also revealed that monetary-based reinforcement impacted positively on performance more than other forms of rewards and recognition.
Because some behavior is so complex that it does not occur all at once, managers must reinforce progressive approximations of the desired behavior. This process begins with the reinforcement of behavior that may barely resemble the desired behavior, using a continuous reinforcement schedule with a progressive standard. Consequently, behavior must show improvement or greater approximation of the desired behavior to receive reinforcement as time goes on.
When managers wish to discourage certain kinds of behavior or decrease the probability of their occurrence, they can implement a schedule of punishment along the lines of a schedule of reinforcement. Punishment involves
the application of undesirable consequences or the removal of positive consequences following undesired behavior. However, negative consequences must be meted out with consideration of how it will affect individual workers, because what constitutes punishment for one worker may not for another. Ultimately, these consequences or stimuli must be linked to the undesired behavior and decrease the probability of it reoccurring in order for them to constitute punishment in the technical sense of the operant conditioning approach. Moreover, effective punishment usually embodies the following qualities: it is consistent, immediate, impersonal, and contingent on specific behavior. Finally, punishment should be informative—letting employees know why they are being punished—and employees should recognize that future punishment can be avoided by refraining from the undesired behavior.
Operant conditioning has been successfully applied in many settings: clinical, for individual behavior modification, teaching, for classroom management, instructional development, for programmed instruction, and management, for organizational behavior modification.
SEE ALSO Motivation and Motivation Theory; Organizational Behavior
Elfenbein, H.A. “Emotion in Organizations: A Review and Theoretical Integration.” Academy of Management Annals Vol. 1, December 2007, 316–331.
Geiser, Robert L. Behavior Mod and the Managed Society. Boston: Beacon Press, 1976.
Hinkin, T.R., and C.A. Schriesheim. “If You Don't Hear From Me You Know You Are Doing Fine: The Effects of Management Nonresponse to Employee Performance.” Cornell Hotel & Restaurant Administration Quarterly, November 2004, 362–372.
Lutz, J. Learning and Memory. 2nd ed. Long Grove, IL: Waveland Press, 2004.
Malott, R.W., and E.A. Trojan. Principles of Behavior. 5th ed. Upper Saddle River, NJ: Pearson/Prentice Hall, 2004.
Nadler, Leonard, and Zeace Nadler. The Handbook of Human Resource Development. 2nd ed. New York, NY: John Wiley & Sons, 1990.
Nelson, D.L., and J.C. Quick. Organizational Behavior. Thomson: South-Western, 2006.
Pinder, Craig C. Work Motivation: Theory, Issues, and Applications. Glenview, IL: Scott, Foresman and Company, 1984.
Skinner, B.F. About Behaviorism. New York, NY: Alfred A. Knopf, 1974.
Smith, P., and A. Dyson. “Get with the Programme.” The Safety and Health Practitioner,December 2004, 38–40.
Learning is an important topic for the social sciences. It can explain much of development, for example, and can be used in many applied settings (such as educational or clinical). There are various perspectives on learning. One of the most useful involves operant conditioning.
An operant is a voluntary behavior that is used in order to obtain a reinforcer or avoid a punisher. Operant Conditioning uses reinforcement and punishment systematically to facilitate learning. Its unique foci include, first, its focus on voluntary behaviors (“operants”), and second, its emphasis on the consequences of behavior. Other learning theories emphasize antecedents rather than consequences and involuntary behaviors, or reflexes, rather than operants. The Classical Conditioning perspective, for instance, focuses on antecedents and reflexes. The founder of Classical Conditioning, Ivan Pavlov, used a bell as an antecedent stimulus in his well-known research with dogs. These dogs salivated after the ringing of the bell had been repeatedly associated with meat powder. Salivation is of course a reflex. Though this emphasis on reflexes implies that Classical Conditioning may not be as useful as Operant Conditioning, it did influence the development of the Lamaze birthing technique and has been adapted to the systematic desensitization of phobias and other problems that involve physiology and reflexes.
Another perspective, called Social Learning theory, is more consistent with Operant theory than is Classical Conditioning. It emphasizes modeling and observational learning but it recognizes the impact of consequences. A child might observe a hero on television, for example, who is richly rewarded for some altruistic behavior, and although the child watching the television is not reinforced, the child imitates the hero. Observational learning of this sort is sometimes described as “vicarious reinforcement.”
Reinforcement makes operant behavior more likely to occur in the future. Punishment makes operants less likely to occur. Reinforcement can be used in an operant procedure known as shaping. Here reinforcement is given to behaviors that increasingly resemble a target behavior, and gradually the individual will in fact display that target behavior. It is sometimes called the Method of Successive Approximation. Fading involves reinforcing one behavior simultaneously with prompts or assistance of some kind. The assistance is gradually withdrawn, or faded, until the behavior is emitted without any prompts. Undesirable behaviors can be eliminated from behavioral repertoires by punishing them. Alternatively, it is sometimes more appropriate to identify the reinforcers that are supporting the undesirable behavior and simply eliminate them. In this fashion, punishment is unnecessary. If behavior is not supported by reinforcement, it becomes extinct. This is the rationale for time-outs; individuals are placed in a setting that does not allow them to receive reinforcement, nor support inappropriate behaviors.
Consequences are not always effective in controlling behavior. Indeed, one of the most important steps when using operant procedures involves the accurate identification of reinforcers and punishers. There are many idiosyncrasies; what controls the behavior of one person often has no impact on another. The effectiveness of consequences is in part determined by the type and amount given, but also by deprivation (hunger), the gradient (the interval between the behavior and the consequence), and the schedule (the number of behaviors that must be emitted to earn a reinforcer). In general, a shorter interval ensures that consequences are maximally effective. Early on it is also important to use a continuous schedule, with a one-to-one ratio (every single instance of the behavior earns the consequence). Larger ratios are useful to program generalization and maintenance. It is typically best to start with a continuous schedule, but then thin the schedule such that the individual may emit two, then three, then five, then ten behaviors to earn one reinforcer. Similarly, variable schedules can be used to ensure that behavior is resistant to extinction. In a variable (or intermittent) schedule the ratio of behaviors to reinforcers fluctuates (3:1, then 5:1, then 2:1, then 7:1, and so on). B. F. Skinner demonstrated each of these operant concepts using highly controlled laboratory experiments, typically with subhuman species.
At one point Skinner drew from operant principles to develop a highly controlled crib for his own daughter. It kept her environment at an ideal temperature, with controlled lighting and visual stimuli. This reflects Skinner’s emphasis on environmental control. The environment sometimes influences behavior in ways that one does not even notice, such as visual distraction or temperature, and sometimes the environmental influence takes the form of obvious consequences to one’s actions, such as reinforcers and punishers. Skinner felt that one could retain free will only if one maintained an awareness of the environmental and experiential influences on one’s behavior. The apparatus just described is sometimes called a Skinner Box, though that name is also sometimes used to describe the operant chamber in which rats are trained via reinforcement. Operant chambers automatically monitor and reinforce particular behaviors (for example, pressing down on a bar). They provide a high level of experimental control, which was typical of Skinner’s work. He felt that objectivity and experimental control were necessary if psychology was to be scientific. He once stated that sciences are only valid if they can “predict and control.”
Additional research by Skinner and others has demonstrated that operant conditioning is highly effective when used systematically in educational or clinical settings. In fact, according to Skinner, operant conditioning also occurs spontaneously in the natural environment. Parents, for example, may reinforce or punish behavior without really intending to do so (or at least without relying on operant theory to make their decisions). Skinner’s 1948 novel, Walden Two, describes a segment of the population that employs operant principles in a kind of utopia. The key idea is that consequences dramatically influence behavior and, as noted above, in order to exercise free will, people should be aware of operants’ effects and use them in a systematic and beneficial fashion.
Skinner, Burrhus Frederick. 1948. Walden Two. New York: Macmillan.
Skinner, Burrhus Frederick. 1953. Science and Human Behavior. New York: Macmillan.
Skinner, Burrhus Frederick. 1976. About Behaviorism. New York: Vintage.
Mark A. Runco
Approach to human learning based on the premise that human intelligence and will operate on the environment rather than merely respond to the environment's stimuli.
Operant conditioning is an elaboration of classical conditioning . Operant conditioning holds that human learning is more complex than the model developed by Ivan Pavlov (1849-1936) and involves human intelligence and will operating (thus its name) on its environment rather than being a slave to stimuli.
The Pavlovian model of classical conditioning was revolutionary in its time but eventually came to be seen as limited in its application to most human behavior, which is far more complex than a series of automatic responses
|The frequency of a behavior is increased because of the behavior of the subject.||When a person receives reinforcement after engaging in some behavior, the person is likely to repeat that behavior.||When a person experiences a negative state and does something to eliminate the undesired state, the person is likely to repeat that behavior.|
|The frequency of a behavior is decreased because of the behavior of the subject.||When a person engages in a behavior and something negative is applied as a result, that behavior is less likely to be repeated.||When a person engages in a behavior and something positive is taken away, that behavior is less likely to be repeated.|
to various stimuli. B.F. Skinner (1904-1990) elaborated on this concept by introducing the idea of consequences into the behaviorist formula of human learning. Pavlov's classical conditioning explained behavior strictly in terms of stimuli, demonstrating a causal relationship between stimuli and behavior. In Pavlov's model, humans responded to stimuli in specific, predictable ways. According to Skinner, however, behavior is seen as far more complex, allowing for the introduction of choice and free will. According to operant conditioning, the likelihood that a behavior will be repeated depends to a great degree on the amount of pleasure (or pain ) that behavior has caused or brought about in the past. Skinner also added to the vocabulary of behaviorism the concepts of negative and positive reinforcer and of punishment .
According to the Skinner model of operant conditioning humans learn behaviors based on a trial and error process whereby they remember what behaviors elicited positive, or pleasurable, responses and which elicited negative ones. He derived these theories from observing the behaviors of rats and pigeons isolated in what have come to be known as Skinner boxes. Inside the boxes, rats that had been deprived of food were presented with a lever that, when pushed, would drop a pellet of food into the cage. Of course, the rat wouldn't know this, and so the first time it hit the lever, it was a purely accidental, the result of what Skinner called random trial and error behavior. Eventually, however, the rat would "learn" that hitting the lever resulted in the appearance of food and it would continue doing so. Receiving the food, then, in the language of operant conditioning, is considered the reinforcer while hitting the lever becomes the operant, the way the organism operates on its environment.
Skinner's model of operant conditioning broke down reinforcements into four kinds to study the effects these various "schedules of reinforcement" would have on behavior. These schedules are: fixed interval, variable interval, fixed ration, and variable ration. In a fixed interval schedule experiment, the lever in the rat's box would only provide food at a specific rate, regardless of how often the rat pulled the lever. In other words, food would be provided every 60 seconds. Eventually, the rat adapts to this schedule, pushing the lever with greater frequency approximately every 60 seconds. In variable interval experiments, the lever becomes active at random intervals. Rats presented with this problem adapt by pressing the lever less frequently but at more regular intervals. An experiment using a fixed ratio schedule uses a lever that becomes active only after the rat pulls it a specific number of times, and in a variable ration experiment the number of pulls between activity is random. Behavior of the rats adapts to these conditions and is adjusted to provide the most rewards.
The real-world ramifications of operant conditioning experiments are easy to imagine, and many of the experiments described would probably sound very familiar to parents who use such systems of rewards and punishments on a daily basis with their children regardless of whether they have ever heard of B.F. Skinner. His model has been used by learning theorists of various sorts to describe all kinds of human behaviors. Since the 1960s, however, behaviorism has taken a back seat to cognitive theories of learning, although few dispute the elementary tenets of operant conditioning and their use in the acquisition of rudimentary adaptive behaviors.
Blackman, Derek E. Operant Conditioning: An Experimental Analysis of Behaviour. London: Methuen, 1974.
Smith, Terry L. Behavior and Its Causes: Philosophical Foundations of Operant Psychology. Boston: Kluwer Academic Publishers, 1994.