Reinforcement, Positive and Negative
Reinforcement, Positive and Negative
Reinforcement is a term used to refer to the procedure of removing or presenting stimuli (reinforcers) to maintain or increase the frequency or likelihood of a response. The term is also applied to refer to an underlying process that leads to reinforcement or to the actual act of reinforcement, but many psychologists discourage such a broad application of the term. Reinforcement is usually divided into two types: negative and positive.
A negative reinforcer is a stimulus that when removed after a response, will increase the frequency or likelihood of that response. Negative reinforcers can range from uncomfortable physical sensations or interpersonal situations to actions causing severe physical distress. The sound of an alarm clock is an example of a negative reinforcer. Assuming that the sound is unpleasant, turning it off, or removing its sound, serves to reinforce getting out of bed. A positive reinforcer is a stimulus which increases the frequency or likelihood of a response when its presentation is made contingent upon that response. Giving a child candy for cleaning his or her room is an example of a positive reinforcer.
Reinforcers can also be further classified as primary and conditional. Primary reinforcers naturally reinforce an organism. Their reinforcing properties are not learned. They are usually biological in nature, and satisfy physiological needs. Examples include air, food, and water. Conditioned reinforcers do not serve to reinforce responses prior to conditioning. They are initially neutral with respect to the response in question, but, when repeatedly paired with a primary reinforcer, they develop the power to increase or maintain a response. Conditioned reinforcers are also called secondary reinforcers.
Reinforcement as a theoretical concept in psychology can be traced back to Russian physiologist Ivan P. Pavlov (1849–1936) and American psychologist Edward L. Thorndike (1874–1949), who both studied conditioning and learning in animals in the early 1900s. Pavlov developed the general procedures and terminology for studying what is now called classical conditioning. This term refers to both the experimental procedure and the type of learning that occurs within that procedure. Pavlov’s experiments involved giving a hungry dog dry meat powder every few minutes. The presentation of the meat powder was consistently paired with a bell tone. The meat powder made the dog salivate, and after a few experimental trials, the bell tone alone was enough to elicit salivation.
In Pavlov’s terminology, the meat powder was an unconditional stimulus, because it reliably (unconditionally) led to salivation. He called the salivation an unconditional response. The bell tone was a conditioned stimulus because the dog did not salivate in response to the bell until he had been conditioned to do so through repeated pairings with the meat powder. The salivation, thus, was a conditioned response.
Thorndike’s experiments involved placing cats inside specially designed boxes from which they could escape and get food only if they performed a specific behavior such as pulling on a string loop or pressing a panel. Thorndike then timed how long it took individual cats to gain release from the box over a number of trials. Thorndike found that the cats behaved aimlessly at first until they seemed to discover by chance the correct response or responses. Over repeated trials the cats began to quickly and economically execute the correct response or responses within seconds. It seemed that the initially random behaviors leading to release were strengthened, or reinforced, as a result of the positive consequence of escaping the box and receiving food. Thorndike also found that responses decreased and in some cases ceased altogether when the food reward was no longer given.
Thorndike’s procedures were greatly modified by Burrhus F. Skinner (1904–1990) in the 1930s and 1940s. Skinner conditioned rats to press down a small lever to obtain a food reward. This type of procedure and the resultant conditioning have become known as operant conditioning. The term “operant” refers to a focus on behaviors that alter, or operate on, the environment. It is also referred to as instrumental conditioning because the behaviors are instrumental in bringing about reinforcement. The food reward or any consequence that strengthens a behavior is called a “reinforcer of conditioning.” The decrease in response when the food or reinforcer was taken away is known as “extinction.” In operant conditioning theory, behaviors cease or are maintained by their consequences for the organism.
Reinforcement takes on slightly different meanings in the two types of conditioning. In classical conditioning, reinforcement is the unconditioned stimulus delivered either simultaneously or just after the conditioned stimulus. Here, the unconditioned stimulus reinforces the association between the conditioned and unconditioned stimulus by strengthening that association. In operant conditioning, reinforcement simply serves to strengthen the response. Furthermore, in operant conditioning the reinforcer’s presentation or withdrawal is contingent upon performance of the targeted response. In classical conditioning the reinforcement or unconditional stimulus occurs whether or not the targeted response is made.
Reinforcement schedules are derived from the timing and patterning of reinforcement response. Reinforcement may be scheduled in numerous ways, based upon the number, or sequencing, of responses, or on certain timing intervals with respect to the response. The consequences of behaviors always operate on some sort of schedule, and the schedule can affect the behavior as much as the reinforcement itself. For this reason a significant amount of research has focused on the effects of various schedules on the development and maintenance of targeted behaviors.
In operant conditioning research, two particular types of schedules that have been studied extensively are ratio and interval schedules. In ratio schedules, reinforcers are presented based on the number of responses made. In fixed-ratio schedules, a reinforcer is presented for every fixed number of responses so that, for example, every fifth response might be reinforced. In variable ratio schedules, responses are reinforced using an average ratio of responses, but the number of responses needed for reinforcement changes unpredictably from one reinforcement to the next. Using the interval schedule, reinforcements are presented based on the length of time between reinforcements. Thus, the first response to occur after a given time interval from the last reinforcement will be reinforced. In fixed interval schedules, the time interval remains the same between reinforcement presentation. In variable interval schedules, time intervals between reinforcements change randomly around an average time interval.
Research has shown that small differences in scheduling can create dramatic differences in behaviors. Ratio schedules usually lead to higher rates of response than interval schedules. Variable schedules, especially variable interval schedules, lead to highly stable behavior patterns. Furthermore, variably reinforced behaviors resist extinction, persisting long after they are no longer reinforced. This is why it is often difficult to extinguish some of our daily behaviors, since most are maintained under irregular or variable reinforcement schedules. Gambling is a clear example of this phenomenon, as only some bets are won yet gamblers continue taking their chances.
Reinforcement may be used and applied in numerous ways, not just to simple behaviors, but to complex behavior patterns as well. For example, it has been used to educate institutionalized mentally retarded children and adults using shaping or successive approximation. Shaping is the gradual building up of a desired behavior by systematically reinforcing smaller components of the desired behavior or similar behaviors. Much of this training has focused on self-care skills
Classical conditioning— A procedure involving pairing a stimulus that naturally elicits a response with one that does not until the second stimulus elicits a response like the first.
Conditioned reinforcers— Also called secondary reinforcers, they do not have inherent reinforcing qualities but acquire them through repeated pairings with unconditioned reinforcers such as food or water.
Conditioning— A general term for procedures in which associative learning is the goal.
Extinction— A procedure in which reinforcement of a previously reinforced response is discontinued, it often leads to a decrease or complete stoppage of that response.
Learning theories— A number of different theories pertaining to the learning process.
Operant conditioning— Also called instrumental conditioning, it is a type of conditioning or learning in which reinforcements are contingent on a targeted response.
Reinforcement schedule— The timing and patterning of reinforcement presentation with respect to the response.
Shaping— The gradual achievement of a desired behavior by systematically reinforcing smaller components of it or similar behaviors.
Systematic desensitization— A therapeutic technique designed to decrease anxiety toward an object or situation.
Token economy— A therapeutic environment in which tokens representing rewards are used as secondary reinforcers to promote certain behaviors.
Unconditioned reinforcers— Also called primary reinforcers, they are inherently reinforcing and usually biological in nature serving to satisfy physiological needs. In classical conditioning they are also any unconditioned stimuli.
such as dressing, feeding, and grooming. In teaching a subject how to feed himself, for example, a bite of food may be made contingent on the person simply looking at a fork. The next time the food may be made contingent on the subject pointing to the fork, then touching it, and finally grasping it and bringing the food to his mouth. Shaping has also been used to decrease aggressive and self-destructive behaviors.
Another successful application of reinforcement involves using token economies, primarily in institutional settings such as jails and homes for the mentally retarded and mentally ill. Token economies are a type of behavior therapy in which actual tokens are given as conditioned reinforcers contingent on the performance of desired behaviors. The token functions like money in that it has no inherent value. Its value lies in the rewards it can be used to obtain. For example, prisoners may be given tokens for keeping their cell in order, and they may be able to use the tokens to obtain certain privileges, such as extra desserts or extra exercise time. Most follow-up data indicates that behaviors reinforced by tokens, or any other secondary reinforcer, are usually not maintained once the reinforcement system is discontinued. Thus, while token economies can be quite successful in regulating and teaching behaviors in certain controlled settings, they have not proven successful in creating long-term behavioral change.
Systematic desensitization is a therapeutic technique based on a learning theory that has been successfully used in psychotherapy to treat phobias and anxiety about objects or situations. Systematic desensitization consists of exposing the client to a series of progressively more tension-provoking stimuli directly related to the fear. This is done under relaxed conditions until the client is successfully desensitized to his fear. Fear of public speaking, for example, might be gradually overcome by first showing the client pictures of such situations, then movies, then taking them to an empty auditorium, then having them give a speech within the empty auditorium, etc., until his anxiety is extinguished. Systematic desensitization may be performed in numerous ways, depending on the nature of the fear and the client.
Recent trends in reinforcement research include conceptualizing the process underlying reinforcement as a physiological neural reaction. Some theorists believe the concept of reinforcement is superfluous in that some learning seems to occur without it, and simple mental associations may more adequately explain learning. The study of reinforcement is, for the most part, embedded in learning theory research.
Learning theories and the study of reinforcement achieved a central place in American experimental psychology from approximately the 1940s through the 1960s. Over time it became clear, however, that learning theories could not easily account for certain aspects of higher human learning and complex behaviors such as language and reasoning. More cognitively oriented theories focusing on internal mental processes were put forth, in part to fill that gap, and they have gained increasing support. Learning theories are no longer quite as exalted. Nonetheless, more recently, a number of psychologists have powerfully explained many apparently complex aspects of human cognition by applying little more than some basic principles of associative learning theory. In addition, these same principles have been persuasively used to explain certain decision-making processes, and they show potential for explaining a number of well-known yet poorly understood elements of perceptual learning. While learning theories may not be as powerful as their creators and supporters had hoped, they have added greatly to our understanding of certain aspects of learning and of changing behavior, and they show great potential for continuing to add to our knowledge.
Hinson, Mary Joan. An analysis of the effect of using food as an operant conditioning instrument to achieve prompt attendance by college students. Ann Arbor: ProQuest / UMI. 2006.
Lorenzetti, Frederick. Cellular mechanisms of operant and classical conditioning. Ann Arbor: ProQuest/UMI. 2006.