Prisoner’s Dilemma (Psychology)
Prisoner’s Dilemma (Psychology)
The prisoner’s dilemma game (PDG) is a method of indicating the results of the possible pairings of the cooperative and noncooperative choices of two players. PDG can be illustrated by either of the matrices in Figure 1. With a PDG there are two players, a column player (A ) and a row player (B ), each of whom has two choices, X or Y, resulting in four possible combinations of choices with each combination yielding a different set of payoffs or outcomes. Payoffs or outcomes can be thought of as rewards or as some index of player satisfaction. The usual convention is that numbers above the diagonal in each cell represent the outcomes for the column player, and numbers below the diagonal in each cell represent the outcomes for the row player. In the example matrices, the numbers, or outcomes, can be thought of as dollars.
Suppose that both players choose X. For the left-hand matrix this combination of choices would result in each player receiving $3. After such a combination of choices, one player, for example the column player, might be tempted to choose Y on the next trial. If the row player continued to choose X, the result would be that the column player’s outcome would increase to $4, but the row player’s outcome would decrease to $1. Following such a result, one can imagine that the row player would shift from X to Y on the next trial, with the result that both players would receive only $2. Such a possibility illustrates the dilemma. Each player can increase his or her outcomes by choosing Y, but if both players are guided by immediate self-interest, both will receive lower outcomes than could have been obtained through cooperation or mutual X choices.
The X choice is usually referred to as a cooperative choice, and the Y choice is sometimes referred to as a competitive choice and sometimes as a defecting choice. Which term is more appropriate depends on whether the Y choice is motivated by greed or by fear. If the Y choice is motivated by greed, or an interest in increasing outcomes, the choice is appropriately referred to as a competitive choice. On the other hand, if the Y choice is motivated by fear, or an interest in minimizing the reduction in outcome resulting from the other player’s Y choice, the Y
choice is appropriately referred to as a defecting choice (a choice to withdraw from cooperation).
According to Matt Ridley, “broadly speaking any situation in which you are tempted to do something, but know it would be a great mistake if everybody did the same thing, is likely to be a prisoner’s dilemma” (1996, pp. 55–56). An everyday example relates to being honest versus cheating. Other examples relate to overwhaling, over-fishing, pollution of the air, pollution of the water, and conservation of water during a drought. These latter examples, which may involve more than two people and may provide more than two choices (e.g., how much water to save and not just whether to save or not to save water), are sometimes labeled resource dilemmas and sometimes labeled commons dilemmas. The term commons comes from Garrett Hardin’s (1968, 1993, 1998) description of the potential “tragedy” that could result from over-grazing in a shared, but unmanaged, medieval commons or pasture. Commons dilemmas share with the PDG the assumption that the pursuit of self-interest results in collective detriment. This assumption stands in contrast to Adam Smith’s (1776) marketplace model, which implies that the pursuit of self-interest results in collective benefit. One can regard many political disputes as partially flowing from disagreement regarding which model is more appropriate in a particular situation.
William Poundstone (1992, p. 123) maintains that recognition of the tension between self-interest and the common good has been widespread; he supports this assertion by citing numerous historical instances of statements similar to the “golden rule” as stated in the Bible in Matthew 7:12, “In everything, do to others what you would have them do to you.” However, the conflict between self-interest and the common good was first cast in the form of a two-by-two matrix by two mathematicians at Rand Corporation, Merrill Flood and Melvin Dresher (1952). The matrix was specifically created to provide an empirical test of mathematician John Nash’s concept of an equilibrium point, developed in his 1950 dissertation at Princeton University (see Colman 1995, pp. 58–61; Nasar 1998, pp. 115–122).
For the PDG, the Nash equilibrium is the lower right-hand cell. On the assumption that each player assumes that the other player will follow his or her self-interest, the players should arrive at this mutual- Y cell. Such choices should, furthermore, be stable or in equilibrium, because if either player moves from this cell, he or she will receive a lower outcome. In order to test Nash’s theory, Flood and Dresher had two colleagues in January 1950 play the PDG for one hundred trials. The results indicated that mutual Y choices, the Nash equilibrium, only occurred fourteen times. One player cooperated sixty-eight of one hundred times, and the other player cooperated seventy-eight of one hundred times. Although there was some competition, or defection, clearly the participants did not behave as Nash’s theory predicted.
Subsequently, the Flood and Dresher matrix was labeled the prisoner’s dilemma game by mathematician Albert Tucker, Nash’s adviser at Princeton University. At a talk given to the Psychology Department at Stanford University in 1950, Tucker illustrated the matrix with an anecdote of two prisoners who had been arrested on suspicion of having committed a crime. Each prisoner had a choice of remaining silent (analogous to choosing X ) or of giving evidence against the other (analogous to choosing Y ). Either prisoner could minimize his sentence by giving evidence against the other, but when both gave evidence, the prisoners could be convicted of a more serious charge than when both remained silent.
The PDG is sometimes characterized as a matrix meeting two requirements. First, the outcomes in the four cells follow a rank order for the column player from upper right to upper left to lower right to lower left. Note from the left-hand matrix in Figure 1 that the payoffs in these cells are 4, 3, 2, 1. (For the row player, the rank order is from lower left to upper left to lower right to upper right.) Second, the average outcome for the upper-right and lower-left cells is less than the outcome in the upper-left cell. Note that 2.5, the average of 4 and 1, is less than 3. This second requirement guarantees that higher outcomes will be achieved by mutual X choices, rather than by alternating between the lower-left and upper-right cells.
Interdependence theory (Kelley and Thibaut 1978; Kelley et al. 2002) provides a more sophisticated perspective on the PDG. This perspective relies on four concepts that can be illustrated by the left-hand matrix in Figure 1. The first of these is labeled actor control (AC) by Harold Kelley and his coauthors, and reflects the direct control that each player has over his or her own outcomes. For the column player, AC is the difference in column averages (the average of 3 and 1, or 2, for the X column versus the average of 4 and 2, or 3, for the Y column). For the row player, AC is the difference in row averages. For both players, AC is 1.
The second concept is labeled partner control (PC) by Kelley and his associates, and reflects the direct control that the partner has over actor outcomes. For the column player, PC is the difference in row averages (the average of 3 and 4, or 3.5, for the X row versus the average of 1 and 2, or 1.5, for the Y row). For the row player, PC is the difference in column averages. For both players, PC is 2. Note that AC has a smaller absolute value than PC, and also that an increase in one player’s AC results in a decrease in the partner’s PC. If the column player, for example, increases AC by shifting from X to Y, the row player’s PC is decreased. From the perspective of AC and PC alone, the PDG is a matrix in which a small increase in one’s own outcomes results in a large loss in partner outcomes.
The third concept is labeled joint control (JC) by Kelley and his coauthors, and reflects the extent to which the players can maximize their outcomes by taking turns, or alternating, their X and Y choices. For both players, JC is the difference in diagonal averages (the average of 3 and 2 versus the average of 4 and 1). For both players, JC is 0; that is, there is no joint control or advantage in alternating X and Y choices. The purpose of the requirement that JC be 0 is similar to the purpose of the above requirement that the average outcome for either column or row player for the upper-right and lower-left cells is less than the outcome in the upper-left cell.
The fourth and final interdependence theory concept is the correspondence between the two players’ outcomes across the four cells. For matrices such as those in Figure 1, in which the players’ outcomes are symmetric, correspondence is indexed by the correlation between the two players’ outcomes across the four cells. For the PDG, the correlation is always negative; that is, in general, as one player’s outcomes increase, the other player’s outcomes decrease.
The correlation between the outcomes for the two players is a mathematical consequence of the AC to PC ratio. For the left-hand matrix, the ratio is 1 to 2 and the correlation is -.80. For the right-hand matrix, the ratio is 3 to 4 and the correlation is -.96. This difference illustrates the point that the PDG is not one matrix, but a family of matrices with differing AC to PC ratios. The importance of this point becomes apparent in view of the interdependence theory assumption that correspondence reflects conflict of interest. As the AC to PC ratio becomes larger, and the correlation more negative, conflict of interest increases. From the perspective of interdependence theory, some PDG situations are more likely than others to lead to conflict.
Research has indicated that when the PDG is played between two groups, each of which is required to make a consensus choice on each trial, there is frequently, but not always, more competition than there is between individuals (see Wildschut et al.  for a statistical summary, or meta-analysis, of published research). This difference, which is labeled a discontinuity effect, has been shown to be more evident as correspondence, or the correlation between the individuals’ or groups’ outcomes, becomes more negative.
SEE ALSO Common Good, The; Commons, The; Externality; Nash Equilibrium; Nash, John; Noncooperative Games
Colman, Andrew M. 1995. Game Theory and Its Applications in the Social and Biological Sciences. 2nd ed. London: Butterworth-Heinemann.
Flood, Merrill M. 1952. Some Experimental Games, Research Memorandum Rm-789. Santa Monica, CA: Rand Corporation.
Hardin, Garrett. 1968. The Tragedy of the Commons. Science 162: 1243–1248.
Hardin, Garrett. 1998. Extensions of “The Tragedy of the Commons.” Science 280: 682–683.
Kelley, Harold H., and John W. Thibaut. 1978. Interpersonal Relations: A Theory of Interdependence. New York: Wiley.
Kelley, Harold H., et al. 2002. An Atlas of Interpersonal Situations. Cambridge, U.K.: Cambridge University Press.
Nash, John F., Jr. 1950. Non-Cooperative Games. PhD diss., Princeton University, Princeton, NJ.
Poundstone, William. 1992. Prisoner’s Dilemma. New York: Doubleday.
Ridley, Matt. 1996. The Origins of Virtue: Human Instincts and the Evolution of Cooperation. London: Penguin.
Smith, Adam.  2003. An Enquiry into the Nature and Causes of the Wealth of Nations. London: Methuen.
Wildschut, Tim, et al. 2003. Beyond the Group Mind: A Quantitative Review of the Interindividual-Intergroup Discontinuity Effect. Psychological Bulletin 129: 698–722.
Chester A. Insko
Prisoner’s Dilemma (Economics)
Prisoner’s Dilemma (Economics)
The prisoner’s dilemma is a classic example of an environment in which individuals rationally fail to cooperate even though cooperation would make each person better off. A standard formulation is the following: Two men commit a crime and are arrested. The following possible punishments await them. If both men confess, each will receive five years in prison. If neither man confesses, each will receive three years in prison. If one man confesses and the other does not, the confessor will receive one year in prison whereas the other receives ten.
The two men would receive the lightest sentence if both refused to confess than if neither does so. However, suppose each man is a prisoner in his own separate holding cell, so that it is impossible for the two men to coordinate their behavior. If the prisoners are acting rationally, each will confess. The reason is the following: A person convicted of a crime should confess if it reduces his prison sentence. The complication in each man’s decision is that the consequence of his behavior depends on the behavior of the other man, which he does not know. However, in the prisoner’s dilemma, it turns out that the rational choice for a prisoner is to confess regardless of the behavior of the other prisoner. To see this, if the other prisoner confesses, then confession brings a sentence of five years whereas not confessing brings ten years. If the other prisoner does not confess, then confession brings a sentence of one year whereas not confessing brings three. Thus, confession is a dominant strategy as it is always preferable to the alternative. The two men therefore choose strategies that, although individually rational, are collectively inferior to those they would choose if they could cooperate.
When one considers situations where individuals play a sequence of prisoner’s dilemma games, rational behavior leads to very different outcomes. The reason for this is that for repeated prisoner’s dilemma games, each player will be making a sequence of choices so that the decision to cooperate in one game can depend on the past behavior of the other player. This means that players can reward each other by making choices that depend on the play of their opponent. The existence of noncooperative equilibrium strategies for an infinite repeated prisoner’s dilemma has been shown in many contexts; a classic formulation is explored in Drew Fudenberg and Eric Maskin’s 1986 paper. It is also possible for periods of cooperation to occur in finite sequential games, as described in David Kreps et al.’s 1982 work.
The prisoner’s dilemma is a basic component of any game theory textbook; a particularly insightful treatment may be found in Roger Myerson’s Game Theory (1991). A history that places the prisoner’s dilemma in the context of the development of game theory is William Poundstone’s book Prisoner’s Dilemma (1992).
Fudenberg, Drew, and Eric Maskin. 1986. The Folk Theorem in Repeated Games with Discounting and with Incomplete Information. Econometrica 54: 533–554.
Kreps, David, Paul Milgrom, D. John Roberts, and Robert Wilson. 1982. Rational Cooperation in the Finitely Repeated Prisoner’s Dilemma. Journal of Economic Theory 27: 245–252.
Myerson, Roger. 1991. Game Theory: Analysis of Conflict. Cambridge, MA: Harvard University Press.
Poundstone, William. 1992. Prisoner’s Dilemma. New York: Doubleday.
William A. Brock
Steven N. Durlauf