<<

Colloquium

Competition among cooperators: and reciprocity

Peter Danielson*

Centre for Applied Ethics, University of British Columbia, Vancouver, BC, Canada V6T 1Z2 Levine argues that neither self-interest nor altruism explains ex- quite widely, to include sometimes counterintuitive possibilities. perimental results in bargaining and public goods games. Subjects’ [This is the critical justification for Binmore’s (6) harsh criticism preferences appear also to be sensitive to their opponents’ per- of Axelrod (2, 7).] In particular, when we specify the mechanisms ceived altruism. Sethi and Somanathan provide a general account of reciprocity, we shall see that they have surprises in store for of reciprocal preferences that survive under evolutionary pressure. us (8). Although a wide variety of reciprocal strategies pass this evolu- The present paper runs a similar line of inquiry with a new tionary test, Sethi and Somanthan conjecture that fewer are likely starting point. We begin with Levine’s account of human coop- to survive when reciprocal strategies compete with each other. erative behavior in experiments using bargaining and public This paper develops evolutionary agent-based models to test their goods games. Although it is commonly agreed that self-interest conjecture in cases where reciprocal preferences can differ in a will not account for these experimental results, Levine goes variety of games. We confirm that reciprocity is necessary but not further and argues that simple altruism is also inadequate. A sufficient for optimal cooperation. We explore the theme of better fit is obtained from preferences that are sensitive to one’s competition among reciprocal cooperators and display three in- opponents’ perceived altruism. Thus, Levine argues that a kind teresting emergent organizations: racing to the ‘‘moral high of reciprocity of altruism is required to explain human cooper- ative behavior. Sethi and Somanathan (9) modify the reciprocal ground,’’ unstable cycles of preference change, and, when we component of Levine’s construction to achieve evolutionary implement reciprocal mechanisms, hierarchies resulting from ex- stability under a variety of selective regimes. They raise the ploiting fellow cooperators. If reciprocity is a basic mechanism question of what would happen were various reciprocal coop- facilitating cooperation, we can expect interaction that evolves erators placed in competition with each other. around it to be complex, non-optimal, and resistant to change. To attempt to answer this question, we constructed two kinds of models. In the first set of models, agents are capable of quite he topic for this colloquium is competition and cooperation complex preference interactions. They can be altruistic toward Tas factors in emergent human organization. Sober and others and react to other agents’ altruism and reaction functions. Wilson (1) note that the behavior labeled ‘‘cooperation’’ by The agents in the second set of models have simpler interactions, evolutionary game theorists is the same as that discussed in the as they implement working reciprocal strategy mechanisms of altruism literature. So our topic can be taken very rather than preferences. We test these agents in a variety of generally; indeed, it is ‘‘the central theoretical problem of situations. (We use eight commonly discussed two-by-two ’’ (1). Naturally, the main question is how is altru- games.) Although we confirm the basic result that reciprocity ism͞cooperation possible among agents selected by competitive allows agents to cooperate where neither self-interested nor evolutionary processes. But this possibility question can be simple altruistic agents would, we also discover that some forms misleading. For example, both Axelrod (2) and Gauthier (3) of reciprocity lead to unexpected—emergent—social structures. advanced the discussion of these issues by insisting that there are Some reciprocally altruistic agents can race to the moral high situations in which cooperative agents responsive to the behavior ground, and, ironically, treat ‘‘less altruistic’’ agents exploit- atively. Others are trapped in cycles of preference change. or disposition of their opponents do as well or better than Finally, although reciprocity is necessary to stable cooperation, straightforwardly competitive agents. Yet each defended a single it need not be optimal. When we implement one family of simple cooperative strategy— and Constrained Maximiza- reciprocal mechanisms, exploitative hierarchies emerge as well tion, respectively—that is not a unique equilibrium. [Of course, as equal cooperative outcomes. in Gauthier’s case, critics point out that Constrained Maximi- zation doesn’t even claim to be an equilibrium strategy (4). The Previous Work extended preference concepts discussed below can be seen as a In this section, we discuss only the immediate ancestors of the way to avoid this criticism.] Neither attended to the competition present paper; see ref. 10 for a broader survey of the literature between cooperative strategies and both neglected less ‘‘nice’’ alternative strategies. In earlier work (5), I extended Gauthier’s model to address This paper results from the Arthur M. Sackler Colloquium of the National Academy of competition among cooperators, finding that what I called Sciences, ‘‘Adaptive Agents, Intelligence, and Emergent Human Organization: Capturing Complexity through Agent-Based Modeling,’’ held October 4–6, 2001, at the Arnold and ‘‘reciprocal cooperators,’’ who demand discriminatory respon- Mabel Beckman Center of the National Academies of Science and Engineering in Irvine, CA. siveness from cooperative partners, exploit some more tolerant *E-mail: [email protected]. cooperators, and thereby supplant nicer Constrained Maximiz- †Although it is not surprising that, as agents, we encourage others to cooperate, it must be ers. More generally, the combination of evolution, altruism, and stressed that the values cooperation (pareto) optimizes are local (evidently, typically reciprocity need not result in populations of equal, optimal, and within range of our encouragements). As social scientists, we seek to understand coop- tolerant cooperators. Most generally, because we’re inclined to eration’s causes and constituent mechanisms. As applied social scientists (applied ethi- favor cooperation, we need methods that challenge our intuitive cists), we will sometimes seek to encourage cooperation and sometimes to undermine it. † (Think of the original story of the prisoner’s dilemma.) So we can account for a bias toward biases. Evolutionary simulation can be a good test because cooperation, but it is a bandwagon we should be cautious about riding. We need good evolution builds in strong competitive pressure. But simulations tools to disengage our theoretical ethical intuitions from our moralistic responses and can easily confirm biases, unless we allow the generator to range then to isolate, test, and improve new moral mechanisms.

www.pnas.org͞cgi͞doi͞10.1073͞pnas.082079899 PNAS ͉ May 14, 2002 ͉ vol. 99 ͉ suppl. 3 ͉ 7237–7242 Downloaded by guest on October 1, 2021 on rationality and evolution. This paper builds on two recent Table 1. Eight sequential game outcomes streams of research. First, Levine (11) asks ‘‘to what extent can a simple model of players who are not selfish explain the data from a variety of experiments, including both ultimatum and public goods games?’’ Following Levine, we construct interact- ing utility functions that have parameters for both pure altruism and a reciprocity factor, ␭. ‘‘When ␭ Ͼ 0, the model can be regarded as incorporating an element of fairness, . . . One of our OR 3,3 1,2 2,1 0,0 major conclusions is that ␭ ϭ 0 [pure altruism] is not consistent CO 3,3 0,0 0,0 2,2 with data from the ultimatum game’’ (11); therefore, according KS 0,3 1,2 2,1 3,0 to Levine, experiments reveal that human players are spiteful as AG 2,0 1,1 3,3 0,2 well as altruistic and this can be accounted for by preferences PD 2,2 0,3 3,0 1,1 that reciprocate something. (That is, preferences that are not UPD 3,2 0,3 4,0 1,1 exogenous but are instead functions of some feature of the BS 1,3 0,0 0,0 2,1 opposing player. Throughout this paper, ‘‘opposition’’ denotes CK 4,4 1,5 5,1 0,0 pairing in games without prejudice to the player’s interests or preferences; opponents need not be competitors.) The question arises, what do they reciprocate? common knowledge. Players know each other’s preferences, Second, Sethi and Somanathan (9) introduce an additional including their altruistic and reciprocal components. test: ‘‘survival under evolutionary pressure.’’ Their work takes (iv) All players have the same formal preference function, Guth and Yaari’s (12) evolutionary approach to preferences: which includes arguments for other’s altruistic and reciprocal ‘‘instead of assuming that individual preferences are exogenously parameters. Players differ only in the value of their altruistic and given, we think of an evolutionary process where preferences are reciprocal parameters. This assumption may seem inconsistent determined as evolutionarily stable strategies.’’ In this approach, with our evolutionary approach. Of course, it would be better to ‘‘reproductive success is purely a function of the resources have a more constructive model, where the altruistic and recip- earned via strategic interaction,’’ whereas preferences only de- rocal apparatus evolves. We move in this direction below. termine agent’s moves. I use this approach in two papers (13, ‡) Obviously ii–iv are strong and unrealistic assumptions. We to evolve both rational preferences that track their interests, and make them to connect our argument with Levine’s and Sethi and deviant preferences that do not, using only modest assumptions Somanathan’s interesting work and to make our own simulations about information. But if we provide more information, agents tractable. We are, of course, assuming and not recommending may evolve preferences that only need not track their objective comparability, transparency, and homogeneity. Assumption i is interests, their preferences may also respond to other agents’ weaker than the norm, and we weaken assumption iv toward the preferences. Casting these models in terms of preferences in- end of this article. stead of strategies resolves some procedural problems (see, for example, ref. 8). So my agent-based model can be used to extend Modeling Altruism and Reciprocity Sethi and Somanathan’s argument. Levine’s ‘‘model can be viewed as a particular parameterization Sethi and Somanathan argue that the evolutionary demands of of a class of models in which preferences depend on payoffs to the situation further structure the preference function. This an individual player and to his rivals, as well as depending on his function must allow spite toward nonaltruistic ‘‘materialists,’’ own type and the type of his rivals’’ (11). Following Sethi and and hence focus on differences in altruism. This paper extends Somanathan (9), where ␲ is the material payoff function of a their line of argument, confirming their conjecture: ‘‘while a n-player game x, player i’s utility function ui has the direct wide range of parameter values is consistent with survival against component and ␤ weight for the opponents’ material payoff materialists, a much narrower range may be expected to survive shown in Eq. 1. Because we will use only 2-person games, we when several members of this class of preferences are in com- can simplify and focus on one opponent, j. petition with each other’’ (9). ͑ ͒ ϭ ␲͑ ͒ ϩ ͸ ␤ ␲ ͑ ͒ We show, first, that attending only to differences in altruism ui x x ij j x [1] leads first to invidious races to the ‘‘moral high ground.’’ j  i Suggesting further modifications of the function leads to unsta- ␤ ␣ ␭ ble patterns of preference change. Finally, although reciprocity has two components, pure altruistic and reciprocal . Levine is necessary to stable cooperation, it need not be optimal. relates them by Eq. 2. Indeed, when we specify a family of reciprocal mechanisms, ␣ ϩ ␭ ␣ exploitative hierarchies emerge as well as fair optimal outcomes. ␤ ϭ i i i ij ϩ ␭ [2] 1 i Assumptions Sethi and Somanathan modify this to allow ␤ to go negative, Here we highlight the assumptions we make that are not which they call ‘‘spite,’’ in case player is less altruistic than standard in the economics literature. (i) We assume that pref- j player ; see Eq. 3. erences are not exogenous but instead vary under evolutionary i pressure. Players’ choices are fixed by their preferences in the ␣ ϩ ␭ ͑␣ Ϫ ␣ ͒ i i j i short run of a set of games but over the longer run of the ␤ ϭ [3] ij 1 ϩ ␭ simulation, the distribution of preferences represented in i the population are driven by evolutionary pressures. Agent-Based Models (ii) Altruistic and reciprocal parameters assume that prefer- All of the models below are based on a common coevolutionary ences can be compared (11). (iii) These models assume strong framework. Each generation is a round-robin tournament in which 60 agents plays a round of 8 sequential games, ranging ‡Danielson, P., a paper presented to the Society for Machines and Mentality, Eastern from trivial coordination and constant sum to prisoner’s di- Division of the American Philosophical Association, December 27–30, 2000, New York. lemma (PD), battle of the sexes, and chicken (see Table 1). Each

7238 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.082079899 Danielson Downloaded by guest on October 1, 2021 player plays both roles P1 and P2 (see game tree above Table 1) in each game with each of the other players in each generation. The equilibrium scores (italicized) for one round total 32 and the joint optimal scores (bolded) total 40. Each player’s fitness is its score in 8 games ϫ 2 roles ϫ 59 opponents. About half the population is retained (for continuity), and the other half is composed of new genetically programmed offspring (14). In particular, the top 2 players are cloned, 24 players are selected with a chance proportional to their fitness, 30 new players are created by crossover from 15 parents, and 4 new players are single-point mutations of their parents. Parents are selected with a chance proportional to their fitness. The players are lisp functions that output preferences in the first two models and moves in the third. (We used a variant of SCHEME that runs under Fig. 1. Rising reciprocity. JAVA on most common computing platforms. Thanks to Ken Anderson, Tim Hickey, and Peter Norvig for this tool. Program ͞͞ We use a genetic algorithm to drive our evolutionary simu- available at http: jscheme.sourceforge.net.) lation.§ To apply a genetic programmer to these simple agents, we only need a set of (closed arithmetic) functions that will raise Preference Models and lower their parameters. We initialize the population with In models 1 and 2, all players interact rationally based on their various pairs of values, and let the function set work on these preferences. Differences between players are caused by dif- values [e.g., here is a typical player: (INCA (INCA (INCL ferent preferences. Although choice is determined by the BOTH))); programming code (scheme functions and variables) player’s preferences, selection is driven by the material payoffs is written in all capitals]. Starting with BOTH ϭ (1, 1), using an in the game matrices. incremental value of 0.2, yields this pair: (1.4, 1.2). If we seed a Sethi and Somanathan argue that agents with reciprocal population with all materialists, reciprocators immediately in- preferences have a ‘‘strategic advantage’’ over altruists or ma- vade it. So far, this finding agrees with Sethi and Somanathan. terially selfish agents, ‘‘a given player with the former preference But if we start with a random population of various reciprocators ␣ obtains a greater payoff than an otherwise identical player with and materialists, we get a surprising result. Typically, will climb to the limit 2, and ␭ will continue to increase; see the left scale the latter preference at any equilibrium’’ (9). Agents with ␭ reciprocal preferences will therefore be selected under a wide on Fig. 1. This increase happens because just as a higher allows range of evolutionary regimes. The question remains: what reciprocators to be spiteful and invade materialists, it also allows ‘‘higher’’ reciprocators to spitefully invade ‘‘lower.’’ This process happens when agents differ in altruism or reciprocity? We turn is self-limiting, as once ␭ gets too high, cooperation between to this question of competition between cooperators now. similar reciprocators falls off. Thus the average score per game of the population rises until ␭ jumps. The equilibrium score is 2.0 Racing to the Top and the optimum is 2.5; see the scale on the right side of Fig. 1. First, we implemented Eq. 3. Because Sethi and Somanathan I write ‘‘higher’’ and ‘‘lower’’ in quotes above to alert us to the specify that 0 Ͻ ␣ Ͻ 1, we needed to limit the range of ␣. insidious aspect of this process. Given the way Eq. 3 links ␣ and However, to make cooperation possible, our game’s payoff ␭, higher values allow one to exploit one’s ‘‘less’’ altruistic ranges required 0 Ͻ ␣ Ͻ 2. In contrast, ␭ was not subject to an opponents. Of course, according to the high values in the upper limit, but was required to be Ͼ0. Each agent is charac- prevalent reciprocal preference function, one would indeed be terized as a pair of values (␣, ␭), which, together with the more altruistic, were one’s opponent worthy, but in fact higher opponent’s ␣, yield a general preference function which can be values for these parameters allow one to be less altruistic; indeed, applied to the payoffs of each game. For example, consider two spiteful, to ‘‘lesser’’ fellow cooperators. This observation should reciprocal altruists each with ␣ ϭ 2 and ␭ ϭ 2, and the prisoner’s warn us about interpreting this model of reciprocal altruism. The dilemma in row 5 of Table 1. The material payoffs for P1 and P2 ease with which one can misinterpret this type of reciprocity may are, in the order (ll, lr, rl, rr), (2, 0, 3, 1) and (2, 3, 0, 1), give us pause. It is not unreasonable for well-intentioned agents respectively . The ␤ value for each is 0.67. Applying ␤ to the to give their ‘‘betters’’ more leeway out of respect for their higher outcomes, we get: (3.33, 2, 3, 1.67) and (3.33, 3, 2, 1.67). Notice values. But, unfortunately, evolution (or learning) will find this that this high ␤ makes P2 an unconditional cooperator, prefer- out, leading to exploitation of these cooperators and a race to the (formally) ‘‘higher’’ moral ground. ring the left branch in each case (3.33 Ͼ 3 and 2 Ͼ 1.67). It is only Finally, please note that this implementation is no criticism of the reciprocal element, which requires that P1 also be similarly Sethi and Somanathan. Their model can be protected from this altruistic, preferring to choose left as well, that leads to a process by fixing a limit on ␭. However, given evolutionary cooperative, rather than an exploitative, outcome. pressures to increase ␭, this fix amounts to fixing ␭ at that limit, Given their differences, we should calibrate our implementation and we will have thereby lost the ability to explore the interaction with Sethi and Somanathan’s claims. First, a pair of ‘‘materialists’’ of a variety of reciprocal altruists. [in our implementation: (0, 0)] will never cooperate and will achieve an equilibrium score of 32. Second, a pair of reciprocal altruistic (2, Reciprocal Preferences Can Be Unstable 2) agents will cooperate in the four social dilemmas, and achieve an To continue our investigation of various reciprocal agents, we fix optimal score of 40. Finally, the reciprocal (2, 2) agents are spiteful the parameters in a different way. First, we note that the trick of enough to threaten materialists into yielding in many games, with an outcome of 21 to the materialist and 34 to the reciprocators. These outcomes guarantee that reciprocators will invade ma- §Admittedly, this is overkill when we only need to vary two parameters. Actually, we use genetic programming (15), which is even more overpowered. The reason is to use a terialists under individual selection. And this is what happens uniform evolutionary device for these simple models as well as the more complex ones that in our evolutionary simulation. follow and still others explored in refs. 8 and 16.

Danielson PNAS ͉ May 14, 2002 ͉ vol. 99 ͉ suppl. 3 ͉ 7239 Downloaded by guest on October 1, 2021 irrelevant to the social dilemmas and cooperation. Then I would have missed this interesting unstable ecology, driven by KS at one turn. Indeed, the explicit arbitrariness of our game and function sets is a welcome reminder that this paper only attempts to sample some interesting possibilities of what can happen when different sorts of reciprocal cooperators interact. Were we attempting to demonstrate something stronger we would need to account for the generator of agent mechanisms and social situations. We would also, we should add, need to run the simulations longer than our sample runs of 40 generations or explain why these short runs suffice.

Fig. 2. Unstable preference change. A Strategy-Based Model A second way to test reciprocity is to model more explicitly the way the agents reason, allowing reciprocity to enter their strat- ␭ increasing only works because Eq. 3 is not fully reciprocal. Eq. egies. The previous model was built on a common rationality ␭ 3 has respond to differences between one’s own and one’s mechanism, which allowed P1 to inform its choices by looking ␣ ␭ opponent’s , but not to differences in . To correct this ahead in the game tree and using P2’s preferences to predict the deficiency, we introduce our variant reciprocity factor, made outcomes of P2’s choices. On top of this (explicitly modeled) ␳ ␭ ␳ explicit by using in place of , sensitive to itself. Secondly, the rationality component, it permitted (without modeling the ␣ bias favoring higher seems to be a mistake. As we have seen, mechanisms) access to one’s own and one’s opponent’s param- ␣ given a reciprocal linking of , differences matter, whichever of eters ␣, ␭, and ␳ as needed. The game apparatus of the present ␣ the opponents has higher ; this is suggested by our deconstruc- model is much simpler; it omits all of these components. Agents tion of ‘‘higher altruism’’ in the previous section. Unfortunately, have instead access to functions to compare various material introducing differences complicates the model. The problem outcomes from various points of view. For example (P1 Ͼϭ with differences is that they need a reference point, else they will NODE1 NODE2 NODE3) chooses NODE1 or NODE2 accord- Ϫ Ͻ ␤ Ͻ not generate the range 1 1 needed for the spiteful ing to which is better for P1; NODE3 is selected on equality. survival trait. Eq. 4 is one example of a way to fix a reference Other functions compare nodes from P2’s perspective, choose point. Here ␳ is implicitly altruistic subject to the absolute ␳ the smaller, or compare from the joint perspective of both difference between the two players’ compared with a given players. Thus altruism is possible, but, not surprisingly, lacking threshold, T. any means to discriminate between opponents, it does not evolve in this basic model. In ref. 10 I demonstrate the evolution of ␤ ϭ ␳ ͑T Ϫ ͉␳ Ϫ ␳ ͉͒ [4] ij i i j rationality with this model. Here I elaborate it to try to evolve In a particularly interesting run, we set the threshold T ϭ 2 and reciprocity. got the results shown in Fig. 2, which plots the ␳ value of the best Our primitive functions provide no basis for reciprocity, so we scoring player in each generation on the left-hand scale. In this add one. We use a matching function to avoid the computational case, High ␳ invades Low, and Medium ␳ invades High. High and problem of strategies computing other strategies’ reciprocal Low ␳ interact as we expect; High cooperate together and Low properties. Our function (MATCH YES NO) compares two do not. High invade Low by virtue of spite; High threatens Low players’ programs, selecting the YES component if they are in the game of chicken. The key to the complex dynamics is the identical and the NO component otherwise. But reciprocity is ability of Medium ␳ to invade High. Medium can invade High only adaptive if the behavior reciprocated is group adaptive, and because of the constant sum game included in the set. The the alternative is individually adaptive, and match requires exact altruistic equilibrium will typically be different from the self- coordination of these strategies. These are high demands at the interest equilibrium, even with constant sums. Medium exploits crude levels of these agents’ code. For example, here is a simple High’s choice of this equilibrium. Finally, unsurprisingly, Low (designed) rational player: (CXM (P1 Ͼ (P2 Ͼ (LL) (LR)) (P2 Ͼ exploits Medium in the social dilemmas. (RL) (RR))) (P2 Ͼ (LL) (LR)) (P2 Ͼ (RL) (RR))). Therefore, Only High͞High and Medium͞Medium populations cooper- the evolutionary accessibility of stable matching strategies de- ate, so this model spends about one-third of its cycle in a pends on us making more compact representations available in noncooperative period. The right-hand scale in Fig. 2 plots the the function set, for example (IMAX) for the above code and population’s mean score. Notice that the initial drop as the (UMAX) for its jointly maximizing alternative. Given these maladapted initial population is exploited, followed by the quick additional functions, MATCH based reciprocators [e.g., rise to near optimal (2.5) levels. Each time ␳ drops, it is followed by a drop in population mean score, with low points at Gener- ation 17 and 28, where many exploited players score below the equilibrium (2.0) level. A limitation of this testing method is the arbitrariness of the reciprocal functions and their parameters, many of which (not discussed here) lead to stable reciprocal cooperation. We try to remove some of this arbitrariness in the next section. However, one source of arbitrariness is to be applauded. The set of eight games was not chosen as especially relevant to the current task; it was simply adopted from earlier, unrelated work, where they were thought to represent the ‘‘interesting’’ possibilities of social interaction. The present results turn on the variety of games used in the test. Had I created a game suite for the task at hand, I likely would have omitted the constant sum game (KS in Table 1) as Fig. 3. A hierarchy.

7240 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.082079899 Danielson Downloaded by guest on October 1, 2021 Table 2. A prisoner’s dilemma CD

C 3.5 4.5 D0 3

(MATCH (UMAX) (IMAX))] which fully cooperates with and only with those that match it, and plays rationally with all others, readily evolve. Limits of Matching However, match-based reciprocators need not be full coopera- Fig. 4. Spatial PD. Solid bars, cooperating; hatched bars, defecting. tors. When we actually implement reciprocity as matching, we see that it can reinforce a variety of behaviors, not merely full at least one of its immediate neighbors is the same type; R2 cooperation. Matching will lock in arbitrary features of the first cooperates only when both are the same type. Fig. 4 shows the match-based reciprocal cooperators and often lead to subop- scores of players in the middle of a longer line, with three R1 timal semicooperation. The very first run of the model illus- then three D and then three R2. Dark bars are cooperating; trates this. In generation 10 this incomprehensible (to me) light bars are defecting. Notice that the R1 exposed to a D program evolved and subsequently took over the population: neighbor cooperates (because he has one matching neighbor) (BOTHϾ (P1 Ͼϭ(RL) (MATCH (RR) (P2L)) (CXM (P2L) Ͼϭ Ͼϭ and the R2 similarly exposed defects. This allows the edge R2 (LL) (P1 (LL) (LL) (P2R)))) (P1 (RL) (P2L) (RL))). to exploit his R2 neighbor, who cooperates. Thus It uses low-level components to reach the high, but not quite again reciprocity allows exploitation between reciprocal optimal score of 39 of a fully cooperative 40 in the eight games, cooperators. managing to cooperate in the prisoner’s dilemma, chicken, and Consider now the dynamics of the system, which depend on the battle of the sexes. This reciprocator has three features worth outcomes at the edge between the types. The rightmost R1 our attention: (i) it is highly stable (because MATCH blocks copies his D neighbor, so D moves left. The rightmost D player further evolution), (ii) it is non-optimal, and (iii) it is complex. copies his R2 neighbor, so R2 moves left. Generally, R1 will lose Finally, the tendency to lock into an arbitrary convention allows players to D although R2 will gain them. So in this model, the the possibility that reciprocal cooperators exploit one another. spread of cooperation depends on cooperators counting those First, notice that demanding identity for a match is unnecessarily that exploit them in this edge case as cooperators (the PD strong. All that is required is that players match in the first YES payoffs were chosen to generate these dynamics). branch; the second NO branch is irrelevant to insuring that match- So we see that that our length-sensitive matchers are not ers reciprocate (this finding means that spite toward nonmatchers making an obvious mistake, nor is the outcome a pointless is not required). We introduce the new function, YMATCH, which artifact. Some learning or evolutionary regimes might require incorporates this weaker test. YMATCH-based reciprocal cooper- exploitation (which might better be termed ‘‘sacrifice’’ in this ators proliferate more readily than MATCH-based agents do, case) for the proliferation of cooperation. However, the justifi- because it is easier for YMATCH-based agents to get a successful cation of exploitation and its tendency to evolve are quite matching pair to seed the process. distinct. Exploitation is justified only in the spatial case but YMATCH will allow some reciprocal cooperators to exploit evolves in both. others. All that is needed is for the functions they coordinate around to generate different behavior based on some (otherwise) irrelevant Conclusions feature, such as length. Here is an example (YMATCH (BIGGER Our results support the conclusions of Levine and Sethi and (IMAX) (UMAX) (UMAX)) (IMAX)), which cooperates when a Somanathan. Reciprocity is necessary to stabilizing cooperation. similar opponent is the same length or longer, but plays rationally It allows altruistic cooperators to do better than self-interested when it is the longer of the two and thereby exploits shorter rational agents. However, when we move toward implementing cooperators. Strategies like this will lead to unequal hierarchies with reciprocal mechanisms, the issue of competition between coop- scores correlated with length; see Fig. 3. erators becomes central. At the most abstract level, reciprocal Justifying Exploitation cooperators can compete by having ‘‘more altruistic’’ prefer- ences and they can find themselves with dynamically unstable This exploitation and resulting inequality among ‘‘cooperators’’ preferences. When we implement reciprocal mechanisms, coop- may seem a pointless artifact of my implementation or a mistake erators can be exploited by those with whom they have coordi- on the part of the would-be reciprocal cooperators. But do not nated on reciprocity. Reciprocity, although stable, need not be dismiss this result so quickly. First, YMATCH takes advantage optimal and can be opaque in its complexity. of the evolutionary dynamics to spread. Thus, given that reci- In the wider context of recent work on and ethics, procity must be implemented by some set of mechanisms, these results are not surprising. Binmore notes, ‘‘while the something like our BIGGER function may infect any real emphasis is usually on the fact that such reciprocity mechanisms reciprocal system subject to evolution or learning. Second, can be used to sustain Pareto-efficient equilibria in indefinitely although the unequal scores YMATCH and BIGGER allow repeated games, it is important to recognize that the same serve no useful social purpose in the present model (which fact mechanism can also be used to stabilize inefficient equilibria’’ supports talk of ‘‘infection’’), this need not be the case. (18). Boyd and Richerson’s apt title, ‘‘ allows the To illustrate, consider a simple model of local interaction, evolution of cooperation (or anything else) in sizable groups,’’ where agents play the prisoner’s dilemma in Table 2 with their warns of similar properties in reciprocal social systems (19). immediate neighbors on a line (16, 17). The dynamics are driven by imitation; each player copies the strategy of his I thank Rik Blok and Josh Epstein for very helpful comments on earlier immediate neighbor who does best. Let the players consist of versions, and Bob Axtell, Rajiv Sethi, and Brian Skyrms for helpful Defectors and Reciprocators of two types: R1 cooperates when feedback on this research project.

Danielson PNAS ͉ May 14, 2002 ͉ vol. 99 ͉ suppl. 3 ͉ 7241 Downloaded by guest on October 1, 2021 1. Sober, E. & Wilson, D. S. (1998) Unto Others: The Evolution and Psychology of 11. Levine, D. K. (1998) Review of Economic Dynamics 1, 593–622. Unselfish Behavior (Harvard Univ. Press, Cambridge, MA). 12. Guth, W. & Yaari, M. (1992) in Explaining Process and Change – Approaches 2. Axelrod, R. M. (1984) The Evolution of Cooperation (Basic Books, New York). to , ed. Witt, U. (Univ. of Michigan Press, Ann Arbor). 3. Gauthier, D. (1986) Morals by Agreement (Oxford Univ. Press, Oxford). 13. Danielson, P. (2002) J. Interest Group Formal Appl. Logic, in press. 4. Binmore, K. (1994) Game Theory and the Social Contract: Playing Fair (MIT 14. Koza, J. R. (1992) Genetic Programming: On the Programming of Computers by Press, Cambridge, MA). Means of (MIT Press, Cambridge, MA). 5. Danielson, P. (1992) Artificial : Virtuous Robots for Virtual Games 15. Danielson, P. (2001) in Practical Rationality and Preference: Essays for David (Routledge, London). Gauthier, eds. Morris, C. & Ripstein, A. (Cambridge Univ. Press, New York), 6. Binmore, K. (1998) J. Art. Soc. Social Sim. 1. pp. 173–188. 7. Axelrod, R. M. (1997) The Complexity of Cooperation (Princeton Univ. Press, Princeton). 16. Danielson, P. (1998) Can. J. Philos. 28, 627–652. 8. Danielson, P. (1998) in Modeling Rationality, Morality, and Evolution, ed. 17. Eshel, I., Samuelson, L. & Shaked, A. (1998) Am. Econ. Rev. 88, Danielson, P. (Oxford Univ. Press, New York), Vol. 7, pp. 423–441. 157–179. 9. Sethi, R. & Somanathan, E. (2001) J. Econ. Theor. 97, 273–297. 18. Binmore, K. (1998) Game Theory and the Social Contract: Just Playing (MIT 10. Danielson, P. (forthcoming) in The Handbook of Rationality, eds. Mele, A. & Press, Cambridge, MA). Rawling, P. (Oxford Univ. Press, Oxford), in press. 19. Boyd, R. & Richerson, P. J. (1992) Ethol. Sociobiol. 13, 171–195.

7242 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.082079899 Danielson Downloaded by guest on October 1, 2021