arXiv:1402.4946v1 [cs.GT] 20 Feb 2014 ino ieto nietrcpoiy ndrc no- direct some In on reciprocity. rely indirect models or direct these of of tion most puzzle, this solve [14]. interest group a a and is there individual as between where serves settings conflict game real-world various The natural for (by [22]. metaphor dominant and players become players In defecting selection) cooperative player. mechanism incen- cooperative outperform special the an any would to has of cost player a absence each at the ride” Thus player “free other to defecting. the tive off cooperates player better one is defect if mutual or But than payoff (C) defection. higher gives cooperate where cooperation to game Mutual player whether two (D). simultaneous decides a player is cooperation PD each of 10]. 8, the 6, study [2, to games used widely individuals. (in reward) performing accumulated in- better of by of terms selection abundance are relative natural players the mimics generation, creasing EGT subject Thus each fitness, mutation. of relative to their end to the proportional At reproduced a to [26]. the play according usually game reward agents matrix, receive payoff generation and specified each a games In of number puzzle. finite framework this simple a study societies provides to 30] Evolutionary human [18, 29]. and (EGT) 21, theory animal [7, game ubiquitous in is biolog- settings cooperation numerous eventually where social are should there and oneself contrary, benefits to ical the that cost On action a An at disappear. the comes [1]. to and selection challenge others natural a of presents theory organisms competing ingly ∗ † [email protected] [email protected] lhuhmn ouin aebe u ot ore- to forth put been have solutions many Although most the of one is (PD) dilemma prisoner’s The seem- among behavior cooperative of evolution The xlctmmr fpeiu aepas u oe sbsdon based is c model produces Our that plays. model game a previous developing of in requir memory based interested which explicit reciprocity are interactions, These we previous their paper reciprocity. know and indirect th partner or approaches existing direct the of of Most intelligence. artificial upr h yohssta nqiyaeso rmtscoo promotes cooperat aversion which inequity well under that the hypothesis conditions inequa both the the income considering support present using by of and model effect our population the study We explore we interaction. whereby Here economics, outcomes. behavioral in within introduced concept a vlto fcoeaini ieysuidpolmi biol in problem studied widely a is cooperation of Evolution .INTRODUCTION I. nqiyAeso n h vlto fCooperation of Evolution the and Aversion Inequity nentoa nttt fIfrainTechnology-Hyder Information of Institute International sa Ahmed Asrar rsnrsdilemma prisoner’s etefrDt Engineering Data for Centre ∗ yeaa,India Hyderabad, n aaaa Karlapalem Kamalakar and t yfron oeaypyff ftemtra payoffs material the If players payoff. equal- of monetary income by forgoing enforce proposed by individuals as ity [9], model, Schmidt aversion and inequity Fehr the theoretical In and results models. experimental between anomalies as economics behavioral in- and aversion to within formalized averse been studied are has model widely concept individuals Our This that inequalities. notion explicitly. or come the known type on agent’s be based an to is require interaction not past does its that cooperation of dilemma prisoner’s round sta- single produce game. in can even Experimental model based cooperation tag ble tags. a inter- that similar show their [27] having bias results agents favorably with to cultural use actions or agents traits observable which simple artifacts are Tags 28]. [27, tion oeso oprto sthe is cooperation of models same the in agents other Agents with group. groups. cooperate on and also groups but form individuals selection on that only proposing not by works issues these selection addresses Group 36] [34, individuals. out- unrelated cooperative in kin explain observed not between comes does differentiate and agents to selection non-kin agents and genetic kin requires their towards It In biased favorably relatives. are models. agents 33] selection [13, group and selection cog- dynamic advanced abilities. large require nitive would In assumption an oth- partner interactions. such their to settings previous recognize kind their agents are know assume who and models notion those Both the to on kind ers. relies are 24] individuals 20, agent that [17, other reciprocity the cooper- Indirect (unkind) how cooperative was. on non based or inter- (kind) response each ative their after update and they repeatedly, action interact agents 35] [19, nti ae,w r neetdi eeoigamodel a developing in interested are we paper, this In nitrsigadto oteaoereciprocity-free above the to addition interesting An kin include reciprocity require not do that Approaches tepancoeainrl nsm notion some on rely cooperation explain at eaierltosi mn o kin. non among relationship perative savne ontv blte.I this In abilities. cognitive advanced es oesasm gnsrcgietheir recognize agents assume models niiul aeaotpyffequality payoff about care individuals 9.I a nrdcdt con o outcome for account to introduced was It [9]. o eoe oiat u results Our dominant. becomes ion mxdadtesailystructured spatially the and -mixed ,j i, g,sca cec,eooisand economics , social ogy, iyt ud ate eeto and selection partner guide to lity oeainwtotrqiigany requiring without ooperation h oinof notion the are x † i x , abad j epciey the respectively, , nqiyaversion inequity tag-mediated ate selec- partner , experienced inequity 2 utility (which players maximize) of player i is given by 1 5 0.8 1 Ui(xi, xj )= xi −k1 ·max(xj −xi, 0)−k2 ·max(xi −xj , 0) 0.5

0.6 where k1 < k2 and 0 ≤ k2 ≤ 1 are inequity aversion sen-

(i, j) 0.4

sitivity parameters. It follows from the above equation P that the experienced utility is maximized when inequity 0.2 is zero. The condition k1 < k2 implies that the decrease in a player’s utility is higher when it is behind (in terms 0 −9 −6 −3 0 3 6 9 of payoff) than when it is ahead. Fehr and Schmidt [9] rj show that the above model can account for cooperation observed in various human interactions. FIG. 1. P(i, j) for different values of rj and λi = {5, 1, 0.5} Here we study the effect of using the social paradigm when ri = 0 and λj = 0. of inequity aversion as a criterion for partner selection. Instead of defining experienced utility we simply allow agents to use accumulated payoff to select their game partner. Thus inequity aversion in our context implies Cooperative players provide benefit b to their oppo- that agents avoid interacting with other agents whose nent and incur a cost c (b>c> 0). Defecting players accumulated payoff is higher or lower than their own pay- neither provide benefit nor incur any cost. Given the off. Our experimental results show that cooperation can payoff structure, the dominant strategy for each player, emerge even if individuals receive only rudimentary envi- irrespective of what the other player does, is to defect. ronmental signals about others’ well-being (and not their Thus players end up with a payoff of zero instead of the type or specific behavior). It supports the hypotheses mutually beneficial payoff b − c. This characterizes that inequity aversion promotes cooperation among non the dilemma between individual interest and group well- kin [4]. We also observe a strong correlation between co- being. Without loss of generality, we use the following operation and inequity aversion that indicates a possible normalized payoff matrix: coevolution of these two behaviors [4]. The rest of the paper is organized as follows. We first 1 0 present the model and define the partner selection bias . 1+ c/b c/b as a function of income inequality. We then evaluate the model by considering both the well-mixed and the This allows us to study the game as a function of a spatially structured population and highlight the evolu- single cost-to-benefit parameter, 0 < c/b < 1 [11, 16]. tionary dynamics of cooperation and inequity aversion. Partner selection and interaction. In each generation, We also study the effects of various parameters on the agents are selected sequentially (random order) and play fraction of cooperative players in the environment and a single round of the prisoner’s dilemma game. Players then conclude. use their accumulated payoff and sensitivity parameter to determine their game partner. The search space variable Ns ∈ N gives the subset of agents from which players II. MODEL choose their partner. Denote ri, rj as the accumulated payoff and λi, λj as the sensitivity parameters of players In this section we first introduce the general model i, j, respectively. The probability that player i would where agents interact in a well-mixed population. We accept player j is given by later present the model with spatial constraints. Agents. The model consists of a fixed number of agents −λi|ri−rj | N. Each agent is either the cooperative or defecting type. φ(i, j)= e . The cooperative type always plays cooperate, and the de- Given the search space Ns, player i selects player fecting type always plays defect. Agents have an associ- j such that the above probability is maximized, j = ated parameter λ which gives a measure of how sensitive argmaxj∈Ns φ(i, j). We similarly define φ(j, i) as the or tolerant they are to payoff inequality. A player’s type probability that player j would accept player i as its part- and sensitivity parameter are subjected to evolutionary ner. The probability of interaction between players i and changes (see evolutionary step below). Players also have j is given by an accumulated reward which is initialized to zero at the start of each generation and updated according to the P(i, j)= φ(i, j)φ(j, i) payoff received in each game. = e−(λi+λj )|ri−rj |. (1) Payoff matrix. The payoff received by each player is specified by the standard prisoner’s dilemma game: The interaction is bilateral or with mutual consent since the probability of interaction depends on both play- b − c −c ers’ sensitivity parameters. We set the range of the sen- .  b 0  sitivity parameter λ ∈ [0, 5]. If λi = λj = 0, players are 3

1 200 0.8 0.6 fc 0.4 150 C-C 0.2 0 C-D 750 2250 3750 5250 6750 8250 9750 11250 12750 14250 D-D generation g 100

(a) #Games 5 50 4 3 λ 2 0 1 0 0.5 1.1 1.7 2.3 2.9 3.5 4.1 750 2250 3750 5250 6750 8250 9750 11250 12750 14250 generation g λ (b) (c)

FIG. 2. (a) The fraction of cooperative players fc across generations for a typical run for N = 250, Ns = 8, and c/b = 0.45. (b) The corresponding average λ value of cooperative players. (c) The corresponding number of games with C-C, C-D, and D-D interactions vs λ. indifferent to inequity, and as λ increases, they become Algorithm 1 Algorithm for agent interaction increasingly inequity averse. The upper limit of 5 ensures Require: µ, N, Ns. that the probability P(i, j) is close to zero, even for small while generation g ≤ gmax do payoff differences. To illustrate, Figure. 1 gives the prob- for all agents i ∈ N do ability of interaction between players i and j for different select Ns ∈ N agents randomly j = arg max φ(i, j) values of λi when i’s accumulated reward ri = 0 and j∈Ns With Probability P(i, j), Play(i, j) λj = 0. For rj = 1 and λi = 0.5, the probability of in- teraction P(i, j)=0.6 which decreases to P(i, j)=0.006 Update accumulated payoff ri,rj end for for λi = 5. New Set Of Players N = EvolutionaryStep(N, µ) Once the partner is selected, the PD game is played end while with the probability given in Equation. (1). If both play- ers are cooperative, their accumulated payoff increases by 1. If only one of the players is cooperative, the de- III. RESULTS AND DISCUSSION fecting player’s accumulated payoff increases by 1 + c/b, and the cooperative player’s accumulated payoff remains unchanged. We recall that N is the number of agents, Ns is the We note that, unlike tag-mediated models of cooper- search space, and c/b is the cost-to-benefit ratio. We ation [27] where an agent’s tags remain fixed in a given denote the fraction of cooperative players with fc. C-C generation, in our model the accumulated payoff serves denotes the cooperative-cooperative player interaction. as a dynamic tag which changes after each interaction. We similarly define C-D and D-D interactions. For all Evolutionary step. At the end of each generation experiments, we set µ =0.1 and initialize λ to a uniform agents are reproduced using the binary tournament pro- value between [0, 5]. The initial fraction of cooperative cedure [12, 31]: players is set to 10%. We report the results by averaging across 20 runs, with each run consisting of 15000 gener- (1) Two distinct agents are randomly selected for a tour- ations. Our results cover the following aspects: (1) the nament, and the agent with the higher fitness value is evolutionary dynamics of fc, λ, and the correlation be- declared the winner (we use the accumulated reward tween them, (2) the change in the fraction of games with as the fitness value). C-C, C-D, and D-D interactions as λ value changes, (3) (2) A copy of the winner, called the offspring, is added the effect of number of agents and search space on fc as to the new generation and the above procedure is re- the cost-to-benefit ratio c/b increases, and (4) the effect peated N times. of spatial constraints on fc.

Additionally, a mutation is applied to each offspring. The mutation value gives the probability with which the A. Mixed population offspring’s type is randomly reset to either the cooper- ative or defecting type. Since the sensitivity parameter Figure 2(a) shows the fraction of cooperative players fc is a continuous variable, we apply mutation by adding, across generations for a typical run (with N = 250, Ns = with probability µ, Gaussian noise with mean 0 and de- 8 and c/b = 0.45). Figure 2(b) gives the corresponding viation 1 to the inherited λ value [28]. The accumulated average λ value of cooperative players. We observe high reward of the offspring is set to zero. The algorithm in levels of cooperation fc ≥ 0.75 interrupted by brief pe- Table 1 provides the pseudo code of the model. riods of defection. The generations when the fraction of 4

1 1 1.0 100 6 0.9 300 200 8 0.8 0.8 0.8 300 10 12 250 0.7 0.6 0.6 f f 0.6 c c N 200 0.5 0.4 0.4 0.4 150 0.3 0.2 0.2 0.2 100 0 0 0.1

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 4 6 8 10 12 0.0 c/b c/b Ns (a) (b) (c)

FIG. 3. (a) fc vs c/b for Ns = 8 and N = {100, 200, 300}. (b) fc vs c/b for N = 150 and Ns = {6, 8, 10, 12}. (c) Color map depicting fc on the Ns − N plane for c/b = 0.4. cooperative players falls significantly are highlighted to mulated payoff increases. The environment faces a slight show the close correlation between fc and λ. Across all selection pressure towards higher tolerance levels (lower simulations we observe that an increase or decrease in λ value). As this happens gradually over generations, at the fraction of cooperative players is preceded by an in- some threshold λ, agents become vulnerable to invasion crease or decrease in the average λ value. For example, by defectors, and the cooperation levels fall sharply. The as shown in Figures 2(a) and 2(b), from generation 7100 system remains in this state until by chance, due to mu- to 7500, the average λ value decreases gradually from 4.2 tation, a few cooperative players with a relatively high λ to 0.3. We see a corresponding decrease in the fc value value emerge and reestablish cooperation. from 0.84 to 0.08. To further validate the correlation between fc and λ, At the start of each generation, as the accumulated Figure 2(c) shows the number of games with C-C, C- payoff of all players is 0, the probability of interaction D, and D-D interactions as a function of λ. For a low λ between any two players is 1, irrespective of λ. So in the value C-D and D-D interactions dominate. As the λ value initial stages, defecting players have a slight advantage increases, cooperative clusters emerge, and C-C becomes (since the cooperative players would not reject them as dominant. a game partner), and their accumulated payoff increases. But after the first few games, the cooperative agents in- The cycles of cooperation and defection are not regu- volved in C-D interactions start forming temporary clus- lar or periodic, and how often the system goes into the ters sharing a common payoff. If these cooperative play- defection state and how quickly it recovers depends on ers are sufficiently inequity averse, in the subsequent it- the parameters N, Ns, and c/b. In general we observe erations, they are more likely to interact within these that as the cost-to-benefit ratio c/b increases or Ns de- clusters, and their payoff increases. The defecting players creases, it takes longer for the system to recover from the who initially exploit the cooperative players face a form of defection state. “social exclusion” from these groups. Thus inequity aver- Figures 3(a) and 3(b) show the fraction of cooperative sion allows cooperative players to seek new partners and players vs the cost-to-benefit ratio for different values of form groups with whom they can share more equitable N and Ns, respectively. Across both simulations, as c/b payoffs. As the interactions proceed, cooperative players increases, fc decreases, and for c/b ≥ 0.5, cooperation that are more tolerant (small λ value) to income inequal- disappears. With respect to N, we observe the fc value ity continue to interact with defecting agents. They are decreases marginally for 0.3 ≤ c/b ≤ 0.5 as N decreases. outperformed by cooperative players with relatively high The change in fc is considerably higher with respect to λ value and die out. Over generations the average λ of change in Ns values. As the search space increases, fc cooperative players increases, which further reduces the increases. The search space affects the probability of likelihood of C-D interactions until cooperation becomes selecting an agent from within the temporary clusters dominant. that emerge. A higher search space value translates to a However, this cooperation due to the temporary clus- higher probability of interaction within the cluster. We tering effect is not permanent. When cooperation is es- also observe a “thresholding” effect; that is, for a fixed tablished, agents have a high λ value. At this stage, change in Ns the increase in the fc value is higher for the cooperative agents with a relatively smaller λ value smaller values of Ns. Figure 3(c) shows a color map of have a slight advantage as they are more likely to toler- fc for different values of Ns (along the X axis) and N ate inequality and play the game. And since cooperative (along the Y axis) with c/b =0.45. For a high (Ns ≥ 12) agents are dominant, with high probability these agents or low search space (Ns ≤ 4), fc does not change with interact with other cooperative agents, and their accu- N. 5

1 1 1 0.8 0.6 12x12 0.4 fc 0.8 0.8 16x16 0.2 20x20 0 24x24 100 300 500 700 900 1100 1300 1500 1700 1900 0.6 Within 0.6 generation g f Across f g c (a) 0.4 0.4

5

 0.2 0.2 3

λ  1 0 0 0 2 8 14 20 26 32 38 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 100 300 500 700 900 1100 1300 1500 1700 1900 generation g generation g c/b (b) (c) (d)

FIG. 4. (a) The fraction of cooperative players fc across generations for a typical run with N = 12 × 12 and c/b = 0.45. (b) The corresponding average λ value of cooperative players. (c) The corresponding fraction of games (fg ) within and outside the clusters for the first few generations. (d) fc vs c/b for different grid sizes.

B. Two-dimensional lattice side the clusters for the first few generations. The cor- responding snapshots for the first 40 generations along We now consider the spatial prisoner’s dilemma game with the size of the largest cooperative cluster (given be- [23, 32] where each player occupies a cell in a square lat- low each plot) are shown in Figure. 5. Defecting players tice. Similar to the well-mixed population model, agents are in black. We observe the cooperative players become are either the cooperative or defecting type and use their dominant by generation 32. accumulated payoff to select their game partner and in- Figure 4(d) shows the fraction of cooperators vs c/b teract. But due to spatial constraints an agent’s search for different grid sizes. We observe fc does not change space is restricted to its four neighboring cells. We also with N, and similar to the well-mixed model, cooperation ≥ change the binary tournament procedure to reflect an disappears for c/b 0.5. agent’s fixed position. For each offspring cell to be added to the new generation, two distinct agents are randomly selected from the neighborhood of the cell. The agent IV. CONCLUSION with the higher fitness values is declared the winner and the offspring inherits the winner’s type and λ value. Mu- We developed a model of agent interaction that is tation is applied as discussed in the previous model. motivated by the social paradigm that individuals are Figures 4(a) and 4(b) show the evolutionary dynam- inequity-averse and prefer to interact with others within ics of fc and λ for a typical run with N = 12 × 12 and the same social strata. We presented the results by con- c/b = 0.45 for the first 2000 generations (similar behav- sidering both the well-mixed and the spatially structured ior is seen in subsequent generations). Across all sim- populations across different parameter values. In general, ulations, we observe that the cooperation in the spatial cooperation becomes dominant when the cost of coopera- prisoner’s dilemma is more “robust” (i.e., unlike the pre- tion is low and is more robust for a structured population. vious model where cooperation almost disappears before Our results support the hypotheses that inequity aver- recovering, here we see stable cooperation). The increase sion promotes cooperation among non kin [4]. It allows and decrease in the fc value are marginal but are still individuals to seek new partners with whom they can closely correlated with the increase and decrease in the share more equitable payoffs. And if the equitable pay- λ value. off increases the relative fitness of such individuals (as is While a single cooperative agent surrounded by defec- the case with the prisoner’s dilemma), natural selection tors is always outperformed, cooperative players that are would guarantee that inequity averse cooperative agents sufficiently inequity averse and adjacent to each other emerge. We also observe a strong correlation between have two advantages: (1) the temporary clustering that inequity aversion and cooperation that points to the co- arises due to sharing a common payoff and (2) the clus- evolution of these behaviors. Brosnan [4] provides an ex- tering provided by spatial constraints. Like the previous tensive discussion of inequity aversion observed in other model, in the initial games of a generation, if coopera- species [15, 25], including capuchin monkeys [3], which tive and defecting agents interact, the inequity in accu- “elucidate evolutionary precursors to inequity aversion” mulated payoff ensures that in the subsequent iterations [5]. the probability of C-D interactions decreases. And as We believe the model presented in this paper is an im- is traditionally the case, the spatial constraint further portant step towards better understanding coevolution improves the performance of cooperative players since of cooperation and inequity aversion. In the future, we players in the interior of the cluster enjoy the benefit of intend to evaluate the model by considering generic ran- mutual cooperation. Figure 4(c) shows the fraction of dom networks and to incorporate other social factors like interactions (for the above sample run) within and out- group membership, dominance rank, context of interac- 6

FIG. 5. Snapshots for first 40 generations on 12 × 12 grid and c/b = 0.45. Defecting players are in dark. The labels give the size of the largest cooperative cluster. tion, etc., which have been shown to effect the overall response to inequity [4].

[1] Robert M. Axelrod. The evolution of cooperation. Basic [9] Ernest Fehr and Klaus M Schmidt. A theory of fairness, Books, New York, 1984. competition, and cooperation. Qarterly Journal of Eco- [2] Robert Boyd. Mistakes allow evolutionary stability in the nomics, 114(3):817–868, August 1999. repeated prisoner’s dilemma game. Journal of Theoretical [10] David B. Fogel. On the relationship between the duration Biology, 136(1):47–56, Jan 1989. of an encounter and the evolution of cooperation in the [3] S. F. Brosnan and F. B. M de Waal. Monkeys reject iterated prisoner’s dilemma. Evol. Comput., 3(3):349– unequal pay. , 424:297–299, 2003. 363, September 1995. [4] Sarah F Brosnan. A hypothesis of the co-evolution of [11] Feng Fu, Martin A. Nowak, and Christoph Hauert. Inva- cooperation and responses to inequity. Frontiers in Neu- sion and expansion of cooperators in lattice populations: roscience, 5(43), 2011. Prisoner’s dilemma vs. snowdrift games. Journal of The- [5] Sarah F. Brosnan and Frans B. M de Waal. Animal oretical Biology, 266(3):358 – 366, 2010. behaviour: Fair refusal by capuchin monkeys. Nature, [12] David E. Goldberg and Kalyanmoy Deb. A comparative 428(6979):140, 2004. analysis of selection schemes used in genetic algorithms. [6] Michael Doebeli and Christoph Hauert. Models of coop- In Foundations of Genetic Algorithms, pages 69–93. Mor- eration based on the Prisoner’s Dilemma and the Snow- gan Kaufmann, 1991. drift game. Ecology Letters, 8(7):748–766, July 2005. [13] W. D. Hamilton. The genetical evolution of social be- [7] L.A. Dugatkin. Cooperation among animals: an evolu- haviour. I. Journal of Theoretical Biology, 7(1):1–16, July tionary perspective. Oxford Series in Ecology and Evolu- 1964. tion Series. , Incorporated, 1997. [14] Garrett Hardin. The tragedy of the commons. Science, [8] J. Farrell and R. Ware. Evolutionary stability in the 162:1243–1248, December 1968. repeated prisoner’s dilemma. Theoretical Population Bi- [15] Fatemeh Heidary, Mohammad Reza Vaeze Mahdavi, Far- ology, 36(2):161–166, Oct 1989. shad Momeni, Bagher Minaii, Mehrdad Rogani, Nader 7

Fallah, Roghayeh Heidary, and Reza Gharebaghi. Food A Study in Conflict and Cooperation. University of inequality negatively impacts cardiac health in rabbits. Michigan Press, Ann Arbor, 1965. PLoS ONE, 3(11):e3705, 11 2008. [27] Rick L. Riolo. The effects and evolution of tag-mediated [16] P Langer, M Nowak, and C Hauert. Spatial invasion of selection of partners in populations playing the iterated cooperation. J Theor Biol, 250:634–641, 2008. prisoner’s dilemma. In Seventh International Conference [17] O. Leimar and P. Hammerstein. Evolution of coopera- on Genetic Algorithms, pages 378–385, San Francisco, tion through indirect reciprocity. Proceedings. Biological CA, USA, 1997. Morgan Kaufmann Publishers Inc. sciences / The Royal Society, 268(1468):745–753, April [28] Rick L. Riolo, Michael D. Cohen, and Robert Axelrod. 2001. Evolution of cooperation without reciprocity. Nature, [18] John Maynard Smith and George R. Price. The Logic 414(6862):441–443, November 2001. of Animal Conflict. Nature, 246(5427):15–18, November [29] K. Sigmund. Games of life: explorations in ecology, evo- 1973. lution, and behaviour. Penguin science. Oxford Univer- [19] Martin Nowak and . A strategy of win- sity Press, 1993. stay, lose-shift that outperforms tit-for-tat in the pris- [30] John Maynard Smith. Evolution and the Theory of oner’s dilemma game. Nature, 364(6432):56–58, 1993. Games. Cambridge University Press, Cambridge, UK, [20] Martin Nowak and Karl Sigmund. Evolution of indirect 1982. reciprocity by image scoring. Nature, 393:573–7, June [31] Shinsuke Suzuki and Eizo Akiyama. Reputation and the 1998. evolution of cooperation in sizable groups. Proceedings of [21] Martin A. Nowak. Evolutionary Dynamics: exploring the the Royal Society B: Biological Sciences, 272(1570):1373– Equations of Life. Belknap Press of 1377, July 2005. Press, September 2006. [32] Gy¨orgy Szab´oand Csaba T˝oke. Evolutionary prisoner’s [22] Martin A. Nowak. Five Rules for the Evolution of Coop- dilemma game on a square lattice. Phys. Rev. E, 58:69– eration. Science, 314(5805):1560–1563, December 2006. 73, Jul 1998. [23] Martin A. Nowak and Robert M. May. Evolutionary [33] P.D. Taylor. Altruism in viscous populations an inclusive games and spatial chaos. Nature, 359(6398):826–829, Oc- fitness model. Evolutionary Ecology, 6(4):352–356, 1992. tober 1992. [34] A. Traulsen and M. A. Nowak. Evolution of coopera- [24] Martin A. Nowak and Karl Sigmund. Evolution of indi- tion by multilevel selection. Proceedings of the National rect reciprocity. Nature, 437:1291–1298, 2005. Academy of Sciences of the United States of America, [25] Friederike Range, Lisa Horn, Zs´ofia Viranyi, and Ludwig 103(29):10952–10955, July 2006. Huber. The absence of reward induces inequity aversion [35] Robert L. Trivers. The Evolution of Reciprocal Altruism. in dogs. Proceedings of the National Academy of Sciences The Quarterly Review of Biology, 46(1):35–57, 1971. of the United States of America, 106(1):340–345, January [36] D. S. Wilson. A theory of . Proceedings 2009. of the National Academy of Sciences of the United States [26] A. Rapoport and A. M. Chammah. Prisoner’s Dilemma: of America, 72(1):143–146, January 1975.