J. theor. Biol. (2002) 218, 187–194 doi:10.1006/yjtbi.3067, available online at http://www.idealibrary.com on

Replicator Dynamics for Optional Games Christoph Hauertwz,SilviaDeMontewy, Josef Hofbauerw and Karl Sigmundnw8

w Institute for Mathematics, University of Vienna, Strudlhofgasse 4, A-1090 Vienna, Austria z Department of Zoology, University of British Columbia, 6270 University Boulevard, Vancouver, BC, Canada V6T1Z4 y Department of Physics, Danish Technical University, DK-2800 Kgs. Lyngby, Denmark and 811ASA, A-2361 Laxenburg, Austria

(Received on 8 October 2001, Accepted in revised form on 12 April 2002)

The public goods game represents a straightforward generalization of the prisoner’s dilemma to an arbitrary number of players. Since the dominant is to defect, both classical and evolutionary predict the asocial that no player contributes to the public goods. In contrast to the compulsory public goods game, optional participation provides a natural way to avoid deadlocks in the state of mutual defection. The three resulting strategiesFcollaboration or defection in the public goods game, as well as not joining at allFare studied by means of a replicator dynamics, which can be completely analysed in spite of the fact that the payoff terms are nonlinear. If is valuable enough, the dynamics exhibits a rock-scissors-paper type of cycling between the three strategies, leading to sizeable average levels of cooperation in the population. Thus, voluntary participation makes cooperation feasible. But for each strategy, the average payoff value remains equal to the earnings of those not participating in the public goods game. r 2002 Elsevier Science Ltd. All rights reserved.

Introduction In this article, we present another mechanism to Most theories on the emergence of cooperation achieve sizeable levels of cooperation in a popula- among selfish individuals are based on kin tion. The following investigation is based on the . selection (Hamilton, 1963), group selection public goods game (see Fehr& G achter, 2002; (Wilson & Sober, 1994) and reciprocal altruism Kagel & Roth, 1995) which represents a natural (Trivers, 1971). In all three models, cooperative extension of the prisoner’s dilemma to an arbitrary behavioris attained throughbasic mechanisms number of players (see Boyd & Richerson, 1988; of discrimination enabling individuals to target Dawes, 1980; Hauert & Schuster, 1997). their altruistic acts towards certain partners In a typical public goods game, an experi- only. menter gives 20 dollars to each of eight players. The players may contribute part or all of their money to some common pool. The experimenter then triples this amount and divides it equally n Corresponding author. Institute for Mathematics, Uni- among the eight players, irrespective of the versity of Vienna, Strudlhofgasse 4, A-1090 Vienna, Austria. Tel.: +43-3-4277-50612; fax: +43-1-4277-9506. amount of theirindividual contribution.If all E-mail address: [email protected] (K. Sigmund). players contribute maximally, they will end up

0022-5193/02/$35.00/0 r 2002 Elsevier Science Ltd. All rights reserved. 188 C. HAUERT ET AL. with 60 dollars each. But each individual is faced whetherto participatein the public goods game with the temptation to exploit, as a free rider, or not. [For a similar approach in the prisoner’s the contributions of the co-players. Hence, the dilemma see Batali & Kitcher(1995); Orbell& dominating strategy is to invest nothing at all. If Dawes (1993)]. Individuals unwilling to join the all players do this, they will not increase their public goods game are termed loners. These initial capital. In this sense, the ‘‘rational’’ players prefer to rely on a small but fixed payoff equilibrium solution prescribed to ‘‘homo oeco- Pl ¼ sc with nomicus’’ leads to economic stalemate. In actual experiments, players tend to invest a lot, how- 0osor À 1; ð2Þ ever: typically, in the first round, they invest 10 dollars or more (Fehr & Gachter,. 2002). such that the members in a group where all Public goods games are abundant in human cooperate are better off than loners, but loners and animal societies, and can be seen as basic are better off than members in a group of examples of economic interactions (see e.g. defectors. Binmore, 1994; Dugatkin, 1997). For the optional public goods game, there are thus three behavioral types in the population: (a) The Model the loners unwilling to join the public goods We considera largepopulation of players. game, (b) the cooperators ready to join the From time to time, N such players are chosen group and to contribute their effort, and (c) the randomly. Within such a group, players can either defectors who join, but do not contribute. contribute some fixed amount c ornothing at all. Assuming that groups form randomly, the pay- The return of the public good, i.e. the payoff to the offs for the different strategies Pc; Pd and Pl are players in the group, depends on the abundance of then determined by the relative frequencies x; y and z of the three strategies. cooperators. If nc denotes theirnumberamong the public goods players, the net payoff for coopera- tors Pc and defectors Pd is given by The Equations of Motion n assumes that a P ¼Àc þ rc c; c N strategy’s payoff determines the growth rate of its frequency within the population. More n precisely, following (Weibull (1995), Schlag P ¼ rc c; d N (1998) and Hofbauer& Sigmund (1998), we postulate in ourmodel that playersusing where r denotes the interest rate on the common strategies i ¼ 1; y; n occasionally compare their pool. Fora public goods game deservingits name, payoff with that of a randomly chosen ‘‘model’’ we must have memberof the population, and adopt the strategy of their model with a probability 1 r N: ð1Þ o o proportional to the difference between the The first inequality states that if all do the same, model’s payoff and theirown, if this is they are better off cooperating than defecting; the positive (and with probability 0 otherwise). In second inequality states that each individual is the continuous time model, the evolution of the betteroff defecting than cooperating.Selfish frequencies xi of the strategies i is given by players will therefore always avoid the cost of X cooperation c; i.e. a collective of selfish players will x’i ¼ xixjðPi À PjÞð3Þ nevercooperate.Defection is the dominating j strategy. Hence both classical and evolutionary game theory predict that all players will defect, and with 1pi; jpn; which reduces to the replicator obtain zero payoff. equation We now extend the public goods game. In this ’ % optional public goods game, players can decide xi ¼ xiðPi À PÞ; ð4Þ OPTIONAL PUBLIC GOOD GAMES 189 P % where P ¼ xjPj is the average payoff in the game. In the case S ¼ 1 (no co-playershows up) population. we assume that the playerhas no otheroption To be precise, we consider a very large, well- than to play as a loner, and obtains payoff s: mixed population with three types of players: This happens with probability zNÀ1: Fora given loners, cooperators and defectors. From time to playerwilling to join the public goods game, the time, sample groups of N players are randomly probability of finding, among the N À 1 other chosen and offered to participate in a single players in the sample, S À 1 co-players joining public goods game. Note that in large popula- the group (S41), is tions, the probability that two players ever ! encounteragain can be neglected. Depending N À 1 ð1 À zÞSÀ1zNÀS: on their type, players either refuse to participate S À 1 orjoin the public goods game. In the lattercase, they eitherdefect, orcooperate.But, their The probability that m of these players are strategies are specified beforehand, and do not cooperators, and S À 1 À m defectors, is depend on the composition of the sampled  ! group. In particular, we do not consider condi- x m y SÀ1Àm S À 1 tional strategies of the type: cooperate if and : x þ y x þ y m only if the group contains more than two cooperators, or the like. In that case, the payoff fordefectorsis rm=S: Each playeris sampled a numberof times, and Hence the expected payoff fora defectorin a obtains an average payoff which depends on his group of S players (S ¼ 2; y; N)is strategy as well as the composition of the entire ! population. This composition changes according  r XSÀ1 x m y SÀ1Àm S À 1 to the replicator dynamics [see eqns (3) and (4)]. m S x þ y x þ y The intuition behind the dynamics is that m¼0 m occasionallyFand independently of the sam- r x pling of the public goods teamsFa randomly ¼ ðS À 1Þ : chosen player A compares his or her payoff with S x þ y that of anotherplayer B (also randomly chosen, Thus, within the entire population and not limited to ! the ‘‘public goods’’ groups to which A belonged), XN NÀ1 x N À 1 and adopts the strategy of B; if it yields a higher Pd ¼ sz þ r 1 À z S À 1 payoff, with a probability proportional to the S¼1 difference in their payoffs. Opportunities for  1 updating, i.e. changing the strategy, are sup- ð1 À zÞSÀ1zNÀS 1 À posed to occurmuch less frequentlythan S opportunities to play in a public goods group. x We emphasize that there are other reasonable ¼ szNÀ1 þ r game dynamics, but forthe replicatorequation 1 À z we can produce a full analysis. "# ! Forsimplicity and without loss of generality, XN N À 1 1 1 À ð1 À zÞSÀ1zNÀS we set the cost c of cooperation equal to 1: The S À 1 S payoff forlonersis then given by the constant S¼1  Pl ¼ s: N À 1 N and using ¼ S=N leads to S À 1 S In order to compute the payoff values for  cooperators and defectors, we first derive the N NÀ1 x 1 À z probability that S of the N sampled individuals Pd ¼ sz þ r 1 À : ð5Þ are actually willing to join the public goods 1 À z Nð1 À zÞ 190 C. HAUERT ET AL.

In a group with S À 1 co-players playing the 0.8 public goods game, switching from cooperation to defection yields 1 À r=S: Hence, 0.6 ! r < 2 XN  0.4 r N À 1 SÀ1 NÀS r = 2 Pd À Pc ¼ 1 À ð1 À zÞ z :

S F(z) S¼2 S À 1 0.2 r > 2

Using the same arguments as before, we obtain 0 r À zN NÀ1 1 − 0.2 Pd À Pc ¼ 1 þðr À 1Þz À ¼: FðzÞ: N 1 À z 0 0.2 0.4 0.6 0.8 1 ð6Þ z Fig. 1. The difference between the payoff of coopera- The advantage of defectors over cooperators tors Pc and defectors Pd is a function of the fraction of depends only on the fraction of individuals loners z: FðzÞ¼Pd À Pc: If almost everybody is participat- - actually willing to play, i.e. on the fraction of ing in the public goods game (z 0) then FðzÞ40 holds and it pays to defect. However, for interest rates r42; if the loners z: At the same time, it is independent of proportion z of loners increases, it eventually pays to the loner’s payoff s: cooperate (FðzÞo0) and the social dilemma disappearsFat least fora while. FðzÞ has eitherno ora unique rootin the The sign of Pd À Pc in fact determines whether interval ð0; 1Þ: it pays to switch from cooperation to defection or vice versa, FðzÞ¼0 being the equilibrium condition. We claim that for rp2; F has no root, x; y; zX0; x þ y þ z ¼ 1g; i.e. the vectors e of # i and for r42 exactly one root z in the interval the standard basis (i ¼ c; d; l in a straightforward ð0; 1Þ: In order to show this, we consider the notation), are obviously fixed points. There are function GðzÞ¼FðzÞð1 À zÞ which has the same no otherfixed points on the boundaryof S3: In roots as FðzÞ in ð0; 1Þ and note that (a) Gð0Þ¼ fact, the edge e e consists of an orbit leading ^ c d 1 À r=N40; (b) Gð1Þ¼0; (c) GðzÞ ð2 À rÞ from e (cooperators only) to e (defectors only), 2 - c d ðN À 1Þð1 À zÞ for z 1; such that in a neigh- the edge e e is an orbit leading to the state borhood of z ¼ 1 GðzÞ is negative for r42; and d l 00 NÀ3 consisting of loners only, and the orbit elec closes (d) G ðzÞ¼z ðN À 1ÞððN À 2Þðr À 1ÞÀzðNrÀ this heteroclinic cycle of rock-scissors-paper type N À rÞÞ changes sign at most once in ð0; 1Þ: Thus, on the boundary. for r42 [which by eqn (1) implies N42] there In order to analyse the dynamics in the # exists a threshold value of the loners frequency z interior, it is useful to show that the replicator above which cooperators fare better than defec- equation, defined on the simplex S ; can be tors (see Fig. 1). 3 % rewritten in the form of a Hamiltonian system, The average population payoff P can now be and thus admits an invariant of motion. Indeed, rewritten using the condition y ¼ 1 À x À z: defining as a new variable f ¼ x=ðx þ yÞ; i.e. the % fraction of cooperators among the individuals P ¼ xPc þ yPd þ zPl actually participating in the public goods game, ¼ xðPc À Pd Þþzðs À Pd ÞþPd we obtain ¼ÀxðP À P Þþð1 À zÞðP À sÞþs: d c d ’ yx’ À xy’ xy f ¼ 2 ¼ 2 ðPc À Pd Þ: Substituting eqns (5) and (6) then yields ðx þ yÞ ðx þ yÞ

P% ¼ s Àð½Š1 À zÞs Àðr À 1Þx ð1 À zNÀ1Þ: ð7Þ This, as well as substituting eqn (7) into the z’ ¼ zðs À P%Þ; yields ’ The Dynamics f ¼Àf ð1 À f ÞFðzÞ; ð8Þ Let us now analyse the replicator dynamics. z’ ¼ ½Šs À f ðr À 1Þ zð1 À zÞð1 À zNÀ1Þð9Þ The corners of the simplex S3 ¼fðx; y; zÞ : OPTIONAL PUBLIC GOOD GAMES 191

2 with ðf ; zÞ on the unit square ð0; 1Þ : Dividing c the right-hand side by the function f ð1 À f Þz ð1 À zÞð1 À zNÀ1Þ; which is positive on the unit square, corresponds to a change in velocity which does not affect the orbits. This yields ÀFðzÞ f’¼ ¼: ÀgðzÞ; zð1 À zÞð1 À zNÀ1Þ

s À f ðr À 1Þ z’ ¼ ¼: lð f Þ: f ð1 À f Þ

Introducing H :¼ G þ L; where GðzÞ and Lð f Þ are primitives of gðzÞ and lð f Þ:   r r GðzÞ¼ 1À log z þ À 1 logð1À zÞþRðzÞ; N 2 e l e d ð10Þ Fig. 2. The three corners ec; ed ; el of S3 are saddle points (but el is not hyperbolic) and the boundary bd S3 Lð f Þ¼s log f þðr À 1 À sÞ log ð1 À f Þð11Þ represents a rock-scissors-paper type heteroclinic cycle. For small interest rates, ro2; no fixed point exists in int S3 and all orbits converge to el : But el is not Lyapunov stable. with RðzÞ bounded on ½0; 1Š; we obtain the Parameters: N ¼ 5; r ¼ 1:8; s ¼ 0:5: Hamiltonian system @H f’¼À ; @z which follows from Pd ¼ Pl: Due to the fact that the system is conservative, and the Hamiltonian @H H attains a strict (global) maximum at ðs= z’ ¼ : ðr À 1Þ; z#Þ; the interior equilibrium Q is a center, @f i.e. it is neutrally stable and surrounded by The actual dynamics of the system depends on closed orbits (see Fig 3). Actually, all interior ð Þ- whetherthe condition Pd ¼ Pc can be satisfied in orbits are closed: eqn (10) shows that G z the interior S ; and hence on the interest rate r: ÀN for z-0; 1if2oroN; and eqn (11) implies 3 - - For rp2 there are no fixed points except that Lðf Þ À N as f 0; 1ifsor À 1: There- H- À N the corners and all trajectories in int S3 are fore, uniformly near the boundary of ½ Š2 homoclinic orbits of el: Thus, the system will 0; 1 and hence all level sets of H are closed display intermittently brief bursts of coopera- curves. In particular, no interior orbit converges tion, but always ends up with no one willing to to the non-hyperbolic equilibrium el: participate in the public goods game, as shown Variations of the three parameters N; r; s in Fig. 2. allow to position Q anywhere in the interior of For r42; eqn (2) implies that there exists a the simplex (see Fig. 4). Note that in general all unique fixed point Q ¼ðx#; y#; z#Þ in int S3 such three parameters must be adjusted to place Q in that Fðz#Þ¼0 and: a particular location. According to eqns (12) and (13), the fixed point Q lies on the line s x# ¼ ð1 À z#Þð12Þ r À 1 s x ¼ y ð14Þ as well as r À 1 À s  s N y# ¼ 1 À ð1 À z#Þð13Þ independent of the group size : Forincreasing r À 1 N; Q moves towards the corner el and in the 192 C. HAUERT ET AL.

c e c

Q

Q

e l e d e l e d

Fig. 3. For r42; the three corners ec; ed ; el are again Fig. 5. In the limiting case r ¼ N; the edge eced is a line saddle points and bd S3 represents a heteroclinic cycle. In of fixed points, stable on ecQ (closed circles) and unstable int S3 a single fixed point Q appears. It is a center surrounded by closed orbits (see text). Parameters: on Qed (open circles). Random drift and occasional N ¼ 5; r ¼ 3; s ¼ 1: appearances of the missing loner strategy will eventually drive the system close to the corner ec with almost everybody cooperating. Parameters: N ¼ 3; r ¼ 3; s ¼ 1:

e c limit N-N homoclinic orbits issuing from and leading to el are obtained. Forthe limiting cases r ¼ N; s ¼ r À 1 and s ¼ 0; Q approaches the edges eced ; elec or eled ; respectively. In particular, for r ¼ N; coopera- tion becomes stable in the sense that, while the state can fluctuate along the edge z ¼ 0by random drift, any small fluctuation introducing σ N the missing loners will be offset in such a way that the loners vanish again and the number of Q r cooperators is larger than previously (see Fig. 5). Although the time averages of the state variables overR an orbit of period T; defined as % T v ¼ð1=TÞ 0 v dt; depend on the initial condi- tions, the following relations hold for every e l e d orbit. First, the average fraction of cooperators among playing individuals corresponds to its Fig. 4. The position of the center Q in S3 depends on the values of the parameters N; r and s: The intersection of value at the equilibrium point Q: the three lines corresponds to N ¼ 5; r ¼ 3; s ¼ 1: Each line indicates the displacement of the centerwhen varyinga x% s single parameter. Increasing the number of potential ¼ : ð15Þ participants N shifts the centeralong the solid line in the x% þ y% r À 1 direction indicated by the arrow, i.e. towards the corner el : Similarly, increasing s shifts the centerupwardson the This means that the time average lies on the solid dashed line z ¼ z# and increasing r moves the centerto the right, along the dash–dotted line. For r-2; the center line in Fig. 4 which connects Q and el: Second, approaches the corner el: the average of the fraction of cooperators among OPTIONAL PUBLIC GOOD GAMES 193 participants in public goods games f%corresponds dilemma. In a public goods game, those players to the fraction of the averages: who are defecting are always better off than those players who are cooperating. Nevertheless, s f%¼ : ð16Þ if the group size S of participating players is less r À 1 than the interest rate r; it pays the individual Surprisingly, perhaps, increasing r always favors player to switch from defection to cooperation. defection, i.e. it decreases the fraction f of If players have the option of an asocial ‘‘fallback cooperators among those actually engaging in solution’’, they can refuse to join the public the public goods game. goods game. If enough players refuse to join, the According to numerical calculations, the time group becomes so small that the game is no average lies on the line segment Qel and longera social dilemma. But then, the higher converges to el as the closed orbit approaches payoff obtained by the cooperators in the public the boundary of S3: We can offeronly a heuristic goods game causes more players to join, and explanation of this observation: The closer the larger groups of public goods players create the periodic orbit is to the boundary the more time it temptation to defect, i.e. the social dilemma. will spend near the degenerate equilibrium el This requires r42; a condition which is similar (both eigenvalues zero) where motion is much to the condition that in the Prisoner’s dilemma slower than close to the hyperbolic equilibria ec game, the benefit exceeds twice the cost: this and ed : condition is essential forthe stability of the Let us show how eqn (15) is deduced Pavlov strategy (Nowak & Sigmund, 1995). It by integrating eqn (9). Remembering that, by may be argued that this condition can also be definition, x ¼ f ð1 À zÞ; and dividing both sides found in Hamilton’s rule for kinship selection. of eqn (9) by z ð1 À zNÀ1Þ; we get Here, the cost-to-benefit ratio should exceed the Z degree of relatedness between donor and reci- T pient, but under‘‘normal’’conditions (no ½Šsð1 À zÞÀðr À 1Þ x dt 0 inbreeding, etc.) this relatedness is at most 1=2: Z The proposed model for an optional public T z’ dt goods game represents one of the rare cases ¼ ¼ ð ÞzðTÞ NÀ1 p z zð0Þ 0 z ð1 À z Þ where a highly nonlinear system of replicator equations can be fully analysed by purely NÀ1 À1 pðzÞ being a primitive of ½zð1 À z ފ : Since analytical means. For small interest rates, rp2; the orbits are closed, the last term vanishes and homoclinic orbits are observed starting in and % % % the proportionality between x and 1 À z; i.e. x þ returning to el; i.e. the state where no one y% follows. The time average (16) follows in participates in the public goods game. For r42 the same way afterdividing eqn (9) by a fixed point occurs in the interior of the simplex NÀ1 zð1 À zÞð1 À z Þ: S3: By reducing the replicator equations to a Due to the properties of the replicator Hamiltonian system, we see that Q is actually equation, the time averages of the payoffs for a centerand that in int S3 only closed orbits the three different strategies are equal and reduce appear. From this follow various conditions on to the payoff of loners s: the time averages of the frequencies and payoffs % % % of the three strategies. For example, the average Pc ¼ Pd ¼ Pl ¼ s: ratio of cooperators and defectors corresponds to the ratio of the averages and is independent of Thus, in the long run, no one does better or the initial configuration and the group size N: It worse than the loners. turns out to be impossible to increase coopera- tion by increasing the interest rate rFon the Discussion contrary, it favors defection and lowers x%=y%: In The oscillations, and thus the recurrent order to promote cooperation, one should rather increase in cooperation, are due to the fact that increase the loner’s payoff s or reduce the group a public goods game needs not always be a social size N: Note that in the lattercase x%=y% still 194 C. HAUERT ET AL. increases even when keeping the profits for each REFERENCES invested dollarconstant ( r=N ¼ const). The fact Batali,J.&Kitcher, P. (1995). Evolution of altruism that cooperation is favored in smaller groups in optional and compulsory games. J. theor. Biol. 175, agrees with other theoretical as well as experi- 161–171. mental results (Bonacich et al., 1976; Boyd & Binmore, K. G. (1994). Playing Fair: Game Theory and the Richerson, 1988; Milinski et al., 1990; Hauert & Social Contract. Cambridge, MA: MIT Press. Bonacich, P., Sure, G. H., Kahan,J.P.&Meeker,R.J. Schuster, 1998). (1976). Cooperation and group size in the n-person We stress that the dynamics obtained in this prisoner’s dilemma. J. Conflict Resolut. 20, 687–705. simple and, we believe, natural model is highly Boyd,R.&Richerson, P. J. (1988). The evolution of reciprocity in sizeable groups. J. theor. Biol. 132, degenerate: it has a center, an invariant of 337–356. motion, a heteroclinic cycle, a non-hyperbolic Daw e s, R. M. (1980). Social dilemmas. Ann. Rev. Psychol. fixed point, and an even numberof Nash 31, 169–193. equilibria. All these properties are non-generic Dugatkin, L. A. (1997). Cooperation Among Animals: an Evolutionary Perspective. Oxford: Oxford University underthe usual assumptions. But as we show in Press. Hauert et al. (2002), a wide variety of adaptive Fehr,E.&GA¨ chter, S. (2002). Altruistic punishment in mechanisms, corresponding to many different humans. Nature 415, 137–140. amilton types of evolutionary game dynamics, lead to H , W. D. (1963). The evolution of altruistic behaviour. Am. Nat. 97, 354–356. persistent oscillations in the frequencies of the Hauert,C.&Schuster, H. G. (1997). Effects of three strategies. increasing the number of players and memory size in The option to drop out from a public goods the iterated prisoner’s dilemma: a numerical approach. Proc. R. Soc. Lond. B 264, 513–519. game, i.e. a social and economic enterprise, Hauert,C.&Schuster, H. G. (1998). Extending the avoids deadlocks in states of mutual defection iterated prisoner’s dilemma without synchrony. J. theor. and economic stalemate. As a prerequisite, the Biol. 192, 155–166. possible gainFi.e. the ‘‘interest’’ rFhas to be Hauert, C., De Monte, S., Hofbauer,J.&Sigmund,K. (2002). Volunteering as Red Queen mechanism for quite large. The enterprise must offer a con- co-operation in public goods games. Science, 296, siderable advantage. In simple societies, such 1129–1132. situations may occurin big game hunting orin Hofbauer,J.&Sigmund, K. (1998). Evolutionary Games war. Small groups of volunteers are known to be and Population Dynamics. Cambridge: Cambridge University Press. efficient fordifficult tasks. This must have been Kagel,J.H.&Roth, A. E., eds (1995). The Handbook known to military people for ages. We only of . Princeton, NJ: Princeton quote Marbot, an officer of Napoleon: ‘‘To face University Press. Milinski, M., Pfluger, D., KU¨ lling,D.&Kettler,R. immense perils, volunteers are infinitely prefer- (1990). Do sticklebacks cooperate repeatedly in recipro- able to bodies of men under orders.’’ Success cal pairs? Behav. Ecol. Sociobiol. 27, 17–21. attracts larger groups of participants, but growth Nowak,M.A.&Sigmund, K. (1995). Invasion dynamics may inherently spell decline. This mechanism of the finitely repeated prisoner’s dilemma. Games Econ. Behav. 11, 364–390. leads to oscillations in the composition of the Orbell,J.H.&Daw e s, R. M. (1993). Social welfare, population. However, the average effect on the cooperators’ advantage, and the option of not playing the individual’s payoff is just the same as if this game. Am. Soc. Rev. 58, 787–800. chlag possibility did not exist and all members of the S , K. (1998). Why imitate, and if so, how? A bounded rational approach to multi-armed bandits. population were loners. J. Econ. Theory 78, 130–156. Trivers, R. L. (1971). The evolution of reciprocal altruism. Ch.H. acknowledges support of the Swiss National Q. Rev. Biol. 46, 35–57. Science Foundation; S.D. acknowledges support of Weibull, J. W. (1995). Evolutionary Game Theory. ! the Department of Mathematics, Universita Statale Cambridge, MA: MIT Press. di Milano; K.S. acknowledges support of the Wilson,D.S.&Sober, E. (1994). Reintroducing group Wissenschaftskolleg WK W008 Differential Equation selection to the human behavioral sciences. Behav. Brain Models in Science and Engineering. Sci. 17, 585–654.