Memory-Two Strategies Forming Symmetric Mutual Reinforcement Learning Equilibrium in Repeated Prisoner's Dilemma Game

Total Page:16

File Type:pdf, Size:1020Kb

Memory-Two Strategies Forming Symmetric Mutual Reinforcement Learning Equilibrium in Repeated Prisoner's Dilemma Game Memory-two strategies forming symmetric mutual reinforcement learning equilibrium in repeated prisoner’s dilemma game Masahiko Ueda Graduate School of Sciences and Technology for Innovation, Yamaguchi University, Yamaguchi 753-8511, Japan Abstract We investigate symmetric equilibria of mutual reinforcement learning when both players alternately learn the optimal memory-two strategies against the oppo- nent in the repeated prisoner’s dilemma game. We provide the necessary condi- tion for memory-two deterministic strategies to form symmetric equilibria. We then provide two examples of memory-two deterministic strategies which form symmetric mutual reinforcement learning equilibria. We also prove that mu- tual reinforcement learning equilibria formed by memory-two strategies are also mutual reinforcement learning equilibria when both players use reinforcement learning of memory-n strategies with n> 2. Keywords: Repeated prisoner’s dilemma game; Reinforcement learning; Memory-two strategies 1. Introduction Learning in game theory has been attracted much attention [1, 2, 3, 4, 5]. Because rationality of human beings is bounded [6], modeling of human beings as learning agents has been used in game theory and theoretical economics. One of the most popular learning methods is reinforcement learning [7]. In reinforce- ment learning, an agent gradually learns the optimal policy against a stationary environment. Mutual reinforcement learning in game theory is a more difficult problem since the existence of multiple agents makes an environment nonsta- arXiv:2108.03258v1 [physics.soc-ph] 5 Aug 2021 tionary [8, 9, 10, 11, 12]. Several methods have been proposed for reinforcement learning with multiple agents [13]. Here, we investigate mutual reinforcement learning in the repeated pris- oner’s dilemma game [14]. More explicitly, we investigate properties of equi- libria formed by learning agents when the two agents alternately learn their optimal strategies against the opponent. In the previous study [15], it was Email address: [email protected] (Masahiko Ueda) Preprint submitted to Elsevier August 10, 2021 found that, among all deterministic memory-one strategies, only the Grim trig- ger strategy, the Win-Stay Lose-Shift strategy, and the All-D strategy can form symmetric equilibrium of mutual reinforcement learning. Recently, memory-n strategies with n > 1 attract much attention in game theory because longer memory enables more complicated behavior [16, 17, 18, 19, 20, 21]. However, even whether the above equilibria formed by memory-one strategies are still equilibria in memory-n settings or not has not been known. In this paper, we extend the analysis of Ref. [15] to memory-two strate- gies. First, we provide the necessary condition for memory-two deterministic strategies to form symmetric equilibria. Then we provide two non-trivial ex- amples of memory-two deterministic strategies which form symmetric mutual reinforcement learning equilibria. Furthermore, we also prove that mutual re- inforcement learning equilibria formed by memory-n′ strategies are also mutual reinforcement learning equilibria when both players use reinforcement learning of memory-n strategies with n>n′. This paper is organized as follows. In Section 2, we introduce the repeated prisoner’s dilemma game with memory-n strategies, and players using reinforce- ment learning. In Section 3, we show that the structure of the optimal strategies is constrained by the Bellman optimality equation. In Section 4, we introduce the concepts of mutual reinforcement learning equilibrium and symmetric equi- librium. We then provide the necessary condition for memory-two deterministic strategies to form symmetric equilibria. In Section 5, we provide two examples of memory-two deterministic strategies which form symmetric mutual reinforce- ment learning equilibria. In Section 6, we show that mutual reinforcement learn- ing equilibria formed by memory-n′ strategies are also mutual reinforcement learning equilibria when both players use reinforcement learning of memory-n strategies with n>n′. Section 7 is devoted to conclusion. 2. Model We introduce the repeated prisoner’s dilemma game [8]. There are two players (1 and 2) in the game. Each player chooses cooperation (C) or defection (D) on every round. The action of player a is written as σa ∈ {C,D}. We collectively write σ := (σ1, σ2), and call σ a state. We also write the space of all possible states as Ω := {C,D}2. The payoff of player a ∈ {1, 2} when the state is σ is described as ra (σ). The payoffs in the prisoner’s dilemma game are given by (r1 (C, C) , r1 (C,D) , r1 (D, C) , r1 (D,D)) = (R,S,T,P ) (1) (r2 (C, C) , r2 (C,D) , r2 (D, C) , r2 (D,D)) = (R,T,S,P ) (2) with T>R>P>S and 2R>T + S. The memory-n strategy (n ≥ 1) n a T σ σ(−m) of player is described as the conditional probability a a| m=1 of n σ n σ(−m) taking action a when the states in the previous rounds are m=1, n where we have introduced the notation σ(−m) := σ(−1), ··· , σ(−n) [21]. m=1 2 We write the length of memory of player a as na and define n := max {n1,n2}. In this paper, we assume that n is finite. Below we introduce the notation −a := {1, 2}\a. We consider the situation that both players learn their optimal strategies against the strategy of the opponent by reinforcement learning [7]. In rein- forcement learning, each player learns mapping (called policy) from the states n σ(−m) n σ m=1 in the previous rounds to his/her action so as to maximize his/her expected future reward. We write the action of player a at round t as σa(t). In addition, we write ra(t) := ra (σ(t)). We define the action-value function of player a as ∞ n n σ(−m) E k σ t−n σ(−m) Qa σa, := γ ra(t + k) σa(t)= σa, [ (s)]s=t−1 = , m=1 " m=1# h i k=0 h i X (3) where γ is a discounting factor satisfying 0 ≤ γ < 1. The action-value function n Q σ , σ(−m) a a a m=1 represents the expected future payoffs of player by tak- n σ n σ(−m) ing action a when states in the previous rounds are m=1. It should be noted that the right-hand side does not depend on t. Due to the property of memory-n strategies, the action-value function Qa obeys the Bellman equation against a fixed strategy T−a of the opponent: n (−m) Qa σa, σ m=1 n h i (−m) = ra (σ) T−a σ−a| σ m=1 σ−a X h i n−1 n n−1 ′ σ σ(−m) σ(−m) ′ σ σ(−m) +γ Ta σa| , T−a σ−a| Qa σa, , . ′ m=1 m=1 m=1 σ σ−a Xa X h i h i h i (4) See Appendix A for the derivation of Eq. (4). It has been known that the ∗ ∗ optimal policy Ta and the optimal action-value function Qa obeys the following Bellman optimality equation: n ∗ σ(−m) Qa σa, m=1 n h i (−m) = ra (σ) T−a σ−a| σ m=1 σ−a X h i n n−1 σ(−m) ∗ σ σ(−m) +γ T−a σ−a| max Qa σ,ˆ , , (5) m=1 σˆ m=1 σ−a X h i h i with the support n n ∗ σ(−m) ∗ σ(−m) suppTa ·| = argmax Qa σ, . (6) m=1 σ m=1 h i h i 3 See Appendix B for the derivation of Eqs. (5) and (6). In other words, in the optimal policy against T−a, player a takes the action σa which maximizes n Q∗ , σ(−m) n the value of a · m=1 when the states at the previous rounds are n σ(−m) m=1. We investigate the situation that players infinitely repeat the infinitely- repeated games and players alternately learn their optimal strategies in each game, as in Ref. [15]. We write the optimal strategy and the corresponding ∗(d) ∗(d) optimal action-value function of player a at d-th game as Ta and Qa , re- ∗(0) spectively. Given an initial strategy T2 of player 2, in the (2l − 1)-th game N ∗(2l−1) ∗(2l−2) ∗(2l−1) (l ∈ ), player 1 learns T1 against T2 by calculating Q1 . In the ∗(2l) ∗(2l−1) ∗(2l) 2l-th game, player 2 learns T2 against T1 by calculating Q2 . We are ∗(∞) ∗(∞) interested in the fixed points of the dynamics, that is, Ta and Qa . In this paper, we mainly investigate situations that the support (6) contains only one action, that is, strategies are deterministic. The number of determinis- 2n tic memory-n strategies in the repeated prisoner’s dilemma game is 22 , which increases rapidly as n increases. 3. Structure of optimal strategies Below we consider only the case n = 2. The Bellman optimality equation (5) for n = 2 is ∗ σ(−1) σ(−2) Qa σa, , (−1) (−2) = ra (σ) T−a σ−a| σ , σ σ−a X σ(−1) σ(−2) ∗ σ σ(−1) +γ T−a σ−a| , max Qa σ,ˆ , (7) σˆ σ−a X with ∗ σ(−1) σ(−2) ∗ σ(−1) σ(−2) suppTa ·| , = argmax Qa σ, , . (8) σ The number of memory-two deterministic strategies is 216, which is quite large, and therefore we cannot investigate all memory-two deterministic strategies as in the case of memory-one deterministic strategies [15]. Instead, we first investigate general properties of optimal strategies. We introduce the matrix representation of a strategy: Ta (σ) Ta (σ| (C, C), (C, C)) Ta ( σ| (C, C), (C,D)) Ta ( σ| (C, C), (D, C)) Ta (σ| (C, C), (D,D)) Ta ( σ| (C,D), (C, C)) Ta (σ| (C,D), (C,D)) Ta (σ| (C,D), (D, C)) Ta (σ| (C,D), (D,D)) := .
Recommended publications
  • Chapter 2 Equilibrium
    Chapter 2 Equilibrium The theory of equilibrium attempts to predict what happens in a game when players be- have strategically. This is a central concept to this text as, in mechanism design, we are optimizing over games to find the games with good equilibria. Here, we review the most fundamental notions of equilibrium. They will all be static notions in that players are as- sumed to understand the game and will play once in the game. While such foreknowledge is certainly questionable, some justification can be derived from imagining the game in a dynamic setting where players can learn from past play. Readers should look elsewhere for formal justifications. This chapter reviews equilibrium in both complete and incomplete information games. As games of incomplete information are the most central to mechanism design, special at- tention will be paid to them. In particular, we will characterize equilibrium when the private information of each agent is single-dimensional and corresponds, for instance, to a value for receiving a good or service. We will show that auctions with the same equilibrium outcome have the same expected revenue. Using this so-called revenue equivalence we will describe how to solve for the equilibrium strategies of standard auctions in symmetric environments. Emphasis is placed on demonstrating the central theories of equilibrium and not on providing the most comprehensive or general results. For that readers are recommended to consult a game theory textbook. 2.1 Complete Information Games In games of compete information all players are assumed to know precisely the payoff struc- ture of all other players for all possible outcomes of the game.
    [Show full text]
  • Uniqueness and Stability in Symmetric Games: Theory and Applications
    Uniqueness and stability in symmetric games: Theory and Applications Andreas M. Hefti∗ November 2013 Abstract This article develops a comparably simple approach towards uniqueness of pure- strategy equilibria in symmetric games with potentially many players by separating be- tween multiple symmetric equilibria and asymmetric equilibria. Our separation approach is useful in applications for investigating, for example, how different parameter constel- lations may affect the scope for multiple symmetric or asymmetric equilibria, or how the equilibrium set of higher-dimensional symmetric games depends on the nature of the strategies. Moreover, our approach is technically appealing as it reduces the complexity of the uniqueness-problem to a two-player game, boundary conditions are less critical compared to other standard procedures, and best-replies need not be everywhere differ- entiable. The article documents the usefulness of the separation approach with several examples, including applications to asymmetric games and to a two-dimensional price- advertising game, and discusses the relationship between stability and multiplicity of symmetric equilibria. Keywords: Symmetric Games, Uniqueness, Symmetric equilibrium, Stability, Indus- trial Organization JEL Classification: C62, C65, C72, D43, L13 ∗Author affiliation: University of Zurich, Department of Economics, Bluemlisalpstr. 10, CH-8006 Zurich. E-mail: [email protected], Phone: +41787354964. Part of the research was accomplished during a research stay at the Department of Economics, Harvard University, Littauer Center, 02138 Cambridge, USA. 1 1 Introduction Whether or not there is a unique (Nash) equilibrium is an interesting and important question in many game-theoretic settings. Many applications concentrate on games with identical players, as the equilibrium outcome of an ex-ante symmetric setting frequently is of self-interest, or comparably easy to handle analytically, especially in presence of more than two players.
    [Show full text]
  • Equilibrium Computation in Normal Form Games
    Tutorial Overview Game Theory Refresher Solution Concepts Computational Formulations Equilibrium Computation in Normal Form Games Costis Daskalakis & Kevin Leyton-Brown Part 1(a) Equilibrium Computation in Normal Form Games Costis Daskalakis & Kevin Leyton-Brown, Slide 1 Tutorial Overview Game Theory Refresher Solution Concepts Computational Formulations Overview 1 Plan of this Tutorial 2 Getting Our Bearings: A Quick Game Theory Refresher 3 Solution Concepts 4 Computational Formulations Equilibrium Computation in Normal Form Games Costis Daskalakis & Kevin Leyton-Brown, Slide 2 Tutorial Overview Game Theory Refresher Solution Concepts Computational Formulations Plan of this Tutorial This tutorial provides a broad introduction to the recent literature on the computation of equilibria of simultaneous-move games, weaving together both theoretical and applied viewpoints. It aims to explain recent results on: the complexity of equilibrium computation; representation and reasoning methods for compactly represented games. It also aims to be accessible to those having little experience with game theory. Our focus: the computational problem of identifying a Nash equilibrium in different game models. We will also more briefly consider -equilibria, correlated equilibria, pure-strategy Nash equilibria, and equilibria of two-player zero-sum games. Equilibrium Computation in Normal Form Games Costis Daskalakis & Kevin Leyton-Brown, Slide 3 Tutorial Overview Game Theory Refresher Solution Concepts Computational Formulations Part 1: Normal-Form Games
    [Show full text]
  • Symmetric Equilibrium Existence and Optimality in Differentiated Product Markets*
    Rcprin~edfrom JOU~NALof Eco~ou~cTHmr Vol 47, No. I. February 1989 All Rlghu Revrwd by Audermc Pw.New York and London Pnrtd u! &lx~um Notes, Comments, and Letters to the Editor Symmetric Equilibrium Existence and Optimality in Differentiated Product Markets* Department of Economics, Columbia University, New York, New York 10027 Received December 20, 1985; revised January 7, 1988 Equilibrium existence and optimality are analysed in a market for products differentiated by their variety. A game of three stages is analysed. Firms enter in the first stage, choose varieties in the second stage, and choose prices in the third stage. The existence of subgame-perfect equilibria is established. At equilibrium products are symmetrically located in the space of characteristics and are offered at equal prices. The surplus maximizing solution is characterized, and it is shown that sur- plus maximizing product diversity is lower than the equilibrium one. Juurnul u/ &conomic Literalure Classification Numbers: 022, 611, 615, 933. C 1989 Audcm~c Preu, lac. Ever since Hotelling's acclaimed "Principle of Minimum Differentiation" [13] attention has been focused on the issue of equilibrium patterns of dif- ferentiated products in duopoly and oligopoly. Hotelling modelled duopoly competition as a two stage game. In the last stage firms played a non- cooperative game in prices taking as given the choices of varieties made in the first stage. In the first stage firms chose varieties non-cooperatively expecting to receive the Nash equilibrium profits of the price game played for the chosen varieties. Hotelling looked for equilibria which (in modern terminology) were subgame-perfect.
    [Show full text]
  • On Uniqueness and Stability of Symmetric Equilibria in Differentiable Symmetric Games
    University of Zurich Department of Economics Working Paper Series ISSN 1664-7041 (print) ISSN1664-705X(online) Working Paper No. 18 On Uniqueness and Stability of symmetric equilibria in differentiable symmetric games Andreas Hefti May 2011 On Uniqueness and Stability of symmetric equilibria in differentiable symmetric games∗ Andreas Hefti† February 2011 Abstract Higher-dimensional symmetric games become of more and more importance for applied micro- and macroeconomic research. Standard approaches to uniqueness of equilibria have the drawback that they are restrictive or not easy to evaluate analytically. In this paper I provide some general but comparably simple tools to verify whether a symmetric game has a unique symmetric equilibrium or not. I distinguish between the possibility of multiple symmetric equilibria and asymmetric equilibria which may be economically interesting and is useful to gain further insights into the causes of asymmetric equilibria in symmetric games with higher-dimensional strategy spaces. Moreover, symmetric games may be used to derive some properties of the equilibrium set of certain asymmetric versions of the symmetric game. I further use my approach to discuss the relationship between stability and (in)existence of multiple symmetric equilibria. While there is an equivalence between stability, inexistence of multiple symmetric equilibria and the unimportance of strategic effects for the compara- tive statics, this relationship breaks down in higher dimensions. Stability under symmetric adjustments is a minimum requirement of a symmetric equilibrium for reasonable compara- tive statics of symmetric changes. Finally, I present an alternative condition for a symmetric equilibrium to be a local contraction which is more general than the conventional approach of diagonal dominance and yet simpler to evaluate than the eigenvalue condition of continuous adjustment processes.
    [Show full text]
  • Sion's Minimax Theorem and Nash
    Munich Personal RePEc Archive Sion’s minimax theorem and Nash equilibrium of symmetric multi-person zero-sum game Satoh, Atsuhiro and Tanaka, Yasuhito 24 October 2017 Online at https://mpra.ub.uni-muenchen.de/82148/ MPRA Paper No. 82148, posted 24 Oct 2017 14:07 UTC Sion’s minimax theorem and Nash equilibrium of symmetric multi-person zero-sum game∗∗ Atsuhiro Satoha,∗, Yasuhito Tanakab,∗∗ aFaculty of Economics, Hokkai-Gakuen University, Toyohira-ku, Sapporo, Hokkaido, 062-8605, Japan. bFaculty of Economics, Doshisha University, Kamigyo-ku, Kyoto, 602-8580, Japan. Abstract We will show that Sion’s minimax theorem is equivalent to the existence of Nash equilibrium in a symmetric multi-person zero-sum game. If a zero-sum game is asymmetric, maximin strategies and minimax strategies of players do not corre- spond to Nash equilibrium strategies. However, if it is symmetric, the maximin strategy and the minimax strategy constitute a Nash equilibrium. Keywords: multi-person zero-sum game, Nash equilibrium, Sion’s minimax theorem. ∗E-mail: [email protected] ∗∗E-mail: [email protected]. Preprint submitted to Elsevier October 24, 2017 1. Introduction We consider the relation between Sion’s minimax theorem and the existence of Nash equilibrium in a symmetric multi-person zero-sum game. We will show that they are equivalent. An example of such a game is a relative profit maximization game in a Cournot oligopoly. Suppose that there are n ≥ 3 firms in an oligopolistic industry. Letπ ¯i be the absolute profit of the i-th firm. Then, its relative profit is 1 n π = π¯ − π¯ .
    [Show full text]
  • I. Non-Cooperative Games. A. Game Theory Can Be Used to Model a Wide
    Rationality and Game Theory (L2, L3, L4) An Introduction to Non-Cooperative Game Theory ii. For example: It is probably fair to say that the application of game theory to economic problems is the most In Cournot duopoly, each firm's profits depend upon its own output decision and active area of theory in modern economics and philosophy. A quick look at any economics that of the other firm in the market. journal published and many philosophy journals in the past decade will reveal a large number In a setting where pure public goods are consumed, one's own consumption of the of articles that rely upon elementary game theory to analyze economic behavior of theoretical public good depends in part on one's own production level of the good, and, in part, and policy interest. on that of all others. After a snow fall, the amount of snow on neighborhood sidewalks depends partly on To some extent, the tradition of game theory in economics is an old one. The Cournot your own efforts at shoveling and partly that of all others in the neighborhood. duopoly model (1838) is an example of a non-cooperative game with a Nash equilibrium. In an election, each candidate's vote maximizing policy position depends in part on Analysis of Stackelberg duopoly and monopolistic competition have always been based on the positions of the other candidate(s). models and intuitions very much like those of game theorists. iii. Game theory models are less interesting in cases where there are no interdependencies. Modern work on: the self-enforcing properties of contracts, credible commitments, the private For example, a case where there is no interdependence it that of a producer or consumer production of public goods, externalities, time inconsistency problems, models of negotiation, in perfectly competitive market.
    [Show full text]
  • Evolutionarily Stable Strategies Consider the Game Hawk Dove Hawk -2,-2 2,0 Dove 0,2 1,1
    Evolutionarily Stable Strategies Consider the game Hawk Dove Hawk -2,-2 2,0 Dove 0,2 1,1 Let σp be the strategy: play Hawk with probability p and Dove with prob- ability (1 − p). Recall that F (p; q) is the expected payoff to Player 1 when Player uses σp and Player 2 uses σq. Definition 0.1 We say that σp is an Evolutionarily Stable Strategy (ESS) if either: i) F (p; p) ≥ F (q; p) for all q 6= p, or ii) F (p; p) = F (q; p) and F (p; q) > F (q; q) for all q 6= p. In case i) we say p is a strong ESS and in case ii) it is a mild ESS. Remarks: • If p is an ESS, then F (p; p) ≥ F (q; p) for all q. Thus (σp; σp) must be a symmetric Nash equilibrium. • Any strong ESS, must be a pure strategy equilibrium. • A strict pure strategy symmetric equilibrium is a strong ESS. [Here strict means that if Player 1 changes her move while Player 2 does not, Player 1's payoff will go down.] In the Hawk{Dove game there is a unique symmetric equilibrium where each player uses the mixed strategy σ1=3, i.e., play H with probability 1/3. We show this is a mild ESS. 1 1 2 4 F ( ; q) = [q(−2) + (1 − q)(2)] + [q(0) + (1 − q)(1)] = − q 3 3 3 3 F (q; q) = q[q(−2) + (1 − q)(2)] + (1 − q)[q(0) + (1 − q)(1)] = 1 − 4q − 3q2 We need F (1=3; q) > F (q; q) for all q 6= p.
    [Show full text]
  • Impossibility of Collusion Under Imperfect Monitoring with Flexible Production
    Impossibility of Collusion under Imperfect Monitoring with Flexible Production By Yuliy Sannikov and Andrzej Skrzypacz* We show that it is impossible to achieve collusion in a duopoly when (a) goods are homogenous and firms compete in quantities; (b) new, noisy information arrives continuously, without sudden events; and (c) firms are able to respond to new information quickly. The result holds even if we allow for asymmetric equilibria or monetary transfers. The intuition is that the flexibility to respond quickly to new information unravels any collusive scheme. Our result applies to both a simple sta- tionary model and a more complicated one, with prices following a mean-reverting Markov process, as well as to models of dynamic cooperation in many other set- tings. (JEL D43, L2, L3) Collusion is a major problem in many mar- We study the scope of collusion in a quan- kets and has been an important topic of study tity-setting duopoly with homogenous goods in both applied and theoretical economics. From and flexible production—that is, if firms can exposed collusive cases, we know how numer- change output flow frequently.2 As in Green and ous real-life cartels have been organized and Porter (984), in our model firms cannot observe what kinds of agreements (either implicit or each other’s production decisions directly. They explicit) are likely to be successful in obtain- observe only noisy market prices/signals that ing collusive profits. At least since George J. depend on the total market supply. We show that Stigler (964), economists have recognized that collusion is impossible to achieve if: imperfect monitoring may destabilize cartels.
    [Show full text]
  • A. the Symmetric Fixed-N Nash Equilibrium
    Online Appendix for “Testing Game Theory in the Field: Swedish LUPI Lottery Games” Robert Östling, Joseph Tao-yi Wang, Eileen Y. Chou and Colin F. Camerer A. The Symmetric Fixed-N Nash Equilibrium Let there be a finite number of N players that each pick an integer between 1 and K. If there are numbers that are only chosen by one player, then the player that picks the lowest such number wins a prize, which we normalize to 1, and all other players get zero. If there is no number that only one player chooses, everybody gets zero. To get some intuition for the equilibrium in the game with many players, we first consider the cases with two and three players. If there are only two players and two numbers to choose from, the game reduces to the following bimatrix game. 1 2 1 0, 0 1, 0 2 0, 1 0, 0 This game has three equilibria. There are two asymmetric equilibria in which one player picks 1 and the other player picks 2, and one symmetric equilibrium in which both players pick 1. Now suppose that there are three players and three numbers to choose from (i.e., N = K = 3). In any pure strategy equilibrium it must be the case that at least one player plays the number 1, but not more than two players play the number 1 (if all three play 1, it is optimal to deviate for one player and pick 2). In pure strategy equilibria where only one player plays 1, the other players can play in any combination of the other two numbers.
    [Show full text]
  • Evolutionary Game Theory: a Renaissance
    games Review Evolutionary Game Theory: A Renaissance Jonathan Newton ID Institute of Economic Research, Kyoto University, Kyoto 606-8501, Japan; [email protected] Received: 23 April 2018; Accepted: 15 May 2018; Published: 24 May 2018 Abstract: Economic agents are not always rational or farsighted and can make decisions according to simple behavioral rules that vary according to situation and can be studied using the tools of evolutionary game theory. Furthermore, such behavioral rules are themselves subject to evolutionary forces. Paying particular attention to the work of young researchers, this essay surveys the progress made over the last decade towards understanding these phenomena, and discusses open research topics of importance to economics and the broader social sciences. Keywords: evolution; game theory; dynamics; agency; assortativity; culture; distributed systems JEL Classification: C73 Table of contents 1 Introduction p. 2 Shadow of Nash Equilibrium Renaissance Structure of the Survey · · 2 Agency—Who Makes Decisions? p. 3 Methodology Implications Evolution of Collective Agency Links between Individual & · · · Collective Agency 3 Assortativity—With Whom Does Interaction Occur? p. 10 Assortativity & Preferences Evolution of Assortativity Generalized Matching Conditional · · · Dissociation Network Formation · 4 Evolution of Behavior p. 17 Traits Conventions—Culture in Society Culture in Individuals Culture in Individuals · · · & Society 5 Economic Applications p. 29 Macroeconomics, Market Selection & Finance Industrial Organization · 6 The Evolutionary Nash Program p. 30 Recontracting & Nash Demand Games TU Matching NTU Matching Bargaining Solutions & · · · Coordination Games 7 Behavioral Dynamics p. 36 Reinforcement Learning Imitation Best Experienced Payoff Dynamics Best & Better Response · · · Continuous Strategy Sets Completely Uncoupled · · 8 General Methodology p. 44 Perturbed Dynamics Further Stability Results Further Convergence Results Distributed · · · control Software and Simulations · 9 Empirics p.
    [Show full text]
  • An Introduction to Game Theory
    Publicly•available solutions for AN INTRODUCTION TO GAME THEORY Publicly•available solutions for AN INTRODUCTION TO GAME THEORY MARTIN J. OSBORNE University of Toronto Copyright © 2012 by Martin J. Osborne All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior permission of Martin J. Osborne. This manual was typeset by the author, who is greatly indebted to Donald Knuth (TEX), Leslie Lamport (LATEX), Diego Puga (mathpazo), Christian Schenk (MiKTEX), Ed Sznyter (ppctr), Timothy van Zandt (PSTricks), and others, for generously making superlative software freely available. The main font is 10pt Palatino. Version 6: 2012-4-7 Contents Preface xi 1 Introduction 1 Exercise 5.3 (Altruistic preferences) 1 Exercise 6.1 (Alternative representations of preferences) 1 2 Nash Equilibrium 3 Exercise 16.1 (Working on a joint project) 3 Exercise 17.1 (Games equivalent to the Prisoner’s Dilemma) 3 Exercise 20.1 (Games without conflict) 3 Exercise 31.1 (Extension of the Stag Hunt) 4 Exercise 34.1 (Guessing two-thirds of the average) 4 Exercise 34.3 (Choosing a route) 5 Exercise 37.1 (Finding Nash equilibria using best response functions) 6 Exercise 38.1 (Constructing best response functions) 6 Exercise 38.2 (Dividing money) 7 Exercise 41.1 (Strict and nonstrict Nash equilibria) 7 Exercise 47.1 (Strict equilibria and dominated actions) 8 Exercise 47.2 (Nash equilibrium and weakly dominated
    [Show full text]