Zero-Sum Games Game Theory 2021

Total Page:16

File Type:pdf, Size:1020Kb

Zero-Sum Games Game Theory 2021 Zero-Sum Games Game Theory 2021 Game Theory 2021 Ulle Endriss Institute for Logic, Language and Computation University of Amsterdam Ulle Endriss 1 Zero-Sum Games Game Theory 2021 Plan for Today Today we are going to focus on the special case of zero-sum games and discuss two positive results that do not hold for games in general. • new solution concepts: maximin and minimax solutions • Minimax Theorem: maximin = minimax = NE for zero-sum games • fictitious play: basic model for learning in games • convergence result for the case of zero-sum games The first part of this is also covered in Chapter 3 of the Essentials. K. Leyton-Brown and Y. Shoham. Essentials of Game Theory: A Concise, Multi- disciplinary Introduction. Morgan & Claypool Publishers, 2008. Chapter 3. Ulle Endriss 2 Zero-Sum Games Game Theory 2021 Zero-Sum Games Today we focus on two-player games hN; A; ui with N = f1; 2g. Notation: Given player i 2 f1; 2g, we refer to her opponent as −i. Recall: A zero-sum game is a two-player normal-form game hN; A; ui for which ui(a) + u−i(a) = 0 for all action profiles a 2 A. Examples include (but are not restricted to) games in which you can win (+1), lose (−1), or draw (0), such as matching pennies: H T L R −1 1 −5 3 H T 1 −1 5 −3 1 −1 0 −2 T B −1 1 0 2 Ulle Endriss 3 Zero-Sum Games Game Theory 2021 Constant-Sum Games A constant-sum game is a two-player normal-form game hN; A; ui for which there exists a c 2 R such that ui(a) + u−i(a) = c for all a 2 A. Thus: A zero-sum game is a constant-sum game with constant c = 0. Everything about zero-sum games to be discussed today also applies to constant-sum games, but for simplicity we only talk about the former. Fun Fact: Football is not a constant-sum game, as you get 3 points for a win, 0 for a loss, and 1 for a draw. But prior to 1994, when the \three-points-for-a-win" rule was introduced, World Cup games were constant-sum (with 2, 0, 1 points, for win, loss, draw, respectively). Ulle Endriss 4 Zero-Sum Games Game Theory 2021 Maximin Strategies The definitions on this slide apply to arbitrary normal-form games . Suppose player i wants to maximise her worst-case expected utility (e.g., if all others conspire against her). Then she should play: ? si 2 argmax min ui(si; s−i) s 2S si2Si −i −i ? Any such si is called a maximin strategy (usually there is just one). Solution concept: assume each player will play a maximin strategy. Call max min ui(si; s−i) player i's maximin value (or security level). si s−i Ulle Endriss 5 Zero-Sum Games Game Theory 2021 Exercise: Maximin and Nash Consider the following two-player game: L R 2 0 T 8 0 0 4 B 0 8 What is the maximin solution? How does this relate to Nash equilibria? Note: This is neither a zero-sum nor a constant-sum game. Ulle Endriss 6 Zero-Sum Games Game Theory 2021 Exercise: Maximin and Nash Again Now consider this fairly similar game, which is zero-sum: L R −8 0 T 8 0 0 −8 B 0 8 What is the maximin solution? How does this relate to Nash equilibria? Ulle Endriss 7 Zero-Sum Games Game Theory 2021 Minimax Strategies Now focus on two-player games only, with players i and −i ... Suppose player i wants to minimise −i's best-case expected utility (e.g., to punish her). Then i should play: ? si 2 argmin max u−i(si; s−i) s 2S si2Si −i −i Remark: For a zero-sum game, an alternative interpretation is that player i has to play first and her opponent −i can respond. ? Any such si is called a minimax strategy (usually there is just one). Call min max u−i(si; s−i) player −i's minimax value. si s−i So i's minimax value is min max ui(s−i; si) = min max ui(si; s−i). s−i si s−i si Ulle Endriss 8 Zero-Sum Games Game Theory 2021 Equivalence of Maximin and Minimax Values Recall: For two-player games, we have seen the following definitions. • Player i's maximin value is max min ui(si; s−i). si s−i • Player i's minimax value is min max ui(si; s−i). s−i si Lemma 1 In a two-player game, maximin and minimax value coincide: max min ui(si; s−i)= min max ui(si; s−i) si s−i s−i si We omit the proof. For the case of two actions per player, there is a helpful visualisation in the Essentials. Note that one direction is easy: (6) LHS is what i can achieve when she has to move first, while RHS is what i can achieve when she can move second. X Remark: The lemma does not hold if we quantify over actions rather than strategies (counterexample: Matching Pennies). Ulle Endriss 9 Zero-Sum Games Game Theory 2021 The Minimax Theorem Recall: A zero-sum game is a two-player game with ui(a) + u−i(a) = 0. Theorem 2 (Von Neumann, 1928) In a zero-sum game, a strategy profile is aNE iff each player's expected utility equals her minimax value. Proof: Let vi be the minimax/maximin value of player i (and v−i = −vi that of player −i). (1) Suppose ui(si; s−i) 6= vi. Then one player does worse than she could (note that here we use the zero-sum property!). So (si; s−i) is not a NE. X (2) Suppose ui(si; s−i) = vi. Then one having a better response would mean the other's security John von Neumann (1903{1957) level is actually lower. So (si; s−i) is a NE. X J. von Neumann. Zur Theorie der Gesellschaftsspiele. Mathematische Annalen, 100(1):295{320, 1928. Ulle Endriss 10 Zero-Sum Games Game Theory 2021 Learning in Games Suppose you keep playing the same game against the same opponents. You might try to learn their strategies. A good hypothesis might be that the frequency with which player i plays action ai is approximately her probability of playing ai. Now suppose you always best-respond to those hypothesised strategies. And suppose everyone else does the same. What will happen? We are going to see that for zero-sum games this process converges to a NE. This yields a method for computing a NE for the (non-repeated) game: just imagine players engage in such \fictitious play". Ulle Endriss 11 Zero-Sum Games Game Theory 2021 Empirical Mixed Strategies ` 0 1 `−1 Given a history of actions Hi = ai ; ai ; : : : ; ai played by player i in ` ` prior plays of game hN; A; ui, fix her empirical mixed strategy si 2 Si: 1 s`(a ) = · #fk < ` j ak = a g for all a 2 A i i ` i i i i | {z } ` relative frequency of ai in Hi Ulle Endriss 12 Zero-Sum Games Game Theory 2021 Best Pure Responses ? Recall: Strategy si 2 Si is a best response for player i to the (partial) ? 0 0 strategy profile s−i if ui(si ; s−i) > ui(si; s−i) for all si 2 Si. Due to expected utilities being convex combinations of plain utilities: Observation 3 For any given (partial) strategy profile s−i, the set of best responses for player i must include at least one pure strategy. So we can restrict attention to best pure responses for player i to s−i: ? ai 2 argmax ui(ai; s−i) ai2Ai Ulle Endriss 13 Zero-Sum Games Game Theory 2021 Fictitious Play Take any action profile a0 2 A for the normal-form game hN; A; ui. Fictitious play of hN; A; ui, starting in a0, is the following process: 0 • In round ` = 0, each player i 2 N plays action ai . • In any round ` > 0, each player i 2 N plays a best pure response to her opponents' empirical mixed strategies: ` ` ai 2 argmax ui(ai; s−i); where ai2Ai ` 1 k 0 si0 (ai0 ) = ` · #fk < ` j ai0 = ai0 g for all i 2 N and ai0 2 Ai0 Assume some deterministic way of breaking ties between maxima. 0 1 2 This yields a sequence a a a ::: with a corresponding 0 1 2 sequence of empirical-mixed-strategy profiles s s s ::: Question: Does lim s` exist and is it a meaningful strategy profile? `!1 Ulle Endriss 14 Zero-Sum Games Game Theory 2021 Example: Matching Pennies Let's see what happens when we start in the upper lefthand cornerHH (and break ties between equally good responses in favour of H): H T −1 1 H 1 −1 1 −1 T −1 1 Any strategy can be represented by a single probability (of playing H). 1 1 2 1 3 1 3 1 3 1 HH ( 1 ; 1 ) HT ( 2 ; 2 ) HT ( 3 ; 3 ) TT ( 4 ; 4 ) TT ( 5 ; 5 ) 3 1 3 2 3 3 3 4 TT ( 6 ; 6 ) TH ( 7 ; 7 ) TH ( 8 ; 8 ) TH ( 9 ; 9 ) 3 5 4 6 5 7 TH ( 10 ; 10 ) HH ( 11 ; 11 ) HH ( 12 ; 12 ) ··· Exercise: Can you guess what this will converge to? Ulle Endriss 15 Zero-Sum Games Game Theory 2021 Convergence Profiles are Nash Equilibria In general, lim s` does not exist (no guaranteed convergence).
Recommended publications
  • Learning and Equilibrium
    Learning and Equilibrium Drew Fudenberg1 and David K. Levine2 1Department of Economics, Harvard University, Cambridge, Massachusetts; email: [email protected] 2Department of Economics, Washington University of St. Louis, St. Louis, Missouri; email: [email protected] Annu. Rev. Econ. 2009. 1:385–419 Key Words First published online as a Review in Advance on nonequilibrium dynamics, bounded rationality, Nash equilibrium, June 11, 2009 self-confirming equilibrium The Annual Review of Economics is online at by 140.247.212.190 on 09/04/09. For personal use only. econ.annualreviews.org Abstract This article’s doi: The theory of learning in games explores how, which, and what 10.1146/annurev.economics.050708.142930 kind of equilibria might arise as a consequence of a long-run non- Annu. Rev. Econ. 2009.1:385-420. Downloaded from arjournals.annualreviews.org Copyright © 2009 by Annual Reviews. equilibrium process of learning, adaptation, and/or imitation. If All rights reserved agents’ strategies are completely observed at the end of each round 1941-1383/09/0904-0385$20.00 (and agents are randomly matched with a series of anonymous opponents), fairly simple rules perform well in terms of the agent’s worst-case payoffs, and also guarantee that any steady state of the system must correspond to an equilibrium. If players do not ob- serve the strategies chosen by their opponents (as in extensive-form games), then learning is consistent with steady states that are not Nash equilibria because players can maintain incorrect beliefs about off-path play. Beliefs can also be incorrect because of cogni- tive limitations and systematic inferential errors.
    [Show full text]
  • Evolutionary Game Theory: ESS, Convergence Stability, and NIS
    Evolutionary Ecology Research, 2009, 11: 489–515 Evolutionary game theory: ESS, convergence stability, and NIS Joseph Apaloo1, Joel S. Brown2 and Thomas L. Vincent3 1Department of Mathematics, Statistics and Computer Science, St. Francis Xavier University, Antigonish, Nova Scotia, Canada, 2Department of Biological Sciences, University of Illinois, Chicago, Illinois, USA and 3Department of Aerospace and Mechanical Engineering, University of Arizona, Tucson, Arizona, USA ABSTRACT Question: How are the three main stability concepts from evolutionary game theory – evolutionarily stable strategy (ESS), convergence stability, and neighbourhood invader strategy (NIS) – related to each other? Do they form a basis for the many other definitions proposed in the literature? Mathematical methods: Ecological and evolutionary dynamics of population sizes and heritable strategies respectively, and adaptive and NIS landscapes. Results: Only six of the eight combinations of ESS, convergence stability, and NIS are possible. An ESS that is NIS must also be convergence stable; and a non-ESS, non-NIS cannot be convergence stable. A simple example shows how a single model can easily generate solutions with all six combinations of stability properties and explains in part the proliferation of jargon, terminology, and apparent complexity that has appeared in the literature. A tabulation of most of the evolutionary stability acronyms, definitions, and terminologies is provided for comparison. Key conclusions: The tabulated list of definitions related to evolutionary stability are variants or combinations of the three main stability concepts. Keywords: adaptive landscape, convergence stability, Darwinian dynamics, evolutionary game stabilities, evolutionarily stable strategy, neighbourhood invader strategy, strategy dynamics. INTRODUCTION Evolutionary game theory has and continues to make great strides.
    [Show full text]
  • Improving Fictitious Play Reinforcement Learning with Expanding Models
    Improving Fictitious Play Reinforcement Learning with Expanding Models Rong-Jun Qin1;2, Jing-Cheng Pang1, Yang Yu1;y 1National Key Laboratory for Novel Software Technology, Nanjing University, China 2Polixir emails: [email protected], [email protected], [email protected]. yTo whom correspondence should be addressed Abstract Fictitious play with reinforcement learning is a general and effective framework for zero- sum games. However, using the current deep neural network models, the implementation of fictitious play faces crucial challenges. Neural network model training employs gradi- ent descent approaches to update all connection weights, and thus is easy to forget the old opponents after training to beat the new opponents. Existing approaches often maintain a pool of historical policy models to avoid the forgetting. However, learning to beat a pool in stochastic games, i.e., a wide distribution over policy models, is either sample-consuming or insufficient to exploit all models with limited amount of samples. In this paper, we pro- pose a learning process with neural fictitious play to alleviate the above issues. We train a single model as our policy model, which consists of sub-models and a selector. Everytime facing a new opponent, the model is expanded by adding a new sub-model, where only the new sub-model is updated instead of the whole model. At the same time, the selector is also updated to mix up the new sub-model with the previous ones at the state-level, so that the model is maintained as a behavior strategy instead of a wide distribution over policy models.
    [Show full text]
  • Approximation Guarantees for Fictitious Play
    Approximation Guarantees for Fictitious Play Vincent Conitzer Abstract— Fictitious play is a simple, well-known, and often- significant progress has been made in the computation of used algorithm for playing (and, especially, learning to play) approximate Nash equilibria. It has been shown that for any games. However, in general it does not converge to equilibrium; ǫ, there is an ǫ-approximate equilibrium with support size even when it does, we may not be able to run it to convergence. 2 Still, we may obtain an approximate equilibrium. In this O((log n)/ǫ ) [1], [22], so that we can find approximate paper, we study the approximation properties that fictitious equilibria by searching over all of these supports. More play obtains when it is run for a limited number of rounds. recently, Daskalakis et al. gave a very simple algorithm We show that if both players randomize uniformly over their for finding a 1/2-approximate Nash equilibrium [12]; we actions in the first r rounds of fictitious play, then the result will discuss this algorithm shortly. Feder et al. then showed is an ǫ-equilibrium, where ǫ = (r + 1)/(2r). (Since we are examining only a constant number of pure strategies, we know a lower bound on the size of the supports that must be that ǫ < 1/2 is impossible, due to a result of Feder et al.) We considered to be guaranteed to find an approximate equi- show that this bound is tight in the worst case; however, with an librium [16]; in particular, supports of constant sizes can experiment on random games, we illustrate that fictitious play give at best a 1/2-approximate Nash equilibrium.
    [Show full text]
  • Modeling Human Learning in Games
    Modeling Human Learning in Games Thesis by Norah Alghamdi In Partial Fulfillment of the Requirements For the Degree of Masters of Science King Abdullah University of Science and Technology Thuwal, Kingdom of Saudi Arabia December, 2020 2 EXAMINATION COMMITTEE PAGE The thesis of Norah Alghamdi is approved by the examination committee Committee Chairperson: Prof. Jeff S. Shamma Committee Members: Prof. Eric Feron, Prof. Meriem T. Laleg 3 ©December, 2020 Norah Alghamdi All Rights Reserved 4 ABSTRACT Modeling Human Learning in Games Norah Alghamdi Human-robot interaction is an important and broad area of study. To achieve success- ful interaction, we have to study human decision making rules. This work investigates human learning rules in games with the presence of intelligent decision makers. Par- ticularly, we analyze human behavior in a congestion game. The game models traffic in a simple scenario where multiple vehicles share two roads. Ten vehicles are con- trolled by the human player, where they decide on how to distribute their vehicles on the two roads. There are hundred simulated players each controlling one vehicle. The game is repeated for many rounds, allowing the players to adapt and formulate a strategy, and after each round, the cost of the roads and visual assistance is shown to the human player. The goal of all players is to minimize the total congestion experienced by the vehicles they control. In order to demonstrate our results, we first built a human player simulator using Fictitious play and Regret Matching algorithms. Then, we showed the passivity property of these algorithms after adjusting the passivity condition to suit discrete time formulation.
    [Show full text]
  • 8. Maxmin and Minmax Strategies
    CMSC 474, Introduction to Game Theory 8. Maxmin and Minmax Strategies Mohammad T. Hajiaghayi University of Maryland Outline Chapter 2 discussed two solution concepts: Pareto optimality and Nash equilibrium Chapter 3 discusses several more: Maxmin and Minmax Dominant strategies Correlated equilibrium Trembling-hand perfect equilibrium e-Nash equilibrium Evolutionarily stable strategies Worst-Case Expected Utility For agent i, the worst-case expected utility of a strategy si is the minimum over all possible Husband Opera Football combinations of strategies for the other agents: Wife min u s ,s Opera 2, 1 0, 0 s-i i ( i -i ) Football 0, 0 1, 2 Example: Battle of the Sexes Wife’s strategy sw = {(p, Opera), (1 – p, Football)} Husband’s strategy sh = {(q, Opera), (1 – q, Football)} uw(p,q) = 2pq + (1 – p)(1 – q) = 3pq – p – q + 1 We can write uw(p,q) For any fixed p, uw(p,q) is linear in q instead of uw(sw , sh ) • e.g., if p = ½, then uw(½,q) = ½ q + ½ 0 ≤ q ≤ 1, so the min must be at q = 0 or q = 1 • e.g., minq (½ q + ½) is at q = 0 minq uw(p,q) = min (uw(p,0), uw(p,1)) = min (1 – p, 2p) Maxmin Strategies Also called maximin A maxmin strategy for agent i A strategy s1 that makes i’s worst-case expected utility as high as possible: argmaxmin ui (si,s-i ) si s-i This isn’t necessarily unique Often it is mixed Agent i’s maxmin value, or security level, is the maxmin strategy’s worst-case expected utility: maxmin ui (si,s-i ) si s-i For 2 players it simplifies to max min u1s1, s2 s1 s2 Example Wife’s and husband’s strategies
    [Show full text]
  • 14.12 Game Theory Lecture Notes∗ Lectures 7-9
    14.12 Game Theory Lecture Notes∗ Lectures 7-9 Muhamet Yildiz Intheselecturesweanalyzedynamicgames(withcompleteinformation).Wefirst analyze the perfect information games, where each information set is singleton, and develop the notion of backwards induction. Then, considering more general dynamic games, we will introduce the concept of the subgame perfection. We explain these concepts on economic problems, most of which can be found in Gibbons. 1 Backwards induction The concept of backwards induction corresponds to the assumption that it is common knowledge that each player will act rationally at each node where he moves — even if his rationality would imply that such a node will not be reached.1 Mechanically, it is computed as follows. Consider a finite horizon perfect information game. Consider any node that comes just before terminal nodes, that is, after each move stemming from this node, the game ends. If the player who moves at this node acts rationally, he will choose the best move for himself. Hence, we select one of the moves that give this player the highest payoff. Assigning the payoff vector associated with this move to the node at hand, we delete all the moves stemming from this node so that we have a shorter game, where our node is a terminal node. Repeat this procedure until we reach the origin. ∗These notes do not include all the topics that will be covered in the class. See the slides for a more complete picture. 1 More precisely: at each node i the player is certain that all the players will act rationally at all nodes j that follow node i; and at each node i the player is certain that at each node j that follows node i the player who moves at j will be certain that all the players will act rationally at all nodes k that follow node j,...ad infinitum.
    [Show full text]
  • Lecture 10: Learning in Games Ramesh Johari May 9, 2007
    MS&E 336 Lecture 10: Learning in games Ramesh Johari May 9, 2007 This lecture introduces our study of learning in games. We first give a conceptual overview of the possible approaches to studying learning in repeated games; in particular, we distinguish be- tween approaches that use a Bayesian model of the opponents, vs. nonparametric or “model-free” approaches to playing against the opponents. Our investigation will focus almost entirely on the second class of models, where the main results are closely tied to the study of regret minimization in online regret. We introduce two notions of regret minimization, and also consider the corre- sponding equilibrium notions. (Note that we focus attention on repeated games primarily because learning results in stochastic games are significantly weaker.) Throughout the lecture we consider a finite N-player game, where each player i has a finite pure action set ; let , and let . We let denote a pure action for Ai A = Qi Ai A−i = Qj=6 i Aj ai player i, and let si ∈ ∆(Ai) denote a mixed action for player i. We will typically view si as a Ai vector in R , with si(ai) equal to the probability that player i places on ai. We let Πi(a) denote the payoff to player i when the composite pure action vector is a, and by an abuse of notation also let Πi(s) denote the expected payoff to player i when the composite mixed action vector is s. More q q a a generally, if is a joint probability distribution on the set A, we let Πi( )= Pa∈A q( )Π( ).
    [Show full text]
  • On the Convergence of Fictitious Play Author(S): Vijay Krishna and Tomas Sjöström Source: Mathematics of Operations Research, Vol
    On the Convergence of Fictitious Play Author(s): Vijay Krishna and Tomas Sjöström Source: Mathematics of Operations Research, Vol. 23, No. 2 (May, 1998), pp. 479-511 Published by: INFORMS Stable URL: http://www.jstor.org/stable/3690523 Accessed: 09/12/2010 03:21 Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://www.jstor.org/action/showPublisher?publisherCode=informs. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. INFORMS is collaborating with JSTOR to digitize, preserve and extend access to Mathematics of Operations Research. http://www.jstor.org MATHEMATICS OF OPERATIONS RESEARCH Vol. 23, No. 2, May 1998 Printed in U.S.A.
    [Show full text]
  • Contemporaneous Perfect Epsilon-Equilibria
    Games and Economic Behavior 53 (2005) 126–140 www.elsevier.com/locate/geb Contemporaneous perfect epsilon-equilibria George J. Mailath a, Andrew Postlewaite a,∗, Larry Samuelson b a University of Pennsylvania b University of Wisconsin Received 8 January 2003 Available online 18 July 2005 Abstract We examine contemporaneous perfect ε-equilibria, in which a player’s actions after every history, evaluated at the point of deviation from the equilibrium, must be within ε of a best response. This concept implies, but is stronger than, Radner’s ex ante perfect ε-equilibrium. A strategy profile is a contemporaneous perfect ε-equilibrium of a game if it is a subgame perfect equilibrium in a perturbed game with nearly the same payoffs, with the converse holding for pure equilibria. 2005 Elsevier Inc. All rights reserved. JEL classification: C70; C72; C73 Keywords: Epsilon equilibrium; Ex ante payoff; Multistage game; Subgame perfect equilibrium 1. Introduction Analyzing a game begins with the construction of a model specifying the strategies of the players and the resulting payoffs. For many games, one cannot be positive that the specified payoffs are precisely correct. For the model to be useful, one must hope that its equilibria are close to those of the real game whenever the payoff misspecification is small. To ensure that an equilibrium of the model is close to a Nash equilibrium of every possible game with nearly the same payoffs, the appropriate solution concept in the model * Corresponding author. E-mail addresses: [email protected] (G.J. Mailath), [email protected] (A. Postlewaite), [email protected] (L.
    [Show full text]
  • Extensive Form Games
    Extensive Form Games Jonathan Levin January 2002 1 Introduction The models we have described so far can capture a wide range of strategic environments, but they have an important limitation. In each game we have looked at, each player moves once and strategies are chosen simultaneously. This misses a common feature of many economic settings (and many classic “games” such as chess or poker). Consider just a few examples of economic environments where the timing of strategic decisions is important: • Compensation and Incentives. Firms first sign workers to a contract, then workers decide how to behave, and frequently firms later decide what bonuses to pay and/or which employees to promote. • Research and Development. Firms make choices about what tech- nologies to invest in given the set of available technologies and their forecasts about the future market. • Monetary Policy. A central bank makes decisions about the money supply in response to inflation and growth, which are themselves de- termined by individuals acting in response to monetary policy, and so on. • Entry into Markets. Prior to competing in a product market, firms must make decisions about whether to enter markets and what prod- ucts to introduce. These decisions may strongly influence later com- petition. These notes take up the problem of representing and analyzing these dy- namic strategic environments. 1 2TheExtensiveForm The extensive form of a game is a complete description of: 1. The set of players 2. Who moves when and what their choices are 3. What players know when they move 4. The players’ payoffs as a function of the choices that are made.
    [Show full text]
  • 8. Noncooperative Games 8.1 Basic Concepts Theory of Strategic Interaction Definition 8.1. a Strategy Is a Plan of Matching Penn
    EC 701, Fall 2005, Microeconomic Theory November 9, 2005 page 351 8. Noncooperative Games 8.1 Basic Concepts How I learned basic game theory in 1986. • Game theory = theory of strategic interaction • Each agent chooses a strategy in order to achieve his goals with • the knowledge that other agents may react to his strategic choice. Definition 8.1. A strategy is a plan of action that specifies what a player will do in every circumstance. EC 701, Fall 2005, Microeconomic Theory November 9, 2005 page 352 Matching Pennies (a zero-sum game) Eva and Esther each places a penny on the table simultaneously. • They each decide in advance whether to put down heads (H )or • tails (T). [This is a choice, NOT a random event as when a coin is flipped.] If both are heads (H ) or both are tails (T), then Eva pays Esther • $1. If one is heads and the other tails, then Esther pays Eva $1. • Esther HT 1 1 H − -1 1 Eva 1 1 T − 1 -1 EC 701, Fall 2005, Microeconomic Theory November 9, 2005 page 353 Esther HT 1 1 H − -1 1 Eva 1 1 T − 1 -1 Eva and Esther are the players. • H and T are strategies,and H , T is each player’s strategy • space. { } 1 and 1 are payoffs. • − Each box corresponds to a strategy profile. • For example, the top-right box represents the strategy profile • H , T h i EC 701, Fall 2005, Microeconomic Theory November 9, 2005 page 354 Deterministic strategies (without random elements) are called • pure strategies.
    [Show full text]