Solving Large Extensive-Form Games with Strategy Constraints

Total Page:16

File Type:pdf, Size:1020Kb

Solving Large Extensive-Form Games with Strategy Constraints The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19) Solving Large Extensive-Form Games with Strategy Constraints Trevor Davis,1 Kevin Waugh,2 Michael Bowling2,1 1Department of Computing Science, University of Alberta 2DeepMind [email protected], fwaughk, [email protected] Abstract One approach for handling multiple objectives is to use a linear combination of per-objective utilities. This approach Extensive-form games are a common model for multiagent has been used in EFGs to “tilt” poker agents toward taking interactions with imperfect information. In two-player zero- sum games, the typical solution concept is a Nash equilibrium specific actions (Johanson et al. 2011), and to mix between over the unconstrained strategy set for each player. In many cost minimization and risk mitigation in sequential security situations, however, we would like to constrain the set of pos- games (Lisy,´ Davis, and Bowling 2016). However, objec- sible strategies. For example, constraints are a natural way to tives are typically measured on incommensurable scales. This model limited resources, risk mitigation, safety, consistency leads to dubious combinations of weights often selected by with past observations of behavior, or other secondary objec- trial-and-error. tives for an agent. In small games, optimal strategies under linear constraints can be found by solving a linear program; A second approach is to constrain the agents’ strategy however, state-of-the-art algorithms for solving large games spaces directly. For example, rather than minimizing the cannot handle general constraints. In this work we introduce expected cost, we use a hard constraint that disqualifies high- a generalized form of Counterfactual Regret Minimization cost strategies. Using such constraints has been extensively that provably finds optimal strategies under any feasible set of studied in single-agent perfect information settings (Altman convex constraints. We demonstrate the effectiveness of our 1999) and partial information settings (Isom, Meyn, and algorithm for finding strategies that mitigate risk in security games, and for opponent modeling in poker games when given Braatz 2008; Santana, Thiebaux,´ and Williams 2016), as well only partial observations of private information. as in (non-sequential) security games (Brown et al. 2014). Incorporating strategy constraints when solving EFGs 1 Introduction presents a unique challenge. Nash equilibria can be found by Multiagent interactions are often modeled using extensive- solving a linear program (LP) derived using the sequence- form games (EFGs), a powerful framework that incoporates form representation (Koller, Megiddo, and von Stengel 1996). sequential actions, hidden information, and stochastic events. This LP is easily modified to incorporate linear strategy con- Recent research has focused on computing approximately straints; however, LPs do not scale to large games. Special- optimal strategies in large extensive-form games, resulting in ized algorithms for efficiently solving large games, such as a solution to heads-up limit Texas Hold’em, a game with ap- an instantiation of Nesterov’s excessive gap technique (EGT) proximately 1017 states (Bowling et al. 2015), and in two in- (Hoda et al. 2010) as well as counterfactual regret minimiza- dependent super-human computer agents for the much larger tion (CFR) (Zinkevich et al. 2008) and its variants (Lanctot heads-up no-limit Texas Hold’em (Moravcˇ´ık et al. 2017; et al. 2009; Tammelin et al. 2015), cannot integrate arbitrary Brown and Sandholm 2018). strategy constraints directly. Currently, the only large-scale When modeling an interaction with an EFG, for each out- approach is restricted to constraints that consider only indi- come we must specify the agents’ utility, a cardinal measure vidual decisions (Farina, Kroer, and Sandholm 2017). of the outcome’s desirability. Utility is particularly difficult to In this work we present the first scalable algorithm for specify. Take, for example, situations where an agent has mul- solving EFGs with arbitrary convex strategy constraints. Our tiple objectives to balance: a defender in a security game with algorithm, Constrained CFR, provably converges towards the primary objective of protecting a target and a secondary a strategy profile that is minimax optimal under the given objective of minimizing expected cost, or a robot operating p constraints. It does this while retaining the O(1= T ) con- in a dangerous environment with a primary task to complete vergence rate of CFR and requiring additional memory pro- and a secondary objective of minimizing damage to itself and portional to the number of constraints. We demonstrate the others. How these objectives combine into a single value, the empirical effectiveness of Constrained CFR by comparing agent’s utility, is ill-specified and error prone. its solution to that of an LP solver in a security game. We Copyright c 2019, Association for the Advancement of Artificial also present a novel constraint-based technique for opponent Intelligence (www.aaai.org). All rights reserved. modeling with partial observations in a small poker game. 1861 2 Background A strategy σi is an "-best response to the opponent’s strat- 0 Formally, an extensive-form game (Osborne and Rubinstein egy σ−i if ui(σi; σ−i) + " ≥ ui(σi; σ−i) for any alternative 0 1994) is a game tree defined by: strategy σi 2 Σi. A strategy profile is an "-Nash equilibrium when each σ is a "-best response to its opponent; such a • A set of players N. This work focuses on games with two i profile exists for any " ≥ 0. The exploitability of a strategy players, so N = f1; 2g. 1 profile is the smallest " = =2("1 +"2) such that each σi is an • A set of histories H, the tree’s nodes rooted at ;. The leafs, "i-best response. Due to the zero-sum property, the game’s Z ⊆ H, are terminal histories. For any history h 2 H, we Nash equilibria are the saddle-points of the minimax problem 0 0 0 let h @ h denote a prefix h of h, and necessarily h 2 H. max min u(σ1; σ2) = min max u(σ1; σ2): (3) • For each h 2 H n Z, a set of actions A(h). For any a 2 σ12Σ1 σ22Σ2 σ22Σ2 σ12Σ1 A(h), ha 2 H is a child of h. A zero-sum EFG can be represented in sequence form • A player function P : H nZ ! N [fcg defining the player (von Stengel 1996). The sets of sequence-form strategies for to act at h. If P (h) = c then chance acts according to a players 1 and 2 are X and Y respectively. A sequence-form known probability distribution σc(h) 2 ∆ , where jA(h)j strategy x 2 X is a vector indexed by pairs I 2 I1, a 2 A(I). ∆ is the probability simplex of dimension jA(h)j. jA(h)j The entry x(I;a) is the probability of player 1 playing the • A set of utility functions ui : Z ! R, for each player. Out- sequence of actions that reaches I and then playing action a. come z has utility ui(z) for player i. We assume the game A special entry, x; = 1, represents the empty sequence. Any is zero-sum, i.e., u1(z) = −u2(z). Let u(z) = u1(z). behavioral strategy σ1 2 Σ1 has a corresponding sequence- form strategy SEQ(σ1) where • For each player i 2 N, a collection of information sets Ii. Ii partitions Hi, the histories where i acts. Two histories SEQ(σ ) := πσ1 (I)σ (I; a): 8I 2 I ; a 2 A(I) 0 1 (I;a) 1 1 i h; h in an information set I 2 Ii are indistinguishable to i. Necessarily A(h) = A(h0), which we denote by A(I). Player i has a unique sequence to reach any history h 2 H When a player acts they do not observe the history, only and, by perfect recall, any information set I 2 Ii. Let the information set it belongs to, which we denote as I[h]. xh and xI denote the corresponding entries in x. Thus, we are free to write the expected utility as u(x; y) = We assume a further requirement on the information sets Ii P called perfect recall. It requires that players are never forced z2Z πc(z)xzyzu(z). This is bilinear, i.e., there exists a to forget information they once observed. Mathematically payoff matrix A such that u(x; y) = x|Ay. A consequence of perfect recall and the laws of probability is for I 2 I1 this means that all indistinguishable histories share the same P sequence of past information sets and actions for the actor. that xI = a2A(I) x(I;a) and that x ≥ 0. These constraints Although this may seem like a restrictive assumption, some are linear and completely describe the polytope of sequence- perfect recall-like condition is needed to guarantee that an form strategies. Using these together, (3) can be expressed as EFG can be solved in polynommial time, and all sequential a bilinear saddle point problem over the polytopes X and Y: games played by humans exhibit perfect recall. max min x|Ay = min max x|Ay (4) x2X y2Y y2Y x2X 2.1 Strategies A behavioral strategy for player i maps each information For a convex function f : X! R, let rf(x) be any element of the subdifferential @f(x), and let r(I;a)f(x) be set I 2 Ii to a distribution over actions, σi(I) 2 ∆jA(I)j. the (I; a) element of this subgradient. The probability assigned to a 2 A(I) is σi(I; a).A strategy profile, σ = fσ ; σ g, specifies a strategy for each player. 1 2 2.2 Counterfactual regret minimization We label the strategy of the opponent of player i as σ−i.
Recommended publications
  • Evolutionary Game Theory: ESS, Convergence Stability, and NIS
    Evolutionary Ecology Research, 2009, 11: 489–515 Evolutionary game theory: ESS, convergence stability, and NIS Joseph Apaloo1, Joel S. Brown2 and Thomas L. Vincent3 1Department of Mathematics, Statistics and Computer Science, St. Francis Xavier University, Antigonish, Nova Scotia, Canada, 2Department of Biological Sciences, University of Illinois, Chicago, Illinois, USA and 3Department of Aerospace and Mechanical Engineering, University of Arizona, Tucson, Arizona, USA ABSTRACT Question: How are the three main stability concepts from evolutionary game theory – evolutionarily stable strategy (ESS), convergence stability, and neighbourhood invader strategy (NIS) – related to each other? Do they form a basis for the many other definitions proposed in the literature? Mathematical methods: Ecological and evolutionary dynamics of population sizes and heritable strategies respectively, and adaptive and NIS landscapes. Results: Only six of the eight combinations of ESS, convergence stability, and NIS are possible. An ESS that is NIS must also be convergence stable; and a non-ESS, non-NIS cannot be convergence stable. A simple example shows how a single model can easily generate solutions with all six combinations of stability properties and explains in part the proliferation of jargon, terminology, and apparent complexity that has appeared in the literature. A tabulation of most of the evolutionary stability acronyms, definitions, and terminologies is provided for comparison. Key conclusions: The tabulated list of definitions related to evolutionary stability are variants or combinations of the three main stability concepts. Keywords: adaptive landscape, convergence stability, Darwinian dynamics, evolutionary game stabilities, evolutionarily stable strategy, neighbourhood invader strategy, strategy dynamics. INTRODUCTION Evolutionary game theory has and continues to make great strides.
    [Show full text]
  • 8. Maxmin and Minmax Strategies
    CMSC 474, Introduction to Game Theory 8. Maxmin and Minmax Strategies Mohammad T. Hajiaghayi University of Maryland Outline Chapter 2 discussed two solution concepts: Pareto optimality and Nash equilibrium Chapter 3 discusses several more: Maxmin and Minmax Dominant strategies Correlated equilibrium Trembling-hand perfect equilibrium e-Nash equilibrium Evolutionarily stable strategies Worst-Case Expected Utility For agent i, the worst-case expected utility of a strategy si is the minimum over all possible Husband Opera Football combinations of strategies for the other agents: Wife min u s ,s Opera 2, 1 0, 0 s-i i ( i -i ) Football 0, 0 1, 2 Example: Battle of the Sexes Wife’s strategy sw = {(p, Opera), (1 – p, Football)} Husband’s strategy sh = {(q, Opera), (1 – q, Football)} uw(p,q) = 2pq + (1 – p)(1 – q) = 3pq – p – q + 1 We can write uw(p,q) For any fixed p, uw(p,q) is linear in q instead of uw(sw , sh ) • e.g., if p = ½, then uw(½,q) = ½ q + ½ 0 ≤ q ≤ 1, so the min must be at q = 0 or q = 1 • e.g., minq (½ q + ½) is at q = 0 minq uw(p,q) = min (uw(p,0), uw(p,1)) = min (1 – p, 2p) Maxmin Strategies Also called maximin A maxmin strategy for agent i A strategy s1 that makes i’s worst-case expected utility as high as possible: argmaxmin ui (si,s-i ) si s-i This isn’t necessarily unique Often it is mixed Agent i’s maxmin value, or security level, is the maxmin strategy’s worst-case expected utility: maxmin ui (si,s-i ) si s-i For 2 players it simplifies to max min u1s1, s2 s1 s2 Example Wife’s and husband’s strategies
    [Show full text]
  • 14.12 Game Theory Lecture Notes∗ Lectures 7-9
    14.12 Game Theory Lecture Notes∗ Lectures 7-9 Muhamet Yildiz Intheselecturesweanalyzedynamicgames(withcompleteinformation).Wefirst analyze the perfect information games, where each information set is singleton, and develop the notion of backwards induction. Then, considering more general dynamic games, we will introduce the concept of the subgame perfection. We explain these concepts on economic problems, most of which can be found in Gibbons. 1 Backwards induction The concept of backwards induction corresponds to the assumption that it is common knowledge that each player will act rationally at each node where he moves — even if his rationality would imply that such a node will not be reached.1 Mechanically, it is computed as follows. Consider a finite horizon perfect information game. Consider any node that comes just before terminal nodes, that is, after each move stemming from this node, the game ends. If the player who moves at this node acts rationally, he will choose the best move for himself. Hence, we select one of the moves that give this player the highest payoff. Assigning the payoff vector associated with this move to the node at hand, we delete all the moves stemming from this node so that we have a shorter game, where our node is a terminal node. Repeat this procedure until we reach the origin. ∗These notes do not include all the topics that will be covered in the class. See the slides for a more complete picture. 1 More precisely: at each node i the player is certain that all the players will act rationally at all nodes j that follow node i; and at each node i the player is certain that at each node j that follows node i the player who moves at j will be certain that all the players will act rationally at all nodes k that follow node j,...ad infinitum.
    [Show full text]
  • Contemporaneous Perfect Epsilon-Equilibria
    Games and Economic Behavior 53 (2005) 126–140 www.elsevier.com/locate/geb Contemporaneous perfect epsilon-equilibria George J. Mailath a, Andrew Postlewaite a,∗, Larry Samuelson b a University of Pennsylvania b University of Wisconsin Received 8 January 2003 Available online 18 July 2005 Abstract We examine contemporaneous perfect ε-equilibria, in which a player’s actions after every history, evaluated at the point of deviation from the equilibrium, must be within ε of a best response. This concept implies, but is stronger than, Radner’s ex ante perfect ε-equilibrium. A strategy profile is a contemporaneous perfect ε-equilibrium of a game if it is a subgame perfect equilibrium in a perturbed game with nearly the same payoffs, with the converse holding for pure equilibria. 2005 Elsevier Inc. All rights reserved. JEL classification: C70; C72; C73 Keywords: Epsilon equilibrium; Ex ante payoff; Multistage game; Subgame perfect equilibrium 1. Introduction Analyzing a game begins with the construction of a model specifying the strategies of the players and the resulting payoffs. For many games, one cannot be positive that the specified payoffs are precisely correct. For the model to be useful, one must hope that its equilibria are close to those of the real game whenever the payoff misspecification is small. To ensure that an equilibrium of the model is close to a Nash equilibrium of every possible game with nearly the same payoffs, the appropriate solution concept in the model * Corresponding author. E-mail addresses: [email protected] (G.J. Mailath), [email protected] (A. Postlewaite), [email protected] (L.
    [Show full text]
  • Extensive Form Games
    Extensive Form Games Jonathan Levin January 2002 1 Introduction The models we have described so far can capture a wide range of strategic environments, but they have an important limitation. In each game we have looked at, each player moves once and strategies are chosen simultaneously. This misses a common feature of many economic settings (and many classic “games” such as chess or poker). Consider just a few examples of economic environments where the timing of strategic decisions is important: • Compensation and Incentives. Firms first sign workers to a contract, then workers decide how to behave, and frequently firms later decide what bonuses to pay and/or which employees to promote. • Research and Development. Firms make choices about what tech- nologies to invest in given the set of available technologies and their forecasts about the future market. • Monetary Policy. A central bank makes decisions about the money supply in response to inflation and growth, which are themselves de- termined by individuals acting in response to monetary policy, and so on. • Entry into Markets. Prior to competing in a product market, firms must make decisions about whether to enter markets and what prod- ucts to introduce. These decisions may strongly influence later com- petition. These notes take up the problem of representing and analyzing these dy- namic strategic environments. 1 2TheExtensiveForm The extensive form of a game is a complete description of: 1. The set of players 2. Who moves when and what their choices are 3. What players know when they move 4. The players’ payoffs as a function of the choices that are made.
    [Show full text]
  • 8. Noncooperative Games 8.1 Basic Concepts Theory of Strategic Interaction Definition 8.1. a Strategy Is a Plan of Matching Penn
    EC 701, Fall 2005, Microeconomic Theory November 9, 2005 page 351 8. Noncooperative Games 8.1 Basic Concepts How I learned basic game theory in 1986. • Game theory = theory of strategic interaction • Each agent chooses a strategy in order to achieve his goals with • the knowledge that other agents may react to his strategic choice. Definition 8.1. A strategy is a plan of action that specifies what a player will do in every circumstance. EC 701, Fall 2005, Microeconomic Theory November 9, 2005 page 352 Matching Pennies (a zero-sum game) Eva and Esther each places a penny on the table simultaneously. • They each decide in advance whether to put down heads (H )or • tails (T). [This is a choice, NOT a random event as when a coin is flipped.] If both are heads (H ) or both are tails (T), then Eva pays Esther • $1. If one is heads and the other tails, then Esther pays Eva $1. • Esther HT 1 1 H − -1 1 Eva 1 1 T − 1 -1 EC 701, Fall 2005, Microeconomic Theory November 9, 2005 page 353 Esther HT 1 1 H − -1 1 Eva 1 1 T − 1 -1 Eva and Esther are the players. • H and T are strategies,and H , T is each player’s strategy • space. { } 1 and 1 are payoffs. • − Each box corresponds to a strategy profile. • For example, the top-right box represents the strategy profile • H , T h i EC 701, Fall 2005, Microeconomic Theory November 9, 2005 page 354 Deterministic strategies (without random elements) are called • pure strategies.
    [Show full text]
  • Solution Concepts in Cooperative Game Theory
    1 A. Stolwijk Solution Concepts in Cooperative Game Theory Master’s Thesis, defended on October 12, 2010 Thesis Advisor: dr. F.M. Spieksma Mathematisch Instituut, Universiteit Leiden 2 Contents 1 Introduction 7 1.1 BackgroundandAims ................................. 7 1.2 Outline .......................................... 8 2 The Model: Some Basic Concepts 9 2.1 CharacteristicFunction .............................. ... 9 2.2 Solution Space: Transferable and Non-Transferable Utilities . .......... 11 2.3 EquivalencebetweenGames. ... 12 2.4 PropertiesofSolutions............................... ... 14 3 Comparing Imputations 15 3.1 StrongDomination................................... 15 3.1.1 Properties of Strong Domination . 17 3.2 WeakDomination .................................... 19 3.2.1 Properties of Weak Domination . 20 3.3 DualDomination..................................... 22 3.3.1 Properties of Dual Domination . 23 4 The Core 25 4.1 TheCore ......................................... 25 4.2 TheDualCore ...................................... 27 4.2.1 ComparingtheCorewiththeDualCore. 29 4.2.2 Strong ǫ-Core................................... 30 5 Nash Equilibria 33 5.1 Strict Nash Equilibria . 33 5.2 Weak Nash Equilibria . 36 3 4 CONTENTS 6 Stable Sets 39 6.1 DefinitionofStableSets ............................... .. 39 6.2 Stability in A′ ....................................... 40 6.3 ConstructionofStronglyStableSets . ...... 41 6.3.1 Explanation of the Strongly Stable Set: The Standard of Behavior in the 3-personzero-sumgame ............................
    [Show full text]
  • Game Theory -- Lecture 4
    Game Theory -- Lecture 4 Patrick Loiseau EURECOM Fall 2016 1 Lecture 2-3 recap • Proved existence of pure strategy Nash equilibrium in games with compact convex action sets and continuous concave utilities • Defined mixed strategy Nash equilibrium • Proved existence of mixed strategy Nash equilibrium in finite games • Discussed computation and interpretation of mixed strategies Nash equilibrium àNash equilibrium is not the only solution concept àToday: Another solution concept: evolutionary stable strategies 2 Outline • Evolutionary stable strategies 3 Evolutionary game theory • Game theory ßà evolutionary biology • Idea: – Relate strategies to phenotypes of genes – Relate payoffs to genetic fitness – Strategies that do well “grow”, those that obtain lower payoffs “Die out” • Important note: – Strategies are hardwired, they are not chosen by players • Assumptions: – Within species competition: no mixture of population 4 Examples • Using game theory to understand population dynamics – Evolution of species – Groups of lions Deciding whether to attack in group an antelope – Ants DeciDing to responD to an attack of a spiDer – TCP variants, P2P applications • Using evolution to interpret economic actions – Firms in a competitive market – Firms are bounDeD, they can’t compute the best response, but have rules of thumbs and adopt hardwired (consistent) strategies – Survival of the fittest == rise of firms with low costs anD high profits 5 A simple model • Assume simple game: two-player symmetric • Assume random tournaments – Large population
    [Show full text]
  • Navigating the Landscape of Multiplayer Games ✉ Shayegan Omidshafiei 1,4 , Karl Tuyls1,4, Wojciech M
    ARTICLE https://doi.org/10.1038/s41467-020-19244-4 OPEN Navigating the landscape of multiplayer games ✉ Shayegan Omidshafiei 1,4 , Karl Tuyls1,4, Wojciech M. Czarnecki2, Francisco C. Santos3, Mark Rowland2, Jerome Connor2, Daniel Hennes 1, Paul Muller1, Julien Pérolat1, Bart De Vylder1, Audrunas Gruslys2 & Rémi Munos1 Multiplayer games have long been used as testbeds in artificial intelligence research, aptly referred to as the Drosophila of artificial intelligence. Traditionally, researchers have focused on using well-known games to build strong agents. This progress, however, can be better 1234567890():,; informed by characterizing games and their topological landscape. Tackling this latter question can facilitate understanding of agents and help determine what game an agent should target next as part of its training. Here, we show how network measures applied to response graphs of large-scale games enable the creation of a landscape of games, quanti- fying relationships between games of varying sizes and characteristics. We illustrate our findings in domains ranging from canonical games to complex empirical games capturing the performance of trained agents pitted against one another. Our results culminate in a demonstration leveraging this information to generate new and interesting games, including mixtures of empirical games synthesized from real world games. 1 DeepMind, Paris, France. 2 DeepMind, London, UK. 3 INESC-ID and Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal. 4These authors ✉ contributed equally: Shayegan Omidshafiei, Karl Tuyls. email: somidshafi[email protected] NATURE COMMUNICATIONS | (2020) 11:5603 | https://doi.org/10.1038/s41467-020-19244-4 | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-19244-4 ames have played a prominent role as platforms for the games.
    [Show full text]
  • Nash Equilibrium and Mechanism Design ✩ ∗ Eric Maskin A,B, a Institute for Advanced Study, United States B Princeton University, United States Article Info Abstract
    Games and Economic Behavior 71 (2011) 9–11 Contents lists available at ScienceDirect Games and Economic Behavior www.elsevier.com/locate/geb Commentary: Nash equilibrium and mechanism design ✩ ∗ Eric Maskin a,b, a Institute for Advanced Study, United States b Princeton University, United States article info abstract Article history: I argue that the principal theoretical and practical drawbacks of Nash equilibrium as a Received 23 December 2008 solution concept are far less troublesome in problems of mechanism design than in most Available online 18 January 2009 other applications of game theory. © 2009 Elsevier Inc. All rights reserved. JEL classification: C70 Keywords: Nash equilibrium Mechanism design Solution concept A Nash equilibrium (called an “equilibrium point” by John Nash himself; see Nash, 1950) of a game occurs when players choose strategies from which unilateral deviations do not pay. The concept of Nash equilibrium is far and away Nash’s most important legacy to economics and the other behavioral sciences. This is because it remains the central solution concept—i.e., prediction of behavior—in applications of game theory to these fields. As I shall review below, Nash equilibrium has some important shortcomings, both theoretical and practical. I will argue, however, that these drawbacks are far less troublesome in problems of mechanism design than in most other applications of game theory. 1. Solution concepts Game-theoretic solution concepts divide into those that are noncooperative—where the basic unit of analysis is the in- dividual player—and those that are cooperative, where the focus is on coalitions of players. John von Neumann and Oskar Morgenstern themselves viewed the cooperative part of game theory as more important, and their seminal treatise, von Neumann and Morgenstern (1944), devoted fully three quarters of its space to cooperative matters.
    [Show full text]
  • Perfect Conditional E-Equilibria of Multi-Stage Games with Infinite Sets
    Perfect Conditional -Equilibria of Multi-Stage Games with Infinite Sets of Signals and Actions∗ Roger B. Myerson∗ and Philip J. Reny∗∗ *Department of Economics and Harris School of Public Policy **Department of Economics University of Chicago Abstract Abstract: We extend Kreps and Wilson’s concept of sequential equilibrium to games with infinite sets of signals and actions. A strategy profile is a conditional -equilibrium if, for any of a player’s positive probability signal events, his conditional expected util- ity is within of the best that he can achieve by deviating. With topologies on action sets, a conditional -equilibrium is full if strategies give every open set of actions pos- itive probability. Such full conditional -equilibria need not be subgame perfect, so we consider a non-topological approach. Perfect conditional -equilibria are defined by testing conditional -rationality along nets of small perturbations of the players’ strate- gies and of nature’s probability function that, for any action and for almost any state, make this action and state eventually (in the net) always have positive probability. Every perfect conditional -equilibrium is a subgame perfect -equilibrium, and, in fi- nite games, limits of perfect conditional -equilibria as 0 are sequential equilibrium strategy profiles. But limit strategies need not exist in→ infinite games so we consider instead the limit distributions over outcomes. We call such outcome distributions per- fect conditional equilibrium distributions and establish their existence for a large class of regular projective games. Nature’s perturbations can produce equilibria that seem unintuitive and so we augment the game with a net of permissible perturbations.
    [Show full text]
  • Extensive-Form Games with Perfect Information
    Extensive-Form Games with Perfect Information Yiling Chen September 22, 2008 CS286r Fall'08 Extensive-Form Games with Perfect Information 1 Logistics In this unit, we cover 5.1 of the SLB book. Problem Set 1, due Wednesday September 24 in class. CS286r Fall'08 Extensive-Form Games with Perfect Information 2 Review: Normal-Form Games A finite n-person normal-form game, G =< N; A; u > I N = f1; 2; :::; ng is the set of players. I A = fA1; A2; :::; Ang is a set of available actions. I u = fu1; u2; :::ung is a set of utility functions for n agents. C D C 5,5 0,8 D 8,0 1,1 Prisoner's Dilemma CS286r Fall'08 Extensive-Form Games with Perfect Information 3 Example 1: The Sharing Game Alice and Bob try to split two indivisible and identical gifts. First, Alice suggests a split: which can be \Alice keeps both", \they each keep one", and \Bob keeps both". Then, Bob chooses whether to Accept or Reject the split. If Bob accepts the split, they each get what the split specifies. If Bob rejects, they each get nothing. Alice 2-0 1-1 0-2 Bob Bob Bob A R A R A R (2,0) (0,0) (1,1) (0,0) (0,2) (0,0) CS286r Fall'08 Extensive-Form Games with Perfect Information 4 Loosely Speaking... Extensive Form I A detailed description of the sequential structure of the decision problems encountered by the players in a game. I Often represented as a game tree Perfect Information I All players know the game structure (including the payoff functions at every outcome).
    [Show full text]