Sound Algorithms in Imperfect Information Games

Total Page:16

File Type:pdf, Size:1020Kb

Sound Algorithms in Imperfect Information Games Extended Abstract AAMAS 2021, May 3-7, 2021, Online Sound Algorithms in Imperfect Information Games Extended Abstract Michal Šustr Martin Schmid Matej Moravčík Czech Technical University DeepMind DeepMind DeepMind [email protected] [email protected] [email protected] Neil Burch Marc Lanctot Michael Bowling DeepMind DeepMind DeepMind [email protected] [email protected] [email protected] ABSTRACT We introduce a framework for evaluating the performance of an Search has played a fundamental role in computer game research online algorithm. Within this framework, we introduce the defini- since the very beginning. And while online search has been com- tion of a sound and n-sound algorithm. Like the exploitability of a monly used in perfect information games such as Chess and Go, strategy in the offline setting, the soundness of an algorithm isa online search methods for imperfect information games have only measure of if its performance against a worst case adversary. Impor- been introduced relatively recently. This paper addresses the ques- tantly, this notion collapses to the previous notion of exploitability tion of what is a sound online algorithm in an imperfect information when the algorithm follows a fixed strategy profile. setting of two-player zero-sum games? We argue that the fixed- We then introduce a consistency framework, a hierarchy that strategy definitions of exploitability and epsilon-Nash equilibria allows us to formally state in what sense an online algorithm plays are ill suited to measure the worst-case performance of an online “consistently” with an n-equilibrium. This allows to state multi- algorithm. We thus formalize epsilon-soundness, a concept that ple bounds on the soundness of the algorithm, based on the n- connects the worst-case performance of an online algorithm to equilibrium and the type of consistency. The stronger the consis- the performance of an epsilon-Nash equilibrium. Our definition tency in our hierarchy, the stronger the bounds. of soundness and the consistency hierarchy finally provide appro- A complete version of this paper can be found on arXiv [9]. priate tools to analyze online algorithms in repeated imperfect information games. We thus inspect some of the previous online 2 MOTIVATIONAL EXAMPLE algorithms in a new light, bringing new insights into their worst Consider a simple online algorithm for (r)ock (p)aper (s)cissors case performance guarantees. game that simply produces the sequence ¹A,?,B,A,?,B,...º. While this algorithm is clearly highly exploitable, the worst-case perfor- KEYWORDS mance of this algorithm can only be properly analysed through the Game Theory; Imperfect Information; Nash Equilibrium; Online repeated game settings as no offline policy captures the dynamics Algorithm; Exploitability; Soundness of this algorithm. ACM Reference Format: Michal Šustr, Martin Schmid, Matej Moravčík, Neil Burch, Marc Lanctot, 3 BACKGROUND and Michael Bowling. 2021. Sound Algorithms in Imperfect Information We present our results using the recent formalism of factored- Games: Extended Abstract. In Proc. of the 20th International Conference on observations games [6], where factored-observations game is a Autonomous Agents and Multiagent Systems (AAMAS 2021), Online, May 3–7, G hN W > A T R Oi 2021, IFAAMAS, 3 pages. tuple = , ,F , , , , 1 INTRODUCTION 3.1 Online Settings Online methods for approximating Nash equilibria in sequential The repeated game ? consists of a finite sequence of : individual imperfect information games appeared only in the last few years [2– matches < = ¹I1,I2,...,I: º, where each match I8 2 Z is a se- ¹ 0 0 1 1 ;8 −1 4, 7, 8]. We thus investigate what it takes for an online algorithm quence of world states and actions I8 = F8 , 08 ,F8 , 08 . , 08 , to be sound in imperfect information settings. While it has been ;8 º F8 . An online algorithm Ω then simply maps an information state known that search with imperfect information is more challenging observed during a match to a strategy, while possibly using its than with perfect information [5, 7], the problem is maybe more internal algorithm state (Def. 1). complex than previously thought. Online algorithms “live” in a : Given two players Ω1, Ω2, we use % to denote the distri- Ω1,Ω2 fundamentally different setting, and they need to be evaluated bution over all the possible repeated games < of length : when appropriately. these two players face each other. The average reward of < is Í: Proc. of the 20th International Conference on Autonomous Agents and Multiagent Systems R= ¹<º = 1/: 8=1 D= ¹I8 º and we denote E<∼%: »R= ¹<º¼ to be Ω1,Ω2 (AAMAS 2021), U. Endriss, A. Nowé, F. Dignum, A. Lomuscio (eds.), May 3–7, 2021, Online. the expected average reward when the players play : matches. © 2021 International Foundation for Autonomous Agents and Multiagent Systems (www.ifaamas.org). All rights reserved. 1674 Extended Abstract AAMAS 2021, May 3-7, 2021, Online Definition 1. Online algorithm Ω is a function S=×Θ 7! Δ¹A= ¹B=ºº× Definition 6. Algorithm Ω is globally consistent with n-equilibria if Θ, that maps an information state B= 2 S= to the strategy f= ¹B=º 2 n 8: 8< = ¹I1,I2,...,I: º 9f 2 NE= 8ℎ 2 H ¹I8 º Δ¹A= ¹B=ºº, while possibly making use of algorithm’s state ) 2 Θ ¹I , ..., I º and updating it. holds that Ω 1 8−1 ¹B¹ℎºº = f ¹B¹ℎºº for 88 2 f1,...,:g. However: 4 SOUNDNESS OF ONLINE ALGORITHM Theorem 7. An algorithm that is globally consistent with n-equilibria Exploitability / n-equilibrium considers the expected utility of a might not be n-sound. fixed strategy against a worst-case adversary in a single match.We thus define a similar concept for the settings of an online algorithm But what if the algorithm keeps on playing the repeated game? in a repeated game: ¹:, nº-soundness. Intuitively, an online algo- While the global consistency with equilibria does not guarantee rithm is ¹:, nº-sound if and only if it is guaranteed the same reward soundness, it guarantees that the expected average reward con- as if it followed a fixed n-equilibrium after : matches. verges to the game value in the limit. Definition 2. For an ¹:, nº-sound online algorithm Ω, the expected Theorem 8. For an algorithm Ω that is globally consistent with average reward against any opponent is at least as good as if it fol- n-equilibria, lowed an n-Nash equilibrium fixed strategy f for any number of S Δ 0 ∗ 1 matches : : 8: 8Ω2 : E<∼%: »R¹<º¼ ≥ D − n − . (2) Ω,Ω2 : 0 8 ≥ 8 0 »R¹ º¼ ≥ 0 »R¹ º¼ : : Ω2 : E<∼%: < E<∼%: < . (1) Corollary 9. An algorithm Ω that is globally consistent with n- Ω,Ω2 f,Ω2 equilibria is ¹:, nº-sound as : ! 1. If algorithm Ω is ¹:, nº-sound for 8: ≥ 1, we say the algorithm is n-sound. 5.3 Strong Global Consistency 5 CONSISTENCY HIERARCHY Strong global consistency additionally guarantees that the game- play itself is generated consistently with an equilibrium; and as As the ¹:, nº-soundness can often be infeasible to compute, we in global consistency, the partial strategies for this game-play also introduce the concept of consistency. This allows to formalize that correspond to an n-equilibrium. In other words, the online algorithm an algorithm plays “consistently” with an n-equilibrium, directly simply exactly follows a predefined equilibrium. bounding the ¹:, nº-soundness. Definition 10. Online algorithm Ω is strongly globally consistent 5.1 Local Consistency with n-equilibrium if n Local consistency simply guarantees that every time we query the 9f 2 NE= 8: 8< = ¹I1,I2,...,I: º 8ℎ 2 H ¹I: º online algorithm, there is an n-equilibrium that has the same local holds that Ω¹I1, ..., I:−1º ¹B¹ℎºº = f ¹B¹ℎºº. behavioral strategy f ¹Bº for the queried state B. Definition 3. Algorithm Ω is locally consistent with n-equilibria if Theorem 11. Online algorithm Ω that is strongly globally consis- tent with n-equilibrium is n-sound. n 8: 8< = ¹I1,I2,...,I º 8ℎ 2 H ¹I º 9f 2 NE : : = Canonical examples of strongly globally consistent online algo- ¹ º holds that Ω I1, ..., I:−1 ¹B¹ℎºº = f ¹B¹ℎºº. rithms are DeepStack/Libratus. In general, an algorithm that uses a notion of safe (continual) resolving is strongly globally consistent We then show that this link has different implication for perfect as it essentially re-solves some n-equilibrium (albeit an unknown and imperfect information games. one) that it follows. Another, more recent example is ReBeL [1], Theorem 4. An algorithm that is locally consistent with n-equilibria as it essentially imitates CFR-D iterations in conjunction with a might not be ¹:, nº-sound. neural network at a test time. 6 REGRET AND ¹:, nº-SOUNDNESS Theorem 5. In perfect information games, an algorithm that is locally consistent with a subgame perfect equilibrium is sound. Finally, the introduced notion of soundness has tight connection to the popular concept of regret when a regret minimizer is used as A particularly interesting example of an algorithm that is only the online algorithm. locally consistent is Online Outcome Sampling [7] (OOS). In the full paper version [9] we show that although this algorithm was Corollary 12. Any regret minimizer with a regret bound of ': ¹ ': º previously considered to be sound, it can produce highly exploitable is :, 2: -sound. strategies in imperfect information games. ACKNOWLEDGMENTS 5.2 Global Consistency Computational resources were supplied by the project "e-Infrastruktura Local consistency guaranteed consistency only for individual states. CZ" (e-INFRA LM2018140). This work was supported by Czech sci- Global consistency is a stronger criterion that guarantees consis- ence foundation grant no. 18-27483Y. tency with some equilibria for all the states in combination.
Recommended publications
  • Lecture 4 Rationalizability & Nash Equilibrium Road
    Lecture 4 Rationalizability & Nash Equilibrium 14.12 Game Theory Muhamet Yildiz Road Map 1. Strategies – completed 2. Quiz 3. Dominance 4. Dominant-strategy equilibrium 5. Rationalizability 6. Nash Equilibrium 1 Strategy A strategy of a player is a complete contingent-plan, determining which action he will take at each information set he is to move (including the information sets that will not be reached according to this strategy). Matching pennies with perfect information 2’s Strategies: HH = Head if 1 plays Head, 1 Head if 1 plays Tail; HT = Head if 1 plays Head, Head Tail Tail if 1 plays Tail; 2 TH = Tail if 1 plays Head, 2 Head if 1 plays Tail; head tail head tail TT = Tail if 1 plays Head, Tail if 1 plays Tail. (-1,1) (1,-1) (1,-1) (-1,1) 2 Matching pennies with perfect information 2 1 HH HT TH TT Head Tail Matching pennies with Imperfect information 1 2 1 Head Tail Head Tail 2 Head (-1,1) (1,-1) head tail head tail Tail (1,-1) (-1,1) (-1,1) (1,-1) (1,-1) (-1,1) 3 A game with nature Left (5, 0) 1 Head 1/2 Right (2, 2) Nature (3, 3) 1/2 Left Tail 2 Right (0, -5) Mixed Strategy Definition: A mixed strategy of a player is a probability distribution over the set of his strategies. Pure strategies: Si = {si1,si2,…,sik} σ → A mixed strategy: i: S [0,1] s.t. σ σ σ i(si1) + i(si2) + … + i(sik) = 1. If the other players play s-i =(s1,…, si-1,si+1,…,sn), then σ the expected utility of playing i is σ σ σ i(si1)ui(si1,s-i) + i(si2)ui(si2,s-i) + … + i(sik)ui(sik,s-i).
    [Show full text]
  • SEQUENTIAL GAMES with PERFECT INFORMATION Example
    SEQUENTIAL GAMES WITH PERFECT INFORMATION Example 4.9 (page 105) Consider the sequential game given in Figure 4.9. We want to apply backward induction to the tree. 0 Vertex B is owned by player two, P2. The payoffs for P2 are 1 and 3, with 3 > 1, so the player picks R . Thus, the payoffs at B become (0, 3). 00 Next, vertex C is also owned by P2 with payoffs 1 and 0. Since 1 > 0, P2 picks L , and the payoffs are (4, 1). Player one, P1, owns A; the choice of L gives a payoff of 0 and R gives a payoff of 4; 4 > 0, so P1 chooses R. The final payoffs are (4, 1). 0 00 We claim that this strategy profile, { R } for P1 and { R ,L } is a Nash equilibrium. Notice that the 0 00 strategy profile gives a choice at each vertex. For the strategy { R ,L } fixed for P2, P1 has a maximal payoff by choosing { R }, ( 0 00 0 00 π1(R, { R ,L }) = 4 π1(R, { R ,L }) = 4 ≥ 0 00 π1(L, { R ,L }) = 0. 0 00 In the same way, for the strategy { R } fixed for P1, P2 has a maximal payoff by choosing { R ,L }, ( 00 0 00 π2(R, {∗,L }) = 1 π2(R, { R ,L }) = 1 ≥ 00 π2(R, {∗,R }) = 0, where ∗ means choose either L0 or R0. Since no change of choice by a player can increase that players own payoff, the strategy profile is called a Nash equilibrium. Notice that the above strategy profile is also a Nash equilibrium on each branch of the game tree, mainly starting at either B or starting at C.
    [Show full text]
  • Economics 201B Economic Theory (Spring 2021) Strategic Games
    Economics 201B Economic Theory (Spring 2021) Strategic Games Topics: terminology and notations (OR 1.7), games and solutions (OR 1.1-1.3), rationality and bounded rationality (OR 1.4-1.6), formalities (OR 2.1), best-response (OR 2.2), Nash equilibrium (OR 2.2), 2 2 examples × (OR 2.3), existence of Nash equilibrium (OR 2.4), mixed strategy Nash equilibrium (OR 3.1, 3.2), strictly competitive games (OR 2.5), evolution- ary stability (OR 3.4), rationalizability (OR 4.1), dominance (OR 4.2, 4.3), trembling hand perfection (OR 12.5). Terminology and notations (OR 1.7) Sets For R, ∈ ≥ ⇐⇒ ≥ for all . and ⇐⇒ ≥ for all and some . ⇐⇒ for all . Preferences is a binary relation on some set of alternatives R. % ⊆ From % we derive two other relations on : — strict performance relation and not  ⇐⇒ % % — indifference relation and ∼ ⇐⇒ % % Utility representation % is said to be — complete if , or . ∀ ∈ % % — transitive if , and then . ∀ ∈ % % % % can be presented by a utility function only if it is complete and transitive (rational). A function : R is a utility function representing if → % ∀ ∈ () () % ⇐⇒ ≥ % is said to be — continuous (preferences cannot jump...) if for any sequence of pairs () with ,and and , . { }∞=1 % → → % — (strictly) quasi-concave if for any the upper counter set ∈ { ∈ : is (strictly) convex. % } These guarantee the existence of continuous well-behaved utility function representation. Profiles Let be a the set of players. — () or simply () is a profile - a collection of values of some variable,∈ one for each player. — () or simply is the list of elements of the profile = ∈ { } − () for all players except . ∈ — ( ) is a list and an element ,whichistheprofile () .
    [Show full text]
  • Pure and Bayes-Nash Price of Anarchy for Generalized Second Price Auction
    Pure and Bayes-Nash Price of Anarchy for Generalized Second Price Auction Renato Paes Leme Eva´ Tardos Department of Computer Science Department of Computer Science Cornell University, Ithaca, NY Cornell University, Ithaca, NY [email protected] [email protected] Abstract—The Generalized Second Price Auction has for advertisements and slots higher on the page are been the main mechanism used by search companies more valuable (clicked on by more users). The bids to auction positions for advertisements on search pages. are used to determine both the assignment of bidders In this paper we study the social welfare of the Nash equilibria of this game in various models. In the full to slots, and the fees charged. In the simplest model, information setting, socially optimal Nash equilibria are the bidders are assigned to slots in order of bids, and known to exist (i.e., the Price of Stability is 1). This paper the fee for each click is the bid occupying the next is the first to prove bounds on the price of anarchy, and slot. This auction is called the Generalized Second Price to give any bounds in the Bayesian setting. Auction (GSP). More generally, positions and payments Our main result is to show that the price of anarchy is small assuming that all bidders play un-dominated in the Generalized Second Price Auction depend also on strategies. In the full information setting we prove a bound the click-through rates associated with the bidders, the of 1.618 for the price of anarchy for pure Nash equilibria, probability that the advertisement will get clicked on by and a bound of 4 for mixed Nash equilibria.
    [Show full text]
  • Extensive Form Games with Perfect Information
    Notes Strategy and Politics: Extensive Form Games with Perfect Information Matt Golder Pennsylvania State University Sequential Decision-Making Notes The model of a strategic (normal form) game suppresses the sequential structure of decision-making. When applying the model to situations in which players move sequentially, we assume that each player chooses her plan of action once and for all. She is committed to this plan, which she cannot modify as events unfold. The model of an extensive form game, by contrast, describes the sequential structure of decision-making explicitly, allowing us to study situations in which each player is free to change her mind as events unfold. Perfect Information Notes Perfect information describes a situation in which players are always fully informed about all of the previous actions taken by all players. This assumption is used in all of the following lecture notes that use \perfect information" in the title. Later we will also study more general cases where players may only be imperfectly informed about previous actions when choosing an action. Extensive Form Games Notes To describe an extensive form game with perfect information we need to specify the set of players and their preferences, just as for a strategic game. In addition, we also need to specify the order of the players' moves and the actions each player may take at each point (or decision node). We do so by specifying the set of all sequences of actions that can possibly occur, together with the player who moves at each point in each sequence. We refer to each possible sequence of actions (a1; a2; : : : ak) as a terminal history and to the function that denotes the player who moves at each point in each terminal history as the player function.
    [Show full text]
  • Games with Hidden Information
    Games with Hidden Information R&N Chapter 6 R&N Section 17.6 • Assumptions so far: – Two-player game : Player A and B. – Perfect information : Both players see all the states and decisions. Each decision is made sequentially . – Zero-sum : Player’s A gain is exactly equal to player B’s loss. • We are going to eliminate these constraints. We will eliminate first the assumption of “perfect information” leading to far more realistic models. – Some more game-theoretic definitions Matrix games – Minimax results for perfect information games – Minimax results for hidden information games 1 Player A 1 L R Player B 2 3 L R L R Player A 4 +2 +5 +2 L Extensive form of game: Represent the -1 +4 game by a tree A pure strategy for a player 1 defines the move that the L R player would make for every possible state that the player 2 3 would see. L R L R 4 +2 +5 +2 L -1 +4 2 Pure strategies for A: 1 Strategy I: (1 L,4 L) Strategy II: (1 L,4 R) L R Strategy III: (1 R,4 L) Strategy IV: (1 R,4 R) 2 3 Pure strategies for B: L R L R Strategy I: (2 L,3 L) Strategy II: (2 L,3 R) 4 +2 +5 +2 Strategy III: (2 R,3 L) L R Strategy IV: (2 R,3 R) -1 +4 In general: If N states and B moves, how many pure strategies exist? Matrix form of games Pure strategies for A: Pure strategies for B: Strategy I: (1 L,4 L) Strategy I: (2 L,3 L) Strategy II: (1 L,4 R) Strategy II: (2 L,3 R) Strategy III: (1 R,4 L) Strategy III: (2 R,3 L) 1 Strategy IV: (1 R,4 R) Strategy IV: (2 R,3 R) R L I II III IV 2 3 L R I -1 -1 +2 +2 L R 4 II +4 +4 +2 +2 +2 +5 +1 L R III +5 +1 +5 +1 IV +5 +1 +5 +1 -1 +4 3 Pure strategies for Player B Player A’s payoff I II III IV if game is played I -1 -1 +2 +2 with strategy I by II +4 +4 +2 +2 Player A and strategy III by III +5 +1 +5 +1 Player B for Player A for Pure strategies IV +5 +1 +5 +1 • Matrix normal form of games: The table contains the payoffs for all the possible combinations of pure strategies for Player A and Player B • The table characterizes the game completely, there is no need for any additional information about rules, etc.
    [Show full text]
  • Introduction to Game Theory: a Discovery Approach
    Linfield University DigitalCommons@Linfield Linfield Authors Book Gallery Faculty Scholarship & Creative Works 2020 Introduction to Game Theory: A Discovery Approach Jennifer Firkins Nordstrom Linfield College Follow this and additional works at: https://digitalcommons.linfield.edu/linfauth Part of the Logic and Foundations Commons, Set Theory Commons, and the Statistics and Probability Commons Recommended Citation Nordstrom, Jennifer Firkins, "Introduction to Game Theory: A Discovery Approach" (2020). Linfield Authors Book Gallery. 83. https://digitalcommons.linfield.edu/linfauth/83 This Book is protected by copyright and/or related rights. It is brought to you for free via open access, courtesy of DigitalCommons@Linfield, with permission from the rights-holder(s). Your use of this Book must comply with the Terms of Use for material posted in DigitalCommons@Linfield, or with other stated terms (such as a Creative Commons license) indicated in the record and/or on the work itself. For more information, or if you have questions about permitted uses, please contact [email protected]. Introduction to Game Theory a Discovery Approach Jennifer Firkins Nordstrom Linfield College McMinnville, OR January 4, 2020 Version 4, January 2020 ©2020 Jennifer Firkins Nordstrom This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Preface Many colleges and universities are offering courses in quantitative reasoning for all students. One model for a quantitative reasoning course is to provide students with a single cohesive topic. Ideally, such a topic can pique the curios- ity of students with wide ranging academic interests and limited mathematical background. This text is intended for use in such a course.
    [Show full text]
  • Extensive-Form Games with Perfect Information
    Extensive-Form Games with Perfect Information Jeffrey Ely April 22, 2015 Jeffrey Ely Extensive-Form Games with Perfect Information Extensive-Form Games With Perfect Information Recall the alternating offers bargaining game. It is an extensive-form game. (Described in terms of sequences of moves.) In this game, each time a player moves, he knows everything that has happened in the past. He can compute exactly how the game will end given any profile of continuation strategies Such a game is said to have perfect information. Jeffrey Ely Extensive-Form Games with Perfect Information Simple Extensive-Form Game With Perfect Information 1b ¨ H L ¨ H R ¨¨ HH 2r¨¨ HH2r AB @ C @ D @ @ r @r r @r 1, 0 2, 3 4, 1 −1, 0 Figure: An extensive game Jeffrey Ely Extensive-Form Games with Perfect Information Game Trees Nodes H Initial node. Terminal Nodes Z ⊂ H. Player correspondence N : H ⇒ f1, ... , ng (N(h) indicates who moves at h. They move simultaneously) Actions Ai (h) 6= Æ at each h where i 2 N(h). I Action profiles can be identified with branches in the tree. Payoffs ui : Z ! R. Jeffrey Ely Extensive-Form Games with Perfect Information Strategies A strategy for player i is a function si specifying si (h) 2 Ai (h) for each h such that i 2 N(h). Jeffrey Ely Extensive-Form Games with Perfect Information Strategic Form Given any profile of strategies s, there is a unique terminal node z(s) that will be reached. We define ui (s) = ui (z(s)). We have thus obtained a strategic-form representation of the game.
    [Show full text]
  • Perfect Conditional E-Equilibria of Multi-Stage Games with Infinite Sets
    Perfect Conditional -Equilibria of Multi-Stage Games with Infinite Sets of Signals and Actions∗ Roger B. Myerson∗ and Philip J. Reny∗∗ *Department of Economics and Harris School of Public Policy **Department of Economics University of Chicago Abstract Abstract: We extend Kreps and Wilson’s concept of sequential equilibrium to games with infinite sets of signals and actions. A strategy profile is a conditional -equilibrium if, for any of a player’s positive probability signal events, his conditional expected util- ity is within of the best that he can achieve by deviating. With topologies on action sets, a conditional -equilibrium is full if strategies give every open set of actions pos- itive probability. Such full conditional -equilibria need not be subgame perfect, so we consider a non-topological approach. Perfect conditional -equilibria are defined by testing conditional -rationality along nets of small perturbations of the players’ strate- gies and of nature’s probability function that, for any action and for almost any state, make this action and state eventually (in the net) always have positive probability. Every perfect conditional -equilibrium is a subgame perfect -equilibrium, and, in fi- nite games, limits of perfect conditional -equilibria as 0 are sequential equilibrium strategy profiles. But limit strategies need not exist in→ infinite games so we consider instead the limit distributions over outcomes. We call such outcome distributions per- fect conditional equilibrium distributions and establish their existence for a large class of regular projective games. Nature’s perturbations can produce equilibria that seem unintuitive and so we augment the game with a net of permissible perturbations.
    [Show full text]
  • Extensive-Form Games with Perfect Information
    Extensive-Form Games with Perfect Information Yiling Chen September 22, 2008 CS286r Fall'08 Extensive-Form Games with Perfect Information 1 Logistics In this unit, we cover 5.1 of the SLB book. Problem Set 1, due Wednesday September 24 in class. CS286r Fall'08 Extensive-Form Games with Perfect Information 2 Review: Normal-Form Games A finite n-person normal-form game, G =< N; A; u > I N = f1; 2; :::; ng is the set of players. I A = fA1; A2; :::; Ang is a set of available actions. I u = fu1; u2; :::ung is a set of utility functions for n agents. C D C 5,5 0,8 D 8,0 1,1 Prisoner's Dilemma CS286r Fall'08 Extensive-Form Games with Perfect Information 3 Example 1: The Sharing Game Alice and Bob try to split two indivisible and identical gifts. First, Alice suggests a split: which can be \Alice keeps both", \they each keep one", and \Bob keeps both". Then, Bob chooses whether to Accept or Reject the split. If Bob accepts the split, they each get what the split specifies. If Bob rejects, they each get nothing. Alice 2-0 1-1 0-2 Bob Bob Bob A R A R A R (2,0) (0,0) (1,1) (0,0) (0,2) (0,0) CS286r Fall'08 Extensive-Form Games with Perfect Information 4 Loosely Speaking... Extensive Form I A detailed description of the sequential structure of the decision problems encountered by the players in a game. I Often represented as a game tree Perfect Information I All players know the game structure (including the payoff functions at every outcome).
    [Show full text]
  • Epistemic Game Theory∗
    Epistemic game theory∗ Eddie Dekel and Marciano Siniscalchi Tel Aviv and Northwestern University, and Northwestern University January 14, 2014 Contents 1 Introduction and Motivation2 1.1 Philosophy/Methodology.................................4 2 Main ingredients5 2.1 Notation...........................................5 2.2 Strategic-form games....................................5 2.3 Belief hierarchies......................................6 2.4 Type structures.......................................7 2.5 Rationality and belief...................................9 2.6 Discussion.......................................... 10 2.6.1 State dependence and non-expected utility................... 10 2.6.2 Elicitation...................................... 11 2.6.3 Introspective beliefs................................ 11 2.6.4 Semantic/syntactic models............................ 11 3 Strategic games of complete information 12 3.1 Rationality and Common Belief in Rationality..................... 12 3.2 Discussion.......................................... 14 3.3 ∆-rationalizability..................................... 14 4 Equilibrium Concepts 15 4.1 Introduction......................................... 15 4.2 Subjective Correlated Equilibrium............................ 16 4.3 Objective correlated equilibrium............................. 17 4.4 Nash Equilibrium...................................... 21 4.5 The book-of-play assumption............................... 22 4.6 Discussion.......................................... 23 4.6.1 Condition AI...................................
    [Show full text]
  • Sequential Price of Anarchy for Atomic Congestion Games with Limited Number of Players
    Kiril Kolev Master Thesis Sequential Price of Anarchy for Atomic Congestion Games with Limited Number of Players August 2016 University of Twente Master Applied Mathematics Master Thesis Sequential Price of Anarchy for Atomic Congestion Games with Limited Number of Players Author: Supervisors: Kiril Kolev Marc Uetz Judith Timmer Jasper de Jong, MSc A thesis submitted in fulfillment of the requirements for the degree of Master of Science in the Discrete Mathematics and Mathematical Programming group of the University of Twente August 2016 \It is just as foolish to complain that people are selfish and treacherous as it is to complain that the magnetic field does not increase unless the electric field has a curl. Both are laws of nature." John von Neumann UNIVERSITY OF TWENTE Faculty of Electrical Engineering, Mathematics, and Computer Science Master Applied Mathematics Abstract Sequential Price of Anarchy for Atomic Congestion Games with Limited Number of Players by Kiril Kolev We consider sequentially played atomic congestion games with linear latency functions. To measure the quality of sub-game perfect equilibria of sequentially played games, we analyze the Sequential Price of Anarchy (SPoA), which is defined to be the ratio of the total cost given by the worst sub-game perfect equilibrium to the total cost of the social optimum. In this project, we want to improve lower and upper bounds on the value of SPoA using a linear programming approach. We limit the number of players to 4, 5 and 6 players and consider different combinations of their possible actions and resources they use. Additionaly we consider two player sequentially played non-atomic congestion games with split flow and give new results for the SPoA in all the possible cases.
    [Show full text]