Automating Collusion Detection in Sequential Games
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Game Theory 10
Game Theory 10. Poker Albert-Ludwigs-Universität Freiburg Bernhard Nebel and Robert Mattmüller 1 Motivation Motivation Kuhn Poker Real Poker: Problems and techniques Counterfac- tual regret minimization B. Nebel, R. Mattmüller – Game Theory 3 / 25 Motivation The system Libratus played a Poker tournament (heads up no-limit Texas hold ’em) from January 11 to 31, 2017 Motivation against four world-class Poker players. Kuhn Poker Heads up: One-on-One, i.e., a zero-sum game. Real Poker: No-limit: There is no limit in betting, only the stack the Problems and user has. techniques Texas hold’em: Each player gets two private cards, then Counterfac- tual regret open cards are dealt: first three, then one, and finally minimization another one. One combines the best 5 cards. Betting before the open cards are dealt and in the end: check, call, raise, or fold. Two teams (reversing the dealt cards). Libratus won the tournament with more than 1.7 Million US-$ (which neither the system nor the programming team got). B. Nebel, R. Mattmüller – Game Theory 4 / 25 The humans behind the scene Motivation Kuhn Poker Real Poker: Problems and techniques Counterfac- tual regret minimization Professional player Jason Les and Prof. Tuomas Sandholm (CMU) B. Nebel, R. Mattmüller – Game Theory 5 / 25 2 Kuhn Poker Motivation Kuhn Poker Real Poker: Problems and techniques Counterfac- tual regret minimization B. Nebel, R. Mattmüller – Game Theory 7 / 25 Kuhn Poker Motivation Minimal form of heads-up Poker, with only three cards: Kuhn Poker Real Poker: Jack, Queen, King. Problems and Each player is dealt one card and antes 1 chip (forced bet techniques in the beginning). -
Game Theory Lecture Notes
Game Theory: Penn State Math 486 Lecture Notes Version 2.1.1 Christopher Griffin « 2010-2021 Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License With Major Contributions By: James Fan George Kesidis and Other Contributions By: Arlan Stutler Sarthak Shah Contents List of Figuresv Preface xi 1. Using These Notes xi 2. An Overview of Game Theory xi Chapter 1. Probability Theory and Games Against the House1 1. Probability1 2. Random Variables and Expected Values6 3. Conditional Probability8 4. The Monty Hall Problem 11 Chapter 2. Game Trees and Extensive Form 15 1. Graphs and Trees 15 2. Game Trees with Complete Information and No Chance 18 3. Game Trees with Incomplete Information 22 4. Games of Chance 24 5. Pay-off Functions and Equilibria 26 Chapter 3. Normal and Strategic Form Games and Matrices 37 1. Normal and Strategic Form 37 2. Strategic Form Games 38 3. Review of Basic Matrix Properties 40 4. Special Matrices and Vectors 42 5. Strategy Vectors and Matrix Games 43 Chapter 4. Saddle Points, Mixed Strategies and the Minimax Theorem 45 1. Saddle Points 45 2. Zero-Sum Games without Saddle Points 48 3. Mixed Strategies 50 4. Mixed Strategies in Matrix Games 53 5. Dominated Strategies and Nash Equilibria 54 6. The Minimax Theorem 59 7. Finding Nash Equilibria in Simple Games 64 8. A Note on Nash Equilibria in General 66 Chapter 5. An Introduction to Optimization and the Karush-Kuhn-Tucker Conditions 69 1. A General Maximization Formulation 70 2. Some Geometry for Optimization 72 3. -
An Essay on Assignment Games
GRAU DE MATEMÀTIQUES Facultat de Matemàtiques i Informàtica Universitat de Barcelona DEGREE PROJECT An essay on assignment games Rubén Ureña Martínez Advisor: Dr. Javier Martínez de Albéniz Dept. de Matemàtica Econòmica, Financera i Actuarial Barcelona, January 2017 Abstract This degree project studies the main results on the bilateral assignment game. This is a part of cooperative game theory and models a market with indivisibilities and money. There are two sides of the market, let us say buyers and sellers, or workers and firms, such that when we match two agents from different sides, a profit is made. We show some good properties of the core of these games, such as its non-emptiness and its lattice structure. There are two outstanding points: the buyers-optimal core allocation and the sellers-optimal core allocation, in which all agents of one sector get their best possible outcome. We also study a related non-cooperative mechanism, an auction, to implement the buyers- optimal core allocation. Resumen Este trabajo de fin de grado estudia los resultados principales acerca de los juegos de asignación bilaterales. Corresponde a una parte de la teoría de juegos cooperativos y proporciona un modelo de mercado con indivisibilidades y dinero. Hay dos lados del mercado, digamos compradores y vendedores, o trabajadores y empresas, de manera que cuando se emparejan dos agentes de distinto lado, se produce un cierto beneficio. Se muestran además algunas buenas propiedades del núcleo de estos juegos, tales como su condición de ser siempre no vacío y su estructura de retículo. Encontramos dos puntos destacados: la distribución óptima para los compradores en el núcleo y la distribución óptima para los vendedores en el núcleo, en las cuales todos los agentes de cada sector obtienen simultáneamente el mejor resultado posible en el núcleo. -
Improving Fictitious Play Reinforcement Learning with Expanding Models
Improving Fictitious Play Reinforcement Learning with Expanding Models Rong-Jun Qin1;2, Jing-Cheng Pang1, Yang Yu1;y 1National Key Laboratory for Novel Software Technology, Nanjing University, China 2Polixir emails: [email protected], [email protected], [email protected]. yTo whom correspondence should be addressed Abstract Fictitious play with reinforcement learning is a general and effective framework for zero- sum games. However, using the current deep neural network models, the implementation of fictitious play faces crucial challenges. Neural network model training employs gradi- ent descent approaches to update all connection weights, and thus is easy to forget the old opponents after training to beat the new opponents. Existing approaches often maintain a pool of historical policy models to avoid the forgetting. However, learning to beat a pool in stochastic games, i.e., a wide distribution over policy models, is either sample-consuming or insufficient to exploit all models with limited amount of samples. In this paper, we pro- pose a learning process with neural fictitious play to alleviate the above issues. We train a single model as our policy model, which consists of sub-models and a selector. Everytime facing a new opponent, the model is expanded by adding a new sub-model, where only the new sub-model is updated instead of the whole model. At the same time, the selector is also updated to mix up the new sub-model with the previous ones at the state-level, so that the model is maintained as a behavior strategy instead of a wide distribution over policy models. -
Strong Stackelberg Reasoning in Symmetric Games: an Experimental
Strong Stackelberg reasoning in symmetric games: An experimental ANGOR UNIVERSITY replication and extension Pulford, B.D.; Colman, A.M.; Lawrence, C.L. PeerJ DOI: 10.7717/peerj.263 PRIFYSGOL BANGOR / B Published: 25/02/2014 Publisher's PDF, also known as Version of record Cyswllt i'r cyhoeddiad / Link to publication Dyfyniad o'r fersiwn a gyhoeddwyd / Citation for published version (APA): Pulford, B. D., Colman, A. M., & Lawrence, C. L. (2014). Strong Stackelberg reasoning in symmetric games: An experimental replication and extension. PeerJ, 263. https://doi.org/10.7717/peerj.263 Hawliau Cyffredinol / General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal ? Take down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. 23. Sep. 2021 Strong Stackelberg reasoning in symmetric games: An experimental replication and extension Briony D. Pulford1, Andrew M. Colman1 and Catherine L. Lawrence2 1 School of Psychology, University of Leicester, Leicester, UK 2 School of Psychology, Bangor University, Bangor, UK ABSTRACT In common interest games in which players are motivated to coordinate their strate- gies to achieve a jointly optimal outcome, orthodox game theory provides no general reason or justification for choosing the required strategies. -
(12) Patent Application Publication (10) Pub. No.: US 2012/0214567 A1 Snow (43) Pub
US 20120214567A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2012/0214567 A1 Snow (43) Pub. Date: Aug. 23, 2012 (54) METHOD AND APPARATUS FORVARIANT Publication Classification OF TEXAS HOLDEM POKER (51) Int. Cl. A63F I3/00 (2006.01) (75) Inventor: Roger M. Snow, Las Vegas, NV A63F I/00 (2006.01) (US) (52) U.S. Cl. ........................................... 463/13; 273/292 57 ABSTRACT (73) Assignee: SHUFFLE MASTER, INC., Las (57) Vegas, NV (US) A variant game of Hold Empoker allows for rules of play of one or all of players being allowed to remain in game with an option of checking or making specific wagering amounts in (21) Appl. No.: 13/455.742 first play wagers, being limited in the size of Subsequent available play wagers or prohibited from making additional (22) Filed: Apr. 25, 2012 play wagers ifa first play wager has been made, being limited in the size of available laterplay wagers ifa first or earlier play Related U.S. Application Data wager has been made, and having the opportunity for at least two and as many as three or four distinct opportunities in the (62) Division of application No. 1 1/156.352, filed on Jun. stages in the play of a hand to be able to make one or more 17, 2005. play wagers. 110 Patent Application Publication Aug. 23, 2012 Sheet 1 of 10 US 2012/0214567 A1 Patent Application Publication Aug. 23, 2012 Sheet 2 of 10 US 2012/0214567 A1 a. s&Os Patent Application Publication Aug. 23, 2012 Sheet 3 of 10 US 2012/0214567 A1 Patent Application Publication Aug. -
Economics 201B Economic Theory (Spring 2021) Strategic Games
Economics 201B Economic Theory (Spring 2021) Strategic Games Topics: terminology and notations (OR 1.7), games and solutions (OR 1.1-1.3), rationality and bounded rationality (OR 1.4-1.6), formalities (OR 2.1), best-response (OR 2.2), Nash equilibrium (OR 2.2), 2 2 examples × (OR 2.3), existence of Nash equilibrium (OR 2.4), mixed strategy Nash equilibrium (OR 3.1, 3.2), strictly competitive games (OR 2.5), evolution- ary stability (OR 3.4), rationalizability (OR 4.1), dominance (OR 4.2, 4.3), trembling hand perfection (OR 12.5). Terminology and notations (OR 1.7) Sets For R, ∈ ≥ ⇐⇒ ≥ for all . and ⇐⇒ ≥ for all and some . ⇐⇒ for all . Preferences is a binary relation on some set of alternatives R. % ⊆ From % we derive two other relations on : — strict performance relation and not  ⇐⇒ % % — indifference relation and ∼ ⇐⇒ % % Utility representation % is said to be — complete if , or . ∀ ∈ % % — transitive if , and then . ∀ ∈ % % % % can be presented by a utility function only if it is complete and transitive (rational). A function : R is a utility function representing if → % ∀ ∈ () () % ⇐⇒ ≥ % is said to be — continuous (preferences cannot jump...) if for any sequence of pairs () with ,and and , . { }∞=1 % → → % — (strictly) quasi-concave if for any the upper counter set ∈ { ∈ : is (strictly) convex. % } These guarantee the existence of continuous well-behaved utility function representation. Profiles Let be a the set of players. — () or simply () is a profile - a collection of values of some variable,∈ one for each player. — () or simply is the list of elements of the profile = ∈ { } − () for all players except . ∈ — ( ) is a list and an element ,whichistheprofile () . -
Using Counterfactual Regret Minimization to Create Competitive Multiplayer Poker Agents
Using Counterfactual Regret Minimization to Create Competitive Multiplayer Poker Agents Nick Abou Risk Duane Szafron University of Alberta University of Alberta Department of Computing Science Department of Computing Science Edmonton, AB Edmonton, AB 780-492-5468 780-492-5468 [email protected] [email protected] ABSTRACT terminal choice node exists for each player’s possible decisions. Games are used to evaluate and advance Multiagent and Artificial The directed edges leaving a choice node represent the legal Intelligence techniques. Most of these games are deterministic actions for that player. Each terminal node contains the utilities of with perfect information (e.g. Chess and Checkers). A all players for one potential game result. Stochastic events deterministic game has no chance element and in a perfect (Backgammon dice rolls or Poker cards) are represented as special information game, all information is visible to all players. nodes, for a player called the chance player. The directed edges However, many real-world scenarios with competing agents are leaving a chance node represent the possible chance outcomes. stochastic (non-deterministic) with imperfect information. For Since extensive games can model a wide range of strategic two-player zero-sum perfect recall games, a recent technique decision-making scenarios, they have been used to study poker. called Counterfactual Regret Minimization (CFR) computes Recent advances in computer poker have led to substantial strategies that are provably convergent to an ε-Nash equilibrium. advancements in solving general two player zero-sum perfect A Nash equilibrium strategy is useful in two-player games since it recall extensive games. Koller and Pfeffer [10] used linear maximizes its utility against a worst-case opponent. -
ECONS 424 – STRATEGY and GAME THEORY MIDTERM EXAM #2 – Answer Key
ECONS 424 – STRATEGY AND GAME THEORY MIDTERM EXAM #2 – Answer key Exercise #1. Hawk-Dove game. Consider the following payoff matrix representing the Hawk-Dove game. Intuitively, Players 1 and 2 compete for a resource, each of them choosing to display an aggressive posture (hawk) or a passive attitude (dove). Assume that payoff > 0 denotes the value that both players assign to the resource, and > 0 is the cost of fighting, which only occurs if they are both aggressive by playing hawk in the top left-hand cell of the matrix. Player 2 Hawk Dove Hawk , , 0 2 2 − − Player 1 Dove 0, , 2 2 a) Show that if < , the game is strategically equivalent to a Prisoner’s Dilemma game. b) The Hawk-Dove game commonly assumes that the value of the resource is less than the cost of a fight, i.e., > > 0. Find the set of pure strategy Nash equilibria. Answer: Part (a) • When Player 2 (in columns) chooses Hawk (in the left-hand column), Player 1 (in rows) receives a positive payoff of by paying Hawk, which is higher than his payoff of zero from playing Dove. − Therefore, for Player2 1 Hawk is a best response to Player 2 playing Hawk. Similarly, when Player 2 chooses Dove (in the right-hand column), Player 1 receives a payoff of by playing Hawk, which is higher than his payoff from choosing Dove, ; entailing that Hawk is Player 1’s best response to Player 2 choosing Dove. Therefore, Player 1 chooses2 Hawk as his best response to all of Player 2’s strategies, implying that Hawk is a strictly dominant strategy for Player 1. -
Introduction to Game Theory I
Introduction to Game Theory I Presented To: 2014 Summer REU Penn State Department of Mathematics Presented By: Christopher Griffin [email protected] 1/34 Types of Game Theory GAME THEORY Classical Game Combinatorial Dynamic Game Evolutionary Other Topics in Theory Game Theory Theory Game Theory Game Theory No time, doesn't contain differential equations Notion of time, may contain differential equations Games with finite or Games with no Games with time. Modeling population Experimental / infinite strategy space, chance. dynamics under Behavioral Game but no time. Games with motion or competition Theory Generally two player a dynamic component. Games with probability strategic games Modeling the evolution Voting Theory (either induced by the played on boards. Examples: of strategies in a player or the game). Optimal play in a dog population. Examples: Moves change the fight. Chasing your Determining why Games with coalitions structure of a game brother across a room. Examples: altruism is present in or negotiations. board. The evolution of human society. behaviors in a group of Examples: Examples: animals. Determining if there's Poker, Strategic Chess, Checkers, Go, a fair voting system. Military Decision Nim. Equilibrium of human Making, Negotiations. behavior in social networks. 2/34 Games Against the House The games we often see on television fall into this category. TV Game Shows (that do not pit players against each other in knowledge tests) often require a single player (who is, in a sense, playing against The House) to make a decision that will affect only his life. Prize is Behind: 1 2 3 Choose Door: 1 2 3 1 2 3 1 2 3 Switch: Y N Y N Y N Y N Y N Y N Y N Y N Y N Win/Lose: L W W L W L W L L W W L W L W L L W The Monty Hall Problem Other games against the house: • Blackjack • The Price is Right 3/34 Assumptions of Multi-Player Games 1. -
Arxiv:Quant-Ph/0702167V2 30 Jan 2013
Quantum Game Theory Based on the Schmidt Decomposition Tsubasa Ichikawa∗ and Izumi Tsutsui† Institute of Particle and Nuclear Studies High Energy Accelerator Research Organization (KEK) Tsukuba 305-0801, Japan and Taksu Cheon‡ Laboratory of Physics Kochi University of Technology Tosa Yamada, Kochi 782-8502, Japan Abstract We present a novel formulation of quantum game theory based on the Schmidt de- composition, which has the merit that the entanglement of quantum strategies is manifestly quantified. We apply this formulation to 2-player, 2-strategy symmetric games and obtain a complete set of quantum Nash equilibria. Apart from those available with the maximal entanglement, these quantum Nash equilibria are ex- arXiv:quant-ph/0702167v2 30 Jan 2013 tensions of the Nash equilibria in classical game theory. The phase structure of the equilibria is determined for all values of entanglement, and thereby the possibility of resolving the dilemmas by entanglement in the game of Chicken, the Battle of the Sexes, the Prisoners’ Dilemma, and the Stag Hunt, is examined. We find that entanglement transforms these dilemmas with each other but cannot resolve them, except in the Stag Hunt game where the dilemma can be alleviated to a certain degree. ∗ email: [email protected] † email: [email protected] ‡ email: [email protected] 1. Introduction Quantum game theory, which is a theory of games with quantum strategies, has been attracting much attention among quantum physicists and economists in recent years [1, 2] (for a review, see [2, 3, 4]). There are basically two reasons for this. One is that quantum game theory provides a general basis to treat the quantum information processing and quantum communication in which plural actors try to achieve their objectives such as the increase in communication efficiency and security [5, 6]. -
Approximating Game-Theoretic Optimal Strategies for Full-Scale Poker
Approximating Game-Theoretic Optimal Strategies for Full-scale Poker D. Billings, N. Burch, A. Davidson, R. Holte, J. Schaeffer, T. Schauenberg, and D. Szafron Department of Computing Science, University of Alberta Edmonton, Alberta, T6G 2E8, Canada Email: {darse,burch,davidson,holte,jonathan,terence,duane}@cs.ualberta.ca Abstract Due to the computational limitations involved, only simpli• fied poker variations have been solved in the past (e.g. [Kuhn, The computation of the first complete approxima• 1950; Sakaguchi and Sakai, 19921). While these are of the• tions of game-theoretic optimal strategies for full- oretical interest, the same methods are not feasible for real scale poker is addressed. Several abstraction tech• games, which are too large by many orders of magnitude niques are combined to represent the game of 2- ([Roller and Pfeffer, 1997J). player Texas Hold'em, having size O(1018), using 7 [Shi and Littman, 2001] investigated abstraction tech• closely related models each having size 0(1O ). niques to reduce the large search space and complexity of the Despite the reduction in size by a factor of 100 problem, using a simplified variant of poker. [Takusagawa, billion, the resulting models retain the key prop• 2000] created near-optimal strategies for the play of three erties and structure of the real game. Linear pro• specific Hold'em flops and betting sequences. [Selby, 1999] gramming solutions to the abstracted game are used computed an optimal solution for the abbreviated game of to create substantially improved poker-playing pro• 1 pre flop Hold em. grams, able to defeat strong human players and be Using new abstraction techniques, we have produced vi• competitive against world-class opponents.