Time and Space: Why Imperfect Information Games Are Hard
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Game Theory 10
Game Theory 10. Poker Albert-Ludwigs-Universität Freiburg Bernhard Nebel and Robert Mattmüller 1 Motivation Motivation Kuhn Poker Real Poker: Problems and techniques Counterfac- tual regret minimization B. Nebel, R. Mattmüller – Game Theory 3 / 25 Motivation The system Libratus played a Poker tournament (heads up no-limit Texas hold ’em) from January 11 to 31, 2017 Motivation against four world-class Poker players. Kuhn Poker Heads up: One-on-One, i.e., a zero-sum game. Real Poker: No-limit: There is no limit in betting, only the stack the Problems and user has. techniques Texas hold’em: Each player gets two private cards, then Counterfac- tual regret open cards are dealt: first three, then one, and finally minimization another one. One combines the best 5 cards. Betting before the open cards are dealt and in the end: check, call, raise, or fold. Two teams (reversing the dealt cards). Libratus won the tournament with more than 1.7 Million US-$ (which neither the system nor the programming team got). B. Nebel, R. Mattmüller – Game Theory 4 / 25 The humans behind the scene Motivation Kuhn Poker Real Poker: Problems and techniques Counterfac- tual regret minimization Professional player Jason Les and Prof. Tuomas Sandholm (CMU) B. Nebel, R. Mattmüller – Game Theory 5 / 25 2 Kuhn Poker Motivation Kuhn Poker Real Poker: Problems and techniques Counterfac- tual regret minimization B. Nebel, R. Mattmüller – Game Theory 7 / 25 Kuhn Poker Motivation Minimal form of heads-up Poker, with only three cards: Kuhn Poker Real Poker: Jack, Queen, King. Problems and Each player is dealt one card and antes 1 chip (forced bet techniques in the beginning). -
An Essay on Assignment Games
GRAU DE MATEMÀTIQUES Facultat de Matemàtiques i Informàtica Universitat de Barcelona DEGREE PROJECT An essay on assignment games Rubén Ureña Martínez Advisor: Dr. Javier Martínez de Albéniz Dept. de Matemàtica Econòmica, Financera i Actuarial Barcelona, January 2017 Abstract This degree project studies the main results on the bilateral assignment game. This is a part of cooperative game theory and models a market with indivisibilities and money. There are two sides of the market, let us say buyers and sellers, or workers and firms, such that when we match two agents from different sides, a profit is made. We show some good properties of the core of these games, such as its non-emptiness and its lattice structure. There are two outstanding points: the buyers-optimal core allocation and the sellers-optimal core allocation, in which all agents of one sector get their best possible outcome. We also study a related non-cooperative mechanism, an auction, to implement the buyers- optimal core allocation. Resumen Este trabajo de fin de grado estudia los resultados principales acerca de los juegos de asignación bilaterales. Corresponde a una parte de la teoría de juegos cooperativos y proporciona un modelo de mercado con indivisibilidades y dinero. Hay dos lados del mercado, digamos compradores y vendedores, o trabajadores y empresas, de manera que cuando se emparejan dos agentes de distinto lado, se produce un cierto beneficio. Se muestran además algunas buenas propiedades del núcleo de estos juegos, tales como su condición de ser siempre no vacío y su estructura de retículo. Encontramos dos puntos destacados: la distribución óptima para los compradores en el núcleo y la distribución óptima para los vendedores en el núcleo, en las cuales todos los agentes de cada sector obtienen simultáneamente el mejor resultado posible en el núcleo. -
Improving Fictitious Play Reinforcement Learning with Expanding Models
Improving Fictitious Play Reinforcement Learning with Expanding Models Rong-Jun Qin1;2, Jing-Cheng Pang1, Yang Yu1;y 1National Key Laboratory for Novel Software Technology, Nanjing University, China 2Polixir emails: [email protected], [email protected], [email protected]. yTo whom correspondence should be addressed Abstract Fictitious play with reinforcement learning is a general and effective framework for zero- sum games. However, using the current deep neural network models, the implementation of fictitious play faces crucial challenges. Neural network model training employs gradi- ent descent approaches to update all connection weights, and thus is easy to forget the old opponents after training to beat the new opponents. Existing approaches often maintain a pool of historical policy models to avoid the forgetting. However, learning to beat a pool in stochastic games, i.e., a wide distribution over policy models, is either sample-consuming or insufficient to exploit all models with limited amount of samples. In this paper, we pro- pose a learning process with neural fictitious play to alleviate the above issues. We train a single model as our policy model, which consists of sub-models and a selector. Everytime facing a new opponent, the model is expanded by adding a new sub-model, where only the new sub-model is updated instead of the whole model. At the same time, the selector is also updated to mix up the new sub-model with the previous ones at the state-level, so that the model is maintained as a behavior strategy instead of a wide distribution over policy models. -
Heads and Tails Unit Overview
6 Heads and tails Unit Overview Outcomes READING In this unit pupils will: Pupils can: ●● Describe wild animals. ●● recognise cognates and deduce meaning of words in ●● Ask and answer about favourite animals. context. ●● Compare different animals’ bodies. ●● understand a poem, find specific information and transfer ●● Write about a wild animal. this to a table. ●● read and complete a gapped text about two animals. Productive Language WRITING VOCABULARY Pupils can: ●● Animals: a crocodile, an elephant, a giraffe, a hippo, ●● write short descriptive texts following a model. a monkey, an ostrich, a pony, a snake. Strategies ●● Body parts: an arm, fingers, a foot, a head, a leg, a mouth, a nose, a tail, teeth, toes, wings. Listen for words that are similar in other languages. L 1 ●● Adjectives: fat, funny, hot, long, short, silly, tall. PB p.35 ex2 PHRASES Before you listen, look at the pictures and guess. ●● It has got/It hasn’t got (a long neck). L 2 AB p.39 ex5 ●● Has it got legs? Yes, it has./No, it hasn’t. Read the questions before you listen. Receptive Language L 3 PB p.39 ex1 VOCABULARY Listen and try to understand the main idea. ●● Animals: butterfly, meerkat, penguin, whale. L 4 ●● Africa, apple, beak, carrot, lake, water, wildlife park. PB p.36 ex1 ●● Eat, live, fly. S Use the model to help you speak. PHRASES 3 PB p.38 ex2 ●● It lives ... Look for words that are similar in other languages. R 2 Recycled Language AB p.38 ex1 VOCABULARY Use your Picture Dictionary to help you spell. -
Using Counterfactual Regret Minimization to Create Competitive Multiplayer Poker Agents
Using Counterfactual Regret Minimization to Create Competitive Multiplayer Poker Agents Nick Abou Risk Duane Szafron University of Alberta University of Alberta Department of Computing Science Department of Computing Science Edmonton, AB Edmonton, AB 780-492-5468 780-492-5468 [email protected] [email protected] ABSTRACT terminal choice node exists for each player’s possible decisions. Games are used to evaluate and advance Multiagent and Artificial The directed edges leaving a choice node represent the legal Intelligence techniques. Most of these games are deterministic actions for that player. Each terminal node contains the utilities of with perfect information (e.g. Chess and Checkers). A all players for one potential game result. Stochastic events deterministic game has no chance element and in a perfect (Backgammon dice rolls or Poker cards) are represented as special information game, all information is visible to all players. nodes, for a player called the chance player. The directed edges However, many real-world scenarios with competing agents are leaving a chance node represent the possible chance outcomes. stochastic (non-deterministic) with imperfect information. For Since extensive games can model a wide range of strategic two-player zero-sum perfect recall games, a recent technique decision-making scenarios, they have been used to study poker. called Counterfactual Regret Minimization (CFR) computes Recent advances in computer poker have led to substantial strategies that are provably convergent to an ε-Nash equilibrium. advancements in solving general two player zero-sum perfect A Nash equilibrium strategy is useful in two-player games since it recall extensive games. Koller and Pfeffer [10] used linear maximizes its utility against a worst-case opponent. -
Probability and Counting Rules
blu03683_ch04.qxd 09/12/2005 12:45 PM Page 171 C HAPTER 44 Probability and Counting Rules Objectives Outline After completing this chapter, you should be able to 4–1 Introduction 1 Determine sample spaces and find the probability of an event, using classical 4–2 Sample Spaces and Probability probability or empirical probability. 4–3 The Addition Rules for Probability 2 Find the probability of compound events, using the addition rules. 4–4 The Multiplication Rules and Conditional 3 Find the probability of compound events, Probability using the multiplication rules. 4–5 Counting Rules 4 Find the conditional probability of an event. 5 Find the total number of outcomes in a 4–6 Probability and Counting Rules sequence of events, using the fundamental counting rule. 4–7 Summary 6 Find the number of ways that r objects can be selected from n objects, using the permutation rule. 7 Find the number of ways that r objects can be selected from n objects without regard to order, using the combination rule. 8 Find the probability of an event, using the counting rules. 4–1 blu03683_ch04.qxd 09/12/2005 12:45 PM Page 172 172 Chapter 4 Probability and Counting Rules Statistics Would You Bet Your Life? Today Humans not only bet money when they gamble, but also bet their lives by engaging in unhealthy activities such as smoking, drinking, using drugs, and exceeding the speed limit when driving. Many people don’t care about the risks involved in these activities since they do not understand the concepts of probability. -
Stories of the Fallen Willow
Stories of the Fallen Willow by Jessica Noel Casimir Senior Honors Thesis Department of English and Comparative Literature April 2020 1 dedication To my parents who sacrifice without hesitation to support my ambitions, To my professors who invested their time and shared their wisdom, And to my fellow-writer friends made along the way. Thank you for the unconditional support and words of encouragement. 2 table of contents Preface…………………………………………………………………………….4 Sellout……………………………………………………………………………..9 Papercuts…………………………………………………………………………24 Static……………………………………………………………………………...27 Dinosaur Bones………………………………………………………………..…32 How to Prepare for a Beach Trip in 4 Easy Steps………………………………..37 A Giver…………………………………………………………………………...39 Hereditary………………………………………………………………………...42 Tomato Soup…………………………………………………………………...…46 A Fair Trade………………………………………………………………………53 Fifth Base…………………………………………………………………………55 Remembering Bennett………………………………………………………….…60 Yellow Puddles……………………………………………………………………69 Head First…………………………………………………………………...…….71 3 preface This introduction is meant to be a moment of honesty. So, I’ll be candid in saying that this is my eighth attempt at writing it. I’ve started and stopped, deleted and retyped, closed my laptop and reopened it. Never in my life have I found it this difficult to write, never in my life has my body physically ached at the thought of sitting down and spending time in my own headspace. Right now, my headspace is the last place on earth I want to be. I’ve decided that this will be my last attempt at writing, and whatever comes out now will remain on the page. I am currently sitting on my couch under a pile of blankets, reclined back as far as my seat will allow me to go. My Amazon Alexa is belting music from an oldies playlist, and my dad is sitting at the kitchen table singing along to “December, 1963” by The Four Seasons as he works. -
Games Lectures.Key
Imperfect Information • So far, all games we’ve developed solutions for have perfect information Lecture 10: Imperfect Information • No hidden information such as individual cards • Hidden information often represented as chance AI For Traditional Games nodes Prof. Nathan Sturtevant Winter 2011 • Could be a play by one player that is hidden until the end of the game Example Tree What is the size of a game with ii? • Simple betting game (Kuhn Poker) • Ante 1 chip • 2-player game, 3-card deck, 1 card each • First player can check/bet • Second player can bet/check or call/fold • If 2nd player bets, 1st player can call/fold 1 1111-1-1 -1 -1 -1 111 -1 -1 -1 • 3 hands each / 6 total combinations • [Exercise: Draw top portion of tree in class] Simple Approach: Perfect-Info Monte-Carlo Drawbacks of Monte-Carlo • We have good perfect information-solvers • May be too many worlds to sample • How can we use them for imperfect information • May get probabilities on worlds incorrect games? • World prob. based on previous actions in the game • Sample all unknown information (eg a world) • May reveal information in actions • For each world: • Good probabilities needed for information hiding • Solve perfectly with alpha-beta • Program has no sense of information seeking/hiding • Take the average best move moves • If too many worlds, sample a reasonable subset • Analysis may be incorrect (see work by Frank and Basin) Strategy Fusion Non-locality World 2 World 1 c 1 c c' World 1 & 2 -1 1 b a a b a' b' -1 1 World 1 1 -1 World 2 1 -1 -1 1 Strengths of Monte-Carlo Analysis of PIMC • Simple to implement • Understanding the Success of Perfect Information Monte Carlo Sampling in Game Tree Search • Relatively fast • Jeffrey Long and Nathan R. -
Problem Set 1: Fake It 'Til You Make It
Water, water, everywhere.... Problem Set 1: Fake It ’Til You Make It Welcome to PCMI. We know you’ll learn a lot of mathematics here—maybe some new tricks, maybe some new perspectives on things with which you’re already familiar. Here’s a few things you should know about how the class is organized. • Don’t worry about answering all the questions. If you’re answering every question, we haven’t written the prob- lem sets correctly. Some problems are actually unsolved. Participants in • Don’t worry about getting to a certain problem number. this course have settled at Some participants have been known to spend the entire least one unsolved problem. session working on one problem (and perhaps a few of its extensions or consequences). • Stop and smell the roses. Getting the correct answer to a question is not a be-all and end-all in this course. How does the question relate to others you’ve encountered? How do others think about this question? • Respect everyone’s views. Remember that you have something to learn from everyone else. Remember that everyone works at a different pace. • Teach only if you have to. You may feel tempted to teach others in your group. Fight it! We don’t mean you should ignore people, but don’t step on someone else’s aha mo- ment. If you think it’s a good time to teach your col- leagues about Bayes’ Theorem, don’t: problems should lead to appropriate mathematics rather than requiring it. The same goes for technology: problems should lead to using appropriate tools rather than requiring them. -
Chapter 7 Homework Answers
118 CHAPTER 7: PROBABILITY: LIVING WITH THE ODDS 9. a. The sum of all the probabilities for a probability distribution is always 1. UNIT 7A 10. a. If Triple Treat’s probability of winning is 1/4, then the probability he loses is 3/4. This implies TIME OUT TO THINK the odds that he wins is (1/4) ÷ (3/4), or 1 to 3, Pg. 417. Birth orders of BBG, BGB, and GBB are the which, in turn, means the odds against are 3 to 1. outcomes that produce the event of two boys in a family of 3. We can represent the probability of DOES IT MAKE SENSE? two boys as P(2 boys). This problem is covered 7. Makes sense. The outcomes are HTTT, THTT, later in Example 3. TTHT, TTTH. Pg. 419. The only change from the Example 2 8. Does not make sense. The probability of any event solution is that the event of a July 4 birthday is is always between 0 and 1. only 1 of the 365 possible outcomes, so the 9. Makes sense. This is a subjective probability, but probability is 1/365. it’s as good as any other guess one might make. Pg. 421. (Should be covered only if you covered Unit 10. Does not make sense. The two outcomes in 6D). The experiment is essentially a hypothesis question are not equally likely. test. The null hypothesis is that the coins are fair, 11. Does not make sense. The sum of P(A) and P(not in which case we should observe the theoretical A) is always 1. -
Successful Nash Equilibrium Agent for a 3-Player Imperfect-Information Game
Successful Nash Equilibrium Agent for a 3-Player Imperfect-Information Game Sam Ganzfried www.ganzfriedresearch.com 1 2 3 4 Scope and applicability of game theory • Strategic multiagent interactions occur in all fields – Economics and business: bidding in auctions, offers in negotiations – Political science/law: fair division of resources, e.g., divorce settlements – Biology/medicine: robust diabetes management (robustness against “adversarial” selection of parameters in MDP) – Computer science: theory, AI, PL, systems; national security (e.g., deploying officers to protect ports), cybersecurity (e.g., determining optimal thresholds against phishing attacks), internet phenomena (e.g., ad auctions) 5 Game theory background rock paper scissors Rock 0,0 -1, 1 1, -1 Paper 1,-1 0, 0 -1,1 Scissors -1,1 1,-1 0,0 • Players • Actions (aka pure strategies) • Strategy profile: e.g., (R,p) • Utility function: e.g., u1(R,p) = -1, u2(R,p) = 1 6 Zero-sum game rock paper scissors Rock 0,0 -1, 1 1, -1 Paper 1,-1 0, 0 -1,1 Scissors -1,1 1,-1 0,0 • Sum of payoffs is zero at each strategy profile: e.g., u1(R,p) + u2(R,p) = 0 • Models purely adversarial settings 7 Mixed strategies • Probability distributions over pure strategies • E.g., R with prob. 0.6, P with prob. 0.3, S with prob. 0.1 8 Best response (aka nemesis) • Any strategy that maximizes payoff against opponent’s strategy • If P2 plays (0.6, 0.3, 0.1) for r,p,s, then a best response for P1 is to play P with probability 1 9 Nash equilibrium • Strategy profile where all players simultaneously play a best response • Standard solution concept in game theory – Guaranteed to always exist in finite games [Nash 1950] • In Rock-Paper-Scissors, the unique equilibrium is for both players to select each pure strategy with probability 1/3 10 • Theorem [Nash 1950]: Every game in strategic form G, with a finite number of players and in which every player has a finite number of pure strategies, has an equilibrium in mixed strategies. -
Name of Game Date of Approval Comments Nevada Gaming Commission Approved Gambling Games Effective August 1, 2021
NEVADA GAMING COMMISSION APPROVED GAMBLING GAMES EFFECTIVE AUGUST 1, 2021 NAME OF GAME DATE OF APPROVAL COMMENTS 1 – 2 PAI GOW POKER 11/27/2007 (V OF PAI GOW POKER) 1 BET THREAT TEXAS HOLD'EM 9/25/2014 NEW GAME 1 OFF TIE BACCARAT 10/9/2018 2 – 5 – 7 POKER 4/7/2009 (V OF 3 – 5 – 7 POKER) 2 CARD POKER 11/19/2015 NEW GAME 2 CARD POKER - VERSION 2 2/2/2016 2 FACE BLACKJACK 10/18/2012 NEW GAME 2 FISTED POKER 21 5/1/2009 (V OF BLACKJACK) 2 TIGERS SUPER BONUS TIE BET 4/10/2012 (V OF BACCARAT) 2 WAY WINNER 1/27/2011 NEW GAME 2 WAY WINNER - COMMUNITY BONUS 6/6/2011 21 + 3 CLASSIC 9/27/2000 21 + 3 CLASSIC - VERSION 2 8/1/2014 21 + 3 CLASSIC - VERSION 3 8/5/2014 21 + 3 CLASSIC - VERSION 4 1/15/2019 21 + 3 PROGRESSIVE 1/24/2018 21 + 3 PROGRESSIVE - VERSION 2 11/13/2020 21 + 3 XTREME 1/19/1999 (V OF BLACKJACK) 21 + 3 XTREME - (PAYTABLE C) 2/23/2001 21 + 3 XTREME - (PAYTABLES D, E) 4/14/2004 21 + 3 XTREME - VERSION 3 1/13/2012 21 + 3 XTREME - VERSION 4 2/9/2012 21 + 3 XTREME - VERSION 5 3/6/2012 21 MADNESS 9/19/1996 21 MADNESS SIDE BET 4/1/1998 (V OF 21 MADNESS) 21 MAGIC 9/12/2011 (V OF BLACKJACK) 21 PAYS MORE 7/3/2012 (V OF BLACKJACK) 21 STUD 8/21/1997 NEW GAME 21 SUPERBUCKS 9/20/1994 (V OF 21) 211 POKER 7/3/2008 (V OF POKER) 24-7 BLACKJACK 4/15/2004 2G'$ 12/11/2019 2ND CHANCE BLACKJACK 6/19/2008 NEW GAME 2ND CHANCE BLACKJACK – VERSION 2 9/24/2008 2ND CHANCE BLACKJACK – VERSION 3 4/8/2010 3 CARD 6/24/2021 NEW GAME NAME OF GAME DATE OF APPROVAL COMMENTS 3 CARD BLITZ 8/22/2019 NEW GAME 3 CARD HOLD’EM 11/21/2008 NEW GAME 3 CARD HOLD’EM - VERSION 2 1/9/2009