“Algoritmos Para Um Jogador Inteligente De Poker” Autor: Vinícius

Total Page:16

File Type:pdf, Size:1020Kb

“Algoritmos Para Um Jogador Inteligente De Poker” Autor: Vinícius Universidade Federal De Santa Catarina Centro Tecnológico Bacharelado em Ciências da Computação “Algoritmos para um jogador inteligente de Poker” Autor: Vinícius Sousa Fazio Florianópolis/SC, 2008 Universidade Federal De Santa Catarina Centro Tecnológico Bacharelado em Ciências da Computação “Algoritmos para um jogador inteligente de Poker” Autor: Vinícius Sousa Fazio Orientador: Mauro Roisenberg Banca: Benjamin Luiz Franklin Banca: João Rosaldo Vollertt Junior Banca: Ricardo Azambuja Silveira Florianópolis/SC, 2008 AGRADECIMENTOS Agradeço a todos que me ajudaram no desenvolvimento deste trabalho, em especial ao professor Mauro Roisenberg e aos colegas de trabalho João Vollertt e Benjamin Luiz Franklin e ao Ricardo Azambuja Silveira pela participação na banca avaliadora. Algoritmos para um jogador inteligente de Poker – Vinícius Sousa Fazio 4 RESUMO Poker é um jogo simples de cartas que envolve aposta, blefe e probabilidade de vencer. O objetivo foi inventar e procurar algoritmos para jogadores artificiais evolutivos e estáticos que jogassem bem poker. O jogador evolutivo criado utiliza aprendizado por reforço, onde o jogo é dividido em uma grande matriz de estados com dados de decisões e recompensas. O jogador toma a decisão que obteve a melhor recompensa média em cada estado. Para comparar a eficiência do jogador, várias disputas com jogadores que tomam decisões baseados em fórmulas simples foram feitas. Diversas disputas foram feitas para comparar a eficiência de cada algoritmo e os resultados estão demonstrados graficamente no trabalho. Palavras-Chave: Aprendizado por Reforço, Inteligência Artificial, Poker; Algoritmos para um jogador inteligente de Poker – Vinícius Sousa Fazio 5 SUMÁRIO 1. Introdução........................................................................................................12 2. Fundamentação Teórica..................................................................................15 2.1. Regras do Poker Texas Hold'em................................................................16 2.1.1. Primeira rodada – Pre-flop..............................................................17 2.1.2. Segunda rodada – Flop....................................................................18 2.1.3. Terceira rodada – Turn....................................................................18 2.1.4. Quarta rodada – River.....................................................................19 2.1.5. Fim do jogo – Showdown................................................................19 2.1.6. Rankings..........................................................................................19 2.1.7. Vencedor..........................................................................................21 2.2. Análise do Poker.......................................................................................22 2.3. Jogadores Artificiais Estáticos.................................................................23 2.3.1. Aleatórios........................................................................................23 2.3.2. Constantes.......................................................................................23 2.3.3. Mathematically Fair Strategy - MFS..............................................24 2.3.4. Jean Rachlin e Gary Higgins - RH.................................................24 2.4. Jogadores Artificiais Evolutivos..............................................................25 2.4.1. Aprendizado por Reforço................................................................25 2.4.2. Algoritmo Genético.........................................................................26 2.5. Probabilidade de Vencer...........................................................................26 2.5.1. Força Bruta....................................................................................27 2.5.2. Monte Carlo....................................................................................27 2.6. Outras Soluções.......................................................................................27 3. Metodologia....................................................................................................29 3.1. Ferramenta e linguagem..........................................................................29 3.2. Treinamento.............................................................................................29 3.3. Verificação de desempenho.....................................................................31 3.4. Implementação dos Jogadores Artificiais................................................31 3.4.1. Aleatórios........................................................................................32 Algoritmos para um jogador inteligente de Poker – Vinícius Sousa Fazio 6 3.4.2. Constantes.......................................................................................32 3.4.3. Mathematically Fair Strategy - MFS..............................................32 3.4.4. Jean Rachlin e Gary Higgins – RH.................................................33 3.4.5. Aprendizado por Reforço................................................................34 3.5. Probabilidade de Vencer............................................................................37 4. Análise dos Resultados..................................................................................38 4.1. Verificação do método Monte-Carlo.........................................................38 4.2. Quantidade de mesas necessárias para avaliar um jogador.......................39 4.3. Constantes de configuração e evolução do Jogador por Reforço..............40 4.4. Disputas.....................................................................................................43 4.4.1. MFS x Aleatório................................................................................43 4.4.2. MFS x Constante..............................................................................44 4.4.3. Aleatório x Constante.......................................................................44 4.4.4. RH x Aleatório..................................................................................45 4.4.5. RH x Constante.................................................................................46 4.4.6. RH x MFS.........................................................................................47 4.4.7. Reforço x Aleatório..........................................................................48 4.4.8. Reforço x Constante.........................................................................48 4.4.9. Reforço x MFS..................................................................................49 4.4.10. Reforço x RH..................................................................................50 4.4.11. Disputa entre os melhores de cada.................................................50 4.4.12. Reforço x Humano..........................................................................51 5. Conclusão........................................................................................................52 6. Referências Bibliográficas.............................................................................54 7. Anexos.............................................................................................................56 7.1. playpoker.m..............................................................................................56 7.2. compute_victory.h....................................................................................66 7.3. getpower.h................................................................................................68 7.4. pokeronerank.h.........................................................................................73 7.5. pokerrank.h...............................................................................................75 7.6. test_pokerrank.h.......................................................................................77 7.7. createconstantplayer.h..............................................................................79 Algoritmos para um jogador inteligente de Poker – Vinícius Sousa Fazio 7 7.8. creatediscreteplayer.h...............................................................................80 7.9. createhumanplayer.h................................................................................86 7.10. createmfsplayer.h...................................................................................87 7.11. createrandomplayer.h.............................................................................88 7.12. createrhplayer.h......................................................................................89 7.13. createdatabase.h.....................................................................................90 7.14. traindiscrete.h.........................................................................................91 7.15. filldatabase.m.........................................................................................95 7.16. newsimutationdispute.m........................................................................97 7.17. newsimutationdispute2.m......................................................................99 7.18. simulatealldispute.h.............................................................................100 7.19. simulatehuman.h..................................................................................103 7.20. Artigo...................................................................................................104
Recommended publications
  • Pokerface: Emotion Based Game-Play Techniques for Computer Poker Players
    University of Kentucky UKnowledge University of Kentucky Master's Theses Graduate School 2004 POKERFACE: EMOTION BASED GAME-PLAY TECHNIQUES FOR COMPUTER POKER PLAYERS Lucas Cockerham University of Kentucky, [email protected] Right click to open a feedback form in a new tab to let us know how this document benefits ou.y Recommended Citation Cockerham, Lucas, "POKERFACE: EMOTION BASED GAME-PLAY TECHNIQUES FOR COMPUTER POKER PLAYERS" (2004). University of Kentucky Master's Theses. 224. https://uknowledge.uky.edu/gradschool_theses/224 This Thesis is brought to you for free and open access by the Graduate School at UKnowledge. It has been accepted for inclusion in University of Kentucky Master's Theses by an authorized administrator of UKnowledge. For more information, please contact [email protected]. ABSTRACT OF THESIS POKERFACE: EMOTION BASED GAME-PLAY TECHNIQUES FOR COMPUTER POKER PLAYERS Numerous algorithms/methods exist for creating computer poker players. This thesis compares and contrasts them. A set of poker agents for the system PokerFace are then introduced. A survey of the problem of facial expression recognition is included in the hopes it may be used to build a better computer poker player. KEYWORDS: Poker, Neural Networks, Arti¯cial Intelligence, Emotion Recognition, Facial Action Coding System Lucas Cockerham 7/30/2004 POKERFACE: EMOTION BASED GAME-PLAY TECHNIQUES FOR COMPUTER POKER PLAYERS By Lucas Cockerham Dr. Judy Goldsmith Dr. Grzegorz Wasilkowski 7/30/2004 RULES FOR THE USE OF THESES Unpublished theses submitted for the Master's degree and deposited in the University of Kentucky Library are as a rule open for inspection, but are to be used only with due regard to the rights of the authors.
    [Show full text]
  • An Artificial Intelligence Agent for Texas Hold'em Poker
    AN ARTIFICIAL INTELL IGENCE AGENT FOR TEXAS HOLD’EM PO KER PATRICK MCCURLEY – 0 62 4 91 7 90 2 An Artificial Intelligence Agent for Texas Hold’em Poker I declare that this document represents my own work except where otherwise stated. Signed …………………………………………………………………………. 08/05/ 2009 Patrick McCurley – 062491790 Introduction 3 TABLE OF CONTENTS 1. Introduction ................................................................................................................................................................ 7 1.1 Problem Description...................................................................................................................................... 7 1.2 Aims and Objectives....................................................................................................................................... 7 1.3 Dissertation Outline ....................................................................................................................................... 8 1.4 Ethics .................................................................................................................................................................... 8 2 Background................................................................................................................................................................10 2.1 Artificial Intelligence and Poker .............................................................................................................10 2.1.1 Problem Domain Realization .........................................................................................................10
    [Show full text]
  • Building a Champion Level Computer Poker Player
    University of Alberta Library Release Form Name of Author: Michael Bradley Johanson Title of Thesis: Robust Strategies and Counter-Strategies: Building a Champion Level Computer Poker Player Degree: Master of Science Year this Degree Granted: 2007 Permission is hereby granted to the University of Alberta Library to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. The author reserves all other publication and other rights in association with the copyright in the thesis, and except as herein before provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatever without the author’s prior written permission. Michael Bradley Johanson Date: Too much chaos, nothing gets finished. Too much order, nothing gets started. — Hexar’s Corollary University of Alberta ROBUST STRATEGIES AND COUNTER-STRATEGIES: BUILDING A CHAMPION LEVEL COMPUTER POKER PLAYER by Michael Bradley Johanson A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment of the requirements for the degree of Master of Science. Department of Computing Science Edmonton, Alberta Fall 2007 University of Alberta Faculty of Graduate Studies and Research The undersigned certify that they have read, and recommend to the Faculty of Graduate Studies and Research for acceptance, a thesis entitled Robust Strategies and Counter-Strategies: Building a Champion Level Computer Poker Player submitted by Michael Bradley Johanson in partial fulfillment of the requirements for the degree of Master of Science. Michael Bowling Supervisor Duane Szafron Michael Carbonaro External Examiner Date: To my family: my parents Brad and Sue Johanson, and my brother, Jeff Johanson.
    [Show full text]
  • Author's Personal Copy Artificial Intelligence Computer Poker: a Review
    Author's personal copy Artificial Intelligence 175 (2011) 958–987 Contents lists available at ScienceDirect Artificial Intelligence www.elsevier.com/locate/artint Computer poker: A review Jonathan Rubin, Ian Watson ∗ Department of Computer Science, University of Auckland, New Zealand article info abstract Article history: The game of poker has been identified as a beneficial domain for current AI research Received 19 February 2010 because of the properties it possesses such as the need to deal with hidden information Accepted 21 December 2010 and stochasticity. The identification of poker as a useful research domain has inevitably Available online 21 January 2011 resulted in increased attention from academic researchers who have pursued many separate avenues of research in the area of computer poker. The poker domain has Keywords: Computer poker often featured in previous review papers that focus on games in general, however a Imperfect information games comprehensive review paper with a specific focus on computer poker has so far been Nash equilibrium lacking in the literature. In this paper, we present a review of recent algorithms and Computational game theory approaches in the area of computer poker, along with a survey of the autonomous Opponent modelling poker agents that have resulted from this research. We begin with the first serious attempts to create strong computerised poker players by constructing knowledge-based and simulation-based systems. This is followed by the use of computational game theory to construct robust poker agents and the advances that have been made in this area. Approaches to constructing exploitive agents are reviewed and the challenging problems of creating accurate and dynamic opponent models are addressed.
    [Show full text]
  • On the Construction, Maintenance and Analysis of Case-Based Strategies in Computer Poker
    On the Construction, Maintenance and Analysis of Case-Based Strategies in Computer Poker Jonathan Rubin A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science, The University of Auckland, 2013. Abstract The state-of-the-art within Artificial Intelligence has directly benefited from research con- ducted within the computer poker domain. One such success has been the the advancement of bottom up equilibrium finding algorithms via computational game theory. On the other hand, alternative top down approaches, that attempt to generalise decisions observed within a col- lection of data, have not received as much attention. In this thesis we examine top down ap- proaches that use Case-Based Reasoning in order to construct strategies within the domain of computer poker. Our analysis begins with the development of frameworks to produce static strategies that do not change during game play. We trace the evolution of our case-based ar- chitecture and evaluate the effect that modifications have on strategy performance. The end result of our experimentation is a coherent framework for producing strong case-based strate- gies based on the observation and generalisation of expert decisions. Next, we introduce three augmentation procedures that extend the initial frameworks in order to produce case-based strategies that are able to adapt to changing game conditions and exploit weaknesses of their opponents. Two of the augmentation procedures introduce different forms of opponent mod- elling into the case-based strategies produced. A further extension investigates the use of trans- fer learning in order to leverage information between separate poker sub-domains.
    [Show full text]
  • A Demonstration of the Polaris Poker System
    A Demonstration of the Polaris Poker System Michael Bowling, Nicholas Abou Risk, Nolan Bard, Darse Billings, Neil Burch, Joshua Davidson, John Hawkin, Robert Holte, Michael Johanson, Morgan Kan, Bryce Paradis, Jonathan Schaeffer, David Schnizlein, Duane Szafron, Kevin Waugh, Martin Zinkevich Department of Computing Science University of Alberta Edmonton, AB, Canada T6G 2E8 poker.cs.ualberta.ca ABSTRACT depends on the type of game being played. Two common types Poker games provide a useful testbed for modern Artificial Intel- of betting rounds are limit and no-limit. After the betting round is ligence techniques. Unlike many classical game domains such as finished, three community cards are placed face up for everyone to chess and checkers, poker includes elements of imperfect informa- use to make a hand. This round is called the flop. The flop betting tion, stochastic events, and one or more adversarial agents to inter- round starts from the small blind. Two more rounds follow, the act with. Furthermore, in poker it is possible to win or lose by vary- turn and the river, with one community card each. After the river ing degrees. Therefore, it can be advantageous to adapt ones’ strat- betting round finishes, the player with the best five card poker hand egy to exploit a weak opponent. A poker agent must address these wins the pot. Texas Hold’em refers to the described game when challenges, acting in uncertain environments and exploiting other limit betting is used. agents, in order to be highly successful. Arguably, poker games Over the last three years, the AAAI Computer Poker Compe- more closely resemble many real world problems than games with tition[5] has been used as the measuring stick for progress in the perfect information.
    [Show full text]
  • Automated Lazy Metalearning in Introspective Reasoning Systems Doctor
    Doctoral theses at NTNU, 2020:193 al thesis T or Gunnar Høst Houeland Tor Gunnar Høst Houeland Automated lazy metalearning in introspective reasoning systems Doctor ISBN 978-82-326-4734-7 (printed ver.) ISBN 978-82-326-4735-4 (electronic ver.) 1503-8181 Doctor NTNU al theses at NTNU, 2020:193 al theses at NTNU, echnology Engineering Philosophiae Doctor Thesis for the Degree of Thesis for the Degree Department of Computer Science Faculty of Information Technology and Electrical Faculty of Information Technology Norwegian University of Science and T ,.1++.6/0,1#)+" &#/'/$,.0&##%.##,$&'),/,-&'#,!0,. .,+"&#'*1+# ,.3#%'++'2#./'04,$!'#+!#+"#!&+,),%4 !1)04,$7+$,.*0',+#!&+,),%4+")#!0.'!)+%'+##.'+% #-.0*#+0,$,*-10#.!'#+!# ,.3#%'++'2#./'04,$!'#+!#+"#!&+,),%4 &#/'/$,.0&##%.##,$&'),/,-&'#,!0,. !1)04,$7+$,.*0',+#!&+,),%4+")#!0.'!)+%'+##.'+% #-.0*#+0,$,*-10#.!'#+!# 5,.1++.6/0,1#)+" 7 -.'+0#"2#. 7 #)#!0.,+'!2#. 7 ,!0,.)0&#/#/0 .'+0#" 4.$'/(/#+0#. Abstract Machine learning systems are becoming increasingly widespread and important, and these days machine learning is used in some form in most industries. However, the application of machine learning technology still has a relatively high barrier to entry, requiring both machine learning expertise and domain knowledge. In this thesis, we present a metareasoning approach to multi-method machine learning that allows the system to adapt and optimize learning for a given domain automatically, without requiring human expert judgment. In contrast to popular deep learning methods for similar situations, the approach presented here does not require specialized hardware nor large data sets. Multiple machine learning components are continuously evaluated at run-time while solving problems, using a framework to analyze overall system performance based on observed prediction performance and time spent.
    [Show full text]
  • An Investigation Into Tournament Poker Strategy Using Evolutionary Algorithms
    An Investigation into Tournament Poker Strategy using Evolutionary Algorithms Richard G. Carter I V N E R U S E I T H Y T O H F G E R D I N B U Doctor of Philosophy Centre for Intelligent Systems and their Applications School of Informatics University of Edinburgh 2007 Abstract Poker has become the subject of an increasing amount of study in the computational in- telligence community. The element of imperfect information presents new and greater challenges than those previously posed by games such as checkers and chess. Ad- vances in computer poker have great potential, since reasoning under conditions of uncertainty is typical of many real world problems. To date the focus of computer poker research has centred on the development of ring game players for limit Texas hold’em. For a computer to compete in the most prestigious poker events, however, it will be required to play in a tournament setting with a no-limit betting structure. This thesis is the first academic attempt to investigate the underlying dynamics of successful no-limit tournament poker play. Professional players have proffered advice in the non-academic poker literature on correct strategies for tournament poker play. This study seeks to empirically validate their suggestions on a simplified no-limit Texas hold’em tournament framework. Starting by using exhaustive simulations, we first assess the hypothesis that a strat- egy including information related to game-specific factors performs better than one founded on hand strength knowledge alone. Specifically, we demonstrate that the use of information pertaining to one’s seating position, the opponents’ prior actions, the stage of the tournament, and one’s chip stack size all contribute towards a statistically significant improvement in the number of tournaments won.
    [Show full text]
  • A Learning AI Algorithm for Poker with Embedded Opponent Modeling
    International Journal of Fuzzy Logic and Intelligent Systems, vol. 10, no. 3, September 2010, pp.170-177 DOI : 10.5391/IJFIS.2010.10.3.170 A Learning AI Algorithm for Poker with Embedded Opponent Modeling Seong-gon Kim* and Yong-Gi Kim** *CISE department, University of Florida, USA **Dept of Computer Science, Gyeongsang National University, Korea Corresponding author: [email protected] Abstract Poker is a game of imperfect information where competing players must deal with multiple risk factors stemming from unknown information while making the best decision to win, and this makes it an interesting test-bed for artificial intelligence research. This paper introduces a new learning AI algorithm with embedded opponent modeling that can be used for these types of situations and we use this AI and apply it to a poker program. The new AI will be based on several graphs with each of its nodes representing inputs, and the algorithm will learn the optimal decision to make by updating the weight of the edges connecting these nodes and returning a probability for each action the graphs represent. Key Word : Learning AI Algorithm, Poker, Opponent Modeling 1. Introduction 2. Texas Hold’em Artificial intelligence has long been applied to games such There are several important aspects to consider when as Chess[1], Checkers[2][3], Go[4], Othello[5][6] and other playing Texas Hold’em. The first is the Pre-Flop Evaluation. various board games with positive results in beating human Pre-flop play in Hold’em has been extensively studied in players and has gained much public attention in the process.
    [Show full text]
  • Monte Carlo Tree Search and Opponent Modeling Through Player Clustering in No-Limit Texas Hold’Em Poker
    Monte Carlo Tree Search and Opponent Modeling through Player Clustering in no-limit Texas Hold'em Poker A.A.J. van der Kleij August 2010 Master thesis Artificial Intelligence University of Groningen, The Netherlands Internal supervisors: Dr. M.A. Wiering (Artificial Intelligence, University of Groningen) Prof. Dr. L. Schomaker (Artificial Intelligence, University of Groningen) Abstract Texas Hold'em Poker is a challenging card game in which players have to deal with uncertainty, imperfect information, deception and multiple competing agents. Predicting your opponents' moves and guessing their hidden cards are key elements. A computer Poker player thus requires an opponent modeling component that performs these tasks. We propose a clustering algorithm called "K-models clustering" that clusters players based on playing style. The algorithm does not use features that describe the players, but models that predict players' moves instead. The result is a partitioning of players into clusters and models for each of these clusters. Learning to reason about the hidden cards of a player is complicated by the fact that in over 90% of the games, a player's cards are not revealed. We in- troduce an expectation-maximization based algorithm that iteratively improves beliefs over the hidden cards of players whose cards were not revealed. After convergence, we obtain models that predict players' hidden cards. We built a computer Poker player using these algorithms and Monte Carlo Tree Search: an anytime algorithm that estimates the expected values of moves using simulations. The resulting program can hold its own in 2 player games against both novices and experienced human opponents.
    [Show full text]
  • Building Poker Agent Using Reinforcement Learning with Neural Networks
    Building Poker Agent Using Reinforcement Learning with Neural Networks Annija Rupeneite Faculty of Computing, University of Latvia, 19 Raina blvd., LV-1586 Riga, Latvia Keywords: Poker Game, Reinforcement Learning, Neural Networks. Abstract: Poker is a game with incomplete and imperfect information. The ability to estimate opponent and interpret its actions makes a player as a world class player. Finding optimal game strategy is not enough to win poker game. As in real life as in online poker game, the most time of it consists of opponent analysis. This paper illustrates a development of poker agent using reinforcement learning with neural networks. 1 STAGE OF THE RESEARCH adapt opponent play style and change its strategy during the game. Poker is a game with incomplete and imperfect information. The ability to estimate opponent and interpret its actions makes a player as a world class 3 RESEARCH PROBLEM player. Finding optimal game strategy is not enough to win poker game. As in real life as in online poker Games are to AI as grand prix racing is to game, the most time of it consists of opponent automobile design. analysis. Author is working on development of poker Poker game has become a field of interest for agent that would find optimal game strategy using artificial intelligence technologies. There are some reinforcement learning (RL) in combination with concealed cards in the game that makes impossible artificial neural network (ANN) for value function to calculate the final outcome of the hand. approximation. Therefore, artificial intelligence approach is used more often to develop online poker agents.
    [Show full text]
  • Applying Artificial Intelligence and Machine Learning Techniques To
    Applying Artificial Intelligence and Machine Learning Techniques to Create Varying Play Style in Artificial Game Opponents Nicholas John Sephton Doctor of Engineering University of York Computer Science October 2016 Abstract Artificial Intelligence is quickly becoming an integral part of the modern world, employed in almost every modern industry we interact with. Whether it be self-drive cars, integra- tion with our web clients or the creation of actual intelligent companions such as Xiaoice1, artificial intelligence is now an integrated and critical part of our daily existence. The application of artificial intelligence to games has been explored for several decades, with many agents now competing at a high level in strategic games which prove challenging for human players (e.g. Go [127] and Chess [73]). With artificial intelligence now able to produce strong opponents for many games, we are more concerned with the style of play of artificial agents, rather than simply their strength. Our work here focusses on the modification of artificial game opponents to create varied playstyle in complex games. We explore several techniques of modifying Monte Carlo Tree Search in an attempt to create different styles of play, thus changing the experience for human opponents playing against them. We also explore improving artificial agent strength, both by investigating parallelization of MCTS and by using Association Rule Mining to predict opponent’s choices, thus improving our ability to play well against them. 1http://www.msxiaoice.com/ 2 Contents Abstract 2 List of Contents 3 List of Figures 9 Acknowledgements 13 Author’s Declaration 15 1 Introduction 16 1.1 Background .
    [Show full text]