Minimax AI Within Non-Deterministic Environments

Total Page:16

File Type:pdf, Size:1020Kb

Minimax AI Within Non-Deterministic Environments Minimax AI within Non-Deterministic Environments by Joe Dan Davenport, B.S. A Thesis In Computer Science Submitted to the Graduate Faculty Of Texas Tech University in Partial Fulfillment of The Requirements for The Degree of MASTER OF SCIENCE Approved Dr. Nelson J. Rushton Committee Chair Dr. Richard Watson Dr. Yuanlin Zhang Mark Sheridan, Ph.D. Dean of the Graduate School August, 2016 Copyright 2016, Joe Dan Davenport Texas Tech University, Joe Dan Davenport, August 2016 Acknowledgements There are a number of people who I owe my sincerest thanks for arriving at this point in my academic career: First and foremost I wish to thank God, without whom many doors that were opened to me would have remained closed. Dr. Nelson Rushton for his guidance both as a scientist and as a person, but even more so for believing in me enough to allow me to pursue my Master’s Degree when my grades didn’t represent that I should. I would also like to thank Dr. Richard Watson who originally enabled me to come to Texas Tech University and guided my path. I would also like to thank Joshua Archer, and Bryant Nelson, not only for their companionship, but also for their patience and willingness to listen to my random thoughts; as well as my manger, Eric Hybner, for bearing with me through this. Finally I would like to thank some very important people in my life. To my Mother, I thank you for telling me I could be whatever I wanted to be, and meaning it. To Israel Perez for always being there for all of us. To my children for teaching me to parent. To my Gran, Doug, Robbi, Dakon, Trey, and Alex, thank you for always being ready to help. I could thank every member of my family for something so I will say a simple thank you to each of you. Last I want to thank my wife Jessica, I thank her last because after God she is first in my life and I would not be here without her love and support standing beside me. ii Texas Tech University, Joe Dan Davenport, August 2016 Table of Contents Acknowledgements ............................................................................................................. ii Table of Contents ............................................................................................................... iii Abstract .............................................................................................................................. iv List of Figures ..................................................................................................................... v I. Introduction ..................................................................................................................... 1 Motivation ........................................................................................................................1 Problem Statement ...........................................................................................................2 Overview ..........................................................................................................................3 Terms ...............................................................................................................................4 II. Background .................................................................................................................... 5 State of the Art .................................................................................................................5 Minimax ...........................................................................................................................8 Background of Minimax ..............................................................................................8 Putting Minimax to Use .............................................................................................10 Minimax applied in domains of higher complexity ...................................................12 Improving performance..............................................................................................14 Non-deterministic domains ........................................................................................16 Expectiminax .................................................................................................................16 III. System Description ..................................................................................................... 18 Types ..............................................................................................................................18 Description .....................................................................................................................21 IV. Implementation Process .............................................................................................. 23 Prototype 1 – Standard Minimax ...................................................................................23 Prototype 2 – Expected Minimax with Static Expected Value ......................................29 Results ............................................................................................................................36 V. Conclusion & Future Work .......................................................................................... 39 Conclusions ....................................................................................................................39 Future Work ...................................................................................................................40 Future work – EMPV .................................................................................................40 Future work – Search Optimization ...........................................................................40 Future Work – Stress Testing & Experimentation .....................................................40 References ......................................................................................................................... 42 iii Texas Tech University, Joe Dan Davenport, August 2016 Abstract True artificial intelligence implementations in modern video games are a rarely seen occurrence. Most reliable sources say that the reason for this has to do with the unreliability in non-deterministic domains and poor performance. In this research, we used a well-known decision algorithm, minimax, in an attempt to focus on both of these issues while still providing emergent intelligence to the system. Minimax has existed since 1928, and has been proven sound in games of perfect information such as chess, or checkers. We refer to these games as deterministic domains in which every action has a known resulting outcome. We will look to extend minimax to non-deterministic domains through the use of expectiminimax. The result of this research was that the minimax and variant expectiminimax algorithm were both applicable in a commercial game domain, providing both acceptable performance and emergent intelligence in both deterministic and non-deterministic environments. iv Texas Tech University, Joe Dan Davenport, August 2016 List of Figures 1. Standard Minimax Prototype UI ................................................................................... 23 2. SMP Deep Copy Constructor........................................................................................ 25 3. SMP AI Class ................................................................................................................ 27 4. SMP vs EMP Successor Function................................................................................. 30 5. EMP Avoidance Function ............................................................................................. 31 6. EMP Mitigation Function ............................................................................................. 33 7. EMP expectedDamage Function ................................................................................... 34 8. SMP vs EMP Time to Depth Comparison .................................................................... 37 v Texas Tech University, Joe Dan Davenport, August 2016 Chapter I Introduction Motivation While the video game industry has boomed and bloomed over the past decade to an extremely respectable $61 billion dollars in sales in 2015 (DiChristopher 2016), the artificial intelligence, henceforth known as AI, techniques being used are far from the forefront of cutting edge. In a 2015 gamasutra.com article, one AI developer stated that his current use of AI implementations are based mostly around “AI techniques that were well-known in the 1970’s” (Graft 2015). The problem for most AI developers is that game AI needs to fit a commercially proven set of experience expectations. In situations where we employ tactics to produce emergent intelligence, like machine learning and decision algorithms, this proves especially difficult. Assume a game designer wants to see a certain type of AI behavior repeated more often. The developer can’t simply step back to the code, change a few settings and ask if that is satisfactory. They must now go back to their data set, and attempt to retrain the agent to see if they can produce this type of requested behavior. Therefore, if we wish to introduce more modern AI techniques into modern video games, the proposed techniques must produce accurate and repeatable results that can be designed around. This is highly important to me as I feel games are as much a teaching and learning tool as they are a source of entertainment. Being able to produce more accurate
Recommended publications
  • Parity Games: Descriptive Complexity and Algorithms for New Solvers
    Imperial College London Department of Computing Parity Games: Descriptive Complexity and Algorithms for New Solvers Huan-Pu Kuo 2013 Supervised by Prof Michael Huth, and Dr Nir Piterman Submitted in part fulfilment of the requirements for the degree of Doctor of Philosophy in Computing of Imperial College London and the Diploma of Imperial College London 1 Declaration I herewith certify that all material in this dissertation which is not my own work has been duly acknowledged. Selected results from this dissertation have been disseminated in scientific publications detailed in Chapter 1.4. Huan-Pu Kuo 2 Copyright Declaration The copyright of this thesis rests with the author and is made available under a Creative Commons Attribution Non-Commercial No Derivatives licence. Researchers are free to copy, distribute or transmit the thesis on the condition that they attribute it, that they do not use it for commercial purposes and that they do not alter, transform or build upon it. For any reuse or redistribution, researchers must make clear to others the licence terms of this work. 3 Abstract Parity games are 2-person, 0-sum, graph-based, and determined games that form an important foundational concept in formal methods (see e.g., [Zie98]), and their exact computational complexity has been an open prob- lem for over twenty years now. In this thesis, we study algorithms that solve parity games in that they determine which nodes are won by which player, and where such decisions are supported with winning strategies. We modify and so improve a known algorithm but also propose new algorithmic approaches to solving parity games and to understanding their descriptive complexity.
    [Show full text]
  • Game Theory: Static and Dynamic Games of Incomplete Information
    Game Theory: Static and Dynamic Games of Incomplete Information Branislav L. Slantchev Department of Political Science, University of California – San Diego June 3, 2005 Contents. 1 Static Bayesian Games 2 1.1BuildingaPlant......................................... 2 1.1.1Solution:TheStrategicForm............................ 4 1.1.2Solution:BestResponses.............................. 5 1.2Interimvs.ExAntePredictions............................... 6 1.3BayesianNashEquilibrium.................................. 6 1.4 The Battle of the Sexes . 10 1.5AnExceedinglySimpleGame................................ 12 1.6TheLover-HaterGame.................................... 13 1.7TheMarketforLemons.................................... 16 2 Dynamic Bayesian Games 18 2.1ATwo-PeriodReputationGame............................... 19 2.2Spence’sEducationGame.................................. 22 3 Computing Perfect Bayesian Equilibria 24 3.1TheYildizGame........................................ 24 3.2TheMyerson-RosenthalGame................................ 25 3.3One-PeriodSequentialBargaining............................. 29 3.4AThree-PlayerGame..................................... 29 3.5RationalistExplanationforWar............................... 31 Thus far, we have only discussed games where players knew about each other’s utility func- tions. These games of complete information can be usefully viewed as rough approximations in a limited number of cases. Generally, players may not possess full information about their opponents. In particular, players may possess
    [Show full text]
  • Learning Board Game Rules from an Instruction Manual Chad Mills A
    Learning Board Game Rules from an Instruction Manual Chad Mills A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science University of Washington 2013 Committee: Gina-Anne Levow Fei Xia Program Authorized to Offer Degree: Linguistics – Computational Linguistics ©Copyright 2013 Chad Mills University of Washington Abstract Learning Board Game Rules from an Instruction Manual Chad Mills Chair of the Supervisory Committee: Professor Gina-Anne Levow Department of Linguistics Board game rulebooks offer a convenient scenario for extracting a systematic logical structure from a passage of text since the mechanisms by which board game pieces interact must be fully specified in the rulebook and outside world knowledge is irrelevant to gameplay. A representation was proposed for representing a game’s rules with a tree structure of logically-connected rules, and this problem was shown to be one of a generalized class of problems in mapping text to a hierarchical, logical structure. Then a keyword-based entity- and relation-extraction system was proposed for mapping rulebook text into the corresponding logical representation, which achieved an f-measure of 11% with a high recall but very low precision, due in part to many statements in the rulebook offering strategic advice or elaboration and causing spurious rules to be proposed based on keyword matches. The keyword-based approach was compared to a machine learning approach, and the former dominated with nearly twenty times better precision at the same level of recall. This was due to the large number of rule classes to extract and the relatively small data set given this is a new problem area and all data had to be manually annotated.
    [Show full text]
  • Game Theory with Sparsity-Based Bounded Rationality∗
    Game Theory with Sparsity-Based Bounded Rationality Xavier Gabaix NYU Stern, CEPR and NBER March 14, 2011 Extremely preliminary and incomplete. Please do not circulate. Abstract This paper proposes a way to enrich traditional game theory via an element of bounded rationality. It builds on previous work by the author in the context of one-agent deci- sion problems, but extends it to several players and a generalization of the concept of Nash equilibrium. Each decision maker builds a simplified representation of the world, which includes only a partial understanding of the equilibrium play of the other agents. The agents’desire for simplificity is modeled as “sparsity” in a tractable manner. The paper formulates the model in the concept of one-shot games, and extends it to sequen- tial games and Bayesian games. It applies the model to a variety of exemplary games: p beauty contests, centipede, traveler’sdilemma, and dollar auction games. The model’s predictions are congruent with salient facts of the experimental evidence. Compared to previous succesful proposals (e.g., models with different levels of thinking or noisy decision making), the model is particularly tractable and yields simple closed forms where none were available. A sparsity-based approach gives a portable and parsimonious way to inject bounded rationality into game theory. Keywords: inattention, boundedly rational game theory, behavioral modeling, sparse repre- sentations, `1 minimization. [email protected]. For very good research assistance I am grateful to Tingting Ding and Farzad Saidi. For helpful comments I thank seminar participants at NYU. I also thank the NSF for support under grant DMS-0938185.
    [Show full text]
  • Complexity in Simulation Gaming
    Complexity in Simulation Gaming Marcin Wardaszko Kozminski University [email protected] ABSTRACT The paper offers another look at the complexity in simulation game design and implementation. Although, the topic is not new or undiscovered the growing volatility of socio-economic environments and changes to the way we design simulation games nowadays call for better research and design methods. The aim of this article is to look into the current state of understanding complexity in simulation gaming and put it in the context of learning with and through complexity. Nature and understanding of complexity is both field specific and interdisciplinary at the same time. Analyzing understanding and role of complexity in different fields associated with simulation game design and implementation. Thoughtful theoretical analysis has been applied in order to deconstruct the complexity theory and reconstruct it further as higher order models. This paper offers an interdisciplinary look at the role and place of complexity from two perspectives. The first perspective is knowledge building and dissemination about complexity in simulation gaming. Second, perspective is the role the complexity plays in building and implementation of the simulation gaming as a design process. In the last section, the author offers a new look at the complexity of the simulation game itself and perceived complexity from the player perspective. INTRODUCTION Complexity in simulation gaming is not a new or undiscussed subject, but there is still a lot to discuss when it comes to the role and place of complexity in simulation gaming. In recent years, there has been a big number of publications targeting this problem from different perspectives and backgrounds, thus contributing to the matter in many valuable ways.
    [Show full text]
  • Game Complexity Vs Strategic Depth
    Game Complexity vs Strategic Depth Matthew Stephenson, Diego Perez-Liebana, Mark Nelson, Ahmed Khalifa, and Alexander Zook The notion of complexity and strategic depth within games has been a long- debated topic with many unanswered questions. How exactly do you measure the complexity of a game? How do you quantify its strategic depth objectively? This seminar answered neither of these questions but instead presents the opinion that these properties are, for the most part, subjective to the human or agent that is playing them. What is complex or deep for one player may be simple or shallow for another. Despite this, determining generally applicable measures for estimating the complexity and depth of a given game (either independently or comparatively), relative to the abilities of a given player or player type, can provide several benefits for game designers and researchers. There are multiple possible ways of measuring the complexity or depth of a game, each of which is likely to give a different outcome. Lantz et al. propose that strategic depth is an objective, measurable property of a game, and that games with a large amount of strategic depth continually produce challenging problems even after many hours of play [1]. Snakes and ladders can be described as having no strategic depth, due to the fact that each player's choices (or lack thereof) have no impact on the game's outcome. Other similar (albeit subjective) evaluations are also possible for some games when comparing relative depth, such as comparing Tic-Tac-Toe against StarCraft. However, these comparative decisions are not always obvious and are often biased by personal preference.
    [Show full text]
  • Learning About Bayesian Games
    Learning to Play Bayesian Games1 Eddie Dekel, Drew Fudenberg and David K. Levine2 First draft: December 23, 1996 Current revision: July 13, 2002 Abstract This paper discusses the implications of learning theory for the analysis of games with a move by Nature. One goal is to illuminate the issues that arise when modeling situations where players are learning about the distribution of Nature’s move as well as learning about the opponents’ strategies. A second goal is to argue that quite restrictive assumptions are necessary to justify the concept of Nash equilibrium without a common prior as a steady state of a learning process. 1 This work is supported by the National Science Foundation under Grants 99-86170, 97-30181, and 97- 30493. We are grateful to Pierpaolo Battigalli, Dan Hojman and Adam Szeidl for helpful comments. 2 Departments of Economics: Northwestern and Tel Aviv Universities, Harvard University and UCLA. 2 1. Introduction This paper discusses the implications of learning theory for the analysis of games with a move by Nature.3 One of our goals is to illuminate some of the issues involved in modeling players’ learning about opponents’ strategies when the distribution of Nature’s moves is also unknown. A more specific goal is to investigate the concept of Nash equilibrium without a common prior, in which players have correct and hence common beliefs about one another’s strategies, but disagree about the distribution over Nature’s moves. This solution is worth considering given the recent popularity of papers that apply it, such as Banerjee and Somanathan [2001], Piketty [1995], and Spector [2000].4 We suggest that Nash equilibrium without a common prior is difficult to justify as the long- run result of a learning process, because it takes very special assumptions for the set of such equilibria to coincide with the set of steady states that could arise from learning.
    [Show full text]
  • Heiko Rauhut: Game Theory. In: "The Oxford Handbook on Offender
    Game theory Contribution to “The Oxford Handbook on Offender Decision Making”, edited by Wim Bernasco, Henk Elffers and Jean-Louis van Gelder —accepted and forthcoming— Heiko Rauhut University of Zurich, Institute of Sociology Andreasstrasse 15, 8050 Zurich, Switzerland [email protected] November 8, 2015 Abstract Game theory analyzes strategic decision making of multiple interde- pendent actors and has become influential in economics, political science and sociology. It provides novel insights in criminology, because it is a universal language for the unification of the social and behavioral sciences and allows deriving new hypotheses from fundamental assumptions about decision making. The first part of the chapter reviews foundations and assumptions of game theory, basic concepts and definitions. This includes applications of game theory to offender decision making in different strate- gic interaction settings: simultaneous and sequential games and signaling games. The second part illustrates the benefits (and problems) of game theoretical models for the analysis of crime and punishment by going in- depth through the “inspection game”. The formal analytics are described, point predictions are derived and hypotheses are tested by laboratory experiments. The article concludes with a discussion of theoretical and practical implications of results from the inspection game. 1 Foundations of game theory Most research on crime acknowledges that offender decision making does not take place in a vacuum. Nevertheless, most analytically oriented research applies decision theory to understand offenders. Tsebelis (1989) illustrates why this is a problem by using two examples: the decision to stay at home when rain is 1 probable and the decision to speed when you are in a hurry.
    [Show full text]
  • How Game Complexity Affects the Playing Behavior of Synthetic Agents
    Kiourt, C., Kalles, Dimitris and Kanellopoulos, P., (2017), How game complexity affects the playing behavior of synthetic agents, 15th European Conference on Multi-Agent Systems, Evry, 14-15 December 2017 EUMAS 2017 How game complexity affects the playing behavior of synthetic agents Chairi Kiourt1, Dimitris Kalles1, and Panagiotis Kanellopoulos2 1School of Science and Technology, Hellenic Open University, Patras, Greece, chairik, [email protected] 2CTI Diophantus and University of Patras, Rion, Greece. [email protected] Abstract—Agent based simulation of social organizations, via of the game. In our investigation, we adopted the state-space the investigation of agents’ training and learning tactics and complexity approach, which is the most-known and widely strategies, has been inspired by the ability of humans to learn used [13]–[15]. from social environments which are rich in agents, interactions and partial or hidden information. Such richness is a source of The complexity of a large set of well-known games has complexity that an effective learner has to be able to navigate. been calculated [14], [15] at various levels, but their usability This paper focuses on the investigation of the impact of the envi- in multi-agent systems as regards the impact on the agents’ ronmental complexity on the game playing-and-learning behavior learning/playing progress is still a flourishing research field. of synthetic agents. We demonstrate our approach using two In this article, we study the game complexity impact on independent turn-based zero-sum games as the basis of forming social events which are characterized both by competition and the learning/training progress of synthetic agents, as well as cooperation.
    [Show full text]
  • 6 Dynamic Games with Incomplete Information
    22 November 2005. Eric Rasmusen, [email protected]. Http://www.rasmusen.org. 6 Dynamic Games with Incomplete Information 6.1 Perfect Bayesian Equilibrium: Entry Deterrence II and III Asymmetric information, and, in particular, incomplete information, is enormously impor- tant in game theory. This is particularly true for dynamic games, since when the players have several moves in sequence, their earlier moves may convey private information that is relevant to the decisions of players moving later on. Revealing and concealing information are the basis of much of strategic behavior and are especially useful as ways of explaining actions that would be irrational in a nonstrategic world. Chapter 4 showed that even if there is symmetric information in a dynamic game, Nash equilibrium may need to be refined using subgame perfectness if the modeller is to make sensible predictions. Asymmetric information requires a somewhat different refine- ment to capture the idea of sunk costs and credible threats, and Section 6.1 sets out the standard refinement of perfect bayesian equilibrium. Section 6.2 shows that even this may not be enough refinement to guarantee uniqueness and discusses further refinements based on out-of- equilibrium beliefs. Section 6.3 uses the idea to show that a player’s ignorance may work to his advantage, and to explain how even when all players know something, lack of common knowledge still affects the game. Section 6.4 introduces incomplete information into the repeated prisoner’s dilemma and shows the Gang of Four solution to the Chain- store Paradox of Chapter 5. Section 6.5 describes the celebrated Axelrod tournament, an experimental approach to the same paradox.
    [Show full text]
  • Part II Dynamic Games of Complete Information
    Part II Dynamic Games of Complete Information 83 84 This is page 8 Printer: Opaq 10 Preliminaries As we have seen, the normal form representation is a very general way of putting a formal structure on strategic situations, thus allowing us to analyze the game and reach some conclusions about what will result from the particular situation at hand. However, one obvious drawback of the normal form is its difficulty in capturing time. That is, there is a sense in which players’ strategy sets correspond to what they can do, and how the combination of their actions affect each others payoffs, but how is the order of moves captured? More important, if there is a well defined order of moves, will this have an effect on what we would label as a reasonable prediction of our model? Example: Sequencing the Cournot Game: Stackelberg Competition Consider our familiar Cournot game with demand P = 100 − q, q = q1 + q2,and ci(qi)=0for i ∈{1, 2}. We have already solved for the best response choices of each firm, by solving their maximum profit function when each takes the other’s quantity as fixed, that is, taking qj as fixed, each i solves: max (100 − qi − qj)qi , qi 86 10. Preliminaries and the best response is then, 100 − q q = j . i 2 Now lets change a small detail of the game, and assume that first, firm 1 will choose q1, and before firm 2 makes its choice of q2 it will observe the choice made by firm 1. From the fact that firm 2 maximizes its profit when q1 is already known, it should be clear that firm 2 will follow its best response function, since it not only has a belief, but this belief must be correct due to the observation of q1.
    [Show full text]
  • Cancer's Intelligence
    Int. Journ. of Unconventional Computing, Vol. 16, pp. 41–78 ©2020 Old City Publishing, Inc. Reprints available directly from the publisher Published by license under the OCP Science imprint, Photocopying permitted by license only a member of the Old City Publishing Group Cancer’s Intelligence J. JAMES FROST† BioMolecular Imaging, 3801 Canterbury Road, Ste. 914, Baltimore, Maryland 21218, USA; Phone +1-443-869-2267; E-mail: [email protected] Received: September 15, 2020. Accepted: September 25, 2020. Cancer is analyzed as an intelligent system of collaborating and computing cells. The limitations of the current regime of cancer research and treatment are addressed, and the resultant need for new paradigmatic thinking is pre- sented. Features of intelligence pervade the natural world from humans to animals of all sizes and complexity to microorganisms. Yet, cancer has hith- erto not been investigated as acting with intelligence as it evades the body’s and the oncologist’s failed attempts to eradicate it. In this analysis, concepts of computation, including self computation and the limits of computation; game playing; e-machine analysis; self-aware systems; P and NP-hard problems; and Boolean networks are addressed and related to features of cancer that can be described as intelligent. The implications of the devel- oped theory of cancer’s intelligence are discussed, together with several signposts and recommendations for new cancer research. Keywords: Cancer, intelligence, computation, game theory, Boolean network 1 INTRODUCTION The grim plight of cancer continues to endure in the face of legions of tar- geted drugs, reams of cancer gene data, and multitudes of physicists and mathematicians on the attack (1).
    [Show full text]