Enhancing AI-Based Game Playing Using Adaptive Data Structures

Enhancing AI-based Game Playing Using Adaptive Data Structures By Spencer Polk A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfilment of the requirements for the degree of Doctor of Philosophy Ottawa-Carleton Institute for Computer Science School of Computer Science Carleton University Ottawa, Ontario June 2016 c Copyright 2016, Spencer Polk The undersigned hereby recommend to the Faculty of Graduate Studies and Research acceptance of the thesis, Enhancing AI-based Game Playing Using Adaptive Data Structures submitted by Spencer Polk Dr. Douglas Howe (Director, School of Computer Science) Dr. B. John Oommen (Thesis Supervisor) (External Examiner) Carleton University June 2016 ii Abstract The design and analysis of strategies for playing strategic board games, is a core area of Artificial Intelligence (AI), that has been studied extensively since the in- ception of the field. However, while two-player board games are very well known, comparatively little research has been done on Multi-Player games, where the num- ber of self-interested, competing players is greater than two. Furthermore, known strategies for multi-player games have difficulties performing on a level of sophistica- tion comparable to their two-player counterparts. When progress in a field is stymied, promising approaches can be discovered through drawing inspiration from other, completely unrelated areas of Computer Sci- ence. The premise of this thesis is the hypothesis that game playing in general, and the problem of multi-player games in particular, can benefit from efficient ranking mechanisms, for moves, board positions, or even players, in the multi-player scenario. The research done in this work confirms the hypothesis. Indeed, we have discovered that this information can be applied to improve game tree pruning, through the concept of move ordering, and within other possibilities. In this thesis, we observe that the formerly-unrelated field of Adaptive Data Structures (ADSs), which provide mechanisms by which a data structure can reorganize itself internally in response to queries, can provide a natural ranking mechanism. The primary motivation of this thesis is to demonstrate that the low-cost ADS-based data structures can provide this ranking mechanism to game-playing engines, and furthermore generate statistically significant improvements to their efficiency. In this work, we will conclusively prove that ADS-based techniques are able to enhance existing multi-player game playing strategies, and perform competitively with state-of-the-art two-player techniques, as well. We demonstrate, through two general-use, domain independent move ordering heuristics, the Threat-ADS heuristic for multi-player games, and the History-ADS heuristic for both two-player and multi- player games, that ADSs are, indeed, capable of achieving this improvement. We present an examination of their performance in a very wide range of game models, configurations, and by employing differing ADSs and refinements. We thus conclusively demonstrate that ADSs are able to achieve strong performance, in game playing engines, in the vast majority of cases. Our work in this thesis provides not only these domain-independent, formal move ordering heuristics, but furthermore serves as a strong example for future investigation into combinations between the fields of ADSs and game playing. ii Dedicated with all my love to my parents and grandparents Nancy, Jeff, Elaine, and Leo Acknowledgements Above all, I express my utmost gratitude to my thesis advisor, Professor John Oommen, for his continuing patience, support, and kindness during this part of my voyage through life. Without his encouragement throughout this process, I would be looking back on what I had done with doubt, and forward to my future in academia and beyond, with fear. Through his generous guidance and advice, the experience has rather been one of exploration and discovery. I would also like to sincerely thank him for providing me with the opportunity to augment my studies through lecturing. I will be grateful to him for all he has done for me for the rest of my life. I would like to thank the members of the thesis proposal committee, for their feedback on earlier drafts of this work. I am grateful for their experienced advice, which has helped guide me in my research and the design of my experiments, as well as the formulation of this final work. I would also like to express my thanks to my co-workers and and supervisors at Purple Forge for allowing me unprecedented leeway in shaping my work schedule to fit my academic goals. I would like to especially thank the founder and CEO, Brian Hurley, who not only accommodated my unpredictable work/school schedule, but also provided me with valuable advice on the commercial applications of my research in the industry, as well as an opportunity to experience “real-world” AI systems. I am grateful to the School of Computer Science, and its administrative staff, and faculty, for all of their support throughout these long years. In particular, I am thankful to Dr. Douglas Howe, for providing me with the opportunity to teach a course during my time as a graduate student. Finally, I thank my family for encouraging and supporting me well beyond any- thing I could have asked for through this process, and for showing me infinite patience when it concerned my tendency to ramble on about my research. I am sincerely grateful for being shown such love and patience. Thank you all from the bottom of my heart! ii Contents 1 Introduction 1 1.1 Problem Overview and Motivation . 2 1.2 ProblemApproach ............................ 4 1.2.1 Verification of the Approach . 7 1.2.2 Complexity of the Verification . 7 1.3 ContributionsoftheThesis ....................... 9 1.4 StructureoftheThesis .......................... 10 2 Survey of the Field 13 2.1 ChapterOverview............................. 13 2.2 Families of Games . 14 2.3 Game Playing: Overview . 17 2.3.1 Historical Overview . 17 2.3.2 GameTrees ............................ 18 i 2.3.3 Branch and Bound . 19 2.4 Two-Player Game Playing . 20 2.4.1 The Mini-Max Algorithm . 20 2.4.2 Alpha-Beta Search . 21 2.4.3 MoveOrdering .......................... 24 2.4.3.1 Killer Moves and the History Heuristic . 25 2.4.4 Other Alpha-Beta Related Techniques . 27 2.4.4.1 Transposition Tables . 27 2.4.4.2 The Horizon Effect and Quiescence Search . 29 2.4.4.3 Stochastic Improvements . 29 2.4.4.4 Handling Imperfect Information . 30 2.4.5 Non-Alpha-Beta Strategies for Two-Player Games . 31 2.5 Multi-Player Game Playing . 33 2.5.1 Challenges of Multi-Player Games . 33 2.5.2 The Paranoid Algorithm . 34 2.5.3 The Max-N Algorithm . 36 2.5.3.1 Extensions to Max-N . 38 2.5.4 The Best-Reply Search . 39 2.6 StochasticMethods............................ 41 2.6.1 MonteCarloTreeSearch. .. .. 42 ii 2.6.2 The UCT Algorithm . 43 2.7 The Improving Agent: Adaptive Data Structures . 44 2.7.1 OverviewofADSs ........................ 44 2.7.2 List-Based ADSs . 46 2.7.2.1 Move-to-Front ..................... 46 2.7.2.2 Transposition . 47 2.7.2.3 Move-Ahead-k ..................... 48 2.7.2.4 POS(k) . 49 2.7.2.5 Stochastic Variants . 50 2.7.3 Tree-BasedADSs ......................... 51 2.8 ChapterSummary ............................ 53 3 Improving the Best-Reply Search using ADS 54 3.1 ChapterOverview............................. 54 3.2 MoveOrdering .............................. 55 3.2.1 Benefits of Move Ordering . 55 3.2.2 Examples of Move Ordering . 56 3.3 OpponentThreatLevel.......................... 58 3.3.1 PropertiesofOpponentThreat . 59 3.3.2 OpponentThreatandMoveOrdering . 60 iii 3.4 Managing Opponent Threat Level Using an ADS . 62 3.5 TheThreat-ADSHeuristic ........................ 67 3.5.1 Developing the Threat-ADS Heuristic . 67 3.5.2 Salient Features of the Threat-ADS Heuristic . 69 3.6 ExperimentalModel ........................... 71 3.7 Prima Facie Experiments ........................ 73 3.8 VariationontheNumberofPlayers . 75 3.8.1 Results............................... 76 3.9 VariationontheStartingStateoftheGame . 78 3.9.1 Results............................... 79 3.10EvaluationofResults........................... 82 3.11ChapterConclusions ........................... 85 4 Improvements to the Threat-ADS Heuristic 86 4.1 ChapterOverview............................. 86 4.2 OpenQuestions.............................. 87 4.3 Threat-ADS’ Behaviour Using Different ADSs . 88 4.4 ErgodicversusAbsorbingADSs . 90 4.5 Investigating the Threat-ADS’ Performance at Different Ply Levels . 92 4.6 ExperimentalModel ........................... 93 iv 4.7 Initial Board Position Experiments . 95 4.8 Midgame Starting Position Experiments . 99 4.9 Discussion................................. 102 4.10 ChapterConclusions ........................... 105 5 History-Based Move Ordering using ADSs 106 5.1 ChapterOverview............................. 106 5.2 Motivation................................. 107 5.3 MoveOrderingUsingMoveHistory . 108 5.4 ManagingMoveHistoryUsinganADS . 109 5.5 TheHistory-ADSHeuristic. 112 5.5.1 Developing the History-ADS Heuristic . 112 5.5.2 Salient Features of the History-ADS Heuristic . 113 5.6 ExperimentalModel ........................... 115 5.7 ResultsforTwo-PlayerGames. 118 5.8 Results for Multi-Player Games . 121 5.9 Discussion................................. 123 5.10 ChapterConclusions ........................... 126 6 Refinements of the History-ADS Heuristic 127 6.1 ChapterOverview............................. 127 v 6.2 OpenQuestions.............................

Enhancing AI-Based Game Playing Using Adaptive Data Structures

Minimax TD-Learning with Neural Nets in a Markov Game

Bayesian Games Professors Greenwald 2018-01-31

Outline for Static Games of Complete Information I

Efficiency and Welfare in Economies with Incomplete Information∗

Lecture 3 1 Introduction to Game Theory 2 Games of Complete

Bayesian Games with Intentions

Part II Dynamic Games of Complete Information

Ex-Post Stability in Large Games (Draft, Comments Welcome)

The Role of Imperfect Information

On Information Design in Games

Bayesian Action-Graph Games

Externalities