Evolving Heuristic Based Game Playing Strategies for Board Games Using Genetic Programming

Total Page:16

File Type:pdf, Size:1020Kb

Evolving Heuristic Based Game Playing Strategies for Board Games Using Genetic Programming Evolving Heuristic Based Game Playing Strategies for Board Games Using Genetic Programming by Clive Andre’ Frankland Submitted in fulfilment of the academic requirements for the degree of Master of Science in the School of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, Pietermaritzburg January 2018 As the candidate’s supervisor, I have/have not approved this thesis/dissertation for submission. Name: Prof. Nelishia Pillay Signed: _______________________ Date: _______________________ i Preface The experimental work described in this dissertation was carried out in the School of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, Pietermartizburg, from February 2015 to January 2018, under the supervision of Professor Nelishia Pillay. These studies represent original work by the author and have not otherwise been submitted in any form for any degree or diploma to any tertiary institution. Where use has been made of the work of others it is duly acknowledged in the text. _______________________ Clive Andre’ Frankland – Candidate (Student Number 791790207) _______________________ Prof. Nelishia Pillay – Supervisor ii Declaration 1 – Plagiarism I, Clive Andre’ Frankland (Student Number: 791790207) declare that: 1. The research reported in this dissertation, except where otherwise indicated or acknowledged, is my original work. 2. This dissertation has not been submitted in full or in part for any degree or examination to any other university. 3. This dissertation does not contain other persons’ data, pictures, graphs or other information, unless specifically acknowledged as being sourced from other persons. 4. This dissertation does not contain other persons’ writing, unless specifically acknowledged as being sourced from other researchers. Where other written sources have been quoted, then: a. Their words have been re-written but the general information attributed to them has been referenced. b. Where their exact words have been used, their writing has been placed inside quotation marks, and referenced. 5. This dissertation does not contain text, graphics or tables copied and pasted from the internet, unless specifically acknowledged, and the source being detailed in the dissertation and in the references sections. Signed: _______________________ Clive Andre’ Frankland iii Declaration 2 – Publications DETAILS OF CONTRIBUTION TO PUBLICATIONS that form part and/or include research presented in this thesis: Publication 1: C. Frankland, N. Pillay, “Evolving Game Playing Strategies for Othello”, In: Proceedings of the 2015 IEEE Congress on Evolutionary Computation (CEC 2015), 25-28 May 2015, Sendai, Japan, pp. 1498-1504, 2015 Publication 2: C. Frankland, N. Pillay, “Evolving game playing strategies for Othello incorporating reinforcement learning and mobility", In: Proceedings of the 2015 Annual Research Conference on South African Institute of Computer Scientists and Information Technologists, 2015. Publication 3: C. Frankland, N. Pillay, “Evolving Heuristic Based Game Playing Strategies for Checkers Incorporating Reinforcement Learning”, NaBIC 2015 - 7th World Conference on Nature and Biologically Inspired Computing, pp. 165-178, 2015. Signed: _______________________ _______________________ Clive Andre’ Frankland Prof. Nelishia Pillay iv Abstract Computerized board games have, since the early 1940s, provided researchers in the field of artificial intelligence with ideal test-beds for studying computational intelligence theories and artificial intelligent behavior. More recently genetic programming (GP), an evolutionary algorithm, has gained popularity in this field for the induction of complex game playing strategies for different types of board games. Studies show that the focus of this research is primarily on using GP to evolve board evaluation functions to be used in combination with other search techniques to produce intelligent game-playing agents. In addition, the intelligence of these agents is often guided by large game specific knowledge bases. The research presented in this dissertation is unique in that the aim is to investigate the use of GP for evolving heuristic based game playing strategies for board games of different complexities. Each strategy represents a computer player and the heuristics making up the strategy determine which game moves are to be made. Unlike other studies where game playing strategies are created offline, in this study, strategies are evolved in real time, during game play. Furthermore, this work investigates the incorporation of reinforcement learning and mobility strategies into the GP approach to enhance the performance of the evolved strategies. The performance of the genetic programming approach was evaluated for three board games, namely, Othello representing a game of low complexity, checkers representing medium complexity and chess a game of high complexity. The performance of the GP approach for Othello without reinforcement learning was compared to the performance of a pseudo-random move player encoded with limited Othello strategies, the GP approach incorporating reinforcement learning, the GP approach incorporating mobility and the GP approach incorporating both reinforcement learning and mobility. Genetic programming evolved players without reinforcement learning and mobility out performed the pseudo-random move player. The GP approach incorporating just reinforcement learning performed the best of all three GP approach combinations. For checkers, the approach without reinforcement learning was compared to the performance of a pseudo- random move player encoded with limited checkers strategies and the GP approach incorporating reinforcement learning. Genetic programming evolved players outperformed the pseudo-random v move player. Players induced combining GP and reinforcement learning outperformed the GP players without reinforcement learning. For chess, the performance of the GP approach incorporating reinforcement learning was evaluated in three different endgame configuration playouts against a pseudo-random move player encoded with limited chess strategies. The results demonstrated success in producing chess strategies that allowed GP evolved players to accomplish all three chess endgame configurations by checkmating the opposing King. Future work will look at investigating other options for incorporating reinforcement learning into the evolutionary process as well as combining online and offline induction of game playing strategies as a means of obtaining a balance between deriving strategies appropriate for the current scenario of game play will be examined. In addition, developing general game players capable of playing well in more than one type of game will be investigated. vi Acknowledgements The author acknowledges the financial assistance of the National Research Foundation (NRF) towards this research. Opinions expressed and conclusions arrived at, are those of the author and cannot be attributed to the NRF. I thank my supervisor, Professor Nelishia Pillay, for her guidance, constant support and encouragement. Special thanks go to my wife, Tammy, and my daughter, Tarryn, for their love and endless support during this study. vii Table of Contents Preface.............................................................................................................................................. i Declaration 1 – Plagiarism .............................................................................................................. ii Declaration 2 – Publications .......................................................................................................... iii Abstract .......................................................................................................................................... iv Acknowledgements ........................................................................................................................ vi Table of Contents .......................................................................................................................... vii List of Figures .............................................................................................................................. xiv List of Tables ............................................................................................................................. xviii List of Algorithms ........................................................................................................................ xix Chapter 1 ......................................................................................................................................... 1 Introduction ................................................................................................................................. 1 1.1 Purpose of this Study........................................................................................................ 1 1.2 Objectives ......................................................................................................................... 1 1.3 Contributions .................................................................................................................... 2 1.4 Dissertation Layout .......................................................................................................... 3 Chapter 2 ......................................................................................................................................... 6 Artificial
Recommended publications
  • Best-First Minimax Search Richard E
    Artificial Intelligence ELSEVIER Artificial Intelligence 84 ( 1996) 299-337 Best-first minimax search Richard E. Korf *, David Maxwell Chickering Computer Science Department, University of California, Los Angeles, CA 90024, USA Received September 1994; revised May 1995 Abstract We describe a very simple selective search algorithm for two-player games, called best-first minimax. It always expands next the node at the end of the expected line of play, which determines the minimax value of the root. It uses the same information as alpha-beta minimax, and takes roughly the same time per node generation. We present an implementation of the algorithm that reduces its space complexity from exponential to linear in the search depth, but at significant time cost. Our actual implementation saves the subtree generated for one move that is still relevant after the player and opponent move, pruning subtrees below moves not chosen by either player. We also show how to efficiently generate a class of incremental random game trees. On uniform random game trees, best-first minimax outperforms alpha-beta, when both algorithms are given the same amount of computation. On random trees with random branching factors, best-first outperforms alpha-beta for shallow depths, but eventually loses at greater depths. We obtain similar results in the game of Othello. Finally, we present a hybrid best-first extension algorithm that combines alpha-beta with best-first minimax, and performs significantly better than alpha-beta in both domains, even at greater depths. In Othello, it beats alpha-beta in two out of three games. 1. Introduction and overview The best chess machines, such as Deep-Blue [lo], are competitive with the best humans, but generate billions of positions per move.
    [Show full text]
  • A World-Championship-Level » Othello Program*
    ARTIFICIAL 279 A World-Championship-Level » Othello Program* Paul S. Rosenbloom Department of Computer Science. Carnegie-Mellon University. Pittsburgh. PA 15213, U.S.A. Recommended by Jacques Pitrat ABSTRACT Othello is a recent addition to the collection of games that have been examined within artificial intelligence. Advances have been rapid, yielding programs that have reached the level of world- championship play. This article describes the current champion Othello program, iago. The work described here includes: (I) a task analysis of Othello; (2) the implementation of a program based on this analysis and state-of-the-art AI game-playing techniques: and (3) an evaluation of the program's performance through games played against other programs and comparisons with expert human play. 1. Introduction The game of Othello1 is a modern variation of the 19th century board game Reversi. It was developed in its current form in 1974 by Goro Hasegawa (see [16] for more details), and up until recently it has been played primarily in Japan, where it is the second most popular board game (next to Go). Othello is similar to backgammon in its appeal; the rules are simple, an enjoyable game can be played by beginners, and there is a great deal of complexity that must *This research was sponsored by the Defense Advanced Research Projects Agency (DOD), ARPA Order No. 3597, monitored by the Air Force Avionics Laboratory Under Contract F33615-78-C-1551. The views and conclusions contained in this document are those of the author and should not be interpreted as representing the official policies, either expressed or implied, of the Defense Advanced Research Projects Agency or the US Government.
    [Show full text]
  • Playing Othello by Deep Learning Neural Network
    Playing Othello By Deep Learning Neural Network Playing Othello By Deep Learning Neural Network Ng Argens The University of Hong Kong Author Note Argens Ng, BEng (Comp. Sc.), The University of Hong Kong (UID.: 3035072143) This paper was prepared in partial fulfillment of the Final Year Project (COMP 4801) required by the curriculum of Bachelor of Engineering (Computer Science) Playing Othello By Deep Learning Neural Network 2 Table of Contents Abstract .............................................................................................................................. 8 Acknowledgement ............................................................................................................. 9 Playing Othello By Deep Learning Neural Network ................................................... 10 Previous Work – AlphaGo ............................................................................................... 11 Overall Structure ........................................................................................................... 11 Artificial Neural Network ............................................................................................. 12 Value Network. ......................................................................................................... 12 Policy Network. ........................................................................................................ 12 Rollout Policy. ..........................................................................................................
    [Show full text]
  • Scalable Search in Computer Chess Computational Intelligence Edited by Wolfgang Hillel and Rudolf Kruse
    Ernst A. Heinz Scalable Search in Computer Chess Computational Intelligence edited by Wolfgang Hillel and Rudolf Kruse The books In .this series conthDute to the long-" r~nge goal pfumlerstanrung and rea­ li zing intelligent behaviour in some envI ronment. 'Thus t!;ley' covertopics from the disCiplines ofArtificial1ntetl,genceandCogni!:tve Science, combined also caJled In- ' tellectics, as well as from fi erds interdJsciplinarilyieJ:at'ed with these. Computati(}' nallnteUigence cOmprises ba ~ic k~Owl edge as'wellas applications. Das rechnende Gehlrn by Patricia S. Church land 'lnd Terrence:r. Sejnows~ i Neuronale Netze und Fuzzy..systeme by DetlefNauck,Frank Klawonn and Rudolf Kruse Fuzzy-Clusteranalyse by Frank Hoppner,. Frank KlaW\Jnn.and Rudol f Kruse Elnfiihrung In Evolutloniire,Algorlthmen by Volker Nissen Neuronale Netze by Andreas Scher.er Sehen und die Verarbeltung~vl.ueller Informatlonen by Hanspeter A. Mallo! Betriebswlrtscbaftllche.Anwendungen des Soft Computing by Biethabn et al. (Ed.) Fuzzy Theorle und Stochastlk by Rudolf Seising (Ed.) MultiobJecflve Heuristic Search by Pa llab Dasgupta, P. P. Chakrabarti and S. C. DeSarkar The Efflclence ofTheorem Proving Strategies by David A. Plaisted an~ Yunshan Zhu Scalable Search in Computer Chess by Ernst A. He inz Among others the followi ng books were published in the series of ArtiRciallntelligence Automated Theorem Proving by Wolfgang Bi bel (out of print) Fuzzy Sets and Fuzzy Logic Foundation of Application - from a Mathematical Point of View by Siegfried Gottwald . Fuzzy Systems In Computer Science edited by Rudolf Kruse, Jorg G'ebhard arid {{ainerPalm Automatlsche Spracherkennung by Ernst Gunter Sthukat-Talamazzinl ; Deduktive Datenbanken by Armin B. Cremers, UJrike Grielhahn and RaJf Hinze Wissensreprasentation und Inferenz l by Wolfgang Bibel, Steffen Holldobler and Torsten ~Chaub .
    [Show full text]
  • Understanding Sampling-Based Adversarial Search Methods
    UNDERSTANDING SAMPLING-BASED ADVERSARIAL SEARCH METHODS A Dissertation Presented to the Faculty of the Graduate School of Cornell University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy by Raghuram Ramanujan August 2012 c 2012 Raghuram Ramanujan ALL RIGHTS RESERVED UNDERSTANDING SAMPLING-BASED ADVERSARIAL SEARCH METHODS Raghuram Ramanujan, Ph.D. Cornell University 2012 Until 2007, the best computer programs for playing the board game Go per- formed at the level of a weak amateur, while employing the same Minimax algo- rithm that had proven so successful in other games such as Chess and Checkers. Thanks to a revolutionary new sampling-based planning approach named Up- per Confidence bounds applied to Trees (UCT), today’s best Go programs play at a master level on full-sized 19 × 19 boards. Intriguingly, UCT’s spectacular success in Go has not been replicated in domains that have been the traditional stronghold of Minimax-style approaches. The focus of this thesis is on under- standing this phenomenon. We begin with a thorough examination of the vari- ous facets of UCT in the games of Chess and Mancala, where we can contrast the behavior of UCT to that of the better understood Minimax approach. We then introduce the notion of shallow search traps — positions in games where short winning strategies for the opposing player exist — and demonstrate that these are distributed very differently in different games, and that this has a significant impact on the performance of UCT. Finally, we study UCT and Minimax in two novel synthetic game settings that permit mathematical analysis.
    [Show full text]
  • 縮小盤オセロにおける完全解析 Perfect Analysis in Miniature Othello
    情報処理学会研究報告 IPSJ SIG Technical Report 縮小盤オセロにおける完全解析 †1 †1 †1 †2 竹下祐輝 池田諭 坂本眞人 伊藤隆夫 1993 年,数学者ファインシュタイン (J. Feinstein) が縮小盤 6×6 盤オセロの後手必勝,及び,パーフェクトプレイ「黒 16 白 20」を発見した.それから 20 年以上が経過し,コンピュータオセロは,今や,人間を遥かに凌いでいるが,標 準盤 8×8 盤オセロは未だ,先手必勝,後手必勝,引き分けの何れになるかが分かっていない.そこで本稿では,4×4 盤,4×6 盤,4×8 盤,4×10 盤,6×6 盤オセロの完全解析結果から,それ以上の盤のゲームの性質について考察する. また,6×6 盤オセロの完全解析について,ファインシュタインの追試を実行する. Perfect Analysis in miniature Othello YUKI TAKESHITA†1 SATOSHI IKEDA†1 MAKOTO SAKAMOTO†1 TAKAO ITO†2 More than 20 years has passed since J. Feinstein (1993) found that a perfect play on 6×6 board of Othello gives a 16-20 win for the second player. A computer Othello surpasses a much more human now. However, standard 8×8 board remain unsolved. In this paper, we show the perfect play in miniature Othello (4×4, 4×6, 4×8, 4×10 and 6×6 boards). From these results, we discuss the feature of the Othello larger than or equal to 8×8 board. In addition, in the 6×6 board, we confirm the result of Feinstein. の盤の中央 4 マスに両者が 2 つずつ任意で配置して始まる 1. はじめに のに対し,オセロは初期配置が決まっている(図 2-1 参照). オセロは,二人零和有限確定完全情報ゲーム[1]に分類さ れる.このクラスに分類されるゲームの性質は,必ず,先 手必勝,後手必勝,引き分けの何れかになる[2]. 1993 年,数学者ファインシュタイン(Joel Feinstein)が, 縮小盤 6×6 盤オセロの後手必勝,及び,パーフェクトプレ イ「黒 16 白 20」を発見した[3].このとき,彼はムーブオー ダリングを用いたアルファ・ベータ法により,約 400 億の 局面を,およそ 2 週間で探索したと記述している.ただし, (a)オセロ盤 (b)リバーシ盤 この結果はイギリスオセロ協会のニューズレター上で報告 (a) Othello board. (b) Reversi board. されたため,論文としては残っていない. 図 2-1 初期配置 それから 20 年以上が経過し,コンピュータオセロは,今 Fig.
    [Show full text]
  • The History of Computer Games
    City University of New York (CUNY) CUNY Academic Works Publications and Research CUNY Graduate Center 2006 The History of Computer Games Jill Cirasella CUNY Graduate Center Danny Kopec CUNY Brooklyn College How does access to this work benefit ou?y Let us know! More information about this work at: https://academicworks.cuny.edu/gc_pubs/174 Discover additional works at: https://academicworks.cuny.edu This work is made publicly available by the City University of New York (CUNY). Contact: [email protected] The History of Computer Games by Jill Cirasella, Computational Sciences Specialist Library, Brooklyn College and IM Dr. Danny Kopec, Associate Professor Computer and Information Science, Brooklyn College Milestones in Computer Backgammon 1979: BKG 9.8, the first strong backgammon player (written by Hans Berliner of Carnegie Mellon University), defeated world champion Luigi Villa in an exhibition match. However, it is widely accepted that Villa played better and that the computer got better rolls. 1989: Gerald Tesauro’s neural network-based Neurogammon, which was trained with a database of games played by expert human players, won the backgammon championship at the 1989 International Computer Olympiad. All top backgammon programs since Neurogammon have been based on neural networks. Search-based algorithms are not currently feasible for backgammon because the game has a branching factor of several hundred. 1991: Gerald Tesauro’s TD-Gammon debuted. Instead of being trained with a database of moves, TD-Gammon was trained by playing itself. This approach was challenging because individual moves are not rewarded: the reward is delayed until the end of the game, and credit for winning must then be distributed among the various moves.
    [Show full text]