1 Introduction

Total Page:16

File Type:pdf, Size:1020Kb

1 Introduction An Investigation of an Evolutionary Approach to the Opening of Go Graham Kendall, Razali Yaakob Philip Hingston School of Computer Science and IT School of Computer and Information Science University of Nottingham, UK Edith Cowan University, WA, Australia {gxk|rby}@cs.nott.ac.uk [email protected] Abstract – The game of Go can be divided into three stages; even told how it had performed in each individual game. the opening, the middle, and the end game. In this paper, The conclusion of Fogel and Chellapilla is that, without evolutionary neural networks, evolved via an evolutionary any expert knowledge, a program can learn to play a game strategy, are used to develop opening game playing strategies at an expert level using a co-evolutionary approach. for the game. Go is typically played on one of three different board sizes, i.e., 9x9, 13x13 and 19x19. A 19x19 board is the Additional information about this work can be found in standard size for tournament play but 9x9 and 13x13 boards [19, 20, 21, 22]. are usually used by less-experienced players or for faster Go is considered more complex than chess or checkers. games. This paper focuses on the opening, using a 13x13 One measure of this complexity is the number of positions board. A feed forward neural network player is played against that can be reached from the starting position which Bouzy a static player (Gondo), for the first 30 moves. Then Gondo and Cazenave estimate to be 10160 [23]. Chess and takes the part of both players to play out the remainder of the checkers are about 1050 and 1017, respectively [23]. game. Two experiments are presented which indicate that The size of the search space is the primary reason why a learning is taking place. good automated Go player is difficult to produce. To reduce the search space, the board can be scaled down I. INTRODUCTION either by reducing the board size or by dividing the board Go is an ancient game, which originated in China at into sub-boards (decomposition). These techniques can least 2000 years ago and was known originally as Weiqi reduce the computational effort required for playing the [1]. The name Go came from the shortened Japanese word game, which is one of the main constraints in producing an Igo. Automated Go programs began appearing in the 1970s effective automated Go player. The early work for small [2] and 1980s [3, 4, 5, 6]. Since that time, Go has been the boards can be found in [24, 25] and decomposition subject of research areas such as machine learning [7, 8, approaches are reported in [26]. 9], cognitive science [10], and evolutionary computing The game of Go can be divided into three general [11, 12], but automated Go programs are still far from the phases: the opening game, the middle game, and the standard of human experts [13]. endgame [27]. Researchers have applied a variety of The games of Go, chess, and checkers are similar, in techniques to the middle game, which is considered the that they are finite, perfect information games, played most difficult – yet important – phase [27]. However, between two players. One of the best-known chess Ishigure [28] states that the opening game is the most programs, Deep Blue [14], achieved world champion important phase. In order to make territory, the best place status when, in 1997, it beat Garry Kasparov by a score of to move is in a corner and the least valuable moves are 3.5-2.5. Utilising custom-built hardware [15], Deep Blue those in the centre of the board. The number of plays in the executed sophisticated search algorithms to analyse up to opening is not predetermined [28]. Aggressive player may 200 million positions per second. start to attack the opponent (thus entering the middle Chinook, a checkers program, developed by Jonathan game) just a few turns after the start of the game, whilst Schaeffer’s team at The University of Alberta won the other openings may last longer. world checkers title in 1994 [16, 17]. Chinook used an Playing just the opening is the approach we adopt here. opening, and endgame database and an extensive checkers In this paper, we are not attempting to reduce the size of knowledge base. the search space or incorporate any Go knowledge. The Chellapilla and Fogel [18], also considered checkers, objective of this research is to investigate a population of developing Blondie24 - a checkers player, which learned evolutionary neural networks, which play the opening of its strategy using a co-evolutionary approach. The current Go and compete to survive to the next generation. A board position in Blondie24 is evaluated using an artificial feedforward neural network player is played against a neural network with the weights of the network being static player, Gondo (written by Jeffrey Greenberg [29]). evolved via an evolutionary strategy. The output of the We play against Gondo for the first 30 turns (15 turns for network is used in a minimax search. Blondie24 was not each player). At that time Gondo takes the part of both provided with any domain knowledge. It received points players and plays out the remainder of the game to based on whether it won, lost, or drew games and was not determine the outcome. An evolutionary strategy is used to Differences in playing strength can be balanced by a evolve the networks. We use a 13x13 board. handicap system, which allows the Black player to place Previous authors have investigated evolving neural several stones on the board before the start of the game. networks to play Go. Richard et al. evolved neural Go is played between two players, Black and White. networks using the SANE1 (Symbiotic, Adaptive Neuro- The game starts with an empty board. Stones are placed Evolution [30]) method [27]. Board sizes of 5x5, 7x7, and on any empty intersection on the board with players 9x9 were used in evolving networks to play on small making alternate moves. Black plays first. Go players can boards against a simple computer opponent called Wally2. pass their turn at any time but consecutive passes ends the Each intersection on the board has two input units and one game. output unit. Any positive output indicates a good move, the A stone on an intersection has a set of liberties (empty larger the value, the better the move. If none of its outputs intersections next to a stone). Figure 1a shows a white is positive, the network will pass its turn. SANE required stone with four liberties. When the opponent places a stone 20 generations to defeat Wally on a 5x5 board, 50 on any neighbouring intersection, white will lose one generations on a 7x7 board, and on a 9x9 board, 260 liberty. In Figure 1b, black stones surround the white generations were needed. Donelly et al. also evolved stone, which means white has no liberties and is captured neural networks via self-play [31]. and removed from the board (Figure 1c). This territory Other researchers have used neural networks to learn to now belongs to black. play Go using machine learning methods such as supervised learning (training on moves from professional games) [32] and reinforcement learning [33, 34]. Yen et al. took a different approach to learning to play Go openings with Jimmy 4.0, which has been ranked 4 3 kyu when they adopted 100 questions from Go books to test the rank of their opening game [35]. Jimmy plays the (a) (b) (c) opening stage by trying to occupy the corners (called joseki - a series of standard moves in corners or sometimes Figure 1: (a) White stone has 4 liberties, (b) Black stones are on edges), extending the edges, and dealing with Moyo surrounding a white stone, (c) White stone has been removed and the black player gets 1 point of territory. (large potential territories). Pattern matching techniques are used to occupy the corners and generate the Moyo moves. In extending the edges, two techniques are used. Placing a stone that does not make a capture, and has no Pattern matching generates moves near the existing stones liberties, is illegal. For example, the white player cannot on the board. A “split” method places a stone equidistant place a stone in the centre of black territory in Figure 1c. between two stones, which are near opposite corners. Placing a stone that results in a repeating position seen For an excellent review of the state of the art in after the same player’s previous play is also illegal. This is computer Go programming, we recommend Müller [36] called the ko-rule. and as a starting point for further reading, Enzenberger’s Figure 2a shows a marked white stone that has only one online bibliography [37]. Other reviews focussing on AI liberty. Black captures the marked white stone by playing approaches may be found in [38, 39]. at position k. After the marked white stone has been removed from the board, the white player cannot place a II. RULES OF GO stone on r, Figure 2b, before the board position has changed (i.e. white must play on another intersection The aim of Go is to gain as much territory as possible. before being able to place a stone on r). Players will try to capture an opponent’s stones and also define areas as their own territory. At the end of the game, the winner will be identified by counting territories using one of two scoring methods; territory scoring or area scoring. Japanese rules use territory scoring. Surrounded k r territory is counted, plus the number of captured opponent stones.
Recommended publications
  • FUEGO—An Open-Source Framework for Board Games and Go Engine Based on Monte Carlo Tree Search Markus Enzenberger, Martin Müller, Broderick Arneson, and Richard Segal
    IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, VOL. 2, NO. 4, DECEMBER 2010 259 FUEGO—An Open-Source Framework for Board Games and Go Engine Based on Monte Carlo Tree Search Markus Enzenberger, Martin Müller, Broderick Arneson, and Richard Segal Abstract—FUEGO is both an open-source software framework with new algorithms that would not be possible otherwise be- and a state-of-the-art program that plays the game of Go. The cause of the cost of implementing a complete state-of-the-art framework supports developing game engines for full-information system. Successful previous examples show the importance of two-player board games, and is used successfully in a substantial open-source software: GNU GO [19] provided the first open- number of projects. The FUEGO Go program became the first program to win a game against a top professional player in 9 9 source Go program with a strength approaching that of the best Go. It has won a number of strong tournaments against other classical programs. It has had a huge impact, attracting dozens programs, and is competitive for 19 19 as well. This paper gives of researchers and hundreds of hobbyists. In Chess, programs an overview of the development and current state of the UEGO F such as GNU CHESS,CRAFTY, and FRUIT have popularized in- project. It describes the reusable components of the software framework and specific algorithms used in the Go engine. novative ideas and provided reference implementations. To give one more example, in the field of domain-independent planning, Index Terms—Computer game playing, Computer Go, FUEGO, systems with publicly available source code such as Hoffmann’s man-machine matches, Monte Carlo tree search, open source soft- ware, software frameworks.
    [Show full text]
  • Reinforcement Learning of Local Shape in the Game of Go
    Reinforcement Learning of Local Shape in the Game of Go David Silver, Richard Sutton, and Martin Muller¨ Department of Computing Science University of Alberta Edmonton, Canada T6G 2E8 {silver, sutton, mmueller}@cs.ualberta.ca Abstract effective. They are fast to compute; easy to interpret, modify and debug; and they have good convergence properties. We explore an application to the game of Go of Secondly, weights are trained by temporal difference learn- a reinforcement learning approach based on a lin- ing and self-play. The world champion Checkers program ear evaluation function and large numbers of bi- Chinook was hand-tuned by expert players over 5 years. nary features. This strategy has proved effective When weights were trained instead by self-play using a tem- in game playing programs and other reinforcement poral difference learning algorithm, the program equalled learning applications. We apply this strategy to Go the performance of the original version [7]. A similar ap- by creating over a million features based on tem- proach attained master level play in Chess [1]. TD-Gammon plates for small fragments of the board, and then achieved world class Backgammon performance after train- use temporal difference learning and self-play. This ingbyTD(0)andself-play[13]. A program trained by method identifies hundreds of low level shapes with TD(λ) and self-play outperformed an expert, hand-tuned ver- recognisable significance to expert Go players, and sion at the card game Hearts [11]. Experience generated provides quantitive estimates of their values. We by self-play was also used to train the weights of the world analyse the relative contributions to performance of champion Othello and Scrabble programs, using least squares templates of different types and sizes.
    [Show full text]
  • Curriculum Guide for Go in Schools
    Curriculum Guide 1 Curriculum Guide for Go In Schools by Gordon E. Castanza, Ed. D. October 19, 2011 Published By: Rittenberg Consulting Group 7806 108th St. NW Gig Harbor, WA 98332 253-853-4831 © 2005 by Gordon E. Castanza, Ed. D. Curriculum Guide 2 Table of Contents Acknowledgements ......................................................................................................................... 4 Purpose and Rationale..................................................................................................................... 5 About this curriculum guide ................................................................................................... 7 Introduction ..................................................................................................................................... 8 Overview ................................................................................................................................. 9 Building Go Instructor Capacity ........................................................................................... 10 Developing Relationships and Communicating with the Community ................................. 10 Using Resources Effectively ................................................................................................. 11 Conclusion ............................................................................................................................ 11 Major Trends and Issues ..........................................................................................................
    [Show full text]
  • FUEGO – an Open-Source Framework for Board Games and Go Engine Based on Monte-Carlo Tree Search
    FUEGO – An Open-source Framework for Board Games and Go Engine Based on Monte-Carlo Tree Search Markus Enzenberger, Martin Muller,¨ Broderick Arneson and Richard Segal Abstract—FUEGO is both an open-source software frame- available source code such as Hoffmann’s FF [20] have had work and a state of the art program that plays the game of a similarly massive impact, and have enabled much followup Go. The framework supports developing game engines for full- research. information two-player board games, and is used successfully in UEGO a substantial number of projects. The FUEGO Go program be- F contains a game-independent, state of the art came the first program to win a game against a top professional implementation of MCTS with many standard enhancements. player in 9×9 Go. It has won a number of strong tournaments It implements a coherent design, consistent with software against other programs, and is competitive for 19 × 19 as well. engineering best practices. Advanced features include a lock- This paper gives an overview of the development and free shared memory architecture, and a flexible and general current state of the FUEGO project. It describes the reusable components of the software framework and specific algorithms plug-in architecture for adding domain-specific knowledge in used in the Go engine. the game tree. The FUEGO framework has been proven in applications to Go, Hex, Havannah and Amazons. I. INTRODUCTION The main innovation of the overall FUEGO framework may lie not in the novelty of any of its specific methods and Research in computing science is driven by the interplay algorithms, but in the fact that for the first time, a state of of theory and practice.
    [Show full text]
  • Go Books Detail
    Evanston Go Club Ian Feldman Lending Library A Compendium of Trick Plays Nihon Ki-in In this unique anthology, the reader will find the subject of trick plays in the game of go dealt with in a thorough manner. Practically anything one could wish to know about the subject is examined from multiple perpectives in this remarkable volume. Vital points in common patterns, skillful finesse (tesuji) and ordinary matters of good technique are discussed, as well as the pitfalls that are concealed in seemingly innocuous positions. This is a gem of a handbook that belongs on the bookshelf of every go player. Chapter 1 was written by Ishida Yoshio, former Meijin-Honinbo, who intimates that if "joseki can be said to be the highway, trick plays may be called a back alley. When one masters the alleyways, one is on course to master joseki." Thirty-five model trick plays are presented in this chapter, #204 and exhaustively analyzed in the style of a dictionary. Kageyama Toshiro 7 dan, one of the most popular go writers, examines the subject in Chapter 2 from the standpoint of full board strategy. Chapter 3 is written by Mihori Sho, who collaborated with Sakata Eio to produce Killer of Go. Anecdotes from the history of go, famous sayings by Sun Tzu on the Art of Warfare and contemporary examples of trickery are woven together to produce an entertaining dialogue. The final chapter presents twenty-five problems for the reader to solve, using the knowledge gained in the preceding sections. Do not be surprised to find unexpected booby traps lurking here also.
    [Show full text]
  • Active Opening Book Application for Monte-Carlo Tree Search in 19×19 Go
    Active Opening Book Application for Monte-Carlo Tree Search in 19×19 Go Hendrik Baier and Mark H. M. Winands Games and AI Group, Department of Knowledge Engineering Maastricht University, Maastricht, The Netherlands Abstract The dominant approach for programs playing the Asian board game of Go is nowadays Monte-Carlo Tree Search (MCTS). However, MCTS does not perform well in the opening phase of the game, as the branching factor is high and consequences of moves can be far delayed. Human knowledge about Go openings is typically captured in joseki, local sequences of moves that are considered optimal for both players. The choice of the correct joseki in a given whole-board position, however, is difficult to formalize. This paper presents an approach to successfully apply global as well as local opening moves, extracted from databases of high-level game records, in the MCTS framework. Instead of blindly playing moves that match local joseki patterns (passive opening book application), knowledge about these moves is integrated into the search algorithm by the techniques of move pruning and move biasing (active opening book application). Thus, the opening book serves to nudge the search into the direction of tried and tested local moves, while the search is able to filter out locally optimal, but globally problematic move choices. In our experiments, active book application outperforms passive book application and plain MCTS in 19×19 Go. 1 Introduction Since the early days of Artificial Intelligence (AI) as a scientific field, games have been used as testbed and benchmark for AI architectures and algorithms. The board game of Go, for example, has stimulated con- siderable research efforts over the last decades, second only to computer chess.
    [Show full text]
  • The Official AGA Tournament Guide
    The Official AGA Tournament Guide by Ken Koester Copyright © The American Go Association 2006 Table of Contents Part I: The Tasks Introduction........................................................................................................................................... v Chapter 1. Planning.............................................................................................................................1 Theme and System ................................................................................................................1 Budgeting, or, The Tournament on $15 and $20 a Day .................................................... 6 In Need of Assistance ...........................................................................................................8 Timetables ...............................................................................................................................9 Chapter 2. Preparation .....................................................................................................................11 Time and Place .....................................................................................................................11 Getting the Word Out ..........................................................................................................12 Food for Thought .................................................................................................................15 The Prize Is Right .................................................................................................................16
    [Show full text]
  • By Arend Bayer, Daniel Bump, Evan Berggren Daniel, David Denholm
    Documentation for the GNU Go Project Edition 3.8 July, 2003 By Arend Bayer, Daniel Bump, Evan Berggren Daniel, David Denholm, Jerome Dumonteil, Gunnar Farneb¨ack, Paul Pogonyshev, Thomas Traber, Tanguy Urvoy, Inge Wallin GNU Go 3.8 Copyright c 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007 and 2008 Free Software Foundation, Inc. This is Edition 3.8 of The GNU Go Project documentation, for the 3.8 version of the GNU Go program. Published by the Free Software Foundation Inc 51 Franklin Street, Fifth Floor Boston, MA 02110-1301 USA Tel: 617-542-5942 Permission is granted to make and distribute verbatim or modified copies of this manual is given provided that the terms of the GNU Free Documentation License (see Section A.5 [GFDL], page 218, version 1.3 or any later version) are respected. Permission is granted to make and distribute verbatim or modified copies of the program GNU Go is given provided the terms of the GNU General Public License (see Section A.1 [GPL], page 207, version 3 or any later version) are respected. i Table of Contents ::::::::::::::::::::::::::::::::::::::::::::::::::::::: 1 1 Introduction::::::::::::::::::::::::::::::::::::: 2 1.1 About GNU Go and this Manual ::::::::::::::::::::::::::::::: 2 1.2 Copyrights ::::::::::::::::::::::::::::::::::::::::::::::::::::: 2 1.3 Authors :::::::::::::::::::::::::::::::::::::::::::::::::::::::: 3 1.4 Thanks :::::::::::::::::::::::::::::::::::::::::::::::::::::::: 3 1.5 Development ::::::::::::::::::::::::::::::::::::::::::::::::::: 3 2 Installation ::::::::::::::::::::::::::::::::::::::
    [Show full text]
  • Going-Insurrection-Read.Pdf
    Introduction The game of Go originated in China or Tibet at least 3500 years ago, and in its simplicity and complexity, it remains the greatest strategic game that exists. Part of its interest is that it is quite abstract, just stones on a grid, and so it lends itself well to interpretation. The most obvious analogy for the game is war, but Go is not chess, where the pieces have military names and are lined up facing each other, making the war analogy inescapable. In fact, in many ways, the traditional image of war as opposing nation states advancing on each other is not applicable to Go. The lines are not so clearly drawn, and rather than starting with a full army that gets picked apart, the Go board begins empty and the players create the geography of the game together. Through its simplicity, Go can become a metaphor for thinking about conflict and struggle more generally. In modern North American society, conflict is everywhere, and overwhelmingly it is a one-sided battle constantly waged by the economic and political elites against everyone else. This conflict is visible in the spread of security cameras and other technologies of surveillance; in the growth of prisons and the expansion of police forces; in the ongoing wars of occupation waged by imperialist nations to secure access to resources; in the ongoing colonization carried out against Indigenous Peoples to undercut their resistance and steal their territories; in the threat of being fired or evicted if we aren't subservient enough; in the mass media that teaches us to submit; and in our relationships where we exploit each other, mirroring the systems of domination we were raised to identify with.
    [Show full text]
  • Machine Learning, Game Play, and Go
    Machine Learning, Game Play, and Go David Stoutamire Abstract The game of go is an ideal problem domain for exploring machine learning: it is easy to define and there are many human experts, yet existing programs have failed to emulate their level of play to date. Existing literature on go playing programs and applications of machine learning to games are surveyed. An error function based on a database of master games is defined which is used to formulate the learning of go as an optimization problem. A classification technique called pattern preference is presented which is able to automatically derive patterns representative of good moves; a hashing technique allows pattern preference to run efficiently on conventional hardware with graceful degradation as memory size decreases. 1 Contents 1 Machine Learning 6 1.1 What is machine learning? X X XXX XXX XX XXX XXX XXX XXX XXX X 6 1.2 Learning as optimization X XX XXX XXX XX XXX XXX XXX XXX XXX X 7 1.3 The Bias of Generalization XX XXX XXX XX XXX XXX XXX XXX XXX X 10 1.4 Two graphical examples X XXX XXX XXX XX XXX XXX XXX XXX XXX X 11 2 Computer Gamesmanship and Go 14 2.1 Computers playing games X XX XXX XXX XX XXX XXX XXX XXX XXX X 14 2.1.1 Simple state games X X XXX XXX XX XXX XXX XXX XXX XXX X 14 2.1.2 Computer chess X XXX XXX XXX XX XXX XXX XXX XXX XXX X 17 2.2 The game of go X X XXX XXX XXX XXX XX XXX XXX XXX XXX XXX X 18 2.3 Go programs X XXX XXX XXX XXX XXX XX XXX XXX XXX XXX XXX X 21 2.3.1 David Fotland X X XXX XXX XXX XX XXX XXX XXX XXX XXX X 22 2.3.2 Ken Chen X XXX XXX XXX XXX XX
    [Show full text]
  • Gentle Joseki, Part I by Pieter Mioch the Patterns
    Gentle Joseki, part I by Pieter Mioch The patterns Dia 1 An opening move at the 4-3 point (komoku) is basically a typical way of not regarding the center important as yet. Komoku had for centuries been the focus of almost every game in Japan and profound research dealing with the 4-3 point was constantly going on. (this explains why even today there are more komoku joseki's than all the other opening moves put together) In dia 1 black's aim is to make a corner enclosure (shimari) with A, B, or C. Black thus spending two moves in the corner gets cash profit but in doing so it'll take longer before he can start to emphasize the center. This is because the orthodox and still popular way of developing in the opening is to first claim the corners, then the sides, and last, as an afterthought the center. It should be kept in mind that either making a shimari yourself or preventing your opponent from making a shimari is regarded of equal value (miai). It is in most cases, if not all very hard to say that making a shimari is clearly better than preventing your opponent from making one. A good idea for any player, be it dan or kyu, is to experiment freely with both styles of playing. Games will most likely develop completely different, providing an opportunity to be amazed and learn, which is all it takes to improve rapidly. Dia 2 and dia 3 show moves which prevent black from making a shimari.
    [Show full text]
  • User Manual for Smartgo 1
    User Manual © 2002-2004 Smart Go, Inc. All rights reserved. SmartGo™ 1.5 User Manual Contents 1 Table of Contents Part 1 Introduction 6 1 What is SmartGo?........ ........................................................................................................................... 6 2 Notation and Conventions...................... ............................................................................................................. 7 3 Screen Layout.. ................................................................................................................................. 7 4 Using Toolbars... ................................................................................................................................ 8 5 Tree of Moves.. ................................................................................................................................. 8 6 Game Collections........ ........................................................................................................................... 9 7 Keyboard Focus....... ............................................................................................................................ 10 Part 2 Common Tasks 11 1 Play Against. .the..... .Computer................. .[Player............ .only]......... ................................................................................... 11 2 Replay a Game..... .............................................................................................................................. 11 3 Study
    [Show full text]