General Game Playing Michael Thielscher, Dresden

Total Page:16

File Type:pdf, Size:1020Kb

General Game Playing Michael Thielscher, Dresden AAAI'08 Tutorial Chess Players General Game Playing Michael Thielscher, Dresden Some of the material presented in this tutorial originates in work by Michael Genesereth and the Stanford Logic Group. We greatly appreciate their contribution. The Turk (18th Century) Alan Turing & Claude Shannon (~1950) Deep-Blue Beats World Champion (1997) In the early days, game playing machines were considered a key to Artificial Intelligence (AI). But chess computers are highly specialized systems. Deep-Blue's intelligence was limited. It couldn't even play a decent game of Tic-Tac-Toe or Rock-Paper-Scissors. With General Game Playing many of the original expectations with game playing machines get revived. Traditional research on game playing focuses on constructing specific evaluation functions A General Game Player is a system that building libraries for specific games understands formal descriptions of arbitrary strategy games The intelligence lies with the programmer, not with the program! learns to play these games well without human intervention A General Game Player needs to exhibit much broader intelligence: abstract thinking strategic planning learning General Game Playing and AI Rather than being concerned with a specialized solution to a narrow problem, General Game Playing encompasses a variety of AI areas: Game Playing Games Agents Knowledge Representation Deterministic, complete information Competitive environments Planning and Search Nondeterministic, partially observable Uncertain environments Learning Rules partially unknown Unknown environment model Robotic player Real-world environments General Game Playing is considered a grand AI Challenge Application (1) Application (2): Economics Commercially available chess computers can't be used for a game of Bughouse Chess. A General Game Playing system could be used for negotiations, marketing strategies, pricing, etc. It can be easily adapted to changes in the business processes and rules, new competitors, etc. The rules of an - marketplace can be formalized as a game, so that agents can automatically learn how to participate. An adaptable game computer would allow the user to modify the rules for arbitrary variants of a game. Single-Player, Deterministic Example Games Single-Player, Deterministic Two-Player, Zero-Sum, Deterministic Two-Player, Zero-Sum, Deterministic Two-Player, Zero-Sum, Nondeterministic n-Player, Deterministic n-Player, Incomplete Information, Nondeterministic General Game Playing Initiative Roadmap games.stanford.edu The Game Description Language GDL: Knowledge Representation Game description language How to make legal moves: Variety of games/actual matches Automated Reasoning Basic player available for download How to solve simple games: Annual world cup @AAAI (since 2005) Planning & Search Price money: US$ 10,000 How to play well: (deterministic games w/ complete information only) Learning Every finite game can be modeled as a state transition system Game Description Language But direct encoding impossible in practice 19,683 states ~ 1043 legal positions Modular State Representation: Fluents Fluent Representation for Chess (2) 8 8 7 cell(X,Y,C) 7 canCastle(P,S) 6 X Î {a,...,h} 6 P Î {white,black} 5 Y Î {1,...,8} 5 S Î {kingsSide,queensSide} 4 C Î {whiteKing,...,blank} 4 enPassant(C) 3 3 control(P) C Î {a,...,h} 2 2 P Î {white,black} 1 1 a b c d e f g h a b c d e f g h Actions Game Rules (I) 8 Players roles([white,black]) 7 move(U,V,X,Y) 6 U,X Î {a,...,h} 5 V,Y Î {1,...,8} 4 init(cell(a,1,whiteRook)) ¡ promote(X,Y,P) Initial position ... 3 X,Y Î {a,...,h} 2 P Î {whiteQueen,...} 1 legal(white,promote(X,Y,P)) <= a b c d e f g h Legal Moves ¡ true(cell(X,7,whitePawn)) ... Game Rules (II) Clausal Logic Variables: X, Y, Z Position updates next(cell(X,Y,C)) <= Constants: a, b, c does(P,move(U,V,X,Y)) Functions: f, g, h ¡ true(cell(U,V,C)) Predicates: p, q, r, = ¢ Logical Operators: ¬, ¡ , , <= terminal <= End of game ¢ Terms: X, Y, Z, a, b, c, f(a), g(a,X), h(a,b,f(Y)) checkmate stalemate Atoms: p(a,b) Literals: p(a,b), ¬q(X,f(a)) goal(white,100) <= Result true(control(black)) Clauses: Head <= Body ¡ checkmate Head: relational sentence ¢ goal(white, 50) <= stalemate Body: logical sentence built from ¡ , , literal Game-Independent Vocabulary Axiomatizing Tic-Tac-Toe: Fluents Relations cell(X,Y,M) roles(list-of(player)) 3 X,Y Î 1,2,3 init(fluent) { } true(fluent) 2 M Î {x,o,b} does(player,move) next(fluent) 1 control(P) legal(player,move) goal(player,value) P Î {xplayer,oplayer} terminal 1 2 3 Axiomatizing Tic-Tac-Toe: Actions Tic-Tac-Toe: Vocabulary Constants xplayer, oplayer Players x, o, b Marks 3 Functions mark(X,Y) cell(number,number,mark) Fluent control(player) Fluent 2 X,Y Î {1,2,3} mark(number,number) Action 1 noop Predicates row(number,mark) column(number,mark) 1 2 3 diagonal(mark) line(mark) open Players and Initial Position Preconditions roles([xplayer,oplayer]) legal(P,mark(X,Y)) <= init(cell(1,1,b)) true(cell(X,Y,b)) ¡ init(cell(1,2,b)) true(control(P)) init(cell(1,3,b)) legal(xplayer,noop) <= init(cell(2,1,b)) true(cell(X,Y,b)) ¡ init(cell(2,2,b)) true(control(oplayer)) init(cell(2,3,b)) legal(oplayer,noop) <= init(cell(3,1,b)) true(cell(X,Y,b)) ¡ init(cell(3,2,b)) true(control(xplayer)) init(cell(3,3,b)) init(control(xplayer)) Update Termination next(cell(M,N,x))<= does(xplayer,mark(M,N)) terminal <= line(x) ¢ line(o) next(cell(M,N,o))<= does(oplayer,mark(M,N)) terminal <= ¬open next(cell(M,N,W))<= true(cell(M,N,W)) ¡ ¬W=b line(W) <= row(M,W) next(cell(M,N,b))<= true(cell(M,N,b)) ¡ line(W) <= column(N,W) ¢ ¡ does(P,mark(J,K)) (¬M=J ¬N=K) line(W) <= diagonal(W) next(control(xplayer)) <= true(control(oplayer)) open <= true(cell(M,N,b)) next(control(oplayer)) <= true(control(xplayer)) Supporting Concepts Goals ¡ row(M,W) <= true(cell(M,1,W)) true(cell(M,2,W)) ¡ true(cell(M,3,W)) goal(xplayer,100) <= line(x) ¡ ¡ ¡ goal(xplayer,50) <= ¬line(x) ¬line(o) ¬open column(N,W) <= true(cell(1,N,W)) true(cell(2,N,W)) ¡ goal(xplayer,0) <= line(o) true(cell(3,N,W)) goal(oplayer,100) <= line(o) ¡ ¡ diagonal(W) <= true(cell(1,1,W)) ¡ goal(oplayer,50) <= ¬line(x) ¬line(o) ¬open ¡ true(cell(2,2,W)) goal(oplayer,0) <= line(x) true(cell(3,3,W)) diagonal(W) <= true(cell(1,3,W)) ¡ true(cell(2,2,W)) ¡ true(cell(3,1,W)) Finite Games Games as State Machines Finite Environment Game “world” with finitely many states b e h One initial state and one or more terminal states Fixed finite number of players Each with finitely many “percepts” and “actions” Each with one or more goal states a c f i k Causal Model Environment changes only in response to moves Synchronous actions d g j Initial State and Terminal States Simultaneous Actions a/a a/b b e h b e h a/a b/b a/b a/a a/a b/a b/b a/b a/b a c f i k a/b a c f i k a/a a/b b/a b/a b/b b/b a/a a/a d g j d g j GDL for Trading Games: Example Game Model (English Auction) An n-player game is a structure with components: role(bidder_1) § ... § role(bidder_n) next(winner(P)) <= does(P,bid(X)) § bestbid(X) S – set of states next(highestBid(X)) <= does(P,bid(X)) § bestbid(X) init(highestBid(0)) § next(winner(P)) <= true(winner(P)) not bid init(round(0)) § A1, ..., An – n sets of actions, one for each player next(highestBid(X)) <= true(highestBid(X) not bid legal(P,bid(X)) <= next(round(N)) <= true(round(M)), successor(M,N) £ true(highestBid(Y)) § greaterthan(X,Y) l1, ..., ln – where li Ai × S, the legality relations bid <= does(P,bid(X)) legal(P,noop) § ¤ bestbid(X) <= does(P,bid(X)) not overbid(X) u: S × A × ... × A S – update function § 1 n terminal <= true(round(10)) overbid(X) <= does(P,bid(Y)) greaterthan(Y,X) ¥ s1 S – initial game state t £ S – the terminal states ¦ £ g1, ... gn – where gi S × , the goal relations Try it Yourself: Play this Game! role(you) legal(you,jump(X,Y)) <= true(cell(X,onecoin)) § true(cell(Y,onecoin)) § init(step(1)) ( twobetween(X,Y) | twobetween(Y,X) ) init(cell(1,onecoin)) init(cell(Y,onecoin)) <= succ(X,Y) zerobetween(X,Y) <= succ(X,Y) succ(1,2) § succ(2,3) § ... § succ(7,8) zerobetween(X,Y) <= succ(X,Z) § true(cell(Z,zerocoins)) § zerobetween(Z,Y) § next(step(Y)) <= true(step(X)) succ(X,Y) § Automated Reasoning onebetween(X,Y) <= succ(X,Z) true(cell(Z,zerocoins)) next(cell(X,zerocoins)) <= does(you,jump(X,Y)) § onebetween(Z,Y) next(cell(Y,twocoins)) <= does(you(jump(X,Y)) § onebetween(X,Y) <= succ(X,Z) true(cell(Z,onecoin)) § next(cell(X,C)) <= does(you,jump(Y,Z)) § zerobetween(Z,Y) § true(cell(X,C)) § twobetween(X,Y) <= succ(X,Z) true(cell(Z,zerocoins)) § distinct(X,Y) distinct(X,Z) § twobetween(Z,Y) twobetween(X,Y) <= succ(X,Z) § true(cell(Z,onecoin)) terminal <= ~continuable § continuable <= legal(you,M) onebetween(Z,Y) twobetween(X,Y) <= succ(X,Z) § true(cell(Z,twocoins)) goal(you,100) <= true(step(5)) § goal(you,0) <= true(cell(X,onecoin)) zerobetween(Z,Y) Background: Reasoning about Actions Game descriptions are a good example of knowledge representation McCarthy's Situation Calculus (1963) with formal logic.
Recommended publications
  • Ai12-General-Game-Playing-Pre-Handout
    Introduction GDL GGP Alpha Zero Conclusion References Artificial Intelligence 12. General Game Playing One AI to Play All Games and Win Them All Jana Koehler Alvaro´ Torralba Summer Term 2019 Thanks to Dr. Peter Kissmann for slide sources Koehler and Torralba Artificial Intelligence Chapter 12: GGP 1/53 Introduction GDL GGP Alpha Zero Conclusion References Agenda 1 Introduction 2 The Game Description Language (GDL) 3 Playing General Games 4 Learning Evaluation Functions: Alpha Zero 5 Conclusion Koehler and Torralba Artificial Intelligence Chapter 12: GGP 2/53 Introduction GDL GGP Alpha Zero Conclusion References Deep Blue Versus Garry Kasparov (1997) Koehler and Torralba Artificial Intelligence Chapter 12: GGP 4/53 Introduction GDL GGP Alpha Zero Conclusion References Games That Deep Blue Can Play 1 Chess Koehler and Torralba Artificial Intelligence Chapter 12: GGP 5/53 Introduction GDL GGP Alpha Zero Conclusion References Chinook Versus Marion Tinsley (1992) Koehler and Torralba Artificial Intelligence Chapter 12: GGP 6/53 Introduction GDL GGP Alpha Zero Conclusion References Games That Chinook Can Play 1 Checkers Koehler and Torralba Artificial Intelligence Chapter 12: GGP 7/53 Introduction GDL GGP Alpha Zero Conclusion References Games That a General Game Player Can Play 1 Chess 2 Checkers 3 Chinese Checkers 4 Connect Four 5 Tic-Tac-Toe 6 ... Koehler and Torralba Artificial Intelligence Chapter 12: GGP 8/53 Introduction GDL GGP Alpha Zero Conclusion References Games That a General Game Player Can Play (Ctd.) 5 ... 18 Zhadu 6 Othello 19 Pancakes 7 Nine Men's Morris 20 Quarto 8 15-Puzzle 21 Knight's Tour 9 Nim 22 n-Queens 10 Sudoku 23 Blob Wars 11 Pentago 24 Bomberman (simplified) 12 Blocker 25 Catch a Mouse 13 Breakthrough 26 Chomp 14 Lights Out 27 Gomoku 15 Amazons 28 Hex 16 Knightazons 29 Cubicup 17 Blocksworld 30 ..
    [Show full text]
  • Towards General Game-Playing Robots: Models, Architecture and Game Controller
    Towards General Game-Playing Robots: Models, Architecture and Game Controller David Rajaratnam and Michael Thielscher The University of New South Wales, Sydney, NSW 2052, Australia fdaver,[email protected] Abstract. General Game Playing aims at AI systems that can understand the rules of new games and learn to play them effectively without human interven- tion. Our paper takes the first step towards general game-playing robots, which extend this capability to AI systems that play games in the real world. We develop a formal model for general games in physical environments and provide a sys- tems architecture that allows the embedding of existing general game players as the “brain” and suitable robotic systems as the “body” of a general game-playing robot. We also report on an initial robot prototype that can understand the rules of arbitrary games and learns to play them in a fixed physical game environment. 1 Introduction General game playing is the attempt to create a new generation of AI systems that can understand the rules of new games and then learn to play these games without human intervention [5]. Unlike specialised systems such as the chess program Deep Blue, a general game player cannot rely on algorithms that have been designed in advance for specific games. Rather, it requires a form of general intelligence that enables the player to autonomously adapt to new and possibly radically different problems. General game- playing systems therefore are a quintessential example of a new generation of software that end users can customise for their own specific tasks and special needs.
    [Show full text]
  • Artificial General Intelligence in Games: Where Play Meets Design and User Experience
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by The IT University of Copenhagen's Repository ISSN 2186-7437 NII Shonan Meeting Report No.130 Artificial General Intelligence in Games: Where Play Meets Design and User Experience Ruck Thawonmas Julian Togelius Georgios N. Yannakakis March 18{21, 2019 National Institute of Informatics 2-1-2 Hitotsubashi, Chiyoda-Ku, Tokyo, Japan Artificial General Intelligence in Games: Where Play Meets Design and User Experience Organizers: Ruck Thawonmas (Ritsumeikan University, Japan) Julian Togelius (New York University, USA) Georgios N. Yannakakis (University of Malta, Malta) March 18{21, 2019 Plain English summary (lay summary): Arguably the grand goal of artificial intelligence (AI) research is to pro- duce machines that can solve multiple problems, not just one. Until recently, almost all research projects in the game AI field, however, have been very specific in that they focus on one particular way in which intelligence can be applied to video games. Most published work describes a particu- lar method|or a comparison of two or more methods|for performing a single task in a single game. If an AI approach is only tested on a single task for a single game, how can we argue that such a practice advances the scientific study of AI? And how can we argue that it is a useful method for a game designer or developer, who is likely working on a completely different game than the method was tested on? This Shonan meeting aims to discuss three aspects on how to generalize AI in games: how to play any games, how to model any game players, and how to generate any games, plus their potential applications.
    [Show full text]
  • PAI2013 Popularize Artificial Intelligence
    Matteo Baldoni Federico Chesani Paola Mello Marco Montali (Eds.) PAI2013 Popularize Artificial Intelligence AI*IA National Workshop “Popularize Artificial Intelligence” Held in conjunction with AI*IA 2013 Turin, Italy, December 5, 2013 Proceedings PAI 2013 Home Page: http://aixia2013.i-learn.unito.it/course/view.php?id=4 Copyright c 2013 for the individual papers by the papers’ authors. Copying permitted for private and academic purposes. Re-publication of material from this volume requires permission by the copyright owners. Sponsoring Institutions Associazione Italiana per l’Intelligenza Artificiale Editors’ addresses: University of Turin DI - Dipartimento di Informatica Corso Svizzera, 185 10149 Torino, ITALY [email protected] University of Bologna DISI - Dipartimento di Informatica - Scienza e Ingegneria Viale Risorgimento, 2 40136 Bologna, Italy [email protected] [email protected] Free University of Bozen-Bolzano Piazza Domenicani, 3 39100 Bolzano, Italy [email protected] Preface The 2nd Workshop on Popularize Artificial Intelligence (PAI 2013) follows the successful experience of the 1st edition, held in Rome 2012 to celebrate the 100th anniversary of Alan Turing’s birth. It is organized as part of the XIII Conference of the Italian Association for Artificial Intelligence (AI*IA), to celebrate another important event, namely the 25th anniversary of AI*IA. In the same spirit of the first edition, PAI 2013 aims at divulging the practical uses of Artificial Intelligence among researchers, practitioners, teachers and stu- dents. 13 contributions were submitted, and accepted after a reviewing process that produced from 2 to 3 reviews per paper. Papers have been grouped into three main categories: student experiences inside AI courses (8 contributions), research and academic experiences (4 contributions), and industrial experiences (1 contribution).
    [Show full text]
  • Arxiv:2002.10433V1 [Cs.AI] 24 Feb 2020 on Game AI Was in a Niche, Largely Unrecognized by Have Certainly Changed in the Last 10 to 15 Years
    From Chess and Atari to StarCraft and Beyond: How Game AI is Driving the World of AI Sebastian Risi and Mike Preuss Abstract This paper reviews the field of Game AI, strengthening it (e.g. [45]). The main arguments have which not only deals with creating agents that can play been these: a certain game, but also with areas as diverse as cre- ating game content automatically, game analytics, or { By tackling game problems as comparably cheap, player modelling. While Game AI was for a long time simplified representatives of real world tasks, we can not very well recognized by the larger scientific com- improve AI algorithms much easier than by model- munity, it has established itself as a research area for ing reality ourselves. developing and testing the most advanced forms of AI { Games resemble formalized (hugely simplified) mod- algorithms and articles covering advances in mastering els of reality and by solving problems on these we video games such as StarCraft 2 and Quake III appear learn how to solve problems in reality. in the most prestigious journals. Because of the growth of the field, a single review cannot cover it completely. Therefore, we put a focus on important recent develop- Both arguments have at first nothing to do with ments, including that advances in Game AI are start- games themselves but see them as a modeling / bench- ing to be extended to areas outside of games, such as marking tools. In our view, they are more valid than robotics or the synthesis of chemicals. In this article, ever.
    [Show full text]
  • Conference Program
    AAAI-07 / IAAI-07 Based on an original photograph courtesy, Tourism Vancouver. Conference Program Twenty-Second AAAI Conference on Artificial Intelligence (AAAI-07) Nineteenth Conference on Innovative Applications of Artificial Intelligence (IAAI-07) July 22 – 26, 2007 Hyatt Regency Vancouver Vancouver, British Columbia, Canada Sponsored by the Association for the Advancement of Artificial Intelligence Cosponsored by the National Science Foundation, Microsoft Research, Michael Genesereth, Google, Inc., NASA Ames Research Center, Yahoo! Research Labs, Intel, USC/Information Sciences Institute, AICML/University of Alberta, ACM/SIGART, The Boeing Company, IISI/Cornell University, IBM Research, and the University of British Columbia Tutorial Forum Cochairs Awards Contents Carla Gomes (Cornell University) Andrea Danyluk (Williams College) All AAAI-07, IAAI-07, AAAI Special Awards, Acknowledgments / 2 Workshop Cochairs and the IJCAI-JAIR Best Paper Prize will be AI Video Competition / 18 Bob Givan (Purdue University) presented Tuesday, July 24, 8:30–9:00 AM, Simon Parsons (Brooklyn College, CUNY) Awards / 2–3 in the Regency Ballroom on the Convention Competitions / 18–19 Doctoral Consortium Chair and Cochair Level. Computer Poker Competition / 18 Terran Lane (The University of New Mexico) Conference at a Glance / 5 Colleen van Lent (California State University Doctoral Consortium / 4 Long Beach) AAAI-07 Awards Exhibition / 16 Student Abstract and Poster Cochairs Presented by Robert C. Holte and Adele General Game Playing Competition / 18 Mehran Sahami (Google Inc.) Howe, AAAI-07 program chairs. General Information / 19–20 Kiri Wagstaff (Jet Propulsion Laboratory) AAAI-07 Outstanding Paper Awards IAAI-07 Program / 8–13 Matt Gaston (University of Maryland, PLOW: A Collaborative Task Learning Agent Intelligent Systems Demos / 17 Baltimore County) James Allen, Nathanael Chambers, Invited Talks / 3, 7 Intelligent Systems Demonstrations George Ferguson, Lucian Galescu, Hyuckchul Man vs.
    [Show full text]
  • Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
    Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm David Silver,1∗ Thomas Hubert,1∗ Julian Schrittwieser,1∗ Ioannis Antonoglou,1 Matthew Lai,1 Arthur Guez,1 Marc Lanctot,1 Laurent Sifre,1 Dharshan Kumaran,1 Thore Graepel,1 Timothy Lillicrap,1 Karen Simonyan,1 Demis Hassabis1 1DeepMind, 6 Pancras Square, London N1C 4AG. ∗These authors contributed equally to this work. Abstract The game of chess is the most widely-studied domain in the history of artificial intel- ligence. The strongest programs are based on a combination of sophisticated search tech- niques, domain-specific adaptations, and handcrafted evaluation functions that have been refined by human experts over several decades. In contrast, the AlphaGo Zero program recently achieved superhuman performance in the game of Go by tabula rasa reinforce- ment learning from games of self-play. In this paper, we generalise this approach into a single AlphaZero algorithm that can achieve, tabula rasa, superhuman performance in many challenging games. Starting from random play, and given no domain knowledge ex- cept the game rules, AlphaZero achieved within 24 hours a superhuman level of play in the games of chess and shogi (Japanese chess) as well as Go, and convincingly defeated a world-champion program in each case. The study of computer chess is as old as computer science itself. Babbage, Turing, Shan- non, and von Neumann devised hardware, algorithms and theory to analyse and play the game of chess. Chess subsequently became the grand challenge task for a generation of artificial intel- ligence researchers, culminating in high-performance computer chess programs that perform at superhuman level (9, 14).
    [Show full text]
  • General Game Playing and the Game Description Language
    General Game Playing and the Game Description Language Abdallah Saffidine In Artificial Intelligence (AI), General Game Playing (GGP) consists in constructing artificial agents that are able to act rationally in previously unseen environments. Typically, an agent is presented with a description of a new environment and is requested to select an action among a given set of legal actions. Since the programmers do not have prior access to the description of the environment, they cannot encode domain specific knowledge in the agent, and the artificial agent must be able to infer, reason, and learn about its environment on its own. The class of environments an agent may face is called Multi-Agent Environments (MAEs), it features many interacting agents in a deterministic and discrete setting. The agents can take turns or act syn- chronously and can cooperate or have conflicting objectives. For instance, the Travelling Salesman Problem, Chess, the Prisoner’s Dilemma, and six-player Chinese Checkers all fall within the MAE framework. The environment is described to the agents in the Game Description Language (GDL), a declarative and rule based language, related to DATALOG. A state of the MAE corresponds to a set of ground terms and some distinguished predicates allow to compute the legal actions for each agents, the transition function through the effect of the actions, or the rewards each agent is awarded when a final state has been reached. State of the art GGP players currently process GDL through a reduction to PROLOG and using a PROLOG interpreter. As a result, the raw speed of GGP engines is several order of magnitudes slower than that of an engine specialised to one type of environments (e.g.
    [Show full text]
  • Rolling Horizon NEAT for General Video Game Playing
    Rolling Horizon NEAT for General Video Game Playing Diego Perez-Liebana Muhammad Sajid Alam Raluca D. Gaina Queen Mary University of London Queen Mary University of London Queen Mary University of London Game AI Group Game AI Group Game AI Group London, UK London, UK London, UK [email protected] [email protected] [email protected] Abstract—This paper presents a new Statistical Forward present rhNEAT, an algorithm that takes concepts from the Planning (SFP) method, Rolling Horizon NeuroEvolution of existing NeuroEvolution of Augmenting Topologies (NEAT) Augmenting Topologies (rhNEAT). Unlike traditional Rolling and RHEA. We analyze different variants of the algorithm and Horizon Evolution, where an evolutionary algorithm is in charge of evolving a sequence of actions, rhNEAT evolves weights and compare its performance with state of the art results. A second connections of a neural network in real-time, planning several contribution of this paper is to start opening a line of research steps ahead before returning an action to execute in the game. that explores alternative representations for Rolling Horizon Different versions of the algorithm are explored in a collection Evolutionary Algorithms: while traditionally RHEA evolves a of 20 GVGAI games, and compared with other SFP methods sequence of actions, rhNEAT evolves neural network weights and state of the art results. Although results are overall not better than other SFP methods, the nature of rhNEAT to adapt and configurations, which in turn generate action sequences to changing game features has allowed to establish new state of used to evaluate individual fitness.
    [Show full text]
  • Automatic Construction of a Heuristic Search Function for General Game Playing
    Automatic Construction of a Heuristic Search Function for General Game Playing Stephan Schiffel and Michael Thielscher Department of Computer Science Dresden University of Technology {stephan.schiffel,mit}@inf.tu-dresden.de Student Paper Abstract tic search. Our focus is on techniques for constructing a search heuristics by the automated analysis of game speci- General Game Playing (GGP) is the art of design- fications. More specifically, we describe the following func- ing programs that are capable of playing previously tionalities of a complete General Game Player: unknown games of a wide variety by being told nothing but the rules of the game. This is in con- 1. Determining legal moves and their effects from a formal trast to traditional computer game players like Deep game description requires reasoning about actions. We Blue, which are designed for a particular game and use the Fluent Calculus and its Prolog-based implemen- can’t adapt automatically to modifications of the tation FLUX [Thielscher, 2005]. rules, let alone play completely different games. General Game Playing is intended to foster the de- 2. To search a game tree, we use non-uniform depth- velopment of integrated cognitive information pro- first search with iterative deepening and general pruning cessing technology. In this article we present an techniques. approach to General Game Playing using a novel 3. Games which cannot be fully searched require auto- way of automatically constructing a search heuris- matically constructing evaluation functions from formal tics from a formal game description. Our system is game specifications. We give the details of a method being tested with a wide range of different games.
    [Show full text]
  • Simulation-Based Approach to General Game Playing
    Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008) Simulation-Based Approach to General Game Playing Hilmar Finnsson and Yngvi Bjornsson¨ School of Computer Science Reykjav´ık University, Iceland fhif,[email protected] Abstract There is an inherent risk with that approach though. In practice, because of how disparate the games and their play- The aim of General Game Playing (GGP) is to create ing strategies can be, the pre-chosen set of generic features intelligent agents that automatically learn how to play may fail to capture some essential game properties. On many different games at an expert level without any top of that, the relative importance of the features can of- human intervention. The most successful GGP agents in the past have used traditional game-tree search com- ten be only roughly approximated because of strict online bined with an automatically learned heuristic function time constraints. Consequently, the resulting heuristic evalu- for evaluating game states. In this paper we describe a ations may become highly inaccurate and, in the worst case, GGP agent that instead uses a Monte Carlo/UCT sim- even strive for the wrong objectives. Such heuristics are of ulation technique for action selection, an approach re- a little (or even decremental) value for lookahead search. cently popularized in computer Go. Our GGP agent has In here we describe a simulation-based approach to gen- proven its effectiveness by winning last year’s AAAI eral game playing that does not require any a priori do- GGP Competition. Furthermore, we introduce and em- main knowledge. It is based on Monte-Carlo (MC) simu- pirically evaluate a new scheme for automatically learn- ing search-control knowledge for guiding the simula- lations and the Upper Confidence-bounds applied to Trees tion playouts, showing that it offers significant benefits (UCT) algorithm for guiding the simulation playouts (Koc- for a variety of games.
    [Show full text]
  • A General Reinforcement Learning Algorithm That Masters Chess, Shogi and Go Through Self-Play
    A general reinforcement learning algorithm that masters chess, shogi and Go through self-play David Silver,1;2∗ Thomas Hubert,1∗ Julian Schrittwieser,1∗ Ioannis Antonoglou,1;2 Matthew Lai,1 Arthur Guez,1 Marc Lanctot,1 Laurent Sifre,1 Dharshan Kumaran,1;2 Thore Graepel,1;2 Timothy Lillicrap,1 Karen Simonyan,1 Demis Hassabis1 1DeepMind, 6 Pancras Square, London N1C 4AG. 2University College London, Gower Street, London WC1E 6BT. ∗These authors contributed equally to this work. Abstract The game of chess is the longest-studied domain in the history of artificial intelligence. The strongest programs are based on a combination of sophisticated search techniques, domain-specific adaptations, and handcrafted evaluation functions that have been refined by human experts over several decades. By contrast, the AlphaGo Zero program recently achieved superhuman performance in the game of Go by reinforcement learning from self- play. In this paper, we generalize this approach into a single AlphaZero algorithm that can achieve superhuman performance in many challenging games. Starting from random play and given no domain knowledge except the game rules, AlphaZero convincingly defeated a world champion program in the games of chess and shogi (Japanese chess) as well as Go. The study of computer chess is as old as computer science itself. Charles Babbage, Alan Turing, Claude Shannon, and John von Neumann devised hardware, algorithms and theory to analyse and play the game of chess. Chess subsequently became a grand challenge task for a generation of artificial intelligence researchers, culminating in high-performance computer chess programs that play at a super-human level (1,2).
    [Show full text]