Heuristic Evaluation Functions for General Game Playing

University of California Los Angeles Heuristic Evaluation Functions for General Game Playing A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Computer Science by James Edmond Clune III 2008 c Copyright by James Edmond Clune III 2008 The dissertation of James Edmond Clune III is approved. Adnan Y. Darwiche Thomas S. Ferguson Alan C. Kay Todd D. Millstein Richard E. Korf, Committee Chair University of California, Los Angeles 2008 ii Table of Contents 1 Introduction ................................ 1 1.1 General Game Playing: The Problem . 1 1.2 The Importance of General Game Playing . 2 1.2.1 Competitive Performance Metric . 2 1.2.2 Flexible Software Capabilities . 3 1.2.3 Game-Oriented Programming . 3 1.2.4 Generality as a Key Aspect of Intelligence . 4 1.2.5 A Timely Problem . 4 1.3 Overview of the Project . 5 1.4 Overview of the Dissertation . 6 2 Philosophy and Vision ......................... 8 2.1 The Ubiquity of Game Models . 8 2.2 Game-Oriented Programming . 10 3 Literature Review ............................ 13 3.1 General Game Playing . 13 3.1.1 Barney Pell’s Metagamer . 13 3.1.2 AAAI General Game Playing Competition . 14 3.1.3 Other General Game Playing Systems . 15 3.2 Automated Planning . 16 3.2.1 Classical Planning . 16 iii 3.2.2 International Planning Competition . 17 3.2.3 Planning Techniques . 18 3.3 Discovery Systems . 20 3.3.1 Feature Discovery . 20 3.3.2 AM and Eurisko . 21 3.3.3 Learning Heuristic Evaluation Functions . 23 4 General Game Playing Framework .................. 25 4.1 Game Description Language (GDL) . 25 4.2 GGP Protocol . 28 4.3 AAAI GGP Competition . 29 5 Abstract-Model Based Heuristic Evaluation Functions ..... 31 5.1 Overview . 31 5.2 Feature Identification . 32 5.2.1 Candidate Expressions . 33 5.2.2 Expression Interpretations . 34 5.2.3 Stability . 35 5.3 Abstract Model . 37 5.3.1 Payoff . 37 5.3.2 Mobility . 38 5.3.3 Termination . 39 5.4 Heuristic Evaluation Function . 40 5.5 Anytime Algorithm . 42 iv 5.6 Use of Evaluation Function in Game-Play . 42 5.7 Evaluation Function Results . 43 5.7.1 Racetrack Corridor . 44 5.7.2 Othello . 45 5.7.3 Chess . 46 5.7.4 Chinese Checkers . 47 6 Techniques Specific to Single-Player Games ............ 49 6.1 Motivation and Overview . 49 6.2 Heuristic Evaluation Function Construction . 50 6.3 Search Algorithms . 52 6.3.1 Uninformed Search . 53 6.3.2 Informed Search . 55 6.4 Algorithm Composition . 56 6.5 Summary . 57 7 Rollout-Based Monte Carlo Methods ................ 58 7.1 Introduction . 58 7.2 Use of Heuristic Evaluation Functions . 60 7.3 Use of Action Heuristics . 60 7.4 Automatic Construction of Action Heuristics . 61 8 Alpha-Beta Minimax versus Monte Carlo Methods ....... 64 8.1 Overview . 64 v 8.2 Randomly Generated Synthetic Games . 66 8.3 Experiments . 68 8.4 Results . 68 8.5 Real Games . 72 9 Empirical Results: AAAI GGP Competitions ........... 78 9.1 First Annual GGP Competition . 78 9.2 Second Annual GGP Competition . 80 9.3 Third Annual GGP Competition . 82 9.4 Summary . 84 10 Discussion and Conclusions ...................... 85 10.1 Summary . 85 10.2 Discussion and Future Work . 86 10.3 Conclusion . 87 A Interpreting Heuristic Evaluation Functions ............ 88 A.1 Chess . 89 A.2 Chinese Checkers . 90 A.3 Othello . 92 B Engineering Considerations ...................... 94 B.1 Reasoning Module . 94 B.2 Multi-Processor Utilization . 95 vi References ................................... 97 vii List of Figures 4.1 A State in Tic-Tac-Toe . 26 5.1 Overall Evaluation Function . 41 5.2 Racetrack Corridor (initial position) . 44 5.3 Racetrack Corridor (with some walls placed) . 44 5.4 Chinese Checkers . 48 6.1 Flow-Chart for Solving Single-Player Problems . 51 8.1 Two-Player Games from AAAI Tournaments . 76 8.2 Results by Branching Factor . 77 viii List of Tables 5.1 Abstract Model Parameters . 37 8.1 Zero-Sum Games . 69 8.2 Varying Time Per Move . 70 8.3 Nonzero-Sum Games . 71 8.4 Games from AAAI Tournaments . 73 9.1 AAAI 2006 GGP Competition Leaderboard . 79 9.2 AAAI 2007 GGP Competition: Results of Preliminary Rounds . 82 ix Acknowledgments I am indebted to my adviser, Rich Korf, for his advice and encouragement. He was supportive from the project’s conception and remained so throughout. Many technical discussions with Rich contributed to this work by improving my understanding of central issues. He read multiple writeups and provided insightful comments and constructive criticism. He steered me in helpful directions, yet gave me freedom to pursue my own ideas. He also sponsored me as a research assistant, which helped me finish this dissertation. I would like to thank Michael Genesereth for sponsoring the AAAI general game playing competition, without which this work would not have been possible. Also, thanks to the Stanford General Game Playing group including Nat Love, Eric Schkufza, David Haley, and Tim Hinrichs. I’d like to also thank the other competitors in the AAAI general game playing tournaments for open discussions amidst friendly competition. Thanks to Alan Kay for encouraging discussions, especially in helping broaden my vision beyond “games” in the traditional sense. The philosophical issues ex- plored in Chapter 2, particularly the view of game-oriented programming lan- guages, came largely from interactions with Alan. I’m grateful to the past and present UCLA AI grads who have helped me through numerous discussions and proofreading, particularly Alex Dow and Alex Fukunaga. Thanks to Barney Pell for an encouraging and helpful discussion early in the project. x Abstract of the Dissertation Heuristic Evaluation Functions for General Game Playing by James Edmond Clune III Doctor of Philosophy in Computer Science University of California, Los Angeles, 2008 Professor Richard E. Korf, Chair A general game-playing program plays games that it has not previously encountered. A game manager program sends the game-playing programs a description of a game’s rules and objectives in a game description language. The game- playing programs compete by sending messages over a network indicating their moves until the game is completed. The class of games covered is intentionally broad, including games of one or more players with alternating or simultaneous moves, with arbitrary numeric payoffs. This research explores the problem of constructing an effective general game- playing program, with an emphasis on techniques for automatically constructing effective heuristic evaluation functions from game descriptions. A technique based on abstract models of games is presented. The abstract model treats mobility, payoff and termination as the most salient elements of a game. Each of these aspects are quantified in terms of stable features. Evidence is presented that the technique produces heuristic evaluation functions that are both comprehensible and effective. Empirical work includes a series of general game-playing programs that placed xi first or second for the three consecutive years of the AAAI General Game-Playing Competition. xii CHAPTER 1 Introduction 1.1 General Game Playing: The Problem The idea of general game playing (GGP) is to create a computer program that effectively plays games that it has not previously encountered. A game manager program sends the game playing programs a description of a game in a well- defined game description language. The description specifies the goal of the game, the legal moves, the initial game state, and the termination conditions. The game manager also sends information about what role the program will play (black or white, naughts or crosses, etc), a start time (time allowed for pre-game analysis), and a move time (time allowed per move once game play begins). The game playing programs compete by sending messages over a network indicating their moves until the game is completed. The class of games covered is intentionally broad, including games of one or more players with alternating or simultaneous moves, with arbitrary numeric payoffs. The immediate goal of the research is to develop techniques that allow us to create GGP programs that win games. The techniques are implemented and their effectiveness is evaluated empirically by competing against programs embodying alternative techniques. My work on general game playing techniques emphasizes heuristic evaluation functions. These are functions from game states to numbers used to assess the de- sirability of non-terminal states for particular players. In my opinion, automatic 1 construction of effective heuristic evaluation functions from game descriptions is the central challenge of general game playing. However, it is a subject that is best dealt with in the context of a complete general game player employing some type of game-tree search. In recognition of this, search techniques are covered in this dissertation as well. 1.2 The Importance of General Game Playing Here I describe several reasons why GGP is an important research problem. 1.2.1 Competitive Performance Metric In his doctoral dissertation [Pel93], Barney Pell argues that a primary reason for computer game playing research has been the competitive performance metric for intelligence, namely the presumed link between winning games and intelligent behavior. Pell describes a problem with this presumption: Unfortunately, the use of such a link has proved problematic: we have been able to produce strong programs for some games through specialized engineering methods, the extreme case being special-purpose hardware, and through analysis of the games by humans instead of by programs themselves. Consequently, it now appears that increased understanding and automation of intelligent processing is neither nec- essary nor sufficient for strong performance in game-playing. That is, it appears that we can construct strong game-playing programs without doing much of interest from an AI perspective, and conversely, we can make significant advances in AI that do not result in strong game-playing programs.

Load more