General Game Playing Michael Thielscher, Dresden

AAAI'08 Tutorial Chess Players General Game Playing Michael Thielscher, Dresden Some of the material presented in this tutorial originates in work by Michael Genesereth and the Stanford Logic Group. We greatly appreciate their contribution. The Turk (18th Century) Alan Turing & Claude Shannon (~1950) Deep-Blue Beats World Champion (1997) In the early days, game playing machines were considered a key to Artificial Intelligence (AI). But chess computers are highly specialized systems. Deep-Blue's intelligence was limited. It couldn't even play a decent game of Tic-Tac-Toe or Rock-Paper-Scissors. With General Game Playing many of the original expectations with game playing machines get revived. Traditional research on game playing focuses on constructing specific evaluation functions A General Game Player is a system that building libraries for specific games understands formal descriptions of arbitrary strategy games The intelligence lies with the programmer, not with the program! learns to play these games well without human intervention A General Game Player needs to exhibit much broader intelligence: abstract thinking strategic planning learning General Game Playing and AI Rather than being concerned with a specialized solution to a narrow problem, General Game Playing encompasses a variety of AI areas: Game Playing Games Agents Knowledge Representation Deterministic, complete information Competitive environments Planning and Search Nondeterministic, partially observable Uncertain environments Learning Rules partially unknown Unknown environment model Robotic player Real-world environments General Game Playing is considered a grand AI Challenge Application (1) Application (2): Economics Commercially available chess computers can't be used for a game of Bughouse Chess. A General Game Playing system could be used for negotiations, marketing strategies, pricing, etc. It can be easily adapted to changes in the business processes and rules, new competitors, etc. The rules of an - marketplace can be formalized as a game, so that agents can automatically learn how to participate. An adaptable game computer would allow the user to modify the rules for arbitrary variants of a game. Single-Player, Deterministic Example Games Single-Player, Deterministic Two-Player, Zero-Sum, Deterministic Two-Player, Zero-Sum, Deterministic Two-Player, Zero-Sum, Nondeterministic n-Player, Deterministic n-Player, Incomplete Information, Nondeterministic General Game Playing Initiative Roadmap games.stanford.edu The Game Description Language GDL: Knowledge Representation Game description language How to make legal moves: Variety of games/actual matches Automated Reasoning Basic player available for download How to solve simple games: Annual world cup @AAAI (since 2005) Planning & Search Price money: US$ 10,000 How to play well: (deterministic games w/ complete information only) Learning Every finite game can be modeled as a state transition system Game Description Language But direct encoding impossible in practice 19,683 states ~ 1043 legal positions Modular State Representation: Fluents Fluent Representation for Chess (2) 8 8 7 cell(X,Y,C) 7 canCastle(P,S) 6 X Î {a,...,h} 6 P Î {white,black} 5 Y Î {1,...,8} 5 S Î {kingsSide,queensSide} 4 C Î {whiteKing,...,blank} 4 enPassant(C) 3 3 control(P) C Î {a,...,h} 2 2 P Î {white,black} 1 1 a b c d e f g h a b c d e f g h Actions Game Rules (I) 8 Players roles([white,black]) 7 move(U,V,X,Y) 6 U,X Î {a,...,h} 5 V,Y Î {1,...,8} 4 init(cell(a,1,whiteRook)) ¡ promote(X,Y,P) Initial position ... 3 X,Y Î {a,...,h} 2 P Î {whiteQueen,...} 1 legal(white,promote(X,Y,P)) <= a b c d e f g h Legal Moves ¡ true(cell(X,7,whitePawn)) ... Game Rules (II) Clausal Logic Variables: X, Y, Z Position updates next(cell(X,Y,C)) <= Constants: a, b, c does(P,move(U,V,X,Y)) Functions: f, g, h ¡ true(cell(U,V,C)) Predicates: p, q, r, = ¢ Logical Operators: ¬, ¡ , , <= terminal <= End of game ¢ Terms: X, Y, Z, a, b, c, f(a), g(a,X), h(a,b,f(Y)) checkmate stalemate Atoms: p(a,b) Literals: p(a,b), ¬q(X,f(a)) goal(white,100) <= Result true(control(black)) Clauses: Head <= Body ¡ checkmate Head: relational sentence ¢ goal(white, 50) <= stalemate Body: logical sentence built from ¡ , , literal Game-Independent Vocabulary Axiomatizing Tic-Tac-Toe: Fluents Relations cell(X,Y,M) roles(list-of(player)) 3 X,Y Î 1,2,3 init(fluent) { } true(fluent) 2 M Î {x,o,b} does(player,move) next(fluent) 1 control(P) legal(player,move) goal(player,value) P Î {xplayer,oplayer} terminal 1 2 3 Axiomatizing Tic-Tac-Toe: Actions Tic-Tac-Toe: Vocabulary Constants xplayer, oplayer Players x, o, b Marks 3 Functions mark(X,Y) cell(number,number,mark) Fluent control(player) Fluent 2 X,Y Î {1,2,3} mark(number,number) Action 1 noop Predicates row(number,mark) column(number,mark) 1 2 3 diagonal(mark) line(mark) open Players and Initial Position Preconditions roles([xplayer,oplayer]) legal(P,mark(X,Y)) <= init(cell(1,1,b)) true(cell(X,Y,b)) ¡ init(cell(1,2,b)) true(control(P)) init(cell(1,3,b)) legal(xplayer,noop) <= init(cell(2,1,b)) true(cell(X,Y,b)) ¡ init(cell(2,2,b)) true(control(oplayer)) init(cell(2,3,b)) legal(oplayer,noop) <= init(cell(3,1,b)) true(cell(X,Y,b)) ¡ init(cell(3,2,b)) true(control(xplayer)) init(cell(3,3,b)) init(control(xplayer)) Update Termination next(cell(M,N,x))<= does(xplayer,mark(M,N)) terminal <= line(x) ¢ line(o) next(cell(M,N,o))<= does(oplayer,mark(M,N)) terminal <= ¬open next(cell(M,N,W))<= true(cell(M,N,W)) ¡ ¬W=b line(W) <= row(M,W) next(cell(M,N,b))<= true(cell(M,N,b)) ¡ line(W) <= column(N,W) ¢ ¡ does(P,mark(J,K)) (¬M=J ¬N=K) line(W) <= diagonal(W) next(control(xplayer)) <= true(control(oplayer)) open <= true(cell(M,N,b)) next(control(oplayer)) <= true(control(xplayer)) Supporting Concepts Goals ¡ row(M,W) <= true(cell(M,1,W)) true(cell(M,2,W)) ¡ true(cell(M,3,W)) goal(xplayer,100) <= line(x) ¡ ¡ ¡ goal(xplayer,50) <= ¬line(x) ¬line(o) ¬open column(N,W) <= true(cell(1,N,W)) true(cell(2,N,W)) ¡ goal(xplayer,0) <= line(o) true(cell(3,N,W)) goal(oplayer,100) <= line(o) ¡ ¡ diagonal(W) <= true(cell(1,1,W)) ¡ goal(oplayer,50) <= ¬line(x) ¬line(o) ¬open ¡ true(cell(2,2,W)) goal(oplayer,0) <= line(x) true(cell(3,3,W)) diagonal(W) <= true(cell(1,3,W)) ¡ true(cell(2,2,W)) ¡ true(cell(3,1,W)) Finite Games Games as State Machines Finite Environment Game “world” with finitely many states b e h One initial state and one or more terminal states Fixed finite number of players Each with finitely many “percepts” and “actions” Each with one or more goal states a c f i k Causal Model Environment changes only in response to moves Synchronous actions d g j Initial State and Terminal States Simultaneous Actions a/a a/b b e h b e h a/a b/b a/b a/a a/a b/a b/b a/b a/b a c f i k a/b a c f i k a/a a/b b/a b/a b/b b/b a/a a/a d g j d g j GDL for Trading Games: Example Game Model (English Auction) An n-player game is a structure with components: role(bidder_1) § ... § role(bidder_n) next(winner(P)) <= does(P,bid(X)) § bestbid(X) S – set of states next(highestBid(X)) <= does(P,bid(X)) § bestbid(X) init(highestBid(0)) § next(winner(P)) <= true(winner(P)) not bid init(round(0)) § A1, ..., An – n sets of actions, one for each player next(highestBid(X)) <= true(highestBid(X) not bid legal(P,bid(X)) <= next(round(N)) <= true(round(M)), successor(M,N) £ true(highestBid(Y)) § greaterthan(X,Y) l1, ..., ln – where li Ai × S, the legality relations bid <= does(P,bid(X)) legal(P,noop) § ¤ bestbid(X) <= does(P,bid(X)) not overbid(X) u: S × A × ... × A S – update function § 1 n terminal <= true(round(10)) overbid(X) <= does(P,bid(Y)) greaterthan(Y,X) ¥ s1 S – initial game state t £ S – the terminal states ¦ £ g1, ... gn – where gi S × , the goal relations Try it Yourself: Play this Game! role(you) legal(you,jump(X,Y)) <= true(cell(X,onecoin)) § true(cell(Y,onecoin)) § init(step(1)) ( twobetween(X,Y) | twobetween(Y,X) ) init(cell(1,onecoin)) init(cell(Y,onecoin)) <= succ(X,Y) zerobetween(X,Y) <= succ(X,Y) succ(1,2) § succ(2,3) § ... § succ(7,8) zerobetween(X,Y) <= succ(X,Z) § true(cell(Z,zerocoins)) § zerobetween(Z,Y) § next(step(Y)) <= true(step(X)) succ(X,Y) § Automated Reasoning onebetween(X,Y) <= succ(X,Z) true(cell(Z,zerocoins)) next(cell(X,zerocoins)) <= does(you,jump(X,Y)) § onebetween(Z,Y) next(cell(Y,twocoins)) <= does(you(jump(X,Y)) § onebetween(X,Y) <= succ(X,Z) true(cell(Z,onecoin)) § next(cell(X,C)) <= does(you,jump(Y,Z)) § zerobetween(Z,Y) § true(cell(X,C)) § twobetween(X,Y) <= succ(X,Z) true(cell(Z,zerocoins)) § distinct(X,Y) distinct(X,Z) § twobetween(Z,Y) twobetween(X,Y) <= succ(X,Z) § true(cell(Z,onecoin)) terminal <= ~continuable § continuable <= legal(you,M) onebetween(Z,Y) twobetween(X,Y) <= succ(X,Z) § true(cell(Z,twocoins)) goal(you,100) <= true(step(5)) § goal(you,0) <= true(cell(X,onecoin)) zerobetween(Z,Y) Background: Reasoning about Actions Game descriptions are a good example of knowledge representation McCarthy's Situation Calculus (1963) with formal logic.

General Game Playing Michael Thielscher, Dresden

Ai12-General-Game-Playing-Pre-Handout

Towards General Game-Playing Robots: Models, Architecture and Game Controller

Artificial General Intelligence in Games: Where Play Meets Design and User Experience

PAI2013 Popularize Artificial Intelligence

Arxiv:2002.10433V1 [Cs.AI] 24 Feb 2020 on Game AI Was in a Niche, Largely Unrecognized by Have Certainly Changed in the Last 10 to 15 Years

Conference Program

Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

General Game Playing and the Game Description Language

Rolling Horizon NEAT for General Video Game Playing

Automatic Construction of a Heuristic Search Function for General Game Playing

Simulation-Based Approach to General Game Playing

A General Reinforcement Learning Algorithm That Masters Chess, Shogi and Go Through Self-Play