Playing "Invisible " with Information-Theoretic Advisors

A.E. Bud, D.W. Albrecht, A.E. Nicholson and I. Zukerman {bud,dwa,annn,ingrid}@csse.monash, edu. au School of Computer Science and Software Engineering, Monash University Clayton, Victoria 3800, AUSTRALIA phone: +61 3 9905-5225 fax: +61 3 9905-5146

From: AAAI Technical Report SS-01-03. Compilation copyright © 2001, AAAI (www.aaai.org). All rights reserved.

Abstract history of use in the AI communitybecause of its strategic complexity, and well-studied and understood properties. Makingdecisions under uncertainty remains one of the cen- Despite these advantages, chess has two signi cant - tral problemsin AI research. Unfortunately, most uncertain real-world problemsare so complexthat any progress in them backs as a general AI testbed: the rst is the success of is extremelydif cult. Gamesmodel some elements of the real players that use hard-coded, domain specif- world, and offer a morecontrolled environmentfor exploring ic rules and strategies to play; and the second is the fact that methodsfor dealing with uncertainty. Chess and chess-like standard chess is a perfect information domain. gameshave long been used as a strategically complextest- A number of researchers have tackled the rst of these draw- bed for general AI research, and we extend that tradition by backs. For example (Berliner 1974) investigated generalised introducing an imperfect information variant of chess with strategies used in chess play as a model for problem solving, someuseful properties such as the ability to scale the amount of uncertainty in the game.We discuss the complexityof this and (Pell 1993) introduced metagame. Metagame is a sys- gamewhich we call invisible chess, and present results outlin- tem for generating new games arbitrarily generated from a ing the basic values of invisible pieces in this game.We mo- set of games known as Symmetric Chess-Like games (SCL tivate and describe the implementationand application of two Games). Thus there is no point hand-training a computer information-theoreticadvisors that assist a player of invisible program to play one particular game, as each game requires chess to control the uncertainty in the game.We describe our different strategic planning and position evaluation. A good decision-theoretic approachto combiningthese information- computer metagame player has to somehow encapsulate a theoretic advisors with a basic strategic advisor. Finally we higher level of "strategic knowledge"than is possible in a discuss promisingpreliminary results that wehave obtained single gamesuch as chess. with these advisors. In this paper, we address the second drawback of chess as a general AI testbed -- that of perfect information. Wede- 1 Introduction scribe a missing (or imperfect) information variant of stan- dard chess which we call invisible chess (Bud et al. 1999).1 Making decisions under uncertainty remains one of the cen- Invisible chess involves a con gurable numberof invisible tral problems in AI research. An agent in an uncertain world pieces, i.e., pieces that a player’s opponent cannot see. In- needs to select actions from the action search space -- the visible chess is thus a representative of the general class set of all possible actions in that world. As the uncertain- of strategically complex, imperfect information, two play- ty increases, this task can becomeincreasingly dif cult. The er, zero-sum games. numberof possible actions may increase, the numberof pos- Manyresearchers have investigated games with missing in- sible situations in which those actions may be applied may formation including poker ((Findler 1977), (Korb, Nichol- increase, or both. The effects of these growing search spaces son, & Jitnah 1999), (Koller &Pfeffer 1997)), bridge ((Gam- are ampli ed as the agent tries to search further ahead. Each biick, Rayner, & Pell 1991), (Ginsberg 1999), (Smith, action on each possible world state requires more possible & Throop 1996)), and multi-user domains (Albrecht et al. world states to be evaluated for future moves. This property 1997). With the exception of Albrecht et al., whouse a large of imperfect information domains makes tackling real-world uncontrollable domain, all of these domainsare strategically problems extremely dif cult. simple given perfect information. Games and game theory model some of these properties On the other hand, invisible chess retains all of the strategic of real-world situations in a more controlled environment complexity of standard chess with the addition of a con- and thereby allow analysis and empirical testing of decision- trollable element of missing information. Invisible chess making strategies in these domains. In this capacity, games have long been used as a testbed for general arti cial intel- ligence research and ideas. In particular, chess has a long 1Invisible chess is in the set of Invisible SCLGames, an exten- sion to the set of SCLGames introduced by Pell. Wehave chosen Copyright(~) 2001, AmericanAssociation for Arti cial Intelli- invisible chess as the rst gameto explore becauseof the availabil- gence (www.aaai.org).All rights reserved. ity of standard chess programs.

Nolicense: PDFproduced by PStin (c) F, Siegert - http://www.this.net/~frank/pstill.html is related to ((Li 1994) and (Ciancarini, Dalla- ions do not tend to occur in a signi cant numberof nodes Libera, & Maran1997)), a chess variant in which all the in the gametree, 2 and therefore partition search is not like- opponent’spieces are invisible and a third party referee de- ly to be very effective on any variants of chess. Morere- termines whetheror not each moveis valid. cently (Ginsberg 1999) included MonteCarlo methods In addition to the complexityof playing standard chess, a his bridge playing programto simulate manypossible out- player of invisible chess mustmaintain the possible posit- comes,choosing the action with the highest expectedutility ions of the opponent’sinvisible pieces. These positions may over the simulations. Ginsbergclaims that trying to glean be representedas a probability distribution over the possible or hide information from an opponentis probably not use- squares on the chess board. Section 4 describes our design ful for bridge. In contrast, wepresent results that showthat that enables a central moduleto maintain approximationsto both informationhiding and gleaning can be useful in invis- the invisible piece probability distributions for both players. ible chess. Section 5 presents a brief analysis of the advantagesof play- (Frank & Basin 1998) and (Frank & Basin 2000) have ing with various invisible pieces. Interestingly, we show formeda detailed investigation of search in imperfectinfor- that the relative value of the minorinvisible pieces differs mation games. They have concentrated almost exclusive- from the standard ranking. Wereport on the ly on bridge. Frank & Basin focus on a numberof search uncertainty causedby the different invisible pieces and the techniques including attening the gametree which they effect that those pieces have on the opponent’sability to have shownto be an NP-Completeproblem, and MonteCar- play strategically. Section 6 discusses the relationship be- lo methods. Their investigation suggests that MonteCar- tween uncertainty and the information-theoretic concept of lo methodsare not appropriate for imperfect information entropy,and the applicationof entropyto our results. It also games.This problemis exacerbated in invisible chess given motivates the design of information-theoreticadvisors, de- the relative lack of symmetryand the size of the gametree. scribes these advisors and presents results that demonstrate In the closest workto that presentedhere, (Ciancarini, Dalla- the ef cac y of our approach.Section 7 contains conclusions Libera, & Maran 1997) considered and end and ideas for further work. gamesin the gameof kriegspiel. Kriegspiel is an existing chess variant where neither player can see the opponent’s 2 Related Work pieces or moves. They used a game-theoretic approach in- volving substantive rationality and have shownpromising Since the 1950s, when Shannonand Turing designed the results in this trivial versionof kriegspiel. Thusfar they have rst chess playing programs (Russell & Norvig 1995), not applied their approachto a completegame of kriegspiel. computershave becomebetter at playing certain gamessuch Kriegspiel differs frominvisible chess in a numberof impor- as chess using large amountsof hand-codeddomain specif- tant areas. All the opponent’spieces are invisible to a player ic information. Continuingthe tradition of using gamesas of kriegspiel. Thusthere is no wayto reduce the uncertainty a testbed for general AI investigation, (Berliner 1974)pro- in the domainwithout reducing the strategic complexity.By poseda tactical analyser for chess whichused strategies and contrast, in invisible chess, the numberand types of invisi- tactics, but did not play as well as the existing hard-coded ble pieces is con gurable. As all pieces are invisible, every systems. moveinvolves substantial increases in uncertainty. In invis- In answer to the success of hard-coded algorithms, ible chess, a player maychoose whetherto movea visible (Pell 1993) introduced a class of gamesknown as Symmetric or invisible piece. Finally, in kriegspiel, a player attempts Chess Like (SCL) Games,and a system called Metagamer movesuntil a legal moveis performed.A third party arbitra- that plays gamesarbitrarily generatedfrom this class using a tor indicates whetheror not each moveis legal. In invisible set of advisorsrepresenting strategies in the class of games. chess, the player is given specie information whena move Peli did not consider imperfect information games.Howev- is impossibleor illegal, and maymiss a turn as a result of er, his class of SCLGames is easily extendedto the class attempting an impossiblemove (see Section 3). of Invisible SCLGames, where one or more of a player’s pieces is hiddenfrom their opponent. Thesedifferences makekriegspiel a substantially morecom- plex and less controllable domainthan invisible chess. (Koller & Pfeffer 1995)investigated simple imperfectinfor- Speci cally, exploring the relationship betweenuncertainty mation gameswith an initial goal of solving them. Howev- and strategic play wouldbe moredif cult in kriegspiel. Ul- er, their approachdoes not scale up to morecomplex games timately, the pathological case of invisible chess whereall such as invisible chess. pieces are invisible approachesthe complexityof kriegspiel. (Smith, Nan, & Throop1996) wrote a bridge playing pro- Thusinvisible chess provides a convenientstepping stone to gram using a modi ed form of gametree with enumerated a muchmore difcult problem. strategies rather than actions, effecting a formof forward pruning of the gametree. Smith et al. stated that forward To our knowledgethere exist no computerkriegspiel players pruningworks well for bridge, but not for chess. (Ginsberg that are able to play a full gameof kriegspiel with whichto 1996)introduced partition search to reducethe effective size compareour system. of the gametree. He showedthat this approach workswell for bridge and other gameswith a high degree of symmetry. Partly becausein chess pawnsonly moveforward, and partly 2In fact if the sameposition occurs three times in a gameof becauseof the strategic nature of the game,the sameposit- chess,the gameis declareda draw.

7 Nolicense: PDFproduced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html 3 Domain: Invisible Chess quently the invisible black on c4 is revealed, and white must supply another move. Invisible chess is based on standard chess with the following difference. In invisible chess we differentiate betweenvisi- 4: If a player is not in , and attempts to moveinto ble and invisible pieces and de ne themas follows: a visible check, the invisible piece causing the checkis revealed, piece is one that both players can see; an invisible piece is and the player’s tumis forfeited. In standard chess, the one that a player’s opponentcannot see. Thus, every time a movewould be disallowed and the player wouldprovide visible piece is moved,the board is updatedas per standard an alternate move. In Figure 4, white attempts to move chess. However,when a player movesan invisible piece, their king from dl to d2. However,black’s invisible bish- their opponentis informedwhich invisible piece has moved, op on a5 is threatening d2; consequently the invisible but not where it has movedfrom or to. This information black bishop on a5 is revealed, and white forfeits their enables each player to maintaina probability distribution of turn. their opponent’sinvisible pieces across the squares on the 5: Ifa player attempts to castle throughcheck from an invis- board. In this paper, we frame invisible chess pieces ~ to ible piece or throughan invisible piece itself, the invisible distinguish them fromvisible chess pieces (~). pieceis revealedand the turn is forfeited. Wede ne three terms for referring to movesin invisible chess: A possible moveis any legal chess movegiven com- 3.1 Domain Complexity plete information about the board. That is, no invisible The complexityof standard chess is well understood. Chess pieces are in the wayof the move,and the piece can moveto has an average branching factor of around 35. A typical the desired position. An impossiblemove is an attempted gamelasts approximately 50 movesper player so the en- movethat violates the laws of chess because of an incor- tire gametree has approximately 351°° nodes ((Russell rect assumptionas to the whereaboutsof an invisible piece. Norvig1995), p 123). For example, a player attempts to movea piece "through" In the trivial case wherethe opponenthas no invisible pieces, an invisible piece, or attempts to use a pawnto capture an the gametree for invisible chessis exactlyas large as that for invisible piece that is not diagonallyin front of it. See Fig- standard chess. Onceinvisible pieces are introducedinto the ure 1. An illegal moveis an impossible movethat would game,each node of the gametree must be expandedfor each not allow the gameto continue. For example,a player is not possible movefor each possible combinationof positions of allowedto movetheir king into checkby an invisible piece. the opponent’sinvisible pieces. In a gameof invisible chess See Figure 3. the player and opponent each have m invisible pieces. If Therules of invisible chess are based on the . each invisible piece has a positive probability of occupying Theonly modications pertain directly to invisible pieces n~ squares, then the branching factor is approximatelythe and their impact on the game.In general if a moveis pos- branching factor of chess multiplied by the combinationof sible then it is accepted; if a moveis impossiblethen it is invisible squares that could be occupiedas shownby the fol- rejected and the player’s turn is forfeited; and if a moveis lowing formula: illegal, someinformation regarding the reason the moveis illegal is revealedto the player attemptingthe move,and the Branching Factor = 35 x nl x n2 x ... x nra (1) player must supply another move.Note that we do not allow If each player has two invisible pieces, and each of those invisible kings in this basic invisible chess, as a version of pieces has an average of only four squares with a posi- invisible chess with invisible kings woulddrastically modify tive probability of occupation, then the average approxi- the goal of the original game. mate branching factor of that gameof invisible chess is (Bud et al. 2000) gives details of all scenarios wherenew 35 x 16 = 560. For three invisible pieces each, the branching rules comeinto effect, but in summary,only the rules for the factor is around 35 × 64 = 2240. Assumingthat the invis- followingscenarios differ fromthose of standard chess. ible pieces are on the board and movingfor approximately 1: Impossiblemoves are disallowed, the position of the rst half the game, then the complete expandedgame tree for invisible piece that caused the moveto be impossible three invisible pieces each is in the order of 22405°which is is revealed, and the player’s turn is forfeited. Figure 1 around 1015° nodes. showsan exampleof an impossible move. To makethe domaineven morecomplex, chess has virtually 2: If a player’s king is threatened,the invisible piece caus- no axes of symmetrythat allow the size of the gametree ing the checkis revealed, and the threatenedplayer moves to be reduced as in highly symmetricalgames such as tic- normally. Additionally, the player can infer that there tac-toe, and Smith and Nan (Smith & Nan 1993) claim that are no invisible pieces betweenthe threatening piece forward pruningto reduce the branchingfactor of chess has and their king (unless the king is being threatened by been shownto be relatively ineffective. a ). Figure 2 showsan exampleof this scenario. In addition to the combinatorially expansive nature of the Blackis told that the invisible whitebishop is on g5. invisible chess gametree, in order to play invisible chess, a 3: If a player is in check, and attempts to moveinto check player mustmaintain beliefs about the positions of the op- again or does not moveout of check, any invisible pieces ponent’sinvisible pieces. causing the check are revealed, and the player must sup- A player using a simplistic belief updating schemesuch as ply another move.In Figure 3, white is in checkand at- assumingthe most likely destination will ignore manyprob- temptsto capture the invisible black knight on fl. Conse- able squares. Further, a player that uses any belief updating

8

Nolicense: PDFproduced by PSfill (c)F. Siegert - http://www.this.net/-frank/pstill.html L...... Y/_//~....~ .,,.’//////,~ ......

3 2 1 a b c d e f g h

Figure 1: An impossible move. The black Figure 2: Aninvisible piece is revealed to bishop on b2 is invisible. Whiteattempts warnof the check. Black is in check be- an impossible moveby trying to movethe causeof the invisible white bishop on g5. bishopon cl to a3. schemethat does not take into account every possible des- as an approximationto the joint distribution of all invisible tination square whenan invisible piece is moved,will lose pieces across all squares. Thus, the calculation of each in- important information about the o w of the game. Suppose visible piece’s moveis reducedto the calculation to be per- a player incorrectly assumesthat the opponent’sinvisible formedwhen there is only one invisible piece. Becausethe piece is on a certain square; if the opponentthen movesa resulting approximationis moreaccurate than the distribu- visible piece to that square, the player nowhas no Wayof tion that could be calculated by a real player of invisible backtracking, and has no information about the location of chess, our results slightly underestimatethe advantagesof the invisible piece. playing chess with invisible pieces. For the purposesof this Assumingno strategic knowledge(i.e., each square that an research, using this methodto maintain probability distribu- invisible piece can moveto is as likely to be visited as any tions allowsus to focus our efforts on the effects of manip- other), and only one invisible piece, a player can easily ulating the amountof uncertainty inherent in these distribu- maintainthe precise distribution for that piece (see (Budet tions. al. 2000)). In addition, wheneverany piece (other than knight) moves,the probability distribution for each invisible 4 Basic Design piece maybe updated to re ect the fact that other squares In Section 3.1 we estimate the branchingfactor of the game in the movingpiece’s path are vacant. Similarly, whenever tree for invisible chess with 2 invisible pieces to be around a player’s king is threatened by any piece (except a knight), 560, and for 3 invisible pieces to be in the order of 2000, visible or invisible, the probabilitydistributions for all invis- comparedto the branching factor of standard chess whichis ible pieces maybe updated to re ect the knownvacancies about 35 ((Russell & Norvig 1995), p. 123). To cope betweenthe piece causing check, and the threatened king. this combinatorial explosion, and the strategic complexity However,for multiple invisible pieces, the positions of in- of invisible chess, we employa "divide and conquer" ap- visible pieces are not conditionally independentwith respect proach. Wesplit the problemof choosingthe next moveinto to each other, so the probability of one invisible piece oc- a numberof simpler sub-problems,and then use utility the- cupyingany particular squareaffects the probability of an- ory (Raiffa 1968) to recombinethe calculations performed other invisible piece occupyingthat square. Maintainingthe for these sub-problems into a move. Weuse information- probability distributions of multipleinvisible pieces involves theoretic ideas (R6nyi1984) to deal with the uncertainty combinatorialcalculations in the numberof invisible pieces the domain, and standard chess reasoning to deal with the or the storage of combinatoriallylarge amountsof data in strategic elements. order to maintainthe completejoint distributions of all in- This modular, hybrid approach is implementedwith advi- visible pieces over the entire chess board. sors or experts connected and controlled by a GameCon- Weaddress this problemby utilising a central moduleto ob- troller and Distribution MaintenanceModule (GCDMM). tain an approximationof the distribution that can be calcu- the current implementation,we include three advisors: (1) lated quickly enoughto use in real time (Section 4). The the strategic advisor; (2) the movehide advisor; and (3) GCDMMcalculates the distributions for both players, and the move seek advisor. The GCDMMcontrols the game has access to the true positionsof all invisible pieces at all state, has knowledgeof the positions of all invisible pieces times. Weuse the true positions of other invisible pieces and maintainsdistributions of invisible pieces on behalf of

Nolicense: PDFproduced by PStin (c) F. Siegert - http:flwww.this,nel/~frank/pstill.html 61...... NN®N ...... N N N 67 ,/,.A~ ~ ~ ...... t 4 3 3 2 2 1 1 a b c d e f g h a b c d e f g h

Figure 3: An illegal move. White is in Figure 4: Attempting to moveinto check. check fromthe black bishop on a5, and at- Whiteattempts to movethe king from dl to tempts to capture the invisible black knight d2. This is impossiblebecause it results in on fl whereit would be in check from the the king being in check fromthe invisible invisible black bishop on c4. black bishop on a5.

8 x IStrat (W1:1) - [[~]{Pos Hide Seek Total N N N Move b4 a3 c5 d6 EU W2:5 Ws:5 7 d~5’ 4 -32 378 -24 81.50 0 0.56 84,30 4 399 -28 -24 87.75 0 0.56~ 90.55 Ib2a3 6 °N a2a4 25 -32 376 61 107.50 0 0 107150 a2a3l 21 -32 376 64 107,25 0 0.77 111.10 5 e2h5 -24 386 374 21 189.25 0.77 0 193,10 c3o41 4 397 363 29 198.25 0 0 198,25 4 e2fl 24 384 376 54 209.50 0.77 3 0 213.35 30,395 374 55 213.50 0 O, 213.50 Ib263 2 albl , 26 395 376 58 213.75 0 0 213.75 glh3 33 397 374 52 214.00 0 0~ 214,00 It 1 e2e4 24 387 376 54 210.25 0.77 0 214,10 dld3[ 31~395377 62 216.25 -0.22 O! 215,15 h2h3 31 395 376 63 216.25 0 0 216.25 a b c d e f g h ~ e2a6l 241 397 376 54 212.75 0.77 0l 216.60 glf3 34 394 376 64 217.00 0 0 217.00 Figure 5: Aninvisible bishop each. White e2d3I 32 395 377 63 216.75 0.77 0 ] 220.60 believes the invisible black bishopon b4 is on one of a3, b4, c5 or d6. Black believes Figure 6: Possible Movessorted by their the invisible white bishop on e2 has possi- relative total scores. ble squares e2, d3, c4, b5 and a6. the two players. The GCDMMis responsible for deciding and their utilities (evaluatedto a speci ed depth), gives the whethera moveis legal, impossibleor illegal and ensuring ExpectedUtility (EU) of each movefrom a strategic per- that the gameprogresses correctly. Whenit is a player’s spective.3 The EUfor each moveor action (A) is calculat- turn to move, the GCDMMrequests the next movefrom the ed by multiplyingits utility value by the probability of the Maximiserfor that player. The Maximiser,responsible for gamestate (X) for whichthe utility value was calculated, choosingthe best movesuggested by the available advisors, generatesall possible boardsand all possible moves,and re- 3Notethat GNUChess has no knowledgeof invisible pieces. quests utility values fromeach of the advisorsfor each move. Themodi cations to GNUChess are to performx ed depthsearch- ing, andto return all movesand utilities rather thanonly the best Each advisor evaluates the possible moves, across as many move.In standardchess, a poormove may be discountedearly in boards as possible in the available time, accordingto an in- the search so that the searchon the gametree can focus on more ternal evaluation function. The strategic advisor, a modi- promisingmoves. However, in invisible chess, a movethat is poor ed version of GNUChess whichreturns all possible moves in oneboard position may be goodin other positions.

10 Nolicense: PDFproduced by PSIJll (c) F. Siegert http:llwww, this.netl-franldpstill.html and summingthis across all possible gamestates (G) as per Equation2. 1B X 61.0 65.0 35.0 17.2 41.5 43.6 89.8 IN 39.0 X 53.4 39.2 13.2 37.4 27.6 89.0 EU(A) = ~ (Utility(AlX) x Prob(X)) (2) 1R 35.0 46.6 X 38.2 25.4 34.4 27.4 93.0 VX6 G 1Q 65.0 60.8 61.8 X 43.8 41.4 47.6 93.0 The moveseek advisor scores movesaccording to howmuch 2B 82.8 86.8 74.6 56.2 X 72.0 57.6 95.6 they reduce the player’s uncertainty. For example, a move 2N 58.5 62.6 65.6 58.6 28.0 X 45.8 96.2 that fails becauseit "collides" with an invisible piece re- 2R 56.4 72.4 72.6 52.4 42.4 54.2 X 98.2 movesthe uncertainty as to the position of that piece. Simi- Y.I. 10.2 11.0 7.0 7.0 4.4 3.8 1.8 X laxly, the movehide advisor scores movesaccording to how muchthey increase the opponent’suncertainty. For example, Table 1: Winpercentages for different combinationsof in- movinga previouslyrevealed invisible piece will greatly in- visible pieces. crease the opponent’s uncertainty. The movehide and seek advisors are describedin Section 6. possible destination squares need to be addedto its distri- Eachadvisor has a weightassociated with it that re ects the bution, and regardless of wherethe invisible white bishop relative value of its advice. The Maximisermultiplies the movesto (unless it causes check), the opponent’suncertain- value returned from each of the advisors by its weight (W0 ty is increased. Onthe other hand, whena visible piece and sumsthese values. The movewith the highest overall traverses a square that has a positive probability of occu- value is passed back to the GCDMMwhich implementsthat pation by an invisible piece, the opponent’suncertainty de- moveand requests a movefrom the other player. If there are creases as they can infer that the traversed squareis actually multiple moveswith equal highest score, one of these moves vacant. For example,the movedld3 tells the opponentthat is chosen at randomby the Maximiser. the square d3 is empty. The Maximisermultiplies each movevalue from each advi- 4.1 Example sor by the advisor’s weight(Wi) and sumsover all the advi- Figure 5 shows a typical position in a gameof invisible sors, giving the "Total" columnin Figure 6. Notice that the chess with each player playing with one invisible bishop; movewith the highest overall value is nowe2d3 (220.60). The extra added uncertainty has been enoughin this case to it is white’s turn to move.Black’s belief aboutthe position makethis movebetter than the strategic choice of glf3. of white’s invisible bishopis representedby the probability distribution: e2 0.2, d3 0.2, c4 0.2, ]35 0.2, a6 0.2, and white believes that black’s invisible bishop 5 The Strategic Advisor has the probability distribution: a3 0.25, b4 0.25, In this section wepresent results for playing invisible chess c5 0.25, d6 0.25. White knows that the invisible with a single strategic advisor for each player. Eachplayer black bishop cannot be on e7 as the black traversed played with a different con guration of invisible pieces, us- e7 to get to its current position at h4. TheGCDMM asks the ing only a strategic advisor to decide the next move.Each white Maximiserfor its next move. The white Maximiser result is obtained from500 gamesrun with a particular con- passes each possible board position, i.e., the current board guration of invisible pieces that ended with a win.4 White with black’s invisible black bishop in each of b4, a3, c5 has a slight advantagein standard chess, and the strategic and d6 to the strategic advisor. Figure 6 showssome possi- advisor movespieces differently dependingon whetherit is ble movesacross the four possible boardstogether with their playing white or black. To removecolour biases fromthe re- utilities and the EUof eachmove from the strategic advisor’s suits, each set of 500 gameswas broken into two runs of 250 perspective. Noticethat the moveglf3 has the highest strat- gameseach, the invisible piece con gurations were swapped egic EUof 217. However,moving the knight on gl has no betweenwhite and black for each run, and the results were effect on either player’s information.If there wereno other averagedbetween the two runs. advisors present, the Maximiserwould choose this move. Table 1 shows results for gamesplayed with major (non- Somemoves do assist in discovering the location of the in- pawn)invisible pieces against each other and against no in- visible black bishop, e.g. d4c5, a2a3 and b2a3, howev- visible pieces (N.I.). Readingacross a row, it showsthe per- er, noneof these movesis strategically strong. For this rea- centage of gameswon by the combinationof invisible pieces son, and becauseof the small numberof squares in the invis- in the rowheading against the invisible piece combinationof ible black bishop’s distribution, the moveseek advisor has a particular column.For example,one invisible bishop (1B) little effect at this point in the game.Later in the game,once won65% of the time against one invisible (1R), but the invisible black bishop has movedseveral times, many only 43.6%of the time against two invisible rooks (2R). squares will have a probability of being occupiedby the in- visible black bishop, and the moveseek advisor will have Theseresults showthat the values of several invisible piece moreeffect. combinationsdiffer betweeninvisible chess and standard chess. For example,in standard chess, a rook is considered The movehide advisor evaluates each movebased on the morevaluable than a bishop. However,in invisible chess, increase in the opponent’suncertainty. Whenthe invisible white bishopmoves, its destination fromthe opponent’sper- 4Drawngames are largely the result of repeatingmoves contin- spective, maybe any square accessible fromany square cur- uously.Consisting of less than 10%of gamesplayed, draws are rently in the invisible white bishop’s distribution. Thusall not countedin theseresults.

11 Nolicense: PDFproduced by PStill (¢) F. Siegert - http:flwww.this.net/~frank/pstill.html one bishop beat one rook, and two bishops beat every other Further examination of our results shows that the more a combinationof invisible pieces consideredincluding rooks. player movestheir invisible pieces, the moreimpossible This apparent anomalyis due to the bishop’s early involve- movestheir opponent attempts (see (Bud et aL 2000) for ment in the game, while rooks tend to comeinto play lat- details). This movementof invisible pieces and the corre- er. By causing uncertainty early in the game,a player with sponding opponentuncertainty can be summarisedusing en- two invisible bishopshas an early advantageagainst a player tropy to quantify each player’s uncertainty in a game.Fig- with two invisible rooks. Further analysis of these results is ures 7 and 8 showthe uncertainty of each player regarding presented in (Budet aL 2000). the positions of the opponent’sinvisible pieces for games played with one invisible bishop against two invisible bish- ops.6 For each game,the entropy of each player’s distribu- 6 Building Information Theoretic Advisors tion of invisible pieces is summedacross all moves.Thus the A reasonable inference fromthe results shownabove is that graphs showthe total amountof uncertainty each player had players with more information about the gametend to win to deal with over the course of each game.Figure 7 shows moreoften; i.e., the closer a player’sbelief abouta boardpo- each player’s uncertainty in the 190 gamesthat werewon by sition is to the true boardposition, the morelikely the player white; the solid line showsblack’s uncertainty, sorted by en- is to play well strategically. In this section, wedescribe our tropy, whilethe dashedline showswhite’s uncertainty in the use of informationtheory (R6nyi 1984) to quantify a play- corresponding game. Figure 8 shows each player’s uncer- er’s uncertainty about the positions of an opponent’sinvis- tainty in the 60 gamesthat were wonby black; the dashed ible pieces (Section 6.1), present two information-theoretic line showswhite’s uncertainty, sorted by entropy, while the advisors (Section 6.2), and discuss somepreliminary results solid line showsblack’s uncertainty in the corresponding obtainedwith these advisors (Section 6.3). game. As can be seen from Figure 7, all the gameswon by white, except for one (game number86) showgreater un- certainty for black. The gameswon by black (Figure 8) are 6.1 Uncertainty and Entropy muchcloser in uncertainty and there are manygames where 7 Information theory provides a measurefor quantifying in- white had greater uncertainty than black. This examplecor- formation represented by a sequence of symbols. That is, roborates our intuition that players with less uncertainty are using information theory we can determine the minimum morelikely to win, and underpinsour information-theoretic numberof bits required to transmit the sequenceof symbols advisors. to someoneelse. This numberrepresents a measureof the amountof informationintrinsic to the sequenceof symbols. 6.2 Information-Theoretic Advisors Thecalculation of this numberrequires a distribution of the Followingan analysis of the results in Section 5, we have probability that any particular symbolto be transmitted will implementedtwo information-theoretic advisors. These ad- occur. Thus any probability distribution can be said to have visors are movehide and moveseek. an entropy or information measureassociated with it. This entropy measureis boundedfrom belowby zero, whenthere is only one possibility and therefore no uncertainty, and in- Move Hide Advisor. Working on the premise that the creases as the distribution spreads. moreuncertain the opponentis, the worse they will play, In invisible chess, given a probability distribution of each the movehide advisor advises a player to perform moves of the opponent’sinvisible pieces, it is possible to derive a that hide informationfrom the opponent. That is, each move probability5 distribution across all possible boardpositions. is scored accordingto its expectedeffect on the opponent’s Thus we can calculate the entropy (H) of a set of board perceiveduncertainty about the positions of the player’s in- visible pieces. This effect maybe an increase, a decrease states together with their associatedprobabilities as follows: or no change in the opponent’s uncertainty. Movesby in- H = - E Prob(X) × log2(Prob(X)) (3) visible pieces that do not cause checkresult in an increase

YXEG in the opponent’s uncertainty. Movesthat cause check or movesby visible pieces maycause a decrease in the oppo- whereX represents a single gamestate, from the set of pos- nent’s uncertainty by revealing vacant squares or invisible sible gamestates (G), whichis one possible pieces themselves. As discussed in Section 6.1, we use the positions of the opponent’s invisible pieces, and Prob(X) entropyof the distribution of the positions of the opponent’s represents the probability of that gamestate. invisible pieces as a measureof the opponent’suncertainty. As invisible pieces move, the numberof squares they may In order to calculate this entropy, the movehide advisor occupyincreases. This leads to an increase in the numberof needs to modelthe opponent’sdistribution update strategy. possible boardstates, and thereforethe entropyof the distri- Of course a player can use the real positions of each non- bution of thoseboard states, i.e., it increasesthe opponent’s movinginvisible piece and thereby avoid the cost of storing uncertainty. 6Themany games with zero entropyfor whiteare generallydue 5Assumingthat the piece distributions are independent,this to the invisiblebishop (queen’s bishop) capturing a pieceon its rst canbe calculatedby combinatoriallycycling through the invisible moveand subsequentlybeing captured without moving again. piecepositions. In ourimplementation, these distributions of invis- 7Notethat the samerelationship betweenentropy and wincan ible piecesare independentbecause of the waythey are maintained be seen for gamesbetween more evenly matched invisible pieces (Budet al. 2000). (Budet al. 2000).

12 No license: PDFproduced by PSdU(c) F. Siegert - http:l/www.this.neff~frank/pstill.html aO

Figure8: Uncertainty(as entropy) in gameswith one in- Figure7: Uncertainty(as entropy) in gameswith one in- visible black bishop against two invisible white bishops visible black bishop against two invisible white bishops that were wonby black. that were wonby white. or calculating the completejoint distribution of the posit- ]Weight] Hide ] Seek J ions of all invisible pieces. This methodprovides a modelof QvsQ B vsB Qvs Q BvsB the best distribution the opponentcould have without taking 0.5 57.2 77.2 47.2 51.0 strategic informationinto account. To incorporate strategic 5.0 43.6 54.6 45.0 71.6 informationinto the model,a player could use the combined 50.0 13.6 30.3 47.6 67.0 expectedutility for each possible destination squarethat an invisible piece could moveto. Table 2: Theeffect of the movehide and moveseek advisors. For a proposedmove, the movehide advisor uses Equation3 to calculate the entropy of the opponentmodel of the distri- of this expected changein entropy is returned as the move bution of the player’s invisible pieces prior to and following seekutility. the move. The exponential of the perceived change in en- tropy is returnedas the movehide utility. Theexponential is 6.3 Advisor Results taken in order to allow a comparisonbetween the log based This section presents and discusses somepreliminary re- utilities returned by the movehide and moveseek advisors, sults that showthe individual effects of the movehide and and the utility returnedby the strategic advisor whichscores moveseek advisors with varying weights. Each result was moveson a linear scale. obtained by playing 500 gamesseparated into runs of 250 MoveSeek Advisor. The moveseek advisor suggests that gamesas before, with one player using the strategic advi- a player performmoves that are morelikely to discover in- sor only, against an opponentusing one of the information- formation about the positions of the opponent’s invisible theoretic advisors and the strategic advisor. The invisi- pieces. That is, each moveis scoredaccording to the expect- ble piece con gurations in this section were chosen be- ed decreasein the entropy of the opponent’sinvisible pieces cause their results are typical of those obtained using our following the move.This expected decrease in entropy must information-theoreticadvisors. be greater than or equal to zero as no movecan makea play- er less certain aboutthe position of the opponent’sinvisible The move hide column of Table 2 shows the results of pieces. adding the movehide advisor to an invisible queen each (Q vs Q) and an invisible white bishopversus an invisible black A movethat covers a large numberof squares with a positive bishop (B vs B). The moveseek columnshows the results probability of occupationby the opponent’sinvisible pieces of adding the moveseek advisor to an invisible queen each will yield a certain amountof informationwhether it is suc- and and one invisible white bishop versus one invisible black cessful or not. Clearly this type of movewill yield more bishop (B vs B). The rst column headed "Weight" shows informationif it is successful, as the player nowknows that the weightof the information-theoretic advisor relative to all of those squares are vacant. Onfailure, a player can only the strategic advisor.Thus, with a weight of 0.5, in games concludethat at least one of those squares is occupied. On played with an invisible queeneach, the player playing with the other hand, a movethat traverses only one square with a the movehide advisor won57.2% of the games. positive probability of occupationby the opponent’sinvisi- ble pieces will yield moreinformation if unsuccessful. That MoveHide Results. With a weight of 0.5, the movehide is, the player nowknows that an invisible piece is on that advisor signi cantly assists, increasingthe win rate to 57.2% square. Thus, the moveseek advisor multiplies the project- and 72.2%for queen versus queen and bishop versus bish- ed decrease in entropy for each outcomeby the probability op respectively. However,as the movehide advisor weight of that outcometo get an expected utility. Theexponential increases past 0.5, the winrate decreasesquite rapidly. This

13 Nolicense: PDF produced by PStill (c) F. Siegert- http://www.this.neff~frank/pstill.html behaviouris typical of all observedmove hide runs, and re- visible queenrather than movingtheir owninvisible queen. sults fromthe player taking less notice of the strategic advi- Onlymoves that traverse squares that have a positive prob- sor’s advice. Althoughthe opponentmay be slightly more ability of occupationby the opponent’sinvisible pieces are uncertain whenthe movehide advisor is weighted highly, valued by the moveseek advisor. These movesmay or may the player is making enoughstrategically poor movesto not correspond to good strategic moves. The more a play- counter that advantage. Nonetheless, the movehide advi- er listens to the moveseek advisor, the fewer goodstrategic sor is de nitely helpful with small weight, and the results movesthey are able to perform. The difference betweenthe shownfor a weightof 0.5 are statistically signi cantly dif- bishop-seeking behaviour and the queen-seeking behaviour ferent from the base case of 50%. dependson the difculty of nding the opponent’sinvisible Figure 9 showsthe difference in entropy betweenwhite and piece. Aninvisible bishopcan have a positive probability of black, each playing with an invisible queen, averagedover occupyingat most half the available squares on the board, the entire gamefor each of 250 games.To makethe trends while an invisible queen can have a positive probability of clearer, the data points are sorted by entropy. The mid- occupyingall the available squares. Thus,the adviceprovid- die curve represents gamesplayed betweeninvisible queens ed by the moveseek advisor often aids in the early capture of with no information-theoretic advisors present, and shows the invisible bishop, thereby removingthe uncertainty from that the entropydifference is fairly even. In approximately the game.In contrast, following this advice whenthe in- half the gamesit is negative, andin the other half it is pos- visible piece is an invisible queen leads to wasted moves. itive, and it ranges betweenaround -1 and +1. The bottom Thus the moveseek advisor assists whenthe uncertainty in curve represents gameswhere white played with the move the gameis low, but maybe useless or detrimental whenthe hide advisor. This represents an entropy advantageto white. uncertaintyis high. The difference betweenwhite and black’s uncertainty ranges from under -2 to around +1. The rest of the curve is also 7 Conclusion and Future Work shifted downwardsindicating the relative increase in black’s uncertainty. This increase in black’s uncertainty leads to Theresults presentedin this paper are preliminaryand fur- morewins for white while using the movehide advisor. ther exploration of the domainis required. Thereare a num- Whenan invisible piece moves,the uncertainty increases in ber of areas that require morein depth investigation. Specif- the same mannerregardless of the nal destination of the ically, moreaccurate prediction of the likely positioning of invisible piece. Thusthe movehide advisor is mostvaluable the opponent’sinvisible pieces is needed. This prediction whenany strategically advantageousmove by an invisible could take the formof using strategic informationabout the piece is possible. likely destination of a movinginvisible piece, or involve evaluating the completesearch tree for more ply. Improv- ing this distribution wouldreduce its entropy and therefore MoveSeek Results. Table 2 (column 5) shows the move improvestrategic performance.A side effect of this type of seek advisor’s effectiveness against an opponent’sinvisible distribution improvementis the possibility of incorporating bishop. As the weightincreases to around5.0, the percent- blufng into invisible chess. That is, movingan invisible age of wins increases up to 71.6%.This is largely a result of piece to an unlikely position in order to confuse an oppo- the moveseek advisor assisting the player to nd and cap- nent. ture the opponent’sinvisible bishop early in the game.The As indicated above, one wayto improvethe prediction of player then has an informationadvantage equivalent to an in- the positions of the opponent’sinvisible pieces wouldbe to visible bishopversus no invisible pieces and is very likely to modelthe uncertainty regarding a player’s pieces from the win. Figure 10 showsthe entropy advantageof playing with opponent’sperspective for moreply. However,the problem the moveseek advisor. Thedata points are sorted by entropy of the combinatorial expansionin the search required as a to clarify the trends. Thetop curveshows the entropydiffer- result of this prediction needs to be resolved. The mostob- ence with no moveseek advisor, and the bottomcurve shows vious wayto managethis explosion is to nd effective ways that the entropy is signi cantly lower whenplaying with a to prune the gametree. moveseek advisor. In this example, the moveseek advi- As our system currently stands, a non-integrated player of sor assists white to reduceuncertainty over the courseof the invisible chess (whether humanor machine)could not have gameand therefore play strategically better than black. the bene t of our GCDMMwhich maintains the approxi- Table 2 (column4) showsthe moveseek advisor as applied mationto the distribution of the positions of the opponent’s to a gamebetween two invisible queens. In this situation, invisible pieces by using the true positions of all other in- the moveseek advisor is muchless effective and wouldap- visible pieces whenupdating the distribution. Sucha player pear to probablybe detrimentalin somecases. It seemslike- wouldneed to nd other waysof approximatingthat distri- ly that the large numberof squares an invisible queen may bution. have a positive probability of occupying,and the frequen- In summary,we have presented and discussed a complex, cy of queen movement,make it dif cult for the opponent but controlled domainfor exploring automatedreasoning in to nd an invisible queen. The top curve in Figure 9 repre- an uncertain environmentwith a high degree of strategic sents gameswhere white played with the moveseek advisor. complexity. Wehave motivated and introduced the use of Anexamination of the entropy difference slightly favours information-theoretic advisors in the strategically complex black comparedto the base case. This is almost certainly imperfect information domainof invisible chess. Wehave because white is spending movestrying to nd black’s in- shownthat our distributed-advisor approachusing a combi-

14 Nolicense: PDFproduced by PStill (c) F. Siegert - http://www.this.nel/~frank/pstill.htnd Invisible ~e*n vs Invlsible Oug~. Diff*r*ncs in Entropy blt~en Black and Whir* 1 Znv Black Bishop vl 1 inv whirs Bishop, Dlff In Entropy beb~sn Black & Whirs 2 g 2 ¯ [n~ropy Diff*’rsn=s with No ~dvi=orl m EntropyDl f f@renc’ewl~h ~ito s~k m ~ntropy Difference with ~Thits Mova Seeks" ® ~tropy Difference with No Advienrn .... 1.5 1.5

O.5 ...... gn~rop¥~i~Dlff*r*n¢~with~cnlts HoveHidg ...... ,"""" ...... r" 0 o ...... ,...... -,.’..... ~-0.~ ~ ~ -0.5 i -1.5 i -1.5

-2.5 1 I ! | -a.5 lOO 15o 2oo 25o

Figure 9: Entropy difference between white and black Figure 10: The difference in entropy between white and for a base case, move seek (weight 1.0) and move hide black for a base case, and with move seek, sorted by (weight 0.5), sorted by entropy. entropy.

nation of information-theoretic and strategic aspects of the Gamb~ick,B.; Rayner,M.; and Pell, B. 1991. Anarchitecture for a domain lead to performance advantages compared to using sophisticated mechanicalBridge player. In Levy, D. N., and Beal, strategic expertise alone. Given the simplicity and general- D. F., eds., Heuristic Programmingin Arti cial Intelligence -- ity of this approach, our results point towards the potential The Second ComputerOlympiad 2, 87-107. Chichester, England: Ellis Horwood. applicability to a range of strategically complex imperfect information domains. Ginsberg,M. 1996. Partition search. In Proceedingsof the Thir- teenth National Conferenceon Arti cial Intelligence, 228-233. AAAIPress / MITPress. Acknowledgements Ginsberg, M. 1999. GIB:steps toward an expert-level bridge- The authors would like to thank Chris Wallace for his in- playing program. In Proceedings of the Sixteenth Internation- sight, advice and patience. al Joint Conferenceon Arti cial Intelligence, 584-589. Morgan Kaufmann. References Koller, D., and Pfeffer, A. 1995. Generating and solving imper- Albrecht, D. W.; Zukerman,I.; Nicholson, A. E.; and Bud, A. fect information games.In Proceedingsof the Fourteenth Inter- 1997. Towardsa bayesian model for keyhole plan recognition national Joint Conferenceon Arti cial Intelligence, 1185-1192. Morgan Kaufmann. in large domains. In User Modelling. Proceedings of the Sixth International Conference UM97,number 383 in CISMCourses Koller, D., and Pfeffer, A. 1997.Representations and solutions for and Lectures, 365-376. Springer Wien, NewYork NYUSA. game-theoretic problems. Arti cial Intelligence 94(1):167-215. Berliner, H.J. 1974. Chess as ProblemSolving. Ph.D. Disserta- Korb, K. B.; Nicholson, A. E.; and Jitnah, N. 1999. Bayesian tion, CarnegieMellon University, Pittsburgh, PA. poker. In Laskey, K. B., and Prade, H., eds., Proceedings of the Fifteenth Conferenceon Uncertaintyin Arti cial Intelligence, Bud, A.; Nicholson, A.; Zukerman,I.; and Albrecht, D. 1999. A hybrid architecture for strategically compleximperfect infor- 343-350. Stockholm, Sweden: MorganKaufmann. mation games. In Proceedings of the 3rd International Confer- Li, D. 1994. Kriegspiel: Chess Under Uncertainty. Premier ence on Knowledge-BasedIntelligent Information Engineering Publishing. Systems., 42--45. IEEEPress. Pell, B. 1993. Strategy Generation and Evaluation for Meta- Bud, A.; Zukerman,I.; Albrecht, D.; and Nicholson, A. 2000. GamePlaying. Ph.D. Dissertation, University of Cambridge. Invisible chess: Rules, complexity and implementation. Tech- Raiffa, H. 1968. Decision analysis: introductory lectures on nical Report forthcoming, Dept. of ComputerScience, Monash choices under uncertainty. Addison-Wesley. University. R6nyi, A. 1984. A diary on information theory. John Wiley and Ciancarini, P.; DallaLibera, E; and Maran, E 1997. Decision sons. Makingunder Uncertainty: A Rational Approachto Kriegspiel. Russell, S., and Norvig,P. 1995. Arti cial Intelligence: A Modern In van den Herik, J., and Uiterwijk, J., eds., Advancesin Comput- Approach.Prentice Hall. er Chess 8, 277-298. Univ. of Rulimburg. Smith, S. J. J., and Nau, D.S. 1993. Strategic planning for Findler, N. V. 1977. Studies in machinecognition using the game imperfect-information games. In Games: Planning and Learn- of poker. Communicationsof the ACM20(4):230-245. ing, Papers from the 1993 Fall Symposium,84-91. AAAIPress. Frank, I., and Basin, D. 1998. Search in gameswith incomplete Smith, S. J. J.; Nau, D. S.; and Throop, T. A. 1996. Total-order information:A case study using bridge card play. Arti cial Intel- multi-agent task-network planning for contract bridge. In Pro- ligence 100:87-123. ceedings of the Thirteenth National Conferenceon Arti cial In- Frank, I., and Basin, D. 2000.A theoretical and empirical inves- telligence, 108-113. AAAIPress / MITPress. tigation of search in imperfect informationgames. In Theoretical ComputerScience -- To appear.

15 Nolicense: PDF produced by PStill(c) F. Siegert- http://www.this.net/~frank/pstill.html