The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-19) Machine Learning Based Heuristic Search Algorithms to Solve Birds of a Feather Card Game Bryon Kucharski, Azad Deihim, Mehmet Ergezer Wentworth Institute of Technology 550 Huntington Ave, Boston, MA 02115 fkucharskib, deihima, [email protected] Abstract moves to be organized into a decision tree, which can be tra- This research was conducted by an interdisciplinary team of versed through with various types of algorithms. Depth-first two undergraduate students and a faculty to explore solutions search (DFS) and breadth-first search (BFS) are two rudi- to the Birds of a Feather (BoF) Research Challenge. BoF is a mentary approaches to tree traversal that are straightforward newly-designed perfect-information solitaire-type game. The to implement and can solve the game if possible. However, focus of the study was to design and implement different al- there is still room left for improving their performance us- gorithms and evaluate their effectiveness. The team compared ing auxiliary algorithms. As part of the challenge proposed the provided depth-first search (DFS) to heuristic algorithms by Neller, teams were asked to explore potential heuristics such as Monte Carlo tree search (MCTS), as well as a novel to guide the performance of these graph algorithms. This heuristic search algorithm guided by machine learning. Since compelled our team to delve into more intelligent solutions all of the studied algorithms converge to a solution from a such as heuristic-based traversal algorithms. There are sev- solvable deal, effectiveness of each approach was measured by how quickly a solution was reached, and how many nodes eral features that can be extracted from any state of a BoF were traversed until a solution was reached. The employed game to be used directly as a heuristic to guide the traver- methods have a potential to provide artificial intelligence en- sal. This abundance of applicable features facilitated a ma- thusiasts with a better understanding of BoF and novel ways chine learning approach: suitable features can be used as to solve perfect-information games and puzzles in general. an input, and the output can be applied as a heuristic - The results indicate that the proposed heuristic search algo- which allows the traversal to be directed by multiple features rithms guided by machine learning provide a significant im- rather than just one. The team compared the provided depth- provement in terms of number of nodes traversed over the first search to heuristic algorithms such as Monte Carlo tree provided DFS algorithm. search (MCTS), as well as a novel heuristic search algorithm guided by machine learning. 1 Introduction The applications of machine learning to solving games This research was conducted by an interdisciplinary team of have been well studied. IBM engineers have applied lin- two undergraduate students and a faculty to explore solu- ear approaches as well as alpha-beta pruning to solve the tions to the Birds of a Feather (BoF) Research Challenge game of checkers (Samuel 1959), (Samuel 1967). (Yan et al. proposed by (Neller 2016). BoF is a perfect-information 2005) proposed an heuristic approach to the Klondike soli- solitaire game, comparable to FreeCell solitaire. taire that solved twice as many games on average than an BoF is played with a standard 52 card deck. Each game expert human player. Machine learning based solutions to begins with a starting deal of sixteen random cards orga- FreeCell have also been explored. (Chan 2006) compared nized into a 4-by-4 grid. The player must select an individual the performance of multiple algorithms, including heuristic card and move it on top of another card in the selected card’s approaches, neural networks, Bayesian learning and deci- row or column provided that one of the following conditions sion trees. are met: (1) Both cards have the same suit. (2) Both cards Besides machine learning algorithms, randomized search- have the same rank. (3) Both cards have adjacent ranks. For based techniques are employed to decrease the number of example, a King of Hearts can be placed on top of a Queen of nodes that are evaluated to solve a game. Alpha-beta prun- Clubs since their ranks are adjacent, or that King of Hearts ing (Knuth and Moore 1975) has been the forefront for fi- can be placed on top of a Nine of Hearts since they have nite, zero-sum two player games with perfect information. the same rank. The game concludes when fifteen moves are (Coulom 2006) introduced a new algorithm, Monte Carlo made resulting in one stack of all sixteen cards. tree search, which combines Monte Carlo methods with tree Newly-discovered perfect-information puzzles offer a search to solve these types of games. MCTS has been widely plethora of terrain for exploration and allow for fun and cre- used in solving various games, including a 9 x 9 computer ative solutions. The nature of BoF allows for all possible Go program named Crazy Stone (Coulom 2007), Othello Copyright c 2019, Association for the Advancement of Artificial (Nijssen 2007), and Tic Tac Toe (Auger 2011). Google’s Intelligence (www.aaai.org). All rights reserved. DeepMind employed a combination of Deep Learning and 9656 MCTS to create AlphaZero, an algorithm that mastered To minimize the error, the partial derivative of El can be Chess, Shogi, (Silver et al. 2017a), and 18 x 18 Go (Silver set to 0 with respect to both variables. A closed form solution et al. 2017b). to the above equation is: MCTS has also been implemented in single player games Pn such as Sudoku (Cazenave 2009) and SameGame (Schadd i=0(xi − xm)(yi − ym) w = Pn 2 et al. 2008). In our research, MCTS is applied to Birds of i=0(xi − xm) a Feather, a finite, single player game with perfect informa- tion. b = ym − wxm The paper is organized as follows: Section 2 provides a where xm and ym are the means of the x and y vectors, brief introduction to heuristic search algorithms employed respectively (Abu-Mostafa, Magdon-Ismail, and Lin 2012). in this research. Section 3 defines the proposed machine The LR model yields a continuous value y^ 2 IR1. In our learning based heuristic search algorithms. Section 4 intro- case, y represents solvability of a given game state, and LR duces the experimental settings and the evaluation criteria. predicts if the game is solvable or not. Thus, a higher value Section 5 discusses the empirical results obtained. Finally, for y^ indicates a larger confidence in solving the game, and Section 6 states the conclusion and suggests future work. a lower value suggests a potentially unsolvable game. 2 Search Algorithms Monte Carlo Tree Search This section reviews the provided solution to the BoF chal- lenge and introduces the heuristic search algorithms em- MCTS is a tree search algorithm that computes the best ployed in the proposed solution. move to make in a given state. The algorithm consists of a selection phase, expansion phase, simulation phase, and Depth-first search a back-propagation phase (Coulom 2006). In the selection phase, the algorithm starts at the root of the tree and tra- Depth-first search is an algorithm that allows for simple verses down fully expanded nodes until a leaf node that is traversal through graphs. The nature of DFS is that it visits not fully expanded is encountered. In the expansion phase, a each node once and only once by beginning at the root and child of the leaf node that has not been visited is simulated exploring as far down the tree as possible before backtrack- n number of times in the simulation phase. The simulation ing. DFS can be used to solve Birds of a Feather because all produces an outcome, which is then propagated back to the possible moves can be organized into a decision tree. Any root node. solution to a solvable deal of Birds of a Feather must reside in the deepest level of the tree, which makes DFS a desirable method of traversal. 3 Machine learning based heuristic search Another algorithm that can be used to solve Birds of a algorithms Feather is breadth-first search, which prioritizes finding new BoF Research Challenge provided an implementation of paths by searching all the node’s children prior to advancing DFS as well as a sample script for generating solvability to the next level of the tree. This method of traversal would data. Our team leveraged this file and created addition fea- be undesirable because it would search every node in the top tures to collect data across 10,000 seeds. The data was split fifteen levels before reaching the final level of the tree where into test and train sets, and models were created as part of every possible solution must reside. a heuristic search algorithm. Every iteration of the heuristic, Linear Regression the features were generated for the current node. The prob- ability of solvability is predicted using a linear regression Linear regression (LR) is a supervised learning algorithm model for every child of the current node, then rearranged employed to forecast continuous outcome. It generally re- from highest probability to lowest probability. The highest quires a structured array of real numbers, x and predicts vec- probability child is selected first as the move to make, and tor of real numbers, y. the heuristic repeats. If a solution is not found, the second The predicted output, y^ relies on a hypothesis function, highest probability child is selected until the game is solved h(x). or all the nodes are exhausted. A priority queue was em- y ≈ y^ = h(x): ployed to implement this behavior.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages6 Page
-
File Size-