SmartK: An Efficient, Scalable, and Winning Parallel Monte Carlo Tree

Michael S. Davinroy, Shawn Pan, Bryce Wiedenbeck, and Tia Newhall Department, Swarthmore College, Swarthmore, PA 19081

Motivations SmartK Algorithm Results

● MCTS (Monte Carlo Tree Search) Processes on cluster nodes SmartK is a better Hex player than Root ○ Auction bidding ○ Program analysis independently compute MCTS Experiments run on Comet Cluster at The San Diego ○ Neural network architecture search 1st Phase: processes compute MCTS in parallel Supercomputing Center [1] ○ Radiation therapy design Num Board SmartK Root SmartK Root ● Large search spaces require parallel Procs: Size: 1st Player 1st Player 2nd Player 2nd Player solutions 128 8x8 90% 42% 58% 10% ● State-of-the-art approaches do not scale 256 11x11 74% 56% 44% 26% beyond shared memory 512 14x14 68% 50% 50% 32% ● Our novel SmartK MCTS Algorithm: (Averages over 50 runs) ○ Scales using distributed memory 2nd Phase: for k rounds Higher win percentage as 1st Player ○ Directs processes to independently 1. Processes assigned promising tree nodes Higher win percentage as 2nd Player explore promising paths in the tree using UCB weight from previous round (Hex has a 1st mover advantage) 2. Run MCTS locally; share weights for next MCTS Algorithm round SmartK scales to large size problems

Board Time per Time per Memory Nodes Algorithm ● Non-deterministic, four-stage algorithm: Size Rollout move space used expanded ○ Selection: traverses known tree, trading SmartK 8x8 0.3ms 24.3s 1.21 GB/P P*R off exploration and exploitation ○ Expansion: adds one node to the tree Root 8x8 0.5ms 35.4s 1.37 GB/P P*R ○ Simulation: plays until the end of the Seq 8x8 0.3ms 20.2s 1.55 GB/P R game to generate a result estimate SmartK 11x11 0.3ms 21.6s 1.22 GB/P P*R ○ : updates values for all selected/expanded nodes Other Approaches Root 11x11 0.4ms 27.7s 1.49 GB/P P*R Seq 11x11 0.3ms 20.7s 1.78 GB/P R Root Parallel: Tree Parallel: P: # of processes R: # of rollouts independent MCTS / shared / Computation and space efficient creates redundant work requires shared data Uses no more resources than Root References: 1. XSEDE: John Towns, Timothy Cockerill, Maytal Dahan, Ian Foster, Kelly Gaither, Andrew Grimshaw, Victor Hazlewood, Scott Lathrop, Dave Lifka, Gregory D. Peterson, Ralph Roskies, J. Ray Scott, Nancy Wilkins-Diehr, "XSEDE: Accelerating Scientific Discovery", Computing in Science & Engineering, vol.16, no. 5, 2014. 2. Comet Supercomputer. San Diego Supercomputer Center. http://www.sdsc.edu