The Monte-Carlo Revolution in Go

The Monte-Carlo Revolution in Go

The Monte-Carlo Revolution in Go R´emiCoulom October, 2015 Introduction Monte-Carlo Tree Search Game Complexity History How can we deal with complexity ? Conclusion Game Complexity Game Complexity∗ Status Tic-tac-toe 103 Solved manually Connect 4 1014 Solved in 1988 Checkers 1020 Solved in 2007 Chess 1050 Programs > best humans Go 10171 Programs best humans ∗Complexity: number of board configurations R´emiCoulom The Monte Carlo Revolution in Go 2 / 13 When formal methods fail Approximate evaluation Reasoning with uncertainty Introduction Monte-Carlo Tree Search Game Complexity History How can we deal with complexity ? Conclusion How can we deal with complexity ? Some formal methods Use symmetries Use transpositions Combinatorial game theory R´emiCoulom The Monte Carlo Revolution in Go 3 / 13 Introduction Monte-Carlo Tree Search Game Complexity History How can we deal with complexity ? Conclusion How can we deal with complexity ? Some formal methods Use symmetries Use transpositions Combinatorial game theory When formal methods fail Approximate evaluation Reasoning with uncertainty R´emiCoulom The Monte Carlo Revolution in Go 3 / 13 E E E E E E E E E Classical approach = depth limit + pos. evaluation (E) (chess, shogi, . ) Monte-Carlo approach = random playouts Introduction Monte-Carlo Tree Search Game Complexity History How can we deal with complexity ? Conclusion Dealing with Huge Trees Full tree R´emiCoulom The Monte Carlo Revolution in Go 4 / 13 Monte-Carlo approach = random playouts Introduction Monte-Carlo Tree Search Game Complexity History How can we deal with complexity ? Conclusion Dealing with Huge Trees E E E E E E E E E Classical approach = depth limit + pos. evaluation (E) (chess, shogi, . ) Full tree R´emiCoulom The Monte Carlo Revolution in Go 4 / 13 Introduction Monte-Carlo Tree Search Game Complexity History How can we deal with complexity ? Conclusion Dealing with Huge Trees E E E E E E E E E Classical approach = depth limit + pos. evaluation (E) (chess, shogi, . ) Full tree Monte-Carlo approach = random playouts R´emiCoulom The Monte Carlo Revolution in Go 4 / 13 Introduction Principle of Monte-Carlo Evaluation Monte-Carlo Tree Search Monte-Carlo Tree Search History Patterns Conclusion A Random Playout R´emiCoulom The Monte Carlo Revolution in Go 5 / 13 Introduction Principle of Monte-Carlo Evaluation Monte-Carlo Tree Search Monte-Carlo Tree Search History Patterns Conclusion Principle of Monte-Carlo Evaluation Root Position Random Playouts MC Evaluation + + = R´emiCoulom The Monte Carlo Revolution in Go 6 / 13 Problems Evaluation may be wrong For instance, if all moves lose immediately, except one that wins immediately. Introduction Principle of Monte-Carlo Evaluation Monte-Carlo Tree Search Monte-Carlo Tree Search History Patterns Conclusion Basic Monte-Carlo Move Selection Algorithm N playouts for every move Pick the best winning rate 5,000 playouts/s on 19x19 9/10 3/10 4/10 R´emiCoulom The Monte Carlo Revolution in Go 7 / 13 Introduction Principle of Monte-Carlo Evaluation Monte-Carlo Tree Search Monte-Carlo Tree Search History Patterns Conclusion Basic Monte-Carlo Move Selection Algorithm N playouts for every move Pick the best winning rate 5,000 playouts/s on 19x19 Problems Evaluation may be wrong For instance, if all moves lose immediately, except one that wins immediately. 9/10 3/10 4/10 R´emiCoulom The Monte Carlo Revolution in Go 7 / 13 Introduction Principle of Monte-Carlo Evaluation Monte-Carlo Tree Search Monte-Carlo Tree Search History Patterns Conclusion Monte-Carlo Tree Search Principle More playouts to best moves Apply recursively Under some simple conditions: proven convergence to optimal move when #playouts! 1 9/15 2/6 3/9 R´emiCoulom The Monte Carlo Revolution in Go 8 / 13 Introduction Principle of Monte-Carlo Evaluation Monte-Carlo Tree Search Monte-Carlo Tree Search History Patterns Conclusion Incorporating Domain Knowledge with Patterns Patterns Library of local shapes Automatically generated Used for playouts Cut branches in the tree Examples (out of ∼30k) Good Bad to move R´emiCoulom The Monte Carlo Revolution in Go 9 / 13 Victories against classical programs 2006: Crazy Stone (Coulom) wins 9 × 9 Computer Olympiad 2007: MoGo (Wang, Gelly, Munos, . ) wins 19 × 19 Introduction Monte-Carlo Tree Search History Conclusion History (1/2) Pioneers 1993: Br¨ugmann:first MC program, not taken seriously 2000: The Paris School: Bouzy, Cazenave, Helmstetter R´emiCoulom The Monte Carlo Revolution in Go 10 / 13 Introduction Monte-Carlo Tree Search History Conclusion History (1/2) Pioneers 1993: Br¨ugmann:first MC program, not taken seriously 2000: The Paris School: Bouzy, Cazenave, Helmstetter Victories against classical programs 2006: Crazy Stone (Coulom) wins 9 × 9 Computer Olympiad 2007: MoGo (Wang, Gelly, Munos, . ) wins 19 × 19 R´emiCoulom The Monte Carlo Revolution in Go 10 / 13 Introduction Monte-Carlo Tree Search History Conclusion History (2/2) Games Against Strong Professionals 2008-08: MoGo beats Myungwan Kim (9p), H9 2012-03: Zen beats Masaki Takemiya (9p), H4 2013-03: CrazyStone beats Yoshio Ishida (9p), H4 2014-03: CrazyStone beats Norimoto Yoda (9p), H4 2015-03: CrazyStone loses to Chikun Cho (9p), H3 R´emiCoulom The Monte Carlo Revolution in Go 11 / 13 Difficulties Tree search can't handle all the threats. Must decompose into local problems. Introduction Monte-Carlo Tree Search History Conclusion Limits of the Current MC Programs R´emiCoulom The Monte Carlo Revolution in Go 12 / 13 Introduction Monte-Carlo Tree Search History Conclusion Limits of the Current MC Programs Difficulties Tree search can't handle all the threats. Must decompose into local problems. R´emiCoulom The Monte Carlo Revolution in Go 12 / 13 Perspectives Policy gradient for adaptive playouts Deep convolutional neural networks for clever patterns Introduction Monte-Carlo Tree Search History Conclusion Conclusion Summary of Monte-Carlo Tree Search A major breakthrough for computer Go Works similar games (Hex, Amazons) and automated planning R´emiCoulom The Monte Carlo Revolution in Go 13 / 13 Introduction Monte-Carlo Tree Search History Conclusion Conclusion Summary of Monte-Carlo Tree Search A major breakthrough for computer Go Works similar games (Hex, Amazons) and automated planning Perspectives Policy gradient for adaptive playouts Deep convolutional neural networks for clever patterns R´emiCoulom The Monte Carlo Revolution in Go 13 / 13.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    20 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us