Neurogammon Wins Computer Olympiad

Total Page:16

File Type:pdf, Size:1020Kb

Neurogammon Wins Computer Olympiad NOTE Communicated by Terrence Sejnowski Neurogammon Wins Computer Olympiad Gerald Tesauro IBM Thomas 1. Watson Researcli Ceiiter, P.O. Box 704, Yorktom Hrights, NY 10598 USA Neurogammon 1.0 is a backgammon program which uses multilayer neural networks to make move decisions and doubling decisions. The Downloaded from http://direct.mit.edu/neco/article-pdf/1/3/321/811855/neco.1989.1.3.321.pdf by guest on 24 September 2021 networks learned to play backgammon by backpropagation training on expert data sets. At the recently held First Computer Olympiad in London, Neurogammon won the backgammon competition with a perfect record of five wins and no losses, thereby becoming the first learning program ever to win a tournament. Neural network learning procedures are being widely investigated for many classes of practical applications. Board games such as chess, go, and backgammon provide a fertile testing ground because performance measures are clear and well defined. Furthermore, expert-level play can be of tremendous complexity. Learning programs have been studied in games environments for many years, but heretofore have not reached significant levels of performance. Neurogammon 1.O represents the culmination of previous research in backgammon learning networks (Tesauro and Sejnowski 1989; Tesauro 1988; Tesauro 1989) in the form of a fully functioning program. Neu- rogammon contains one network which makes doubling cube decisions and a set of six networks which make move decisions in different phases of the game. Each network has a standard fully-connected feed forward architecture with a single hidden layer, and was trained by the weil- known backpropagation algorithm (Rumelhart et al. 1986). The move- making networks were trained on a set of positions from 400 games in which the author played both sides. A "comparison paradigm," de- scribed in (Tesauro 1989), was used to teach the networks that the move selected by the expert should score higher than each of the other pos- sible legal moves. The doubling network was trained on a separate set of about 3000 positions which were classified according to a crude nine- point ranking scale of doubling strength. The training of each network proceeded until maximum generalization performance was obtained, as measured by performance on a set of test positions not used in training. The resulting program appears to play at a substantially higher level than conventional backgammon programs. At the Computer Olympiad in London, held on August 9-15, 1989, and organized by David Levy, Neirml Coinpirtntioii 1, 321-323 (1989) @ 1989 Massachusetts Institute of Technology 322 Gerald Tesauro Neurogammon competed against five other opponents: three commercial programs (Video Gammon/USA, Mephisto Backgammon/ W. Germany, and Saitek Backgammon/Netherlands) and two non-commercial pro- grams (Backbrain/Sweden and A1 Backgammon/USA). Hans Berliner’s BKG program was not entered in the competition. In matches to 11 points, Neurogammon defeated Video Gammon by 12-7, Mephisto by 12-5, Saitek by 12-9, Backbrain by 114, and A1 Backgammon by 16-1, to take the gold medal in the backgammon competition. Also, in un- official matches to 15 points against two other commercial programs, Downloaded from http://direct.mit.edu/neco/article-pdf/1/3/321/811855/neco.1989.1.3.321.pdf by guest on 24 September 2021 Fidelity Backgammon Challenger and Sun Microsystems’ Gammontool, Neurogammon won by scores of 16-3 and 15-8 respectively. There were also a number of unofficial matches against intermediate-level humans at the Olympiad. Neurogammon won three of these and lost one. Fi- nally, in an official challenge match on the last day of the Olympiad, Neurogammon put up a good fight but lost to a human expert, Ossi Weiner of West Germany, by a score of 2-7. Weiner said that he was surprised at how much the program plays like a human, how rarely it makes mistakes, and that he had to play extremely carefully in order to beat it. In summary, Neurogammon’s victory at the Computer Olympiad demonstrates, along with similar recent advances in fields such as speech recognition (Lippmann 1989) and optical character recognition (Le Cun et al. in press), that neural networks can be practical learning devices for tackling hard computational tasks. It also suggests that machine learning procedures of this type might be useful in other games. However, there is still much work to be done both in extracting additional information from the data sets within the existing approach, as well as in developing new approaches such as unsupervised learning based on outcome, which would supplement what can be achieved with supervised learning from expert data. References Tesauro, G., and Sejnowski, T.J. 1989. A parallel network that learns to play backgammon. Avfificinl Intellig~nce39, 357-390. Tesauro, G. 1988. Neural network defeats creator in backgammon match. Tech. Rep. no. CCSR-88-6, Center for Complex Systems Research, University of Illinois at Urbana-Champaign. Tesauro, G. 1989. Connectionist learning of expert preferences by comparison training. In D. Touretzky, (Ed.), Adzlatices it7 Nr.urd Iiifortnation Processiq Systems, 99-106. Morgan Kaufman Publishers. Rumelhart, D.E., et al. 1986. Learning representations by backpropagating errors. Nntm 323, 533-536. Neurogammon Wins Computer Olympiad 323 Lippmann, R.P. 1989. Review of neural networks for speech recognition. Neiirnl Cotnp 1, 1-38. LeCun, Y., Boser, B., Denker, J.S., Hendersen, D., Howard, R.E., Hubbard, W., and Jackel, L.D. (in press). Backpropagation applied to handwritten zip code recognition. Ntwd Compiitatioii. Received 30 August 1989; accepted 30 August 1989. Downloaded from http://direct.mit.edu/neco/article-pdf/1/3/321/811855/neco.1989.1.3.321.pdf by guest on 24 September 2021.
Recommended publications
  • Table of Contents 129
    Table of Contents 129 TABLE OF CONTENTS Table of Contents ......................................................................................................................................................129 Science and Checkers (H.J. van den Herik) .............................................................................................................129 Searching Solitaire in Real Time (R. Bjarnason, P. Tadepalli, and A. Fern)........................................................ 131 An Efficient Approach to Solve Mastermind Optimally (L-T. Huang, S-T. Chen, S-Ch. Huang, and S.-S. Lin) ...................................................................................................................................... 143 Note: ................................................................................................................................................................. 150 Gentlemen, Stop your Engines! (G. McC. Haworth).......................................................................... 150 Information for Contributors............................................................................................................................. 157 News, Information, Tournaments, and Reports: ......................................................................................................158 The 12th Computer Olympiad (Continued) (H.J. van den Herik, M.H.M. Winands, and J. Hellemons).158 DAM 2.2 Wins Draughts Tournament (T. Tillemans) ........................................................................158
    [Show full text]
  • Mohex 2.0: a Pattern-Based MCTS Hex Player
    MoHex 2.0: a pattern-based MCTS Hex player Shih-Chieh Huang1,2, Broderick Arneson2, Ryan B. Hayward2, Martin M¨uller2, and Jakub Pawlewicz3 1 DeepMind Technologies 2 Computing Science, University of Alberta 3 Institute of Informatics, University of Warsaw Abstract. In recent years the Monte Carlo tree search revolution has spread from computer Go to many areas, including computer Hex. MCTS Hex players now outperform traditional knowledge-based alpha-beta search players, and the reigning Computer Olympiad Hex gold medallist is the MCTS player MoHex. In this paper we show how to strengthen Mo- Hex, and observe that — as in computer Go — using learned patterns in priors and replacing a hand-crafted simulation policy with a softmax pol- icy that uses learned patterns can significantly increase playing strength. The result is MoHex 2.0, about 250 Elo stronger than MoHex on the 11×11 board, and 300 Elo stronger on 13×13. 1 Introduction In the 1940s Piet Hein [22] and independently John Nash [26–28] invented Hex, the classic two-player alternate-turn connection game. The game is easy to im- plement — in the 1950s Claude Shannon and E.F. Moore built an analogue Hex player based on electrical circuits [29] — but difficult to master, and has often been used as a testbed for artificial intelligence research. Around 2006 Monte Carlo tree search appeared in Go Go [11] and soon spread to other domains. The four newest Olympiad Hex competitors — MoHex from 2008 [4], Yopt from 2009 [3], MIMHex from 2010 [5], Panoramex from 2011 [20] — all use MCTS.
    [Show full text]
  • Rules for the 17Th World Computer-Chess
    Rules for the 17 th World Computer-Chess Championship 1 RULES FOR THE 17 th WORLD COMPUTER-CHESS CHAMPIONSHIP Pamplona, Spain May 11-18, 2009 The Board of ICGA The 17 th World Computer-Chess Championship will take place from May 11-18, 2009 in Pamplona, Spain. Here we recall that the Maastricht Triennial Meeting in 2002, i.e., the ICGA meeting, decided that the WCCC should be held annually without distinguishing any type of machines. The observation was clear: all kinds of differences between microcomputers, personal computers, “normal” computers, and supercomputers were in some sense obsolete and the classification thus was considered artificial. So was the division into the classes of single processors and multiprocessors. For 2009 we are introducing a new rule on a somewhat experimental basis. For this year’s WCCC a limit is being placed on the number of cores that a computer system may use for the tournament. The longer-term future of this rule is currently under discussion in various computer chess forums and will be debated by the contestants during this year’s World Championship, which might lead to changes for future years. Another division considered obsolete since 2002 is that between amateur and professional. Is not the real amateur a professional? Or the other way round? For organizational matters we have kept this difference, since for amateurs the cost of travelling and housing is already expensive. Being treated as a professional may be agreeable, but if you have to pay for it then it might be less agreeable. As in previous years we have maintained three groups here, viz.
    [Show full text]
  • Computer Chinese Chess
    Computer Chinese Chess Tsan-sheng Hsu [email protected] http://www.iis.sinica.edu.tw/~tshsu 1 Abstract An introduction to research problems and opportunities in Computer Games. • Using Computer Chinese chess (aaaËËË) as examples. • Show how theoretical research can help in solving the problems. Data-intensive computing: tradeoff between computing on the spot and using pre-stored knowledge. Phases of games • Open game ( 開開開@@@): database • Middle game (---@@@): Search • End game (殘殘殘@@@): knowledge Topics: • Introduction • Construction of a huge knowledge base that is consistent • Playing rules for repetition of positions • Construction of huge endgame databases • Benchmark TCG: Computer Chinese Chess, 20141224, Tsan-sheng Hsu c 2 Introduction Why study Computer Games: • Intelligence requires knowledge. • Games hold an inexplicable fascination for many people, and the notion that computers might play games has existed at least as long as computers. • Reasons why games appeared to be a good domain in which to explore machine intelligence. They provide a structured task in which it is very easy to measure success or failure. They did not obviously require large amount of knowledge. A course on teaching computers to play games was introduced at NTU in 2007. TCG: Computer Chinese Chess, 20141224, Tsan-sheng Hsu c 3 Predictions for 2010 { Status My personal opinion about the status of Prediction-2010 [van den Herik 2002] at October, 2010, right after the Computer Olympiad held in Kanazawa, Japan. solved over champion world champion grand master amateur Awari Chess Go (9 ∗ 9) Bridge Go (19 ∗ 19) Othello Draughts (10 ∗ 10) Chinese chess Shogi Checkers (8 ∗ 8) Scrabble Hex Backgammon Amazons Lines of Action .
    [Show full text]
  • Computer Olympiad
    The 7th Computer Olympiad Mark Winands IKAT, Universiteit Maastricht From July 5 to 11 2002 the Institute for Knowledge and Agent Technology (IKAT) organised the 7th Computer Olympiad at the Universiteit Maastricht (UM). Together with the Olympiad a Computer-Games workshop was organised. This event took place from July 6 to 8. Both events are described in this report. The Computer Olympiad The Computer Olympiad is a multi-games event in which all of the participants are computer programs. The Olympiad is a brainchild of David Levy, who organised this tournament in 1989 (London) for the first time. The next five editions were held in 1990 (London), 1991 (Maastricht), 1992 (London), 2000 (London) and 2001 (Maastricht). This year was the third time that the event was held in Maastricht. IKAT was responsible for the organisation. Similar to last year, Jaap van den Herik was the tournament director. The purpose of the Olympiad is to determine the strongest program for each game. The Olympiad has grown to a social event, as the authors of the programs are not bound to silence during the play as in human tournaments. The event is a reunion where programmers meet, discuss ideas and renew acquaintances. Some teams arrive with the clear goal of winning, some just come to participate, some to test new ideas under tournament conditions. The Olympiad is a truly international event. This year, participants came from all over the world: USA, Canada, Japan, Taiwan, Israel and the European Union. The event was held under the auspices of the ICCA (International Computer Chess Association), which gave it an official status.
    [Show full text]
  • Crazy Stone Wins First UEC Cup 1
    Crazy Stone wins First UEC Cup 1 Crazy Stone wins First UEC Cup Remi´ Coulom Universite´ Charles de Gaulle, Lille, France The First UEC Cup took place on December 1–2, 2007, at the University of Electro-Communications, in Tokyo, Japan. It is a new computer-Go tournament, that was set up after the cancellation of the Gifu Challenge. The Gifu Challenge had been a yearly computer-Go tournament in Japan between 2003 and 2006, but was cancelled in 2007 because of lack of support by sponsors. With 27 programs participating, the First UEC Cup was the largest computer-Go tournament in a very long time. According to Nick Wedd’s list at http://www.computer-go.info/events/, it is the third in history, after the 1997 FOST Cup (40), and the 1999 CGF Cup (28). All participants were from Japan, except two invited programs from France, MoGo and Crazy Stone, that played with local operators. The tournament was organized in two phases. On the first day, a 5-round Swiss tournament selected the top 16 programs. On the second day, a 4-round knockout tournament ranked the 16 selected programs. Tables 1 and 2 summarize the results. Games were played with a time control of 40 minutes, sudden death, with Japanese rules and 6.5 points of komi. Game records may be downloaded from http://jsb.cs.uec.ac.jp/˜igo/result.html, and http://jsb.cs.uec.ac.jp/˜igo/result2.html. This first edition of the UEC cup confirmed the strength of Monte-Carlo programs. MoGo and Crazy Stone, who were first and second in the 2007 Computer Olympiad in Amsterdam, took the third and first places.
    [Show full text]
  • CONFERENCE on COMPUTERS and GAMES 2008 Beijing, China
    CALL FOR PAPERS: CONFERENCE ON COMPUTERS AND GAMES 2008 Beijing, China September-October, 2008 Professor Xinhe Xu, Professor Zhongmin Ma, Professor H.J. van den Herik, and Dr. M.H.M. Winands Beijing, China Maastricht, The Netherlands The Beijing Longlife Group is enabling the organization of the Conference on Computers and Games 2008 (CG2008), the 16th World Computer-Chess Championship (WCCC) and the 13th Computer Olympiad (CO) (September-October, 2008) to be held in Beijing, China. Location: the Beijing Golden Century Golf club, Fangshan, Beijing, China. For the current information on the 16 th WCCC and the 13 th CO, see www.icga.org. A Chinese website will be opened as well. Below we focus on the CG2008. The start of the event will be at the Beijing Golden Century Golf club by the President of the Beijing Longlife Group. The conference will take place on three consecutive days, each day from 8.30 h till 12.30 h. The conference aims in the first place at providing an international forum for computer-games researchers presenting new results on ongoing work. Hence, we invite contributors to submit papers on all aspects of research related to computers and games. Relevant topics include, but are not limited to: (1) the current state of game-playing programs, (2) new theoretical developments in game-related research, (3) general scientific contributions produced by the study of games, and (4) (adaptive) game AI. Moreover, researchers of topics such as (5) cognitive research of how humans play games, and (6) issues related to networked games are invited to submit their contribution as well.
    [Show full text]
  • Olympiads in Informatics 4
    OlympiadsOlympiads Olympiads inin Informaticsin Informatics Informatics Olympiads VolumeVolume 4, Volume2010 4 4 2010 2010 B.A. BURTON. Encouraging algorithmic thinking without a computer 3 in Informatics B.A. BURTON. Encouraging algorithmic thinking without a computer 3 V.M. KIRYUKHIN. Mutual influence of the national educational standard and V.M. KIRYUKHIN. Mutual influence of the national educational standard and Olympiads in Informatics olympiad in informatics contents 15 olympiad in informatics contents 15 4 V.M. KIRYUKHIN, M.S. TSVETKOVA. Strategy for ICT skills teachers and V.M. KIRYUKHIN, M.S. TSVETKOVA. Strategy for ICT skills teachers and informatics olympiad coaches development 30 informatics olympiad coaches development 30 M. KUBICA, J. RADOSZEWSKI. Algorithms without programming 52 M. KUBICA, J. RADOSZEWSKI. Algorithms without programming 52 I.W. KURNIA, B. MARSHAL. Indonesian olympiad in informatics 67 I.W. KURNIA, B. MARSHAL. Indonesian olympiad in informatics 67 K. MANEV, B. YOVCHEVA, M. YANKOV, P. PETROV. Testing of programs K. MANEV, B. YOVCHEVA, M. YANKOV, P. PETROV. Testing of programs with random generated test cases 76 with random generated test cases 76 B. MERRY. Performance analysis of sandboxes for reactive tasks 87 B. MERRY. Performance analysis of sandboxes for reactive tasks 87 P.S. PANKOV. Real processes as sources for tasks in informatics 95 P.S. PANKOV. Real processes as sources for tasks in informatics 95 M. PHILLIPPS. The New Zealand experience of finding informatics talent 104 M. PHILLIPPS. The New Zealand experience of finding informatics talent 104 T. TOCHEV, T. BOGDANOV. Validating the security and stability of the grader Volume 4, 2010 T. TOCHEV, T.
    [Show full text]
  • Computer Shogi 2000 Through 2004
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by DSpace at Waseda University 195 Computer Shogi 2000 through 2004 Takenobu Takizawa Since the first computer shogi program was developed by the author in 1974, thirty years has passed. During that time,shogi programming has attracted both researchers and commercial programmers and playing strength has improved steadily. Currently, the best programs have a level that is comparable to that of a very strong amateur player (about 5-dan). In the near future, a good program will beat a professional player. The basic structure of strong shogi programs is similar to that of chess programs. However, the differences between chess and shogi have led to the development of some shogi-specific methods. In this paper the author will give an overview of the history of computer shogi, summarize the most successful techniques and give some ideas for future directions of research in computer shogi. 1 . Introduction Shogi, or Japanese chess, is a two-player complete information game similar to chess. As in chess, the goal of the game is to capture the opponent’s king. However, there are a number of differences be- tween chess and shogi: the shogi board is slightly bigger than the chess board (9x9 instead of 8x8), there are different pieces that are relatively weak compared to the pieces in chess (no queens, but gold generals, sil- ver generals and lances) and the number of pieces in shogi is larger (40 instead of 32). But the most important difference between chess and shogi is the possibility to re-use captured pieces.
    [Show full text]
  • Computer Games Workshop 2007
    Computer Games Workshop 2007 Amsterdam, June 15{17, 2007 MAIN SPONSORS Preface We are pleased to present the proceedings of the Computer Games Workshop 2007, Amsterdam, June 15{17, 2007. This workshop will be held in conjunc- tion with the 12th Computer Olympiad and the 15th World Computer-Chess Championship. Although the announcement was quite late, we were pleased to receive no less than 24 contributions. After a \light" refereeing process 22 papers were accepted. We believe that they present a nice overview of state-of-the-art research in the ¯eld of computer games. The 22 accepted papers can be categorized into ¯ve groups, according to the type of games used. Chess and Chess-like Games In this group we have included two papers on Chess, one on Kriegspiel, and three on Shogi (Japanese Chess). Matej Guid and Ivan Bratko investigate in Factors A®ecting Diminishing Returns for Searching Deeper the phenomenon of diminishing returns for addi- tional search e®ort. Using the chess programs Crafty and Rybka on a large set of grandmaster games, they show that diminishing returns depend on (a) the value of positions, (b) the quality of the evaluation function, and (c) the phase of the game and the amount of material on the board. Matej Guid, Aritz P¶erez,and Ivan Bratko in How Trustworthy is Crafty's Analysis of Chess Champions? again used Crafty in an attempt at an objective assessment of the strength of chess grandmasters of di®erent times. They show that their analysis is trustworthy, and hardly depends on the strength of the chess program used, the search depth applied, or the size of the sets of positions used.
    [Show full text]
  • Fuego at the Computer Olympiad in Pamplona 2009: a Tournament Report∗
    Fuego at the Computer Olympiad in Pamplona 2009: a Tournament Report∗ Martin M¨uller May 28, 2009 Abstract The 14th Computer Olympiad took place in Pamplona, Spain from May 11-18, 2009. The Fuego program won the 9 × 9 Go competition, and took second place in 19 × 19 Go. This report provides some analysis of the games played by Fuego in the competition. 1 Introduction This was the second Olympiad for Fuego. In its first participation, 2008 in Beijing, Fuego ended up in fourth place in both 9 × 9 and 19 × 19. This year’s competition took place in the beautiful historic Palacio del Contestable in the center of Pamplona. Komi was 7.5 and Chinese rules were used in both tournaments. The Fuego team, software and hardware were as follows: Markus Enzenberger is the creator of Fuego and its lead programmer. Martin M¨uller is team leader, program- mer and opening book author. Broderick Arneson is a programmer. Gerry Tesauro and Richard Segal of IBM Watson Research contributed the distributed memory implemen- tation. The software used was the current svn version of Fuego, with two experimental additions: First, the distributed memory implementation allowed the program to run on a cluster of eight 8-core machines provided by IBM and operated by Rich Segal. Second, for the 19 × 19 competition, an experimental version called FuegoEx was used. This version of Fuego employs knowledge of the classical Go program Explorer for move pruning in core parts of the UCT tree. 2 9 × 9 Tournament The 9 × 9 competition was played on May 11 - 13.
    [Show full text]
  • Game Theory, Alive Anna R
    Game Theory, Alive Anna R. Karlin Yuval Peres 10.1090/mbk/101 Game Theory, Alive Game Theory, Alive Anna R. Karlin Yuval Peres AMERICAN MATHEMATICAL SOCIETY Providence, Rhode Island 2010 Mathematics Subject Classification. Primary 91A10, 91A12, 91A18, 91B12, 91A24, 91A43, 91A26, 91A28, 91A46, 91B26. For additional information and updates on this book, visit www.ams.org/bookpages/mbk-101 Library of Congress Cataloging-in-Publication Data Names: Karlin, Anna R. | Peres, Y. (Yuval) Title: Game theory, alive / Anna Karlin, Yuval Peres. Description: Providence, Rhode Island : American Mathematical Society, [2016] | Includes bibliographical references and index. Identifiers: LCCN 2016038151 | ISBN 9781470419820 (alk. paper) Subjects: LCSH: Game theory. | AMS: Game theory, economics, social and behavioral sciences – Game theory – Noncooperative games. msc | Game theory, economics, social and behavioral sciences – Game theory – Cooperative games. msc | Game theory, economics, social and behavioral sciences – Game theory – Games in extensive form. msc | Game theory, economics, social and behavioral sciences – Mathematical economics – Voting theory. msc | Game theory, economics, social and behavioral sciences – Game theory – Positional games (pursuit and evasion, etc.). msc | Game theory, economics, social and behavioral sciences – Game theory – Games involving graphs. msc | Game theory, economics, social and behavioral sciences – Game theory – Rationality, learning. msc | Game theory, economics, social and behavioral sciences – Game theory – Signaling,
    [Show full text]