Artificial Intelligence As Structural Estimation: Economic

Total Page:16

File Type:pdf, Size:1020Kb

Artificial Intelligence As Structural Estimation: Economic Artificial Intelligence as Structural Estimation: Economic Interpretations of Deep Blue, Bonanza, and AlphaGo∗ Mitsuru Igami† March 1, 2018 Abstract Artificial intelligence (AI) has achieved superhuman performance in a growing num- ber of tasks, but understanding and explaining AI remain challenging. This paper clarifies the connections between machine-learning algorithms to develop AIs and the econometrics of dynamic structural models through the case studies of three famous game AIs. Chess-playing Deep Blue is a calibrated value function, whereas shogi- playing Bonanza is an estimated value function via Rust’s (1987) nested fixed-point method. AlphaGo’s “supervised-learning policy network” is a deep neural network implementation of Hotz and Miller’s (1993) conditional choice probability estimation; its “reinforcement-learning value network” is equivalent to Hotz, Miller, Sanders, and Smith’s (1994) conditional choice simulation method. Relaxing these AIs’ implicit econometric assumptions would improve their structural interpretability. arXiv:1710.10967v3 [econ.EM] 1 Mar 2018 Keywords: Artificial intelligence, Conditional choice probability, Deep neural network, Dynamic game, Dynamic structural model, Simulation estimator. JEL classifications: A12, C45, C57, C63, C73. ∗First version: October 30, 2017. This paper benefited from seminar comments at Riken AIP, Georgetown, Tokyo, Osaka, Harvard, Johns Hopkins, and The Third Cambridge Area Economics and Computation Day conference at Microsoft Research New England, as well as conversations with Susan Athey, Xiaohong Chen, Jerry Hausman, Greg Lewis, Robert Miller, Yusuke Narita, Aviv Nevo, Anton Popov, John Rust, Takuo Sugaya, Elie Tamer, and Yosuke Yasuda. †Yale Department of Economics and MIT Department of Economics. E-mail: [email protected]. 1 1 Introduction Artificial intelligence (AI) has achieved human-like performance in a growing number of tasks, such as visual recognition and natural language processing.1 The classical games of chess, shogi (Japanese chess), and Go were once thought to be too complicated and intractable for AI, but computer scientists have overcome these challenges. In chess, IBM’s computer system named Deep Blue defeated Grandmaster Garry Kasparov in 1997. In shogi, a machine-learning-based program called Bonanza challenged (and was defeated by) Ry¯u¯o champion Akira Watanabe in 2007, but one of its successors (Ponanza) played against Meijin champion Amahiko Satoh and won in 2017. In Go, Google DeepMind developed AlphaGo, a deep-learning-based program, which beat the 2-dan European champion Fan Hui in 2015, a 9-dan (highest rank) professional Lee Sedol in 2016, and the world’s best player Ke Jie in 2017. Despite such remarkable achievements, one of the lingering criticisms of AI is its lack of transparency. The internal mechanism seems like a black box to most people, including the human experts of the relevant tasks,2 which raises concerns about accountability and responsibility. The desire to understand and explain the functioning of AI is not limited to the scientific community. For example, the US Department of Defense airs its concern that “the effectiveness of these systems is limited by the machine’s current inability to explain their decisions and actions to human users,” which led it to host the Explainable AI (XAI) program aimed at developing “understandable” and “trustworthy” machine learning.3 This paper examines three prominent game AIs in recent history: Deep Blue, Bonanza, and AlphaGo. I have chosen to study this category of AIs because board games represent an archetypical task that has required human intelligence, including cognitive skills, decision- making, and problem-solving. They are also well-defined problems for which economic inter- pretations are more natural than for, say, visual recognition and natural language processing. The main finding from this paper’s case studies is that these AIs’ key components are math- ematically equivalent to well-known econometric methods to estimate dynamic structural models. Chess experts and IBM’s engineers manually adjusted thousands of parameters in Deep 1The formal definition of AI seems contentious, partly because scholars have not agreed on the definition of intelligence in the first place. This paper follows a broad definition of AI as computer systems able to perform tasks that traditionally required human intelligence. 2For example, Yoshiharu Habu, the strongest shogi player in recent history, states he does not understand certain board-evaluation functions of computer shogi programs (Habu and NHK [2017]). 3See https://www.darpa.mil/program/explainable-artificial-intelligence (accessed on October 17, 2017). 2 Blue’s “evaluation function,” which quantifies the probability of eventual winning as a func- tion of the current positions of pieces (i.e., state of the game) and therefore could be inter- preted as an approximate value function. Deep Blue is a calibrated value function with a linear functional form. By contrast, the developer of Bonanza constructed a dataset of professional shogi games, and used a discrete-choice regression and a backward-induction algorithm to determine the parameters of its value function. Hence, his method of “supervised learning” is equivalent to Rust’s (1987) nested fixed-point (NFXP) algorithm, which combined a discrete-choice model with dynamic programming (DP) in the maximum likelihood estimation (MLE) framework. Bonanza is an empirical model of human shogi players that is estimated by this direct (or “full-solution”) method. Google DeepMind’s AlphaGo (its original version) embodies an alternative approach to estimating dynamic structural models: two-step estimation.4 Its first component, the “supervised-learning (SL) policy network,” predicts the moves of human experts as a function of the board state. It is an empirical policy function with a class of nonparametric basis functions (DNN: deep neural network) that is estimated by MLE, using data from online Go games. Thus, the SL policy network is a DNN implementation of Hotz and Miller’s (1993) first-stage conditional choice probability (CCP) estimation. AlphaGo’s value function, called “reinforcement-learning (RL) value network,” is con- structed by simulating many games based on the self-play of the SL policy network and estimating another DNN model that maps state to the probability of winning. This pro- cedure is equivalent to the second-stage conditional choice simulation (CCS) estimation, proposed by Hotz, Miller, Sanders, and Smith (1994) for single-agent DP, and by Bajari, Benkard, and Levin (2007) for dynamic games. Thus, these leading game AIs and the core algorithms for their development turn out to be successful applications of the empirical methods to implement dynamic structural models. After introducing basic notations in section 2, I describe the main components of Deep Blue, Bonanza, and AlphaGo in sections 3, 4, and 5, respectively, and explain their structural interpretations. Section 6 clarifies some of the implicit assumptions underlying these AIs, such as (the absence of) unobserved heterogeneity, strategic interactions, and various constraints human players are facing in real games. Section 7 concludes by suggesting that relaxing some of these assumptions and explicitly incorporating more realistic features 4This paper focuses on the original version of AlphaGo, published in 2016, and distinguishes it from its later version, “AlphaGo Zero,” published in 2017. The latter version contains few econometric elements, and is not an immediate subject of my case study, although I discuss some of its interesting features in section 5. 3 of the data-generating process could help make AIs both more human-like (if needed) and more amenable to structural interpretations. Literature This paper clarifies the equivalence between some of the algorithms for devel- oping game AI and the aforementioned econometric methods for estimating dynamic models. As such, the most closely related papers are Rust (1987), Hotz and Miller (1993), and Hotz, Miller, Sanders, and Smith (1994). The game AIs I analyze in this paper are probably the most successful (or at least the most popular) empirical applications of these methods. For a historical review of numerical methods for dynamic programming, see Rust (2017). At a higher level, the purpose of this paper is to clarify the connections between machine learning and econometrics in certain areas. Hence, the paper shares the spirit of, for example, Belloni, Chernozhukov, and Hansen (2014), Varian (2014), Athey (2017), and Mullainathan and Spiess (2017), among many others in the rapidly growing literature on data analysis at the intersection of computer science and economics. 2 Model Rules Chess, shogi, and Go belong to the same class of games, with two players (i =1, 2), discrete time (t = 1, 2, ...), alternating moves (players 1 and 2 choose their actions, at, in odd and even periods, respectively), perfect information, and deterministic state transition, st+1 = f (st, at) , (1) where both the transition, f (·), and the initial state, s1, are completely determined by the rule of each game.5 Action space is finite and is defined by the rule as “legal moves,” at ∈A (st) . (2) State space is finite as well, and consists of four mutually exclusive subsets: st ∈ S = Scont ⊔ Swin ⊔ Sloss ⊔ Sdraw, (3) 5This setup abstracts from the time constraints in official games because the developers of game AIs typically do not incorporate them at the data-analysis stage. Hence, t represents turn-to-move,
Recommended publications
  • Monte Carlo Simulations of the Nested Fixed-Point Algorithm
    MONTE CARLO SIMULATIONS OF THE NESTED FIXED-POINT ALGORITHM Erik P. Johnson Working Paper # WP2011-011 October 2010 http://www.econ.gatech.edu/research/wokingpapers School of Economics Georgia Institute of Technology 221 Bobby Dodd Way Atlanta, GA 30332–0615 c by Erik P. Johnson. All rights reserved. Short sections of text, not to exceed two paragraphs, may be quoted without explicit permission provided that full credit, including c notice, is given to the source. Monte Carlo Simulations of the Nested Fixed-Point Algorithm Erik P. Johnson Working Paper # WP2011-11 October 2010 JEL No. C18, C63 ABSTRACT There have been substantial advances in dynamic structural models and in the economet- ric literature about techniques to estimate those models over the past two decades. One area in which these new developments has lagged is in studying robustness to distribu- tional assumptions and finite sample properties in small samples. This paper extends our understanding of the behavior of these estimation techniques by replicating John Rust’s (1987) influential paper using the nested fixed-point algorithm (NFXP) and then using Monte Carlo techniques to examine the finite sample properties of the estimator. I then examine the consequences of the distributional assumptions needed to estimate the model on the parameter estimates. I find that even in sample sizes of up to 8,000 observations, the NFXP can display finite sample bias and variances substantially larger than the theoret- ical asymptotic variance. This is also true with departures from distributional assumptions, with the mean square error increasing by a factor of 10 for some distributions of unobserved variables.
    [Show full text]
  • 2009 U.S. Tournament.Our.Beginnings
    Chess Club and Scholastic Center of Saint Louis Presents the 2009 U.S. Championship Saint Louis, Missouri May 7-17, 2009 History of U.S. Championship “pride and soul of chess,” Paul It has also been a truly national Morphy, was only the fourth true championship. For many years No series of tournaments or chess tournament ever held in the the title tournament was identi- matches enjoys the same rich, world. fied with New York. But it has turbulent history as that of the also been held in towns as small United States Chess Championship. In its first century and a half plus, as South Fallsburg, New York, It is in many ways unique – and, up the United States Championship Mentor, Ohio, and Greenville, to recently, unappreciated. has provided all kinds of entertain- Pennsylvania. ment. It has introduced new In Europe and elsewhere, the idea heroes exactly one hundred years Fans have witnessed of choosing a national champion apart in Paul Morphy (1857) and championship play in Boston, and came slowly. The first Russian Bobby Fischer (1957) and honored Las Vegas, Baltimore and Los championship tournament, for remarkable veterans such as Angeles, Lexington, Kentucky, example, was held in 1889. The Sammy Reshevsky in his late 60s. and El Paso, Texas. The title has Germans did not get around to There have been stunning upsets been decided in sites as varied naming a champion until 1879. (Arnold Denker in 1944 and John as the Sazerac Coffee House in The first official Hungarian champi- Grefe in 1973) and marvelous 1845 to the Cincinnati Literary onship occurred in 1906, and the achievements (Fischer’s winning Club, the Automobile Club of first Dutch, three years later.
    [Show full text]
  • DP Econometrics 2
    This document was generated at 9:20 AM on Friday, November 08, 2013 13 – Econometric Applications of Dynamic Programming AGEC 637 - 2013 I. Positive analysis using a DP foundation: Econometric applications of DP One of the most active areas of research involving numerical dynamic optimization, is the use of DP models in positive analysis. This work seeks to understand the nature of a particular problem being solved by a decision maker. 1 By specifying explicitly the optimization problem being solved, the analyst is able to estimate the parameters of a structural model, as opposed to the easier and more come reduced form approach. This approach was first develop by John Rust (1987) and I draw here directly on his original paper to explain how this is done. He provides a more complete description of his approach in Rust (1994a and 1994b). My focus here will be quite narrow and serves only as an introduction to this literature. A more generally and up-to-date review of approaches for the estimation of the parameters of dynamic optimization problems is provided by Keane et al. (2011). Recall that Rust’s paper was “Optimal Replacement of GMC Bus Engines: An Empirical Model of Harold Zurcher.” The state variable xt denotes the accumulated mileage (since last replacement) of the GMC bus engines of the bus fleet in the Madison, Wisconsin. Harold Zurcher must make the choice as to whether to carry out routine maintenance, θ θ i=0, or replace the engine, i=1. Each period the operating costs, c(xt, 1) where is a vector of parameters to be estimated.
    [Show full text]
  • Present 1995-1996 Mark Bils, University of Rochester Bo Honore, Princeton U
    DEPARTMENT OF ECONOMICS SHORT-TERM VISITORS 1995 – Present 1995-1996 Mark Bils, University of Rochester Bo Honore, Princeton University 1996-1997 George Loewenstein, Carnegie Mellon University Dan Peled, Technion - Israel Institute of Technology Motty Perry, Hebrew University of Jerusalem Arnold Zellner, University of Chicago 1997-1998 Costas Meghir, University College of London Dan Peled, Technion - Israel Institute of Technology Motty Perry, Hebrew University of Jerusalem Andrew Postlewaite, University of Pennsylvania 1998-1999 James J. Heckman, University of Chicago 1999-2000 Robert Trivers, Rutgers University Wilbert van der Klaauw, University of North Carolina - Chapel Hill 2000-2001 Richard Blundell, University College London Kyle Bagwell, Columbia University Dennis Epple, Carnegie-Mellon University -- CANCELLED Narayana Kocherlakota, University of Minnesota (Bank of Montreal) Dale Mortensen, Northwestern University 2001-2002 Peter Howitt, Brown University (Bank of Montreal) Bo Honore, Princeton University Tim Kehoe, University of Minnesota Antonio Merlo, University of Pennsylvania Peter Norman, University of Wisconsin Angela Redish, University of British Columbia (Bank of Montreal) 2002-2003 John Abowd, Cornell University George Neumann, Iowa University 2003-2004 Tyler Cowan, George Mason University Ayse Imrohoroglu, University of Southern California 2004-2005 Derek Neal, University of Chicago Christopher Flinn, New York University 2005-2006 Christopher Flinn, New York University Enrique Mendoza, University of Maryland Derek Allen
    [Show full text]
  • Catur Komputer Dari Wikipedia Bahasa Indonesia, Ensiklopedia Bebas
    Deep Blue Dari Wikipedia bahasa Indonesia, ensiklopedia bebas Belum Diperiksa Deep Blue Deep Blue adalah sebuah komputer catur buatan IBM. Deep Blue adalah komputer pertama yang memenangkan sebuah permainan catur melawan seorang juara dunia (Garry Kasparov) dalam waktu standar sebuah turnamen catur. Kemenangan pertamanya (dalam pertandingan atau babak pertama) terjadi pada 10 Februari 1996, dan merupakan permainan yang sangat terkenal. Namun Kasparov kemudian memenangkan 3 pertandingan lainnya dan memperoleh hasil remis pada 2 pertandingan selanjutnya, sehingga mengalahkan Deep Blue dengan hasil 4-2. Deep Blue lalu diupgrade lagi secara besar-besaran dan kembali bertanding melawan Kasparov pada Mei 1997. Dalam pertandingan enam babak tersebut Deep Blue menang dengan hasil 3,5- 2,5. Babak terakhirnya berakhir pada 11 Mei. Deep Blue menjadi komputer pertama yang mengalahkan juara dunia bertahan. Komputer ini saat ini sudah "dipensiunkan" dan dipajang di Museum Nasional Sejarah Amerika (National Museum of American History),Amerika Serikat. http://id.wikipedia.org/wiki/Deep_Blue Catur komputer Dari Wikipedia bahasa Indonesia, ensiklopedia bebas Komputer catur dengan layar LCD pada 1990-an Catur komputer adalah arsitektur komputer yang memuat perangkat keras dan perangkat lunak komputer yang mampu bermain caturtanpa kendali manusia. Catur komputer berfungsi sebagai alat hiburan sendiri (yang membolehkan pemain latihan atau hiburan jika lawan manusia tidak ada), sebagai alat bantu kepada analisis catur, untuk pertandingan catur komputer dan penelitian untuk kognisi manusia. Kategori Deep Blue (chess computer) From Wikipedia, the free encyclopedia Deep Blue Deep Blue was a chess-playing computer developed by IBM. On May 11, 1997, the machine, with human intervention between games, won the second six-game match against world champion Garry Kasparov, two to one, with three draws.[1] Kasparov accused IBM of cheating and demanded a rematch.
    [Show full text]
  • Grid Computing for Artificial Intelligence
    0 10 Grid Computing for Artificial Intelligence Yuya Dan Matsuyama University Japan 1. Introduction This chapter is concerned in grid computing for artificial intelligence systems. In general, grid computing enables us to process a huge amount of data in any fields of discipline. Scientific computing, calculation of the accurate value of π, search for Mersenne prime numbers, analysis of protein molecular structure, weather dynamics, data analysis, business data mining, simulation are examples of application for grid computing. As is well known that supercomputers have very high performance in calculation, however, it is quite expensive to use them for a long time. On the other hand, grid computing using idle resources on the Internet may be running on the reasonable cost. Shogi is a traditional game involving two players in Japan as similar as chess, which is more complicated than chess in rule of play. Figure 1 shows the set of Shogi and initial state of 40 pieces on 9 x 9 board. According to game theory, both chess and Shogi are two-person zero-sum game with perfect information. They are not only a popular game but also a target in the research of artificial intelligence. Fig. 1. Picture of Shogi set including 40 pieces ona9x9board. There are the corresponding king, rook, knight, pawn, and other pieces. Six kinds of pieces can change their role like pawns in chess when they reach the opposite territory. 202 Advances in Grid Computing The information systems for Shogi need to process astronomical combinations of possible positions to move. It is described by Dan (4) that the grid systems for Shogi is effective in reduction of computational complexity with collaboration from computational resources on the Internet.
    [Show full text]
  • YEARBOOK the Information in This Yearbook Is Substantially Correct and Current As of December 31, 2020
    OUR HERITAGE 2020 US CHESS YEARBOOK The information in this yearbook is substantially correct and current as of December 31, 2020. For further information check the US Chess website www.uschess.org. To notify US Chess of corrections or updates, please e-mail [email protected]. U.S. CHAMPIONS 2002 Larry Christiansen • 2003 Alexander Shabalov • 2005 Hakaru WESTERN OPEN BECAME THE U.S. OPEN Nakamura • 2006 Alexander Onischuk • 2007 Alexander Shabalov • 1845-57 Charles Stanley • 1857-71 Paul Morphy • 1871-90 George H. 1939 Reuben Fine • 1940 Reuben Fine • 1941 Reuben Fine • 1942 2008 Yury Shulman • 2009 Hikaru Nakamura • 2010 Gata Kamsky • Mackenzie • 1890-91 Jackson Showalter • 1891-94 Samuel Lipchutz • Herman Steiner, Dan Yanofsky • 1943 I.A. Horowitz • 1944 Samuel 2011 Gata Kamsky • 2012 Hikaru Nakamura • 2013 Gata Kamsky • 2014 1894 Jackson Showalter • 1894-95 Albert Hodges • 1895-97 Jackson Reshevsky • 1945 Anthony Santasiere • 1946 Herman Steiner • 1947 Gata Kamsky • 2015 Hikaru Nakamura • 2016 Fabiano Caruana • 2017 Showalter • 1897-06 Harry Nelson Pillsbury • 1906-09 Jackson Isaac Kashdan • 1948 Weaver W. Adams • 1949 Albert Sandrin Jr. • 1950 Wesley So • 2018 Samuel Shankland • 2019 Hikaru Nakamura Showalter • 1909-36 Frank J. Marshall • 1936 Samuel Reshevsky • Arthur Bisguier • 1951 Larry Evans • 1952 Larry Evans • 1953 Donald 1938 Samuel Reshevsky • 1940 Samuel Reshevsky • 1942 Samuel 2020 Wesley So Byrne • 1954 Larry Evans, Arturo Pomar • 1955 Nicolas Rossolimo • Reshevsky • 1944 Arnold Denker • 1946 Samuel Reshevsky • 1948 ONLINE: COVID-19 • OCTOBER 2020 1956 Arthur Bisguier, James Sherwin • 1957 • Robert Fischer, Arthur Herman Steiner • 1951 Larry Evans • 1952 Larry Evans • 1954 Arthur Bisguier • 1958 E.
    [Show full text]
  • A GUIDE to SCHOLASTIC CHESS (11Th Edition Revised June 26, 2021)
    A GUIDE TO SCHOLASTIC CHESS (11th Edition Revised June 26, 2021) PREFACE Dear Administrator, Teacher, or Coach This guide was created to help teachers and scholastic chess organizers who wish to begin, improve, or strengthen their school chess program. It covers how to organize a school chess club, run tournaments, keep interest high, and generate administrative, school district, parental and public support. I would like to thank the United States Chess Federation Club Development Committee, especially former Chairman Randy Siebert, for allowing us to use the framework of The Guide to a Successful Chess Club (1985) as a basis for this book. In addition, I want to thank FIDE Master Tom Brownscombe (NV), National Tournament Director, and the United States Chess Federation (US Chess) for their continuing help in the preparation of this publication. Scholastic chess, under the guidance of US Chess, has greatly expanded and made it possible for the wide distribution of this Guide. I look forward to working with them on many projects in the future. The following scholastic organizers reviewed various editions of this work and made many suggestions, which have been included. Thanks go to Jay Blem (CA), Leo Cotter (CA), Stephan Dann (MA), Bob Fischer (IN), Doug Meux (NM), Andy Nowak (NM), Andrew Smith (CA), Brian Bugbee (NY), WIM Beatriz Marinello (NY), WIM Alexey Root (TX), Ernest Schlich (VA), Tim Just (IL), Karis Bellisario and many others too numerous to mention. Finally, a special thanks to my wife, Susan, who has been patient and understanding. Dewain R. Barber Technical Editor: Tim Just (Author of My Opponent is Eating a Doughnut and Just Law; Editor 5th, 6th and 7th editions of the Official Rules of Chess).
    [Show full text]
  • Read Book Japanese Chess: the Game of Shogi Ebook, Epub
    JAPANESE CHESS: THE GAME OF SHOGI PDF, EPUB, EBOOK Trevor Leggett | 128 pages | 01 May 2009 | Tuttle Shokai Inc | 9784805310366 | English | Kanagawa, Japan Japanese Chess: The Game of Shogi PDF Book Memorial Verkouille A collection of 21 amateur shogi matches played in Ghent, Belgium. Retrieved 28 November In particular, the Two Pawn violation is most common illegal move played by professional players. A is the top class. This collection contains seven professional matches. Unlike in other shogi variants, in taikyoku the tengu cannot move orthogonally, and therefore can only reach half of the squares on the board. There are no discussion topics on this book yet. Visit website. The promoted silver. Brian Pagano rated it it was ok Oct 15, Checkmate by Black. Get A Copy. Kai Sanz rated it really liked it May 14, Cross Field Inc. This is a collection of amateur games that were played in the mid 's. The Oza tournament began in , but did not bestow a title until Want to Read Currently Reading Read. This article may be too long to read and navigate comfortably. White tiger. Shogi players are expected to follow etiquette in addition to rules explicitly described. The promoted lance. Illegal moves are also uncommon in professional games although this may not be true with amateur players especially beginners. Download as PDF Printable version. The Verge. It has not been shown that taikyoku shogi was ever widely played. Thus, the end of the endgame was strategically about trying to keep White's points above the point threshold. You might see something about Gene Davis Software on them, but they probably work.
    [Show full text]
  • Shogi Yearbook 2015
    Shogi Yearbook 2015 SHOGI24.COM SHOGI YEARBOOK 2015 Title match games, Challenger’s tournaments, interviews with WATANABE Akira and HIROSE Akihito, tournament reports, photos, Micro Shogi, statistics, … This yearbook is a free PDF document Shogi Yearbook 2015 Content Content Content .................................................................................................................................................... 2 Just a few words ... .................................................................................................................................. 5 64. Osho .................................................................................................................................................. 6 64. Osho league ................................................................................................................................... 6 64th Osho title match ........................................................................................................................... 9 Game 1 ............................................................................................................................................. 9 Game 2 ........................................................................................................................................... 12 Game 3 ........................................................................................................................................... 15 Game 4 ..........................................................................................................................................
    [Show full text]
  • Large-Scale Optimization for Evaluation Functions with Minimax Search
    Journal of Artificial Intelligence Research 49 (2014) 527-568 Submitted 10/13; published 03/14 Large-Scale Optimization for Evaluation Functions with Minimax Search Kunihito Hoki [email protected] Department of Communication Engineering and Informatics The University of Electro-Communications Tomoyuki Kaneko [email protected] Department of Graphics and Computer Sciences The University of Tokyo Abstract This paper presents a new method, Minimax Tree Optimization (MMTO), to learn a heuristic evaluation function of a practical alpha-beta search program. The evaluation function may be a linear or non-linear combination of weighted features, and the weights are the parameters to be optimized. To control the search results so that the move de- cisions agree with the game records of human experts, a well-modeled objective function to be minimized is designed. Moreover, a numerical iterative method is used to find local minima of the objective function, and more than forty million parameters are adjusted by using a small number of hyper parameters. This method was applied to shogi, a major variant of chess in which the evaluation function must handle a larger state space than in chess. Experimental results show that the large-scale optimization of the evaluation function improves the playing strength of shogi programs, and the new method performs significantly better than other methods. Implementation of the new method in our shogi program Bonanza made substantial contributions to the program's first-place finish in the 2013 World Computer Shogi Championship. Additionally, we present preliminary evidence of broader applicability of our method to other two-player games such as chess.
    [Show full text]
  • Interim Business Report April 1, 2017 September 30, 2017
    “Pursuing infinite possibilities with the power of ideas” since 1896 The198 th Fiscal Year Interim Business Report April 1, 2017 September 30, 2017 INTERIM BUSINESS REPORT Securities code: 3202 010_7046302832912.indd 2 2017/12/20 17:06:41 Management’s message <Introduction> During the first half of the fiscal year ending March 31, 2018, the Japanese economy remained in a moderate recovery trend overall. Although the domestic income and employment environment continued to improve as a result of various government policies, concerns remained in issues such as the ongoing uncertainty of overseas economies. Under these circumstances, the Group continued to work diligently on managerial initiatives based on the “Bridge to the Future” Mid-term Management Plan in each of its business sections. <Business results for the first half> Net sales decreased to ¥2,133 million (down 7.2% year on year) due to the absence of extraordinary income in the commercial property business seen in the same period of the previous fiscal year, as well as sluggish growth among some OEM customers in the health-related and apparel-related sectors. However, operating income increased to ¥227 million (up 14.9% year on year) as a result of a reduction in general and administrative expenses, and ordinary income increased to ¥169 million (up 81.7% year on year) due to the absence of one-time non-operating expenses incurred in the same period of the previous fiscal year. As a result of this, combined with the absence of extraordinary income and extraordinary losses recorded in the same period of the previous fiscal year and income taxes, profit attributable to owners of parent increased around two-fold year on year to ¥130 million (up 105.7% year on year).
    [Show full text]