Alphazero to Alpha Hero a Pre-Study on Additional Tree Sampling Within Self-Play Reinforcement Learning

Total Page:16

File Type:pdf, Size:1020Kb

Alphazero to Alpha Hero a Pre-Study on Additional Tree Sampling Within Self-Play Reinforcement Learning DEGREE PROJECT IN COMPUTER ENGINEERING, FIRST CYCLE, 15 CREDITS STOCKHOLM, SWEDEN 2019 AlphaZero to Alpha Hero A pre-study on Additional Tree Sampling within Self-Play Reinforcement Learning FREDRIK CARLSSON JOEY ÖHMAN KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE Bachelor in Computer Science Date: June 5, 2019 Supervisor: Jörg Conradt Examiner: Örjan Ekeberg Swedish title: Från AlphaZero till alfahjälte - En förstudie om inklusion av additionella trädobservationer i straffinlärning Abstract In self-play reinforcement learning an agent plays games against itself and with the help of hindsight and retrospection improves its policy over time. Using this premise, Alp- haZero famously managed to become the strongest known Go, Shogi, and Chess entity by training a deep neural network from data collected solely from self-play. AlphaZero couples this deep neural network with a Monte Carlo Tree Search algorithm that dras- tically improves the networks initial policy and state evaluation. When training Alp- haZero relies on the final outcome of the game for the generation of training labels. By altering the learning target to instead make use of the improved state evaluation acquired after the tree search, the creation of training labels for states exclusively visited by tree search becomes possible. We propose the extension of Additional Tree Sampling that exploits the change of learning target and provide theoretical arguments and counter- arguments for the validity of this approach. Further, an empirical analysis is performed on the game Connect Four, which harbors results that justifies the change in learning target. The altered learning target seems to have no negative impact on the final player strength nor on the behavior of the learning algorithm over time. Based on these positive results we encourage further research of Additional Tree Sampling in order to validify or reject the usefulness of this method. Sammanfattning I självspelande straffinlärning spelar en agent mot sig själv. Med hjälp av sofistikera- de algoritmer och tillbakablickande kan agenten lära sig en bra policy över tid. Denna metod har gjort AlphaZero till världens starkaste spelare i Go, Shogi, och Schack ge- nom att träna ett djupt neuralt nätverk med data samlat enbart från självspel. AlphaZero kombinerar detta djupa neurala nätverk med en Monte Carlo Tree Search-algoritm som kraftigt förstärker nätverkets evaluering av ett bräde. Originalversionen av AlphaZero genererar träningsdata med det slutgiltiga resultatet av ett spel som inlärningsmål. Ge- nom att ändra detta inlärningsmål till resultatet av trädsöket istället, möjliggörs skapan- det av träningsdata från bräden som enbart blivit upptäckta genom trädsök. Vi föreslår en utökning, Additional Tree Samling, som utnyttjar denna förändring av inlärnings- mål. Detta följs av teoretiska argument för och emot denna utökning av AlphaZero. Vidare utförs en empirisk analys på spelet Fyra i Rad som styrker faktumet att modifie- ringen av inlärningsmål är rimligt. Det förändrade inlärningsmålet visar inga tecken på att försämra den slutgiltiga spelarens skicklighet eller inlärningsalgoritmens beteende under träning. Vi uppmuntrar, baserat på dessa positiva resultat, ytterligare forskning vad gäller Additional Tree Sampling, för att se huruvida denna metod skulle förändra AlphaZero. Acknowledgements We both feel a strong need to personally thank all of the people that have supported, helped and guided us throughout the work of this thesis. We are truly appreciative of all the good folks that have aided us along the way. Special thanks are in place for senior AI researcher Lars Rasmusson, that showed great interest in our problem and was to great help during many discussions, accumulating into many hours spent in front of a whiteboard. Jörg Conradt, our supervisor, aided us greatly by spending a lot of his time and energy to supply us with much of the needed hardware. His devotion and constant availability were equally commendable, rarely has a humanoid been witnessed to respond so quickly to emails. Finally, a warm thank you is reserved for Sverker Jansson and RISE-SICS, giving us access to their offices and supplying us with many much-needed GPU’s. Contents 1 Introduction 1 1.1 Problem Statement . .2 1.2 Scope . .3 1.3 Purpose . .3 2 Background 4 2.1 Reinforcement Learning . .5 2.1.1 Exploration versus Exploitation . .6 2.1.2 Model-based Reinforcement Learning . .6 2.1.3 State Evaluation . .7 2.2 Deep Learning . .8 2.2.1 Artificial Neural Networks . .8 2.2.2 Training Neural Networks . 10 2.2.3 Skewed Datasets & Data Augmentation . 10 2.2.4 Summary . 12 2.3 Monte Carlo Tree Search . 12 2.3.1 Selection . 14 2.3.2 Evaluation . 14 2.3.3 Expansion . 14 2.3.4 Backpropagation . 15 2.3.5 Monte Carlo Tree Search As A Function . 15 2.4 AlphaGo & AlphaZero Overview . 15 2.4.1 AlphaZero Algorithm . 16 2.4.2 Self-Play . 17 2.4.3 Generating Training Samples . 18 2.4.4 Replay Buffer . 19 2.4.5 Supervised Training . 19 2.4.6 Network Architecture . 19 2.5 Environment . 20 2.5.1 Connect Four . 20 3 Hypothesis 21 3.1 The Learning Target . 21 3.2 Additional Tree Sampling . 22 3.3 Potential Issues . 23 3.4 Quality Of Data . 23 3.5 Skewing The dataset . 24 4 Method 25 4.1 Self-Play . 25 4.2 Replay Buffer . 26 4.3 Pre-processing Data . 26 4.4 Supervised Training . 26 4.5 Network Architecture . 27 4.6 State Representation . 27 4.7 Experiment Setup . 27 4.8 Evaluating Agents . 28 4.8.1 Generalization Performance . 28 4.8.2 Player Strength . 29 4.8.3 Agent Behavior . 29 5 Results 30 5.1 General Domain Performance . 30 5.2 Player Strength Comparison . 33 5.3 Behavior Over Generations . 33 6 Discussion 36 6.1 General Domain Performance . 36 6.2 Player Strength Comparison . 37 6.3 Behavior Over Generations . 37 6.4 Limitations . 38 6.5 Future Research . 38 7 Conclusions 40 References 41 Appendix A The network used in the empirical test provided for Connect Four 43 Appendix B Played Games Against Optimal 45 Appendix C GitHub: Implementation & Pre-trained Agents 46 1 Introduction Thanks to the predefined rules and limited state space, games have throughout history served as a testbed for Artificial Intelligence. The bounded environmental complexity and clear goal often give a good indication of how well a particular agent is performing and acts as a metric of success. Additionally, the artificial nature of most games has the benefit of its ease to simulate, removing the need for hardware agents to interact within the real world. Historically most research within the field of game-related AI has been devoted towards solving classical 2-player board games, such as Chess, Backgammon and Go. One recent milestone for AI in such domains is AlphaZero[1], which achieves superhu- man level strength at Go, Chess, and Shogi by only playing against itself. The predeces- sor AlphaGo[2] was produced and optimized for the game Go and famously managed to defeat one of the world’s best Go players, Lee Sedol. Although AlphaGo, unlike Al- phaZero, bootstrapped from human knowledge, AlphaZero decisively outperformed all versions of AlphaGo. Perhaps even more impressive, AlphaZero also managed to defeat the previously strongest chess AI, Stockfish[3], after only 4 hours of self-play. Where Stockfish has been developed and hand-tuned by AI researches and chess grandmasters for several years. During self-play AlphaZero performs a tree search, exploring possible future states be- fore deciding what move to play. Whereafter making a move, it switches sides and repeats the procedure, playing as the opponent. As a game finishes, a policy label and a value label is created for every visited state, where the value label is taken from the final outcome of the game. By doing this, AlphaZero learns to predict the final outcome of a game from any given state. Since the game outcome is only applicable to states actually visited, this limits the number of training labels that can be created from a single game, excluding states only seen during the intrinsic tree search. 1 CHAPTER 1. INTRODUCTION 2 Changing the evaluation to instead target information collected during tree search, would break the dependency posed on label creation and would theoretically allow for the cre- ation of additional tree labels. This thesis proposes the extension of “Additional Tree Sampling” and provides theoretical arguments to the benefits and viability of this ex- tension. Further, an empirical analysis is performed on the viability of using evaluation labels that do not utilize the final outcome of the game, as this is needed for the intro- duction of additional tree samples. To our knowledge, the extension of additional tree sampling and its effects are yet to be proposed and properly analyzed, and as of today, there exists very little documentation regarding the effects of altering the evaluation labels. 1.1 Problem Statement AlphaZero currently relies on the final outcome of a game for the creation of evaluation labels. As this outcome is only present for states visited during that specific game, this enforces a strong dependency on what states are possible candidates for label creation. This among other factors, creates a requisite for exploring a vast number of games, as only so much information can be gathered from a single game. We propose an extension to the AlphaZero algorithm, whereby altering the evaluation label allows for the creation of additional training samples. Theoretical arguments and counter-arguments are provided for utilizing additional tree samples. An empirical analysis is performed, providing insights into the effects caused by alternative evaluation labels. The non-theoretical analysis is realized by implement- ing the original algorithm, as well as the modification, allowing for comparison in player strength and overall domain generalization.
Recommended publications
  • Intro to Tensorflow 2.0 MBL, August 2019
    Intro to TensorFlow 2.0 MBL, August 2019 Josh Gordon (@random_forests) 1 Agenda 1 of 2 Exercises ● Fashion MNIST with dense layers ● CIFAR-10 with convolutional layers Concepts (as many as we can intro in this short time) ● Gradient descent, dense layers, loss, softmax, convolution Games ● QuickDraw Agenda 2 of 2 Walkthroughs and new tutorials ● Deep Dream and Style Transfer ● Time series forecasting Games ● Sketch RNN Learning more ● Book recommendations Deep Learning is representation learning Image link Image link Latest tutorials and guides tensorflow.org/beta News and updates medium.com/tensorflow twitter.com/tensorflow Demo PoseNet and BodyPix bit.ly/pose-net bit.ly/body-pix TensorFlow for JavaScript, Swift, Android, and iOS tensorflow.org/js tensorflow.org/swift tensorflow.org/lite Minimal MNIST in TF 2.0 A linear model, neural network, and deep neural network - then a short exercise. bit.ly/mnist-seq ... ... ... Softmax model = Sequential() model.add(Dense(256, activation='relu',input_shape=(784,))) model.add(Dense(128, activation='relu')) model.add(Dense(10, activation='softmax')) Linear model Neural network Deep neural network ... ... Softmax activation After training, select all the weights connected to this output. model.layers[0].get_weights() # Your code here # Select the weights for a single output # ... img = weights.reshape(28,28) plt.imshow(img, cmap = plt.get_cmap('seismic')) ... ... Softmax activation After training, select all the weights connected to this output. Exercise 1 (option #1) Exercise: bit.ly/mnist-seq Reference: tensorflow.org/beta/tutorials/keras/basic_classification TODO: Add a validation set. Add code to plot loss vs epochs (next slide). Exercise 1 (option #2) bit.ly/ijcav_adv Answers: next slide.
    [Show full text]
  • Development of Games for Users with Visual Impairment Czech Technical University in Prague Faculty of Electrical Engineering
    Development of games for users with visual impairment Czech Technical University in Prague Faculty of Electrical Engineering Dina Chernova January 2017 Acknowledgement I would first like to thank Bc. Honza Had´aˇcekfor his valuable advice. I am also very grateful to my supervisor Ing. Daniel Nov´ak,Ph.D. and to all participants that were involved in testing of my application for their precious time. I must express my profound gratitude to my loved ones for their support and continuous encouragement throughout my years of study. This accomplishment would not have been possible without them. Thank you. 5 Declaration I declare that I have developed this thesis on my own and that I have stated all the information sources in accordance with the methodological guideline of adhering to ethical principles during the preparation of university theses. In Prague 09.01.2017 Author 6 Abstract This bachelor thesis deals with analysis and implementation of mobile application that allows visually impaired people to play chess on their smart phones. The application con- trol is performed using special gestures and text{to{speech engine as a sound accompanier. For human against computer game mode I have used currently the best game engine called Stockfish. The application is developed under Android mobile platform. Keywords: chess; visually impaired; Android; Bakal´aˇrsk´apr´acese zab´yv´aanal´yzoua implementac´ımobiln´ıaplikace, kter´aumoˇzˇnuje zrakovˇepostiˇzen´ymlidem hr´atˇsachy na sv´emsmartphonu. Ovl´ad´an´ıaplikace se prov´ad´ı pomoc´ıspeci´aln´ıch gest a text{to{speech enginu pro zvukov´edoprov´azen´ı.V reˇzimu ˇclovˇek versus poˇc´ıtaˇcjsem pouˇzilasouˇcasnˇenejlepˇs´ıhern´ıengine Stockfish.
    [Show full text]
  • Game Changer
    Matthew Sadler and Natasha Regan Game Changer AlphaZero’s Groundbreaking Chess Strategies and the Promise of AI New In Chess 2019 Contents Explanation of symbols 6 Foreword by Garry Kasparov �������������������������������������������������������������������������������� 7 Introduction by Demis Hassabis 11 Preface 16 Introduction ������������������������������������������������������������������������������������������������������������ 19 Part I AlphaZero’s history . 23 Chapter 1 A quick tour of computer chess competition 24 Chapter 2 ZeroZeroZero ������������������������������������������������������������������������������ 33 Chapter 3 Demis Hassabis, DeepMind and AI 54 Part II Inside the box . 67 Chapter 4 How AlphaZero thinks 68 Chapter 5 AlphaZero’s style – meeting in the middle 87 Part III Themes in AlphaZero’s play . 131 Chapter 6 Introduction to our selected AlphaZero themes 132 Chapter 7 Piece mobility: outposts 137 Chapter 8 Piece mobility: activity 168 Chapter 9 Attacking the king: the march of the rook’s pawn 208 Chapter 10 Attacking the king: colour complexes 235 Chapter 11 Attacking the king: sacrifices for time, space and damage 276 Chapter 12 Attacking the king: opposite-side castling 299 Chapter 13 Attacking the king: defence 321 Part IV AlphaZero’s
    [Show full text]
  • Keras2c: a Library for Converting Keras Neural Networks to Real-Time Compatible C
    Keras2c: A library for converting Keras neural networks to real-time compatible C Rory Conlina,∗, Keith Ericksonb, Joeseph Abbatec, Egemen Kolemena,b,∗ aDepartment of Mechanical and Aerospace Engineering, Princeton University, Princeton NJ 08544, USA bPrinceton Plasma Physics Laboratory, Princeton NJ 08544, USA cDepartment of Astrophysical Sciences at Princeton University, Princeton NJ 08544, USA Abstract With the growth of machine learning models and neural networks in mea- surement and control systems comes the need to deploy these models in a way that is compatible with existing systems. Existing options for deploying neural networks either introduce very high latency, require expensive and time con- suming work to integrate into existing code bases, or only support a very lim- ited subset of model types. We have therefore developed a new method called Keras2c, which is a simple library for converting Keras/TensorFlow neural net- work models into real-time compatible C code. It supports a wide range of Keras layers and model types including multidimensional convolutions, recurrent lay- ers, multi-input/output models, and shared layers. Keras2c re-implements the core components of Keras/TensorFlow required for predictive forward passes through neural networks in pure C, relying only on standard library functions considered safe for real-time use. The core functionality consists of ∼ 1500 lines of code, making it lightweight and easy to integrate into existing codebases. Keras2c has been successfully tested in experiments and is currently in use on the plasma control system at the DIII-D National Fusion Facility at General Atomics in San Diego. 1. Motivation TensorFlow[1] is one of the most popular libraries for developing and training neural networks.
    [Show full text]
  • Reinforcement Learning (1) Key Concepts & Algorithms
    Winter 2020 CSC 594 Topics in AI: Advanced Deep Learning 5. Deep Reinforcement Learning (1) Key Concepts & Algorithms (Most content adapted from OpenAI ‘Spinning Up’) 1 Noriko Tomuro Reinforcement Learning (RL) • Reinforcement Learning (RL) is a type of Machine Learning where an agent learns to achieve a goal by interacting with the environment -- trial and error. • RL is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. • The purpose of RL is to learn an optimal policy that maximizes the return for the sequences of agent’s actions (i.e., optimal policy). https://en.wikipedia.org/wiki/Reinforcement_learning 2 • RL have recently enjoyed a wide variety of successes, e.g. – Robot controlling in simulation as well as in the real world Strategy games such as • AlphaGO (by Google DeepMind) • Atari games https://en.wikipedia.org/wiki/Reinforcement_learning 3 Deep Reinforcement Learning (DRL) • A policy is essentially a function, which maps the agent’s each action to the expected return or reward. Deep Reinforcement Learning (DRL) uses deep neural networks for the function (and other components). https://spinningup.openai.com/en/latest/spinningup/rl_intro.html 4 5 Some Key Concepts and Terminology 1. States and Observations – A state s is a complete description of the state of the world. For now, we can think of states belonging in the environment. – An observation o is a partial description of a state, which may omit information. – A state could be fully or partially observable to the agent. If partial, the agent forms an internal state (or state estimation). https://spinningup.openai.com/en/latest/spinningup/rl_intro.html 6 • States and Observations (cont.) – In deep RL, we almost always represent states and observations by a real-valued vector, matrix, or higher-order tensor.
    [Show full text]
  • OLIVAW: Mastering Othello with Neither Humans Nor a Penny Antonio Norelli, Alessandro Panconesi Dept
    1 OLIVAW: Mastering Othello with neither Humans nor a Penny Antonio Norelli, Alessandro Panconesi Dept. of Computer Science, Universita` La Sapienza, Rome, Italy Abstract—We introduce OLIVAW, an AI Othello player adopt- of companies. Another aspect of the same problem is the ing the design principles of the famous AlphaGo series. The main amount of training needed. AlphaGo Zero required 4.9 million motivation behind OLIVAW was to attain exceptional competence games played during self-play. While to attain the level of in a non-trivial board game at a tiny fraction of the cost of its illustrious predecessors. In this paper, we show how the grandmaster for games like Starcraft II and Dota 2 the training AlphaGo Zero’s paradigm can be successfully applied to the required 200 years and more than 10,000 years of gameplay, popular game of Othello using only commodity hardware and respectively [7], [8]. free cloud services. While being simpler than Chess or Go, Thus one of the major problems to emerge in the wake Othello maintains a considerable search space and difficulty of these breakthroughs is whether comparable results can be in evaluating board positions. To achieve this result, OLIVAW implements some improvements inspired by recent works to attained at a much lower cost– computational and financial– accelerate the standard AlphaGo Zero learning process. The and with just commodity hardware. In this paper we take main modification implies doubling the positions collected per a small step in this direction, by showing that AlphaGo game during the training phase, by including also positions not Zero’s successful paradigm can be replicated for the game played but largely explored by the agent.
    [Show full text]
  • (2021), 2814-2819 Research Article Can Chess Ever Be Solved Na
    Turkish Journal of Computer and Mathematics Education Vol.12 No.2 (2021), 2814-2819 Research Article Can Chess Ever Be Solved Naveen Kumar1, Bhayaa Sharma2 1,2Department of Mathematics, University Institute of Sciences, Chandigarh University, Gharuan, Mohali, Punjab-140413, India [email protected], [email protected] Article History: Received: 11 January 2021; Accepted: 27 February 2021; Published online: 5 April 2021 Abstract: Data Science and Artificial Intelligence have been all over the world lately,in almost every possible field be it finance,education,entertainment,healthcare,astronomy, astrology, and many more sports is no exception. With so much data, statistics, and analysis available in this particular field, when everything is being recorded it has become easier for team selectors, broadcasters, audience, sponsors, most importantly for players themselves to prepare against various opponents. Even the analysis has improved over the period of time with the evolvement of AI, not only analysis one can even predict the things with the insights available. This is not even restricted to this,nowadays players are trained in such a manner that they are capable of taking the most feasible and rational decisions in any given situation. Chess is one of those sports that depend on calculations, algorithms, analysis, decisions etc. Being said that whenever the analysis is involved, we have always improvised on the techniques. Algorithms are somethingwhich can be solved with the help of various software, does that imply that chess can be fully solved,in simple words does that mean that if both the players play the best moves respectively then the game must end in a draw or does that mean that white wins having the first move advantage.
    [Show full text]
  • Download Source Engine for Pc Free Download Source Engine for Pc Free
    download source engine for pc free Download source engine for pc free. Completing the CAPTCHA proves you are a human and gives you temporary access to the web property. What can I do to prevent this in the future? If you are on a personal connection, like at home, you can run an anti-virus scan on your device to make sure it is not infected with malware. If you are at an office or shared network, you can ask the network administrator to run a scan across the network looking for misconfigured or infected devices. Another way to prevent getting this page in the future is to use Privacy Pass. You may need to download version 2.0 now from the Chrome Web Store. Cloudflare Ray ID: 67a0b2f3bed7f14e • Your IP : 188.246.226.140 • Performance & security by Cloudflare. Download source engine for pc free. Completing the CAPTCHA proves you are a human and gives you temporary access to the web property. What can I do to prevent this in the future? If you are on a personal connection, like at home, you can run an anti-virus scan on your device to make sure it is not infected with malware. If you are at an office or shared network, you can ask the network administrator to run a scan across the network looking for misconfigured or infected devices. Another way to prevent getting this page in the future is to use Privacy Pass. You may need to download version 2.0 now from the Chrome Web Store. Cloudflare Ray ID: 67a0b2f3c99315dc • Your IP : 188.246.226.140 • Performance & security by Cloudflare.
    [Show full text]
  • Monte-Carlo Tree Search As Regularized Policy Optimization
    Monte-Carlo tree search as regularized policy optimization Jean-Bastien Grill * 1 Florent Altche´ * 1 Yunhao Tang * 1 2 Thomas Hubert 3 Michal Valko 1 Ioannis Antonoglou 3 Remi´ Munos 1 Abstract AlphaZero employs an alternative handcrafted heuristic to achieve super-human performance on board games (Silver The combination of Monte-Carlo tree search et al., 2016). Recent MCTS-based MuZero (Schrittwieser (MCTS) with deep reinforcement learning has et al., 2019) has also led to state-of-the-art results in the led to significant advances in artificial intelli- Atari benchmarks (Bellemare et al., 2013). gence. However, AlphaZero, the current state- of-the-art MCTS algorithm, still relies on hand- Our main contribution is connecting MCTS algorithms, crafted heuristics that are only partially under- in particular the highly-successful AlphaZero, with MPO, stood. In this paper, we show that AlphaZero’s a state-of-the-art model-free policy-optimization algo- search heuristics, along with other common ones rithm (Abdolmaleki et al., 2018). Specifically, we show that such as UCT, are an approximation to the solu- the empirical visit distribution of actions in AlphaZero’s tion of a specific regularized policy optimization search procedure approximates the solution of a regularized problem. With this insight, we propose a variant policy-optimization objective. With this insight, our second of AlphaZero which uses the exact solution to contribution a modified version of AlphaZero that comes this policy optimization problem, and show exper- significant performance gains over the original algorithm, imentally that it reliably outperforms the original especially in cases where AlphaZero has been observed to algorithm in multiple domains.
    [Show full text]
  • Tensorflow, Theano, Keras, Torch, Caffe Vicky Kalogeiton, Stéphane Lathuilière, Pauline Luc, Thomas Lucas, Konstantin Shmelkov Introduction
    TensorFlow, Theano, Keras, Torch, Caffe Vicky Kalogeiton, Stéphane Lathuilière, Pauline Luc, Thomas Lucas, Konstantin Shmelkov Introduction TensorFlow Google Brain, 2015 (rewritten DistBelief) Theano University of Montréal, 2009 Keras François Chollet, 2015 (now at Google) Torch Facebook AI Research, Twitter, Google DeepMind Caffe Berkeley Vision and Learning Center (BVLC), 2013 Outline 1. Introduction of each framework a. TensorFlow b. Theano c. Keras d. Torch e. Caffe 2. Further comparison a. Code + models b. Community and documentation c. Performance d. Model deployment e. Extra features 3. Which framework to choose when ..? Introduction of each framework TensorFlow architecture 1) Low-level core (C++/CUDA) 2) Simple Python API to define the computational graph 3) High-level API (TF-Learn, TF-Slim, soon Keras…) TensorFlow computational graph - auto-differentiation! - easy multi-GPU/multi-node - native C++ multithreading - device-efficient implementation for most ops - whole pipeline in the graph: data loading, preprocessing, prefetching... TensorBoard TensorFlow development + bleeding edge (GitHub yay!) + division in core and contrib => very quick merging of new hotness + a lot of new related API: CRF, BayesFlow, SparseTensor, audio IO, CTC, seq2seq + so it can easily handle images, videos, audio, text... + if you really need a new native op, you can load a dynamic lib - sometimes contrib stuff disappears or moves - recently introduced bells and whistles are barely documented Presentation of Theano: - Maintained by Montréal University group. - Pioneered the use of a computational graph. - General machine learning tool -> Use of Lasagne and Keras. - Very popular in the research community, but not elsewhere. Falling behind. What is it like to start using Theano? - Read tutorials until you no longer can, then keep going.
    [Show full text]
  • The SSDF Chess Engine Rating List, 2019-02
    The SSDF Chess Engine Rating List, 2019-02 Article Accepted Version The SSDF report Sandin, L. and Haworth, G. (2019) The SSDF Chess Engine Rating List, 2019-02. ICGA Journal, 41 (2). 113. ISSN 1389- 6911 doi: https://doi.org/10.3233/ICG-190107 Available at http://centaur.reading.ac.uk/82675/ It is advisable to refer to the publisher’s version if you intend to cite from the work. See Guidance on citing . Published version at: https://doi.org/10.3233/ICG-190085 To link to this article DOI: http://dx.doi.org/10.3233/ICG-190107 Publisher: The International Computer Games Association All outputs in CentAUR are protected by Intellectual Property Rights law, including copyright law. Copyright and IPR is retained by the creators or other copyright holders. Terms and conditions for use of this material are defined in the End User Agreement . www.reading.ac.uk/centaur CentAUR Central Archive at the University of Reading Reading’s research outputs online THE SSDF RATING LIST 2019-02-28 148673 games played by 377 computers Rating + - Games Won Oppo ------ --- --- ----- --- ---- 1 Stockfish 9 x64 1800X 3.6 GHz 3494 32 -30 642 74% 3308 2 Komodo 12.3 x64 1800X 3.6 GHz 3456 30 -28 640 68% 3321 3 Stockfish 9 x64 Q6600 2.4 GHz 3446 50 -48 200 57% 3396 4 Stockfish 8 x64 1800X 3.6 GHz 3432 26 -24 1059 77% 3217 5 Stockfish 8 x64 Q6600 2.4 GHz 3418 38 -35 440 72% 3251 6 Komodo 11.01 x64 1800X 3.6 GHz 3397 23 -22 1134 72% 3229 7 Deep Shredder 13 x64 1800X 3.6 GHz 3360 25 -24 830 66% 3246 8 Booot 6.3.1 x64 1800X 3.6 GHz 3352 29 -29 560 54% 3319 9 Komodo 9.1
    [Show full text]
  • Setting up Your Environment for the TF Developer Certificate Exam
    Set up your environment to take the TensorFlow Developer Ceicate Exam Questions? Email [email protected]. Last Updated: June 23, 2021 This document describes how to get your environment ready to take the TensorFlow Developer Ceicate exam. It does not discuss what the exam is or how to take it. For more information about the TensorFlow Developer Ceicate program, please go to tensolow.org/ceicate. Contents Before you begin Refund policy Install Python 3.8 Install PyCharm Get your environment ready for the exam What libraries will the exam infrastructure install? Make sure that PyCharm isn't subject to le-loading controls Windows and Linux Users: Check your GPU drivers Mac Users: Ensure you have Python 3.8 All Users: Create a Test Viual Environment that uses TensorFlow in PyCharm Create a new PyCharm project Install TensorFlow and related packages Check the suppoed version of Python in PyCharm Practice training TensorFlow models Setup your environment for the TensorFlow Developer Certificate exam 1 FAQs How do I sta the exam? What version of TensorFlow is used in the exam? Is there a minimum hardware requirement? Where is the candidate handbook? Before you begin The TensorFlow ceicate exam runs inside PyCharm. The exam uses TensorFlow 2.5.0. You must use Python 3.8 to ensure accurate grading of your models. The exam has been tested with Python 3.8.0 and TensorFlow 2.5.0. Before you sta the exam, make sure your environment is ready: ❏ Make sure you have Python 3.8 installed on your computer. ❏ Check that your system meets the installation requirements for PyCharm here.
    [Show full text]