Creating the Engine for Scientific Discovery
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
A Comprehensive Workplace Environment Based on a Deep
International Journal on Advances in Software, vol 11 no 3 & 4, year 2018, http://www.iariajournals.org/software/ 358 A Comprehensive Workplace Environment based on a Deep Learning Architecture for Cognitive Systems Using a multi-tier layout comprising various cognitive channels, reflexes and reasoning Thorsten Gressling Veronika Thurner Munich University of Applied Sciences ARS Computer und Consulting GmbH Department of Computer Science and Mathematics Munich, Germany Munich, Germany e-mail: [email protected] e-mail: [email protected] Abstract—Many technical work places, such as laboratories or All these settings share a number of commonalities. For test beds, are the setting for well-defined processes requiring both one thing, within each of these working settings a human being high precision and extensive documentation, to ensure accuracy interacts extensively with technical devices, such as measuring and support accountability that often is required by law, science, instruments or sensors. For another thing, processing follows or both. In this type of scenario, it is desirable to delegate certain a well-defined routine, or even precisely specified interaction routine tasks, such as documentation or preparatory next steps, to protocols. Finally, to ensure that results are reproducible, some sort of automated assistant, in order to increase precision and reduce the required amount of manual labor in one fell the different steps and achieved results usually have to be swoop. At the same time, this automated assistant should be able documented extensively and in a precise way. to interact adequately with the human worker, to ensure that Especially in scenarios that execute a well-defined series of the human worker receives exactly the kind of support that is actions, it is desirable to delegate certain routine tasks, such required in a certain context. -
Reinforcement Learning (1) Key Concepts & Algorithms
Winter 2020 CSC 594 Topics in AI: Advanced Deep Learning 5. Deep Reinforcement Learning (1) Key Concepts & Algorithms (Most content adapted from OpenAI ‘Spinning Up’) 1 Noriko Tomuro Reinforcement Learning (RL) • Reinforcement Learning (RL) is a type of Machine Learning where an agent learns to achieve a goal by interacting with the environment -- trial and error. • RL is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. • The purpose of RL is to learn an optimal policy that maximizes the return for the sequences of agent’s actions (i.e., optimal policy). https://en.wikipedia.org/wiki/Reinforcement_learning 2 • RL have recently enjoyed a wide variety of successes, e.g. – Robot controlling in simulation as well as in the real world Strategy games such as • AlphaGO (by Google DeepMind) • Atari games https://en.wikipedia.org/wiki/Reinforcement_learning 3 Deep Reinforcement Learning (DRL) • A policy is essentially a function, which maps the agent’s each action to the expected return or reward. Deep Reinforcement Learning (DRL) uses deep neural networks for the function (and other components). https://spinningup.openai.com/en/latest/spinningup/rl_intro.html 4 5 Some Key Concepts and Terminology 1. States and Observations – A state s is a complete description of the state of the world. For now, we can think of states belonging in the environment. – An observation o is a partial description of a state, which may omit information. – A state could be fully or partially observable to the agent. If partial, the agent forms an internal state (or state estimation). https://spinningup.openai.com/en/latest/spinningup/rl_intro.html 6 • States and Observations (cont.) – In deep RL, we almost always represent states and observations by a real-valued vector, matrix, or higher-order tensor. -
Build Your Trust Advantage, Leadership in the Era of Data
Global C-suite Study 20th Edition Build Your Trust Advantage Leadership in the era of data and AI everywhere This report is IBM’s fourth Global C-suite Study and the 20th Edition in the ongoing IBM CxO Study series developed by the IBM Institute for Business Value (IBV). We have now collected data and insights from more than 50,000 interviews dating back to 2003. This report was authored in collaboration with leading academics, futurists, and technology visionaries. In this report, we present our key findings of CxO insights, experiences, and sentiments based on analysis as described in the research methodology on page 44. Build Your Trust Advantage | 1 Build Your Trust Advantage Leadership in the era of data and AI everywhere Global C-suite Study 20th Edition Our latest study draws on input from 13,484 respondents across 6 C-suite roles, 20 industries, and 98 countries. 2,131 2,105 2,118 2,924 2,107 2,099 Chief Chief Chief Chief Chief Chief Executive Financial Human Information Marketing Operations Officers Officers Resources Officers Officers Officers Officers 3,363 Europe 1,910 Greater China 3,755 North America 858 Japan 915 Middle East and Africa 1,750 Asia Pacific 933 Latin America 2 | Global C-suite Study Table of contents Executive summary 3 Introduction 4 Chapter 1 Customers: How to win in the trust economy 8 Action guide 19 Chapter 2 Enterprises: How to build the human-tech partnership 20 Action guide 31 Chapter 3 Ecosystems: How to share data in the platform era 32 Action guide 41 Conclusion: Return on trust 42 Acknowledgments 43 Related IBV studies 43 Research methodology 44 Notes and sources 45 Build Your Trust Advantage | 3 Executive summary More than 13,000 C-suite executives worldwide their data scientists, to uncover insights from data. -
OLIVAW: Mastering Othello with Neither Humans Nor a Penny Antonio Norelli, Alessandro Panconesi Dept
1 OLIVAW: Mastering Othello with neither Humans nor a Penny Antonio Norelli, Alessandro Panconesi Dept. of Computer Science, Universita` La Sapienza, Rome, Italy Abstract—We introduce OLIVAW, an AI Othello player adopt- of companies. Another aspect of the same problem is the ing the design principles of the famous AlphaGo series. The main amount of training needed. AlphaGo Zero required 4.9 million motivation behind OLIVAW was to attain exceptional competence games played during self-play. While to attain the level of in a non-trivial board game at a tiny fraction of the cost of its illustrious predecessors. In this paper, we show how the grandmaster for games like Starcraft II and Dota 2 the training AlphaGo Zero’s paradigm can be successfully applied to the required 200 years and more than 10,000 years of gameplay, popular game of Othello using only commodity hardware and respectively [7], [8]. free cloud services. While being simpler than Chess or Go, Thus one of the major problems to emerge in the wake Othello maintains a considerable search space and difficulty of these breakthroughs is whether comparable results can be in evaluating board positions. To achieve this result, OLIVAW implements some improvements inspired by recent works to attained at a much lower cost– computational and financial– accelerate the standard AlphaGo Zero learning process. The and with just commodity hardware. In this paper we take main modification implies doubling the positions collected per a small step in this direction, by showing that AlphaGo game during the training phase, by including also positions not Zero’s successful paradigm can be replicated for the game played but largely explored by the agent. -
A.I. Technologies Applied to Naval CAD/CAM/CAE
A.I. Technologies Applied to Naval CAD/CAM/CAE Jesus A. Muñoz Herrero, SENER, Madrid/Spain, [email protected] Rodrigo Perez Fernandez, SENER, Madrid/Spain, [email protected] Abstract Artificial Intelligence is one of the most enabling technologies of digital transformation in the industry, but it is also one of the technologies that most rapidly spreads in our daily activity. Increasingly, elements and devices that integrate artificial intelligence features, appear in our everyday lives. These characteristics are different, depending on the devices that integrate them, or the aim they pursue. The methods and processes that are carried out in Marine Engineering cannot be left out of this technology, but the peculiarities of the profession and the people that take part in it must be taken into account. There are many aspects in which artificial intelligence can be applied in the field of our profession. The management and access to all the information necessary for the correct and efficient execution of a naval project is one of the aspects where this technology can have a very positive impact. Access all the rules, rules, design guides, good practices, lessons learned, etc., in a fast and intelligent way, understanding the natural language of the people, identifying the most appropriate to the process that is being carried out and above all. Learning as we go through the design, is one of the characteristics that will increase the application of this technology in the professional field. This article will describe the evolution of this technology and the current situation of it in the different areas of application. -
Towards Incremental Agent Enhancement for Evolving Games
Evaluating Reinforcement Learning Algorithms For Evolving Military Games James Chao*, Jonathan Sato*, Crisrael Lucero, Doug S. Lange Naval Information Warfare Center Pacific *Equal Contribution ffi[email protected] Abstract games in 2013 (Mnih et al. 2013), Google DeepMind devel- oped AlphaGo (Silver et al. 2016) that defeated world cham- In this paper, we evaluate reinforcement learning algorithms pion Lee Sedol in the game of Go using supervised learning for military board games. Currently, machine learning ap- and reinforcement learning. One year later, AlphaGo Zero proaches to most games assume certain aspects of the game (Silver et al. 2017b) was able to defeat AlphaGo with no remain static. This methodology results in a lack of algorithm robustness and a drastic drop in performance upon chang- human knowledge and pure reinforcement learning. Soon ing in-game mechanics. To this end, we will evaluate general after, AlphaZero (Silver et al. 2017a) generalized AlphaGo game playing (Diego Perez-Liebana 2018) AI algorithms on Zero to be able to play more games including Chess, Shogi, evolving military games. and Go, creating a more generalized AI to apply to differ- ent problems. In 2018, OpenAI Five used five Long Short- term Memory (Hochreiter and Schmidhuber 1997) neural Introduction networks and a Proximal Policy Optimization (Schulman et al. 2017) method to defeat a professional DotA team, each AlphaZero (Silver et al. 2017a) described an approach that LSTM acting as a player in a team to collaborate and achieve trained an AI agent through self-play to achieve super- a common goal. AlphaStar used a transformer (Vaswani et human performance. -
Ilearn Goals
Combining AI and collective intelligence to create a GPS for knowledge CRI Paris We experiment at frontiers of learning, life and digital From babies to lifelong learning LMD students Learning by doing Interdisciplinarity Sustainable Development goals A changing job market Too much! Fake news Recruting for skills on the digital job market Can we reinvent learning with artificial intelligence? What can you (almost) do with AI today? Deep Learning Classification Generation Who are these persons? www.thispersondoesnotexist.com Based on GAN (Generative Adversarial Networks) A.I. image generation From text to image From text to image Text generation Automatic Q&A generation from Wikipedia Debating with A.I. February 12th, 2019 IBM Project Debater vs. World Debating Champion Motion: “We should subsidise preschool” ● 15 mins to prepare arguments ● 4-minute opening statement ● 4-minute rebuttal ● 2-minute summary Predictive tools Pneumonia detection from thorax radiographies (2017) Deepfake A.I. lip reading Conversational interfaces May 2018 Google Duplex iLearn goals Map all learning resources available on the Internet Build learning profiles of our users Provide them maps of their own learnings Match learners with adapted resources Match learners with mentors or co-learners iLearn Knowledge maps + = + Collective Artificial intelligence intelligence Learning groups iLearn iLearn Tags an online Users improve document as a qualification Concepts extraction useful learning (concepts and resource difficulty) Artificial Collective Intelligence Intelligence -
Understanding & Generalizing Alphago Zero
Under review as a conference paper at ICLR 2019 UNDERSTANDING &GENERALIZING ALPHAGO ZERO Anonymous authors Paper under double-blind review ABSTRACT AlphaGo Zero (AGZ) (Silver et al., 2017b) introduced a new tabula rasa rein- forcement learning algorithm that has achieved superhuman performance in the games of Go, Chess, and Shogi with no prior knowledge other than the rules of the game. This success naturally begs the question whether it is possible to develop similar high-performance reinforcement learning algorithms for generic sequential decision-making problems (beyond two-player games), using only the constraints of the environment as the “rules.” To address this challenge, we start by taking steps towards developing a formal understanding of AGZ. AGZ includes two key innovations: (1) it learns a policy (represented as a neural network) using super- vised learning with cross-entropy loss from samples generated via Monte-Carlo Tree Search (MCTS); (2) it uses self-play to learn without training data. We argue that the self-play in AGZ corresponds to learning a Nash equilibrium for the two-player game; and the supervised learning with MCTS is attempting to learn the policy corresponding to the Nash equilibrium, by establishing a novel bound on the difference between the expected return achieved by two policies in terms of the expected KL divergence (cross-entropy) of their induced distributions. To extend AGZ to generic sequential decision-making problems, we introduce a robust MDP framework, in which the agent and nature effectively play a zero-sum game: the agent aims to take actions to maximize reward while nature seeks state transitions, subject to the constraints of that environment, that minimize the agent’s reward. -
Efficiently Mastering the Game of Nogo with Deep Reinforcement
electronics Article Efficiently Mastering the Game of NoGo with Deep Reinforcement Learning Supported by Domain Knowledge Yifan Gao 1,*,† and Lezhou Wu 2,† 1 College of Medicine and Biological Information Engineering, Northeastern University, Liaoning 110819, China 2 College of Information Science and Engineering, Northeastern University, Liaoning 110819, China; [email protected] * Correspondence: [email protected] † These authors contributed equally to this work. Abstract: Computer games have been regarded as an important field of artificial intelligence (AI) for a long time. The AlphaZero structure has been successful in the game of Go, beating the top professional human players and becoming the baseline method in computer games. However, the AlphaZero training process requires tremendous computing resources, imposing additional difficulties for the AlphaZero-based AI. In this paper, we propose NoGoZero+ to improve the AlphaZero process and apply it to a game similar to Go, NoGo. NoGoZero+ employs several innovative features to improve training speed and performance, and most improvement strategies can be transferred to other nonspecific areas. This paper compares it with the original AlphaZero process, and results show that NoGoZero+ increases the training speed to about six times that of the original AlphaZero process. Moreover, in the experiment, our agent beat the original AlphaZero agent with a score of 81:19 after only being trained by 20,000 self-play games’ data (small in quantity compared with Citation: Gao, Y.; Wu, L. Efficiently 120,000 self-play games’ data consumed by the original AlphaZero). The NoGo game program based Mastering the Game of NoGo with on NoGoZero+ was the runner-up in the 2020 China Computer Game Championship (CCGC) with Deep Reinforcement Learning limited resources, defeating many AlphaZero-based programs. -
AI Chips: What They Are and Why They Matter
APRIL 2020 AI Chips: What They Are and Why They Matter An AI Chips Reference AUTHORS Saif M. Khan Alexander Mann Table of Contents Introduction and Summary 3 The Laws of Chip Innovation 7 Transistor Shrinkage: Moore’s Law 7 Efficiency and Speed Improvements 8 Increasing Transistor Density Unlocks Improved Designs for Efficiency and Speed 9 Transistor Design is Reaching Fundamental Size Limits 10 The Slowing of Moore’s Law and the Decline of General-Purpose Chips 10 The Economies of Scale of General-Purpose Chips 10 Costs are Increasing Faster than the Semiconductor Market 11 The Semiconductor Industry’s Growth Rate is Unlikely to Increase 14 Chip Improvements as Moore’s Law Slows 15 Transistor Improvements Continue, but are Slowing 16 Improved Transistor Density Enables Specialization 18 The AI Chip Zoo 19 AI Chip Types 20 AI Chip Benchmarks 22 The Value of State-of-the-Art AI Chips 23 The Efficiency of State-of-the-Art AI Chips Translates into Cost-Effectiveness 23 Compute-Intensive AI Algorithms are Bottlenecked by Chip Costs and Speed 26 U.S. and Chinese AI Chips and Implications for National Competitiveness 27 Appendix A: Basics of Semiconductors and Chips 31 Appendix B: How AI Chips Work 33 Parallel Computing 33 Low-Precision Computing 34 Memory Optimization 35 Domain-Specific Languages 36 Appendix C: AI Chip Benchmarking Studies 37 Appendix D: Chip Economics Model 39 Chip Transistor Density, Design Costs, and Energy Costs 40 Foundry, Assembly, Test and Packaging Costs 41 Acknowledgments 44 Center for Security and Emerging Technology | 2 Introduction and Summary Artificial intelligence will play an important role in national and international security in the years to come. -
ELF Opengo: an Analysis and Open Reimplementation of Alphazero
ELF OpenGo: An Analysis and Open Reimplementation of AlphaZero Yuandong Tian 1 Jerry Ma * 1 Qucheng Gong * 1 Shubho Sengupta * 1 Zhuoyuan Chen 1 James Pinkerton 1 C. Lawrence Zitnick 1 Abstract However, these advances in playing ability come at signifi- The AlphaGo, AlphaGo Zero, and AlphaZero cant computational expense. A single training run requires series of algorithms are remarkable demonstra- millions of selfplay games and days of training on thousands tions of deep reinforcement learning’s capabili- of TPUs, which is an unattainable level of compute for the ties, achieving superhuman performance in the majority of the research community. When combined with complex game of Go with progressively increas- the unavailability of code and models, the result is that the ing autonomy. However, many obstacles remain approach is very difficult, if not impossible, to reproduce, in the understanding of and usability of these study, improve upon, and extend. promising approaches by the research commu- In this paper, we propose ELF OpenGo, an open-source nity. Toward elucidating unresolved mysteries reimplementation of the AlphaZero (Silver et al., 2018) and facilitating future research, we propose ELF algorithm for the game of Go. We then apply ELF OpenGo OpenGo, an open-source reimplementation of the toward the following three additional contributions. AlphaZero algorithm. ELF OpenGo is the first open-source Go AI to convincingly demonstrate First, we train a superhuman model for ELF OpenGo. Af- superhuman performance with a perfect (20:0) ter running our AlphaZero-style training software on 2,000 record against global top professionals. We ap- GPUs for 9 days, our 20-block model has achieved super- ply ELF OpenGo to conduct extensive ablation human performance that is arguably comparable to the 20- studies, and to identify and analyze numerous in- block models described in Silver et al.(2017) and Silver teresting phenomena in both the model training et al.(2018). -
Applying Deep Double Q-Learning and Monte Carlo Tree Search to Playing Go
CS221 FINAL PAPER 1 Applying Deep Double Q-Learning and Monte Carlo Tree Search to Playing Go Booher, Jonathan [email protected] De Alba, Enrique [email protected] Kannan, Nithin [email protected] I. INTRODUCTION the current position. By sampling states from the self play OR our project we replicate many of the methods games along with their respective rewards, the researchers F used in AlphaGo Zero to make an optimal Go player; were able to train a binary classifier to predict the outcome of a however, we modified the learning paradigm to a version of game with a certain confidence. Then based on the confidence Deep Q-Learning which we believe would result in better measures, the optimal move was taken. generalization of the network to novel positions. We depart from this method of training and use Deep The modification of Deep Q-Learning that we use is Double Q-Learning instead. We use the same concept of called Deep Double Q-Learning and will be described later. sampling states and their rewards from the games of self play, The evaluation metric for the success of our agent is the but instead of feeding this information to a binary classifier, percentage of games that are won against our Oracle, a we feed the information to a modified Q-Learning formula, Go-playing bot available in the OpenAI Gym. Since we are which we present shortly. implementing a version of reinforcement learning, there is no data that we will need other than the simulator. By training III. CHALLENGES on the games that are generated from self-play, our player The main challenge we faced was the computational com- will output a policy that is learned at the end of training by plexity of the game of Go.