An Executive's Guide to AI
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Neural Networks (AI) (WBAI028-05) Lecture Notes
Herbert Jaeger Neural Networks (AI) (WBAI028-05) Lecture Notes V 1.5, May 16, 2021 (revision of Section 8) BSc program in Artificial Intelligence Rijksuniversiteit Groningen, Bernoulli Institute Contents 1 A very fast rehearsal of machine learning basics 7 1.1 Training data. .............................. 8 1.2 Training objectives. ........................... 9 1.3 The overfitting problem. ........................ 11 1.4 How to tune model flexibility ..................... 17 1.5 How to estimate the risk of a model .................. 21 2 Feedforward networks in machine learning 24 2.1 The Perceptron ............................. 24 2.2 Multi-layer perceptrons ......................... 29 2.3 A glimpse at deep learning ....................... 52 3 A short visit in the wonderland of dynamical systems 56 3.1 What is a “dynamical system”? .................... 58 3.2 The zoo of standard finite-state discrete-time dynamical systems .. 64 3.3 Attractors, Bifurcation, Chaos ..................... 78 3.4 So far, so good ... ............................ 96 4 Recurrent neural networks in deep learning 98 4.1 Supervised training of RNNs in temporal tasks ............ 99 4.2 Backpropagation through time ..................... 107 4.3 LSTM networks ............................. 112 5 Hopfield networks 118 5.1 An energy-based associative memory ................. 121 5.2 HN: formal model ............................ 124 5.3 Geometry of the HN state space .................... 126 5.4 Training a HN .............................. 127 5.5 Limitations .............................. -
Backpropagation and Deep Learning in the Brain
Backpropagation and Deep Learning in the Brain Simons Institute -- Computational Theories of the Brain 2018 Timothy Lillicrap DeepMind, UCL With: Sergey Bartunov, Adam Santoro, Jordan Guerguiev, Blake Richards, Luke Marris, Daniel Cownden, Colin Akerman, Douglas Tweed, Geoffrey Hinton The “credit assignment” problem The solution in artificial networks: backprop Credit assignment by backprop works well in practice and shows up in virtually all of the state-of-the-art supervised, unsupervised, and reinforcement learning algorithms. Why Isn’t Backprop “Biologically Plausible”? Why Isn’t Backprop “Biologically Plausible”? Neuroscience Evidence for Backprop in the Brain? A spectrum of credit assignment algorithms: A spectrum of credit assignment algorithms: A spectrum of credit assignment algorithms: How to convince a neuroscientist that the cortex is learning via [something like] backprop - To convince a machine learning researcher, an appeal to variance in gradient estimates might be enough. - But this is rarely enough to convince a neuroscientist. - So what lines of argument help? How to convince a neuroscientist that the cortex is learning via [something like] backprop - What do I mean by “something like backprop”?: - That learning is achieved across multiple layers by sending information from neurons closer to the output back to “earlier” layers to help compute their synaptic updates. How to convince a neuroscientist that the cortex is learning via [something like] backprop 1. Feedback connections in cortex are ubiquitous and modify the -
Implementing Machine Learning & Neural Network Chip
Implementing Machine Learning & Neural Network Chip Architectures Using Network-on-Chip Interconnect IP Implementing Machine Learning & Neural Network Chip Architectures USING NETWORK-ON-CHIP INTERCONNECT IP TY GARIBAY Chief Technology Officer [email protected] Title: Implementing Machine Learning & Neural Network Chip Architectures Using Network-on-Chip Interconnect IP Primary author: Ty Garibay, Chief Technology Officer, ArterisIP, [email protected] Secondary author: Kurt Shuler, Vice President, Marketing, ArterisIP, [email protected] Abstract: A short tutorial and overview of machine learning and neural network chip architectures, with emphasis on how network-on-chip interconnect IP implements these architectures. Time to read: 10 minutes Time to present: 30 minutes Copyright © 2017 Arteris www.arteris.com 1 Implementing Machine Learning & Neural Network Chip Architectures Using Network-on-Chip Interconnect IP What is Machine Learning? “ Field of study that gives computers the ability to learn without being explicitly programmed.” Arthur Samuel, IBM, 1959 • Machine learning is a subset of Artificial Intelligence Copyright © 2017 Arteris 2 • Machine learning is a subset of artificial intelligence, and explicitly relies upon experiential learning rather than programming to make decisions. • The advantage to machine learning for tasks like automated driving or speech translation is that complex tasks like these are nearly impossible to explicitly program using rule-based if…then…else statements because of the large solution space. • However, the “answers” given by a machine learning algorithm have a probability associated with them and can be non-deterministic (meaning you can get different “answers” given the same inputs during different runs) • Neural networks have become the most common way to implement machine learning. -
Improving Efficiency of Map Reduce Paradigm with ANFIS for Big Data (IJSTE/ Volume 1 / Issue 12 / 015)
IJSTE - International Journal of Science Technology & Engineering | Volume 1 | Issue 12 | June 2015 ISSN (online): 2349-784X Improving Efficiency of Map Reduce Paradigm with ANFIS for Big Data Gor Vatsal H. Prof. Vatika Tayal Department of Computer Science and Engineering Department of Computer Science and Engineering NarNarayan Shashtri Institute of Technology Jetalpur , NarNarayan Shashtri Institute of Technology Jetalpur , Ahmedabad , India Ahmedabad , India Abstract As all we know that map reduce paradigm is became synonyms for computing big data problems like processing, generating and/or deducing large scale of data sets. Hadoop is a well know framework for these types of problems. The problems for solving big data related problems are varies from their size , their nature either they are repetitive or not etc., so depending upon that various solutions or way have been suggested for different types of situations and problems. Here a hybrid approach is used which combines map reduce paradigm with anfis which is aimed to boost up such problems which are likely to repeat whole map reduce process multiple times. Keywords: Big Data, fuzzy Neural Network, ANFIS, Map Reduce, Hadoop ________________________________________________________________________________________________________ I. INTRODUCTION Initially, to solve problem various problems related to large crawled documents, web requests logs, row data , etc a computational processing model is suggested by jeffrey Dean and Sanjay Ghemawat is Map Reduce in 2004[1]. MapReduce programming model is inspired by map and reduce primitives which are available in Lips and many other functional languages. It is like a De Facto standard and widely used for solving big data processing and related various operations. -
Predrnn: Recurrent Neural Networks for Predictive Learning Using Spatiotemporal Lstms
PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs Yunbo Wang Mingsheng Long∗ School of Software School of Software Tsinghua University Tsinghua University [email protected] [email protected] Jianmin Wang Zhifeng Gao Philip S. Yu School of Software School of Software School of Software Tsinghua University Tsinghua University Tsinghua University [email protected] [email protected] [email protected] Abstract The predictive learning of spatiotemporal sequences aims to generate future images by learning from the historical frames, where spatial appearances and temporal vari- ations are two crucial structures. This paper models these structures by presenting a predictive recurrent neural network (PredRNN). This architecture is enlightened by the idea that spatiotemporal predictive learning should memorize both spatial ap- pearances and temporal variations in a unified memory pool. Concretely, memory states are no longer constrained inside each LSTM unit. Instead, they are allowed to zigzag in two directions: across stacked RNN layers vertically and through all RNN states horizontally. The core of this network is a new Spatiotemporal LSTM (ST-LSTM) unit that extracts and memorizes spatial and temporal representations simultaneously. PredRNN achieves the state-of-the-art prediction performance on three video prediction datasets and is a more general framework, that can be easily extended to other predictive learning tasks by integrating with other architectures. 1 Introduction -
Identificação De Textos Em Imagens CAPTCHA Utilizando Conceitos De
Identificação de Textos em Imagens CAPTCHA utilizando conceitos de Aprendizado de Máquina e Redes Neurais Convolucionais Relatório submetido à Universidade Federal de Santa Catarina como requisito para a aprovação da disciplina: DAS 5511: Projeto de Fim de Curso Murilo Rodegheri Mendes dos Santos Florianópolis, Julho de 2018 Identificação de Textos em Imagens CAPTCHA utilizando conceitos de Aprendizado de Máquina e Redes Neurais Convolucionais Murilo Rodegheri Mendes dos Santos Esta monografia foi julgada no contexto da disciplina DAS 5511: Projeto de Fim de Curso e aprovada na sua forma final pelo Curso de Engenharia de Controle e Automação Prof. Marcelo Ricardo Stemmer Banca Examinadora: André Carvalho Bittencourt Orientador na Empresa Prof. Marcelo Ricardo Stemmer Orientador no Curso Prof. Ricardo José Rabelo Responsável pela disciplina Flávio Gabriel Oliveira Barbosa, Avaliador Guilherme Espindola Winck, Debatedor Ricardo Carvalho Frantz do Amaral, Debatedor Agradecimentos Agradeço à minha mãe Terezinha Rodegheri, ao meu pai Orlisses Mendes dos Santos e ao meu irmão Camilo Rodegheri Mendes dos Santos que sempre estiveram ao meu lado, tanto nos momentos de alegria quanto nos momentos de dificuldades, sempre me deram apoio, conselhos, suporte e nunca duvidaram da minha capacidade de alcançar meus objetivos. Agradeço aos meus colegas Guilherme Cornelli, Leonardo Quaini, Matheus Ambrosi, Matheus Zardo, Roger Perin e Victor Petrassi por me acompanharem em toda a graduação, seja nas disciplinas, nos projetos, nas noites de estudo, nas atividades extracurriculares, nas festas, entre outros desafios enfrentados para chegar até aqui. Agradeço aos meus amigos de infância Cássio Schmidt, Daniel Lock, Gabriel Streit, Gabriel Cervo, Guilherme Trevisan, Lucas Nyland por proporcionarem momentos de alegria mesmo a distância na maior parte da caminhada da graduação. -
Getting Started with Machine Learning
Getting Started with Machine Learning CSC131 The Beauty & Joy of Computing Cornell College 600 First Street SW Mount Vernon, Iowa 52314 September 2018 ii Contents 1 Applications: where machine learning is helping1 1.1 Sheldon Branch............................1 1.2 Bram Dedrick.............................4 1.3 Tony Ferenzi.............................5 1.3.1 Benefits of Machine Learning................5 1.4 William Golden............................7 1.4.1 Humans: The Teachers of Technology...........7 1.5 Yuan Hong..............................9 1.6 Easton Jensen............................. 11 1.7 Rodrigo Martinez........................... 13 1.7.1 Machine Learning in Medicine............... 13 1.8 Matt Morrical............................. 15 1.9 Ella Nelson.............................. 16 1.10 Koichi Okazaki............................ 17 1.11 Jakob Orel.............................. 19 1.12 Marcellus Parks............................ 20 1.13 Lydia Sanchez............................. 22 1.14 Tiff Serra-Pichardo.......................... 24 1.15 Austin Stala.............................. 25 1.16 Nicole Trenholm........................... 26 1.17 Maddy Weaver............................ 28 1.18 Peter Weber.............................. 29 iii iv CONTENTS 2 Recommendations: How to learn more about machine learning 31 2.1 Sheldon Branch............................ 31 2.1.1 Course 1: Machine Learning................. 31 2.1.2 Course 2: Robotics: Vision Intelligence and Machine Learn- ing............................... 33 2.1.3 Course -
Almost Unsupervised Text to Speech and Automatic Speech Recognition
Almost Unsupervised Text to Speech and Automatic Speech Recognition Yi Ren * 1 Xu Tan * 2 Tao Qin 2 Sheng Zhao 3 Zhou Zhao 1 Tie-Yan Liu 2 Abstract 1. Introduction Text to speech (TTS) and automatic speech recognition (ASR) are two popular tasks in speech processing and have Text to speech (TTS) and automatic speech recog- attracted a lot of attention in recent years due to advances in nition (ASR) are two dual tasks in speech pro- deep learning. Nowadays, the state-of-the-art TTS and ASR cessing and both achieve impressive performance systems are mostly based on deep neural models and are all thanks to the recent advance in deep learning data-hungry, which brings challenges on many languages and large amount of aligned speech and text data. that are scarce of paired speech and text data. Therefore, However, the lack of aligned data poses a ma- a variety of techniques for low-resource and zero-resource jor practical problem for TTS and ASR on low- ASR and TTS have been proposed recently, including un- resource languages. In this paper, by leveraging supervised ASR (Yeh et al., 2019; Chen et al., 2018a; Liu the dual nature of the two tasks, we propose an et al., 2018; Chen et al., 2018b), low-resource ASR (Chuang- almost unsupervised learning method that only suwanich, 2016; Dalmia et al., 2018; Zhou et al., 2018), TTS leverages few hundreds of paired data and extra with minimal speaker data (Chen et al., 2019; Jia et al., 2018; unpaired data for TTS and ASR. -
The Deep Learning Revolution and Its Implications for Computer Architecture and Chip Design
The Deep Learning Revolution and Its Implications for Computer Architecture and Chip Design Jeffrey Dean Google Research [email protected] Abstract The past decade has seen a remarkable series of advances in machine learning, and in particular deep learning approaches based on artificial neural networks, to improve our abilities to build more accurate systems across a broad range of areas, including computer vision, speech recognition, language translation, and natural language understanding tasks. This paper is a companion paper to a keynote talk at the 2020 International Solid-State Circuits Conference (ISSCC) discussing some of the advances in machine learning, and their implications on the kinds of computational devices we need to build, especially in the post-Moore’s Law-era. It also discusses some of the ways that machine learning may also be able to help with some aspects of the circuit design process. Finally, it provides a sketch of at least one interesting direction towards much larger-scale multi-task models that are sparsely activated and employ much more dynamic, example- and task-based routing than the machine learning models of today. Introduction The past decade has seen a remarkable series of advances in machine learning (ML), and in particular deep learning approaches based on artificial neural networks, to improve our abilities to build more accurate systems across a broad range of areas [LeCun et al. 2015]. Major areas of significant advances include computer vision [Krizhevsky et al. 2012, Szegedy et al. 2015, He et al. 2016, Real et al. 2017, Tan and Le 2019], speech recognition [Hinton et al. -
Anna Lysyanskaya Curriculum Vitae
Anna Lysyanskaya Curriculum Vitae Computer Science Department, Box 1910 Brown University Providence, RI 02912 (401) 863-7605 email: [email protected] http://www.cs.brown.edu/~anna Research Interests Cryptography, privacy, computer security, theory of computation. Education Massachusetts Institute of Technology Cambridge, MA Ph.D. in Computer Science, September 2002 Advisor: Ronald L. Rivest, Viterbi Professor of EECS Thesis title: \Signature Schemes and Applications to Cryptographic Protocol Design" Massachusetts Institute of Technology Cambridge, MA S.M. in Computer Science, June 1999 Smith College Northampton, MA A.B. magna cum laude, Highest Honors, Phi Beta Kappa, May 1997 Appointments Brown University, Providence, RI Fall 2013 - Present Professor of Computer Science Brown University, Providence, RI Fall 2008 - Spring 2013 Associate Professor of Computer Science Brown University, Providence, RI Fall 2002 - Spring 2008 Assistant Professor of Computer Science UCLA, Los Angeles, CA Fall 2006 Visiting Scientist at the Institute for Pure and Applied Mathematics (IPAM) Weizmann Institute, Rehovot, Israel Spring 2006 Visiting Scientist Massachusetts Institute of Technology, Cambridge, MA 1997 { 2002 Graduate student IBM T. J. Watson Research Laboratory, Hawthorne, NY Summer 2001 Summer Researcher IBM Z¨urich Research Laboratory, R¨uschlikon, Switzerland Summers 1999, 2000 Summer Researcher 1 Teaching Brown University, Providence, RI Spring 2008, 2011, 2015, 2017, 2019; Fall 2012 Instructor for \CS 259: Advanced Topics in Cryptography," a seminar course for graduate students. Brown University, Providence, RI Spring 2012 Instructor for \CS 256: Advanced Complexity Theory," a graduate-level complexity theory course. Brown University, Providence, RI Fall 2003,2004,2005,2010,2011 Spring 2007, 2009,2013,2014,2016,2018 Instructor for \CS151: Introduction to Cryptography and Computer Security." Brown University, Providence, RI Fall 2016, 2018 Instructor for \CS 101: Theory of Computation," a core course for CS concentrators. -
Reinforcement Learning (1) Key Concepts & Algorithms
Winter 2020 CSC 594 Topics in AI: Advanced Deep Learning 5. Deep Reinforcement Learning (1) Key Concepts & Algorithms (Most content adapted from OpenAI ‘Spinning Up’) 1 Noriko Tomuro Reinforcement Learning (RL) • Reinforcement Learning (RL) is a type of Machine Learning where an agent learns to achieve a goal by interacting with the environment -- trial and error. • RL is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. • The purpose of RL is to learn an optimal policy that maximizes the return for the sequences of agent’s actions (i.e., optimal policy). https://en.wikipedia.org/wiki/Reinforcement_learning 2 • RL have recently enjoyed a wide variety of successes, e.g. – Robot controlling in simulation as well as in the real world Strategy games such as • AlphaGO (by Google DeepMind) • Atari games https://en.wikipedia.org/wiki/Reinforcement_learning 3 Deep Reinforcement Learning (DRL) • A policy is essentially a function, which maps the agent’s each action to the expected return or reward. Deep Reinforcement Learning (DRL) uses deep neural networks for the function (and other components). https://spinningup.openai.com/en/latest/spinningup/rl_intro.html 4 5 Some Key Concepts and Terminology 1. States and Observations – A state s is a complete description of the state of the world. For now, we can think of states belonging in the environment. – An observation o is a partial description of a state, which may omit information. – A state could be fully or partially observable to the agent. If partial, the agent forms an internal state (or state estimation). https://spinningup.openai.com/en/latest/spinningup/rl_intro.html 6 • States and Observations (cont.) – In deep RL, we almost always represent states and observations by a real-valued vector, matrix, or higher-order tensor. -
What Every Computer Scientist Should Know About Floating-Point Arithmetic
What Every Computer Scientist Should Know About Floating-Point Arithmetic DAVID GOLDBERG Xerox Palo Alto Research Center, 3333 Coyote Hill Road, Palo Alto, CalLfornLa 94304 Floating-point arithmetic is considered an esotoric subject by many people. This is rather surprising, because floating-point is ubiquitous in computer systems: Almost every language has a floating-point datatype; computers from PCs to supercomputers have floating-point accelerators; most compilers will be called upon to compile floating-point algorithms from time to time; and virtually every operating system must respond to floating-point exceptions such as overflow This paper presents a tutorial on the aspects of floating-point that have a direct impact on designers of computer systems. It begins with background on floating-point representation and rounding error, continues with a discussion of the IEEE floating-point standard, and concludes with examples of how computer system builders can better support floating point, Categories and Subject Descriptors: (Primary) C.0 [Computer Systems Organization]: General– instruction set design; D.3.4 [Programming Languages]: Processors —compders, optirruzatzon; G. 1.0 [Numerical Analysis]: General—computer arithmetic, error analysis, numerzcal algorithms (Secondary) D. 2.1 [Software Engineering]: Requirements/Specifications– languages; D, 3.1 [Programming Languages]: Formal Definitions and Theory —semantZcs D ,4.1 [Operating Systems]: Process Management—synchronization General Terms: Algorithms, Design, Languages Additional Key Words and Phrases: denormalized number, exception, floating-point, floating-point standard, gradual underflow, guard digit, NaN, overflow, relative error, rounding error, rounding mode, ulp, underflow INTRODUCTION tions of addition, subtraction, multipli- cation, and division. It also contains Builders of computer systems often need background information on the two information about floating-point arith- methods of measuring rounding error, metic.