Distributed Deep Q-Learning

Total Page:16

File Type:pdf, Size:1020Kb

Distributed Deep Q-Learning Distributed Deep Q-Learning Kevin Chavez1, Hao Yi Ong1, and Augustus Hong1 Abstract— We propose a distributed deep learning model unsupervised image classification. To achieve model paral- to successfully learn control policies directly from high- lelism, we use Caffe, a deep learning framework developed dimensional sensory input using reinforcement learning. The for image recognition that distributes training across multiple model is based on the deep Q-network, a convolutional neural network trained with a variant of Q-learning. Its input is processor cores [11]. raw pixels and its output is a value function estimating The contributions of this paper are twofold. First, we future rewards from taking an action given a system state. develop and implement a software framework adapted that To distribute the deep Q-network training, we adapt the supports model and data parallelism for DQN. Second, we DistBelief software framework to the context of efficiently demonstrate and analyze the performance of our distributed training reinforcement learning agents. As a result, the method is completely asynchronous and scales well with the number RL agent. The rest of this paper is organized as follows. of machines. We demonstrate that the deep Q-network agent, Section II introduces the background on the class of machine receiving only the pixels and the game score as inputs, was able learning problem our algorithm solves. This is followed by to achieve reasonable success on a simple game with minimal Section III and Section IV, which detail the serial DQN parameter tuning. and our approach to distributing the training. Section V I. INTRODUCTION discusses our experiments on a classic video game, and some Reinforcement learning (RL) agents face a tremendous concluding remarks are drawn and future works mentioned challenge in optimizing their control of a system approaching in Section VI. real-world complexity: they must derive efficient represen- tations of the environment from high-dimensional sensory II. BACKGROUND inputs and use these to generalize past experience to new situations. While past work in RL has shown that with good We begin with a brief review on MDPs and reinforcement hand-crafted features agents are able to learn good control learning (RL). policies, their applicability has been limited to domains where such features have been discovered, or to domains A. Markov decision process with fully observed, low-dimensional state spaces [1]–[3]. We consider the problem of efficiently scaling a deep In an MDP, an agent chooses action at at time t after learning algorithm to control a complicated system with observing state st . The agent then receives reward rt , and high-dimensional sensory inputs. The basis of our algorithm the state evolves probabilistically based on the current state- is a RL agent called a deep Q-network (DQN) [4], [5] that action pair. The explicit assumption that the next state only combines RL with a class of artificial neural networks known depends on the current state-action pair is referred to as the as deep neural networks [6]. DQN uses an architecture called Markov assumption. An MDP can be defined by the tuple the deep convolutional network, which utilizes hierarchical .S;A;T;R/, where S and A are the sets of all possible layers of tiled convolutional filters to exploit the local spatial states and actions, respectively, T is a probabilistic transition correlations present in images. As a result, this architecture function, and R is a reward function. T gives the probability is robust to natural transformations such as changes of of transitioning into state s0 from taking action a at the viewpoint and scale [7]. current state s, and is often denoted T .s;a;s0/. R gives a In practice, increasing the scale of deep learning with scalar value indicating the immediate reward received for respect to the number of training examples or the number of taking action a at the current state s and is denoted R .s;a/. model parameters can drastically improve the performance of To solve an MDP, we compute a policy ? that, if deep neural networks [8], [9]. To train a deep network with followed, maximizes the expected sum of immediate rewards many parameters on multiple machines efficiently, we adapt from any given state. The optimal policy is related to the a software framework called DistBelief to the context of the optimal state-action value function Q? .s;a/, which is the training of RL agents [10]. Our new framework supports expected value when starting in state s, taking action a, data parallelism, thereby allowing us to potentially utilize and then following actions dictated by ?. Mathematically, computing clusters with thousands of machines for large- it obeys the Bellman recursion scale distributed training, as shown in [10] in the context of ? X ? Q .s;a/ R .s;a/ T s;a;s0 max Q s0;a0 : 1 D C a0 A K. Chavez, H. Y. Ong, and A. Hong are with the Departments of Elec- s0 S 2 trical Engineering, Mechanical Engineering, and Computer Science, respec- 2 tively, at Stanford University, Stanford, CA 94305, USA kjchavez, The state-action value function can be computed using a haoyi, auhong @stanford.edu f dynamic programming algorithm called value iteration. To g obtain the optimal policy for state s, we compute rectifier nonlinearity ? .s/ argmaxQ? .s;a/: f .x/ max.0;x/; D a A D 2 B. Reinforcement learning which was empirically observed to model real/integer valued inputs well [12], [13], as is required in our case. The The problem reinforcement learning seeks to solve differs remaining layers are fully-connected linear layers with a from the standard MDP in that the state space and transition single output for each valid action. The number of valid and reward functions are unknown to the agent. The goal of actions varies with the game application. The neural network the agent is thus to both build an internal representation of is implemented on Caffe [11], which is a versatile deep the world and select actions that maximizes cumulative future learning framework that allows us to define the network reward. To do this, the agent interacts with an environment architecture and training parameters freely. And because through a sequence of observations, actions, and rewards and Caffe is designed to take advantage of all available com- learns from past experience. puting resources on a machine, we can easily achieve model In our algorithm, the deep Q-network builds its internal parallelism using the software. representation of its environment by explicitly approximating the state-action value function Q? via a deep neural network. B. Q-learning Here, the basic idea is to estimate ? We parameterize the approximate value function Q .s;a/ maxEŒRt st s;at a;; D j D D Q .s;a / using the deep convolutional network as j where maps states to actions (or distributions over actions), described above, in which are the parameters of the with the additional knowledge that the optimal value function Q-network. These parameters are iteratively updated by the obeys Bellman equation minimizers of the loss function Ä h 2i ? ? Li .i / E .yi Q .s;a i // ; (1) Q .s;a/ E r maxQ s0;a0 s;a ; D s;a . / I D s0 E C a0 j with iteration number i, target y where E is the MDP environment. i E Œr max Q .s ;a / s;a, and “behaviorD s0 E a0 0 0 i 1 distribution” C (exploration policyI j .s;a/. The optimizers of III. APPROACH the Q-network loss function can be computed by gradient This section presents the general approach adapted from descent the serial deep Q-learning in [4], [5] to our purpose. In Q .s;a / Q .s;a / ˛ Â Q .s;a /; particular, we discuss the neural network architecture, the I WD I C r I iterative training algorithm, and a mechanism that improves with learning rate ˛. training convergence stability. For computational expedience, the parameters are updated after every time step; i.e., with every new experience. Our A. Preprocessing and network architecture algorithm also avoids computing full expectations, and we Working directly with raw video game frames can be train on single samples from and E. This results in the computationally demanding. Our algorithm applies a basic Q-learning update preprocessing step aimed at reducing the input dimension- Â Ã ality. Here, the raw frames are gray-scaled from their RGB Q .s;a/ Q .s;a/ ˛ r maxQ s0;a0 Q .s;a/ : WD C C a representation and down-sampled to a fixed size for input 0 to the neural network. For this paper, the function applies The procedure is an off-policy training method [14] that this preprocessing to the last four frames of a sequence and learns the policy a argmax Q .s;a / while using an D a I stacks them to produce the input to the state-action value exploration policy or behavior distribution selected by an function Q. -greedy strategy. We use an architecture in which there is a separate output The target network parameters used to compute y in unit for each possible action, and only the state representation Eq. (1) are only updated with the Q-network parameters is an input to the neural network; i.e., the preprocessed four every C steps and are held fixed between individual updates. frames sequence. The outputs correspond to the predicted These staggered updates stabilizes the learning process com- Q-values of the individual action for the input size. The pared to the standard Q-learning process, where an update main advantage of this type of architecture is the ability to that increases Q .st ;at / often also increases Q .st 1;a/ compute Q-values for all possible actions in a given state for all a and hence also increases the target y.C These with only a single forward pass through the network.
Recommended publications
  • Backpropagation and Deep Learning in the Brain
    Backpropagation and Deep Learning in the Brain Simons Institute -- Computational Theories of the Brain 2018 Timothy Lillicrap DeepMind, UCL With: Sergey Bartunov, Adam Santoro, Jordan Guerguiev, Blake Richards, Luke Marris, Daniel Cownden, Colin Akerman, Douglas Tweed, Geoffrey Hinton The “credit assignment” problem The solution in artificial networks: backprop Credit assignment by backprop works well in practice and shows up in virtually all of the state-of-the-art supervised, unsupervised, and reinforcement learning algorithms. Why Isn’t Backprop “Biologically Plausible”? Why Isn’t Backprop “Biologically Plausible”? Neuroscience Evidence for Backprop in the Brain? A spectrum of credit assignment algorithms: A spectrum of credit assignment algorithms: A spectrum of credit assignment algorithms: How to convince a neuroscientist that the cortex is learning via [something like] backprop - To convince a machine learning researcher, an appeal to variance in gradient estimates might be enough. - But this is rarely enough to convince a neuroscientist. - So what lines of argument help? How to convince a neuroscientist that the cortex is learning via [something like] backprop - What do I mean by “something like backprop”?: - That learning is achieved across multiple layers by sending information from neurons closer to the output back to “earlier” layers to help compute their synaptic updates. How to convince a neuroscientist that the cortex is learning via [something like] backprop 1. Feedback connections in cortex are ubiquitous and modify the
    [Show full text]
  • Lecture 6 Learned Feedforward Visual Processing Neural Networks, Deep Learning, Convnets
    William T. Freeman, Antonio Torralba, 2017 Lecture 6 Learned feedforward visual processing Neural Networks, Deep learning, ConvNets Some slides modified from R. Fergus We need translation invariance Lots of useful linear filters… Laplacian Gaussian derivative Gaussian Gabor And many more… High order Gaussian derivatives We need translation and scale invariance Lots of image pyramids… Gaussian Pyr Laplacian Pyr And many more: QMF, steerable, … We need … What is the best representation? • All the previous representation are manually constructed. • Could they be learnt from data? A brief history of Neural Networks enthusiasm time Perceptrons, 1958 Rosenblatt http://www.ecse.rpi.edu/homepages/nagy/PDF_chrono/2011_Na gy_Pace_FR.pdf. Photo by George Nagy 9 http://www.manhattanrarebooks-science.com/rosenblatt.htm Perceptrons, 1958 10 Perceptrons, 1958 enthusiasm time Minsky and Papert, Perceptrons, 1972 12 Perceptrons, 1958 enthusiasm Minsky and Papert, 1972 time Parallel Distributed Processing (PDP), 1986 14 XOR problem Inputs Output 0 0 0 1 0 1 0 1 1 0 1 1 1 0 0 1 PDP authors pointed to the backpropagation algorithm as a breakthrough, allowing multi-layer neural networks to be trained. Among the functions that a multi-layer network can represent but a single-layer network cannot: the XOR function. 15 Perceptrons, PDP book, 1958 1986 enthusiasm Minsky and Papert, 1972 time LeCun conv nets, 1998 Demos: http://yann.lecun.com/exdb/lenet/index.html 17 18 Neural networks to recognize handwritten digits? yes Neural networks for tougher problems? not really http://pub.clement.farabet.net/ecvw09.pdf 19 NIPS 2000 • NIPS, Neural Information Processing Systems, is the premier conference on machine learning.
    [Show full text]
  • Comparative Study of Deep Learning Software Frameworks
    Comparative Study of Deep Learning Software Frameworks Soheil Bahrampour, Naveen Ramakrishnan, Lukas Schott, Mohak Shah Research and Technology Center, Robert Bosch LLC {Soheil.Bahrampour, Naveen.Ramakrishnan, fixed-term.Lukas.Schott, Mohak.Shah}@us.bosch.com ABSTRACT such as dropout and weight decay [2]. As the popular- Deep learning methods have resulted in significant perfor- ity of the deep learning methods have increased over the mance improvements in several application domains and as last few years, several deep learning software frameworks such several software frameworks have been developed to have appeared to enable efficient development and imple- facilitate their implementation. This paper presents a com- mentation of these methods. The list of available frame- parative study of five deep learning frameworks, namely works includes, but is not limited to, Caffe, DeepLearning4J, Caffe, Neon, TensorFlow, Theano, and Torch, on three as- deepmat, Eblearn, Neon, PyLearn, TensorFlow, Theano, pects: extensibility, hardware utilization, and speed. The Torch, etc. Different frameworks try to optimize different as- study is performed on several types of deep learning ar- pects of training or deployment of a deep learning algorithm. chitectures and we evaluate the performance of the above For instance, Caffe emphasises ease of use where standard frameworks when employed on a single machine for both layers can be easily configured without hard-coding while (multi-threaded) CPU and GPU (Nvidia Titan X) settings. Theano provides automatic differentiation capabilities which The speed performance metrics used here include the gradi- facilitates flexibility to modify architecture for research and ent computation time, which is important during the train- development. Several of these frameworks have received ing phase of deep networks, and the forward time, which wide attention from the research community and are well- is important from the deployment perspective of trained developed allowing efficient training of deep networks with networks.
    [Show full text]
  • Comparative Study of Caffe, Neon, Theano, and Torch
    Workshop track - ICLR 2016 COMPARATIVE STUDY OF CAFFE,NEON,THEANO, AND TORCH FOR DEEP LEARNING Soheil Bahrampour, Naveen Ramakrishnan, Lukas Schott, Mohak Shah Bosch Research and Technology Center fSoheil.Bahrampour,Naveen.Ramakrishnan, fixed-term.Lukas.Schott,[email protected] ABSTRACT Deep learning methods have resulted in significant performance improvements in several application domains and as such several software frameworks have been developed to facilitate their implementation. This paper presents a comparative study of four deep learning frameworks, namely Caffe, Neon, Theano, and Torch, on three aspects: extensibility, hardware utilization, and speed. The study is per- formed on several types of deep learning architectures and we evaluate the per- formance of the above frameworks when employed on a single machine for both (multi-threaded) CPU and GPU (Nvidia Titan X) settings. The speed performance metrics used here include the gradient computation time, which is important dur- ing the training phase of deep networks, and the forward time, which is important from the deployment perspective of trained networks. For convolutional networks, we also report how each of these frameworks support various convolutional algo- rithms and their corresponding performance. From our experiments, we observe that Theano and Torch are the most easily extensible frameworks. We observe that Torch is best suited for any deep architecture on CPU, followed by Theano. It also achieves the best performance on the GPU for large convolutional and fully connected networks, followed closely by Neon. Theano achieves the best perfor- mance on GPU for training and deployment of LSTM networks. Finally Caffe is the easiest for evaluating the performance of standard deep architectures.
    [Show full text]
  • Deep Learning Architectures for Sequence Processing
    Speech and Language Processing. Daniel Jurafsky & James H. Martin. Copyright © 2021. All rights reserved. Draft of September 21, 2021. CHAPTER Deep Learning Architectures 9 for Sequence Processing Time will explain. Jane Austen, Persuasion Language is an inherently temporal phenomenon. Spoken language is a sequence of acoustic events over time, and we comprehend and produce both spoken and written language as a continuous input stream. The temporal nature of language is reflected in the metaphors we use; we talk of the flow of conversations, news feeds, and twitter streams, all of which emphasize that language is a sequence that unfolds in time. This temporal nature is reflected in some of the algorithms we use to process lan- guage. For example, the Viterbi algorithm applied to HMM part-of-speech tagging, proceeds through the input a word at a time, carrying forward information gleaned along the way. Yet other machine learning approaches, like those we’ve studied for sentiment analysis or other text classification tasks don’t have this temporal nature – they assume simultaneous access to all aspects of their input. The feedforward networks of Chapter 7 also assumed simultaneous access, al- though they also had a simple model for time. Recall that we applied feedforward networks to language modeling by having them look only at a fixed-size window of words, and then sliding this window over the input, making independent predictions along the way. Fig. 9.1, reproduced from Chapter 7, shows a neural language model with window size 3 predicting what word follows the input for all the. Subsequent words are predicted by sliding the window forward a word at a time.
    [Show full text]
  • Unsupervised Speech Representation Learning Using Wavenet Autoencoders Jan Chorowski, Ron J
    1 Unsupervised speech representation learning using WaveNet autoencoders Jan Chorowski, Ron J. Weiss, Samy Bengio, Aaron¨ van den Oord Abstract—We consider the task of unsupervised extraction speaker gender and identity, from phonetic content, properties of meaningful latent representations of speech by applying which are consistent with internal representations learned autoencoding neural networks to speech waveforms. The goal by speech recognizers [13], [14]. Such representations are is to learn a representation able to capture high level semantic content from the signal, e.g. phoneme identities, while being desired in several tasks, such as low resource automatic speech invariant to confounding low level details in the signal such as recognition (ASR), where only a small amount of labeled the underlying pitch contour or background noise. Since the training data is available. In such scenario, limited amounts learned representation is tuned to contain only phonetic content, of data may be sufficient to learn an acoustic model on the we resort to using a high capacity WaveNet decoder to infer representation discovered without supervision, but insufficient information discarded by the encoder from previous samples. Moreover, the behavior of autoencoder models depends on the to learn the acoustic model and a data representation in a fully kind of constraint that is applied to the latent representation. supervised manner [15], [16]. We compare three variants: a simple dimensionality reduction We focus on representations learned with autoencoders bottleneck, a Gaussian Variational Autoencoder (VAE), and a applied to raw waveforms and spectrogram features and discrete Vector Quantized VAE (VQ-VAE). We analyze the quality investigate the quality of learned representations on LibriSpeech of learned representations in terms of speaker independence, the ability to predict phonetic content, and the ability to accurately re- [17].
    [Show full text]
  • Tensorflow, Theano, Keras, Torch, Caffe Vicky Kalogeiton, Stéphane Lathuilière, Pauline Luc, Thomas Lucas, Konstantin Shmelkov Introduction
    TensorFlow, Theano, Keras, Torch, Caffe Vicky Kalogeiton, Stéphane Lathuilière, Pauline Luc, Thomas Lucas, Konstantin Shmelkov Introduction TensorFlow Google Brain, 2015 (rewritten DistBelief) Theano University of Montréal, 2009 Keras François Chollet, 2015 (now at Google) Torch Facebook AI Research, Twitter, Google DeepMind Caffe Berkeley Vision and Learning Center (BVLC), 2013 Outline 1. Introduction of each framework a. TensorFlow b. Theano c. Keras d. Torch e. Caffe 2. Further comparison a. Code + models b. Community and documentation c. Performance d. Model deployment e. Extra features 3. Which framework to choose when ..? Introduction of each framework TensorFlow architecture 1) Low-level core (C++/CUDA) 2) Simple Python API to define the computational graph 3) High-level API (TF-Learn, TF-Slim, soon Keras…) TensorFlow computational graph - auto-differentiation! - easy multi-GPU/multi-node - native C++ multithreading - device-efficient implementation for most ops - whole pipeline in the graph: data loading, preprocessing, prefetching... TensorBoard TensorFlow development + bleeding edge (GitHub yay!) + division in core and contrib => very quick merging of new hotness + a lot of new related API: CRF, BayesFlow, SparseTensor, audio IO, CTC, seq2seq + so it can easily handle images, videos, audio, text... + if you really need a new native op, you can load a dynamic lib - sometimes contrib stuff disappears or moves - recently introduced bells and whistles are barely documented Presentation of Theano: - Maintained by Montréal University group. - Pioneered the use of a computational graph. - General machine learning tool -> Use of Lasagne and Keras. - Very popular in the research community, but not elsewhere. Falling behind. What is it like to start using Theano? - Read tutorials until you no longer can, then keep going.
    [Show full text]
  • Machine Learning V/S Deep Learning
    International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 02 | Feb 2019 www.irjet.net p-ISSN: 2395-0072 Machine Learning v/s Deep Learning Sachin Krishan Khanna Department of Computer Science Engineering, Students of Computer Science Engineering, Chandigarh Group of Colleges, PinCode: 140307, Mohali, India ---------------------------------------------------------------------------***--------------------------------------------------------------------------- ABSTRACT - This is research paper on a brief comparison and summary about the machine learning and deep learning. This comparison on these two learning techniques was done as there was lot of confusion about these two learning techniques. Nowadays these techniques are used widely in IT industry to make some projects or to solve problems or to maintain large amount of data. This paper includes the comparison of two techniques and will also tell about the future aspects of the learning techniques. Keywords: ML, DL, AI, Neural Networks, Supervised & Unsupervised learning, Algorithms. INTRODUCTION As the technology is getting advanced day by day, now we are trying to make a machine to work like a human so that we don’t have to make any effort to solve any problem or to do any heavy stuff. To make a machine work like a human, the machine need to learn how to do work, for this machine learning technique is used and deep learning is used to help a machine to solve a real-time problem. They both have algorithms which work on these issues. With the rapid growth of this IT sector, this industry needs speed, accuracy to meet their targets. With these learning algorithms industry can meet their requirements and these new techniques will provide industry a different way to solve problems.
    [Show full text]
  • Comparative Analysis of Recurrent Neural Network Architectures for Reservoir Inflow Forecasting
    water Article Comparative Analysis of Recurrent Neural Network Architectures for Reservoir Inflow Forecasting Halit Apaydin 1 , Hajar Feizi 2 , Mohammad Taghi Sattari 1,2,* , Muslume Sevba Colak 1 , Shahaboddin Shamshirband 3,4,* and Kwok-Wing Chau 5 1 Department of Agricultural Engineering, Faculty of Agriculture, Ankara University, Ankara 06110, Turkey; [email protected] (H.A.); [email protected] (M.S.C.) 2 Department of Water Engineering, Agriculture Faculty, University of Tabriz, Tabriz 51666, Iran; [email protected] 3 Department for Management of Science and Technology Development, Ton Duc Thang University, Ho Chi Minh City, Vietnam 4 Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Vietnam 5 Department of Civil and Environmental Engineering, Hong Kong Polytechnic University, Hong Kong, China; [email protected] * Correspondence: [email protected] or [email protected] (M.T.S.); [email protected] (S.S.) Received: 1 April 2020; Accepted: 21 May 2020; Published: 24 May 2020 Abstract: Due to the stochastic nature and complexity of flow, as well as the existence of hydrological uncertainties, predicting streamflow in dam reservoirs, especially in semi-arid and arid areas, is essential for the optimal and timely use of surface water resources. In this research, daily streamflow to the Ermenek hydroelectric dam reservoir located in Turkey is simulated using deep recurrent neural network (RNN) architectures, including bidirectional long short-term memory (Bi-LSTM), gated recurrent unit (GRU), long short-term memory (LSTM), and simple recurrent neural networks (simple RNN). For this purpose, daily observational flow data are used during the period 2012–2018, and all models are coded in Python software programming language.
    [Show full text]
  • A Primer on Machine Learning
    A Primer on Machine Learning By instructor Amit Manghani Question: What is Machine Learning? Simply put, Machine Learning is a form of data analysis. Using algorithms that “ continuously learn from data, Machine Learning allows computers to recognize The core of hidden patterns without actually being programmed to do so. The key aspect of Machine Learning Machine Learning is that as models are exposed to new data sets, they adapt to produce reliable and consistent output. revolves around a computer system Question: consuming data What is driving the resurgence of Machine Learning? and learning from There are four interrelated phenomena that are behind the growing prominence the data. of Machine Learning: 1) the ever-increasing volume, variety and velocity of data, 2) the decrease in bandwidth and storage costs and 3) the exponential improve- ments in computational processing. In a nutshell, the ability to perform complex ” mathematical computations on big data is driving the resurgence in Machine Learning. 1 Question: What are some of the commonly used methods of Machine Learning? Reinforce- ment Machine Learning Supervised Machine Learning Semi- supervised Machine Unsupervised Learning Machine Learning Supervised Machine Learning In Supervised Learning, algorithms are trained using labeled examples i.e. the desired output for an input is known. For example, a piece of mail could be labeled either as relevant or junk. The algorithm receives a set of inputs along with the corresponding correct outputs to foster learning. Once the algorithm is trained on a set of labeled data; the algorithm is run against the same labeled data and its actual output is compared against the correct output to detect errors.
    [Show full text]
  • DIY Deep Learning for Vision: the Caffe Framework
    DIY Deep Learning for Vision: the Caffe framework caffe.berkeleyvision.org github.com/BVLC/caffe Evan Shelhamer adapted from the Caffe tutorial with Jeff Donahue, Yangqing Jia, and Ross Girshick. Why Deep Learning? The Unreasonable Effectiveness of Deep Features Classes separate in the deep representations and transfer to many tasks. [DeCAF] [Zeiler-Fergus] Why Deep Learning? The Unreasonable Effectiveness of Deep Features Maximal activations of pool5 units [R-CNN] conv5 DeConv visualization Rich visual structure of features deep in hierarchy. [Zeiler-Fergus] Why Deep Learning? The Unreasonable Effectiveness of Deep Features 1st layer filters image patches that strongly activate 1st layer filters [Zeiler-Fergus] What is Deep Learning? Compositional Models Learned End-to-End What is Deep Learning? Compositional Models Learned End-to-End Hierarchy of Representations - vision: pixel, motif, part, object - text: character, word, clause, sentence - speech: audio, band, phone, word concrete abstract learning What is Deep Learning? Compositional Models Learned End-to-End figure credit Yann LeCun, ICML ‘13 tutorial What is Deep Learning? Compositional Models Learned End-to-End Back-propagation: take the gradient of the model layer-by-layer by the chain rule to yield the gradient of all the parameters. figure credit Yann LeCun, ICML ‘13 tutorial What is Deep Learning? Vast space of models! Caffe models are loss-driven: - supervised - unsupervised slide credit Marc’aurelio Ranzato, CVPR ‘14 tutorial. Convolutional Neural Nets (CNNs): 1989 LeNet: a layered model composed of convolution and subsampling operations followed by a holistic representation and ultimately a classifier for handwritten digits. [ LeNet ] Convolutional Nets: 2012 AlexNet: a layered model composed of convolution, + data subsampling, and further operations followed by a holistic + gpu representation and all-in-all a landmark classifier on + non-saturating nonlinearity ILSVRC12.
    [Show full text]
  • Deep Learning and Neural Networks Module 4
    Deep Learning and Neural Networks Module 4 Table of Contents Learning Outcomes ......................................................................................................................... 5 Review of AI Concepts ................................................................................................................... 6 Artificial Intelligence ............................................................................................................................ 6 Supervised and Unsupervised Learning ................................................................................................ 6 Artificial Neural Networks .................................................................................................................... 8 The Human Brain and the Neural Network ........................................................................................... 9 Machine Learning Vs. Neural Network ............................................................................................... 11 Machine Learning vs. Neural Network Comparison Table (Educba, 2019) .............................................. 12 Conclusion – Machine Learning vs. Neural Network ........................................................................... 13 Real-time Speech Translation ............................................................................................................. 14 Uses of a Translator App ...................................................................................................................
    [Show full text]