Deep Learning Unsupervised Learning

Total Page:16

File Type:pdf, Size:1020Kb

Deep Learning Unsupervised Learning Deep Learning Unsupervised Learning Russ Salakhutdinov Machine Learning Department Carnegie Mellon University Canadian Institute for Advanced Research Tutorial Roadmap Part 1: Supervised (Discriminave) Learning: Deep Networks Part 2: Unsupervised Learning: Deep Generave Models Part 3: Open Research Ques>ons Unsupervised Learning Non-probabilis>c Models Probabilis>c (Generave) Ø Sparse Coding Models Ø Autoencoders Ø Others (e.g. k-means) Non-Tractable Models Tractable Models Ø Generave Adversarial Ø Fully observed Ø Boltzmann Machines Networks Ø Belief Nets Variaonal Ø Moment Matching Ø NADE Autoencoders Networks Ø PixelRNN Ø Helmholtz Machines Ø Many others… Explicit Density p(x) Implicit Density Tutorial Roadmap • Basic Building Blocks: Ø Sparse Coding Ø Autoencoders • Deep Generave Models Ø Restricted Boltzmann Machines Ø Deep Boltzmann Machines Ø Helmholtz Machines / Variaonal Autoencoders • Generave Adversarial Networks Tutorial Roadmap • Basic Building Blocks: Ø Sparse Coding Ø Autoencoders • Deep Generave Models Ø Restricted Boltzmann Machines Ø Deep Boltzmann Machines Ø Helmholtz Machines / Variaonal Autoencoders • Generave Adversarial Networks Sparse Coding • Sparse coding (Olshausen & Field, 1996). Originally developed to explain early visual processing in the brain (edge detec>on). • Objecve: Given a set of input data vectors learn a dic>onary of bases such that: Sparse: mostly zeros • Each data vector is represented as a sparse linear combinaon of bases. Sparse Coding Natural Images Learned bases: “Edges” New example = 0.8 * + 0.3 * + 0.5 * x = 0.8 * + 0.3 * + 0.5 * [0, 0, … 0.8, …, 0.3, …, 0.5, …] = coefficients (feature representaon) Slide Credit: Honglak Lee Sparse Coding: Training • Input image patches: • Learn dic>onary of bases: Reconstruc>on error Sparsity penalty • Alternang Op>mizaon: 1. Fix dic>onary of bases and solve for ac>vaons a (a standard Lasso problem). 2. Fix ac>vaons a, op>mize the dic>onary of bases (convex QP problem). Sparse Coding: Tes>ng Time • Input: a new image patch x* , and K learned bases • Output: sparse representaon a of an image patch x*. = 0.8 * + 0.3 * + 0.5 * x* = 0.8 * + 0.3 * + 0.5 * [0, 0, … 0.8, …, 0.3, …, 0.5, …] = coefficients (feature representaon) Image Classificaon Evaluated on Caltech101 object category dataset. Classification Algorithm (SVM) Learned Input Image bases Features (coefficients) 9K images, 101 classes Algorithm Accuracy Baseline (Fei-Fei et al., 2004) 16% PCA 37% Sparse Coding 47% Slide Credit: Honglak Lee Lee, Bale, Raina, Ng, 2006 Interpre>ng Sparse Coding a Sparse features a Implicit g(a) Explicit f(x) Linear nonlinear Decoding encoding x’ x • Sparse, over-complete representaon a. • Encoding a = f(x) is implicit and nonlinear func>on of x. • Reconstruc>on (or decoding) x’ = g(a) is linear and explicit. Autoencoder Feature Representation Feed-back, Feed-forward, generative, bottom-up top-down Decoder Encoder path Input Image • Details of what goes insider the encoder and decoder maer! • Need constraints to avoid learning an iden>ty. Autoencoder Binary Features z Decoder Encoder filters D filters W. Dz z=σ(Wx) Linear Sigmoid function function path Input Image x Autoencoder Binary Features z • An autoencoder with D inputs, D outputs, and K hidden units, with K<D. Dz z=σ(Wx) • Given an input x, its reconstruc>on is given by: Input Image x Decoder Encoder Autoencoder Binary Features z • An autoencoder with D inputs, D outputs, and K hidden units, with K<D. Dz z=σ(Wx) Input Image x • We can determine the network parameters W and D by minimizing the reconstruc>on error: Autoencoder Linear Features z • If the hidden and output layers are linear, it will learn hidden units that are a linear func>on of the data and minimize the squared error. Wz z=Wx • The K hidden units will span the same space as the first k principal components. The weight vectors Input Image x may not be orthogonal. • With nonlinear hidden units, we have a nonlinear generalizaon of PCA. Another Autoencoder Model Binary Features z Encoder filters W. σ(WTz) z=σ(Wx) Decoder Sigmoid filters D function path Binary Input x • Need addi>onal constraints to avoid learning an iden>ty. • Relates to Restricted Boltzmann Machines (later). Predic>ve Sparse Decomposi>on Binary Features z L Sparsity Encoder 1 filters W. Dz z=σ(Wx) Decoder Sigmoid filters D function path Real-valued Input x At training time path Decoder Encoder Kavukcuoglu, Ranzato, Fergus, LeCun, 2009 Stacked Autoencoders Class Labels Decoder Encoder Features Sparsity Decoder Encoder Features Sparsity Decoder Encoder Input x Stacked Autoencoders Class Labels Decoder Encoder Features Sparsity Decoder Encoder Features Greedy Layer-wise Learning. Sparsity Decoder Encoder Input x Stacked Autoencoders Class Labels • Remove decoders and use feed-forward part. Encoder • Standard, or Features convolu>onal neural network architecture. Encoder • Parameters can be Features fine-tuned using backpropagaon. Encoder Input x Deep Autoencoders Decoder 30 W 4 Top 500 RBM T T W 1 W 1 + 8 2000 2000 T T 500 W 2 W 2 + 7 1000 1000 W 3 1000 W T W T RBM 3 3 + 6 500 500 W T T 4 W 4 + 5 1000 30 Code layer 30 W 4 W 44+ W 2 2000 500 500 RBM W 3 W 33+ 1000 1000 W 2 W 22+ 2000 2000 2000 W 1 W 1 W 11+ RBM Encoder Pretraining Unrolling Finetuning Deep Autoencoders • 25x25 – 2000 – 1000 – 500 – 30 autoencoder to extract 30-D real- valued codes for Olive face patches. • Top: Random samples from the test dataset. • Middle: Reconstruc>ons by the 30-dimensional deep autoencoder. • Boom: Reconstruc>ons by the 30-dimen>noal PCA. Informaon Retrieval European Community Interbank Markets Monetary/Economic 2-D LSA space Energy Markets Disasters and Accidents Leading Legal/Judicial Economic Indicators Government Accounts/ Borrowings Earnings • The Reuters Corpus Volume II contains 804,414 newswire stories (randomly split into 402,207 training and 402,207 test). • “Bag-of-words” representaon: each ar>cle is represented as a vector containing the counts of the most frequently used 2000 words. (Hinton and Salakhutdinov, Science 2006) Tutorial Roadmap • Basic Building Blocks: Ø Sparse Coding Ø Autoencoders • Deep Generave Models Ø Restricted Boltzmann Machines Ø Deep Boltzmann Machines Ø Helmholtz Machines / Variaonal Autoencoders • Generave Adversarial Networks BRIEF ARTICLE THE AUTHOR Maximum likelihood ✓⇤Fully Observed Models = arg max Ex pdata log pmodel(x ✓) ✓ ⇠ | Fully-visible belief• Explicitly model condi>onal probabili>es: net n pmodel(x)=pmodel(x1) pmodel(xi x1,...,xi 1) | − Yi=2 Each condi>onal can be a complicated neural network • A number of successful models, including Ø NADE, RNADE (Larochelle, et.al. 20011) Ø Pixel CNN (van den Ord et. al. 2016) Ø Pixel RNN (van den Ord et. al. 2016) Pixel CNN 1 Restricted Boltzmann Machines hidden variables Pair-wise Unary Graphical Models: Powerful Feature Detectors framework for represen>ng dependency structure between random variables. Image visible variables RBM is a Markov Random Field with: • Stochas>c binary visible variables • Stochas>c binary hidden variables • Bipar>te connec>ons. Markov random fields, Boltzmann machines, log-linear models. Learning Features Observed Data Learned W: “edges” Subset of 25,000 characters Subset of 1000 features Sparse New Image: representaons = …. Logis>c Func>on: Suitable for modeling binary images RBMs for Real-valued & Count Data Learned features (out of 10,000) 4 million unlabelled images Learned features: ``topics’’ russian clinton computer trade stock Reuters dataset: russia house system country wall 804,414 unlabeled moscow president product import street newswire stories yeltsin bill sovware world point dow Bag-of-Words soviet congress develop economy Collaborave Filtering Binary hidden: user preferences h Learned features: ``genre’’ W1 Fahrenheit 9/11 Independence Day v Bowling for Columbine The Day Aver Tomorrow The People vs. Larry Flynt Con Air Mul>nomial visible: user rangs Canadian Bacon Men in Black II La Dolce Vita Men in Black Nexlix dataset: 480,189 users Friday the 13th Scary Movie The Texas Chainsaw Massacre Naked Gun 17,770 movies Children of the Corn Hot Shots! Over 100 million rangs Child's Play American Pie The Return of Michael Myers Police Academy State-of-the-art performance on the Nexlix dataset. (Salakhutdinov, Mnih, Hinton, ICML 2007) Different Data Modali>es • Binary/Gaussian/ Sovmax RBMs: All have binary hidden variables but use them to model different kinds of data. Binary 0 0 0 1 0 Real-valued 1-of-K • It is easy to infer the states of the hidden variables: Product of Experts The joint distribu>on is given by: Marginalizing over hidden variables: Product of Experts government clinton bribery mafia stock … authority house corrupon business wall power president dishonesty gang street empire bill corrupt mob point federaon congress fraud insider dow Topics “government”, ”corrupon” and ”mafia” can combine to give very high probability to a word “Silvio Silvio Berlusconi Berlusconi”. Product of Experts The joint distribu>on is given by: Marginalizing over hidden variables: Product of Experts 50 Replicated 40 Softmax 50−D government clinton bribery mafia stock … authority house 30 corrupon business wall street power president dishonesty LDA 50−D gang point empire bill 20 corrupt mob federaon congress fraud insider dow Precision (%) 10 Topics “government”, ”corrupon” and ”mafia” can combine to give very 0.001 0.006 0.051 0.4 1.6 6.4 25.6 100 Recall high probability to a word “(%) Silvio Silvio Berlusconi Berlusconi”. Deep Boltzmann Machines Low-level features: Edges Built from unlabeled inputs. Input: Pixels Image (Salakhutdinov 2008, Salakhutdinov & Hinton 2012) Deep Boltzmann Machines Learn simpler representaons, then compose more complex ones Higher-level features: Combinaon of edges Low-level features: Edges Built from unlabeled inputs. Input: Pixels Image (Salakhutdinov 2008, Salakhutdinov & Hinton 2009) Model Formulaon h3 Same as RBMs W3 model parameters h2 • Dependencies between hidden variables.
Recommended publications
  • Self-Discriminative Learning for Unsupervised Document Embedding
    Self-Discriminative Learning for Unsupervised Document Embedding Hong-You Chen∗1, Chin-Hua Hu∗1, Leila Wehbe2, Shou-De Lin1 1Department of Computer Science and Information Engineering, National Taiwan University 2Machine Learning Department, Carnegie Mellon University fb03902128, [email protected], [email protected], [email protected] Abstract ingful a document embedding as they do not con- sider inter-document relationships. Unsupervised document representation learn- Traditional document representation models ing is an important task providing pre-trained such as Bag-of-words (BoW) and TF-IDF show features for NLP applications. Unlike most competitive performance in some tasks (Wang and previous work which learn the embedding based on self-prediction of the surface of text, Manning, 2012). However, these models treat we explicitly exploit the inter-document infor- words as flat tokens which may neglect other use- mation and directly model the relations of doc- ful information such as word order and semantic uments in embedding space with a discrimi- distance. This in turn can limit the models effec- native network and a novel objective. Exten- tiveness on more complex tasks that require deeper sive experiments on both small and large pub- level of understanding. Further, BoW models suf- lic datasets show the competitiveness of the fer from high dimensionality and sparsity. This is proposed method. In evaluations on standard document classification, our model has errors likely to prevent them from being used as input that are relatively 5 to 13% lower than state-of- features for downstream NLP tasks. the-art unsupervised embedding models. The Continuous vector representations for docu- reduction in error is even more pronounced in ments are being developed.
    [Show full text]
  • Q-Learning in Continuous State and Action Spaces
    -Learning in Continuous Q State and Action Spaces Chris Gaskett, David Wettergreen, and Alexander Zelinsky Robotic Systems Laboratory Department of Systems Engineering Research School of Information Sciences and Engineering The Australian National University Canberra, ACT 0200 Australia [cg dsw alex]@syseng.anu.edu.au j j Abstract. -learning can be used to learn a control policy that max- imises a scalarQ reward through interaction with the environment. - learning is commonly applied to problems with discrete states and ac-Q tions. We describe a method suitable for control tasks which require con- tinuous actions, in response to continuous states. The system consists of a neural network coupled with a novel interpolator. Simulation results are presented for a non-holonomic control task. Advantage Learning, a variation of -learning, is shown enhance learning speed and reliability for this task.Q 1 Introduction Reinforcement learning systems learn by trial-and-error which actions are most valuable in which situations (states) [1]. Feedback is provided in the form of a scalar reward signal which may be delayed. The reward signal is defined in relation to the task to be achieved; reward is given when the system is successfully achieving the task. The value is updated incrementally with experience and is defined as a discounted sum of expected future reward. The learning systems choice of actions in response to states is called its policy. Reinforcement learning lies between the extremes of supervised learning, where the policy is taught by an expert, and unsupervised learning, where no feedback is given and the task is to find structure in data.
    [Show full text]
  • Lecture 9 Recurrent Neural Networks “I’M Glad That I’M Turing Complete Now”
    Lecture 9 Recurrent Neural Networks “I’m glad that I’m Turing Complete now” Xinyu Zhou Megvii (Face++) Researcher [email protected] Nov 2017 Raise your hand and ask, whenever you have questions... We have a lot to cover and DON’T BLINK Outline ● RNN Basics ● Classic RNN Architectures ○ LSTM ○ RNN with Attention ○ RNN with External Memory ■ Neural Turing Machine ■ CAVEAT: don’t fall asleep ● Applications ○ A market of RNNs RNN Basics Feedforward Neural Networks ● Feedforward neural networks can fit any bounded continuous (compact) function ● This is called Universal approximation theorem https://en.wikipedia.org/wiki/Universal_approximation_theorem Cybenko, George. "Approximation by superpositions of a sigmoidal function." Mathematics of Control, Signals, and Systems (MCSS) 2.4 (1989): 303-314. Bounded Continuous Function is NOT ENOUGH! How to solve Travelling Salesman Problem? Bounded Continuous Function is NOT ENOUGH! How to solve Travelling Salesman Problem? We Need to be Turing Complete RNN is Turing Complete Siegelmann, Hava T., and Eduardo D. Sontag. "On the computational power of neural nets." Journal of computer and system sciences 50.1 (1995): 132-150. Sequence Modeling Sequence Modeling ● How to take a variable length sequence as input? ● How to predict a variable length sequence as output? RNN RNN Diagram A lonely feedforward cell RNN Diagram Grows … with more inputs and outputs RNN Diagram … here comes a brother (x_1, x_2) comprises a length-2 sequence RNN Diagram … with shared (tied) weights x_i: inputs y_i: outputs W: all the
    [Show full text]
  • Reinforcement Learning Data Science Africa 2018 Abuja, Nigeria (12 Nov - 16 Nov 2018)
    Reinforcement Learning Data Science Africa 2018 Abuja, Nigeria (12 Nov - 16 Nov 2018) Chika Yinka-Banjo, PhD Ayorkor Korsah, PhD University of Lagos Ashesi University Nigeria Ghana Outline • Introduction to Machine learning • Reinforcement learning definitions • Example reinforcement learning problems • The Markov decision process • The optimal policy • Value function & Q-value function • Bellman Equation • Q-learning • Building a simple Q-learning agent (coding) • Recap • Where to go from here? Introduction to Machine learning • Artificial Intelligence (AI) is the study and design of Intelligent agents. • An Intelligent agent can perceive its environment through sensors and it can act on its environment through actuators. • E.g. Agent: Humanoid robot • Environment: Earth? • Sensors: Camera, tactile sensor etc. • Actuators: Motors, grippers etc. • Machine learning is a subfield of Artificial Intelligence Branches of AI Introduction to Machine learning • Machine learning techniques learn from data without being explicitly programmed to do so. • Machine learning models enable the agent to learn from its own experience by extracting useful information from feedback from its environment. • Three types of learning feedback: • Supervised learning • Unsupervised learning • Reinforcement learning Branches of Machine learning Supervised learning • Supervised learning: the machine learning model is trained on many labelled examples of input-output pairs. • Such that when presented with a novel input, the model can estimate accurately what the correct output should be. • Data(x, y): x is input data, y is label Supervised learning task in the form of classification • Goal: learn a function to map x -> y • Examples include; Classification, regression object detection, image captioning etc. Unsupervised learning • Unsupervised learning: here the model extract useful information from unlabeled and unstructured data.
    [Show full text]
  • A Review of Unsupervised Artificial Neural Networks with Applications
    A REVIEW OF UNSUPERVISED ARTIFICIAL NEURAL NETWORKS WITH APPLICATIONS Samson Damilola Fabiyi Department of Electronic and Electrical Engineering, University of Strathclyde 204 George Street, G1 1XW, Glasgow, United Kingdom [email protected] ABSTRACT designer) who uses his or her knowledge of the environment to Artificial Neural Networks (ANNs) are models formulated to train the network with labelled data sets [7]. Hence, the mimic the learning capability of human brains. Learning in artificial neural networks learn by receiving input and target ANNs can be categorized into supervised, reinforcement and pairs of several observations from the labelled data sets, unsupervised learning. Application of supervised ANNs is processing the input, comparing the output with the target, limited to when the supervisor’s knowledge of the environment computing the error between the output and target, and using is sufficient to supply the networks with labelled datasets. the error signal and the concept of backward propagation to Application of unsupervised ANNs becomes imperative in adjust the weights interconnecting the network’s neurons with situations where it is very difficult to get labelled datasets. This the aim of minimising the error and optimising performance [6, paper presents the various methods, and applications of 7]. Fine-tuning of the network continues until the set of weights unsupervised ANNs. In order to achieve this, several secondary that minimise the discrepancy between the output and the sources of information, including academic journals and desired output is obtained. Figure 1 shows the block diagram conference proceedings, were selected. Autoencoders, self- which conceptualizes supervised learning in ANNs.
    [Show full text]
  • Neural Turing Machines Arxiv:1410.5401V2 [Cs.NE]
    Neural Turing Machines Alex Graves [email protected] Greg Wayne [email protected] Ivo Danihelka [email protected] Google DeepMind, London, UK Abstract We extend the capabilities of neural networks by coupling them to external memory re- sources, which they can interact with by attentional processes. The combined system is analogous to a Turing Machine or Von Neumann architecture but is differentiable end-to- end, allowing it to be efficiently trained with gradient descent. Preliminary results demon- strate that Neural Turing Machines can infer simple algorithms such as copying, sorting, and associative recall from input and output examples. 1 Introduction Computer programs make use of three fundamental mechanisms: elementary operations (e.g., arithmetic operations), logical flow control (branching), and external memory, which can be written to and read from in the course of computation (Von Neumann, 1945). De- arXiv:1410.5401v2 [cs.NE] 10 Dec 2014 spite its wide-ranging success in modelling complicated data, modern machine learning has largely neglected the use of logical flow control and external memory. Recurrent neural networks (RNNs) stand out from other machine learning methods for their ability to learn and carry out complicated transformations of data over extended periods of time. Moreover, it is known that RNNs are Turing-Complete (Siegelmann and Sontag, 1995), and therefore have the capacity to simulate arbitrary procedures, if properly wired. Yet what is possible in principle is not always what is simple in practice. We therefore enrich the capabilities of standard recurrent networks to simplify the solution of algorithmic tasks. This enrichment is primarily via a large, addressable memory, so, by analogy to Turing’s enrichment of finite-state machines by an infinite memory tape, we 1 dub our device a “Neural Turing Machine” (NTM).
    [Show full text]
  • A Hybrid Model Consisting of Supervised and Unsupervised Learning for Landslide Susceptibility Mapping
    remote sensing Article A Hybrid Model Consisting of Supervised and Unsupervised Learning for Landslide Susceptibility Mapping Zhu Liang 1, Changming Wang 1,* , Zhijie Duan 2, Hailiang Liu 1, Xiaoyang Liu 1 and Kaleem Ullah Jan Khan 1 1 College of Construction Engineering, Jilin University, Changchun 130012, China; [email protected] (Z.L.); [email protected] (H.L.); [email protected] (X.L.); [email protected] (K.U.J.K.) 2 State Key Laboratory of Hydroscience and Engineering Tsinghua University, Beijing 100084, China; [email protected] * Correspondence: [email protected]; Tel.: +86-135-0441-8751 Abstract: Landslides cause huge damage to social economy and human beings every year. Landslide susceptibility mapping (LSM) occupies an important position in land use and risk management. This study is to investigate a hybrid model which makes full use of the advantage of supervised learning model (SLM) and unsupervised learning model (ULM). Firstly, ten continuous variables were used to develop a ULM which consisted of factor analysis (FA) and k-means cluster for a preliminary landslide susceptibility map. Secondly, 351 landslides with “1” label were collected and the same number of non-landslide samples with “0” label were selected from the very low susceptibility area in the preliminary map, constituting a new priori condition for a SLM, and thirteen factors were used for the modeling of gradient boosting decision tree (GBDT) which represented for SLM. Finally, the performance of different models was verified using related indexes. The results showed that the performance of the pretreated GBDT model was improved with sensitivity, specificity, accuracy Citation: Liang, Z.; Wang, C.; Duan, and the area under the curve (AUC) values of 88.60%, 92.59%, 90.60% and 0.976, respectively.
    [Show full text]
  • 4 Perceptron Learning
    4 Perceptron Learning 4.1 Learning algorithms for neural networks In the two preceding chapters we discussed two closely related models, McCulloch–Pitts units and perceptrons, but the question of how to find the parameters adequate for a given task was left open. If two sets of points have to be separated linearly with a perceptron, adequate weights for the comput- ing unit must be found. The operators that we used in the preceding chapter, for example for edge detection, used hand customized weights. Now we would like to find those parameters automatically. The perceptron learning algorithm deals with this problem. A learning algorithm is an adaptive method by which a network of com- puting units self-organizes to implement the desired behavior. This is done in some learning algorithms by presenting some examples of the desired input- output mapping to the network. A correction step is executed iteratively until the network learns to produce the desired response. The learning algorithm is a closed loop of presentation of examples and of corrections to the network parameters, as shown in Figure 4.1. network test input-output compute the examples error fix network parameters Fig. 4.1. Learning process in a parametric system R. Rojas: Neural Networks, Springer-Verlag, Berlin, 1996 78 4 Perceptron Learning In some simple cases the weights for the computing units can be found through a sequential test of stochastically generated numerical combinations. However, such algorithms which look blindly for a solution do not qualify as “learning”. A learning algorithm must adapt the network parameters accord- ing to previous experience until a solution is found, if it exists.
    [Show full text]
  • Unsupervised Pre-Training of a Deep LSTM-Based Stacked Autoencoder for Multivariate Time Series Forecasting Problems Alaa Sagheer 1,2,3* & Mostafa Kotb2,3
    www.nature.com/scientificreports OPEN Unsupervised Pre-training of a Deep LSTM-based Stacked Autoencoder for Multivariate Time Series Forecasting Problems Alaa Sagheer 1,2,3* & Mostafa Kotb2,3 Currently, most real-world time series datasets are multivariate and are rich in dynamical information of the underlying system. Such datasets are attracting much attention; therefore, the need for accurate modelling of such high-dimensional datasets is increasing. Recently, the deep architecture of the recurrent neural network (RNN) and its variant long short-term memory (LSTM) have been proven to be more accurate than traditional statistical methods in modelling time series data. Despite the reported advantages of the deep LSTM model, its performance in modelling multivariate time series (MTS) data has not been satisfactory, particularly when attempting to process highly non-linear and long-interval MTS datasets. The reason is that the supervised learning approach initializes the neurons randomly in such recurrent networks, disabling the neurons that ultimately must properly learn the latent features of the correlated variables included in the MTS dataset. In this paper, we propose a pre-trained LSTM- based stacked autoencoder (LSTM-SAE) approach in an unsupervised learning fashion to replace the random weight initialization strategy adopted in deep LSTM recurrent networks. For evaluation purposes, two diferent case studies that include real-world datasets are investigated, where the performance of the proposed approach compares favourably with the deep LSTM approach. In addition, the proposed approach outperforms several reference models investigating the same case studies. Overall, the experimental results clearly show that the unsupervised pre-training approach improves the performance of deep LSTM and leads to better and faster convergence than other models.
    [Show full text]
  • Optimal Path Routing Using Reinforcement Learning
    OPTIMAL PATH ROUTING USING REINFORCEMENT LEARNING Rajasekhar Nannapaneni Sr Principal Engineer, Solutions Architect Dell EMC [email protected] Knowledge Sharing Article © 2020 Dell Inc. or its subsidiaries. The Dell Technologies Proven Professional Certification program validates a wide range of skills and competencies across multiple technologies and products. From Associate, entry-level courses to Expert-level, experience-based exams, all professionals in or looking to begin a career in IT benefit from industry-leading training and certification paths from one of the world’s most trusted technology partners. Proven Professional certifications include: • Cloud • Converged/Hyperconverged Infrastructure • Data Protection • Data Science • Networking • Security • Servers • Storage • Enterprise Architect Courses are offered to meet different learning styles and schedules, including self-paced On Demand, remote-based Virtual Instructor-Led and in-person Classrooms. Whether you are an experienced IT professional or just getting started, Dell Technologies Proven Professional certifications are designed to clearly signal proficiency to colleagues and employers. Learn more at www.dell.com/certification 2020 Dell Technologies Proven Professional Knowledge Sharing 2 Abstract Optimal path management is key when applied to disk I/O or network I/O. The efficiency of a storage or a network system depends on optimal routing of I/O. To obtain optimal path for an I/O between source and target nodes, an effective path finding mechanism among a set of given nodes is desired. In this article, a novel optimal path routing algorithm is developed using reinforcement learning techniques from AI. Reinforcement learning considers the given topology of nodes as the environment and leverages the given latency or distance between the nodes to determine the shortest path.
    [Show full text]
  • Memory Augmented Matching Networks for Few-Shot Learnings
    International Journal of Machine Learning and Computing, Vol. 9, No. 6, December 2019 Memory Augmented Matching Networks for Few-Shot Learnings Kien Tran, Hiroshi Sato, and Masao Kubo broken down when they have to learn a new concept on the Abstract—Despite many successful efforts have been made in fly or to deal with the problems which information/data is one-shot/few-shot learning tasks recently, learning from few insufficient (few or even a single example). These situations data remains one of the toughest areas in machine learning. are very common in the computer vision field, such as Regarding traditional machine learning methods, the more data gathered, the more accurate the intelligent system could detecting a new object with only a few available captured perform. Hence for this kind of tasks, there must be different images. For those purposes, specialists define the approaches to guarantee systems could learn even with only one one-shot/few-shot learning algorithms that could solve a task sample per class but provide the most accurate results. Existing with only one or few known samples (e.g., less than 20 solutions are ranged from Bayesian approaches to examples per category). meta-learning approaches. With the advances in Deep learning One-shot learning/few-shot learning is a challenging field, meta-learning methods with deep networks could be domain for neural networks since gradient-based learning is better such as recurrent networks with enhanced external memory (Neural Turing Machines - NTM), metric learning often too slow to quickly cope with never-before-seen classes, with Matching Network framework, etc.
    [Show full text]
  • Training Robot Policies Using External Memory Based Networks Via Imitation
    Training Robot Policies using External Memory Based Networks Via Imitation Learning by Nambi Srivatsav A Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Science Approved September 2018 by the Graduate Supervisory Committee: Heni Ben Amor, Chair Siddharth Srivastava HangHang Tong ARIZONA STATE UNIVERSITY December 2018 ABSTRACT Recent advancements in external memory based neural networks have shown promise in solving tasks that require precise storage and retrieval of past information. Re- searchers have applied these models to a wide range of tasks that have algorithmic properties but have not applied these models to real-world robotic tasks. In this thesis, we present memory-augmented neural networks that synthesize robot naviga- tion policies which a) encode long-term temporal dependencies b) make decisions in partially observed environments and c) quantify the uncertainty inherent in the task. We extract information about the temporal structure of a task via imitation learning from human demonstration and evaluate the performance of the models on control policies for a robot navigation task. Experiments are performed in partially observed environments in both simulation and the real world. i TABLE OF CONTENTS Page LIST OF FIGURES . iii CHAPTER 1 INTRODUCTION . 1 2 BACKGROUND .................................................... 4 2.1 Neural Networks . 4 2.2 LSTMs . 4 2.3 Neural Turing Machines . 5 2.3.1 Controller . 6 2.3.2 Memory . 6 3 METHODOLOGY . 7 3.1 Neural Turing Machine for Imitation Learning . 7 3.1.1 Reading . 8 3.1.2 Writing . 9 3.1.3 Addressing Memory . 10 3.2 Uncertainty Estimation . 12 4 EXPERIMENTAL RESULTS .
    [Show full text]