Unsupervised Learning: Deep Auto-Encoder Credits

Total Page:16

File Type:pdf, Size:1020Kb

Unsupervised Learning: Deep Auto-Encoder Credits Unsupervised Learning: Deep Auto-encoder Credits • This lecture contains public domain materials from • Guy Nolan •Stefania Raimondo •Jason Su • Hung-yi Lee Auto-encoder Usually <784 Compact NN code representation of Encoder the input object 28 X 28 = 784 Learn together Can reconstruct code NN Decoder the original object Recap: PCA Minimize 푥 − 푥ො 2 As close as possible encode decode 푥 푐 푥ො 푊 푊푇 hidden layer Input layer (linear) output layer Bottleneck later Output of the hidden layer is the code Symmetric is not Deep Auto-encoder necessary. • Of course, the auto-encoder can be deep As close as possible Output Layer Input Layer bottle Layer Layer Layer Layer Layer Layer Layer Layer Layer Layer Layer … … 푇 푇 푊2 푊 푊1 푊2 1 Initialize by RBM 푥 Code 푥ො layer-by-layer Reference: Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. "Reducing the dimensionality of data with neural networks." Science 313.5786 (2006): 504-507 784 784 1000 30 500 784 250 30 250 500 encoder 1000 - 784 encoder PCA - Deep Image Original Auto Deep Auto DEEP AUTOENCODER EXAMPLE https://cs.stanford.edu/people/karpathy/convnetjs/demo/autoencoder.html - By Andrej Karpathy 784 784 1000 2 500 784 250 2 250 500 1000 784 More: Contractive auto-encoder Ref: Rifai, Salah, et al. "Contractive Auto-encoder auto-encoders: Explicit invariance during feature extraction.“ Proceedings of the 28th International Conference on • De-noising auto-encoder Machine Learning (ICML-11). 2011. As close as possible encode decode 푐 푥 푥′ 푥ො Add noise Vincent, Pascal, et al. "Extracting and composing robust features with denoising autoencoders." ICML, 2008. Deep Auto-encoder - Example NN 푐 Encoder PCA 32-dim Pixel -> tSNE t-Distributed Stochastic Neighbor Embedding (t-SNE) Auto-encoder – Text Retrieval Vector Space Model Bag-of-word this 1 is 1 word string: query “This is an apple” a 0 an 1 apple 1 pen 0 document … Semantics are not considered. Auto-encoder – Text Retrieval The documents talking about the same thing will have close code. 2 query 125 250 500 LSA: project documents to 2000 2 latent topics Bag-of-word (document or query) Auto- As close as encoder possible for CNN Deconvolution Unpooling Convolution Deconvolution Pooling Unpooling Convolution code Deconvolution Pooling CNN -Unpooling 14 x 14 28 x 28 Alternative: simply Source of image : repeat the values https://leonardoaraujosantos.gitbooks.io/artificial- inteligence/content/image_segmentation.html CNN Actually, deconvolution is convolution. - Deconvolution + + + = + Auto-encoder – Pre-training DNN • Greedy Layer-wise Pre-training again output 10 500 1000 784 푥ො Target W1’ 1000 1000 W1 Input 784 Input 784 푥 Auto-encoder – Pre-training DNN • Greedy Layer-wise Pre-training again output 10 500 1000 푎ො1 W2’ 1000 1000 Target W2 1000 1000 푎1 fix W1 Input 784 Input 784 푥 Auto-encoder – Pre-training DNN • Greedy Layer-wise Pre-training again output 10 1000 푎ො2 W3’ 500 500 W3 1000 1000 푎2 Target fix W2 1000 1000 푎1 fix W1 Input 784 Input 784 푥 Auto-encoder – Pre-training DNN Find-tune by • Greedy Layer-wise Pre-training again backpropagation output 10 output 10 Random 4 W init 500 500 W3 1000 1000 Target W2 1000 1000 W1 Input 784 Input 784 푥 SIMPLE LATENT SPACE INTERPOLATION - KERAS 푧1 Encoder 푧2 Encoder SIMPLE LATENT SPACE INTERPOLATION - KERAS 푧1 푧2 푧푖 = 훼 + 1 − 훼 푧푖 Decoder SIMPLE LATENT SPACE INTERPOLATION – KERAS CODE EXAMPLE SIMPLE LATENT SPACE – INTERPOLATION - KERAS DENOISING AUTOENCODERS Intuition: - We still aim to encode the input and to NOT mimic the identity function. - We try to undo the effect of corruption process stochastically applied to the input. A more robust model Encoder Decoder Noisy Input Denoised Input Latent space representation DENOISING AUTOENCODERS Use Case: - Extract robust representation for a NN classifier. Encoder Noisy Input Latent space representation DENOISING AUTOENCODERS Instead of trying to mimic the identity function by minimizing: 퐿 푥, 푔 푓 푥 where L is some loss function A DAE instead minimizes: 퐿 푥, 푔 푓 푥෤ where 푥෤ is a copy of 푥 that has been corrupted by some form of noise. DENOISING AUTOENCODERS Idea: A robust representation 푥ො against noise: 푤′ - Random assignment of subset of inputs to 0, with probability 푣. 푓 푥 - Gaussian additive noise. 푤 푥 DENOISING AUTOENCODERS 푥ො • Reconstruction 푥ො computed from the 푤′ corrupted input 푥෤. • Loss function compares 푥ො reconstruction 푓 푥෤ with the noiseless 푥. ❖The autoencoder cannot fully trust each 푤 feature of 푥 independently so it must learn the correlations of 푥’s features. 푥෤ 0 0 0 ❖Based on those relations we can predict a more ‘not prune to changes’ model. Noise Process 풑 풙෥ȁ풙 ➢ We are forcing the hidden layer to learn a 푥 generalized structure of the data. DENOISING AUTOENCODERS - PROCESS Taken some input 푥 Apply Noise 푥෤ DENOISING AUTOENCODERS - PROCESS 푥෤ Encode And Decode DAEDAE 푔 푓 푥෤ DENOISING AUTOENCODERS - PROCESS DAEDAE 푥ො 푔 푓 푥෤ DENOISING AUTOENCODERS - PROCESS 푥ො Compare 푥 DENOISING AUTOENCODERS DENOISING CONVOLUTIONAL AE – KERAS - 50 epochs. - Noise factor 0.5 - 92% accuracy on validation set. STACKED AE - Motivation: ❑ We want to harness the feature extraction quality of a AE for our advantage. ❑ For example: we can build a deep supervised classifier where it’s input is the output of a SAE. ❑ The benefit: our deep model’s W are not randomly initialized but are rather “smartly selected” ❑Also using this unsupervised technique lets us have a larger unlabeled dataset. STACKED AE - Building a SAE consists of two phases: 1. Train each AE layer one after the other. 2. Connect any classifier (SVM / FC NN layer etc.) STACKED AE 푥 푦 SAE Classifier STACKED AE – TRAIN PROCESS First Layer Training (AE 1) 푥 푓1 푥 푧1 푔1 푧1 푥ො STACKED AE – TRAIN PROCESS Second Layer Training (AE 2) 푧Ƹ 푥 푓1 푥 푧1 푓2 푧1 푧2 푔2 푧2 1 STACKED AE – TRAIN PROCESS Add any classifier 푥 푓1 푥 푧1 푓2 푧1 푧2 Classifier Output https://mi.eng.cam.ac.uk/projects/segnet/#demo U-Net • Current state of the art – Very popular in MICCAI 2016 – Works well with low data • Influenced by the previous – Up-conv 2x2 = bilinear up- sampling then 2x2 convolution – 2D slices – 3x3 convolutions https://lmb.informatik.uni-freiburg.de/people/ronneber/u-net/ Credits • This lecture contains public domain materials from – Guy Nolan – Stefania Raimondo – Jason Su – Hung-yi Lee .
Recommended publications
  • Self-Discriminative Learning for Unsupervised Document Embedding
    Self-Discriminative Learning for Unsupervised Document Embedding Hong-You Chen∗1, Chin-Hua Hu∗1, Leila Wehbe2, Shou-De Lin1 1Department of Computer Science and Information Engineering, National Taiwan University 2Machine Learning Department, Carnegie Mellon University fb03902128, [email protected], [email protected], [email protected] Abstract ingful a document embedding as they do not con- sider inter-document relationships. Unsupervised document representation learn- Traditional document representation models ing is an important task providing pre-trained such as Bag-of-words (BoW) and TF-IDF show features for NLP applications. Unlike most competitive performance in some tasks (Wang and previous work which learn the embedding based on self-prediction of the surface of text, Manning, 2012). However, these models treat we explicitly exploit the inter-document infor- words as flat tokens which may neglect other use- mation and directly model the relations of doc- ful information such as word order and semantic uments in embedding space with a discrimi- distance. This in turn can limit the models effec- native network and a novel objective. Exten- tiveness on more complex tasks that require deeper sive experiments on both small and large pub- level of understanding. Further, BoW models suf- lic datasets show the competitiveness of the fer from high dimensionality and sparsity. This is proposed method. In evaluations on standard document classification, our model has errors likely to prevent them from being used as input that are relatively 5 to 13% lower than state-of- features for downstream NLP tasks. the-art unsupervised embedding models. The Continuous vector representations for docu- reduction in error is even more pronounced in ments are being developed.
    [Show full text]
  • Q-Learning in Continuous State and Action Spaces
    -Learning in Continuous Q State and Action Spaces Chris Gaskett, David Wettergreen, and Alexander Zelinsky Robotic Systems Laboratory Department of Systems Engineering Research School of Information Sciences and Engineering The Australian National University Canberra, ACT 0200 Australia [cg dsw alex]@syseng.anu.edu.au j j Abstract. -learning can be used to learn a control policy that max- imises a scalarQ reward through interaction with the environment. - learning is commonly applied to problems with discrete states and ac-Q tions. We describe a method suitable for control tasks which require con- tinuous actions, in response to continuous states. The system consists of a neural network coupled with a novel interpolator. Simulation results are presented for a non-holonomic control task. Advantage Learning, a variation of -learning, is shown enhance learning speed and reliability for this task.Q 1 Introduction Reinforcement learning systems learn by trial-and-error which actions are most valuable in which situations (states) [1]. Feedback is provided in the form of a scalar reward signal which may be delayed. The reward signal is defined in relation to the task to be achieved; reward is given when the system is successfully achieving the task. The value is updated incrementally with experience and is defined as a discounted sum of expected future reward. The learning systems choice of actions in response to states is called its policy. Reinforcement learning lies between the extremes of supervised learning, where the policy is taught by an expert, and unsupervised learning, where no feedback is given and the task is to find structure in data.
    [Show full text]
  • Face Recognition: a Convolutional Neural-Network Approach
    98 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 8, NO. 1, JANUARY 1997 Face Recognition: A Convolutional Neural-Network Approach Steve Lawrence, Member, IEEE, C. Lee Giles, Senior Member, IEEE, Ah Chung Tsoi, Senior Member, IEEE, and Andrew D. Back, Member, IEEE Abstract— Faces represent complex multidimensional mean- include fingerprints [4], speech [7], signature dynamics [36], ingful visual stimuli and developing a computational model for and face recognition [8]. Sales of identity verification products face recognition is difficult. We present a hybrid neural-network exceed $100 million [29]. Face recognition has the benefit of solution which compares favorably with other methods. The system combines local image sampling, a self-organizing map being a passive, nonintrusive system for verifying personal (SOM) neural network, and a convolutional neural network. identity. The techniques used in the best face recognition The SOM provides a quantization of the image samples into a systems may depend on the application of the system. We topological space where inputs that are nearby in the original can identify at least two broad categories of face recognition space are also nearby in the output space, thereby providing systems. dimensionality reduction and invariance to minor changes in the image sample, and the convolutional neural network provides for 1) We want to find a person within a large database of partial invariance to translation, rotation, scale, and deformation. faces (e.g., in a police database). These systems typically The convolutional network extracts successively larger features return a list of the most likely people in the database in a hierarchical set of layers. We present results using the [34].
    [Show full text]
  • CNN Architectures
    Lecture 9: CNN Architectures Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 9 - 1 May 2, 2017 Administrative A2 due Thu May 4 Midterm: In-class Tue May 9. Covers material through Thu May 4 lecture. Poster session: Tue June 6, 12-3pm Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 9 - 2 May 2, 2017 Last time: Deep learning frameworks Paddle (Baidu) Caffe Caffe2 (UC Berkeley) (Facebook) CNTK (Microsoft) Torch PyTorch (NYU / Facebook) (Facebook) MXNet (Amazon) Developed by U Washington, CMU, MIT, Hong Kong U, etc but main framework of Theano TensorFlow choice at AWS (U Montreal) (Google) And others... Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 9 - 3 May 2, 2017 Last time: Deep learning frameworks (1) Easily build big computational graphs (2) Easily compute gradients in computational graphs (3) Run it all efficiently on GPU (wrap cuDNN, cuBLAS, etc) Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 9 - 4 May 2, 2017 Last time: Deep learning frameworks Modularized layers that define forward and backward pass Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 9 - 5 May 2, 2017 Last time: Deep learning frameworks Define model architecture as a sequence of layers Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 9 - 6 May 2, 2017 Today: CNN Architectures Case Studies - AlexNet - VGG - GoogLeNet - ResNet Also.... - NiN (Network in Network) - DenseNet - Wide ResNet - FractalNet - ResNeXT - SqueezeNet - Stochastic Depth Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 9 - 7 May 2, 2017 Review: LeNet-5 [LeCun et al., 1998] Conv filters were 5x5, applied at stride 1 Subsampling (Pooling) layers were 2x2 applied at stride 2 i.e.
    [Show full text]
  • Reinforcement Learning Data Science Africa 2018 Abuja, Nigeria (12 Nov - 16 Nov 2018)
    Reinforcement Learning Data Science Africa 2018 Abuja, Nigeria (12 Nov - 16 Nov 2018) Chika Yinka-Banjo, PhD Ayorkor Korsah, PhD University of Lagos Ashesi University Nigeria Ghana Outline • Introduction to Machine learning • Reinforcement learning definitions • Example reinforcement learning problems • The Markov decision process • The optimal policy • Value function & Q-value function • Bellman Equation • Q-learning • Building a simple Q-learning agent (coding) • Recap • Where to go from here? Introduction to Machine learning • Artificial Intelligence (AI) is the study and design of Intelligent agents. • An Intelligent agent can perceive its environment through sensors and it can act on its environment through actuators. • E.g. Agent: Humanoid robot • Environment: Earth? • Sensors: Camera, tactile sensor etc. • Actuators: Motors, grippers etc. • Machine learning is a subfield of Artificial Intelligence Branches of AI Introduction to Machine learning • Machine learning techniques learn from data without being explicitly programmed to do so. • Machine learning models enable the agent to learn from its own experience by extracting useful information from feedback from its environment. • Three types of learning feedback: • Supervised learning • Unsupervised learning • Reinforcement learning Branches of Machine learning Supervised learning • Supervised learning: the machine learning model is trained on many labelled examples of input-output pairs. • Such that when presented with a novel input, the model can estimate accurately what the correct output should be. • Data(x, y): x is input data, y is label Supervised learning task in the form of classification • Goal: learn a function to map x -> y • Examples include; Classification, regression object detection, image captioning etc. Unsupervised learning • Unsupervised learning: here the model extract useful information from unlabeled and unstructured data.
    [Show full text]
  • Deep Learning Architectures for Sequence Processing
    Speech and Language Processing. Daniel Jurafsky & James H. Martin. Copyright © 2021. All rights reserved. Draft of September 21, 2021. CHAPTER Deep Learning Architectures 9 for Sequence Processing Time will explain. Jane Austen, Persuasion Language is an inherently temporal phenomenon. Spoken language is a sequence of acoustic events over time, and we comprehend and produce both spoken and written language as a continuous input stream. The temporal nature of language is reflected in the metaphors we use; we talk of the flow of conversations, news feeds, and twitter streams, all of which emphasize that language is a sequence that unfolds in time. This temporal nature is reflected in some of the algorithms we use to process lan- guage. For example, the Viterbi algorithm applied to HMM part-of-speech tagging, proceeds through the input a word at a time, carrying forward information gleaned along the way. Yet other machine learning approaches, like those we’ve studied for sentiment analysis or other text classification tasks don’t have this temporal nature – they assume simultaneous access to all aspects of their input. The feedforward networks of Chapter 7 also assumed simultaneous access, al- though they also had a simple model for time. Recall that we applied feedforward networks to language modeling by having them look only at a fixed-size window of words, and then sliding this window over the input, making independent predictions along the way. Fig. 9.1, reproduced from Chapter 7, shows a neural language model with window size 3 predicting what word follows the input for all the. Subsequent words are predicted by sliding the window forward a word at a time.
    [Show full text]
  • A Review of Unsupervised Artificial Neural Networks with Applications
    A REVIEW OF UNSUPERVISED ARTIFICIAL NEURAL NETWORKS WITH APPLICATIONS Samson Damilola Fabiyi Department of Electronic and Electrical Engineering, University of Strathclyde 204 George Street, G1 1XW, Glasgow, United Kingdom [email protected] ABSTRACT designer) who uses his or her knowledge of the environment to Artificial Neural Networks (ANNs) are models formulated to train the network with labelled data sets [7]. Hence, the mimic the learning capability of human brains. Learning in artificial neural networks learn by receiving input and target ANNs can be categorized into supervised, reinforcement and pairs of several observations from the labelled data sets, unsupervised learning. Application of supervised ANNs is processing the input, comparing the output with the target, limited to when the supervisor’s knowledge of the environment computing the error between the output and target, and using is sufficient to supply the networks with labelled datasets. the error signal and the concept of backward propagation to Application of unsupervised ANNs becomes imperative in adjust the weights interconnecting the network’s neurons with situations where it is very difficult to get labelled datasets. This the aim of minimising the error and optimising performance [6, paper presents the various methods, and applications of 7]. Fine-tuning of the network continues until the set of weights unsupervised ANNs. In order to achieve this, several secondary that minimise the discrepancy between the output and the sources of information, including academic journals and desired output is obtained. Figure 1 shows the block diagram conference proceedings, were selected. Autoencoders, self- which conceptualizes supervised learning in ANNs.
    [Show full text]
  • A Hybrid Model Consisting of Supervised and Unsupervised Learning for Landslide Susceptibility Mapping
    remote sensing Article A Hybrid Model Consisting of Supervised and Unsupervised Learning for Landslide Susceptibility Mapping Zhu Liang 1, Changming Wang 1,* , Zhijie Duan 2, Hailiang Liu 1, Xiaoyang Liu 1 and Kaleem Ullah Jan Khan 1 1 College of Construction Engineering, Jilin University, Changchun 130012, China; [email protected] (Z.L.); [email protected] (H.L.); [email protected] (X.L.); [email protected] (K.U.J.K.) 2 State Key Laboratory of Hydroscience and Engineering Tsinghua University, Beijing 100084, China; [email protected] * Correspondence: [email protected]; Tel.: +86-135-0441-8751 Abstract: Landslides cause huge damage to social economy and human beings every year. Landslide susceptibility mapping (LSM) occupies an important position in land use and risk management. This study is to investigate a hybrid model which makes full use of the advantage of supervised learning model (SLM) and unsupervised learning model (ULM). Firstly, ten continuous variables were used to develop a ULM which consisted of factor analysis (FA) and k-means cluster for a preliminary landslide susceptibility map. Secondly, 351 landslides with “1” label were collected and the same number of non-landslide samples with “0” label were selected from the very low susceptibility area in the preliminary map, constituting a new priori condition for a SLM, and thirteen factors were used for the modeling of gradient boosting decision tree (GBDT) which represented for SLM. Finally, the performance of different models was verified using related indexes. The results showed that the performance of the pretreated GBDT model was improved with sensitivity, specificity, accuracy Citation: Liang, Z.; Wang, C.; Duan, and the area under the curve (AUC) values of 88.60%, 92.59%, 90.60% and 0.976, respectively.
    [Show full text]
  • 4 Perceptron Learning
    4 Perceptron Learning 4.1 Learning algorithms for neural networks In the two preceding chapters we discussed two closely related models, McCulloch–Pitts units and perceptrons, but the question of how to find the parameters adequate for a given task was left open. If two sets of points have to be separated linearly with a perceptron, adequate weights for the comput- ing unit must be found. The operators that we used in the preceding chapter, for example for edge detection, used hand customized weights. Now we would like to find those parameters automatically. The perceptron learning algorithm deals with this problem. A learning algorithm is an adaptive method by which a network of com- puting units self-organizes to implement the desired behavior. This is done in some learning algorithms by presenting some examples of the desired input- output mapping to the network. A correction step is executed iteratively until the network learns to produce the desired response. The learning algorithm is a closed loop of presentation of examples and of corrections to the network parameters, as shown in Figure 4.1. network test input-output compute the examples error fix network parameters Fig. 4.1. Learning process in a parametric system R. Rojas: Neural Networks, Springer-Verlag, Berlin, 1996 78 4 Perceptron Learning In some simple cases the weights for the computing units can be found through a sequential test of stochastically generated numerical combinations. However, such algorithms which look blindly for a solution do not qualify as “learning”. A learning algorithm must adapt the network parameters accord- ing to previous experience until a solution is found, if it exists.
    [Show full text]
  • Unsupervised Pre-Training of a Deep LSTM-Based Stacked Autoencoder for Multivariate Time Series Forecasting Problems Alaa Sagheer 1,2,3* & Mostafa Kotb2,3
    www.nature.com/scientificreports OPEN Unsupervised Pre-training of a Deep LSTM-based Stacked Autoencoder for Multivariate Time Series Forecasting Problems Alaa Sagheer 1,2,3* & Mostafa Kotb2,3 Currently, most real-world time series datasets are multivariate and are rich in dynamical information of the underlying system. Such datasets are attracting much attention; therefore, the need for accurate modelling of such high-dimensional datasets is increasing. Recently, the deep architecture of the recurrent neural network (RNN) and its variant long short-term memory (LSTM) have been proven to be more accurate than traditional statistical methods in modelling time series data. Despite the reported advantages of the deep LSTM model, its performance in modelling multivariate time series (MTS) data has not been satisfactory, particularly when attempting to process highly non-linear and long-interval MTS datasets. The reason is that the supervised learning approach initializes the neurons randomly in such recurrent networks, disabling the neurons that ultimately must properly learn the latent features of the correlated variables included in the MTS dataset. In this paper, we propose a pre-trained LSTM- based stacked autoencoder (LSTM-SAE) approach in an unsupervised learning fashion to replace the random weight initialization strategy adopted in deep LSTM recurrent networks. For evaluation purposes, two diferent case studies that include real-world datasets are investigated, where the performance of the proposed approach compares favourably with the deep LSTM approach. In addition, the proposed approach outperforms several reference models investigating the same case studies. Overall, the experimental results clearly show that the unsupervised pre-training approach improves the performance of deep LSTM and leads to better and faster convergence than other models.
    [Show full text]
  • Optimal Path Routing Using Reinforcement Learning
    OPTIMAL PATH ROUTING USING REINFORCEMENT LEARNING Rajasekhar Nannapaneni Sr Principal Engineer, Solutions Architect Dell EMC [email protected] Knowledge Sharing Article © 2020 Dell Inc. or its subsidiaries. The Dell Technologies Proven Professional Certification program validates a wide range of skills and competencies across multiple technologies and products. From Associate, entry-level courses to Expert-level, experience-based exams, all professionals in or looking to begin a career in IT benefit from industry-leading training and certification paths from one of the world’s most trusted technology partners. Proven Professional certifications include: • Cloud • Converged/Hyperconverged Infrastructure • Data Protection • Data Science • Networking • Security • Servers • Storage • Enterprise Architect Courses are offered to meet different learning styles and schedules, including self-paced On Demand, remote-based Virtual Instructor-Led and in-person Classrooms. Whether you are an experienced IT professional or just getting started, Dell Technologies Proven Professional certifications are designed to clearly signal proficiency to colleagues and employers. Learn more at www.dell.com/certification 2020 Dell Technologies Proven Professional Knowledge Sharing 2 Abstract Optimal path management is key when applied to disk I/O or network I/O. The efficiency of a storage or a network system depends on optimal routing of I/O. To obtain optimal path for an I/O between source and target nodes, an effective path finding mechanism among a set of given nodes is desired. In this article, a novel optimal path routing algorithm is developed using reinforcement learning techniques from AI. Reinforcement learning considers the given topology of nodes as the environment and leverages the given latency or distance between the nodes to determine the shortest path.
    [Show full text]
  • Introduction-To-Deep-Learning.Pdf
    Introduction to Deep Learning Demystifying Neural Networks Agenda Introduction to deep learning: • What is deep learning? • Speaking deep learning: network types, development frameworks and network models • Deep learning development flow • Application spaces Deep learning introduction ARTIFICIAL INTELLIGENCE Broad area which enables computers to mimic human behavior MACHINE LEARNING Usage of statistical tools enables machines to learn from experience (data) – need to be told DEEP LEARNING Learn from its own method of computing - its own brain Why is deep learning useful? Good at classification, clustering and predictive analysis What is deep learning? Deep learning is way of classifying, clustering, and predicting things by using a neural network that has been trained on vast amounts of data. Picture of deep learning demo done by TI’s vehicles road signs person background automotive driver assistance systems (ADAS) team. What is deep learning? Deep learning is way of classifying, clustering, and predicting things by using a neural network that has been trained on vast amounts of data. Machine Music Time of Flight Data Pictures/Video …any type of data Speech you want to classify, Radar cluster or predict What is deep learning? • Deep learning has its roots in neural networks. • Neural networks are sets of algorithms, modeled loosely after the human brain, that are designed to recognize patterns. Biologocal Artificial Biological neuron neuron neuron dendrites inputs synapses weight axon output axon summation and cell body threshold dendrites synapses cell body Node = neuron Inputs Weights Summation W1 x1 & threshold W x 2 2 Output Inputs W3 Output x3 y layer = stack of ∑ ƒ(x) neurons Wn xn ∑=sum(w*x) Artificial neuron 6 What is deep learning? Deep learning creates many layers of neurons, attempting to learn structured representation of big data, layer by layer.
    [Show full text]