Unsupervised Learning: Autoencoders Yunsheng Bai Roadmap

Total Page:16

File Type:pdf, Size:1020Kb

Unsupervised Learning: Autoencoders Yunsheng Bai Roadmap Unsupervised Learning: Autoencoders Yunsheng Bai Roadmap 1. Introduction to Autoencoders 2. Sparse Autoencoders (SAE) (2008) 3. Denoising Autoencoders (DAE) (2008) 4. Contractive Autoencoders (CAE) (2011) 5. Stacked Convolutional Autoencoders (SCAE) (2011) 6. Recursive Autoencoders (RAE) (2011) 7. Variational Autoencoders (VAE) (2013) 8. Adversarial Autoencoders (AAE) (2015) 9. Wasserstein Autoencoders (WAE) (2017) 10. Autoencoders for Graphs Introduction to Autoencoders Principal component analysis (PCA) is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. https://en.wikipedia.org/wiki/Principal_component_analysis#/media/File:GaussianScatterPCA.svg https://www.cs.toronto.edu/~urtasun/courses/CSC411/14_pca.pdf Change of basis Inner product between them https://www.cs.toronto.edu/~urtasun/courses/CSC411/14_pca.pdf PCA ≈ Autoencoder with Linear Activation Function Not necessarily orthogonal Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems https://www.cs.toronto.edu/~urtasun/courses/CSC411/14_pca.pdf PCA ≈ Autoencoder with Linear Activation Function Could have many layers, but as long as activation is linear → a single W and a single V Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems https://www.cs.toronto.edu/~urtasun/courses/CSC411/14_pca.pdf PCA vs Autoencoder — autoencoders are much more flexible than PCA. — NN activation functions introduce “non-linearities” in encoding, but PCA only does linear transformation. — we can stack autoencoders to form a deep autoencoder network https://towardsdatascience.com/autoencoders-are-essential-in-deep-neural-nets-f0365b2d1d7c Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems Layer 1 Layer 2 Layer 3 Layer 4 Stacked Pro Deep Learning with TensorFlow: A Mathematical Approach to Advanced Artificial Intelligence in Python Goal: Learn Useful Features from Data We’ve seen that autoencoders can do PCA, but fundamentally, why does an autoencoder work? https://hackernoon.com/autoencoders-deep-learning-bits-1-11731e200694 Goal: Feature/Representation Learning Why can’t an autoencoder simply copy input to output through identity functions? f g 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 min ||x-g(f(x))|| 0 0 1 0 0 0 0 0 1 0 0 0 2 0 0 0 1 0 0 0 0 0 1 0 0 Overcomplete 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 Pro Deep Learning with TensorFlow: A Mathematical Approach to Advanced Artificial Intelligence in Python To Achieve Feature Learning, Conflicting Goals Autoencoders are designed to be unable to learn to copy perfectly. Usually they are restricted in ways that allow them to copy only approximately. Because the model is forced to prioritize which aspects of the input should be copied, it often learns useful properties of the data. Deep Learning (Adaptive Computation and Machine Learning series) (Ian Goodfellow, Yoshua Bengio, Aaron Courville) “If you could speak only a few words per month, you would probably try to make them worth listening to.” Undercomplete Autoencoders h Encoders and decoders are too powerful :( Pro Deep Learning with TensorFlow: A Mathematical Approach to Advanced Artificial Intelligence in Python http://rgraphgallery.blogspot.com/2013/04/rg-3d-scatter-plots-with-vertical-lines.html Regularized Autoencoders Regularized autoencoders use a loss function that 2008: Sparse Autoencoders (SAE) encourages the model to have other properties besides the ability to copy its input to its output. These other 2008: Denoising Autoencoders (DAE) properties include sparsity of the representation, smallness of the derivative of the representation, and 2011: Contractive Autoencoders (CAE) robustness to noise or to missing inputs. A regularized autoencoder can be nonlinear and 2011: Stacked Convolutional Autoencoders (SCAE) overcomplete but still learn something useful about the data distribution, even if the model capacity is great 2011: Recursive Autoencoders (RAE) enough to learn a trivial identity function. 2013: Variational Autoencoders (VAE) → introduce new things to the loss 2015: Adversarial Autoencoders (AAE) → they are just different regularizers 2017: Wasserstein Autoencoders (WAE) Deep Learning (Adaptive Computation and Machine Learning series) (Ian Goodfellow, Yoshua Bengio, Aaron Courville) Properties of Autoencoders (Ideally) 1. Learn useful features from data (effective representations) a. Capture the intrinsic properties of data → feed them into downstream applications b. Can be thought of as patterns in data → generate new data 2. Produce low-dimensional vectors (efficient/compact representations) a. Efficient for storage b. Efficient for downstream models c. May be free of noise in input d. Easier to visualize than high-dimensional data Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems Properties of Autoencoders (Ideally) 3. Are flexible: Can be modified/guided/regularized in various ways: a. Input data, e.g. add noise b. Output data, e.g. something different from the input c. Architecture, e.g. fully connected layer → convolutional layer d. Loss, e.g. add additional loss terms → capture other useful information from input e. Latent space, e.g. Gaussian (more later in VAE) i. Enforce certain prior knowledge, usually through additional loss terms ii. Analyzing the latent space/representations is a trend (?), e.g. debiasing word embeddings f. … (Be creative! This is where research comes from) History of Autoencoders 10 years ago, we thought that deep nets would also need an unsupervised cost, like the autoencoder cost, to regularize them. Today, we know we are able to recognize images just by using backprop on the supervised cost as long as there is enough labeled data. (Humans can learn from very few labeled examples. Why? One popular hypothesis: Brain can leverage unsupervised or semi-supervised learning.) There are other tasks where we do still use autoencoders, but they’re not the fundamental solution to training deep nets that people once thought they were going to be. (Ian Goodfellow, 2016) https://www.quora.com/Why-are-autoencoders-considered-a-failure-What-are-their-alternatives https://www.doc.ic.ac.uk/~js4416/163/website/nlp/#XGlorot2011 Deep Learning (Adaptive Computation and Machine Learning series) (Ian Goodfellow, Yoshua Bengio, Aaron Courville) Applications of Autoencoders 1. Data Compression for Storage a. Difficult to train an autoencoder better than a basic algorithm like JPEG b. Autoencoders are data-specific: may be hard to generalize to unseen data 2. Dimensionality Reduction for Data Visualization a. t-SNE is good, but typically requires relatively low-dimensional data i. For high-dimensional data, first use autoencode, then use t-SNE b. Latent space visualization (more later) https://blog.keras.io/building-autoencoders-in-keras.html https://www.doc.ic.ac.uk/~js4416/163/website/nlp/#XVincent2008 https://hackernoon.com/latent-space-visualization-deep-learning-bits-2-bd09a46920df Applications of Autoencoders 3. Unsupervised Pretraining a. Greedy Layer-Wise Unsupervised Pretraining: Train each layer of feedforward net greedily; continue stacking layers; output of prior layers is input for the next one; fine tune b. Today, we have random weight initialization, rectified linear units (ReLUs) (2011), dropout (2012), batch normalization (2014), residual learning (2015) + large labeled datasets c. Still useful i. Train a deep autoencoder ii. Train an autoencoder on an unlabeled dataset, and reuse the lower layers to create a new network trained on the labeled data (~supervised pretraining) iii. Train an autoencoder on an unlabeled dataset, and use the learned representations in downstream tasks (see more in 4) https://blog.keras.io/building-autoencoders-in-keras.html https://www.doc.ic.ac.uk/~js4416/163/website/nlp/#XVincent2008 Greedy Layer-Wise Unsupervised Pretraining for Training Deep Autoencoders Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems Unsupervised Pretraining for Supervised Tasks Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems downside: two-staged → hyperparameters tuning :( Unsupervised Pretraining for Supervised Tasks Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems Supervised Pretraining Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems Multi-Task Learning Transfer Learning Domain Adaptation supervised pretraining https://www.youtube.com/watch?v=R3DNKE3zKFk Multi-Task Learning https://www.youtube.com/watch?v=R3DNKE3zKFk Applications of Autoencoders 4. Generate Representations for Downstream Tasks a. Special case of unsupervised pre-training (3.c.iii) b. Useful when the initial representation is poor, and there is a lot of unlabeled data i. Word embeddings (better than one-hot representations) ii. Graph node embeddings iii. Image embeddings (Images already lie in a rich vector space? Check out puppy image embeddings!) iv. Semantic hashing: turn database entries (text, image, etc.) into low-dimensional and binary
Recommended publications
  • Recurrent Neural Networks
    Sequence Modeling: Recurrent and Recursive Nets Lecture slides for Chapter 10 of Deep Learning www.deeplearningbook.org Ian Goodfellow 2016-09-27 Adapted by m.n. for CMPS 392 RNN • RNNs are a family of neural networks for processing sequential data • Recurrent networks can scale to much longer sequences than would be practical for networks without sequence-based specialization. • Most recurrent networks can also process sequences of variable length. • Based on parameter sharing q If we had separate parameters for each value of the time index, o we could not generalize to sequence lengths not seen during training, o nor share statistical strength across different sequence lengths, o and across different positions in time. (Goodfellow 2016) Example • Consider the two sentences “I went to Nepal in 2009” and “In 2009, I went to Nepal.” • How a machine learning can extract the year information? q A traditional fully connected feedforward network would have separate parameters for each input feature, q so it would need to learn all of the rules of the language separately at each position in the sentence. • By comparison, a recurrent neural network shares the same weights across several time steps. (Goodfellow 2016) RNN vs. 1D convolutional • The output of convolution is a sequence where each member of the output is a function of a small number of neighboring members of the input. • Recurrent networks share parameters in a different way. q Each member of the output is a function of the previous members of the output q Each member of the output is produced using the same update rule applied to the previous outputs.
    [Show full text]
  • Self-Discriminative Learning for Unsupervised Document Embedding
    Self-Discriminative Learning for Unsupervised Document Embedding Hong-You Chen∗1, Chin-Hua Hu∗1, Leila Wehbe2, Shou-De Lin1 1Department of Computer Science and Information Engineering, National Taiwan University 2Machine Learning Department, Carnegie Mellon University fb03902128, [email protected], [email protected], [email protected] Abstract ingful a document embedding as they do not con- sider inter-document relationships. Unsupervised document representation learn- Traditional document representation models ing is an important task providing pre-trained such as Bag-of-words (BoW) and TF-IDF show features for NLP applications. Unlike most competitive performance in some tasks (Wang and previous work which learn the embedding based on self-prediction of the surface of text, Manning, 2012). However, these models treat we explicitly exploit the inter-document infor- words as flat tokens which may neglect other use- mation and directly model the relations of doc- ful information such as word order and semantic uments in embedding space with a discrimi- distance. This in turn can limit the models effec- native network and a novel objective. Exten- tiveness on more complex tasks that require deeper sive experiments on both small and large pub- level of understanding. Further, BoW models suf- lic datasets show the competitiveness of the fer from high dimensionality and sparsity. This is proposed method. In evaluations on standard document classification, our model has errors likely to prevent them from being used as input that are relatively 5 to 13% lower than state-of- features for downstream NLP tasks. the-art unsupervised embedding models. The Continuous vector representations for docu- reduction in error is even more pronounced in ments are being developed.
    [Show full text]
  • Q-Learning in Continuous State and Action Spaces
    -Learning in Continuous Q State and Action Spaces Chris Gaskett, David Wettergreen, and Alexander Zelinsky Robotic Systems Laboratory Department of Systems Engineering Research School of Information Sciences and Engineering The Australian National University Canberra, ACT 0200 Australia [cg dsw alex]@syseng.anu.edu.au j j Abstract. -learning can be used to learn a control policy that max- imises a scalarQ reward through interaction with the environment. - learning is commonly applied to problems with discrete states and ac-Q tions. We describe a method suitable for control tasks which require con- tinuous actions, in response to continuous states. The system consists of a neural network coupled with a novel interpolator. Simulation results are presented for a non-holonomic control task. Advantage Learning, a variation of -learning, is shown enhance learning speed and reliability for this task.Q 1 Introduction Reinforcement learning systems learn by trial-and-error which actions are most valuable in which situations (states) [1]. Feedback is provided in the form of a scalar reward signal which may be delayed. The reward signal is defined in relation to the task to be achieved; reward is given when the system is successfully achieving the task. The value is updated incrementally with experience and is defined as a discounted sum of expected future reward. The learning systems choice of actions in response to states is called its policy. Reinforcement learning lies between the extremes of supervised learning, where the policy is taught by an expert, and unsupervised learning, where no feedback is given and the task is to find structure in data.
    [Show full text]
  • Ian Goodfellow, Staff Research Scientist, Google Brain CVPR
    MedGAN ID-CGAN CoGAN LR-GAN CGAN IcGAN b-GAN LS-GAN LAPGAN DiscoGANMPM-GAN AdaGAN AMGAN iGAN InfoGAN CatGAN IAN LSGAN Introduction to GANs SAGAN McGAN Ian Goodfellow, Staff Research Scientist, Google Brain MIX+GAN MGAN CVPR Tutorial on GANs BS-GAN FF-GAN Salt Lake City, 2018-06-22 GoGAN C-VAE-GAN C-RNN-GAN DR-GAN DCGAN MAGAN 3D-GAN CCGAN AC-GAN BiGAN GAWWN DualGAN CycleGAN Bayesian GAN GP-GAN AnoGAN EBGAN DTN Context-RNN-GAN MAD-GAN ALI f-GAN BEGAN AL-CGAN MARTA-GAN ArtGAN MalGAN Generative Modeling: Density Estimation Training Data Density Function (Goodfellow 2018) Generative Modeling: Sample Generation Training Data Sample Generator (CelebA) (Karras et al, 2017) (Goodfellow 2018) Adversarial Nets Framework D tries to make D(G(z)) near 0, D(x) tries to be G tries to make near 1 D(G(z)) near 1 Differentiable D function D x sampled from x sampled from data model Differentiable function G Input noise z (Goodfellow et al., 2014) (Goodfellow 2018) Self-Play 1959: Arthur Samuel’s checkers agent (OpenAI, 2017) (Silver et al, 2017) (Bansal et al, 2017) (Goodfellow 2018) 3.5 Years of Progress on Faces 2014 2015 2016 2017 (Brundage et al, 2018) (Goodfellow 2018) p.15 General Framework for AI & Security Threats Published as a conference paper at ICLR 2018 <2 Years of Progress on ImageNet Odena et al 2016 monarch butterfly goldfinch daisy redshank grey whale Miyato et al 2017 monarch butterfly goldfinch daisy redshank grey whale Zhang et al 2018 monarch butterfly goldfinch (Goodfellow 2018) daisy redshank grey whale Figure 7: 128x128 pixel images generated by SN-GANs trained on ILSVRC2012 dataset.
    [Show full text]
  • Reinforcement Learning Data Science Africa 2018 Abuja, Nigeria (12 Nov - 16 Nov 2018)
    Reinforcement Learning Data Science Africa 2018 Abuja, Nigeria (12 Nov - 16 Nov 2018) Chika Yinka-Banjo, PhD Ayorkor Korsah, PhD University of Lagos Ashesi University Nigeria Ghana Outline • Introduction to Machine learning • Reinforcement learning definitions • Example reinforcement learning problems • The Markov decision process • The optimal policy • Value function & Q-value function • Bellman Equation • Q-learning • Building a simple Q-learning agent (coding) • Recap • Where to go from here? Introduction to Machine learning • Artificial Intelligence (AI) is the study and design of Intelligent agents. • An Intelligent agent can perceive its environment through sensors and it can act on its environment through actuators. • E.g. Agent: Humanoid robot • Environment: Earth? • Sensors: Camera, tactile sensor etc. • Actuators: Motors, grippers etc. • Machine learning is a subfield of Artificial Intelligence Branches of AI Introduction to Machine learning • Machine learning techniques learn from data without being explicitly programmed to do so. • Machine learning models enable the agent to learn from its own experience by extracting useful information from feedback from its environment. • Three types of learning feedback: • Supervised learning • Unsupervised learning • Reinforcement learning Branches of Machine learning Supervised learning • Supervised learning: the machine learning model is trained on many labelled examples of input-output pairs. • Such that when presented with a novel input, the model can estimate accurately what the correct output should be. • Data(x, y): x is input data, y is label Supervised learning task in the form of classification • Goal: learn a function to map x -> y • Examples include; Classification, regression object detection, image captioning etc. Unsupervised learning • Unsupervised learning: here the model extract useful information from unlabeled and unstructured data.
    [Show full text]
  • A Review of Unsupervised Artificial Neural Networks with Applications
    A REVIEW OF UNSUPERVISED ARTIFICIAL NEURAL NETWORKS WITH APPLICATIONS Samson Damilola Fabiyi Department of Electronic and Electrical Engineering, University of Strathclyde 204 George Street, G1 1XW, Glasgow, United Kingdom [email protected] ABSTRACT designer) who uses his or her knowledge of the environment to Artificial Neural Networks (ANNs) are models formulated to train the network with labelled data sets [7]. Hence, the mimic the learning capability of human brains. Learning in artificial neural networks learn by receiving input and target ANNs can be categorized into supervised, reinforcement and pairs of several observations from the labelled data sets, unsupervised learning. Application of supervised ANNs is processing the input, comparing the output with the target, limited to when the supervisor’s knowledge of the environment computing the error between the output and target, and using is sufficient to supply the networks with labelled datasets. the error signal and the concept of backward propagation to Application of unsupervised ANNs becomes imperative in adjust the weights interconnecting the network’s neurons with situations where it is very difficult to get labelled datasets. This the aim of minimising the error and optimising performance [6, paper presents the various methods, and applications of 7]. Fine-tuning of the network continues until the set of weights unsupervised ANNs. In order to achieve this, several secondary that minimise the discrepancy between the output and the sources of information, including academic journals and desired output is obtained. Figure 1 shows the block diagram conference proceedings, were selected. Autoencoders, self- which conceptualizes supervised learning in ANNs.
    [Show full text]
  • A Hybrid Model Consisting of Supervised and Unsupervised Learning for Landslide Susceptibility Mapping
    remote sensing Article A Hybrid Model Consisting of Supervised and Unsupervised Learning for Landslide Susceptibility Mapping Zhu Liang 1, Changming Wang 1,* , Zhijie Duan 2, Hailiang Liu 1, Xiaoyang Liu 1 and Kaleem Ullah Jan Khan 1 1 College of Construction Engineering, Jilin University, Changchun 130012, China; [email protected] (Z.L.); [email protected] (H.L.); [email protected] (X.L.); [email protected] (K.U.J.K.) 2 State Key Laboratory of Hydroscience and Engineering Tsinghua University, Beijing 100084, China; [email protected] * Correspondence: [email protected]; Tel.: +86-135-0441-8751 Abstract: Landslides cause huge damage to social economy and human beings every year. Landslide susceptibility mapping (LSM) occupies an important position in land use and risk management. This study is to investigate a hybrid model which makes full use of the advantage of supervised learning model (SLM) and unsupervised learning model (ULM). Firstly, ten continuous variables were used to develop a ULM which consisted of factor analysis (FA) and k-means cluster for a preliminary landslide susceptibility map. Secondly, 351 landslides with “1” label were collected and the same number of non-landslide samples with “0” label were selected from the very low susceptibility area in the preliminary map, constituting a new priori condition for a SLM, and thirteen factors were used for the modeling of gradient boosting decision tree (GBDT) which represented for SLM. Finally, the performance of different models was verified using related indexes. The results showed that the performance of the pretreated GBDT model was improved with sensitivity, specificity, accuracy Citation: Liang, Z.; Wang, C.; Duan, and the area under the curve (AUC) values of 88.60%, 92.59%, 90.60% and 0.976, respectively.
    [Show full text]
  • 4 Perceptron Learning
    4 Perceptron Learning 4.1 Learning algorithms for neural networks In the two preceding chapters we discussed two closely related models, McCulloch–Pitts units and perceptrons, but the question of how to find the parameters adequate for a given task was left open. If two sets of points have to be separated linearly with a perceptron, adequate weights for the comput- ing unit must be found. The operators that we used in the preceding chapter, for example for edge detection, used hand customized weights. Now we would like to find those parameters automatically. The perceptron learning algorithm deals with this problem. A learning algorithm is an adaptive method by which a network of com- puting units self-organizes to implement the desired behavior. This is done in some learning algorithms by presenting some examples of the desired input- output mapping to the network. A correction step is executed iteratively until the network learns to produce the desired response. The learning algorithm is a closed loop of presentation of examples and of corrections to the network parameters, as shown in Figure 4.1. network test input-output compute the examples error fix network parameters Fig. 4.1. Learning process in a parametric system R. Rojas: Neural Networks, Springer-Verlag, Berlin, 1996 78 4 Perceptron Learning In some simple cases the weights for the computing units can be found through a sequential test of stochastically generated numerical combinations. However, such algorithms which look blindly for a solution do not qualify as “learning”. A learning algorithm must adapt the network parameters accord- ing to previous experience until a solution is found, if it exists.
    [Show full text]
  • Unsupervised Pre-Training of a Deep LSTM-Based Stacked Autoencoder for Multivariate Time Series Forecasting Problems Alaa Sagheer 1,2,3* & Mostafa Kotb2,3
    www.nature.com/scientificreports OPEN Unsupervised Pre-training of a Deep LSTM-based Stacked Autoencoder for Multivariate Time Series Forecasting Problems Alaa Sagheer 1,2,3* & Mostafa Kotb2,3 Currently, most real-world time series datasets are multivariate and are rich in dynamical information of the underlying system. Such datasets are attracting much attention; therefore, the need for accurate modelling of such high-dimensional datasets is increasing. Recently, the deep architecture of the recurrent neural network (RNN) and its variant long short-term memory (LSTM) have been proven to be more accurate than traditional statistical methods in modelling time series data. Despite the reported advantages of the deep LSTM model, its performance in modelling multivariate time series (MTS) data has not been satisfactory, particularly when attempting to process highly non-linear and long-interval MTS datasets. The reason is that the supervised learning approach initializes the neurons randomly in such recurrent networks, disabling the neurons that ultimately must properly learn the latent features of the correlated variables included in the MTS dataset. In this paper, we propose a pre-trained LSTM- based stacked autoencoder (LSTM-SAE) approach in an unsupervised learning fashion to replace the random weight initialization strategy adopted in deep LSTM recurrent networks. For evaluation purposes, two diferent case studies that include real-world datasets are investigated, where the performance of the proposed approach compares favourably with the deep LSTM approach. In addition, the proposed approach outperforms several reference models investigating the same case studies. Overall, the experimental results clearly show that the unsupervised pre-training approach improves the performance of deep LSTM and leads to better and faster convergence than other models.
    [Show full text]
  • Optimal Path Routing Using Reinforcement Learning
    OPTIMAL PATH ROUTING USING REINFORCEMENT LEARNING Rajasekhar Nannapaneni Sr Principal Engineer, Solutions Architect Dell EMC [email protected] Knowledge Sharing Article © 2020 Dell Inc. or its subsidiaries. The Dell Technologies Proven Professional Certification program validates a wide range of skills and competencies across multiple technologies and products. From Associate, entry-level courses to Expert-level, experience-based exams, all professionals in or looking to begin a career in IT benefit from industry-leading training and certification paths from one of the world’s most trusted technology partners. Proven Professional certifications include: • Cloud • Converged/Hyperconverged Infrastructure • Data Protection • Data Science • Networking • Security • Servers • Storage • Enterprise Architect Courses are offered to meet different learning styles and schedules, including self-paced On Demand, remote-based Virtual Instructor-Led and in-person Classrooms. Whether you are an experienced IT professional or just getting started, Dell Technologies Proven Professional certifications are designed to clearly signal proficiency to colleagues and employers. Learn more at www.dell.com/certification 2020 Dell Technologies Proven Professional Knowledge Sharing 2 Abstract Optimal path management is key when applied to disk I/O or network I/O. The efficiency of a storage or a network system depends on optimal routing of I/O. To obtain optimal path for an I/O between source and target nodes, an effective path finding mechanism among a set of given nodes is desired. In this article, a novel optimal path routing algorithm is developed using reinforcement learning techniques from AI. Reinforcement learning considers the given topology of nodes as the environment and leverages the given latency or distance between the nodes to determine the shortest path.
    [Show full text]
  • Defending Black Box Facial Recognition Classifiers Against Adversarial Attacks
    Defending Black Box Facial Recognition Classifiers Against Adversarial Attacks Rajkumar Theagarajan and Bir Bhanu Center for Research in Intelligent Systems, University of California, Riverside, CA 92521 [email protected], [email protected] Abstract attacker has full knowledge about the classification model’s parameters and architecture, whereas in the Black box set- Defending adversarial attacks is a critical step towards ting [50] the attacker does not have this knowledge. In this reliable deployment of deep learning empowered solutions paper we focus on the Black box based adversarial attacks. for biometrics verification. Current approaches for de- Current defenses against adversarial attacks can be clas- fending Black box models use the classification accuracy sified into four approaches: 1) modifying the training data, of the Black box as a performance metric for validating 2) modifying the model, 3) using auxiliary tools, and their defense. However, classification accuracy by itself is 4) detecting and rejecting adversarial examples. Modi- not a reliable metric to determine if the resulting image is fying the training data involves augmenting the training “adversarial-free”. This is a serious problem for online dataset with adversarial examples and re-training the classi- biometrics verification applications where the ground-truth fier [22, 30, 64, 73] or performing N number of pre-selected of the incoming image is not known and hence we cannot image transformations in a random order [13, 17, 26, 52]. compute the accuracy of the classifier or know if the image Modifying the model involves pruning the architecture of is ”adversarial-free” or not. This paper proposes a novel the classifier [38, 51, 69] or adding pre/post-processing lay- framework for defending Black box systems from adversar- ers to it [9, 12, 71].
    [Show full text]
  • Introducing Machine Learning for Healthcare Research
    INTRODUCING MACHINE LEARNING FOR HEALTHCARE RESEARCH Dr Stephen Weng NIHR Research Fellow (School for Primary Care Research) Primary Care Stratified Medicine (PRISM) Division of Primary Care School of Medicine University of Nottingham What is Machine Learning? Machine learning teaches computers to do what comes naturally to humans and animals: learn from experience. Machine learning algorithms use computation methods to “learn” information directly from data without relying on a predetermined equation to model. The algorithms adaptively improve their performance as the number of data samples available for learning increases. When Should We Use Considerations: Complex task or problem Machine Learning? Large amount of data Lots of variables No existing formula or equation Limited prior knowledge Hand-written rules and The nature of input and quantity Rules of the task are dynamic – equations are too complex – of data keeps changing – hospital financial transactions images, speech, linguistics admissions, health care records Supervised learning, which trains a model on known inputs and output data to predict future outputs How Machine Learning Unsupervised learning, which finds hidden patterns or Works intrinsic structures in the input data Semi-supervised learning, which uses a mixture of both techniques; some learning uses supervised data, some learning uses unsupervised learning Unsupervised Learning Group and interpret data Clustering based only on input data Machine Learning Classification Supervised learning Develop model based
    [Show full text]