Symmetric Variational Autoencoder and Connections to Adversarial Learning

Total Page:16

File Type:pdf, Size:1020Kb

Symmetric Variational Autoencoder and Connections to Adversarial Learning Symmetric Variational Autoencoder and Connections to Adversarial Learning Liqun Chen1 Shuyang Dai1 Yunchen Pu1 Erjin Zhou4 Chunyuan Li1 Qinliang Su2 Changyou Chen3 Lawrence Carin1 1Duke University, 2Sun Yat-Sen University, China, 3University at Buffalo, SUNY, 4Megvii Research [email protected] Abstract also capable of describing a large set of data that do not appear to be real. A new form of the variational autoencoder (VAE) The generative adversarial network (GAN) [Goodfellow is proposed, based on the symmetric Kullback- et al., 2014] represents a significant recent advance toward Leibler divergence. It is demonstrated that learn- development of generative models that are capable of syn- ing of the resulting symmetric VAE (sVAE) thesizing realistic data. Such models also employ latent has close connections to previously developed variables, drawn from a simple distribution analogous to adversarial-learning methods. This relationship the aforementioned prior, and these random variables are helps unify the previously distinct techniques of fed through a (deep) neural network. The neural network VAE and adversarially learning, and provides acts as a functional transformation of the original random insights that allow us to ameliorate shortcom- variables, yielding a model capable of representing so- ings with some previously developed adversarial phisticated distributions. Adversarial learning discourages methods. In addition to an analysis that moti- the network from yielding synthetic data that are unrealis- vates and explains the sVAE, an extensive set of tic, from the perspective of a learned neural-network-based experiments validate the utility of the approach. classifier. However, GANs are notoriously difficult to train, and multiple generalizations and techniques have been de- veloped to improve learning performance [Salimans et al., 1 Introduction 2016], for example Wasserstein GAN (WGAN) [Arjovsky and Bottou, 2017, Arjovsky et al., 2017] and energy-based GAN (EB-GAN) [Zhao et al., 2017]. Generative models [Pu et al., 2015, 2016b] that are descrip- tive of data have been widely employed in statistics and While the original GAN and variants were capable of syn- machine learning. Factor models (FMs) represent one com- thesizing highly realistic data (e.g., images), the models monly used generative model [Tipping and Bishop, 1999], lacked the ability to infer the latent variables given ob- and mixtures of FMs have been employed to account for served data. This limitation has been mitigated recently more-general data distributions [Ghahramani and Hinton, by methods like adversarial learned inference (ALI) [Du- 1997]. These models typically have latent variables (e.g., moulin et al., 2017], and related approaches. However, ALI factor scores) that are inferred given observed data; the la- appears to be inadequate from the standpoint of inference, tent variables are often used for a down-stream goal, such in that, given observed data and associated inferred latent as classification [Carvalho et al., 2008]. After training, variables, the subsequently synthesized data often do not such models are useful for inference tasks given subsequent look particularly close to the original data. observed data. However, when one draws from such mod- The variational autoencoder (VAE) [Kingma and Welling, els, by drawing latent variables from the prior and pushing 2014] is a class of generative models that precedes GAN. them through the model to synthesize data, the synthetic VAE learning is based on optimizing a variational lower data typically do not appear to be realistic. This suggests bound, connected to inferring an approximate posterior that while these models may be useful for analyzing ob- distribution on latent variables; such learning is typically served data in terms of inferred latent variables, they are not performed in an adversarial manner. VAEs have been demonstrated to be effective models for inferring latent variables, in that the reconstructed data do typically look Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (AISTATS) 2018, Lanzarote, Spain. like the original data, albeit in a blurry manner [Dumoulin PMLR: Volume 84. Copyright 2018 by the author(s). et al., 2017, Pu et al., 2016a, 2017a]. The form of the VAE Symmetric Variational Autoencoder and Connections to Adversarial Learning has been generalized recently, in terms of the adversarial we typically seek to maximize the expected log likelihood: ^ variational Bayesian (AVB) framework [Mescheder et al., θ = argmaxθ Eq(x) log pθ(x), where one typically invokes 2016]. This model yields general forms of encoders and de- 1 PN the approximation Eq(x) log pθ(x) ≈ N n=1 log pθ(xn) coders, but it is based on the original variational Bayesian assuming N iid observed samples fxngn=1;N . (VB) formulation. The original VB framework yields a It is typically intractable to evaluate pθ(x) directly, as lower bound on the log likelihood of the observed data, R and therefore model learning is connected to maximum- dzpθ(xjz)p(z) generally doesn’t have a closed form. likelihood (ML) approaches. From the perspective of de- Consequently, a typical approach is to consider a model signing generative models, it has been recognized recently qφ(zjx) for the posterior of the latent code z given ob- that ML-based learning has limitations [Arjovsky and Bot- served x, characterized by parameters φ. Distribution tou, 2017]: such learning tends to yield models that match qφ(zjx) is often termed an encoder, and pθ(xjz) is a de- observed data, but also have a high probability of generat- coder [Kingma and Welling, 2014]; both are here stochas- ing unrealistic synthetic data. tic, vis-a-vis` their deterministic counterparts associated with a traditional autoencoder [Vincent et al., 2010]. Con- The original VAE employs the Kullback-Leibler diver- sider the variational expression gence to constitute the variational lower bound. As is well known, the KL distance metric is asymmetric. We demon- pθ(xjz)p(z) Lx(θ; φ) = Eq(x)Eqφ(zjx) log (1) strate that this asymmetry encourages design of decoders qφ(zjx) (generators) that often yield unrealistic synthetic data when the latent variables are drawn from the prior. From a dif- In practice the expectation wrt x ∼ q(x) is evaluated via ferent but related perspective, the encoder infers latent vari- sampling, assuming N observed samples fxngn=1;N . One ables (across all training data) that only encompass a subset typically must also utilize sampling from qφ(zjx) to eval- of the prior. As demonstrated below, these limitations of uate the corresponding expectation in (1). Learning is ef- ^ ^ the encoder and decoder within conventional VAE learning fected as (θ; φ) = argmaxθ;φ Lx(θ; φ), and a model so are intertwined. learned is termed a variational autoencoder (VAE)[Kingma and Welling, 2014]. We consequently propose a new symmetric VAE (sVAE), based on a symmetric form of the KL divergence and asso- It is well known that Lx(θ; φ) = Eq(x)[log pθ(x) − ciated variational bound. The proposed sVAE is learned KL(qφ(zjx)kpθ(zjx))] ≤ Eq(x)[log pθ(x)]. Alterna- using an approach related to that employed in the AVB tively, the variational expression may be represented as [Mescheder et al., 2016], but in a new manner connected L (θ; φ) = −KL(q (x; z)kp (x; z)) + C (2) to the symmetric variational bound. Analysis of the sVAE x φ θ x demonstrates that it has close connections to ALI [Du- where qφ(x; z) = q(x)qφ(zjx), pθ(x; z) = p(z)pθ(xjz) moulin et al., 2017], WGAN [Arjovsky et al., 2017] and and Cx = Eq(x) log q(x). One may readily show that to the original GAN [Goodfellow et al., 2014] framework; in fact, ALI is recovered exactly, as a special case of the KL(qφ(x; z)kpθ(x; z)) proposed sVAE. This provides a new and explicit linkage = Eq(x)KL(qφ(zjx)kpθ(zjx)) + KL(q(x)kpθ(x)) (3) between the VAE (after it is made symmetric) and a wide = Eqφ(z)KL(qφ(xjz)kpθ(xjz)) + KL(qφ(z)kp(z))(4) class of adversarially trained generative models. Addition- R ally, with this insight, we are able to ameliorate much of the where qφ(z) = q(x)qφ(zjx)dx. To max- aforementioned limitations of ALI, from the perspective of imize Lx(θ; φ), we seek minimization of data reconstruction. In addition to analyzing properties of KL(qφ(x; z)kpθ(x; z)). Hence, from (3) the goal is the sVAE, we demonstrate excellent performance on an ex- to align pθ(x) with q(x), while from (4) the goal is to tensive set of experiments. align qφ(z) with p(z). The other terms seek to match the respective conditional distributions. All of these condi- tions are implied by minimizing KL(qφ(x; z)kpθ(x; z)). 2 Review of Variational Autoencoder However, the KL divergence is asymmetric, which yields limitations wrt the learned model. 2.1 Background 2.2 Limitations of the VAE Assume observed data samples x ∼ q(x), where q(x) is the true and unknown distribution we wish to approxi- The support Sp(z) of a distribution p(z) is defined as the mate. Consider pθ(xjz), a model with parameters θ and ~ R member of the set fSp(z) : S~ p(z)dz = 1 − g with latent code z. With prior p(z) on the codes, the mod- p(z) ~ R minimum size kSp(z)k , ~ dz. We are typically in- eled generative process is x ∼ pθ(xjz), with z ∼ p(z). Sp(z) We may marginalize out the latent codes, and hence the terested in ! 0+. For notational convenience we replace R model is x ∼ pθ(x) = dzpθ(xjz)p(z). To learn θ, Sp(z) with Sp(z), with the understanding is small. We also Liqun Chen1, Shuyang Dai1, Yunchen Pu1, Erjin Zhou4, Chunyuan Li1 R define Sp(z) as the largest set for which p(z)dz = � (�) − Sp(z) ) R R − �(�) �(�) , and hence S p(z)dz + S p(z)dz = 1.
Recommended publications
  • Disentangled Variational Auto-Encoder for Semi-Supervised Learning
    Information Sciences 482 (2019) 73–85 Contents lists available at ScienceDirect Information Sciences journal homepage: www.elsevier.com/locate/ins Disentangled Variational Auto-Encoder for semi-supervised learning ∗ Yang Li a, Quan Pan a, Suhang Wang c, Haiyun Peng b, Tao Yang a, Erik Cambria b, a School of Automation, Northwestern Polytechnical University, China b School of Computer Science and Engineering, Nanyang Technological University, Singapore c College of Information Sciences and Technology, Pennsylvania State University, USA a r t i c l e i n f o a b s t r a c t Article history: Semi-supervised learning is attracting increasing attention due to the fact that datasets Received 5 February 2018 of many domains lack enough labeled data. Variational Auto-Encoder (VAE), in particu- Revised 23 December 2018 lar, has demonstrated the benefits of semi-supervised learning. The majority of existing Accepted 24 December 2018 semi-supervised VAEs utilize a classifier to exploit label information, where the param- Available online 3 January 2019 eters of the classifier are introduced to the VAE. Given the limited labeled data, learn- Keywords: ing the parameters for the classifiers may not be an optimal solution for exploiting label Semi-supervised learning information. Therefore, in this paper, we develop a novel approach for semi-supervised Variational Auto-encoder VAE without classifier. Specifically, we propose a new model called Semi-supervised Dis- Disentangled representation entangled VAE (SDVAE), which encodes the input data into disentangled representation Neural networks and non-interpretable representation, then the category information is directly utilized to regularize the disentangled representation via the equality constraint.
    [Show full text]
  • A Deep Hierarchical Variational Autoencoder
    NVAE: A Deep Hierarchical Variational Autoencoder Arash Vahdat, Jan Kautz NVIDIA {avahdat, jkautz}@nvidia.com Abstract Normalizing flows, autoregressive models, variational autoencoders (VAEs), and deep energy-based models are among competing likelihood-based frameworks for deep generative learning. Among them, VAEs have the advantage of fast and tractable sampling and easy-to-access encoding networks. However, they are cur- rently outperformed by other models such as normalizing flows and autoregressive models. While the majority of the research in VAEs is focused on the statistical challenges, we explore the orthogonal direction of carefully designing neural archi- tectures for hierarchical VAEs. We propose Nouveau VAE (NVAE), a deep hierar- chical VAE built for image generation using depth-wise separable convolutions and batch normalization. NVAE is equipped with a residual parameterization of Normal distributions and its training is stabilized by spectral regularization. We show that NVAE achieves state-of-the-art results among non-autoregressive likelihood-based models on the MNIST, CIFAR-10, CelebA 64, and CelebA HQ datasets and it provides a strong baseline on FFHQ. For example, on CIFAR-10, NVAE pushes the state-of-the-art from 2.98 to 2.91 bits per dimension, and it produces high-quality images on CelebA HQ as shown in Fig. 1. To the best of our knowledge, NVAE is the first successful VAE applied to natural images as large as 256×256 pixels. The source code is available at https://github.com/NVlabs/NVAE. 1 Introduction The majority of the research efforts on improving VAEs [1, 2] is dedicated to the statistical challenges, such as reducing the gap between approximate and true posterior distributions [3, 4, 5, 6, 7, 8, 9, 10], formulating tighter bounds [11, 12, 13, 14], reducing the gradient noise [15, 16], extending VAEs to discrete variables [17, 18, 19, 20, 21, 22, 23], or tackling posterior collapse [24, 25, 26, 27].
    [Show full text]
  • Explainable Deep Learning Models in Medical Image Analysis
    Journal of Imaging Review Explainable Deep Learning Models in Medical Image Analysis Amitojdeep Singh 1,2,* , Sourya Sengupta 1,2 and Vasudevan Lakshminarayanan 1,2 1 Theoretical and Experimental Epistemology Laboratory, School of Optometry and Vision Science, University of Waterloo, Waterloo, ON N2L 3G1, Canada; [email protected] (S.S.); [email protected] (V.L.) 2 Department of Systems Design Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada * Correspondence: [email protected] Received: 28 May 2020; Accepted: 17 June 2020; Published: 20 June 2020 Abstract: Deep learning methods have been very effective for a variety of medical diagnostic tasks and have even outperformed human experts on some of those. However, the black-box nature of the algorithms has restricted their clinical use. Recent explainability studies aim to show the features that influence the decision of a model the most. The majority of literature reviews of this area have focused on taxonomy, ethics, and the need for explanations. A review of the current applications of explainable deep learning for different medical imaging tasks is presented here. The various approaches, challenges for clinical deployment, and the areas requiring further research are discussed here from a practical standpoint of a deep learning researcher designing a system for the clinical end-users. Keywords: explainability; explainable AI; XAI; deep learning; medical imaging; diagnosis 1. Introduction Computer-aided diagnostics (CAD) using artificial intelligence (AI) provides a promising way to make the diagnosis process more efficient and available to the masses. Deep learning is the leading artificial intelligence (AI) method for a wide range of tasks including medical imaging problems.
    [Show full text]
  • Double Backpropagation for Training Autoencoders Against Adversarial Attack
    1 Double Backpropagation for Training Autoencoders against Adversarial Attack Chengjin Sun, Sizhe Chen, and Xiaolin Huang, Senior Member, IEEE Abstract—Deep learning, as widely known, is vulnerable to adversarial samples. This paper focuses on the adversarial attack on autoencoders. Safety of the autoencoders (AEs) is important because they are widely used as a compression scheme for data storage and transmission, however, the current autoencoders are easily attacked, i.e., one can slightly modify an input but has totally different codes. The vulnerability is rooted the sensitivity of the autoencoders and to enhance the robustness, we propose to adopt double backpropagation (DBP) to secure autoencoder such as VAE and DRAW. We restrict the gradient from the reconstruction image to the original one so that the autoencoder is not sensitive to trivial perturbation produced by the adversarial attack. After smoothing the gradient by DBP, we further smooth the label by Gaussian Mixture Model (GMM), aiming for accurate and robust classification. We demonstrate in MNIST, CelebA, SVHN that our method leads to a robust autoencoder resistant to attack and a robust classifier able for image transition and immune to adversarial attack if combined with GMM. Index Terms—double backpropagation, autoencoder, network robustness, GMM. F 1 INTRODUCTION N the past few years, deep neural networks have been feature [9], [10], [11], [12], [13], or network structure [3], [14], I greatly developed and successfully used in a vast of fields, [15]. such as pattern recognition, intelligent robots, automatic Adversarial attack and its defense are revolving around a control, medicine [1]. Despite the great success, researchers small ∆x and a big resulting difference between f(x + ∆x) have found the vulnerability of deep neural networks to and f(x).
    [Show full text]
  • Generative Models
    Lecture 11: Generative Models Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 11 - 1 May 9, 2019 Administrative ● A3 is out. Due May 22. ● Milestone is due next Wednesday. ○ Read Piazza post for milestone requirements. ○ Need to Finish data preprocessing and initial results by then. ● Don't discuss exam yet since people are still taking it. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 11 -2 May 9, 2019 Overview ● Unsupervised Learning ● Generative Models ○ PixelRNN and PixelCNN ○ Variational Autoencoders (VAE) ○ Generative Adversarial Networks (GAN) Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 11 - 3 May 9, 2019 Supervised vs Unsupervised Learning Supervised Learning Data: (x, y) x is data, y is label Goal: Learn a function to map x -> y Examples: Classification, regression, object detection, semantic segmentation, image captioning, etc. Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 11 - 4 May 9, 2019 Supervised vs Unsupervised Learning Supervised Learning Data: (x, y) x is data, y is label Cat Goal: Learn a function to map x -> y Examples: Classification, regression, object detection, Classification semantic segmentation, image captioning, etc. This image is CC0 public domain Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 11 - 5 May 9, 2019 Supervised vs Unsupervised Learning Supervised Learning Data: (x, y) x is data, y is label Goal: Learn a function to map x -> y Examples: Classification, DOG, DOG, CAT regression, object detection, semantic segmentation, image Object Detection captioning, etc. This image is CC0 public domain Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 11 - 6 May 9, 2019 Supervised vs Unsupervised Learning Supervised Learning Data: (x, y) x is data, y is label Goal: Learn a function to map x -> y Examples: Classification, GRASS, CAT, TREE, SKY regression, object detection, semantic segmentation, image Semantic Segmentation captioning, etc.
    [Show full text]
  • Infinite Variational Autoencoder for Semi-Supervised Learning
    Infinite Variational Autoencoder for Semi-Supervised Learning M. Ehsan Abbasnejad Anthony Dick Anton van den Hengel The University of Adelaide {ehsan.abbasnejad, anthony.dick, anton.vandenhengel}@adelaide.edu.au Abstract with whatever labelled data is available to train a discrimi- native model for classification. This paper presents an infinite variational autoencoder We demonstrate that our infinite VAE outperforms both (VAE) whose capacity adapts to suit the input data. This the classical VAE and standard classification methods, par- is achieved using a mixture model where the mixing coef- ticularly when the number of available labelled samples is ficients are modeled by a Dirichlet process, allowing us to small. This is because the infinite VAE is able to more ac- integrate over the coefficients when performing inference. curately capture the distribution of the unlabelled data. It Critically, this then allows us to automatically vary the therefore provides a generative model that allows the dis- number of autoencoders in the mixture based on the data. criminative model, which is trained based on its output, to Experiments show the flexibility of our method, particularly be more effectively learnt using a small number of samples. for semi-supervised learning, where only a small number of The main contribution of this paper is twofold: (1) we training samples are available. provide a Bayesian non-parametric model for combining autoencoders, in particular variational autoencoders. This bridges the gap between non-parametric Bayesian meth- 1. Introduction ods and the deep neural networks; (2) we provide a semi- supervised learning approach that utilizes the infinite mix- The Variational Autoencoder (VAE) [18] is a newly in- ture of autoencoders learned by our model for prediction troduced tool for unsupervised learning of a distribution x x with from a small number of labeled examples.
    [Show full text]
  • Variational Autoencoder for End-To-End Control of Autonomous Driving with Novelty Detection and Training De-Biasing
    Variational Autoencoder for End-to-End Control of Autonomous Driving with Novelty Detection and Training De-biasing Alexander Amini1, Wilko Schwarting1, Guy Rosman2, Brandon Araki1, Sertac Karaman3, Daniela Rus1 Abstract— This paper introduces a new method for end-to- Uncertainty Estimation end training of deep neural networks (DNNs) and evaluates & Novelty Detection it in the context of autonomous driving. DNN training has Uncertainty been shown to result in high accuracy for perception to action Propagate learning given sufficient training data. However, the trained models may fail without warning in situations with insufficient or biased training data. In this paper, we propose and evaluate Encoder a novel architecture for self-supervised learning of latent variables to detect the insufficiently trained situations. Our method also addresses training data imbalance, by learning a set of underlying latent variables that characterize the training Dataset Debiasing data and evaluate potential biases. We show how these latent Weather Adjacent Road Steering distributions can be leveraged to adapt and accelerate the (Snow) Vehicles Surface Control training pipeline by training on only a fraction of the total Command Resampled data dataset. We evaluate our approach on a challenging dataset distribution for driving. The data is collected from a full-scale autonomous Unsupervised Latent vehicle. Our method provides qualitative explanation for the Variables Sample Efficient latent variables learned in the model. Finally, we show how Accelerated Training our model can be additionally trained as an end-to-end con- troller, directly outputting a steering control command for an Fig. 1: Semi-supervised end-to-end control. An encoder autonomous vehicle.
    [Show full text]
  • An Introduction to Variational Autoencoders Full Text Available At
    Full text available at: http://dx.doi.org/10.1561/2200000056 An Introduction to Variational Autoencoders Full text available at: http://dx.doi.org/10.1561/2200000056 Other titles in Foundations and Trends R in Machine Learning Computational Optimal Transport Gabriel Peyre and Marco Cuturi ISBN: 978-1-68083-550-2 An Introduction to Deep Reinforcement Learning Vincent Francois-Lavet, Peter Henderson, Riashat Islam, Marc G. Bellemare and Joelle Pineau ISBN: 978-1-68083-538-0 An Introduction to Wishart Matrix Moments Adrian N. Bishop, Pierre Del Moral and Angele Niclas ISBN: 978-1-68083-506-9 A Tutorial on Thompson Sampling Daniel J. Russo, Benjamin Van Roy, Abbas Kazerouni, Ian Osband and Zheng Wen ISBN: 978-1-68083-470-3 Full text available at: http://dx.doi.org/10.1561/2200000056 An Introduction to Variational Autoencoders Diederik P. Kingma Google [email protected] Max Welling University of Amsterdam Qualcomm [email protected] Boston — Delft Full text available at: http://dx.doi.org/10.1561/2200000056 Foundations and Trends R in Machine Learning Published, sold and distributed by: now Publishers Inc. PO Box 1024 Hanover, MA 02339 United States Tel. +1-781-985-4510 www.nowpublishers.com [email protected] Outside North America: now Publishers Inc. PO Box 179 2600 AD Delft The Netherlands Tel. +31-6-51115274 The preferred citation for this publication is D. P. Kingma and M. Welling. An Introduction to Variational Autoencoders. Foundations and Trends R in Machine Learning, vol. 12, no. 4, pp. 307–392, 2019. ISBN: 978-1-68083-623-3 c 2019 D.
    [Show full text]
  • Exploring Semi-Supervised Variational Autoencoders for Biomedical Relation Extraction
    Exploring Semi-supervised Variational Autoencoders for Biomedical Relation Extraction Yijia Zhanga,b and Zhiyong Lua* a National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, Maryland 20894, USA b School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116023, China Corresponding author: Zhiyong Lu ([email protected]) Abstract The biomedical literature provides a rich source of knowledge such as protein-protein interactions (PPIs), drug-drug interactions (DDIs) and chemical-protein interactions (CPIs). Biomedical relation extraction aims to automatically extract biomedical relations from biomedical text for various biomedical research. State-of-the-art methods for biomedical relation extraction are primarily based on supervised machine learning and therefore depend on (sufficient) labeled data. However, creating large sets of training data is prohibitively expensive and labor-intensive, especially so in biomedicine as domain knowledge is required. In contrast, there is a large amount of unlabeled biomedical text available in PubMed. Hence, computational methods capable of employing unlabeled data to reduce the burden of manual annotation are of particular interest in biomedical relation extraction. We present a novel semi-supervised approach based on variational autoencoder (VAE) for biomedical relation extraction. Our model consists of the following three parts, a classifier, an encoder and a decoder. The classifier is implemented using multi-layer convolutional neural networks (CNNs), and the encoder and decoder are implemented using both bidirectional long short-term memory networks (Bi-LSTMs) and CNNs, respectively. The semi-supervised mechanism allows our model to learn features from both the labeled and unlabeled data.
    [Show full text]
  • Deep Variational Autoencoder: an Efficient Tool for PHM Frameworks
    Deep Variational Autoencoder: An Efficient Tool for PHM Frameworks Ryad Zemouri, Melanie Levesque, Normand Amyot, Claude Hudon, Olivier Kokoko To cite this version: Ryad Zemouri, Melanie Levesque, Normand Amyot, Claude Hudon, Olivier Kokoko. Deep Varia- tional Autoencoder: An Efficient Tool for PHM Frameworks. 2020 Prognostics and Health Man- agement Conference (PHM-Besançon), May 2020, Besancon, France. pp.235-240, 10.1109/PHM- Besancon49106.2020.00046. hal-02868384 HAL Id: hal-02868384 https://hal-cnam.archives-ouvertes.fr/hal-02868384 Submitted on 7 Jul 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Deep Variational Autoencoder: an efficient tool for PHM frameworks. Ryad Zemouri∗,Melanie´ Levesque´ y, Normand Amyoty, Claude Hudony and Olivier Kokokoy ∗Cedric-Lab, CNAM, HESAM universite,´ Paris (France), [email protected] y Institut de Recherche d’Hydro-Quebec´ (IREQ), Varennes, QC, J3X1S1 Canada Abstract—Deep learning (DL) has been recently used in several Variational Autoencoder applications of machine health monitoring systems. Unfortu- nately, most of these DL models are considered as black-boxes with low interpretability. In this research, we propose an original PHM framework based on visual data analysis. The most suitable space dimension for the data visualization is the 2D-space, Encoder Decoder which necessarily involves a significant reduction from a high- dimensional to a low-dimensional data space.
    [Show full text]
  • A Generative Model for Semi-Supervised Learning
    Iowa State University Capstones, Theses and Creative Components Dissertations Fall 2019 A Generative Model for Semi-Supervised Learning Shang Da Follow this and additional works at: https://lib.dr.iastate.edu/creativecomponents Part of the Artificial Intelligence and Robotics Commons Recommended Citation Da, Shang, "A Generative Model for Semi-Supervised Learning" (2019). Creative Components. 382. https://lib.dr.iastate.edu/creativecomponents/382 This Creative Component is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Creative Components by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. A Generative Model for Semi-Supervised Learning Shang Da A Creative Component submitted to the graduate faculty in partial fulfillment of the requirements for the degree of Master of Science in Computer Science Program of Study Committee: Dr. Jin Tian, Major Professor Dr. Jia Liu, Committee Member Dr. Wensheng Zhang, Committee Member The student author, whose presentation of the scholarship herein was approved by the program of study committee, is solely responsible for the content of this dissertation. The Graduate College will ensure this dissertation is globally accessible and will not permit alterations after a degree is conferred. Iowa State University Ames, Iowa 2019 Copyright © Cy Cardinal, 2019. All rights reserved. ii TABLE
    [Show full text]
  • Latent Property State Abstraction for Reinforcement Learning
    Latent Property State Abstraction For Reinforcement learning John Burden Sajjad Kamali Siahroudi Daniel Kudenko Centre for the Study Of Existential L3S Research Center L3S Research Center Risk Hanover Hanover University of Cambridge [email protected] [email protected] [email protected] ABSTRACT the construction of RS functions. This would allow for reaping the Potential Based Reward Shaping has proven itself to be an effective benefits of reward shaping without the onus of manually encoding method for improving the learning rate for Reinforcement Learning domain knowledge or abstract tasks. algorithms — especially when the potential function is derived Our main contribution is a novel RL technique, called Latent from the solution to an Abstract Markov Decision Process (AMDP) Property State Abstraction (LPSA) that creates its own intrinsic encapsulating an abstraction of the desired task. The provenance reward function for use with reward shaping which can speed of the AMDP is often a domain expert. In this paper we introduce up the learning process of Deep RL algorithms. In our work we a novel method for the full automation of creating and solving an demonstrate this speed-up on the widely popular DQN algorithm AMDP to induce a potential function. We then show empirically [13]. Our method works by learning its own abstract representation that the potential function our method creates improves the sample of environment states using an auto-encoder. Then it learns an efficiency of DQN in the domain in which we test our approach. abstract value function over the environment before utilising this to increase the learning rate on the ground level of the domain KEYWORDS using reward shaping.
    [Show full text]