CS7015 (Deep Learning) : Lecture 7

CS7015 (Deep Learning) : Lecture 7 Autoencoders and relation to PCA, Regularization in autoencoders, Denoising autoencoders, Sparse autoencoders, Contractive autoencoders Mitesh M. Khapra Department of Computer Science and Engineering Indian Institute of Technology Madras 1/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Module 7.1: Introduction to Autoencoders 2/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 h = g(W xi + b) ∗ ^xi = f(W h + c) ^xi W ∗ h W xi 3/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Encodes its input xi into a hidden representation h Decodes the input again from this hidden representation The model is trained to minimize a certain loss function which will ensure that ^x is close to x (we will see some h = g(W x + b) i i i such loss functions soon) ∗ ^xi = f(W h + c) An autoencoder is a special type of ^xi feed forward neural network which does the following W ∗ h W xi 3/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Decodes the input again from this hidden representation The model is trained to minimize a certain loss function which will ensure that ^x is close to x (we will see some h = g(W x + b) i i i such loss functions soon) ∗ ^xi = f(W h + c) An autoencoder is a special type of ^xi feed forward neural network which does the following W ∗ Encodes its input xi into a hidden h representation h W xi 3/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Decodes the input again from this hidden representation The model is trained to minimize a certain loss function which will ensure that ^xi is close to xi (we will see some such loss functions soon) ∗ ^xi = f(W h + c) An autoencoder is a special type of ^xi feed forward neural network which does the following W ∗ Encodes its input xi into a hidden h representation h W xi h = g(W xi + b) 3/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 The model is trained to minimize a certain loss function which will ensure that ^xi is close to xi (we will see some such loss functions soon) ∗ ^xi = f(W h + c) An autoencoder is a special type of ^xi feed forward neural network which does the following W ∗ Encodes its input xi into a hidden h representation h W Decodes the input again from this hidden representation xi h = g(W xi + b) 3/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 The model is trained to minimize a certain loss function which will ensure that ^xi is close to xi (we will see some such loss functions soon) An autoencoder is a special type of ^xi feed forward neural network which does the following W ∗ Encodes its input xi into a hidden h representation h W Decodes the input again from this hidden representation xi h = g(W xi + b) ∗ ^xi = f(W h + c) 3/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 An autoencoder is a special type of ^xi feed forward neural network which does the following W ∗ Encodes its input xi into a hidden h representation h W Decodes the input again from this hidden representation x i The model is trained to minimize a certain loss function which will ensure that ^x is close to x (we will see some h = g(W x + b) i i i such loss functions soon) ∗ ^xi = f(W h + c) 3/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 An autoencoder where dim(h) < dim(xi) is called an under complete autoencoder ^xi W ∗ h W xi h = g(W xi + b) ∗ ^xi = f(W h + c) 4/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 If we are still able to reconstruct ^xi perfectly from h, then what does it say about h? h is a loss-free encoding of xi. It cap- tures all the important characteristics of xi Do you see an analogy with PCA? An autoencoder where dim(h) < dim(xi) is called an under complete autoencoder Let us consider the case where ^xi dim(h) < dim(xi) W ∗ h W xi h = g(W xi + b) ∗ ^xi = f(W h + c) 4/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 h is a loss-free encoding of xi. It cap- tures all the important characteristics of xi Do you see an analogy with PCA? An autoencoder where dim(h) < dim(xi) is called an under complete autoencoder Let us consider the case where ^xi dim(h) < dim(xi) If we are still able to reconstruct ^x W ∗ i perfectly from h, then what does it h say about h? W xi h = g(W xi + b) ∗ ^xi = f(W h + c) 4/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Do you see an analogy with PCA? An autoencoder where dim(h) < dim(xi) is called an under complete autoencoder Let us consider the case where ^xi dim(h) < dim(xi) If we are still able to reconstruct ^x W ∗ i perfectly from h, then what does it h say about h? h is a loss-free encoding of x . It cap- W i tures all the important characteristics xi of xi h = g(W xi + b) ∗ ^xi = f(W h + c) 4/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 An autoencoder where dim(h) < dim(xi) is called an under complete autoencoder Let us consider the case where ^xi dim(h) < dim(xi) If we are still able to reconstruct ^x W ∗ i perfectly from h, then what does it h say about h? h is a loss-free encoding of x . It cap- W i tures all the important characteristics xi of xi Do you see an analogy with PCA? h = g(W xi + b) ∗ ^xi = f(W h + c) 4/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Let us consider the case where ^xi dim(h) < dim(xi) If we are still able to reconstruct ^x W ∗ i perfectly from h, then what does it h say about h? h is a loss-free encoding of x . It cap- W i tures all the important characteristics xi of xi Do you see an analogy with PCA? h = g(W xi + b) ∗ ^xi = f(W h + c) An autoencoder where dim(h) < dim(xi) is called an under complete autoencoder 4/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Let us consider the case when dim(h) ≥ dim(xi) In such a case the autoencoder could learn a trivial encoding by simply copying xi into h and then copying h into ^xi Such an identity encoding is useless in practice as it does not really tell us anything about the important characteristics of the data An autoencoder where dim(h) ≥ dim(xi) is called an over complete autoencoder ^xi W ∗ h W xi h = g(W xi + b) ∗ ^xi = f(W h + c) 5/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 In such a case the autoencoder could learn a trivial encoding by simply copying xi into h and then copying h into ^xi Such an identity encoding is useless in practice as it does not really tell us anything about the important characteristics of the data An autoencoder where dim(h) ≥ dim(xi) is called an over complete autoencoder Let us consider the case when ^xi dim(h) ≥ dim(xi) W ∗ h W xi h = g(W xi + b) ∗ ^xi = f(W h + c) 5/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Such an identity encoding is useless in practice as it does not really tell us anything about the important characteristics of the data An autoencoder where dim(h) ≥ dim(xi) is called an over complete autoencoder Let us consider the case when ^xi dim(h) ≥ dim(xi) W ∗ In such a case the autoencoder could learn a trivial encoding by simply h copying xi into h and then copying h into ^x W i xi h = g(W xi + b) ∗ ^xi = f(W h + c) 5/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Such an identity encoding is useless in practice as it does not really tell us anything about the important characteristics of the data An autoencoder where dim(h) ≥ dim(xi) is called an over complete autoencoder Let us consider the case when ^xi dim(h) ≥ dim(xi) W ∗ In such a case the autoencoder could learn a trivial encoding by simply h copying xi into h and then copying h into ^x W i xi h = g(W xi + b) ∗ ^xi = f(W h + c) 5/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Such an identity encoding is useless in practice as it does not really tell us anything about the important characteristics of the data An autoencoder where dim(h) ≥ dim(xi) is called an over complete autoencoder Let us consider the case when ^xi dim(h) ≥ dim(xi) W ∗ In such a case the autoencoder could learn a trivial encoding by simply h copying xi into h and then copying h into ^x W i xi h = g(W xi + b) ∗ ^xi = f(W h + c) 5/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Such an identity encoding is useless in practice as it does not really tell us anything about the important characteristics of the data An autoencoder where dim(h) ≥ dim(xi) is called an over complete autoencoder Let us consider the case when ^xi dim(h) ≥ dim(xi) W ∗ In such a case the autoencoder could learn a trivial encoding by simply h copying xi into h and then copying h into ^x W i xi h = g(W xi + b) ∗ ^xi = f(W h + c) 5/55 Mitesh M.

CS7015 (Deep Learning) : Lecture 7

Training Autoencoders by Alternating Minimization

Turbo Autoencoder: Deep Learning Based Channel Codes for Point-To-Point Communication Channels

Double Backpropagation for Training Autoencoders Against Adversarial Attack

Artificial Intelligence Applied to Electromechanical Monitoring, A

Unsupervised Speech Representation Learning Using Wavenet Autoencoders Jan Chorowski, Ron J

De Novo Molecular Design by Combining Deep Autoencoder

Accent Transfer with Discrete Representation Learning and Latent Space Disentanglement

Autoencoder-Based Initialization for Recur- Rent Neural Networks with a Linear Memory

Audio Word2vec: Unsupervised Learning of Audio Segment Representations Using Sequence-To-Sequence Autoencoder

Nonparametric Guidance of Autoencoder Representations Using Label Information

Anomaly Detection of Power Plant Equipment Using Long Short-Term Memory Based Autoencoder Neural Network

Unsupervised Feature Extraction with Autoencoder Trees