CS7015 (Deep Learning) : Lecture 7

CS7015 (Deep Learning) : Lecture 7 Autoencoders and relation to PCA, Regularization in autoencoders, Denoising autoencoders, Sparse autoencoders, Contractive autoencoders Mitesh M. Khapra Department of Computer Science and Engineering Indian Institute of Technology Madras 1/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Module 7.1: Introduction to Autoencoders 2/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 h = g(W xi + b) ∗ ^xi = f(W h + c) ^xi W ∗ h W xi 3/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Encodes its input xi into a hidden representation h Decodes the input again from this hidden representation The model is trained to minimize a certain loss function which will ensure that ^x is close to x (we will see some h = g(W x + b) i i i such loss functions soon) ∗ ^xi = f(W h + c) An autoencoder is a special type of ^xi feed forward neural network which does the following W ∗ h W xi 3/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Decodes the input again from this hidden representation The model is trained to minimize a certain loss function which will ensure that ^x is close to x (we will see some h = g(W x + b) i i i such loss functions soon) ∗ ^xi = f(W h + c) An autoencoder is a special type of ^xi feed forward neural network which does the following W ∗ Encodes its input xi into a hidden h representation h W xi 3/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Decodes the input again from this hidden representation The model is trained to minimize a certain loss function which will ensure that ^xi is close to xi (we will see some such loss functions soon) ∗ ^xi = f(W h + c) An autoencoder is a special type of ^xi feed forward neural network which does the following W ∗ Encodes its input xi into a hidden h representation h W xi h = g(W xi + b) 3/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 The model is trained to minimize a certain loss function which will ensure that ^xi is close to xi (we will see some such loss functions soon) ∗ ^xi = f(W h + c) An autoencoder is a special type of ^xi feed forward neural network which does the following W ∗ Encodes its input xi into a hidden h representation h W Decodes the input again from this hidden representation xi h = g(W xi + b) 3/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 The model is trained to minimize a certain loss function which will ensure that ^xi is close to xi (we will see some such loss functions soon) An autoencoder is a special type of ^xi feed forward neural network which does the following W ∗ Encodes its input xi into a hidden h representation h W Decodes the input again from this hidden representation xi h = g(W xi + b) ∗ ^xi = f(W h + c) 3/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 An autoencoder is a special type of ^xi feed forward neural network which does the following W ∗ Encodes its input xi into a hidden h representation h W Decodes the input again from this hidden representation x i The model is trained to minimize a certain loss function which will ensure that ^x is close to x (we will see some h = g(W x + b) i i i such loss functions soon) ∗ ^xi = f(W h + c) 3/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 An autoencoder where dim(h) < dim(xi) is called an under complete autoencoder ^xi W ∗ h W xi h = g(W xi + b) ∗ ^xi = f(W h + c) 4/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 If we are still able to reconstruct ^xi perfectly from h, then what does it say about h? h is a loss-free encoding of xi. It cap- tures all the important characteristics of xi Do you see an analogy with PCA? An autoencoder where dim(h) < dim(xi) is called an under complete autoencoder Let us consider the case where ^xi dim(h) < dim(xi) W ∗ h W xi h = g(W xi + b) ∗ ^xi = f(W h + c) 4/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 h is a loss-free encoding of xi. It cap- tures all the important characteristics of xi Do you see an analogy with PCA? An autoencoder where dim(h) < dim(xi) is called an under complete autoencoder Let us consider the case where ^xi dim(h) < dim(xi) If we are still able to reconstruct ^x W ∗ i perfectly from h, then what does it h say about h? W xi h = g(W xi + b) ∗ ^xi = f(W h + c) 4/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Do you see an analogy with PCA? An autoencoder where dim(h) < dim(xi) is called an under complete autoencoder Let us consider the case where ^xi dim(h) < dim(xi) If we are still able to reconstruct ^x W ∗ i perfectly from h, then what does it h say about h? h is a loss-free encoding of x . It cap- W i tures all the important characteristics xi of xi h = g(W xi + b) ∗ ^xi = f(W h + c) 4/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 An autoencoder where dim(h) < dim(xi) is called an under complete autoencoder Let us consider the case where ^xi dim(h) < dim(xi) If we are still able to reconstruct ^x W ∗ i perfectly from h, then what does it h say about h? h is a loss-free encoding of x . It cap- W i tures all the important characteristics xi of xi Do you see an analogy with PCA? h = g(W xi + b) ∗ ^xi = f(W h + c) 4/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Let us consider the case where ^xi dim(h) < dim(xi) If we are still able to reconstruct ^x W ∗ i perfectly from h, then what does it h say about h? h is a loss-free encoding of x . It cap- W i tures all the important characteristics xi of xi Do you see an analogy with PCA? h = g(W xi + b) ∗ ^xi = f(W h + c) An autoencoder where dim(h) < dim(xi) is called an under complete autoencoder 4/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Let us consider the case when dim(h) ≥ dim(xi) In such a case the autoencoder could learn a trivial encoding by simply copying xi into h and then copying h into ^xi Such an identity encoding is useless in practice as it does not really tell us anything about the important characteristics of the data An autoencoder where dim(h) ≥ dim(xi) is called an over complete autoencoder ^xi W ∗ h W xi h = g(W xi + b) ∗ ^xi = f(W h + c) 5/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 In such a case the autoencoder could learn a trivial encoding by simply copying xi into h and then copying h into ^xi Such an identity encoding is useless in practice as it does not really tell us anything about the important characteristics of the data An autoencoder where dim(h) ≥ dim(xi) is called an over complete autoencoder Let us consider the case when ^xi dim(h) ≥ dim(xi) W ∗ h W xi h = g(W xi + b) ∗ ^xi = f(W h + c) 5/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Such an identity encoding is useless in practice as it does not really tell us anything about the important characteristics of the data An autoencoder where dim(h) ≥ dim(xi) is called an over complete autoencoder Let us consider the case when ^xi dim(h) ≥ dim(xi) W ∗ In such a case the autoencoder could learn a trivial encoding by simply h copying xi into h and then copying h into ^x W i xi h = g(W xi + b) ∗ ^xi = f(W h + c) 5/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Such an identity encoding is useless in practice as it does not really tell us anything about the important characteristics of the data An autoencoder where dim(h) ≥ dim(xi) is called an over complete autoencoder Let us consider the case when ^xi dim(h) ≥ dim(xi) W ∗ In such a case the autoencoder could learn a trivial encoding by simply h copying xi into h and then copying h into ^x W i xi h = g(W xi + b) ∗ ^xi = f(W h + c) 5/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Such an identity encoding is useless in practice as it does not really tell us anything about the important characteristics of the data An autoencoder where dim(h) ≥ dim(xi) is called an over complete autoencoder Let us consider the case when ^xi dim(h) ≥ dim(xi) W ∗ In such a case the autoencoder could learn a trivial encoding by simply h copying xi into h and then copying h into ^x W i xi h = g(W xi + b) ∗ ^xi = f(W h + c) 5/55 Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 7 Such an identity encoding is useless in practice as it does not really tell us anything about the important characteristics of the data An autoencoder where dim(h) ≥ dim(xi) is called an over complete autoencoder Let us consider the case when ^xi dim(h) ≥ dim(xi) W ∗ In such a case the autoencoder could learn a trivial encoding by simply h copying xi into h and then copying h into ^x W i xi h = g(W xi + b) ∗ ^xi = f(W h + c) 5/55 Mitesh M.

CS7015 (Deep Learning) : Lecture 7

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support