
CS273B lecture 5: RNN and autoencoder James Zou October 10, 2016 Recap Recap: feedforward and convnets Main take-aways: • Composition. Units/layers of a NN are modular and can be composed to form complex architecture. • Weight-sharing. Enforcing that the weight be equal across a set of units can dramatically decrease # of parameters. What are limitations of convnets? What are limitations of convnets? • Fixed input length. • Unclear how to adapt to time-series data. • Convolution corresponds to strong prior—not appropriate for many biological settings. • Could require many labeled training examples (high sample complexity). What are limitations of convnets? • Fixed input length. What are limitations of convnets? • Fixed input length. Recurrent neural network output hidden units input Recurrent neural network output = + · hidden units = ( + ) · input Recurrent neural network output hidden units input Recurrent neural network + − + − + − Recurrent neural network − = ( + + ) − · · − − Recurrent neural network = + − · − − What does RNN remind you of? + − + − + − Vanilla RNN: lacks long term memory + − + − + − LSTM network − − − LSTM network − − − Hochreiter, Schmidhuber 1997 LSTM: inside the hood memory − − Figure adapted from Olah blog. LSTM: inside the hood = + − ∗ = σ( [ , ]+) · − − − Figure adapted from Olah blog. LSTM: inside the hood = + − ∗ ∗ = tanh( [ , ]+) · − = σ( [ , ]+) · − − − Figure adapted from Olah blog. LSTM: inside the hood = σ( [ , ]+) · − = tanh( ) output ∗ − − Figure adapted from Olah blog. LSTM summary • LSTM is a variant of RNN that makes it easier to retain long- range interactions. • Parameters of LSTM: , forget , new memory , weight of new memory (input) , output LSTM application: enhancer/TF prediction Output: 919 binary vector for the presence of TF/chromatin Bi-directional LSTM Similar convolutional architecture as before Input: 200bp sequence Quang and Xie. DanQ. 2016 Deep supervised learning • Feedforward Learning a nonlinear mapping from inputs to outputs. Predicting: • Convnets TF binding, gene expression, disease status from images, • RNN, LSTM risk from SNPs, protein structure … Deep unsupervised learning • Nonlinear dimensional reduction and patterns mining. • In many settings, have more unlabeled examples than labeled. • Learn useful representations from unlabeled data. • Better representation may improve prediction accuracy. Low dimensional structure What is the latent dimensionality of each row of images? Urtasun and Zemel. Autoencoder ˆ decoding () encoding Autoencoder ˆ ˆ = ( + ) · () = ( + ) · , = arg min ˆ , || − || Train with backprop as before. Autoencoder ˆ If encoding and decoding are linear then , = arg min , || − || () What does this remind you of? Autoencoder ˆ If encoding and decoding are linear then , = arg min , || − || () Linear autoencoder is basically just PCA! General f and g corresponds to nonlinear dimensional reduction. What is wrong with this picture? ˆ () What is wrong with this picture? ˆ h(X) can just copy X exactly! Overcomplete. Need to impose sparsity on h. () Denoising autoencoder ˆ () 0 0 independent noise Denoising autoencoder ˆ () 0 0 independent noise Illustration of denoising autoencoder Figure from Hugo Larochelle Filters from denoising autoencoder Basis learned by Basis learned by weight- denoising autoencoder decay autoencoder Deep autoencoder ˆ ˆ Deep autoencoder example original DAE PCA Hinton and Salakhutdinov. Science. 2016 Deep autoencoder example PCA Deep autoencoder Hinton and Salakhutdinov. Science. 2016 Application: deep patient Each patient = vector of 41k clinical descriptors Stack of 3 denoising autoencoder 500 dim representation of each patient Miotto et al. DeepPatient. 2016 Application: deep patient 500 dim representation of each patient Random forest to predict future disease Miotto et al. DeepPatient. 2016 .
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages41 Page
-
File Size-