Lecture: Introduction to Deep Learning Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab

Lecture: Introduction to Deep Learning Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Slides adapted from Justin Johnson Stanford University Lecture 19 - 1 6 Dec 2016 So far this quarter Edge detection SIFT RANSAC K-Means Linear classifiers Image features PCA / Eigenfaces Mean-shift Stanford University Lecture 19 - 2 5 Dec6 Dec 2016 2016 Current Research ECCV 2016 (photo credit: Justin Johnson) Stanford University Lecture 19 - 3 6 Dec 2016 Current Research Convolutional Recurrent neural Autoencoders neural networks networks Fully convolutional networks Multilayer perceptrons ECCV 2016 (photo credit: Justin Johnson) What are these things? Why are they important? Stanford University Lecture 19 - 4 5 Dec6 Dec 2016 2016 Deep Learning ● Learning hierarchical representations from data ● End-to-end learning: raw inputs to predictions ● Can use a small set of simple tools to solve many problems ● Has led to rapid progress on many problems ● (Very loosely!) Inspired by the brain ● Today: 1 hour introduction to deep learning fundamentals ● To learn more take CS 231n in the Spring Stanford University Lecture 19 - 5 5 Dec6 Dec 2016 2016 Deep learning for many different problems Stanford University Lecture 19 - 6 6 Dec 2016 Stanford University Lecture 19 - 7 5 Dec6 Dec 2016 2016 Slide credit: CS 231n, Lecture 1, 2016 Stanford University Lecture 19 - 8 5 Dec6 Dec 2016 2016 Slide credit: Kaiming He, ICCV 2015 Stanford University Lecture 19 - 9 5 Dec6 Dec 2016 2016 ≈5.1 Human Human benchmark from Russakovsky et al, “ImageNet Large Scale Visual Recognition Challenge”, IJCV 2015 Slide credit: Kaiming He, ICCV 2015 Stanford University Lecture 19 - 10 5 Dec6 Dec 2016 2016 Slide credit: Kaiming He, ICCV 2015 Stanford University Lecture 19 - 11 5 Dec6 Dec 2016 2016 Deep Learning for Object Detection Slide credit: Ross Girshick, ICCV 2015 Stanford University Lecture 19 - 12 5 Dec6 Dec 2016 2016 Deep Learning for Object Detection Current SOTA: 85.6 He et al, “Deep Residual Learning for Image Recognition”, CVPR 2016 Slide credit: Ross Girshick, ICCV 2015 Stanford University Lecture 19 - 13 5 Dec6 Dec 2016 2016 Object segmentation Figure credit: Dai, He, and Sun, “Instance-aware Semantic Segmentation via Multi-task Network Cascades”, CVPR 2016 Stanford University Lecture 19 - 14 5 Dec6 Dec 2016 2016 Pose Estimation Figure credit: Cao et al, “Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields”, arXiv 2016 Stanford University Lecture 19 - 15 5 Dec6 Dec 2016 2016 Deep Learning for Image Captioning Figure credit: Karpathy and Fei-Fei, “Deep Visual-Semantic Alignments for Generating Image Descriptions”, CVPR 2015 Stanford University Lecture 19 - 16 5 Dec6 Dec 2016 2016 Dense Image Captioning Figure credit: Johnson*, Karpathy*, and Fei-Fei, “DenseCap: Fully Convolutional Localization Networks for Dense Captioning”, CVPR 2016 Stanford University Lecture 19 - 17 5 Dec6 Dec 2016 2016 Visual Question Answering Figure credit: Agrawal et al, “VQA: Visual Question Answering”, ICCV 2015 Figure credit: Zhu et al, “Visual7W: Grounded Question Answering in Images”, CVPR 2016 Stanford University Lecture 19 - 18 5 Dec6 Dec 2016 2016 Image Super-Resolution Figure credit: Ledig et al, “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network”, arXiv 2016 Stanford University Lecture 19 - 19 5 Dec6 Dec 2016 2016 Generating Art Figure credit: Mordvintsev, Olah, and Tyka, “Inceptionism: Going Deeper into Neural Networks”, https://research.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html Figure credit: Gatys, Ecker, and Bethge, “Image Style Transfer using Convolutional Neural Networks”, CVPR 2016 Figure credit: Johnson, Alahi, and Fei-Fei: “Perceptual Losses for Real-Time Style Transfer and Super-Resolution”, ECCV 2016, https://github.com/jcjohnson/fast-neural-style Stanford University Lecture 19 - 20 5 Dec6 Dec 2016 2016 Outside Computer Vision Speech Recognition Machine Translation Figure credit: Bahdanau, Cho, and Bengio, “Neural Machine Translation by jointly learning to align and translate”, ICLR 2015 Figure credit: Amodei et al, “Deep Speech 2: End-to-End Speech Text Synthesis Recognition in English and Mandarin”, arXiv 2015 Speech Synthesis Figure credit: Sutskever, Martens, and Hinton, “Generating Text with Recurrent Neural Networks”, ICML 2011 Figure credit: var der Oord et al, “WaveNet: A Generative Model for Raw Audio”, arXiv 2016, https://deepmind.com/blog/wavenet-generative-model-raw-audio/ Stanford University Lecture 19 - 21 5 Dec6 Dec 2016 2016 Deep Reinforcement Learning Playing Atari games AlphaGo beats Lee Sedol Silver et al, “Mastering the game of Go with deep neural networks and tree search”, Nature 2016 Image credit: http://www.newyorker.com/tech/elements/alphago-lee-sedol-and-the-reassuring-future- of-humans-and-machines Mnih et al, “Human-level control through deep reinforcement learning”, Nature 2015 Stanford University Lecture 19 - 22 5 Dec6 Dec 2016 2016 Outline ● Applications ● Motivation ● Supervised learning ● Gradient descent ● Backpropagation ● Convolutional Neural Networks Stanford University Lecture 19 - 23 5 Dec6 Dec 2016 2016 Outline ● Applications ● Motivation ● Supervised learning ● Gradient descent ● Backpropagation ● Convolutional Neural Networks Stanford University Lecture 19 - 24 5 Dec6 Dec 2016 2016 Recall: Image Classification Write a function that maps images to labels (also called object recognition) Dataset: ETH-80, by B. Leibe; Slide credit: L. Lazebnik No obvious way to implement this function! Slide credit: CS 231n, Lecture 2, 2016 Stanford University Lecture 19 - 25 5 Dec6 Dec 2016 2016 Recall: Image Classification Example training set Slide credit: CS 231n, Lecture 2, 2016 Stanford University Lecture 19 - 26 5 Dec6 Dec 2016 2016 Recall: Image Classification Figure credit: D. Hoiem, L. Lazebnik Stanford University Lecture 19 - 27 5 Dec6 Dec 2016 2016 Recall: Image Classification Features remain fixed Classifier is learned from data Figure credit: D. Hoiem, L. Lazebnik Stanford University Lecture 19 - 28 5 Dec6 Dec 2016 2016 Recall: Image Classification Features remain fixed Classifier is learned from data Problem: How do we know which features to use? We may need different features for each problem! Figure credit: D. Hoiem, L. Lazebnik Stanford University Lecture 19 - 29 5 Dec6 Dec 2016 2016 Recall: Image Classification Features remain fixed Classifier is learned from data Problem: Solution: How do we know which features Learn the features jointly to use? We may need different with the classifier! features for each problem! Figure credit: D. Hoiem, L. Lazebnik Stanford University Lecture 19 - 30 5 Dec6 Dec 2016 2016 Image Classification: Feature Learning Training labels Training Learned Model Learned Image Classifier features features Learned Classifier Learned Model Learned features Prediction Learned Classifier Stanford University Lecture 19 - 31 5 Dec6 Dec 2016 2016 Image Classification: Deep Learning Training labels Learned Model Training Low-level features Low-level Classifier features Mid-level Model is features deep: it High-level has many Mid-level High-level layers features features features Classifier Models can learn hierarchical features Stanford University Lecture 19 - 32 5 Dec6 Dec 2016 2016 Image Classification: Deep Learning Training labels Learned Model Training Low-level features Low-level Classifier features Mid-level Model is features deep: it High-level has many Mid-level High-level layers features features features Classifier Models can learn hierarchical features Figure credit: Zeiler and Fergus, “Visualizing and Understanding Convolutional Networks”, ECCV 2014 Stanford University Lecture 19 - 33 5 Dec6 Dec 2016 2016 Outline ● Applications ● Motivation ● Supervised learning ● Gradient descent ● Backpropagation ● Convolutional Neural Networks Stanford University Lecture 19 - 34 5 Dec6 Dec 2016 2016 Supervised Learning in 3 easy steps How to learn models from data Step 1: Define Model Stanford University Lecture 19 - 35 5 Dec6 Dec 2016 2016 Supervised Learning in 3 easy steps How to learn models from data Step 1: Define Model Predicted output Model Input data Model (image label) structure (Image pixels) weights Stanford University Lecture 19 - 36 5 Dec6 Dec 2016 2016 Supervised learning b Stanford University Lecture 19 - 37 5 Dec 2016 Supervised Learning in 3 easy steps How to learn models from data Step 1: Step 2: Collect data Define Model Predicted output Model Input data Model (image label) structure (Image pixels) weights Training True input output Stanford University Lecture 19 - 38 5 Dec6 Dec 2016 2016 Supervised Learning in 3 easy steps How to learn models from data Step 1: Step 2: Collect data Define Model Predicted output Model Input data Model (image label) structure (Image pixels) weights Training True input output Step 3: Learn the model Stanford University Lecture 19 - 39 5 Dec6 Dec 2016 2016 Supervised Learning in 3 easy steps How to learn models from data Step 1: Step 2: Collect data Define Model Predicted output Model Input data Model (image label) structure (Image pixels) weights Training True input output Predicted Step 3: Learn the model output Loss function: Regularizer: Learned Minimize average Measures “badness” Penalizes complex weights loss over training set of prediction models Stanford University Lecture 19 - 40 5 Dec6 Dec 2016 2016 Supervised Learning: Linear regression Sometimes called “Ridge Regression” with regularizer Input and output are vectors Loss is Euclidean distance Linear Regression

Load more