CS 556: Computer Vision Lecture 10
Total Page:16
File Type:pdf, Size:1020Kb
CS 556: Computer Vision Lecture 10 Prof. Sinisa Todorovic [email protected] 1 Deep Convolutional Neural Networks (DCNN) Feature extraction using Classification using convolutional layers multilayer perceptron (MLP) 2 DCNN — Classification — Logistic Regression sigmoid function x1 input vector x2 x3 bias +1 learned parameters of logistic regression 3 DCNN — Classification — Multilayer Perceptron Example 4-layer MLP with 2 output units for predicting 2 classes: x1 input x2 x3 +1 bias Layer 4 bias +1 bias +1 Layer 3 Layer 1 Layer 2 Also called fully connected layers with the final softmax layer 4 DCNN — Convolutional Layer Feature extraction using convolutional layers 5 Convolution filter previous level 6 DCNN — Convolutional Layer — Three Stages 7 DCNN — Convolutional Layer — Three Stages 8 DCNN — Convolutional Layer — Three Stages 9 DCNN — Convolutional Layer — Multiple Filters … 10 DCNN — Convolutional Layer — Multiple Filters 11 Fully Connected Layer Examples of Learned Filters in DCNN facesFaces Carscars elephantsElephants chairsChairs Higher layers learn more meaningful (abstract) features 13 Training Neural Nets — Two Stages — MATLAB 1. Unsupervised training of each individual layer using autoencoder 2. Fine-tuning of all layers using backpropagation Example Neural Network for Classifying 10 classes: images 100 hidden 50 hidden 10 output 28x28=784 nodes nodes nodes 14 Autoencoder — MATLAB — Example training by minimizing mean squared error between input and output 15 Autoencoder — MATLAB — Example training by minimizing mean squared error between input and output images labels % Load the training data into memory [xTrainImages, tTrain] = digittrain_dataset; % Display some of the training images clf for i = 1:20 subplot(4,5,i); imshow(xTrainImages{i}); end 16 Autoencoder — MATLAB — Example training by minimizing mean squared error between input and output rng(‘default') % set the random number generator seed hiddenSize1 = 100; % set the number of hidden nodes in Layer 1 autoenc1 = trainAutoencoder(xTrainImages,hiddenSize1, ... 'MaxEpochs',400, ... 'L2WeightRegularization',0.004, ... 'SparsityRegularization',4, ... 'SparsityProportion',0.15, ... 'ScaleData', false); plotWeights(autoenc1); 17 Autoencoder — MATLAB — Example training by minimizing mean squared error between input and output rng(‘default') % set the random number generator seed hiddenSize = 100; % set the number of hidden nodes in Layer 1 autoenc1 = trainAutoencoder(xTrainImages,hiddenSize1, ... 'MaxEpochs',400, ... 'L2WeightRegularization',0.004, ... 'SparsityRegularization',4, ... 'SparsityProportion',0.15, ... 'ScaleData', false); plotWeights(autoenc1); continue training the next layer 18 Autoencoder — MATLAB — Example — Next Layer training by minimizing mean squared error between input and output outputs of 50 hidden 100 hidden nodes nodes feat1 = encode(autoenc1,xTrainImages); hiddenSize2 = 50; % set the number of hidden nodes in Layer 2 autoenc2 = trainAutoencoder(feat1,hiddenSize2, ... 'MaxEpochs',100, ... 'L2WeightRegularization',0.002, ... 'SparsityRegularization',4, ... 'SparsityProportion',0.1, ... 'ScaleData', false); continue training the next layer 19 Softmax Layer — MATLAB — Example outputs of 10 output 50 hidden nodes nodes feat2 = encode(autoenc2,feat1); softnet = trainSoftmaxLayer(feat2,tTrain,’MaxEpochs',400); deepnet = stack(autoenc1,autoenc2,softnet); % stack all layers view(deepnet) 20 Classification using Neural Nets — MATLAB — Example images labels % Load the test images [xTestImages, tTest] = digittest_dataset; y = deepnet(xTest); plotconfusion(tTest,y); 21 Classification using Neural Nets after Fine-Tuning images labels % Perform fine tuning deepnet = train(deepnet,xTrain,tTrain); y = deepnet(xTest); plotconfusion(tTest,y); 22 Fine-Tuning using Error Backpropagation target kth output Goal: Minimize the error function 23 Fine-Tuning using Error Backpropagation all parameters kth target kth output Gradient descent: 24 Chain Rule for Level L single parameter from j-th node at level L-1 to k-th node at level L x1 x2 … x3 … 1+1 25 Chain Rule for Level L single parameter from j-th node at level L-1 to k-th node at level L 26 Chain Rule for Level L single parameter from j-th node at level L-1 to k-th node at level L 27 Chain Rule for Level L single parameter from j-th node at level L-1 to k-th node at level L 28 Chain Rule for Level L single parameter from j-th node at level L-1 to k-th node at level L 29 Chain Rule for Level L single parameter from j-th node at level L-1 to k-th node at level L 30 Chain Rule for Level L single parameter from j-th node at level L-1 to k-th node at level L 31 Chain Rule for Level L single parameter from j-th node at level L-1 to k-th node at level L 32 Fine-Tuning for Level L all parameters kth target kth output Gradient descent: 33 Chain Rule for Level L-1 single parameter from i-th node at level L-2 to j-th node at level L-1 x1 x1 x1 x2 x2 x2 x3 x3 x3 +1 +1 +1 34 Chain Rule for Level L-1 single parameter from i-th node at level L-2 to j-th node at level L-1 x1 x1 x1 x2 x2 x2 x3 x3 x3 +1 +1 +1 gradient descent: 35 Chain Rule for Level L-1 single parameter from i-th node at level L-2 to j-th node at level L-1 x1 x1 x1 x2 x2 x2 x3 x3 x3 +1 +1 +1 36 Chain Rule for Level L-1 single parameter from i-th node at level L-2 to j-th node at level L-1 x1 x1 x1 x2 x2 x2 x3 x3 x3 +1 +1 +1 37 Chain Rule for Level L-1 single parameter from i-th node at level L-2 to j-th node at level L-1 x1 x1 x1 x2 x2 x2 x3 x3 x3 +1 +1 +1 38 Fine-Tuning for Any Level l Gradient descent: where the deltas can be back-propagated: 39 Error Backpropagation — Example 40 Error Backpropagation — Example 41 Error Backpropagation — Example 42 Error Backpropagation — Example 43 Error Backpropagation — Example 44 Error Backpropagation — Example 45 Error Backpropagation — Example 46 Error Backpropagation — Example 47 Error Backpropagation — Example 48 Error Backpropagation — Example 49 Error Backpropagation — Example 50 Error Backpropagation — Example 51 Error Backpropagation — Example 52 Error Backpropagation — Example 53.