<<

CS 556:

Lecture 10

Prof. Sinisa Todorovic

[email protected]

1 Deep Convolutional Neural Networks (DCNN)

Feature extraction using Classification using convolutional layers multilayer (MLP)

2 DCNN — Classification —

x1 input vector x2

x3

bias +1

learned parameters of logistic regression

3 DCNN — Classification — Multilayer Perceptron

Example 4- MLP with 2 output units for predicting 2 classes:

x1 input x2

x3 +1 bias Layer 4 bias +1 bias +1 Layer 3 Layer 1 Layer 2

Also called fully connected layers with the final softmax layer

4 DCNN — Convolutional Layer

Feature extraction using convolutional layers

5

filter previous level

6 DCNN — Convolutional Layer — Three Stages

7 DCNN — Convolutional Layer — Three Stages

8 DCNN — Convolutional Layer — Three Stages

9 DCNN — Convolutional Layer — Multiple Filters

10 DCNN — Convolutional Layer — Multiple Filters

11 Fully Connected Layer Examples of Learned Filters in DCNN facesFaces Carscars elephantsElephants chairsChairs

Higher layers learn more meaningful (abstract) features

13 Training Neural Nets — Two Stages — MATLAB

1. Unsupervised training of each individual layer using 2. Fine-tuning of all layers using

Example Neural Network for Classifying 10 classes:

images 100 hidden 50 hidden 10 output 28x28=784 nodes nodes nodes

14 Autoencoder — MATLAB — Example training by minimizing mean squared error between input and output

15 Autoencoder — MATLAB — Example

training by minimizing mean squared error between input and output

images labels

% Load the training data into memory [xTrainImages, tTrain] = digittrain_dataset;

% Display some of the training images clf for i = 1:20 subplot(4,5,i); imshow(xTrainImages{i}); end

16 Autoencoder — MATLAB — Example

training by minimizing mean squared error between input and output

rng(‘default') % set the random number generator seed hiddenSize1 = 100; % set the number of hidden nodes in Layer 1 autoenc1 = trainAutoencoder(xTrainImages,hiddenSize1, ... 'MaxEpochs',400, ... 'L2WeightRegularization',0.004, ... 'SparsityRegularization',4, ... 'SparsityProportion',0.15, ... 'ScaleData', false); plotWeights(autoenc1);

17 Autoencoder — MATLAB — Example

training by minimizing mean squared error between input and output

rng(‘default') % set the random number generator seed hiddenSize = 100; % set the number of hidden nodes in Layer 1 autoenc1 = trainAutoencoder(xTrainImages,hiddenSize1, ... 'MaxEpochs',400, ... 'L2WeightRegularization',0.004, ... 'SparsityRegularization',4, ... 'SparsityProportion',0.15, ... 'ScaleData', false); plotWeights(autoenc1);

continue training the next layer

18 Autoencoder — MATLAB — Example — Next Layer

training by minimizing mean squared error between input and output

outputs of 50 hidden 100 hidden nodes nodes

feat1 = encode(autoenc1,xTrainImages);

hiddenSize2 = 50; % set the number of hidden nodes in Layer 2

autoenc2 = trainAutoencoder(feat1,hiddenSize2, ... 'MaxEpochs',100, ... 'L2WeightRegularization',0.002, ... 'SparsityRegularization',4, ... 'SparsityProportion',0.1, ... 'ScaleData', false); continue training the next layer

19 Softmax Layer — MATLAB — Example

outputs of 10 output 50 hidden nodes nodes feat2 = encode(autoenc2,feat1); softnet = trainSoftmaxLayer(feat2,tTrain,’MaxEpochs',400); deepnet = stack(autoenc1,autoenc2,softnet); % stack all layers view(deepnet)

20 Classification using Neural Nets — MATLAB — Example

images labels

% Load the test images [xTestImages, tTest] = digittest_dataset;

y = deepnet(xTest);

plotconfusion(tTest,y);

21 Classification using Neural Nets after Fine-Tuning

images labels

% Perform fine tuning deepnet = train(deepnet,xTrain,tTrain);

y = deepnet(xTest);

plotconfusion(tTest,y);

22 Fine-Tuning using Error Backpropagation

target kth output

Goal: Minimize the error function

23 Fine-Tuning using Error Backpropagation

all parameters kth target kth output

Gradient descent:

24 Chain Rule for Level L

single parameter from j-th node at level L-1 to k-th node at level L

x1

x2 …

x3 … 1+1

25 Chain Rule for Level L single parameter from j-th node at level L-1 to k-th node at level L

26 Chain Rule for Level L single parameter from j-th node at level L-1 to k-th node at level L

27 Chain Rule for Level L single parameter from j-th node at level L-1 to k-th node at level L

28 Chain Rule for Level L single parameter from j-th node at level L-1 to k-th node at level L

29 Chain Rule for Level L single parameter from j-th node at level L-1 to k-th node at level L

30 Chain Rule for Level L single parameter from j-th node at level L-1 to k-th node at level L

31 Chain Rule for Level L single parameter from j-th node at level L-1 to k-th node at level L

32 Fine-Tuning for Level L

all parameters kth target kth output

Gradient descent:

33 Chain Rule for Level L-1

single parameter from i-th node at level L-2 to j-th node at level L-1

x1 x1 x1 x2 x2 x2 x3 x3 x3 +1 +1 +1

34 Chain Rule for Level L-1

single parameter from i-th node at level L-2 to j-th node at level L-1

x1 x1 x1 x2 x2 x2 x3 x3 x3 +1 +1 +1

gradient descent:

35 Chain Rule for Level L-1

single parameter from i-th node at level L-2 to j-th node at level L-1

x1 x1 x1 x2 x2 x2 x3 x3 x3 +1 +1 +1

36 Chain Rule for Level L-1

single parameter from i-th node at level L-2 to j-th node at level L-1

x1 x1 x1 x2 x2 x2 x3 x3 x3 +1 +1 +1

37 Chain Rule for Level L-1

single parameter from i-th node at level L-2 to j-th node at level L-1

x1 x1 x1 x2 x2 x2 x3 x3 x3 +1 +1 +1

38 Fine-Tuning for Any Level l

Gradient descent:

where the deltas can be back-propagated:

39 Error Backpropagation — Example

40 Error Backpropagation — Example

41 Error Backpropagation — Example

42 Error Backpropagation — Example

43 Error Backpropagation — Example

44 Error Backpropagation — Example

45 Error Backpropagation — Example

46 Error Backpropagation — Example

47 Error Backpropagation — Example

48 Error Backpropagation — Example

49 Error Backpropagation — Example

50 Error Backpropagation — Example

51 Error Backpropagation — Example

52 Error Backpropagation — Example

53