CS4501: Introduction to Computer Vision Neural Networks (Nns) Artificial Neural Networks (Anns) Multi-Layer Perceptrons (Mlps Previous

CS4501: Introduction to Computer Vision Neural Networks (Nns) Artificial Neural Networks (Anns) Multi-Layer Perceptrons (Mlps Previous

CS4501: Introduction to Computer Vision Neural Networks (NNs) Artificial Neural Networks (ANNs) Multi-layer Perceptrons (MLPs Previous • Softmax Classifier • Inference vs Training • Gradient Descent (GD) • Stochastic Gradient Descent (SGD) • mini-batch Stochastic Gradient Descent (SGD) • Max-Margin Classifier • Regression vs Classification • Issues with Generalization / Overfitting • Regularization / momentum Today’s Class Neural Networks • The Perceptron Model • The Multi-layer Perceptron (MLP) • Forward-pass in an MLP (Inference) • Backward-pass in an MLP (Backpropagation) Perceptron Model Frank Rosenblatt (1957) - Cornell University Activation function ": .: - .; "; 1, if ) .*"* + 0 > 0 ! " = $ *+, . ) 0, otherwise < " < .= "= More: https://en.wikipedia.org/wiki/Perceptron Perceptron Model Frank Rosenblatt (1957) - Cornell University ": !? .: - .; "; 1, if ) .*"* + 0 > 0 ! " = $ *+, . ) 0, otherwise < " < .= "= More: https://en.wikipedia.org/wiki/Perceptron Perceptron Model Frank Rosenblatt (1957) - Cornell University Activation function ": .: - .; "; 1, if ) .*"* + 0 > 0 ! " = $ *+, . ) 0, otherwise < " < .= "= More: https://en.wikipedia.org/wiki/Perceptron Activation Functions Step(x) Sigmoid(x) Tanh(x) ReLU(x) = max(0, x) Two-layer Multi-layer Perceptron (MLP) ”hidden" layer & Loss / Criterion ! '" " & ! '# # & )(" )" ' !$ $ & ! '% % & Linear Softmax $" = [$"& $"' $"( $")] !" = [1 0 0] !+" = [,- ,. ,/] 0- = 1-&$"& + 1-'$"' + 1-($"( + 1-)$") + 3- 0. = 1.&$"& + 1.'$"' + 1.($"( + 1.)$") + 3. 0/ = 1/&$"& + 1/'$"' + 1/($"( + 1/)$") + 3/ 56 56 59 5: ,- = 4 /(4 +4 + 4 ) 59 56 59 5: ,. = 4 /(4 +4 + 4 ) 5: 56 59 5: ,/ = 4 /(4 +4 + 4 ) 9 Linear Softmax $" = [$"& $"' $"( $")] !" = [1 0 0] !+" = [,- ,. ,/] 1-& 1-' 1-( 1-) 0 = 1 $ + 1 $ + 1 $ + 1 $ + 3 - -& "& -' "' -( "( -) ") - 1 = 1.& 1.' 1.( 1.) 0. = 1.&$"& + 1.'$"' + 1.($"( + 1.)$") + 3. 1/& 1/' 1/( 1/) 0/ = 1/&$"& + 1/'$"' + 1/($"( + 1/)$") + 3/ 3 = 3- 3. 3/ 56 56 59 5: ,- = 4 /(4 +4 + 4 ) 59 56 59 5: ,. = 4 /(4 +4 + 4 ) 5: 56 59 5: ,/ = 4 /(4 +4 + 4 ) 10 Linear Softmax $" = [$"& $"' $"( $")] !" = [1 0 0] !+" = [,- ,. ,/] 1-& 1-' 1-( 1-) 1 = 1.& 1.' 1.( 1.) 0 = 1$2 + 42 1/& 1/' 1/( 1/) 4 = 4- 4. 4/ 67 67 6: 6; ,- = 5 /(5 +5 + 5 ) 6: 67 6: 6; ,. = 5 /(5 +5 + 5 ) 6; 67 6: 6; ,/ = 5 /(5 +5 + 5 ) 11 Linear Softmax $" = [$"& $"' $"( $")] !" = [1 0 0] !+" = [,- ,. ,/] 1-& 1-' 1-( 1-) 1 = 1.& 1.' 1.( 1.) 0 = 1$2 + 42 1/& 1/' 1/( 1/) 4 = 4- 4. 4/ , = 56,789$(0) 12 Linear Softmax $" = [$"& $"' $"( $")] !" = [1 0 0] !+" = [,- ,. ,/] , = 01,234$(6$7 + 97) 13 Two-layer MLP + Softmax $" = [$"& $"' $"( $")] !" = [1 0 0] !+" = [,- ,. ,/] 9 9 0& = 1234526(8[&]$ + ;[&]) 9 9 , = 15,=40$(8[']$ + ;[']) 14 N-layer MLP + Softmax $" = [$"& $"' $"( $")] !" = [1 0 0] !+" = [,- ,. ,/] 9 9 0& = 1234526(8[&]$ + ;[&]) 9 9 0' = 1234526(8[']0& + ;[']) … 9 9 0A = 1234526(8[A]0A?& + ;[A]) … 9 9 , = 15,=40$(8[>]0>?& + ;[>]) 15 How to train the parameters? $" = [$"& $"' $"( $")] !" = [1 0 0] !+" = [,- ,. ,/] 9 9 0& = 1234526(8[&]$ + ;[&]) 9 9 0' = 1234526(8[']0& + ;[']) … 9 9 0A = 1234526(8[A]0A?& + ;[A]) … 9 9 , = 15,=40$(8[>]0>?& + ;[>]) 16 Forward pass (Forward-propagation) / !+ = 4567859(*+) *+ = & 0"+1'+ + 3" +-. & / <" = & 0#+!+ + 3# +-. )" = 4567859(<+) ! '" " & ! '# # & )(" )" ' !$ $ & !% '% =8>> = =()", )(") & How to train the parameters? $" = [$"& $"' $"( $")] !" = [1 0 0] !+" = [,- ,. ,/] 9 9 0& = 1234526(8[&]$ + ;[&]) 9 9 0' = 1234526(8[']0& + ;[']) … We can still use SGD 9 9 0A = 1234526(8[A]0A?& + ;["]) … We need! BC BC , = 15,=40$(8 09 + ;9 ) [>] >?& [>] B8[A]"D B; A " 18 How to train the parameters? $" = [$"& $"' $"( $")] !" = [1 0 0] !+" = [,- ,. ,/] 9 9 0& = 1234526(8[&]$ + ;[&]) 9 9 0' = 1234526(8[']0& + ;[']) … We can still use SGD 9 9 0" = 1234526(8[C]0C?& + ;[C]) … We need! 9 9 , = 15,=40$(8[>]0>?& + ;[>]) AB AB A8 A; B = B511(,, !) [C]"D C " 19 How to train the parameters? $" = [$"& $"' $"( $")] !" = [1 0 0] !+" = [,- ,. ,/] 9 9 0& = 1234526(8[&]$ + ;[&]) 9 9 0' = 1234526(8[']0& + ;[']) … We can still use SGD 9 9 0" = 1234526(8[C]0C?& + ;[C]) … We need! 9 9 , = 15,=40$(8[>]0>?& + ;[>]) AB AB A8 A; B = B511(,, !) [C]"D C " 20 How to train the parameters? $" = [$"& $"' $"( $")] !" = [1 0 0] !+" = [,- ,. ,/] 9 9 0& = 1234526(8[&]$ + ;[&]) 9 9 0' = 1234526(8[']0& + ;[']) … BC BC B0>?& B0A B0A?& 9 9 = … 0" = 1234526(8[A]0A?& + ;[A]) B8[A]"D B0>?& B0>?' B0A?& B8 A "D … 9 9 , = 15,=40$(8[>]0>?& + ;[>]) C = C511(,, !) 21 Backward pass (Back-propagation) *+ * A *+ *+ * *+ = ( & B"-C'- + E") = /012304(,-) *'7 *'7 -?@ *,- *,- *,- *!7 *+ * A *+ = ( & B#-!- + E#) & *!7 *!7 -?@ *<" 89 8 89 = /012304(<-) 8:; 8:; 8=(; ! '" " & ! '# # & )(" )" ' !$ $ & *+ *!7 *+ ' !% = 89 8 % *B *B *! = +()", )(") #- #- 7 8=(; 8=(; *+ *' *+ & = 7 *B"-C *B"-C *'7 (mini-batch) Stochastic Gradient Descent (SGD) ! = 0.01 '(), +) = / −log 60,789:7(), +) 0∈2 Initialize w and b randomly For Softmax Classifier for e = 0, num_epochs do for b = 0, num_batches do Compute: &'(), +)/&) and &'(), +)/&+ Update w: ) = ) − ! &'(), +)/&) Update b: + = + − ! &'(), +)/&+ Print: '(), +) // Useful to see if this is becoming smaller or not. end end 23 Automatic Differentiation You only need to write code for the forward pass, backward pass is computed automatically. Frameworks such as Pytorch will “record” the operations performed on tensors and compute gradients through the “recorded” operations when requested. Pytorch (Facebook -- mostly): https://pytorch.org/ Tensorflow (Google -- mostly): https://www.tensorflow.org/ DyNet (team includes UVA Prof. Yangfeng Ji): http://dynet.io/ Example • Provided in Assignments 3 and Assignment 4. • Let’s dissect Assignment 3… Defining a Linear Softmax classifier Defining a Linear Softmax classifier Using a Linear Softmax classifier Training a Linear Softmax classifier What is trainLoader? Training a Linear Softmax classifier (improved) This depends on the model but we don’t need it anymore Defining a Two-layer Neural Network Questions? 33.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    33 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us