Learning from Data Lecture 20 Multilayer Perceptron

Learning from Data Lecture 20 Multilayer Perceptron

Learning From Data Lecture 20 Multilayer Perceptron Multiple layers Universal Approximation The Neural Network M. Magdon-Ismail CSCI 4100/6100 recap:Unsupervised Learning k-Means Clustering Gaussian Mixture Model ) x ( P x ‘Hard’ partition into k-clusters ‘Soft’ probability density estimation M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 2 /18 Bio-inspired Neural Network −→ The Neural Network - Biologically Inspired Engineering success may start with biological inspiration, but then take a totally different path. M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 3 /18 Planes don’t flap wings −→ Planes Don’t Flap Wings to Fly Engineering success may start with biological inspiration, but then take a totally different path. M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 4 /18 xor −→ xor: A Limitation of the Linear Model +1 −1 2 x −1 +1 x1 M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 5 /18 Decomposing xor −→ Decomposing xor +1 −1 2 x −1 f = h1h2 + h1h2 +1 x1 +1 −1 −1 2 2 x x +1 x1 x1 t t h1(x) = sign(w1 x) h2(x) = sign(w2 x) M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 6 /18 Perceptrons for or and and −→ Perceptrons for or and and or(x1,x2) = sign(x1 + x2 +1.5) and(x1,x2) = sign(x1 + x2 − 1.5) 1 1 1.5 −1.5 x1 or(x1, x2) x1 and(x1, x2) 1 1 1 1 x2 x2 M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 7 /18 Representing f using or and and −→ Representing f Using or and and f = h1h2 + h1h2 1 1.5 h1h2 f 1 1 h1h2 M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 8 /18 Expand ands −→ Representing f Using or and and f = h1h2 + h1h2 1 −1.5 1 −1.5 1.5 1 h f 1 1 −1 −1 h 1 2 1 AM c L Creator: Malik Magdon-Ismail Multilayer Perceptron: 9 /18 Expand h1,h2 −→ Representing f Using or and and f = h1h2 + h1h2 1 1 −1.5 1 t w1 x −1.5 1.5 1 x1 f 1 −1 −1 1 x2 t 1 w2 x M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 10 /18 The Multilayer Perceptron −→ The Multilayer Perceptron (MLP) 1 1 −1.5 1 t w1 x −1.5 1.5 1 x1 f 1 −1 −1 1 x2 t 1 w2 x 1 w0 w 1 t x1 sign(w x) w x2 2 More layers allow us to implement f These additional layers are called hidden layers M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 11 /18 Universal approximation −→ Universal Approximation Any target function f that can be decomposed into linear separators can be implemented by a 3-layer MLP. M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 12 /18 Circle Example −→ Universal Approximation A sufficiently smooth separator can “essentially” be decomposed into linear separators. − − − − − − + + + + + + + + + + + + − − − − − − Target 8perceptrons 16perceptrons M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 13 /18 Approximation versus generalization −→ Approximation Versus Generalization The size of the MLP controls the approximation-generalization tradeoff. More nodes per hidden layer =⇒ approximation↑ and generalization↓ AM c L Creator: Malik Magdon-Ismail Multilayer Perceptron: 14 /18 Minimizing Ein −→ Minimizing Ein A combinatorial problem even harder with the MLP than the Perceptron. Ein is not smooth (due to sign function), so cannot use gradient descent. sign(x) ≈ tan(x) −→ gradient descent to minimize Ein. M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 15 /18 Neural Network −→ The Neural Network 1 1 1 x1 θ θ θ h(x) x2 θ θ θ(s) . s θ xd input layer ℓ =0 hidden layers 0 <ℓ<L output layer ℓ = L M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 16 /18 Zooming into a hidden node −→ Zooming into a Hidden Node 11 1 x1 θ θ θ h(x) x2 θ θ θ(s) . s θ xd input layer ℓ =0 hidden layers 0 <ℓ<L output layer ℓ = L layer (ℓ + 1) θ W(ℓ+1) ℓ ℓ W(ℓ) s( ) x( ) + θ layer (ℓ − 1) layer ℓ layer ℓ parameters layers ℓ =0, 1, 2,...,L layer ℓ has “dimension” d(ℓ) =⇒ d(ℓ) + 1 nodes (ℓ) (ℓ) signals in s d dimensional input vector (ℓ) (ℓ) (ℓ) w w ··· w ℓ 1 2 d( ) (ℓ) (ℓ) ℓ outputs x d + 1 dimensional output vector W( ) = . . (ℓ) d(ℓ−1) d(ℓ) weights in W ( + 1) × dimensional matrix weights out W(ℓ+1) (d(ℓ) + 1) × d(ℓ+1) dimensional matrix M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 17 /18 Neural Network −→ The Neural Network Biology −−−−−−−−−−−→ Engineering −−−→ 1 1 1 x1 θ θ θ h(x) x2 θ θ θ(s) . s θ xd input layer ℓ =0 hidden layers 0 <ℓ<L output layer ℓ = L M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 18 /18.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    18 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us