Learning from Data Lecture 20 Multilayer Perceptron

Learning From Data Lecture 20 Multilayer Perceptron Multiple layers Universal Approximation The Neural Network M. Magdon-Ismail CSCI 4100/6100 recap:Unsupervised Learning k-Means Clustering Gaussian Mixture Model ) x ( P x ‘Hard’ partition into k-clusters ‘Soft’ probability density estimation M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 2 /18 Bio-inspired Neural Network −→ The Neural Network - Biologically Inspired Engineering success may start with biological inspiration, but then take a totally different path. M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 3 /18 Planes don’t flap wings −→ Planes Don’t Flap Wings to Fly Engineering success may start with biological inspiration, but then take a totally different path. M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 4 /18 xor −→ xor: A Limitation of the Linear Model +1 −1 2 x −1 +1 x1 M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 5 /18 Decomposing xor −→ Decomposing xor +1 −1 2 x −1 f = h1h2 + h1h2 +1 x1 +1 −1 −1 2 2 x x +1 x1 x1 t t h1(x) = sign(w1 x) h2(x) = sign(w2 x) M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 6 /18 Perceptrons for or and and −→ Perceptrons for or and and or(x1,x2) = sign(x1 + x2 +1.5) and(x1,x2) = sign(x1 + x2 − 1.5) 1 1 1.5 −1.5 x1 or(x1, x2) x1 and(x1, x2) 1 1 1 1 x2 x2 M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 7 /18 Representing f using or and and −→ Representing f Using or and and f = h1h2 + h1h2 1 1.5 h1h2 f 1 1 h1h2 M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 8 /18 Expand ands −→ Representing f Using or and and f = h1h2 + h1h2 1 −1.5 1 −1.5 1.5 1 h f 1 1 −1 −1 h 1 2 1 AM c L Creator: Malik Magdon-Ismail Multilayer Perceptron: 9 /18 Expand h1,h2 −→ Representing f Using or and and f = h1h2 + h1h2 1 1 −1.5 1 t w1 x −1.5 1.5 1 x1 f 1 −1 −1 1 x2 t 1 w2 x M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 10 /18 The Multilayer Perceptron −→ The Multilayer Perceptron (MLP) 1 1 −1.5 1 t w1 x −1.5 1.5 1 x1 f 1 −1 −1 1 x2 t 1 w2 x 1 w0 w 1 t x1 sign(w x) w x2 2 More layers allow us to implement f These additional layers are called hidden layers M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 11 /18 Universal approximation −→ Universal Approximation Any target function f that can be decomposed into linear separators can be implemented by a 3-layer MLP. M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 12 /18 Circle Example −→ Universal Approximation A sufficiently smooth separator can “essentially” be decomposed into linear separators. − − − − − − + + + + + + + + + + + + − − − − − − Target 8perceptrons 16perceptrons M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 13 /18 Approximation versus generalization −→ Approximation Versus Generalization The size of the MLP controls the approximation-generalization tradeoff. More nodes per hidden layer =⇒ approximation↑ and generalization↓ AM c L Creator: Malik Magdon-Ismail Multilayer Perceptron: 14 /18 Minimizing Ein −→ Minimizing Ein A combinatorial problem even harder with the MLP than the Perceptron. Ein is not smooth (due to sign function), so cannot use gradient descent. sign(x) ≈ tan(x) −→ gradient descent to minimize Ein. M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 15 /18 Neural Network −→ The Neural Network 1 1 1 x1 θ θ θ h(x) x2 θ θ θ(s) . s θ xd input layer ℓ =0 hidden layers 0 <ℓ<L output layer ℓ = L M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 16 /18 Zooming into a hidden node −→ Zooming into a Hidden Node 11 1 x1 θ θ θ h(x) x2 θ θ θ(s) . s θ xd input layer ℓ =0 hidden layers 0 <ℓ<L output layer ℓ = L layer (ℓ + 1) θ W(ℓ+1) ℓ ℓ W(ℓ) s( ) x( ) + θ layer (ℓ − 1) layer ℓ layer ℓ parameters layers ℓ =0, 1, 2,...,L layer ℓ has “dimension” d(ℓ) =⇒ d(ℓ) + 1 nodes (ℓ) (ℓ) signals in s d dimensional input vector (ℓ) (ℓ) (ℓ) w w ··· w ℓ 1 2 d( ) (ℓ) (ℓ) ℓ outputs x d + 1 dimensional output vector W( ) = . . (ℓ) d(ℓ−1) d(ℓ) weights in W ( + 1) × dimensional matrix weights out W(ℓ+1) (d(ℓ) + 1) × d(ℓ+1) dimensional matrix M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 17 /18 Neural Network −→ The Neural Network Biology −−−−−−−−−−−→ Engineering −−−→ 1 1 1 x1 θ θ θ h(x) x2 θ θ θ(s) . s θ xd input layer ℓ =0 hidden layers 0 <ℓ<L output layer ℓ = L M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 18 /18.

Load more