Learning From Data Lecture 20 Multilayer Perceptron
Multiple layers Universal Approximation The Neural Network
M. Magdon-Ismail CSCI 4100/6100 recap:Unsupervised Learning
k-Means Clustering Gaussian Mixture Model ) x ( P
x
‘Hard’ partition into k-clusters ‘Soft’ probability density estimation
M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 2 /18 Bio-inspired Neural Network −→ The Neural Network - Biologically Inspired
Engineering success may start with biological inspiration, but then take a totally different path.
M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 3 /18 Planes don’t flap wings −→ Planes Don’t Flap Wings to Fly
Engineering success may start with biological inspiration, but then take a totally different path.
M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 4 /18 xor −→ xor: A Limitation of the Linear Model
+1 −1 2 x −1
+1
x1
M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 5 /18 Decomposing xor −→ Decomposing xor
+1 −1 2 x −1 f = h1h2 + h1h2
+1
x1
+1 −1 −1 2 2 x x
+1
x1 x1 t t h1(x) = sign(w1 x) h2(x) = sign(w2 x)
M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 6 /18 Perceptrons for or and and −→ Perceptrons for or and and
or(x1,x2) = sign(x1 + x2 +1.5) and(x1,x2) = sign(x1 + x2 − 1.5)
1 1
1.5 −1.5
x1 or(x1, x2) x1 and(x1, x2) 1 1
1 1 x2 x2
M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 7 /18 Representing f using or and and −→ Representing f Using or and and
f = h1h2 + h1h2
1
1.5
h1h2 f 1
1 h1h2
M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 8 /18 Expand ands −→ Representing f Using or and and
f = h1h2 + h1h2
1 −1.5 1
−1.5 1.5 1 h f 1 1 −1 −1 h 1 2 1
AM c L Creator: Malik Magdon-Ismail Multilayer Perceptron: 9 /18 Expand h1,h2 −→ Representing f Using or and and
f = h1h2 + h1h2
1 1 −1.5 1 t w1 x −1.5 1.5 1 x1 f 1 −1 −1 1 x2 t 1 w2 x
M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 10 /18 The Multilayer Perceptron −→ The Multilayer Perceptron (MLP)
1 1 −1.5 1 t w1 x −1.5 1.5 1 x1 f 1 −1 −1 1 x2 t 1 w2 x
1 w0
w 1 t x1 sign(w x)
w x2 2
More layers allow us to implement f These additional layers are called hidden layers
M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 11 /18 Universal approximation −→ Universal Approximation
Any target function f that can be decomposed into linear separators can be implemented by a 3-layer MLP.
M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 12 /18 Circle Example −→ Universal Approximation
A sufficiently smooth separator can “essentially” be decomposed into linear separators.
− − − − − − + + + + + + + + + + + + − − − − − −
Target 8perceptrons 16perceptrons
M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 13 /18 Approximation versus generalization −→ Approximation Versus Generalization
The size of the MLP controls the approximation-generalization tradeoff.
More nodes per hidden layer =⇒ approximation↑ and generalization↓
AM c L Creator: Malik Magdon-Ismail Multilayer Perceptron: 14 /18 Minimizing Ein −→ Minimizing Ein
A combinatorial problem even harder with the MLP than the Perceptron.
Ein is not smooth (due to sign function), so cannot use gradient descent. sign(x) ≈ tan(x) −→ gradient descent to minimize Ein.
M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 15 /18 Neural Network −→ The Neural Network
1 1 1
x1 θ θ θ h(x)
x2 θ θ θ(s) . . s θ xd
input layer ℓ =0 hidden layers 0 <ℓ M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 16 /18 Zooming into a hidden node −→ Zooming into a Hidden Node 11 1 x1 θ θ θ h(x) x2 θ θ θ(s) . . s θ xd input layer ℓ =0 hidden layers 0 <ℓ layer (ℓ + 1) θ W(ℓ+1) ℓ ℓ W(ℓ) s( ) x( ) + θ layer (ℓ − 1) layer ℓ layer ℓ parameters layers ℓ =0, 1, 2,...,L layer ℓ has “dimension” d(ℓ) =⇒ d(ℓ) + 1 nodes (ℓ) (ℓ) signals in s d dimensional input vector (ℓ) (ℓ) (ℓ) w w ··· w ℓ 1 2 d( ) (ℓ) (ℓ) ℓ outputs x d + 1 dimensional output vector W( ) = . . (ℓ) d(ℓ−1) d(ℓ) weights in W ( + 1) × dimensional matrix weights out W(ℓ+1) (d(ℓ) + 1) × d(ℓ+1) dimensional matrix M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 17 /18 Neural Network −→ The Neural Network Biology −−−−−−−−−−−→ Engineering − − − → 1 1 1 x1 θ θ θ h(x) x2 θ θ θ(s) . . s θ xd input layer ℓ =0 hidden layers 0 <ℓ M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 18 /18