<<

Learning From Data Lecture 20 Multilayer

Multiple layers Universal Approximation The Neural Network

M. Magdon-Ismail CSCI 4100/6100 recap:

k-Means Clustering Gaussian Mixture Model ) x ( P

x

‘Hard’ partition into k-clusters ‘Soft’ probability density estimation

M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 2 /18 Bio-inspired Neural Network −→ The Neural Network - Biologically Inspired

Engineering success may start with biological inspiration, but then take a totally different path.

M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 3 /18 Planes don’t flap wings −→ Planes Don’t Flap Wings to Fly

Engineering success may start with biological inspiration, but then take a totally different path.

M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 4 /18 xor −→ xor: A Limitation of the Linear Model

+1 −1 2 x −1

+1

x1

M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 5 /18 Decomposing xor −→ Decomposing xor

+1 −1 2 x −1 f = h1h2 + h1h2

+1

x1

+1 −1 −1 2 2 x x

+1

x1 x1 t t h1(x) = sign(w1 x) h2(x) = sign(w2 x)

M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 6 /18 for or and and −→ Perceptrons for or and and

or(x1,x2) = sign(x1 + x2 +1.5) and(x1,x2) = sign(x1 + x2 − 1.5)

1 1

1.5 −1.5

x1 or(x1, x2) x1 and(x1, x2) 1 1

1 1 x2 x2

M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 7 /18 Representing f using or and and −→ Representing f Using or and and

f = h1h2 + h1h2

1

1.5

h1h2 f 1

1 h1h2

M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 8 /18 Expand ands −→ Representing f Using or and and

f = h1h2 + h1h2

1 −1.5 1

−1.5 1.5 1 h f 1 1 −1 −1 h 1 2 1

AM c L Creator: Malik Magdon-Ismail Multilayer Perceptron: 9 /18 Expand h1,h2 −→ Representing f Using or and and

f = h1h2 + h1h2

1 1 −1.5 1 t w1 x −1.5 1.5 1 x1 f 1 −1 −1 1 x2 t 1 w2 x

M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 10 /18 The Multilayer Perceptron −→ The Multilayer Perceptron (MLP)

1 1 −1.5 1 t w1 x −1.5 1.5 1 x1 f 1 −1 −1 1 x2 t 1 w2 x

1 w0

w 1 t x1 sign(w x)

w x2 2

More layers allow us to implement f These additional layers are called hidden layers

M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 11 /18 Universal approximation −→ Universal Approximation

Any target function f that can be decomposed into linear separators can be implemented by a 3- MLP.

M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 12 /18 Circle Example −→ Universal Approximation

A sufficiently smooth separator can “essentially” be decomposed into linear separators.

− − − − − − + + + + + + + + + + + + − − − − − −

Target 8perceptrons 16perceptrons

M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 13 /18 Approximation versus generalization −→ Approximation Versus Generalization

The size of the MLP controls the approximation-generalization tradeoff.

More nodes per hidden layer =⇒ approximation↑ and generalization↓

AM c L Creator: Malik Magdon-Ismail Multilayer Perceptron: 14 /18 Minimizing Ein −→ Minimizing Ein

A combinatorial problem even harder with the MLP than the Perceptron.

Ein is not smooth (due to sign function), so cannot use . sign(x) ≈ tan(x) −→ gradient descent to minimize Ein.

M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 15 /18 Neural Network −→ The Neural Network

1 1 1

x1 θ θ θ h(x)

x2 θ θ θ(s) . . s θ xd

input layer ℓ =0 hidden layers 0 <ℓ

M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 16 /18 Zooming into a hidden node −→ Zooming into a Hidden Node

11 1

x1 θ θ θ h(x)

x2 θ θ θ(s) . . s θ xd

input layer ℓ =0 hidden layers 0 <ℓ

layer (ℓ + 1)

θ

W(ℓ+1) ℓ ℓ W(ℓ) s( ) x( ) + θ

layer (ℓ − 1) layer ℓ

layer ℓ parameters layers ℓ =0, 1, 2,...,L layer ℓ has “dimension” d(ℓ) =⇒ d(ℓ) + 1 nodes (ℓ) (ℓ) signals in s d dimensional input vector (ℓ) (ℓ) (ℓ) w w ··· w ℓ  1 2 d( )  (ℓ) (ℓ) ℓ outputs x d + 1 dimensional output vector W( ) = .  .  (ℓ) d(ℓ−1) d(ℓ)   weights in W ( + 1) × dimensional matrix   weights out W(ℓ+1) (d(ℓ) + 1) × d(ℓ+1) dimensional matrix

M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 17 /18 Neural Network −→ The Neural Network

Biology −−−−−−−−−−−→ Engineering − − − →

1 1 1

x1 θ θ θ h(x)

x2 θ θ θ(s) . . s θ xd input layer ℓ =0 hidden layers 0 <ℓ

M c AL Creator: Malik Magdon-Ismail Multilayer Perceptron: 18 /18