<<

Introduction to Neural Computation

Prof. Michale Fee MIT BCS 9.40 — 2018

Lecture 16 Networks, Matrices and Sets Seeing in high dimensions

Screen shot © Embedding Projector Project. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/. https://research.googleblog.com/2016/12/open-sourcing-embedding-projector-tool.html 2 Learning Objectives for Lecture 16

• More on two-layer feed-forward networks

transformations (rotated transformations)

• Basis sets

• Linear independence

3 Learning Objectives for Lecture 16

• More on two-layer feed-forward networks

• Matrix transformations (rotated transformations)

• Basis sets

• Linear independence

• Change of basis

4 Two-layer feed-forward network • We can expand our set of output neurons to make a more general network…

input firing rates b = 1 2 3 4  ⎡u , u , u ,... u ⎤ u ⎣ 1 2 3 nb ⎦ =

Lots of synaptic weights! Wab

output firing rates a = 1 2 3 4  ⎡v , v , v ,... v ⎤ = v ⎣ 1 2 3 na ⎦

5 Two-layer feed-forward network

• We now have a weight from each of our input neurons onto each of our output neurons! b = 1 2 3

• We write the weights as a matrix.

a = 1 2 3 b = 1 2 3 a = ⎡  ⎤ weight matrix 1 ⎡ w w w ⎤ wa =1 ⎢ 11 12 13 ⎥ ⎢  ⎥ Wab = 2 ⎢ w21 w22 w23 ⎥ = ⎢ wa =2 ⎥ ⎢ ⎥  3 w w w ⎢ w ⎥ ⎣ 31 32 33 ⎦ ⎣ a =3 ⎦ a b row column post pre 6 Two-layer feed-forward network

• We can write down the firing rates of our output neurons as a . b = 1 2 3   v = W u va = ∑Wabub b

a = 1 2 3

  ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ v1 w11 w12 w13 u1 wa =1 ⋅u ⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢   ⎥  v = ⎢ v2 ⎥ = ⎢ w21 w22 w23 ⎥⎢ u2 ⎥ = ⎢ wa =2 ⋅ u ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢   ⎥ w31 w32 w33 u3 v3 ⎣ ⎦ ⎣ ⎦ wa =3 ⋅u ⎣ ⎦ ⎣ ⎦

interpretation of matrix multiplication 7 Two-layer feed-forward network

• There is another way to think about what the weight matrix means…

b = 1 2 3 b = 1 2 3 ⎡ ⎤ ⎡ ⎤ w11 w12 w13 u1   ⎢ ⎥⎢ ⎥ v = W u = ⎢ w21 w22 w23 ⎥⎢ u2 ⎥ ⎢ ⎥ ⎢ ⎥ w31 w32 w33 u3 1 2 3 ⎣ ⎦ ⎣ ⎦ a =  (1)  (2)  (3) ⎡ 1 0 1 ⎤ ⎡w w w ⎤ W = ⎢ ⎥ ⎣ ⎦ ⎢ 1 1 1 ⎥ ⎣⎢ 0 0 1 ⎦⎥ vector of weights from vector of weights from input neuron 1 input neuron 3 vector of weights from input neuron 2 8 Two-layer feed-forward network

• There is another way to think about what the weight matrix means…

b = 1 2 3 ⎡ ⎤ ⎡ ⎤ w11 w12 w13 u1 b = 1 2 3   ⎢ ⎥⎢ ⎥ v = W u = ⎢ w21 w22 w23 ⎥⎢ u2 ⎥ ⎢ ⎥ ⎢ ⎥ w31 w32 w33 u3 ⎣ ⎦ ⎣ ⎦  (1)  (2)  (3) ⎡w w w ⎤ a = 1 2 3 ⎣ ⎦ • What is the output if only input neuron 1 is active?

⎡ ⎤ ⎡ w ⎤ w11 w12 w13 ⎡ u ⎤ 11 ⎡ 1⎤ 1 ⎢ ⎥  ⎢ ⎥⎢ ⎥  (1) ⎢ ⎥ w w w = u w21 = u w = u1 1 v = ⎢ 21 22 23 ⎥⎢ 0 ⎥ 1 ⎢ ⎥ 1 ⎢ ⎥ 0 ⎢ w w w ⎥ ⎢ 0 ⎥ ⎢w ⎥ ⎣⎢ ⎦⎥ ⎣ 31 32 33 ⎦ ⎣ ⎦ ⎣ 31 ⎦ 9 Two-layer feed-forward network b = 1 2 3 ⎡ ⎤ ⎡ ⎤ w11 w12 w13 u1   ⎢ ⎥⎢ ⎥ v = W u = ⎢ w21 w22 w23 ⎥⎢ u2 ⎥ b = 1 2 3 ⎢ ⎥ ⎢ ⎥ w31 w32 w33 u3 ⎣ ⎦ ⎣ ⎦    ⎡w (1) w (2) w (3) ⎤ ⎣ ⎦ a = 1 2 3 ⎡ w ⎤ ⎡ w ⎤ ⎡ w ⎤  ⎢ 11 ⎥ ⎢ 12 ⎥ ⎢ 13 ⎥ v = u w21 + u w22 + u w23 1 ⎢ ⎥ 2 ⎢ ⎥ 3 ⎢ ⎥ ⎡ 1 0 1 ⎤ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ w31 w32 w33 W = 1 1 1 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎢ ⎥ ⎣⎢ 0 0 1 ⎦⎥     v u w(1) u w (2) u w(3) = 1 + 2 + 3

The output pattern is a of contributions from each of the input neurons! 10 Examples of simple networks

• Each input neuron connects to one neuron in the output layer, with a weight of one.  u 1 2 3 ⎛ 1 0 0 ⎞ W = ⎜ 0 1 0 ⎟ W = I  1 1 1 ⎜ ⎟ v ⎝ 0 0 1 ⎠

⎡ ⎤ ⎡ ⎤ ⎡ 1 0 0 ⎤ u1 u1   ⎢ ⎥⎢ ⎥ ⎢ ⎥ v W u = 0 1 0 u2 = u2 = ⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ 0 0 1 ⎥⎢ u ⎥ ⎢ u ⎥ ⎣ ⎦⎣ 3 ⎦ ⎣ 3 ⎦   v = u 11 Examples of simple networks

• Each input neuron connects to one neuron in the output layer, with an arbitrary weight  u 1 2 3 ⎛ ⎞ λ1 0 0 ⎜ ⎟ 0 λ 0  λ1 λ2 λ3 W = Λ Λ = ⎜ 2 ⎟ v ⎜ ⎟ 0 0 λ ⎝⎜ 3 ⎠⎟

⎡ ⎤ ⎡ ⎤ λ1 0 0 ⎡ u ⎤ λ1u1   ⎢ ⎥⎢ 1 ⎥ ⎢ ⎥ v = Λu = ⎢ 0 λ2 0 ⎥⎢ u2 ⎥ = ⎢ λ2u2 ⎥ ⎢ ⎥ ⎢ ⎥ 0 0 λ ⎢ u ⎥ λ u ⎣⎢ 3 ⎦⎥⎣ 3 ⎦ ⎣⎢ 3 3 ⎦⎥

12 Examples of simple networks

• Input neurons connect to output neurons with a weight matrix that corresponds to a .

1 2

⎡ cosφ −sinφ ⎤ sinφ W = Φ Φ = ⎢ ⎥ cosφ cosφ sinφ cosφ −sinφ ⎣⎢ ⎦⎥

⎡ ⎤   ⎡ cosφ −sinφ ⎤ u1 ⎡ u cos u sin ⎤ ⎢ ⎥ 1 φ − 2 φ v = Φ⋅u = ⎢ ⎥ = ⎢ ⎥ sinφ cosφ ⎢ u2 ⎥ u sinφ + u cosφ ⎣⎢ ⎦⎥⎣ ⎦ ⎣⎢ 1 2 ⎦⎥

13 Examples of simple networks

• Let’s look at an example rotation matrix (ϕ=-45°)

1 2 ⎡ ⎤ ⎡ π π ⎤ 1 1 ⎢ cos(− ) −sin(− ) ⎥ ⎢ ⎥ 0 ⎢ 4 4 ⎥ ⎢ 2 2 ⎥ 1 Φ(−45 ) = = ⎢ ⎥ 1 ⎢ π π ⎥ 1 1 2 2 sin(− ) cos(− ) ⎢ − ⎥ 1 1 ⎢ 4 4 ⎥ − ⎣ ⎦ ⎣⎢ 2 2 ⎦⎥ 2 2

⎡ 1 1 ⎤ ⎢ ⎥ ⎡ u ⎤   2 2 1 u v v = Φ⋅u = ⎢ ⎥ ⎢ ⎥ 2 2 ⎢ 1 1 ⎥ ⎢ − ⎥ ⎢ u2 ⎥ ⎣⎢ 2 2 ⎦⎥ ⎣ ⎦ u v ⎡ ⎤ 1 1  1 u2 + u1 v = ⎢ ⎥ 2 ⎢ u − u ⎥ ⎣ 2 1 ⎦ 14 Examples of simple networks

• Rotation matrices can be very useful when different directions in feature carry different useful information 1 2

1 ⎡ 1 1 ⎤ Φ(−450 ) = ⎢ ⎥ 2 ⎢ −1 1 ⎥ 1 1 ⎣ ⎦ 2 2 1 1 − 2 2 furry

u2 v2 dogs cats cats Φ

u1 v1 doggy breath dogs

15 Examples of simple networks

• Rotation matrices can be very useful when different directions in feature space carry different useful information ⎡ u + u ⎤ I(λ ) color  1 2 1 2 v = ⎢ ⎥ v2 2 ⎢ u2 − u1 ⎥ Φ ⎣ ⎦

brightness I(λ ) 1 v1

AR elevation v2 Φ

loudness AL v1

16 Learning Objectives for Lecture 16

• More on two-layer feed-forward networks

• Matrix transformations (rotated transformations)

• Basis sets

• Linear independence

• Change of basis

17 Matrix transformations

2 • In general A maps the set of vectors in R onto another set of vectors 2 in R .   y = Ax A

18 Matrix transformations

2 • In general A maps the set of vectors in R onto another set of vectors 2 in R .   y = Ax A

A−1

 −1  x = A y

19   Matrix transformations y = Ax • Perturbations from the

⎛ 1+δ 0 ⎞ ⎛ 1+δ 0 ⎞ ⎛ 1 0 ⎞ ⎛ 1+δ 0 ⎞ A = ⎜ ⎟ A = A = ⎜ ⎟ A = ⎜ ⎟ ⎝ 0 1+δ ⎠ ⎝⎜ 0 1 ⎠⎟ ⎝ 0 1+δ ⎠ ⎝ 0 1−δ ⎠

⎛ −1 0 ⎞ ⎛ −1 0 ⎞ These are all diagonal matrices A = ⎜ ⎟ A = ⎜ ⎟ 0 1 ⎜ ⎟ ⎝ ⎠ ⎝ 0 −1 ⎠ ⎛ a 0 ⎞ Λ = ⎝⎜ 0 b ⎠⎟

⎛ a−1 0 ⎞ Λ−1 = ⎜ ⎟ ⎜ 0 b−1 ⎟ ⎝ ⎠ 20 Rotation matrix ⎛ cosθ −sinθ ⎞ • Rotation in 2 dimensions Φ(θ) = ⎜ ⎟ ⎝ sinθ cosθ ⎠

θ = 10o θ = 25o θ = 45o θ = 90o

• Does a rotation matrix have an inverse? det(Φ) =1

• The inverse of a rotation matrix is just its

−1 T Φ (θ) = Φ(−θ) = Φ (θ) 21 Rotated transformations • Let’s construct a matrix that produces a stretch along a 45° angle…

⎛ cos45 −sin 45 ⎞ 1 ⎛ 1 −1 ⎞ = Φ = ⎜ ⎟ ⎜ 1 1 ⎟ ⎝ sin 45 cos45 ⎠ 2 ⎝ ⎠

ΦT Λ Φ −45o +45o  x +45o

• We do each of these steps by multiplying our matrices together

T  Φ Λ Φ x 22 Rotated transformations

• Let’s construct a matrix that produces a stretch along a 45° angle…

 T  T  T  x Φ x ΛΦ x ΦΛΦ x

T Φ Λ Φ ⎛ ⎞ ⎛ ⎞ 1 ⎛ 1 1 ⎞ 1 1 1 2 0 = − = ⎜ ⎟ = ⎜ ⎟ ⎜ 1 1 ⎟ 2 ⎝ −1 1 ⎠ ⎝ 0 1 ⎠ 2 ⎝ ⎠

T 1 ⎛ 3 1 ⎞ ΦΛΦ = ⎜ ⎟ 2 ⎝ 1 3 ⎠ 23 Inverse of matrix products • We can unto our transformation by taking the inverse

T −1 ⎣⎡ΦΛΦ ⎦⎤ • How do you take the inverse of a sequence of matrix multiplications A*B*C? []ABC −1ABC = C −1B−1A−1ABC []ABC −1= C −1B−1A−1 = C −1B−1BC • Thus… = C −1C = I T −1 T −1 −1 −1 ⎣⎡ΦΛΦ ⎦⎤ = ⎣⎡Φ ⎦⎤ []Λ []Φ

−1 1 ⎛ ⎞ T − −1 T −1 ⎛ 2 0 ⎞ 1/ 2 0 ⎡ΦΛΦ ⎤ = ΦΛ Φ Λ = = ⎜ ⎟ ⎣ ⎦ ⎝⎜ 0 1 ⎠⎟ ⎝ 0 1 ⎠ 24 Rotated transformations

• Let’s construct a matrix that undoes a stretch along a 45° angle…

 T  −1 T  −1 T  x Φ x Λ Φ x ΦΛ Φ x

ΦT Λ−1 Φ(+45o )

⎛ ⎞ 1 ⎛ 1 1 ⎞ 1 ⎛ 1 1 ⎞ = 1/ 2 0 = − = ⎜ ⎟ ⎜ ⎟ ⎜ 1 1 ⎟ 2 ⎝ −1 1 ⎠ ⎝ 0 1 ⎠ 2 ⎝ ⎠

⎛ ⎞ 1 T 1 3 −1 ΦΛ− Φ = ⎜ ⎟ 4 ⎝⎜ −1 3 ⎠⎟ 25 Rotated transformations

• Construct a matrix that does compression along a -45° angle…

T Φ Λ Φ +45o −45o

−45o

ΦT Λ Φ(−45o )

1 ⎛ 1 −1 ⎞ ⎛ 0.2 0 ⎞ 1 ⎛ 1 1 ⎞ = ⎜ ⎟ = ⎜ ⎟ = ⎜ ⎟ 2 ⎝ 1 1 ⎠ ⎝ 0 1 ⎠ 2 ⎝ −1 1 ⎠

T ⎛ 0.6 0.4 ⎞ ΦΛΦ = ⎜ ⎟ ⎝ 0.4 0.4 ⎠ 26 Transformations that can’t be undone • Some transformation matrices have no inverse…

 T  T  T  x Φ x ΛΦ x ΦΛΦ x

T o Φ = Φ(45 ) Λ Φ(−45o )

1 ⎛ 1 1 ⎞ ⎛ ⎞ = − ⎛ 0 0 ⎞ 1 1 1 ⎜ 1 1 ⎟ = ⎜ ⎟ = ⎜ ⎟ 2 ⎝ ⎠ ⎝ 0 1 ⎠ 2 ⎝ −1 1 ⎠

T ⎛ 0.5 0.5 ⎞ ΦΛΦ = T ⎜ 0.5 0.5 ⎟ det ΦΛΦ = 0 det()Λ = 0 ⎝ ⎠ () 27 Learning Objectives for Lecture 16

• More on two-layer feed-forward networks

• Matrix transformations (rotated transformations)

• Basis sets

• Linear independence

• Change of basis

28 Basics of basis sets

• We can think of vectors as abstract ‘directions’ in space. But in order to specify the elements of a vector, we need to choose a coordinate system.

• To do this, we write our vector as a linear combination of a set of special vectors called the ‘basis set.’

⎛ x⎞ ⎛ 1⎞ ⎛ 0⎞ ⎛ 0⎞  ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ v = y = x 0 + y 1 + z 0 = xeˆ1 + yeˆ2 + zeˆ3 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ z⎟ ⎝ 0⎠ ⎝ 0⎠ ⎝ 1⎠ ⎝ ⎠

• The numbers x, y, z are called the coordinates of the vector.

• The vectors { eˆ 1 , eˆ 2 , eˆ 3 } are called the ‘basis vectors’, in this case, in three dimensions. 29 Basics of basis sets

• In order to describe an arbitrary vector in the space of real numbers in n n dimensions ( R ), our basis vectors need to have n numbers.

• n In order to describe an arbitrary vector in R , we need to have n basis vectors.

• The basis set we wrote down earlier { e ˆ1 , e ˆ 2 , eˆ 3 } is called the ‘’. Each vector has one element that’s a one and the rest are zeros.

⎛ 1⎞ ⎛ 0⎞ ⎛ 0⎞ eˆ = ⎜ 0⎟ eˆ = ⎜ 1⎟ eˆ = ⎜ 0⎟ 1 ⎜ ⎟ 2 ⎜ ⎟ 3 ⎜ ⎟ ⎝ 0⎠ ⎝ 0⎠ ⎝ 1⎠ 30 Orthonormal basis

• In addition, the standard basis has the interesting property that

each vector is a ⎛ 1⎞ ⎛ 0⎞ ⎛ 0⎞ eˆ = ⎜ 0⎟ eˆ = ⎜ 1⎟ eˆ = ⎜ 0⎟ 1 ⎜ ⎟ 2 ⎜ ⎟ 3 ⎜ ⎟ eˆi ⋅ eˆi = 1 ⎝ 0⎠ ⎝ 0⎠ ⎝ 1⎠

• Each vector is orthogonal to all the other vectors eˆ ⋅ eˆ = 0, i ≠ j eˆ1 ⋅ eˆ2 = 0 eˆ1 ⋅ eˆ3 = 0 eˆ2 ⋅ eˆ3 = 0 i j

• These properties can be written compactly as

⎧ ˆ ˆ ⎪ 1 if i = j ei ⋅ ej = δ ij δ ij = ⎨ 0 if i ≠ j ⎩⎪

• Any basis set with these properties is called ‘orthonormal’. 31 Basics of basis sets • The standard basis is not the only orthonormal basis   Consider a different set of orthogonal unit vectors: { f , f } 1 2 ˆ    f2 v v fˆ fˆ v fˆ fˆ = ( ⋅ 1 ) 1 + ( ⋅ 2 ) 2  v

⎛ ! ˆ ⎞ ! ⎜v ⋅ f1 ⎟ v f = ⎜ ! ⎟ ⎜v ⋅ fˆ ⎟ ⎝ 2 ⎠ ˆ f1  • The vector coordinates are given by the dot products of the vector v with each of the basis vectors.

32 Non-orthonormal basis sets

• Vectors can also be written as a linear combination of (almost) any vectors, not just orthonormal basis vectors

 f 2     v v = c f + c f 1 1 2 2  c f 2 2  f1  c f 1 1

33 Basics of basis sets   • Let’s decompose an arbitrary vector v in a basis set { f , f } 1 2    v = c f + c f 1 1 2 2

• The coefficients c and c are called ‘coordinates of the vector v in   1 2 the basis{ f , f } . 1 2

⎛ ⎞  c1  • The vector v = ⎜ ⎟ is called the ‘coordinate vector’ of f ⎜ ⎟ v ⎝ c2 ⎠   in the basis { f , f } . 1 2

34 Basics of basis sets

• Let’s look at an example. Consider the basis

  ⎪⎧⎛ 1⎞ ⎛ −2⎞ ⎪⎫ { f1 , f2 } = ⎨⎜ ⎟ , ⎜ ⎟ ⎬ ⎝ 3⎠ ⎝ 1 ⎠ ⎩⎪ ⎭⎪

 ⎛ 3⎞ and the vector v = in the standard basis. ⎝⎜ 5⎠⎟

 • Find the vector coordinates of v in the new basis.  • Write v as a linear combination of the new basis vectors:    c f + c f = v 1 1 2 2 system of equations

⎛ 1⎞ ⎛ −2⎞ ⎛ 3⎞ c1 − 2c2 = 3 c1 ⎜ ⎟ + c2 ⎜ ⎟ = ⎜ ⎟ 3 1 5 3c1 + c2 = 5 ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ 35 Basics of basis sets

• We can write this system of equations in matrix notation:

c1 − 2c2 = 3   3c + c = 5 F vf = v 1 2

⎛ ⎞ ⎛ 1 −2 ⎞  c1  ⎛ 3⎞ where F = vf = ⎜ ⎟ v = ⎜ 3 1 ⎟ ⎜ ⎟ ⎜ 5⎟ ⎝ ⎠ ⎝ c2 ⎠ ⎝ ⎠

 • Now solve for v by multiplying both sides of the equation by the f inverse of matrix F .   F −1 F v = F −1v f   v = F −1 v f 36 Basics of basis sets

• We can find the inverse of F : ⎛ ⎞ −1 1 1 2 F = ⎜ ⎟ 7 ⎝ −3 1 ⎠

 −1 1 ⎛ 1 2 ⎞ ⎛ 3 ⎞ vf = F v = ⎜ ⎟ ⎜ ⎟ 7 ⎝ −3 1 ⎠ ⎝ 5 ⎠

1 ⎛ 3+10⎞ 1 ⎛ 13⎞ = ⎜ ⎟ = ⎜ ⎟ 7 ⎝ −9 + 5 ⎠ 7 ⎝ −4 ⎠   • Thus, we find the coordinate vector of v in basis { f , f } 1 2  ⎛ 13 / 7 ⎞ vf = ⎜ ⎟ ⎝ −4 / 7 ⎠ 37 Basics of basis sets   • In summary: to find the coordinate vector for v in the basis { f , f } , we 1 2 construct a matrix F whose columns are just the elements of the basis vectors.       F = f1 f2 F = f f f ... f () ( 1 2 3 n ) ! ! such that v = F v f  • We can solve for v by multiplying both sides of the equation by f the inverse of matrix F

! −1 ! vf = F v ‘change of basis’

• But this only works if F has an inverse! 38 Learning Objectives for Lecture 16

• More on two-layer feed-forward networks

• Matrix transformations (rotated transformations)

• Basis sets

• Linear independence

• Change of basis

39 Subspaces

• n n We need n vectors in R to form a basis in R . But not any set of n vectors will do the trick!

• Consider the following set of vectors

⎛ ⎞ ⎛ ⎞ ⎛ ⎞  1  0  1 f = ⎜ 0⎟ f = ⎜ 1⎟ f = ⎜ 1⎟ 1 ⎜ ⎟ 2 ⎜ ⎟ 3 ⎜ ⎟ ⎝ 0⎠ ⎝ 0⎠ ⎝ 0⎠

   • Note that any linear combination of { f , f , f } will always lie in 1 2 3 the (x , y) ⎛ ⎞ c1 + c3     ⎜ ⎟ v = c f + c f + c f = c + c 1 1 2 2 3 3 ⎜ 2 3 ⎟ ⎝⎜ 0 ⎠⎟    • { f , f , f } 3 Thus, the set of vectors 1 2 3 doesn’t span all of R 3 It only spans the x-y plane - a subspace of R 40 Linear independence

⎛ ⎞ ⎛ ⎞ ⎛ ⎞  1  0  1 f = ⎜ 0⎟ f = ⎜ 1⎟ f = ⎜ 1⎟ 1 ⎜ ⎟ 2 ⎜ ⎟ 3 ⎜ ⎟ ⎝ 0⎠ ⎝ 0⎠ ⎝ 0⎠

• Note that we can write any of these vectors as a linear combination of the other two.          f = f + f f = f − f f = f − f 3 1 2 2 3 1 1 3 2

• Thus, this set of vectors is called ‘linearly dependent’.

n • Any set of n linearly dependent vectors cannot form a basis in R

• How do you know if a set of vectors is linearly dependent?     F = f f f ... f det(F) = 0 ( 1 2 3 n ) 41 Linear dependence

 n • If de t( F ) = 0 then F maps v into a subspace of f R

⎛ ⎞ F F = 0.5 0.5 ⎝⎜ 0.5 0.5 ⎠⎟

• If F maps onto a subspace, then the mapping is not reversible!

det(F) = 0 42 Geometric interpretation of

• The determinant is the ‘volume’ of a unit cube after transformation (area of unit square in two dimensions).

A area = 1 area = 0.5

det(A) = 0.5

• A pure rotation matrix has a determinant of one.

A area = 1 area = 1

det(A) = 1 43 Learning Objectives for Lecture 16

• More on two-layer feed-forward networks

• Matrix transformations (rotated transformations)

• Basis sets

• Linear independence

• Change of basis

44 Change of basis       f , f ,... f F = f f ... f {}1 2 n ()1 2 n

   • If de t( F ) 0 then the vectors f 1 , f 2 ,... f n ≠ {} o are linearly independent

n o form a complete basis set in R

• Then the matrix F implements a ‘change of basis’

  From standard basis to f Or from f to standard basis  −1    vf = F v v = F v f 45 Change of basis   • The change of basis is easy if { f , f } is an orthonormal basis… 1 2

⎛ ⎞ ⎛ fˆ ⎞ T 1 F = ⎜ fˆ fˆ ⎟ F = ⎜ ⎟ 1 2 fˆ ⎝⎜ ⎠⎟ ⎝ 2 ⎠

⎛ fˆ ⎞ ⎛ ⎞ ⎛ ⎞ T 1 1 0 = I F F = ⎜ ⎟ ⎜ fˆ fˆ ⎟ = ⎜ ⎟ fˆ 1 2 ⎝ 0 1 ⎠ ⎝ 2 ⎠ ⎝⎜ ⎠⎟

Thus… F T = F −1 F is just a rotation matrix! 46 Change of basis

• With an orthonormal basis set, the coordinates are just given by the dot product with the basis vectors !

⎛ ⎞ ⎛ fˆ ⎞ ˆ ˆ −1 T 1 F = ⎜ f1 f 2 ⎟ F = F = ⎜ ⎟ ⎜ ⎟ ˆ ⎝ ⎠ ⎝ f2 ⎠

  v = F −1v f ⎛  ˆ ⎞ ⎛ fˆ ⎞ v ⋅ f1  T  1  = ⎜ ⎟ vf = F v = ⎜ ⎟ v  fˆ ⎜ v ⋅ fˆ ⎟ ⎝ 2 ⎠ ⎝ 2 ⎠

47 Change of basis

• In two dimensions, there is a family of orthonormal basis sets

⎛ ⎞ ⎛ ⎞ ⎛ ⎞ cosθ sinθ ˆ cosθ ˆ sinθ F = ⎜ ⎟ f1 = ⎜ ⎟ f2 = ⎜ ⎟ −sinθ cosθ ⎝ −sinθ ⎠ ⎝ cosθ ⎠ ⎝ ⎠

fˆ    2 v = v ⋅ fˆ fˆ + v ⋅ fˆ fˆ  ( 1 ) 1 ( 2 ) 2 v

⎛ ! ˆ ⎞ ! ⎜v ⋅ f1 ⎟  T  v = vf = F v f ⎜ ! ⎟ ⎜ ˆ ⎟ v ⋅ f2 θ ⎝ ⎠ ˆ f1  • The vector coordinates are given by the dot products of the vector v with each of the rotated basis vectors. 48 Seeing in high dimensions

Screen shot © Embedding Projector Project. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/. https://research.googleblog.com/2016/12/open-sourcing-embedding-projector-tool.html 49 Learning Objectives for Lecture 16

• More on two-layer feed-forward networks

• Matrix transformations (rotated transformations)

• Basis sets

• Linear independence

• Change of basis

50 MIT OpenCourseWare https://ocw.mit.edu/

9.40 Introduction to Neural Computation Spring 2018

For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.