Lecture 16: Networks, Matrices and Basis Sets

Introduction to Neural Computation Prof. Michale Fee MIT BCS 9.40 — 2018 Lecture 16 Networks, Matrices and Basis Sets Seeing in high dimensions Screen shot © Embedding Projector Project. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/. https://research.googleblog.com/2016/12/open-sourcing-embedding-projector-tool.html 2 Learning Objectives for Lecture 16 • More on two-layer feed-forward networks • Matrix transformations (rotated transformations) • Basis sets • Linear independence • Change of basis 3 Learning Objectives for Lecture 16 • More on two-layer feed-forward networks • Matrix transformations (rotated transformations) • Basis sets • Linear independence • Change of basis 4 Two-layer feed-forward network • We can expand our set of output neurons to make a more general network… input firing rates b = 1 2 3 4 ⎡u , u , u ,... u ⎤ u ⎣ 1 2 3 nb ⎦ = Lots of synaptic weights! Wab output firing rates a = 1 2 3 4 ⎡v , v , v ,... v ⎤ = v ⎣ 1 2 3 na ⎦ 5 Two-layer feed-forward network • We now have a weight from each of our input neurons onto each of our output neurons! b = 1 2 3 • We write the weights as a matrix. a = 1 2 3 b = 1 2 3 a = ⎡ ⎤ weight matrix 1 ⎡ w w w ⎤ wa =1 ⎢ 11 12 13 ⎥ ⎢ ⎥ Wab = 2 ⎢ w21 w22 w23 ⎥ = ⎢ wa =2 ⎥ ⎢ ⎥ 3 w w w ⎢ w ⎥ ⎣ 31 32 33 ⎦ ⎣ a =3 ⎦ a b row column post pre 6 Two-layer feed-forward network • We can write down the firing rates of our output neurons as a matrix multiplication. b = 1 2 3 v = W u va = ∑Wabub b a = 1 2 3 ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ v1 w11 w12 w13 u1 wa =1 ⋅u ⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥ v = ⎢ v2 ⎥ = ⎢ w21 w22 w23 ⎥⎢ u2 ⎥ = ⎢ wa =2 ⋅ u ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ v w31 w32 w33 u3 w ⋅ u 3 ⎣ ⎦ ⎣ ⎦ a =3 ⎣ ⎦ ⎣ ⎦ • Dot product interpretation of matrix multiplication 7 Two-layer feed-forward network • There is another way to think about what the weight matrix means… b = 1 2 3 b = 1 2 3 ⎡ ⎤ ⎡ ⎤ w11 w12 w13 u1 ⎢ ⎥⎢ ⎥ v = W u = ⎢ w21 w22 w23 ⎥⎢ u2 ⎥ ⎢ ⎥ ⎢ ⎥ w31 w32 w33 u3 a = 1 2 3 ⎣ ⎦ ⎣ ⎦ (1) (2) (3) ⎡ 1 0 1 ⎤ ⎡w w w ⎤ W = ⎢ ⎥ ⎣ ⎦ ⎢ 1 1 1 ⎥ ⎣⎢ 0 0 1 ⎦⎥ vector of weights from vector of weights from input neuron 1 input neuron 3 vector of weights from input neuron 2 8 Two-layer feed-forward network • There is another way to think about what the weight matrix means… b = 1 2 3 ⎡ ⎤ ⎡ ⎤ w11 w12 w13 u1 b = 1 2 3 ⎢ ⎥⎢ ⎥ v = W u = ⎢ w21 w22 w23 ⎥⎢ u2 ⎥ ⎢ ⎥ ⎢ ⎥ w w w u 31 32 33 3 ⎣ ⎦⎣ ⎦ (1) (2) (3) ⎡w w w ⎤ a = 1 2 3 ⎣ ⎦ • What is the output if only input neuron 1 is active? ⎡ ⎤ ⎡ w w w ⎤ ⎡ ⎤ w11 ⎡ ⎤ 11 12 13 u1 1 ⎢ ⎥ ⎢ ⎥ (1) ⎢ ⎥ ⎢ ⎥ w w w = u w21 = u w = u1 1 v = ⎢ 21 22 23 ⎥⎢ 0 ⎥ 1 ⎢ ⎥ 1 ⎢ ⎥ 0 ⎢ w w w ⎥ ⎢ 0 ⎥ ⎢w ⎥ ⎣⎢ ⎦⎥ ⎣ 31 32 33 ⎦ ⎣ ⎦ ⎣ 31 ⎦ 9 Two-layer feed-forward network b = 1 2 3 ⎡ ⎤ ⎡ ⎤ w11 w12 w13 u1 ⎢ ⎥⎢ ⎥ v = W u = ⎢ w21 w22 w23 ⎥⎢ u2 ⎥ b = 1 2 3 ⎢ w w w ⎥ ⎢ u ⎥ 31 32 33 3 ⎣ ⎦⎣ ⎦ ⎡w (1) w (2) w (3) ⎤ ⎣ ⎦ a = 1 2 3 ⎡ w ⎤ ⎡ w ⎤ ⎡ w ⎤ ⎢ 11 ⎥ ⎢ 12 ⎥ ⎢ 13 ⎥ v = u w21 + u w22 + u w23 1 ⎢ ⎥ 2 ⎢ ⎥ 3 ⎢ ⎥ ⎡ 1 0 1 ⎤ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ w31 w32 w33 W = 1 1 1 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎢ ⎥ ⎣⎢ 0 0 1 ⎦⎥ v = u w(1) + u w (2) + u w(3) 1 2 3 The output pattern is a linear combination of contributions from each of the input neurons! 10 Examples of simple networks • Each input neuron connects to one neuron in the output layer, with a weight of one. u 1 2 3 ⎛ 1 0 0 ⎞ W = ⎜ 0 1 0 ⎟ W = I 1 1 1 ⎜ ⎟ v ⎝ 0 0 1 ⎠ ⎡ ⎤ ⎡ ⎤ ⎡ 1 0 0 ⎤ u1 u1 ⎢ ⎥⎢ ⎥ ⎢ ⎥ v = W u = 0 1 0 ⎢ u2 ⎥ = ⎢ u2 ⎥ ⎢ ⎥ ⎢ 0 0 1 ⎥⎢ u ⎥ ⎢ u ⎥ ⎣ ⎦⎣ 3 ⎦ ⎣ 3 ⎦ v = u 11 Examples of simple networks • Each input neuron connects to one neuron in the output layer, with an arbitrary weight u 1 2 3 ⎛ ⎞ λ1 0 0 ⎜ ⎟ 0 λ 0 λ1 λ2 λ3 W = Λ Λ = ⎜ 2 ⎟ v ⎜ ⎟ 0 0 λ ⎝⎜ 3 ⎠⎟ ⎡ ⎤ ⎡ ⎤ λ1 0 0 ⎡ u ⎤ λ1u1 ⎢ ⎥⎢ 1 ⎥ ⎢ ⎥ v = Λu = ⎢ 0 λ2 0 ⎥⎢ u2 ⎥ = ⎢ λ2u2 ⎥ ⎢ ⎥ ⎢ ⎥ 0 0 λ ⎢ u ⎥ λ u ⎣⎢ 3 ⎦⎥⎣ 3 ⎦ ⎣⎢ 3 3 ⎦⎥ 12 Examples of simple networks • Input neurons connect to output neurons with a weight matrix that corresponds to a rotation matrix. 1 2 ⎡ cosφ −sinφ ⎤ sinφ W = Φ Φ = ⎢ ⎥ cosφ cosφ sinφ cosφ −sinφ ⎣⎢ ⎦⎥ ⎡ ⎤ ⎡ cosφ −sinφ ⎤ u1 ⎡ u cosφ − u sinφ ⎤ v = Φ⋅u = ⎢ ⎥⎢ ⎥ = ⎢ 1 2 ⎥ sinφ cosφ ⎢ u2 ⎥ u sinφ + u cosφ ⎣⎢ ⎦⎥⎣ ⎦ ⎣⎢ 1 2 ⎦⎥ 13 Examples of simple networks • Let’s look at an example rotation matrix (ϕ=-45°) 1 2 ⎡ ⎤ ⎡ π π ⎤ 1 1 ⎢ cos(− ) −sin(− ) ⎥ ⎢ ⎥ 0 ⎢ 4 4 ⎥ ⎢ 2 2 ⎥ 1 Φ(−45 ) = = ⎢ ⎥ 1 ⎢ π π ⎥ 1 1 2 2 sin(− ) cos(− ) ⎢ − ⎥ 1 1 ⎢ 4 4 ⎥ − ⎣ ⎦ ⎣⎢ 2 2 ⎦⎥ 2 2 ⎡ 1 1 ⎤ ⎢ ⎥ ⎡ u ⎤ 2 2 1 u v v = Φ⋅u = ⎢ ⎥ ⎢ ⎥ 2 2 ⎢ 1 1 ⎥ ⎢ − ⎥ ⎢ u2 ⎥ ⎣⎢ 2 2 ⎦⎥ ⎣ ⎦ u v ⎡ ⎤ 1 1 1 u2 + u1 v = ⎢ ⎥ 2 ⎢ u − u ⎥ ⎣ 2 1 ⎦ 14 Examples of simple networks • Rotation matrices can be very useful when different directions in feature space carry different useful information 1 2 1 ⎡ 1 1 ⎤ Φ(−450 ) = ⎢ ⎥ 2 ⎢ −1 1 ⎥ 1 1 ⎣ ⎦ 2 2 1 1 − 2 2 furry u2 v2 dogs cats cats Φ u1 v1 doggy breath dogs 15 Examples of simple networks • Rotation matrices can be very useful when different directions in feature space carry different useful information ⎡ u + u ⎤ I(λ ) color 1 2 1 2 v = ⎢ ⎥ v2 2 ⎢ u2 − u1 ⎥ Φ ⎣ ⎦ brightness I(λ ) 1 v1 AR elevation v2 Φ loudness AL v1 16 Learning Objectives for Lecture 16 • More on two-layer feed-forward networks • Matrix transformations (rotated transformations) • Basis sets • Linear independence • Change of basis 17 Matrix transformations 2 • In general A maps the set of vectors in R onto another set of vectors 2 in R . y Ax = A 18 Matrix transformations 2 • In general A maps the set of vectors in R onto another set of vectors 2 in R . y Ax = A A−1 x = A−1y 19 Matrix transformations y = Ax • Perturbations from the identity matrix ⎛ 1+δ 0 ⎞ ⎛ 1+δ 0 ⎞ ⎛ 1 0 ⎞ ⎛ 1+δ 0 ⎞ A = ⎜ ⎟ A = A = ⎜ ⎟ A = ⎜ ⎟ ⎝ 0 1+δ ⎠ ⎝⎜ 0 1 ⎠⎟ ⎝ 0 1+δ ⎠ ⎝ 0 1−δ ⎠ ⎛ −1 0 ⎞ ⎛ −1 0 ⎞ These are all diagonal matrices A = ⎜ ⎟ A = ⎜ ⎟ 0 1 ⎜ ⎟ ⎝ ⎠ ⎝ 0 −1 ⎠ ⎛ a 0 ⎞ Λ = ⎝⎜ 0 b ⎠⎟ ⎛ a−1 0 ⎞ Λ−1 = ⎜ ⎟ ⎜ 0 b−1 ⎟ ⎝ ⎠ 20 Rotation matrix ⎛ cosθ −sinθ ⎞ • Rotation in 2 dimensions Φ(θ) = ⎜ ⎟ ⎝ sinθ cosθ ⎠ θ = 10o θ = 25o θ = 45o θ = 90o • Does a rotation matrix have an inverse? det(Φ) =1 • The inverse of a rotation matrix is just its transpose −1 T Φ (θ) = Φ(−θ) = Φ (θ) 21 Rotated transformations • Let’s construct a matrix that produces a stretch along a 45° angle… ⎛ cos45 −sin 45 ⎞ 1 ⎛ 1 −1 ⎞ = Φ = ⎜ ⎟ ⎜ 1 1 ⎟ ⎝ sin 45 cos45 ⎠ 2 ⎝ ⎠ ΦT Λ Φ −45o +45o x +45o • We do each of these steps by multiplying our matrices together T Φ Λ Φ x 22 Rotated transformations • Let’s construct a matrix that produces a stretch along a 45° angle… T T T x Φ x ΛΦ x ΦΛΦ x T Φ Λ Φ ⎛ ⎞ ⎛ ⎞ 1 ⎛ 1 1 ⎞ 1 1 1 2 0 = − = ⎜ ⎟ = ⎜ ⎟ ⎜ 1 1 ⎟ 2 ⎝ −1 1 ⎠ ⎝ 0 1 ⎠ 2 ⎝ ⎠ T 1 ⎛ 3 1 ⎞ ΦΛΦ = ⎜ ⎟ 2 ⎝ 1 3 ⎠ 23 Inverse of matrix products • We can unto our transformation by taking the inverse T −1 ⎣⎡ΦΛΦ ⎦⎤ • How do you take the inverse of a sequence of matrix multiplications A*B*C? []ABC −1ABC = C −1B−1A−1ABC []ABC −1= C −1B−1A−1 = C −1B−1BC • Thus… = C −1C = I T −1 T −1 −1 −1 ⎣⎡ΦΛΦ ⎦⎤ = ⎣⎡Φ ⎦⎤ []Λ []Φ −1 1 ⎛ ⎞ T − −1 T −1 ⎛ 2 0 ⎞ 1/ 2 0 ⎡ΦΛΦ ⎤ = ΦΛ Φ Λ = = ⎜ ⎟ ⎣ ⎦ ⎝⎜ 0 1 ⎠⎟ ⎝ 0 1 ⎠ 24 Rotated transformations • Let’s construct a matrix that undoes a stretch along a 45° angle… T −1 T −1 T x Φ x Λ Φ x ΦΛ Φ x ΦT Λ−1 Φ(+45o ) ⎛ ⎞ 1 ⎛ 1 1 ⎞ 1 ⎛ 1 1 ⎞ = 1/ 2 0 = − = ⎜ ⎟ ⎜ ⎟ ⎜ 1 1 ⎟ 2 ⎝ −1 1 ⎠ ⎝ 0 1 ⎠ 2 ⎝ ⎠ ⎛ ⎞ 1 T 1 3 −1 ΦΛ− Φ = ⎜ ⎟ 4 ⎝⎜ −1 3 ⎠⎟ 25 Rotated transformations • Construct a matrix that does compression along a -45° angle… T Φ Λ Φ +45o −45o −45o ΦT Λ Φ(−45o ) 1 ⎛ 1 −1 ⎞ ⎛ 0.2 0 ⎞ 1 ⎛ 1 1 ⎞ = ⎜ ⎟ = ⎜ ⎟ = ⎜ ⎟ 2 ⎝ 1 1 ⎠ ⎝ 0 1 ⎠ 2 ⎝ −1 1 ⎠ T ⎛ 0.6 0.4 ⎞ ΦΛΦ = ⎜ ⎟ ⎝ 0.4 0.4 ⎠ 26 Transformations that can’t be undone • Some transformation matrices have no inverse… T T T x Φ x ΛΦ x ΦΛΦ x T o Φ = Φ(45 ) Λ Φ(−45o ) 1 ⎛ 1 1 ⎞ ⎛ ⎞ = − ⎛ 0 0 ⎞ 1 1 1 ⎜ 1 1 ⎟ = ⎜ ⎟ = ⎜ ⎟ 2 ⎝ ⎠ ⎝ 0 1 ⎠ 2 ⎝ −1 1 ⎠ T ⎛ 0.5 0.5 ⎞ ΦΛΦ = T ⎜ 0.5 0.5 ⎟ det ΦΛΦ = 0 det()Λ = 0 ⎝ ⎠ () 27 Learning Objectives for Lecture 16 • More on two-layer feed-forward networks • Matrix transformations (rotated transformations) • Basis sets • Linear independence • Change of basis 28 Basics of basis sets • We can think of vectors as abstract ‘directions’ in space. But in order to specify the elements of a vector, we need to choose a coordinate system. • To do this, we write our vector as a linear combination of a set of special vectors called the ‘basis set.’ ⎛ x⎞ ⎛ 1⎞ ⎛ 0⎞ ⎛ 0⎞ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ v = y = x 0 + y 1 + z 0 = xeˆ1 + yeˆ2 + zeˆ3 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ z⎟ ⎝ 0⎠ ⎝ 0⎠ ⎝ 1⎠ ⎝ ⎠ • The numbers x, y, z are called the coordinates of the vector.

Lecture 16: Networks, Matrices and Basis Sets

Linear Independence, Span, and Basis of a Set of Vectors What Is Linear Independence?

Discover Linear Algebra Incomplete Preliminary Draft

Linear Independence, the Wronskian, and Variation of Parameters

Span, Linear Independence and Basis Rank and Nullity

Homogeneous Systems (1.5) Linear Independence and Dependence (1.7)

Linear Algebra Review

A SHORT PROOF of the EXISTENCE of JORDAN NORMAL FORM Let V Be a Finite-Dimensional Complex Vector Space and Let T : V → V Be A

Lecture Notes to Be Used in Conjunction with 233 Computational

Lecture 29: Jordan Chains & Tableaux

Maxmalgebra M the Linear Algebra of Combinatorics?

Linear Algebra

Linear Algebra Review Vectors