<<

Chapter 1

Introduction to Linear

1.1 Vector Operations

x: a simple numeric value/ variable, e.g. x = 2.5, x = π, x = 105

• N-dimensional column vector ~v with elements vi:   v1  v2  ~v =   (1.1)  .   .  vN

The of ~v, ~v>, is a row vector:

> ~v = (v1, v2, . . . , vN ) (1.2)

~ ~ • of vectors: The sum ~a+b is a vector with elements (~a+b)i = ai+bi • The dot- (inner product) of two vectors gives a scalar value:

~ >~ ~> X ~a · b = ~a b = b ~a = aibi (1.3) i √ pP 2 The (length) of vector ~v is given by |~v| = ~v · ~v = i vi ~v Unit/normalized vector vˆ of a non-zero vector ~v:v ˆ = |v| , |vˆ| = 1 The dot-product of two vectors has an interesting geometric interpretation:

~a ·~b = |~a| · |~b| · cos(θ) (1.4)

Thereby θ is the angle between the two vectors. π ~ Two vectors are orthogonal if θ = 2 , i.e. if ~a · b = 0. The of ~a onto ~b (fig. 1.1) is given by:

~a · ˆb = |~a| · cos(θ) (1.5)

1 2 CHAPTER 1. INTRODUCTION TO

Figure 1.1: Projection of ~a onto ~b.

> • by a scalar k: k · ~v = (k · v1, k · v2, . . . , k · vN ) The length of the vector ~v scales with the factor k: s s X 2 2 X 2 |k · ~v| = (k · vi) = k · vi = k · |~v| (1.6) i i

Exercise 1.1.1. Calculate all the vector products and the lengths for the vectors: ~v1 = (1, 2, 3),~v2 = (2, 3, 1), ~v3 = (−8, −5, 31).

Exercise 1.1.2. Try to explain in your own words why ~v2 · ~v3 = 0 for any 2 vectors ~v1 and ~v2, if ~v3 = ~v1 · |~v2| − ~v2 · ( ~v1 · ~v2). Exercise 1.1.3. Proof that ~a ·~b = |~a| · |~b| · cos(θ), eg. using the law of cosines.

1.1.1 The Linear Neuron Imagine a neuron A receiving input from N sensory neurons. Each synapse has a weight or efficacy wi and the activity of each pre-synaptic neuron is described by the firing rate xi. Synaptic weights with wi > 0 correspond to excitatory synapses, whereas weights with wi < 0 represent inhibitory synapses. In the case of a linear neuron, the firing rate xA of depends linearly on its input, i.e. its firing rate is a weighted sum of its inputs: X xA = w1x1 + w2x2 + ... + wN xN = wixi (1.7) i If we describe the neuronal inputs and synaptic weights by vectors ~x and ~w, respectively, then we can write eq. 1.7 for the firing rate xA more compactly as :

xA = ~w · ~x (1.8) The output of the linear neuron A is zero precisely if the input vector ~x is orthogonal to the weight vector ~w. The of input vectors that are orthogonal to the weight vector form a so-called in the input space. In other words, our linear neuron is a detector which is maximally sensitive to inputs parallel to a particular direction in the input space and minimally sensitive to inputs lying on a (N −1)-dimensional hyperplane orthogonal to this direction. 1.2. LINEAR MAPPINGS OF VECTORS 3

1.2 Linear Mappings of Vectors

Consider a M(~v that maps a N-dimensional vector ~v to a P -dimensional > vector M(~v) = (M1(~v),M2(~v),...,MP (~v)) . This mapping is linear if and only if:

1. For all scalars k: M(k · ~v) = k · M(~v)

2. For all pairs of vectors ~a and ~b: M(~a +~b) = M(~a) + M(~b)

This means that each of M(~v) is determined by a of the elements ~v. Hence, for each element Mi(~v) we can find some scalars Mij such that: X Mi(~v) = Mi1v1 + Mi2v2 + ... + MiN vN = Mijvj (1.9) j

We arrange the scalars Mij to a P × N- M and define the product M · ~v of matrix M with column vector ~v by: X (M · ~v)i = Mijvj (1.10) j and the product ~v> · M of matrix M with row vector ~v is given by:

> X (~v · M)j = viMij (1.11) i

This motivates the definition of matrices and . Thus, each possible on any vector can be described by multiplying the vector with a corresponding matrix. We say the matrix multiplication of a vector corresponds to a linear transformation of the vector.

1.3 Matrix Operations

• A P × N-matrix M has P rows and N columns and elements Mij, where i indicates the row index and j represents the column index:   M11 M12 ··· M1N M21 M22 ··· M2N  M =   (1.12)  . . .. .   . . . .  MP 1 MP 2 ··· MPN

> > The transpose of M, M , is the matrix with elements Mij = Mji. I.e. the columns and rows of M are flipped:   M11 M21 ··· MP 1  M12 M22 ··· MP 2  M> =   (1.13)  . . .. .   . . . .  M1N M2N ··· MPN 4 CHAPTER 1. INTRODUCTION TO LINEAR ALGEBRA

• Multiplication by a scalar k:

The matrix k · M = M · k has the elements (k · M)ij = k · Mij • Addition of matrices: A + B is a matrix with elements (A + B)ij = Aij + Bij • The matrix-product of M ×N-matrix A with N ×P -matrix B is defined as follows:     A11 A12 ··· A1N B11 B12 ··· B1P  A21 A22 ··· A2N   B21 B22 ··· B2P  A · B =   ·    . . .. .   . . .. .   . . . .   . . . .  AM1 AM2 ··· AMN BN1 BN2 ··· BNP .   A~ 1    A~   2  ~ ~ ~ =  .  ·  B1 B2 ··· BP   .   .  A~M

  A~1 · B~1 A~1 · B~ 2 ··· A~1 · B~P ~ ~ ~ ~ ~ ~  A2 · B1 A2 · B2 ··· A2 · BP  =    . . .. .   . . . .  A~M · B~1 A~M · B~ 2 ··· A~M · B~P

 P P P  A1iBi1 A1iBi2 ··· A1iBiP Pi Pi Pi  i A2iBi1 i A2iBi2 ··· i A2iBiP  =   (1.14)  . . .. .   . . . .  P P P i AMiBi1 i AMiBi2 ··· i AMiBiP For each row of A we calculate the dot-product with each column of B. Note, in general the matrix-product is not commutative: AB 6= BA • An N × N-matrix is a . A square matrix M is called > symmetric if M = M . This means Mij = Mji for all i and j.

• The 1 is a matrix that is Mii = 1 on the diagonal and Mij = 0, i 6= j otherwise. • The inverse of a square matrix M is a matrix M−1 satisfying: M−1 · M = M · M−1 = 1 (1.15) Note, not all matrices have an inverse, but if the inverse exists, it is unique. If the inverse M−1 exists, the matrix M is called invertible. Exercise 1.3.1. Calculate the following products: A~v, ~v>B, AB and BA for: 1 5 6 4 1 3 > ~v = (1, 1, 1) , A = 3 2 5 , B = 2 1 1 4 1 7 3 1 2 1.4. LINEAR 5

Exercise 1.3.2. Show that (AB)> = B>A>.

Exercise 1.3.3. Show that (A>)−1 = (A−1)>.

Exercise 1.3.4. Suppose A and B are both invertible N × N-matrices. Show that (AB)−1 = B−1A−1.

1.4 Linear Equations

A central problem of linear algebra is to solve systems of linear equations (SLE) with several unknowns. Simple SLE can be solved by substitution and elimina- tion. For example suppose the following SLE:

2x + 3y = 6 4x + 9y = 15.

1. We solve the top for x in terms of y: 3 x = 3 − y 2

2. Then we substitute the expression for x into the bottom equation:

 3  4 3 − y + 9y = 15 2

3. Now we solve this equation for y and get y = 1. This in turn we substitute 3 3 into the reduced equation of the first step and we get: x = 3 − 2 · 1 = 2 However, for more complicated SLE with more equations and more unkowns we need a moore systematic approach. A method that is particularly useful and efficient for numerical solutions to SLE is . We will discuss the Gaussian elimination by solving the following SLE:

v1 + v2 + v3 = 0

4v1 + 2v2+ v3 = 1

9v1 + 3v2+ v3 = 3

1. Write the SLE in matrix-form M · ~v = ~b and generate the extended coefficient matrix:

   1 1 1 0   M ~b  =  4 2 1 1  9 3 1 3

2. The goal is to turn M into the identity matrix by

• swapping rows • multiplying rows by a scalar value • adding/ subtracting rows from each other 6 CHAPTER 1. INTRODUCTION TO LINEAR ALGEBRA

    1 1 1 0 R2−4·R1 1 1 1 0 R3−9·R1  4 2 1 1  −−−−−−→  0 −2 −3 1  9 3 1 3 0 −6 −8 3

R3−3·R2  1 1 1 0  R1−R3  1 1 0 0  − 1 ·R R − 3 ·R 2 2 3 1 2 2 3 1 −−−−−−→  0 1 2 − 2  −−−−−−→  0 1 0 − 2  0 0 1 0 0 0 1 0  1  1 0 0 2 R1−R2 1 −−−−−→  0 1 0 − 2  0 0 1 0 3. By substituting the coefficients back into our SLE we get: 1 1 v = , v = − , v = 0 (1.16) 1 2 2 2 3 Exercise 1.4.1. Solve the following SLE: 10y − z + w = 10 2x − 2y − 4z = −3 4x + 2y + 4w = 5 3x + 2y + 3w = 4

1.5 Invertible Matrices

An N × N-matrix M is called an if there exists a matrix M−1 satisfying: M−1 · M = M · M−1 = 1 (1.17) If a matrix is not invertible, it is called a singular matrix. Note, the defini- tion of the inverse M−1 corresponds to a system of linear equations (SLE). Therefore M−1 can be calculated by using Gaussian elimination: 1 2 0 1. Suppose M = 2 3 0. What is the inverse M−1 of M? 3 4 1 2. Let’s write the SLE in matrix-form and generate the corresponding ex- tended coefficient matrix:    1 2 0 1 0 0   M 1  =  2 3 0 0 1 0  3 4 1 0 0 1

3. As before we are using Gaussian eliminiation (see section 1.4) in order to turn M into the identity-matrix.     1 2 0 1 0 0 R2−2·R1 1 2 0 1 0 0 R3−3·R1  2 3 0 0 1 0  −−−−−−→  0 −1 0 −2 1 0  3 4 1 0 0 1 0 −2 1 −3 0 1     1 2 0 1 0 0 R1−2·R2 1 0 0 −3 2 0 −1·R2 R3+2·R2 −−−−→  0 1 0 2 −1 0  −−−−−−→  0 1 0 2 −1 0  0 −2 1 −3 0 1 0 0 1 1 −2 1 1.6. VECTOR SPACES AND SUBSPACES 7

Figure 1.2: Addition and of vectors.

Exercise 1.5.1. Calculate the inverse of the following matrix:

2 2 −1 1 1 3 0 1 A =   (1.18) 2 4 −1 2 1 2 −1 1

Exercise 1.5.2. Proof that:

−1 a b 1  d −b A−1 = = (1.19) d ad − bc −c a

The denominator in eq. 1.19 has an interesting interpretation. As we will see later, the determinante of a 2 × 2-matrix A is det (A) = ad − bc.

1.6 Vector Spaces and Subspaces

An N-dimensional RN is simply a collection of N-dimensional vectors along with two methods of superposition (fig. 1.2):

1. Addition of vectors: ~v + ~w

2. of vectors, i.e. scaling by a scalar k: k · ~w

A vector space has to contain two special elements:

1. of addition ~0: ~v + ~0 = ~v

2. Identity element of scalar multiplication 1: 1 · ~v = ~v

Examples of vector spaces:

• R1: Vector space of all real .

• R3: 3-dimensional . But how could we visualize a N-dimensional space? One way is to pack/collaps (N − 1) into one ”pseudo”-axis and visualize it in a two-dimensional plot. On the left in fig. 1.3 we see for example the outline of a 3-dim. vector 3 space R , with dimensions or axes R1, R2, R3. The first two dimensions R1 and 8 CHAPTER 1. INTRODUCTION TO LINEAR ALGEBRA

Figure 1.3: Visualization of a N-dimensional space.

2 R2 build a 2-dim. vector space R . By turning the figure and looking along the axis R2 one gets a two-dimensional representation with the remaining axis R3 and a ”pseudo”-axis R2. On the right we see a corresponding two-dimensional N representation of a N-dim. space R with a remaining axis RN and a ”pseudo”- axis RN−1.

1.6.1 Linear Subspaces

A of a vector space RN is a of vectors V in RN that build a vector space by themself. For example, every 2-dim. and every 1-dim. intersecting with the origin is a subspace of R3 (see fig. 1.4). A subset of vectors V is a linear subspace of RN if and only if : 1. The zero-vector ~0 is in V .

2. Any linear combination of elements of V is again element of V , i.e.:

• If two elements ~v and ~w are in V then the sum ~v + ~w is in V . • If ~v is in V the scaled vector k · ~v is in V .

A K-dimensioal subspace V can be described by a set of K linearly inde- ~ ~ ~ pendent vectors b1, b2,..., bK that span V . Such a set of linearly independent vectors is called a B of subspace V . A basis B of a K-dim. subspace V has the following properties: ~ ~ ~ • Linear independency: If x1 ·b1 +x2 ·b2 +...+xK ·bK = 0 it necessarily follows that x1 = x2 = ... = xK = 0 This means, that each linear independent vector contains unique informa- tion that is not contained in any of the other vectors and therefore linearly independent vectors cannot cancel each others out.

• Spanning property: For every ~v ∈ V we can find x1, x2, . . . , xK ∈ R such that: ~ ~ ~ ~v = x1 · b1 + x2 · b2 + ... + xK · bK (1.20)

This means that every element of the subspace can be represented by a linear combination of basis vectors.

For example consider R2, the vector space of all coordinates (a, b) where both a and b are real numbers (see fig 1.5. Then a natural and simple basis consists 1.6. VECTOR SPACES AND SUBSPACES 9

Figure 1.4: A plane intersecting with the origin is a 2-dim. subspace of R3.

2 of the vectors ~e1 = (1, 0) and ~e2 = (0, 1). Any vector ~v = (a, b) in R can be expressed as:

~v = a · (1, 0) + b · (0, 1) = a ~e1 + b~e2 (1.21)

For ~e1 and ~e2 we immediately see that both, the linear independency as well as the spanning property are fullfilled. Note, a natural basis where the basis vectors are zero everywhere except in one dimensions is called . 1  But any two linearly independent vectors, like ~v1 = (1, 1) and ~v2 = 2 , −1 , will also form a basis of R2.

• Linear independency: Suppose we have a, b such that: 1  a · (1, 1) + b · , −1 = 0 2 Then we have the following linear equations: 1 a + b = 0 2 a − b = 0

1 This means that a = − 2 b and a = b, and this can only hold true if a = b = 0.

• Spanning property: Let (a, b) be an arbitrary vector in R2. Now we have to show that there exist numbers x1, x2 such that: 1  x · (1, 1) + x · , −1 = (a, b) 1 2 2 10 CHAPTER 1. INTRODUCTION TO LINEAR ALGEBRA

Figure 1.5: Examples for a standard basis and an arbitrary basis of linearly independent vectors spanning R2.

Again, this leads us to a system of linear equations:

1 x + x = a 1 2 2 x1 − x2 = b

Hence, with x1 = b + x2 we get: 1 2 2 1 b + x + x = a → x = (a − b), x = a + b 2 2 2 2 3 1 3 3

In the following we will discuss sum interesting examples for subspaces:

• All solutions of a system of homogeneous linear equations with N unknowns form a subspace of RN . ”Homogenous” means that all equations are equal to 0.

a11x1 + a12x2 + ··· + a1N xN = 0

a21x1 + a22x2 + ··· + a2N xN = 0 ......

aM1x1 + aM2x2 + ··· + aMN xn = 0

As we know, this can be written as a single matrix equation: A · ~x = ~0. The solution to this matrix equation is called the null space of matrix A. The null space is the set of vectors that are mapped to ~0 by matrix A. Suppose we have the following matrix:

1 1 1 A = 0 1 1 1.6. VECTOR SPACES AND SUBSPACES 11

What is the null space of A, i.e. what are the solutions to A~v = ~0? Using Gaussian elimination (see section 1.4) we get:     1 1 1 0 R −R 1 0 0 0 −−−−−→1 2 0 1 1 0 0 1 1 0

From this we get x = 0, y = t, z = −t, where t is any . Hence, the solutions to A~v = ~0 form a subspace with basis ~b = (0, −1, 1) which is a line in a 3-dim. vector space. Every ~v = t · ~b lies on this line and is element of the null space of A. • Basis matrix: A set of linearly independent vectors can be arranged as column vectors (or row vectors) of a matrix. This matrix is called basis matrix. By Gaussian elimination we can turn any matrix into a basis matrix. Suppose we have the following matrix A with row vectors ~v1 = (1, 2, 3),~v2 = (0, 1, 2),~v3 = (2, 6, 10):

1 2 3  A = 0 1 2  2 6 10

Using Gaussian elimination we will now turn A into a matrix with linearly independent rows:       1 2 3 1 2 3 R3−2·R2 1 0 −1 R3−2·R1 R1−2·R2 0 1 2  −−−−−−→ 0 1 2 −−−−−−→ 0 1 2  2 6 10 0 2 4 0 0 0

Note, now row 3 consists of zeros, i.e. the row was eliminated. What re- main are 2 linearly independent vectors ~w1 = (1, 0, −1) and ~w2 = (0, 1, 2). • : A set of vectors that are all orthogonal to each other, i.e. ~vi · ~vj = 0 if i 6= j. Exercise 1.6.1. Determine if the vectors ~v = (1, 3, −1, 4), ~w = (3, 8, −5, 7), ~u = (2, 9, 4, 23) are linearly dependent or independent. Exercise 1.6.2. Find a basis for the null space of matrix A:

2 4 −5 3 A = 3 6 −7 4 5 10 −11 6

Exercise 1.6.3. Proof the following: If a matrix A is invertable, then its rows and columns form a basis.