<<

EECS 16A Designing Information Devices and Systems I Spring 2016 Official Lecture Notes Note 21

Introduction

In this lecture note, we will introduce the last topics of this semester, change of and diagonalization. We will introduce the mathematical foundations for these two topics. Although we will not go through its application this semester, this is an important concept that we will use to analyze signals and linear time- systems in EE16B.

By the end of these notes, you should be able to:

1. Change a vector from one basis to another. 2. Take a representation for a linear transformation in one basis and express that linear transfor- mation in another basis. 3. Understand the importance of a diagonalizing basis and its properties. 4. Identify if a matrix is diagonalizable and if so, to diagonalize it.

Change of Basis for Vectors

Previously, we have seen that matrices can be interpreted as linear transformations between vector spaces. In particular, an m×n matrix A can be viewed as a A : U → V mapping a vector~u from n m U ∈ R to a vector A~u in vector space V ∈ R . In this note, we explore a different interpretation of square, invertible matrices as a . " # " # u 4 Let’s first start with an example. Consider the vector ~u = 1 = . When we write a vector in this form, u2 3 " # " # 1 0 implicitly we are representing it in the for 2, ~e = and ~e = . This means that we R 1 0 2 1 2 can write~u = 4~e1 +3~e2. Geometrically,~e1 and~e2 define a grid in R , and~u is represented by the coordinates in the grid, as shown in the figure below:

EECS 16A, Spring 2016, Note 21 1 " # 1 What if we want to represent ~u as a of another of basis vectors, say ~a = and 1 1 " # 0 ~a = ? This means that we need to find scalars u and u such that ~u = u ~a + u ~a . We can write 2 1 a1 a2 a1 1 a2 2 this equation in matrix form:   | | " # " # u u ~a ~a  a1 = 1  1 2 u u | | a2 2 " #" # " # 1 0 u 4 a1 = 1 1 ua2 3 " # 1 0 Thus we can find u and u by solving a system of linear equations. Since the inverse of is a1 a2 1 1 " # 1 0 , we get u = 4 and u = −1. Then we can write ~u as 4~a −~a . Geometrically, ~a and ~a defines −1 1 a1 a2 1 2 1 2 a skewed grid from which the new coordinates are computed.

Notice that the same vector ~u can be represented in multiple ways. In the standard basis ~e1,~e2, the coordi- nates for ~u are 4,3. In the skewed basis ~a1,~a2, the coordinates for ~u are 4,−1. Same vector geometrically, but different coordinates.

n In general, suppose we are given a vector~u ∈ R in the standard basis and want to change to a different basis " # ua1 with linearly independent basis vectors ~a1,··· ,~an. If we denote the vector in the new basis as ~ua = , ua2 h i we solve the following equation A~ua = ~u, where A is the matrix ~a1 ··· ~an . Therefore the change of basis is given by:

−1 ~ua = A ~u

Now we know how to change a vector from the standard basis to any basis, how do we change a vector ~ua in the basis ~a1,··· ,~an back to a vector ~u in the standard basis? We simply reverse the change of basis transformation, thus ~u = A~ua.

Pictorially, the relationship between any two bases and the standard basis is given by:

−1 A- B - Basis ~a1,··· ,~an  Standard Basis  Basis~b1,··· ,~bn A−1 B

EECS 16A, Spring 2016, Note 21 2 For example, given a vector ~u = ua1~a1 + ··· + uan~an represented as a linear combination of the basis vectors ~ ~ ~ ~a1,··· ,~an, we can represent it as a different linear combination of basis vectors b1,··· ,bn, ~u = ub1 b1 +···+ ~ ubn bn by writing: ~ ~ B~ub = ub1 b1 + ··· + ubn bn =~u = ua1~a1 + ··· + uan~an = A~ua −1 ~ub = B A~ua

−1 Thus the change of basis transformation from basis ~a1,··· ,~an to basis~b1,··· ,~bn is given by B A.

Change of Basis for Linear Transformations

Now that we know how to change the basis of vectors, let’s shift our attention to linear transformations. We will answer these questions in this section: how do we change the basis of linear transformations and what does this mean? First let’s review linear transformations. Suppose we have a linear transformation T n n represented by a n × n matrix that transforms ~u ∈ R to v ∈ R : ~v = T~u

The important thing to notice is that T maps vectors to vectors in a linear manner. It also happens to be represented as a matrix. Implicit in this representation is the choice of a . Unless stated otherwise, we always assume that the coordinate system is that defined by the standard basis vectors ~e1,~e2,...,~en.

n What is this transformation T in another basis? Suppose we have basis vectors ~a1,··· ,~an ∈ R . If ~u is represented by ~ua = ua1~a1 + ··· + uan~an and ~v is represented by ~va = va1~a1 + ··· + van~an in this basis, what is the transformation T represented by in this basis, such that Ta~ua = ~va? If we define the matrix A = h i ~a1 ··· ~an , we can write our vectors ~u and ~v as a linear combination of the basis vectors: ~u = A~ua and ~v = A~va. This is exactly the change of basis from the standard basis to the ~a1,··· ,~an basis.

T~u =~v

TA~ua = A~va −1 A TA~ua =~va

−1 Since we want Ta~ua =~va, Ta = A TA. In fact, the correspondences stated above are all represented in the following diagram.

T ~u - ~v 6 6 A−1 A A A−1

? ? - ~ua ~va Ta

For example, there are two ways to get from ~ua to ~va. First is the transformation Ta. Second, we can out the longer path, applying transformations A, T and A−1 in order. This is represented in matrix form as −1 Ta = A TA (note that the order is reversed since this matrix acts on a vector ~ua on the right).

EECS 16A, Spring 2016, Note 21 3 −1 By the same logic, we can go in the reverse direction. And then T = ATaA . Notice that this makes sense. −1 A takes a vector written in the standard basis and returns the coordinates in the A basis. Ta then acts on it and returns new coordinates in the A basis. Multiplying by A returns a weighted sum of the columns of A, in other words, giving us the vector back in the original standard basis.

A Diagonalizing Basis

Now we know what a linear transformation looks like under a different basis, is there a special basis under which the transformation attains a nice form? Let’s suppose that the basis vectors ~a1,··· ,~an are linearly independent eigenvectors of the T, with eigenvalues λ1,··· ,λn. What does T~u looks like now? Recall that ~u can be written in the new basis: ua1~a1 + ··· + uan~an.

T~u = T(ua1~a1 + ··· + uan~an)

= ua1 T~a1 + ··· + uan T~an

= ua1 λ1~a1 + ··· + uan λn~an     | ··· | λ1 ua1   ..  .  = ~a1 ··· ~an .  .  | ··· |    λn uan

= AD~ua = ADA−1~u

Where D is the of eigenvalues and A is a matrix with corresponding eigenvectors as its −1 columns. Thus we have proved that in an eigenvector basis, T = ADA . In particular, Ta, the counterpart of T in the eigenvector basis, is a diagonal matrix.

This gives us many useful properties. Recall that does not commute in general, but diagonal matrices commute. Can we construct other non-diagonal matrices that commute under multipli- cation? Suppose T1 and T2 can both be expressed in the same diagonal basis, that is they have the same −1 −1 eigenvectors but possibly different eigenvalues. Then T1 = AD1A and T2 = AD2A . We show that T1 and T2 commute:

−1 −1 T1T2 = (AD1A )(AD2A ) −1 = AD1D2A −1 = AD2D1A −1 −1 = (AD2)(A A)(D1A ) −1 −1 = (AD2A )(AD1A )

= T2T1

One of the other great properties has to do with raising a matrix to a power. If we can diagonalize a matrix, then this is really easy. Can you see for yourself why the argument above tells us that if T = ADA−1, then T k = ADkA−1? (Hint: all the terms in the middle cancel.) But raising a diagonal matrix to a power is just raising its individual elements to that same power.

EECS 16A, Spring 2016, Note 21 4 Diagonalization

We saw from the previous section the usefulness of representing a matrix (i.e. a linear transformation) in a basis so that it is diagonal, so under what circumstances is a matrix diagonalizable? Recall from before that a n × n matrix T is diagonalizable if it has n linearly independent eigenvectors. If it has n linearly independent eigenvectors ~a1,··· ,~an with eigenvalues λ1,···λn, then we can write:

T = ADA−1 h i Where A = ~a1 ··· ~an and D is a diagonal matrix of eigenvalues. " # 1 0 Example 21.1 (A 2 × 2 matrix that is diagonalizable): Let T = . To find the eigenvalues, we 0 −1 solve the equation:

det(T − λI) = 0 (1 − λ)(−1 − λ) = 0 λ = 1,−1 " # 1 The eigenvector corresponding to λ = 1 is ~a = . And the eigenvalues corresponding to λ = −1 is 1 0 " # " # " # 0 1 0 1 0 ~a = . Now we have the diagonalization of T = ADA−1 where A = , D = , and A−1 = 2 1 0 1 0 −1 " # 1 0 . 0 1

" # 1 0 Example 21.2 (A 2 × 2 matrix that is not diagonalizable): Let T = . To find the eigenvalues, −1 1 we solve the equation:

det(T − λI) = 0 (1 − λ)2 = 0 λ = 1 " # 0 The eigenvector corresponding to λ = 1 is ~a = . Since this matrix only has 1 eigenvector, it is not 1 diagonalizable.

Example 21.3 (Diagonalization of a 3 × 3 matrix):   3 2 4   T = 2 0 2 4 2 3

EECS 16A, Spring 2016, Note 21 5 To find the eigenvalues, we solve the equation:

det(T − λI) = 0 λ 3 − 6λ 2 − 15λ − 8 = 0 (λ − 8)(λ + 1)2 = 0 λ = −1,8

If λ = −1, we need to find ~a such that:     4 2 4 2 1 2     2 1 2~a = 0 =⇒ 0 0 0~a = 0 4 2 4 0 0 0

Thus the of the nullspace of T − (−1)I is 2, and we can find two linearly independent vectors in this basis:     1 1     ~a1 =  0  ~a2 = −2 −1 0

If λ = 8, we need to find ~a such that:     −5 2 4 1 0 −1      2 −8 2 ~a = 0 =⇒ 0 2 −1~a = 0 4 2 −5 0 0 0

Thus the dimension of the nullspace of T − (8)I is 1, and we find the vector in the nullspace:   2   ~a3 = 1 2

Now we define:       1 1 2 −1 0 0 4 2 −5 1 A =  0 −2 1 D =  0 −1 0 A−1 = 1 −4 1      9   −1 0 2 0 0 8 2 1 2

Then T can be diagonalized as T = ADA−1.

(Extra) Application - the DFT Basis

We would like to offer a sneak peek into why diagonalization can be useful in signals and linear time invariant systems. You will not be responsible for the material introduced in this section, but we would just want to give you some motivation behind the topic.

EECS 16A, Spring 2016, Note 21 6 Any finite time linear time invariant system can be described by matrix of the form   s0 sn−1 sn−2 s1    s1 s0 sn−1 s2   S =  s2 s1 s0 ... s3 (1)  . . . .   . . . .   . . . .  sn−1 sn−2 sn−3 s0

These kind of matrices are called circulant matrices. Observe that the second column is the first col- umn shifted by one time step, the third column is the first column shifted by two time steps, ..., the nth column is the first column shifted by n − 1 time steps. As we can see, we need exactly n parameters h iT ~s = s0 s1 ... sn−1 to uniquely define the system.

h iT For example, if the input signal to the system is ~x = 1 0 ... 0 , then the output of the system would be   s0 sn−1 sn−2 s1       1 s0  s1 s0 sn−1 s2     0  s1  S~x =  s2 s1 s0 ... s3. =  . . (2)  . . . . .  .   . . . .       0 sn−1 sn−1 sn−2 sn−3 s0

h iT For a general signal~x = x0 x1 ... xn−1 , the output signal would be

  s0 sn−1 sn−2 s1     x0  s1 s0 sn−1 s2    x1  S~x =  s2 s1 s0 ... s3 . . (3)  . . . .  .   . . . .     xn−1 sn−1 sn−2 sn−3 s0

However, doing such matrix multiplication is tedious. Is there a way to transform S and~x to a different basis so that in this basis S becomes a diagonal matrix S0 and ~x becomes ~x0? In this case, computing the matrix multiplication is simple

 0    0  s0 x0 s0x0 0 0  s  x1   s x1  0~0  1    1  S x =  .  .  =  . . (4)  ..  .   .   0    0  sn−1 xn−1 sn−1xn−1

In other words, we would like to be able to diagonalize S, i.e., find U such that S = US0U−1 where the columns of U are the desired basis.

To find U, we need to find the eigenvectors and eigenvalues of S. Recall that we can usually start by finding the eigenvalues by solving the equation det(S − λI) = 0, however, in this case, this does not give us much information. Instead, let’s try to identify some patterns by solving the n = 3 case:

EECS 16A, Spring 2016, Note 21 7   s0 s2 s1   S = s1 s0 s2 (5) s2 s1 s0

T Let~v = [w0,w1,w2] . Then   s0w0 + s1w2 + s2w1   S~v = s0w1 + s1w0 + s2w2 (6) s0w2 + s1w1 + s2w0

For~v to be an eigenvector, S~v = λ~v. What properties should w0, w1 and w2 have so that~v is an eigenvector? We see an intriguing pattern in S~v. If wi has the following two properties:

1. wiw j = wi+ j (mod n)

2. w0 = 1

Then we can factor out the eigenvalue:       (s0w0 + s1w2 + s2w1)w0w0 (s0w0 + s1w2 + s2w1)w0 w0       S~v = (s0w1 + s1w0 + s2w2)w2w1 = (s0w0 + s1w2 + s2w1)w1 = s0w0 + s1w2 + s2w1 w1 (7) (s0w2 + s1w1 + s2w0)w1w2 (s0w0 + s1w2 + s2w1)w2 w2

Therefore~v is an eigenvector with eigenvalue s0w0 + s1w2 + s2w1.

This guessing and checking method we used might seem quite magical, but it is often a great way to find eigenvectors of matrices with a lot of symmetry.

To really find the eigenvectors, we need to find a set of numbers w0,w1,w2 with the above-mentioned properties. Real numbers wouldn’t do the trick, as they don’t have the “wrap-around” property needed. For i 2πm k all m = 0,1,...,n − 1, complex exponentials of the form wk = e n have all the required properties.

i 2π k Why? Because e n steps along the n-th roots of unity as we let k vary from 0,1,...,n − 1. This is a simple i i 2π n i 2π n i2 consequence of Euler’s identity e θ = cosθ +isinθ geometrically or the fact that (e n ) = e n = e π = 1.

How could we have come up with this guess? The 3x3 case is not impossible to solve. We could’ve just written out the matrix for s0 = s2 = 0 and s1 = 1. (This is the shift-by-one matrix that simply cyclically delays the signal it acts on by 1 time step.) At this point, you could explicitly evaluate the det(S − λI) to find the characteristic and see that we need to find the roots of λ 3 − 1. This implies that the eigenvalues are the 3rd roots of unity. From there, you can find the eigenvectors by direct computation and you would find:

The 3 eigenvectors are:       1 1 1 2πi 4πi v0 = 1, v1 = e 3 , v2 = e 3  (8)    4πi   2πi  1 e 3 e 3

EECS 16A, Spring 2016, Note 21 8 You would then notice that the same eigenvectors work for the case s0 = s1 = 0 and s2 = 1. This is the delay by 2 matrix which, by inspection, can be seen as being D2 if D is the delay-by-one matrix. At this point, you know that since these every 3x3 is a linear combination of the (no delay) and these two pure delay matrices, and all three of these matrices share the same eigenvectors, that in fact these must be the eigenvectors for all 3x3 circulant matrices.

Generalizing, if we have a n-dimensional signals, all LTI systems that act on such signals would have the same n eigenvectors:

l·0 l·1 l·(n−1) T ~vl = [ω ,ω ,...,ω ] (9)

2πi n √1 √1 Where ω = e . To normalize this, we have ~ui = n~vi. As we will see, the extra n term makes the be 1.

Therefore we can write U as:   1 1 ··· 1 1  n−2 n−1  1 ω ··· ω ω  1 . . . . .  U = [~u0 ~u1 ··· ~un−1] = √ ......  (10) n    n−2 (n−2)(n−2) (n−1)(n−2) 1 ω ··· ω ω  1 ωn−1 ··· ω(n−2)(n−1) ω(n−1)(n−1)

The basis in which S is diagonalized is called the Fourier basis or DFT basis (also called the frequency domain), and U as a linear transformation is called the inverse discrete Fourier transform that brings signals from the frequency domain into the time domain. The conjugate of U, namely U∗, maps signals from the time-domain to the frequency domain and as a transformation is called the discrete Fourier transform or DFT.

EECS 16A, Spring 2016, Note 21 9