Maths for Signals and Systems Linear Algebra in Engineering

Home , Invertible matrix, Orthogonal matrix, Square matrix

Lectures 8-9, Friday 30th October 2015 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR) IN SIGNAL PROCESSING IMPERIAL COLLEGE LONDON Semi-orthogonal matrices with more rows than columns

푇 • The column vectors 푞1, … , 푞푛 are orthogonal if 푞푖 ∙ 푞푗 = 0 for 푖 ≠ 푗. • In order for a set of 푛 vectors to satisfy the above, their dimension 푚 must be at least 푛, i.e., 푚 ≥ 푛. This is because the maximum number of 푚 − dimensional vectors that can be orthogonal is 푚. • If their lengths are all 1, then the vectors are called orthonormal. 푇 0 when 푖 ≠ 푗 (퐨퐫퐭퐡퐨퐠퐨퐧퐚퐥 vectors) 푞푖 ∙ 푞푗 = 1 when 푖 = 푗 (퐮퐧퐢퐭 vectors: 푞푖 = 1) • I assign to a matrix with 푛 orthonormal 푚 −dimensional columns the special letter 푄푚×푛. • Now I will drop the subscript because no one uses it. • I wish to deal first with the case where 푄 is strictly non-square (it is rectangular), and therefore, 푚 > 푛. • The matrix 푄 is called semi-orthogonal. Semi-orthogonal matrices with more rows than columns

Problem: Consider a semi-orthogonal matrix 푄 with real entries, where the number of rows 푚 exceeds the number of columns 푛 and the columns are orthonormal vectors. 푇 Prove that 푄 푄 = 퐼푛×푛.

Solution:

푇 푞1 푞 푇 푄푇푄 = 2 푞 푞 … 푞푛 = 퐼 . ⋮ 1 2 푛×푛 푇 푞푛

. We see that 푄푇 is only an inverse from the left. ′ ′ . This is because there isn’t a matrix 푄 for which 푄푄 = 퐼푚×푚. This would imply that we could find 푚 independent vectors of dimension 푛, with 푚 > 푛. This is not possible.

Semi-orthogonal matrices: Generalization

• In linear algebra, a semi-orthogonal matrix is a non-square matrix with real entries where: if the number of rows exceeds the number of columns, then the columns are orthonormal vectors; but if the number of columns exceeds the number of rows, then the rows are orthonormal vectors.

• Equivalently, a rectangular matrix of dimension 푚 × 푛 is semi-orthogonal if 푇 푇 푄 푄 = 퐼푛×푛, 푚 > 푛 or 푄푄 = 퐼푚×푚, 푛 > 푚

• The above formula yields the terms left-invertible or right-invertible matrix.

• In the above cases, the left or right inverse is the transpose of the matrix. For that reason, a rectangular orthogonal matrix is called semi-unitary. (To remind you: a unitary matrix is the one with an inverse being its transpose.)

Semi-orthogonal matrices: Generalization

Problem 1: Show that for left-invertible, semi-orthogonal matrices of dimension 푚 × 푛, 푚 > 푛 푄푥 = 푥 for every 푛 − dimensional vector 푥.

Solution: 푄푥 2 = 푄푥 푇 푄푥 = 푥푇푄푇푄푥 = 푥푇퐼푥 = 푥푇푥 ⇒ 푄푥 2 = 푥 2 ⇒ 푄푥 = 푥 .

Problem 2: Show that for right-invertible, semi-orthogonal matrices of dimension 푚 × 푛, 푚 < 푛, 푄푇푥 = 푥 for every 푚 − dimensional vector 푥.

Solution: 푄푇푥 2 = 푄푇푥 푇 푄푇푥 = 푥푇푄푄푇푥 = 푥푇퐼푥 = 푥푇푥 ⇒ 푄푇푥 2 = 푥 2 ⇒ 푄푇푥 = 푥 .

Orthogonal matrices

Problem 1: 푇 Extend the relationship 푄 푄 = 퐼푛×푛 for the case when 푄 is a square matrix of dimension 푛 × 푛 and has orthogonal columns.

Solution: 푇 −1 푇 푄 푄 = 퐼푛×푛 ⇒ 푄 = 푄 . The inverse is the transpose.

Problem 2: 푇 Prove that 푄푄 = 퐼푛×푛.

Solution: ′ ′ Since 푄 is a full rank matrix we can find 푄 such that 푄푄 = 퐼푛×푛. This gives: 푇 ′ 푇 ′ 푇 ′ 푇 푄 푄푄 = 푄 퐼푛×푛 ⇒ 퐼푛×푛 푄 = 푄 ⇒ 푄 = 푄 Therefore, we see that 푄푇 is the two-sided inverse of 푄.

Examples of elementary orthogonal matrices Rotation

• Rotation matrix: 푐표푠휃 −푠푖푛휃 푐표푠휃 푠푖푛휃 푄 = and 푄푇 = 푄−1= 푠푖푛휃 푐표푠휃 −푠푖푛휃 푐표푠휃

Problem . Show that the columns of 푄 are orthogonal (straightforward). . Show that the columns of 푄 are unit vectors (straightforward). 0 1 . Explain the effect that the rotation matrix has on vectors 푗 = and 푖 = , 1 0 when it multiplies them from the left. . The matrix causes rotation of the vectors. Examples of elementary orthogonal matrices: Permutation

• Permutation matrices: 0 1 0 0 1 푄 = 0 0 1 and 푄 = . 푄푇 = 푄−1 in both cases. 1 0 1 0 0

Problem . Show that the columns of 푄 are orthogonal (straightforward). . Show that the columns of 푄 are unit vectors (straightforward). 푥 . Explain the effect that the permutation matrices have on a random vector 푦 푧 푥 or 푦 when they multiply the vector from the left. . The matrices cause re-ordering of the elements of these vectors. Examples of elementary orthogonal matrices Householder Reflection

• Householder Reflection matrices 푇 푄 = 퐼 − 2푢푢 with 푢 any vector that satisfies the condition 푢 2 = 1 (unit vector). 푄푇 = 퐼푇 − 2푢푢푇 푇 = 퐼 − 2푢푢푇 = 푄 푄푇푄 = 푄2 = 퐼

Problem 푇 푇 푇 . For 푢1 = 1 0 and 푢2 = 1/ 2 −1/ 2 find 푄푖 = 퐼 − 2푢푖푢푖 , 푖 = 1,2. 푥 . Explain the effect that matrix 푄1 has on the vector 푦 when it multiplies the vector from the left. 푥 . Explain the effect that matrix 푄2 has on the vector 푦 when it multiplies the vector from the left.

푣푣푇 • A generalized definition is 푄 = 퐼 − 2 with 푣 any column vector. 푣 2 The Gram-Schmidt process

• The goal here is to start with three independent vector 푎, 푏, 푐 and construct three orthogonal vectors 퐴, 퐵, 퐶 and finally three orthonormal vectors.

푞1 = 퐴/ 퐴 , 푞2 = 퐵/ 퐵 , 푞3 = 퐶/ 퐶 • We begin by choosing 퐴 = 푎. This first direction is accepted. • The next direction 퐵 must be perpendicular to 퐴. Start with 푏 and subtract its projection along 퐴. This leaves the perpendicular part, which is the orthogonal vector 퐵 (what we knew before as error!), defined as: 퐴퐴푇 퐵 = 푏 − 푏 퐴푇퐴

Problem: Show that 퐴 and 퐵 are orthogonal. Problem: Show that if 푎 and 푏 are independent then 퐵 is not zero. The Gram-Schmidt process

• The third direction starts with 푐. This is not a combination of 퐴 and 퐵. • Most likely 푐 is not perpendicular to 퐴 and 퐵 . • Therefore, subtract its components in those two directions to get 퐶: 퐴퐴푇 퐵퐵푇 퐶 = 푐 − 푐 − 푐 퐴푇퐴 퐵푇퐵

The Gram-Schmidt process: Generalization

• In general we subtract from every new vector its projections in the directions already set. • If we had a fourth vector 푑, we would subtract three projections onto 퐴, 퐵, 퐶 to get 퐷. • We make the resulting vectors orthonormal. • This is done by dividing the vectors with their magnitudes.

The factorization 퐴 = 푄푅 (푄푅 decomposition)

• Assume matrix 퐴 whose columns are 푎, 푏, 푐.

• Assume matrix 푄 whose columns are 푞1, 푞2, 푞3 defined previously. • We are looking for a matrix 푅 such that 퐴 = 푄푅. Since 푄 is an orthogonal matrix we have that 푅 = 푄푇퐴. 푇 푇 푇 푇 푞1 푞1 푎 푞1 푏 푞1 푐 푇 푇 푇 푇 푇 푅 = 푄 퐴 = 푞2 푎 푏 푐 = 푞2 푎 푞2 푏 푞2 푐 푇 푇 푇 푇 푞3 푞3 푎 푞3 푏 푞3 푐 • We know that from the method that was used to construct 푞푖 we have 푇 푇 푇 푞2 푎 = 0, 푞3 푎 = 0, 푞3 푏 = 0 and therefore, 푇 푇 푇 푞1 푎 푞1 푏 푞1 푐 푇 푇 푅 = 0 푞2 푏 푞2 푐 푇 0 0 푞3 푐 • 푄푅 decomposition can facilitate the solution of the system 퐴푥 = 푏, since 퐴푥 = 푏 ⇒ 푄푅푥 = 푏 ⇒ 푅푥 = 푄푇푏. The later system is easy to solve due to the upper triangular form of 푅. • So far you have learnt two types of decompositions: the 푳푼 and the 푸푹.

Determinants

• The Determinant is a crucial number associated with square matrices only.

• It is denoted by det 퐴 = 퐴 . These are two different symbols we use for determinants.

• If a matrix 퐴 is invertible, that means det 퐴 ≠ 0.

• Furthermore, det 퐴 ≠ 0 means that matrix 퐴 is invertible.

푎 푏 푎 푏 • For a 2 × 2 matrix the determinant is defined as = 푎푑 − 푏푐. This 푐 푑 푐 푑 formula is explicitly associated with the solution of the system 퐴푥 = 푏 where 퐴 is a 2 × 2 matrix.

Properties of determinants

1. det 퐼 = 1. This is easy to show in the case of a 2 × 2 matrix using the formula of the previous slide.

2. If we exchange two rows of a matrix the sign of the determinant reverses. Therefore: • If we perform an even number of row exchanges the determinant remains the same. • If we perform an odd number of row exchanges the determinant changes sign. • Hence, the determinant of a permutation matrix is 1 or −1. 1 0 0 1 = 1 and = −1 as expected. 0 1 1 0

Properties of determinants

3a. If a row is multiplied with a scalar, the determinant is multiplied with that scalar 푡푎 푡푏 푎 푏 too, i.e., = 푡 . 푐 푑 푐 푑

푎 + 푎′ 푏 + 푏′ 푎 푏 푎′ 푏′ 3b. = + 푐 푑 푐 푑 푐 푑 Note that det 퐴 + 퐵 ≠ det 퐴 + det 퐵 I observe linearity only for a single row.

4. Two equal rows leads to 푑푒푡 = 0. . As mentioned, if I exchange rows the sign of the determinant changes. . In that case the matrix is the same and therefore, the determinant should remain the same. . Therefore, the determinant must be zero. . This is also expected from the fact that the matrix is not invertible.

Properties of determinants

푎 푏 푎 푏 푎 푏 푎 푏 푎 푏 푎 푏 5. = + = − 푙 = 푐 − 푙푎 푑 − 푙푏 푐 푑 −푙푎 −푙푏 푐 푑 푎 푏 푐 푑 Therefore, the determinant after elimination remains the same.

6. A row of zeros leads to 푑푒푡 = 0. This can verified as follows for any matrix: 0 0 0 ∙ 푎 0 ∙ 푏 푎 푏 = = 0 = 0 푐 푑 푐 푑 푐 푑

7. Consider an upper triangular matrix (∗ is a random element) 푑1 ∗ … ∗ 0 푑 … ∗ 2 = 푑 푑 … 푑 ⋮ ⋮ ⋱ ⋮ 1 2 푛 0 0 … 푑푛 I can easily show the above using the following steps: . I transform the upper triangular matrix to a diagonal one using elimination. . I use property 3a 푛 times. 푛 푛 . I end up with the determinant 푖=1 푑푖det (퐼) = 푖=1 푑푖. . Same comments are valid for a lower triangular matrix.

Properties of determinants

8. det 퐴 = 0 when 퐴 is singular. This is because if 퐴 is singular I get a row of zeros by elimination. Using the same concept I can say that if 퐴 is invertible then det 퐴 ≠ 0.

In general I have 퐴 → 푈 → 퐷, det 퐴 = 푑1푑2 … 푑푛 =product of pivots.

9. det 퐴퐵 = det 퐴 det 퐵 1 det 퐴−1 = det 퐴 det 퐴2 = [det(퐴)]2 det 2퐴 = 2푛det(퐴) where 퐴: 푛 × 푛

10. det 퐴푇 = det 퐴 . . In order to show that, we use the 퐿푈 decomposition of 퐴 and the above properties. 퐴 = 퐿푈 and therefore 퐴푇= 푈푇퐿푇. Determinant is always product of pivots. . This property can also be proved by the use of induction.

Determinant of a 2 × 2 matrix

푎 푏 • The goal is to find the determinant of a 2 × 2 matrix using the properties 푐 푑 described previously. 1 0 0 1 • We know that = 1 and = −1. 0 1 1 0 푎 푏 푎 0 0 푏 푎 0 푎 0 0 푏 0 푏 • = + = + + + = 푐 푑 푐 푑 푐 푑 푐 0 0 푑 푐 0 0 푑 푎 0 0 푏 1 0 0 1 0 + + + 0 = 푎푑 + 푏푐 = 푎푑 − 푏푐 0 푑 푐 0 0 1 1 0 • I can realize the above analysis for 3 × 3 matrices. • I break the determinant of a 2 × 2 random matrix into 4 determinants of simpler matrices. • In the case of a 3 × 3 matrix I break it into 27 determinants. • And so on. Determinant of any matrix

푎 푏 • For the case of a 2 × 2 matrix we got: 푐 푑

푎 푏 푎 0 푎 0 0 푏 0 푏 푎 0 0 푏 = + + + = 0 + + + 0 푐 푑 푐 0 0 푑 푐 0 0 푑 0 푑 푐 0

• The determinants which survive have strictly one entry from each row and each column.

• The above is a universal conclusion. Determinant of any matrix

푎11 푎12 푎13 • For the case of a 3 × 3 matrix 푎21 푎22 푎23 we got: 푎31 푎32 푎33

푎11 푎12 푎13 푎11 0 0 푎11 0 0 푎21 푎22 푎23 = 0 푎22 0 + 0 0 푎23 + ⋯ = 푎31 푎32 푎33 0 0 푎33 0 푎32 0 푎11푎22푎33 − 푎11푎23 푎32 + ⋯

• As mentioned the determinants which survive have strictly one entry from each row and each column. Determinant of any matrix

• For the case of a 2 × 2 matrix the determinant has 2 survived terms. • For the case of a 3 × 3 matrix the determinant has 6 survived terms. • For the case of a 4 × 4 matrix the determinant has 24 survived terms. • For the case of a 푛 × 푛 matrix the determinant has 푛! survived terms.  The elements from the first row can be chosen in 푛 different ways.  The elements from the second row can be chosen in (푛 − 1) different ways.  and so on…

Problem Find the determinant of the following matrix: 0 0 1 1 0 1 1 0 1 1 0 0 1 0 0 1

Big Formula for the determinant

• For the case of a 푛 × 푛 matrix the determinant has 푛! terms.

det 퐴 = ±푎1푎푎2푏푎3푐 … 푎푛푧 푛!terms

 푎, 푏, 푐, … , 푧 are different columns.  In the above summation, half of the terms have a plus and half of them have a minus sign.

Big Formula for the determinant

• For the case of a 푛 × 푛 matrix, cofactors consist of a method which helps us to connect a determinant to determinants of smaller matrices.

det 퐴 = ±푎1푎푎2푏푎3푐 … 푎푛푧 푛!terms

• For a 3 × 3 matrix we have det 퐴 = 푎11(푎22푎33 − 푎23 푎32) + ⋯ • 푎22푎33 − 푎23 푎32 is the determinant of a 2 × 2 matrix which is a sub-matrix of the original matrix.

Cofactors

• The cofactor of element 푎푖푗 is defined as follows: 퐶푖푗 = ±det 푛 − 1 × 푛 − 1 matrix 퐴푖푗

 퐴푖푗 is the 푛 − 1 × 푛 − 1 that is obtained from the original matrix if row 푖 and column 푗 are eliminated.  We keep the + if (푖 + 푗) is even.  We keep the − if (푖 + 푗) is odd.

• Cofactor formula along row 1: det 퐴 = 푎11퐶11 + 푎12퐶12 + ⋯ + 푎1푛 퐶1푛

• Generalization:

. Cofactor formula along row 푖: det 퐴 = 푎푖1퐶푖1 + 푎푖2퐶푖2 + ⋯ + 푎푖푛 퐶푖푛 . Cofactor formula along column 푗: det 퐴 = 푎1푗퐶1푗 + 푎2푗퐶2푗 + ⋯ + 푎푛푗 퐶푛푗

• Cofactor formula along any row or column can be used for the final estimation of the determinant.

Estimation of the inverse 퐴−1 using cofactors

• For a 2 × 2 matrix it is quite easy to show that 푎 푏 −1 1 푑 −푏 = 푐 푑 푎푑 − 푏푐 −푐 푎 • Big formula for 퐴−1 1 퐴−1 = 퐶푇 det(퐴) 퐴퐶푇 = det(퐴) ⋅ 퐼

• 퐶푖푗 is the cofactor of 푎푖푗 which is a sum of products of 푛 − 1 entries. • In general 푎11 … 푎1푛 퐶11 … 퐶푛1 ⋮ ⋮ ⋮ ⋮ = det(퐴) ⋅ 퐼 푎푛1 … 푎푛푛 퐶1푛 … 퐶푛푛

Solve 퐴푥 = 푏 when 퐴 is square and invertible

• The solution of the system 퐴푥 = 푏 when 퐴 is square and invertible can be now obtained from 1 푥 = 퐴−1푏 = 퐶푇푏 det(퐴) • Cramer’s rule: det(퐵 ) det(퐵 ) . First element of vector 푥 is 푥 = 1 . Then 푥 = 2 and so on. 1 det(퐴) 2 det(퐴)

. What are these matrices 퐵푖? 퐵1 = 푏 ⋮ last 푛 − 1 columns of 퐴 . 퐵1 is obtained by 퐴 if we replace the first column with 푏. 퐵푖 is obtained by 퐴 if we replace the 푖th column with 푏. . In practice we must find (푛 + 1) determinants.

The determinant is the volume of a box

• Consider 퐴 to be a matrix of size 3 × 3. • Observe the three-dimensional box (parallelepiped) formed from the three rows or columns of 퐴. • It can be proven that abs( 퐴 ) ≡volume of the box.

det 퐴 ≡volume of a box

• Take 퐴 = 퐼. Then the box mentioned previously is the unit cube and its volume is 1.

Problem: Consider an orthogonal square matrix 푄. Prove that det 푄 = 1 or −1.

Solution: 푄푇푄 = 퐼 ⇒ det 푄푇푄 = det 푄푇 det 푄 = det (퐼) = 1 But det 푄푇 = det 푄 ⇒ 푄 2 = 1 ⇒ 푄 = ±1

• The above results verifies also the fact that the determinant of a 3 × 3 matrix is the volume of the cube that is formed by the rows of the matrix. This is because if you consider 퐴 = 푄 with 푄 being an orthogonal matrix, the related box is a rotated version of the unit cube in the 3D space. Its volume is again 1.

• Take 푄 and double one of its vectors. The cube’s volume doubles (you have two cubes sitting on top of each other.) The determinant doubles as well (property 3a).