Consider the linear combination cc11aa 2 2 cmm a, where cc 1 , , m are scalars, and aa1, ,m are nnm -dimensional vectors ( in general)
Next consider the equation cc 11 aa 2 2 c mm a 0 (1)
Definition: The set of vectors aa 1 ,, m is linearly independent if the only solution to (1) is cc 12 c m 0. Otherwise, the vectors are linearly dependent. Remark: Linear dependence allows us to write (at least) one of the vectors as a combination of the others. Suppose c1 0, then
aaa1212313(/)cc (/) cc ( cmm /) c 1 a
Example: The three vectors
aa12[3 0 2 2], [ 6 42 24 54], a 3 [21 21 0 15] are linearly dependent because 1 60aaa 1232 5.1
Matrix Rank
Definition: The maximum number of linearly independent row vectors of a matrix A = [ajk] is called the rank of A and is denoted by rank A. Example: The matrix 3022 A 6422454 21 21 0 15 is rank 2, or rank A = 2. (How do we know that it is not less than 2?) MATLAB Verification: >>A>> A = [3022[3 0 2 2; -6422454216 42 24 54; 21 -21 0 -15] >> rank(A)
5.2
1 Matrix Rank
Theorem: The rank of a matrix A equals the number of independent column vectors. Hence, rank A = rank AT
Proof : Let A = [ajk] be an m n matrix and let rank A = r. Then, A has a linearl y i nd epend ent set of row vect ors v1,…,vr andlld all row vect ors of A, a1,…,am are linear combinations of these vectors, so we may write
avv1111122cc c 1rr v
avv2211222cc c 2rr v
avvmmcc11 m 2 2 c mrr v
WittifthWe may now write equations for the k-th componentftht of the aj itin terms of fth the k-th component of the vj as
acvcv1111122kk k cv 1 rrk
acvcv2211222kk k cv 2 rrk acvcv cv mk m11 k m 2 2 k mr rk 5.3
Matrix Rank
Proof (continued): acvcv1111122kk k cv 1 rrk
acvcv2211222kk k cv 2 rrk
acvcvmk m11 k m 2 2 k cv mr rk which we may also write
acc111121kr c acc c 221222krvv v 12kk rk accmk m12 m c mr
T We see that [a1k a2k … amk] is the k-th column vector of A. Hence, each column vector is a linear combination the rm-dimensional column vectors T cj, where cj = [c1k c2k … cmk] . Consequently, the number of independent column vectors is rr
5.4
2 Matrix Rank
Proof (continued): We may carry out the same argument with AT. Since the rows of A are the columns of AT and vice versa, we find rr and hence rr ElExample: CidiConsider again 3022 A 6422454 21 21 0 15
We see 230 2 3 0 22 229 24 6 42 and 54 6 42 33 321 02121152121
So, only two vectors are linearly independent (rank A = 2, again) How do we determine the rank? Gauss elimination…
but how do we prove that? 5.5
Finite-Dimensional, Real Vector Spaces
Definition: A non-empty set V of elements a, b, … is called a real vector space and these elements are called vectors if in V there are defined two algebraic operations (vector multiplication and scalar addition) over the field of real numbers that satisfy the relations: (I) Vector addition associates to every pair of elements a and b a unique element of V that is denoted a + b (I.1) Commutivity: For any a, b: a + b = b + a (I.2) Associativity: For any three vectors a, b, c: (a + b) + c = a + (b + c) (()I.3) There is a uni que 0 vector in V, such that a + 0 = a (I.4) For every a, there is a unique vector –a, such that a + (–a) = 0
5.6
3 Finite-Dimensional, Real Vector Spaces
Definition: A non-empty set V of elements a, b, … is called a real vector space and these elements are called vectors if in V there are defined two algebraic operations (vector multiplication and scalar addition) over the field of real numbers that satisfy the relations: (II) Scalar multiplication associates to every aVcand a unique element of V that is denoted ca (II.1) Distributivity: For any c, a, b: c(a + b) = ca + cb (II.2) Distributivity: For any c, k, a: (c + k )a = ca + ka (II.3) Associativity: For any c, k, a: c(ka) = (ck)a (II.4) F or every a: 1a = a The vector space is finite-dimensional or n-dimensional if each vector has n elements, where n +.
5.7
Finite-Dimensional, Real Vector Spaces
Definition: A field is a set of mathematical “objects” (the standard terminology of mathematical texts) that is endowed with two mathematical operations, addition (+) and multiplication () that satisfy the following axioms: Axiom 1. (Commutative laws): x + y = y + x, x y = y x Axiom 2. (Associative laws): x + ( y + z) = (x + y) + z , x ( y z)= (x y) z Axiom 3. (Identity elements): There are unique elements 0 and 1 such that x + 0 = x and x 1 = x AiAxiom 4 . (Inverses ): For eac h x, there i s a uni que el ement –x, suchthth that x + (–x) = 0; for each x 0, there is a unique element x , such that x x = 1 Axiom 5. (Distributive law): x ( y + z) = x y + x z Remark: The only fields that are currently important in
applications are the real and complex numbers 5.8
4 Finite-Dimensional, Real Vector Spaces
Remarks: (1) The scalar field can be any arithmetic field that obeys the field axioms. The complex field is often used. Quarternions are sometimes used. (2) A vector space V can consist of any objects that obey the axioms. These can be column vectors, row vectors, or matrices (of the same size) — among other possibilities. This concept is much more general than the concept of row vectors or column vectors! (3) The most important abstract example of a real, finite-dimensional vector space is n — the space of all ordered n-tuples of real numbers (4) The powerful concept of a vector space can be (and will be) extended to infinite dimensions. An important example: All functions whose squares are Lebesgue-integrable between 0 and 1 [denoted L2(0,1)]
5.9
Finite, Real Vector Spaces
Definitions: (1) The maximum number of linearly independent vectors in V is called the dimension of V and is denoted dim V. In a finite vector space, this number is always a non-negggative integer. (2) A linearly independent set of vectors that has the largest possible number of elements is called a basis for V. The number of these elements equals dim V.
(3) All linear combinations of a non-empty set of vectors a1, …, ap V is called the span of these vectors. They define a subspace of V and constitute a vector space. The subspace’s dimension is less than or equal to dim V. Example: The span of the three vectors from the example of slide 1
aa12[3 0 2 2], [ 6 42 24 54], a 3 [21 21 0 15] is a vector space of dimension 2. Any two of these vectors are
linearly independent and may be chosen as a basis, e.g., a1 and a2
5.10
5 Matrix Theorems
Theorem 1: Row-equivalent matrices have the same rank. Consequently, we may determine the rank by reducing the matrix to upper triangular (echelon) form. The number of non-zero rows equals the rank. ElExample: CidiConsider again 3022 A 6422454 21 21 0 15 We find 3 0 2 2 30 2 2 3022 A 6 42 24 54 0 42 28 58 0 42 28 58 21 21 0 15 0 21 14 29 0 0 0 0 and conclude once again that the matrix has rank 2.
5.11
Matrix Theorems
Theorem 2: Linear dependence and independence. The p vectors x1,…,xp with n components each are linearly independent if the matrix with row vectors x1,…,xp has rank p. They are linearly dependent if the rank is less than p. Theorem 3: A set of p vectors with n < p is always linearly dependent. Theorem 4: The vector space n, which consists of all vectors with n components has dimension n.
5.12
6 Matrix Theorems
Theorem 5: Fundamental theorem for linear systems.
(a) Existence. A linear system of m equations with n unknowns x1,…,xn has solutions if and only if the coefficient matrix A and the augmentdted mat tirix Aˆ have the same rank . W e recall ( slid e 3 .6)
aa11 12 ax 1n 1 b 1 aa11 12 a 1n | b 1 aa a x b aa a| b Axb21 22 2n 2 2 Aˆ 21 22 2n 2 | aamm12 a mnn x b m aamm12 a mnm| b
(()b) Uniqueness. The syypystem has precisely one solution if fyf and only if rank A = rank Aˆ = n (c) Infinitely many solutions. If r = rank A = rank Aˆ < n, the system has infinitely many solutions. We may choose n r of these unknowns arbitrarily. Once that is done, the remain r unknowns are determined.
5.13
Matrix Theorems
Theorem 5 (continued): Fundamental theorem for linear systems. (d) Gauss elimination. If solutions exist, they can all be obtained by Gauss elimination.
Definition: The system Ax = b is called homogeneous if b = 0. Otherwise, it is called inhomogeneous. Theorem 6. Homogeneous systems. A homogeneous system always has
the trivial solution x1= 0, x2 = 0,…, xn = 0. Non-trivial solutions exist if and only if r = rank A < n. In this case, the non-trival solutions with 0 form a vector space of dimension n r, called the solution space. In particular, if
x1 and x2 are soltilution vect ors, th en x = c1x1 + c2x2, where c1 and c2 are scalars, is a solution vector.
Proof: The solution vectors form a vector space because if x1 and x2 are among them, then Ax = A (c1x1 + c2x2) = c1Ax1 + c2Ax2 = 0. If r < n, then 5.c implies that we may arbitrarily choose n – r
unknowns, xr+1,…,xn, and every solution can be obtained in this way. 5.14
7 Matrix Theorems
Theorem 6. Homogeneous systems. A homogeneous system always has the trivial solution x1= 0, x2 = 0,…, xn = 0. Non-trivial solutions exist if and only if r = rank A < n. In this case, the non-trival solutions with 0 form a vector sppface of dimension n r, called the solution space. In pp,farticular, if x1 and x2 are solution vectors, then x = c1x1 + c2x2, where c1 and c2 are scalars, is a solution vector.
Proof: The solution vectors form a vector space because if x1 and x2 are among them, then Ax = A (c1x1 + c2x2) = c1Ax1 + c2Ax2 = 0. If r < n, then 5.c implies that we may arbitrarily choose n – r unknowns, xr+1,…,xn, and every solution can be obtained in this way. We may obtain a basis for the solution sppyace by choosin g xr+j = 1 and all the other xr+1,,,…,xn equal to zero. Thus, the solution space has dimension n – r. Definition: This solution space is called the null space because Ax = 0 for every element in the space. The dimension of the space is called the nullity of A. We find rank A + nullity A = n.
5.15
Matrix Theorems
We can now completely characterize the non-homogeneous solutions Theorem 7. Inhomogeneous systems. If an inhomogeneous system Ax = b
has solutions, then these solutions have the form x = x0 + xh, where x0 is any fixed sol uti on ( a parti cul ar sol uti on) of th e syst em and xh runs through all the solutions of the homogeneous system.
Proof: Let x be any solution of the system and x0 a particular solution, then Ax = b and Ax0 = b, so that A(x – x0) = 0 and thus x – x0 = xh, an element of the null space. Running through all elements of the null space, we obtain all the solutions.
5.16
8 Determinants and Cramer’s Rule
Determinants play an important theoretical role in the study of n n matrices and can be used to solve n equations with n unknowns, although it is not an efficient way to do it. DfiitiDefinitions: A dtdetermi nant of ord er n is a scal ar associ itdated with an mat tirix
A = [ajk], which is written
aa11 12 a 1n n aa21 22 a 2n DadetA (sgn ) jj, j1
aann12 a nn where indicates a permutation of [1 2 3 … n], indicates the sum over all permutations, and sgn = 1 is the sign of permutation. 12 n Remark: We denote the elements of a permutation as 12 n Example: 1234 ,12,23,34,41 2341 5.17
Determinants and Cramer’s Rule
Properties of permutations: (1) The are n! permutations of dimension n. Hence our definition is not a practical way to compute a determinant if n > 3. (Use LU factorization instead .) (2) Every permutation can be factored into at most n – 1 transpositions Example: = (1,4)(1,3)(1,2) (3) Every transposition can be factored into an odd number of elementary transpositions of the form (i,i + 1). Example: (1,3) = (2,3)(1,2)(2,3) = (1,2)(2,3)(1,2) (4) The number of transpositions in a permutation is not unique, but the numbilber is always even or odd ddH. Hence, each permut ttihation has a well - defined parity and sgn is unambiguous. (5) The permutations of dimension n constitute a group: (a) there is an identity permutation ; (b) each permutation has a unique inverse (c) permutations are associative
5.18
9 Determinants and Cramer’s Rule
Properties of permutations: (6) If we let = , then sgn = (sgn )(sgn ). In particular, sgn = sgn n = 2 Determinants
aa11 12 Daaaadet A 11 22 21 12 (two terms) aa21 22 n = 3 Determinants
aaa11 12 13
Daaaaaaaaaaaadet A 21 22 23 11 22 33 12 23 31 13 21 32
aaa31 32 33
aaa11 23 32 aaa 12 21 33 aaa 13 22 31 (six terms) These expressions are very useful in practice! 5.19
Determinants and Cramer’s Rule
Properties of determinants: Theorem 1: det I = 1. Proof: The only permutation that does not have a zero multiplicand is the identity permutation and all multiplicands are 1.
Theorem 2: If B = Pjk A, where Pjk is an elementary permutation matrix, i.e., we transpose two rows of A, then det B = det A. Proof: We let be the operation that transposes the rows j and k, and we note that sgn = sgn for each . We now have nnn detA (sgn )aaajj,, (sgn ) jj (sgn ) j , j jjj111 nn (sgn )bbjj,, (sgn ) jj detB jj11 Corollary: The determinant of a matrix with two identical rows is zero. Theorem 3: det cA = cn det A. Proof: Each element in the sum contains the product of cntimes.
5.20
10 Determinants and Cramer’s Rule
Properties of determinants:
Theorem 4: Let A = [ajk], alk = blk + clk, for the l-th row. Let B = [bjk] and C = [cjk], where bjk = ajk and cjk = ajk when j l, then det A = det B + det C Proof: n detA (sgn )(bcll,, ll ) a j , j jl1, nn (sgn )bcjj,, (sgn ) jj detBC det jj11 Example: a a aa aa 11 12 11 12 11 12 bbb21c 21b 22c 22bb 21 22cc 21 22 Corollary: Adding one row to another does not change the determinant
We have shown that the elementary row operations do not change a non-zero determinant to zero or vice versa. 5.21
Determinants and Cramer’s Rule
Properties of determinants: Theorem 5: Gauss elimination without pivoting does not change det A. Gauss elimination with pivoting yields det A, where the sign depends on whthhether th e numb er of row i itnterch anges i s even or odd . n Corollary: det A sgn u where the u are the diagonal elements of j1 jj jj the upper triangular matrix that results from Gaussian elimination and is an appropriate row permutation. Remark: Gauss elimination provides an effective way to calculate the determinant. Theorem 6: det AT = det A. nn Proof: detAT (sgn )aa (sgn ) jj, jj, 1 jj11 n (sgn 1 )a det A jj, 1 1 j1 Corollary: All row theorems apply equally well to columns 5.22
11 Determinants and Cramer’s Rule
Properties of determinants: Theorem 6: det AB = (det A)(det B) Proof: We first suppose that A = D is diagonal. Then, each element of the
first row of B is multiplied by d11. Hence, the factor d11 will appear in det DB. The second row of DB is multiplied by d22. Hence, the factor d22 also appears in det DB. Continuing in this way, we find nn n detDB (sgn )dbjj j,, j d jj (sgn ) b j j det D det B jj11 j 1 By a combination of Gauss elimination of elements below the diagonal and Gauss-Jordan elimination of elements above the diagg,onal, we can transform any matrix A to a diagonal matrix D without changing its determinant, except for perhaps a sign. We apply the same operations, along with the same pivots, to AB, which transforms this matrix to DB, without changing its determinant except for the same sign as appears in det D. The theorem follows. Corollary: det A det A 5.23
Determinants and Cramer’s Rule
An alternative definition: Theorem 7: We may write nn jk jk DaMaMMdetAM ( 1)jkkk jk ( 1)jk kkjk , where jk det jk , kj11 and Mjk is the (n –1) (n –1) matrix that is obtained by removing the j-th row and k-th column from A.
Proof: We first observe that a11 appears in all permutations of the form [1 2 3 … n], where is a permutation of [2 3 … n]. Hence, we conclude that the term in D that is proportional to a11 must equal a11M11. The factor a12 appears in all permutations of the form (1,2)[1 2 3 … n], where we transpose the first and second columns. That changes the sign of the permutations, so that term proportional to a12 must equal a12M12. The factor a13 appears in all permutations of the form (3,2)(2,1 )[1 2 3 … n], which has two transpositions and contributes +a13M13.
5.24
12 Determinants and Cramer’s Rule
An alternative definition: Theorem 7: We may write nn jk jk DaMaMMdetAM ( 1)jkkk jk ( 1)jk kkjk , where jk det jk , kj11 and Mjk is the (n –1) (n –1) obtained by removing the j-th row and k-th column from A. n 1k Proof (continued): Continuing in this way, we find DaMdetA ( 1) 11kk k1 To obtain the result for any j, we transpose rows. To obtain the second equality, we use the corollary to theorem 5 on slide 5.22. Remark: This theorem is often used as the definition of determinants.
Notation: The quantity Mjk is called the minor of ajk in D. The quantity j+k Cjk = (1) Mjk is called the cofactor.
5.25
Determinants and Cramer’s Rule
Cramer’s Rule: Theorem 8: If the determinant is non-zero, then a unique solution to the
linear equation Ax = b exists and is given by xk = Dk /D, k = 1,…,n, where D = dtdet A, and Dk ithdtis the determi nant tbti obtained df from A blithby replacing the k-th column in A with b. Proof: Not given! This approach to solving linear equations is useless in practice.
Inverse Rule: Theorem 9: The i nverse of a non-silingular mat tiiibrix is given by 1T AA []/detC jk where Cjk is the cofactor of ajk and we note that we use the transpose. The n = 2 version is sometimes useful.
1 1 aa22 12 A det A aa21 11 5.26
13 Determinants and Cramer’s Rule
Determinants in MATLAB: >> syms A a11 a12 a21 a22 >> A = [a11 a12; a21 a22] >> dA = det(A) >> iA = inv(A) >> A = [3 2 1; 5 6 4; 7 8 9] >> dA = det(A) >> dAT = 3*6*9 + 2*4*7 +1*5*8 - 7*6*1 - 8*4*3 - 9*5*2 >> B = [1 0 0; 0 2 0; 0 0 3] >> dB = det(B) >> dAB = det(A*B) >> dABt = dA*dB >> doc det
5.27
14