A Set of Equations of the Form (1) A11x1 + A12x2 +
Total Page:16
File Type:pdf, Size:1020Kb
18.024 SPRING OF 2008 SYS. SYSTEMS OF LINEAR EQUATIONS A set of equations of the form 8 >a11x1 + a12x2 + ··· + a1nxn = c1 > <>a21x1 + a22x2 + ··· + a2nxn = c2 (1) . > . > :am1x1 + am2x2 + ··· + amnxn = cm is called a system of m linear equations in n unknowns. Here aij’s, i = 1; : : : ; m and j = 1; : : : ; n, and c1; : : : ; cm are numbers, and x1; : : : ; xn are regarded as unknown. The system (1) is conveniently written in the matrix notation as (2) AX = C; where A = (aij) is an m by n matrix, called the coefficient matrix, and 2 3 2 3 x1 c1 6 . 7 6 . 7 X = 4 . 5 and C = 4 . 5 xn cm are, respectively, 1 by n column vector and 1 by m column vector. By a solution of (1) we mean an n-vector (x1; : : : ; xn) for which all the equations in (1) are satisfied simultaneously. The solution set of (1) consists of all such n-vectors; it is a subset of Vn, naturally. Theory of systems of linear equations forms a major branch of linear algebra. Computational algorithms of finding solutions of linear systems have important applications in real-world prob- lems in engineering, business, and science, especially the social sciences. A system of nonlinear equations can often be approximated by a linear system (called linearization), which is a useful technique in designing a mathematical model of a relatively complex system. The present chapter provides with a crash course on solving systems of linear equations (without proofs); An extensive theory may take many chapters to completely develop. To those who are interested in studying proofs (and already familiar with linear algebra) I refer to [Mun] or [Str]. Solution set. The solution set of (1) satisfies one of the followings: S1 The solution set is empty. In this case we say the system is inconsistent. S2 The solution set consists of a single point. Then, we say the solution is unique. S3 The solution set is a k-dimensional plane (called a hyperplane) of Vn for some k > 0, that is, a k-dimensional subspace of Vn translated by an n-vector. We say in this case that the system has infinitely many solutions. Example 1. (1) The system ( x + y = 1 x + y = 2 has no solution since the sum of two numbers cannot be both 1 and 2. 1 (2) The system ( x + y = 1 x − y = 0 has a unique solution (x; y) = (1=2; 1=2). (3) The system ( x + y = 1 2x + 2y = 2 has infinitely many solutions. Indeed, any two numbers whose sum is 1 give a solution. We may express the general solution of this system as (x; y) = (0; 1) + t(1; −1); where t is an arbitrary scalar. Thus, the solution set is a 1-plane in V2. More precisely, it forms a line in V2 through (0; 1) and determined by (1; −1). Geometrically interpreted, each linear equation of two unknowns x and y determines a straight line on the (x; y)-plane. Since a solution to a linear system must satisfy all equations of the system, it must lie on the intersection of these lines, and therefore, the solution set of the linear system of two equations in two unknowns is either (1) the empty set, or (2) a single point, or (3) a line. The proof of the trichotomy result uses Gauss-Jordan elimination. The following theorem states the crucial result which we shall use later to actually solve systems. Theorem 2. Consider the system of linear equations AX = C, where A is an m by n matrix and C is a m by 1 matrix. Let B be the matrix obtained by applying an elementary row operations to A and C0 be the matrix obtained by the same elementary row operations to C. Then the solution set of the system BX = C0 is the same as the solution set of AX = C. Homogeneous systems. To (1) we can associate another system AX = O obtained by replacing each ci in (1) by 0. This is called the homogeneous system corresponding to (1). If C 6= O then (1) is called inhomogeneous. The homogeneous system always has one solution, namely X = O, which is called the trivial solution. It may have others. Furthermore, the solution set of AX = O is a linear subspace of Vn, which is called the null space of A. Indeed, if X1 and X2 are solutions of AX = 0 then so are X1 + X2 and cX1 for any scalar c. We wish to determine the dimension of this solution space and to find a basis for it. Definition 3. The column[row] rank of a matrix A is the maximal number of linearly independent columns[rows] of A. The column rank and the row rank are always equal, and thus they are simply called the rank∗ of A. Theorem 4. Let A be an m by n matrix and let r be the rank of A. The solution space of the system of linear equations AX = O is a subspace of Vn of dimension n − r. In particular, if the rows of A are independent then the solution space of the system AX = O has dimension n − m. A proof is in [Mun, B, Theorem 3]. ∗An alternative definition viewing A as a linear transformation is given in [Apo, 16.3]. 2 Example 5. Let 2 0 1 4 1 2 3 6−1 −2 0 9 −17 (3) A = 6 7 ; 4 1 2 0 −6 1 5 2 5 4 −10 4 and as such AX = O is a system of 4 equations in 5 unknowns. Now, A is reduced via Gauss-Jordan elimination to 21 0 −8 0 −33 60 1 4 0 2 7 (4) D = 6 7 : 40 0 0 1 0 5 0 0 0 0 0 We apply Theorem 2 in our situation to assert that the solution set of AX = O agrees with that of DX = O. In view of Theorem 4 the solution set of DX = O (and also the solution set of AX = O) has dimension 5 − 3 = 2. We now find the general solution of AX = O (or equivalently, DX = O). Observe that in the matrix DX, the unknowns x1, x2 and x4 each appear in one equation. We solve for these unknowns in terms of the others: x1 = 8x3 + 3x5; x2 = −4x3 − 2x5; x4 = 0: The general solution thus can be written as X = (8x3 + 3x5; −4x3 − 2x5; x3; 0; x5) = (8x3; −4x3; x3; 0; 0) + (3x5; −2x5; 0; 0; x5) = x3(8; −4; 1; 0; 0) + x5(3; −2; 0; 0; 1): The solution space is therefore spanned by two vectors (8; −4; 1; 0; 0) and (3; −2; 0; 0; 1). The procedure we followed in the above example can be followed in general. Once we write X as a vector of which each component is a linear combination of xi’s, ad then finally as a linear combination, with coefficients xi, of vectors in Vn. There are of course n − r of the unknowns xi, and hence n − r of these vectors. These vectors are linearly independent. (why?) Solving inhomogeneous systems. We now turn to solving (1) with allowance for an inhomoge- neous term C 6= O. Theorem 6. (a) The solution set of (2) is given by fP + Xh : AXh = 0g; where P is a solution to the inhomogeneous system AX = C. (b) Let r be the rank of A. If r < m then for some C 2 Vm there is no solution of (2). If r = m then (2) always has a solution. The proof of (a) is in [Apo, Theorem 16.18] and the proof of (b) is in [Mun, B, Theorem 6]. Example 7 (continued). The system 203 607 DX = 6 7 ; 405 1 3 where D is given in (4), has no solution since the last equation of the system is 0x1 + 0x2 + 0x3 + 0x4 + 0x5 = 1: On the other hand, the system 2−13 6 3 7 DX = 6 7 4 7 5 0 does have a solution. Indeed, a particular solution satisfies x1 = −1 + 8x3 + 3x5; x2 = 3 − 4x3 − 2x5; x4 = 7: One such a solution is X = (−1; 3; 0; 7; 0). The general solution is thus the 2-plane in V5, repre- sented by the parametric equation X = (−1; 3; 0; 7; 0) + x3(8; −4; 1; 0; 0) + x5(3; −2; 0; 0; 1): Solving the system AX = C in practice involves applying elementary row operations to A and applying the same operations to C. A convenient way to perform these is to form a new matrix from A by adjoining C as a additional column. The matrix obtained so is often called the augmented matrix of the system. Then, one applies the elementary row operations to this matrix and deals with A and C simultaneously. This procedure is explained in [Apo, 16.18]. Remarks on inverses of matrices. We must take into account that matrix multiplication is not commutative. For example, for 2 0 0 3 1 1 2 A = ;B = 3 −2 ; 0 1 3 4 5 −1 1 it is straightforward to verify that AB = I2 but BA 6= I3. Definition 8. Given A 2 Mm×n, a matrix B 2 Mn×m is called a left[right] inverse of A if BA = In[AB = Im], respectively, where In is the identity matrix of dimension n. A matrix B 2 Mn×m is called an inverse of A if BA = In and AB = Im. Theorem 9. A matrix A 2 Mm×n has an inverse if and only if m = n = rankofA.