A. Mathematical Preliminaries
Total Page:16
File Type:pdf, Size:1020Kb
A. Mathematical Preliminaries In this appendix, we present some of the miscellaneous bits of mathematical knowledge that are not topics of advanced linear algebra themselves, but are nevertheless useful and might be missing from the reader’s toolbox. We also review some basic linear algebra from an introductory course that the reader may have forgotten. A.1 Review of Introductory Linear Algebra Here we review some of the basics of linear algebra that we expect the reader to be familiar with throughout the main text. We present some of the key results of introductory linear algebra, but we do not present any proofs or much context. For a more thorough presentation of these results and concepts, the reader is directed to an introductory linear algebra textbook like [Joh20]. A.1.1 Systems of Linear Equations One of the first objects typically explored in introductory linear algebra is a system of linear equations (or a linear system for short), which is a collection of 1 or more linear equations (i.e., equations in which variables can only be added to each other and/or multiplied by scalars) in the same variables, like y + 3z = 3 2x + y z = 1 (A.1.1) − x + y + z = 2: Linear systems can have zero, one, or infinitely many solutions, and one particularly useful method of finding these solutions (if they exist) starts by placing the coefficients of the linear system into a rectangular array called a The rows of a matrix matrix. For example, the matrix associated with the linear system (A.1.1) is represent the equations in the 0 1 3 3 linear system, and its 2 1 1 1 : (A.1.2) columns represent 2 − 3 the variables as well 1 1 1 2 as the coefficients 4 5 on the right-hand We then use a method called Gaussian elimination or row reduction, side. which works by using one of the three following elementary row operations to simplify this matrix as much as possible: Multiplication. Multiplying row j by a non-zero scalar c R, which we denote by cR . 2 j © Springer Nature Switzerland AG 2021 415 N. Johnston, Advanced Linear and Matrix Algebra, https://doi.org/10.1007/978-3-030-52815-7 416 Appendix A. Mathematical Preliminaries Swap. Swapping rows i and j, which we denote by R R . i $ j Addition. Replacing row i by (row i) + c(row j), which we denote by Ri + cR j. In particular, we can use these three elementary row operations to put any matrix into reduced row echelon form (RREF), which means that it has the following three properties: Any matrix that has all rows consisting entirely of zeros are below the non-zero rows, the first two of these • in each non-zero row, the first non-zero entry (called the leading entry) three properties is • said to be in is to the left of any leading entries below it, and (not-necessarily- each leading entry equals 1 and is the only non-zero entry in its column. reduced) row • echelon form. For example, we can put the matrix (A.1.2) into reduced row echelon form via the following sequence of elementary row operations: Every matrix can be 0 1 3 3 1 1 1 2 1 1 1 2 converted into one, 2 1 1 1 R1 R3 2 1 1 1 R2 2R1 0 1 3 3 and only one, 2 − 3 $ 2 − 3 − 2 − − − 3 reduced row 1 1 1 2 −−−−! 0 1 3 3 −−−−! 0 1 3 3 echelon form. 4 5 4 1 1 1 2 5 4 1 0 2 1 5 However, there may R1 R2 R − − − be many different 2 0 1 3 3 R3 R2 0 1 3 3 : − 2 3 − 2 3 sequences of row −−! 0 1 3 3 −−−−−! 0 0 0 0 operations that get 4 5 4 5 there. One of the useful features of reduced row echelon form is that the solutions of the corresponding linear system can be read off from it directly. For example, if we interpret the reduced row echelon form above as a linear system, the bottom row simply says 0x+0y+0z = 0 (so we ignore it), the second row says that y + 3z = 3, and the top row says that x 2z = 1. If we just move the “z” − − term in each of these equations over to the other side, we see that every solution of this linear system has x = 2z 1 and y = 3 3z, where z is arbitrary (we − − thus call z a free variable and x and y leading variables). A.1.2 Matrices as Linear Transformations One of the central features of linear algebra is that there is a one-to-one corre- spondence between matrices and linear transformations. That is, every m n n × In fact, vectors and matrix A m;n can be thought of as a function that sends x R to the matrices do not 2 M 2 vector Ax Rm. Conversely, every linear transformation T : Rn Rm (i.e., even need to have 2 ! function with the property that T(x + cy) = T(x) + cT(y) for all x;y Rn and real entries. Their 2 entries can come c R) can be represented by a matrix—there is a unique matrix A 2 2 Mm;n from any “field” (see with the property that Ax = T(x) for all x Rn. We thus think of matrices and the upcoming 2 Appendix A.4). linear transformations as the “same thing”. Linear transformations are special for the fact that they are determined completely by how they act on the standard basis vectors e1;e2;:::;en, which A linear are the vectors with all entries equal to 0, except for a single entry equal to 1 in transformation is also the location indicated by the subscript (e.g., in R3 there are three standard basis completely e = ( ; ; ) e = ( ; ; ) e = ( ; ; ) e e determined by how vectors: 1 1 0 0 , 2 0 1 0 , and 3 0 0 1 ). In particular, A 1, A 2, it acts on any other :::, Aen are exactly the n columns of A, and those n vectors form the sides of basis of Rn. the parallelogram/ parallelepiped/hyperparallelepiped that the unit square/cube/ hypercube is mapped to by A (see Figure A.1). In particular, linear transformations act “uniformly” in the sense that they send a unit square/cube/hypercube grid to a parallelogram/parallelepiped/ hyperparallelepiped grid without distorting any particular region of space more than other regions of space. A.1 Review of Introductory Linear Algebra 417 y y Squares are mapped to Av parallelograms in R2, v cubes are mapped A e to parallelepipeds in e2 A 2 R3, and so on. −−→ Ae1 x x e1 Figure A.1: A matrix A M acts as a linear transformation on R2 that transforms a 2 2 square grid with sides e1 and e2 into a grid made up of parallelograms with sides Ae1 and Ae2 (i.e., the columns of A). Importantly, A preserves which cell of the grid each vector is in (in this case, v is in the 2nd square to the right, 3rd up, and Av is similarly in the 2nd parallelogram in the direction of Ae1 and 3rd in the direction of Ae2. A.1.3 The Inverse of a Matrix 1 The inverse of a square matrix A n is a matrix A− n for which 1 1 2 M 2 M In the n = 1 case, the AA− = A− A = I (the identity matrix). The inverse of a matrix is unique when inverse of a 1 1 it exists, but not all matrices have inverses (even in the n = 1 case, the scalar 0 matrix (i.e., scalar)×a is just 1=a. does not have an inverse). The inverse of a matrix can be computed by using Gaussian elimination to row-reduce the block matrix [ A I ] into its reduced j row echelon form [ I A 1 ]. Furthermore, if the reduced row echelon form of j − [ A I ] has anything other than I in the left block, then A is not invertible. j We show in A one-sided inverse of a matrix is automatically two-sided (that is, if either Remark 1.2.2 that of the equations AB = I or BA = I holds then the other necessarily holds as one-sided inverses 1 are not necessarily well—we can deduce that B = A− based on just one of the two defining two-sided in the equations). We can get some intuition for why this fact is true by thinking infinite-dimensional geometrically—if AB = I then, as linear transformations, A simply undoes case. whatever B does to Rn, and it perhaps seems believable that B similarly undoes whatever A does (see Figure A.2). y y y Av 1 A− Av = v v 1 A A− −−→ −−−→ x x x 1 n Figure A.2: As a linear transformation, A− undoes what A does to vectors in R . That is, A 1Av = v for all v Rn (and AA 1v = v too). − 2 − Row reducing a matrix is equivalent to multiplication on the left by an invertible matrix in the sense that B can be obtained from A 2 Mm;n 2 Mm;n via a sequence of elementary row operations if and only if there is an invertible matrix P such that B = PA. The fact that every matrix can be row- 2 Mm reduced to a (unique) matrix in reduced row echelon form is thus equivalent to the following fact: ! For every A m;n, there exists an invertible P m such that A = PR,2 where M R is the RREF of A.