Basic Concepts of Linear Algebra by Jim Carrell Department of Mathematics University of British Columbia 2 Chapter 1

Total Page:16

File Type:pdf, Size:1020Kb

Basic Concepts of Linear Algebra by Jim Carrell Department of Mathematics University of British Columbia 2 Chapter 1 1 Basic Concepts of Linear Algebra by Jim Carrell Department of Mathematics University of British Columbia 2 Chapter 1 Introductory Comments to the Student This textbook is meant to be an introduction to abstract linear algebra for first, second or third year university students who are specializing in math- ematics or a closely related discipline. We hope that parts of this text will be relevant to students of computer science and the physical sciences. While the text is written in an informal style with many elementary examples, the propositions and theorems are carefully proved, so that the student will get experince with the theorem-proof style. We have tried very hard to em- phasize the interplay between geometry and algebra, and the exercises are intended to be more challenging than routine calculations. The hope is that the student will be forced to think about the material. The text covers the geometry of Euclidean spaces, linear systems, ma- trices, fields (Q; R, C and the finite fields Fp of integers modulo a prime p), vector spaces over an arbitrary field, bases and dimension, linear transfor- mations, linear coding theory, determinants, eigen-theory, projections and pseudo-inverses, the Principal Axis Theorem for unitary matrices and ap- plications, and the diagonalization theorems for complex matrices such as the Jordan decomposition. The final chapter gives some applications of symmetric matrices positive definiteness. We also introduce the notion of a graph and study its adjacency matrix. Finally, we prove the convergence of the QR algorithm. The proof is based on the fact that the unitary group is compact. Although, most of the topics listed above are found in a standard course on linear algebra, some of the topics such as fields and linear coding theory are seldom treated in such a course. Our feeling is, however, that because 3 4 coding theory is such an important component of the gadgets we use ev- eryday, such as personal computers, CD players, modems etc., and because linear coding theory gives such a nice illustration of how the basic results of linear algebra apply, including it in a basic course is clearly appropriate. Since the vector spaces in coding theory are defined over the prime fields, the students get to see explicit situations where vector space structures which don't involve the real numbers are important. This text also improves on the standard treatment of the determinant, where either its existence in the n n case for n > 3 is simply assumed or it × is defined inductively by the Laplace expansion, an d the student is forced to believe that all Laplace expansions agree. We use the classical definition as a sum over all permutations. This allows one to give a quite easy proof of the Laplace expansion, for example. Much of this material here can be covered in a 13-15 week semester. Throughout the text, we have attempted to make the explanations clear, so that students who want to read further will be able to do so. Contents 1 Introductory Comments to the Student 3 2 Euclidean Spaces and Their Geometry 11 2.1 Rn and the Inner Product. 11 2.1.1 Vectors and n-tuples . 11 2.1.2 Coordinates . 12 2.1.3 The Vector Space Rn . 14 2.1.4 The dot product . 15 2.1.5 Orthogonality and projections . 16 2.1.6 The Cauchy-Schwartz Inequality and Cosines . 19 2.1.7 Examples . 21 2.2 Lines and planes. 24 2.2.1 Lines in Rn . 24 2.2.2 Planes in R3 . 25 2.2.3 The distance from a point to a plane . 26 2.3 The Cross Product . 31 2.3.1 The Basic Definition . 31 2.3.2 Some further properties . 32 2.3.3 Examples and applications . 33 3 Linear Equations and Matrices 37 3.1 Linear equations: the beginning of algebra . 37 3.1.1 The Coefficient Matrix . 39 3.1.2 Gaussian reduction . 40 3.1.3 Elementary row operations . 41 3.2 Solving Linear Systems . 42 3.2.1 Equivalent Systems . 42 3.2.2 The Homogeneous Case . 43 3.2.3 The Non-homogeneous Case . 45 5 6 3.2.4 Criteria for Consistency and Uniqueness . 47 3.3 Matrix Algebra . 50 3.3.1 Matrix Addition and Multiplication . 50 3.3.2 Matrices Over F2: Lorenz Codes and Scanners . 51 3.3.3 Matrix Multiplication . 53 3.3.4 The Transpose of a Matrix . 54 3.3.5 The Algebraic Laws . 55 3.4 Elementary Matrices and Row Operations . 58 3.4.1 Application to Linear Systems . 60 3.5 Matrix Inverses . 63 3.5.1 A Necessary and Sufficient for Existence . 63 3.5.2 Methods for Finding Inverses . 65 3.5.3 Matrix Groups . 67 3.6 The LP DU Decomposition . 73 3.6.1 The Basic Ingredients: L; P; D and U . 73 3.6.2 The Main Result . 75 3.6.3 The Case P = In . 77 3.6.4 The symmetric LDU decomposition . 78 3.7 Summary . 84 4 Fields and vector spaces 85 4.1 Elementary Properties of Fields . 85 4.1.1 The Definition of a Field . 85 4.1.2 Examples . 88 4.1.3 An Algebraic Number Field . 89 4.1.4 The Integers Modulo p . 90 4.1.5 The characteristic of a field . 93 4.1.6 Polynomials . 94 4.2 The Field of Complex Numbers . 97 4.2.1 The Definition . 97 4.2.2 The Geometry of C . 99 4.3 Vector spaces . 102 4.3.1 The notion of a vector space . 102 4.3.2 Inner product spaces . 105 4.3.3 Subspaces and Spanning Sets . 107 4.3.4 Linear Systems and Matrices Over an Arbitrary Field 108 4.4 Summary . 112 7 5 The Theory of Finite Dimensional Vector Spaces 113 5.1 Some Basic concepts . 113 5.1.1 The Intuitive Notion of Dimension . 113 5.1.2 Linear Independence . 114 5.1.3 The Definition of a Basis . 116 5.2 Bases and Dimension . 119 5.2.1 The Definition of Dimension . 119 5.2.2 Some Examples . 120 5.2.3 The Dimension Theorem . 121 5.2.4 Some Applications and Further Properties . 123 5.2.5 Extracting a Basis Constructively . 124 5.2.6 The Row Space of A and the Rank of AT . 125 5.3 Some General Constructions of Vector Spaces . 130 5.3.1 Intersections . 130 5.3.2 External and Internal Sums . 130 5.3.3 The Hausdorff Interesction Formula . 131 5.3.4 Internal Direct Sums . 133 5.4 Vector Space Quotients . 135 5.4.1 Equivalence Relations . 135 5.4.2 Cosets . 136 5.5 Summary . 139 6 Linear Coding Theory 141 6.1 Introduction . 141 6.2 Linear Codes . 142 6.2.1 The Notion of a Code . 142 6.2.2 The International Standard Book Number . 144 6.3 Error detecting and correcting codes . 145 6.3.1 Hamming Distance . 145 6.3.2 The Main Result . 147 6.3.3 Perfect Codes . 148 6.3.4 A Basic Problem . 149 6.3.5 Linear Codes Defined by Generating Matrices . 150 6.4 Hadamard matrices (optional) . 153 6.4.1 Hadamard Matrices . 153 6.4.2 Hadamard Codes . 154 6.5 The Standard Decoding Table, Cosets and Syndromes . 155 6.5.1 The Nearest Neighbour Decoding Scheme . 155 6.5.2 Cosets . 156 6.5.3 Syndromes . 157 8 6.6 Perfect linear codes . 161 6.6.1 Testing for perfection . 162 6.6.2 The hat problem . 163 7 Linear Transformations 167 7.1 Definitions and examples . 167 7.1.1 The Definition of a Linear Transformation . 168 7.1.2 Some Examples . 168 7.1.3 The Algebra of Linear Transformations . 170 7.2 Matrix Transformations and Multiplication . 172 7.2.1 Matrix Linear Transformations . 172 7.2.2 Composition and Multiplication . 173 7.2.3 An Example: Rotations of R2 . 174 7.3 Some Geometry of Linear Transformations on Rn . 177 7.3.1 Transformations on the Plane . 177 7.3.2 Orthogonal Transformations . 178 7.3.3 Gradients and differentials . 181 7.4 Matrices With Respect to an Arbitrary Basis . 184 7.4.1 Coordinates With Respect to a Basis . 184 7.4.2 Change of Basis for Linear Transformations . 187 7.5 Further Results on Linear Transformations . 190 7.5.1 An Existence Theorem . 190 7.5.2 The Kernel and Image of a Linear Transformation . 191 7.5.3 Vector Space Isomorphisms . 192 7.6 Summary . 197 8 An Introduction to the Theory of Determinants 199 8.1 Introduction . 199 8.2 The Definition of the Determinant . 199 8.2.1 Some comments . 199 8.2.2 The 2 2 case . 200 × 8.2.3 Some Combinatorial Preliminaries . 200 8.2.4 Permutations and Permutation Matrices . 203 8.2.5 The General Definition of det(A) . 204 8.2.6 The Determinant of a Permutation Matrix . 205 8.3 Determinants and Row Operations . 207 8.3.1 The Main Result . 208 8.3.2 Properties and consequences . 211 8.3.3 The determinant of a linear transformation . 214 8.3.4 The Laplace Expansion . 215 9 8.3.5 Cramer's rule . 217 8.4 Geometric Applications of the Determinant . 220 8.4.1 Cross and vector products . 220 8.4.2 Determinants and volumes . 220 8.4.3 Change of variables formula . ..
Recommended publications
  • Chapter 8. Eigenvalues: Further Applications and Computations
    8.1 Diagonalization of Quadratic Forms 1 Chapter 8. Eigenvalues: Further Applications and Computations 8.1. Diagonalization of Quadratic Forms Note. In this section we define a quadratic form and relate it to a vector and ma- trix product. We define diagonalization of a quadratic form and give an algorithm to diagonalize a quadratic form. The fact that every quadratic form can be diago- nalized (using an orthogonal matrix) is claimed by the “Principal Axis Theorem” (Theorem 8.1). It is applied in Section 8.2 to address quadratic surfaces. Definition. A quadratic form f(~x) = f(x1, x2,...,xn) in n variables is a polyno- mial that can be written as n f(~x)= u x x X ij i j i,j=1; i≤j where not all uij are zero. Note. For n = 2, a quadratic form is f(x, y)= ax2 + bxy + xy2. The graph of such an equation in the xy-plane is a conic section (ellipse, hyperbola, or parabola). For n = 3, we have 2 2 2 f(x1, x2, x3)= u11x1 + u12x1x2 + u13x1x3 + u22x2 + u23x2x3 + u33x2. We’ll see in the next section that the graph of f(x1, x2, x3) is a quadric surface. We 8.1 Diagonalization of Quadratic Forms 2 can easily verify that u11 u12 u13 x1 [f(x1, x2, x3)]= [x1, x2, x3] 0 u22 u23 x2 . 0 0 u33 x3 Definition. In the quadratic form ~xT U~x, the matrix U is the upper-triangular coefficient matrix of the quadratic form. Example. Page 417 Number 4(a).
    [Show full text]
  • Chapter 6 Eigenvalues and Eigenvectors
    Chapter 6 Eigenvalues and Eigenvectors Po-Ning Chen, Professor Department of Electrical and Computer Engineering National Chiao Tung University Hsin Chu, Taiwan 30010, R.O.C. 6.1 Introduction to eigenvalues 6-1 Motivations • The static system problem of Ax = b has now been solved, e.g., by Gauss- Jordan method or Cramer’s rule. • However, a dynamic system problem such as Ax = λx cannot be solved by the static system method. • To solve the dynamic system problem, we need to find the static feature of A that is “unchanged” with the mapping A. In other words, Ax maps to itself with possibly some stretching (λ>1), shrinking (0 <λ<1), or being reversed (λ<0). • These invariant characteristics of A are the eigenvalues and eigenvectors. Ax maps a vector x to its column space C(A). We are looking for a v ∈ C(A) such that Av aligns with v. The collection of all such vectors is the set of eigen- vectors. 6.1 Eigenvalues and eigenvectors 6-2 Conception (Eigenvalues and eigenvectors): The eigenvalue-eigenvector pair (λi, vi) of a square matrix A satisfies Avi = λivi (or equivalently (A − λiI)vi = 0) where vi = 0 (but λi can be zero), where v1, v2,..., are linearly independent. How to derive eigenvalues and eigenvectors? • For 3 × 3 matrix A, we can obtain 3 eigenvalues λ1,λ2,λ3 by solving 3 2 det(A − λI)=−λ + c2λ + c1λ + c0 =0. • If all λ1,λ2,λ3 are unequal, then we continue to derive: Av1 = λ1v1 (A − λ1I)v1 = 0 v1 =theonly basis of the nullspace of (A − λ1I) Av λ v ⇔ A − λ I v ⇔ v A − λ I 2 = 2 2 ( 2 ) 2 = 0 2 =theonly basis of the nullspace of ( 2 ) Av3 = λ3v3 (A − λ3I)v3 = 0 v3 =theonly basis of the nullspace of (A − λ3I) and the resulting v1, v2 and v3 are linearly independent to each other.
    [Show full text]
  • Principal Component Analysis
    02.01.2019 Prncpal component analyss - Wkpeda Principal component analysis Principal component analysis (PCA) is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables (entities each of which takes on various numerical values) into a set of values of linearly uncorrelated variables called principal components. If there are observations with variables, then the number of distinct principal components is . This transformation is defined in such a way that the first principal component has the largest possible variance (that is, accounts for as much of the variability in the data as possible), and each succeeding component in turn has the highest variance possible under the constraint that it is orthogonal to the preceding components. The resulting vectors (each being a linear combination of the variables and containing n PCA of a multivariate Gaussian observations) are an uncorrelated orthogonal basis set. PCA is sensitive to distribution centered at (1,3) with a the relative scaling of the original variables. standard deviation of 3 in roughly the (0.866, 0.5) direction and of 1 in PCA was invented in 1901 by Karl Pearson,[1] as an analogue of the the orthogonal direction. The principal axis theorem in mechanics; it was later independently developed vectors shown are the eigenvectors of the covariance matrix scaled by and named by Harold Hotelling in the 1930s.[2] Depending on the field of the square root of the corresponding application, it is also named the discrete Karhunen–Loève transform eigenvalue, and shifted so their tails (KLT) in signal processing, the Hotelling transform in multivariate quality are at the mean.
    [Show full text]
  • Chapter 4 Symmetric Matrices and the Second Derivative Test
    Symmetric matrices and the second derivative test 1 Chapter 4 Symmetric matrices and the second derivative test In this chapter we are going to ¯nish our description of the nature of nondegenerate critical points. But ¯rst we need to discuss some fascinating and important features of square matrices. A. Eigenvalues and eigenvectors Suppose that A = (aij) is a ¯xed n £ n matrix. We are going to discuss linear equations of the form Ax = ¸x; where x 2 Rn and ¸ 2 R. (We sometimes will allow x 2 Cn and ¸ 2 C.) Of course, x = 0 is always a solution of this equation, but not an interesting one. We say x is a nontrivial solution if it satis¯es the equation and x 6= 0. DEFINITION. If Ax = ¸x and x 6= 0, we say that ¸ is an eigenvalue of A and that the vector x is an eigenvector of A corresponding to ¸. µ ¶ 0 3 EXAMPLE. Let A = . Then we notice that 1 2 µ ¶ µ ¶ µ ¶ 1 3 1 A = = 3 ; 1 3 1 µ ¶ 1 so is an eigenvector corresponding to the eigenvalue 3. Also, 1 µ ¶ µ ¶ µ ¶ 3 ¡3 3 A = = ¡ ; ¡1 1 ¡1 µ ¶ 3 so is an eigenvector corresponding to the eigenvalue ¡1. ¡1 µ ¶ µ ¶ µ ¶ µ ¶ 2 1 1 1 1 EXAMPLE. Let A = . Then A = 2 , so 2 is an eigenvalue, and A = µ ¶ 0 0 0 0 ¡2 0 , so 0 is also an eigenvalue. 0 REMARK. The German word for eigenvalue is eigenwert. A literal translation into English would be \characteristic value," and this phrase appears in a few texts.
    [Show full text]
  • Geometric Number Theory Lenny Fukshansky
    Geometric Number Theory Lenny Fukshansky Minkowki's creation of the geometry of numbers was likened to the story of Saul, who set out to look for his father's asses and discovered a Kingdom. J. V. Armitage Contents Chapter 1. Geometry of Numbers 1 1.1. Introduction 1 1.2. Lattices 2 1.3. Theorems of Blichfeldt and Minkowski 10 1.4. Successive minima 13 1.5. Inhomogeneous minimum 18 1.6. Problems 21 Chapter 2. Discrete Optimization Problems 23 2.1. Sphere packing, covering and kissing number problems 23 2.2. Lattice packings in dimension 2 29 2.3. Algorithmic problems on lattices 34 2.4. Problems 38 Chapter 3. Quadratic forms 39 3.1. Introduction to quadratic forms 39 3.2. Minkowski's reduction 46 3.3. Sums of squares 49 3.4. Problems 53 Chapter 4. Diophantine Approximation 55 4.1. Real and rational numbers 55 4.2. Algebraic and transcendental numbers 57 4.3. Dirichlet's Theorem 61 4.4. Liouville's theorem and construction of a transcendental number 65 4.5. Roth's theorem 67 4.6. Continued fractions 70 4.7. Kronecker's theorem 76 4.8. Problems 79 Chapter 5. Algebraic Number Theory 82 5.1. Some field theory 82 5.2. Number fields and rings of integers 88 5.3. Noetherian rings and factorization 97 5.4. Norm, trace, discriminant 101 5.5. Fractional ideals 105 5.6. Further properties of ideals 109 5.7. Minkowski embedding 113 5.8. The class group 116 5.9. Dirichlet's unit theorem 119 v vi CONTENTS 5.10.
    [Show full text]
  • The Principal Axis Theorem and Sylvester's Law of Inertia
    The principal axis theorem and Sylvester's law of inertia Jordan Bell [email protected] Department of Mathematics, University of Toronto April 3, 2014 The principal axis theorem is the statement that for any quadratic form on Rn there are coordinates in which the quadratic form is diagonal. Let Q(x) be a quadratic form on Rn. Then for some n × n real symmetric matrix A we have Q(x) = xT Ax for all x 2 Rn. If A is an n × n real symmetric matrix then by the spectral theorem there is an orthonormal basis of Rn each element of which is an eigenvector of A. Let λ1; : : : ; λn be the eigenvalues of A and let p1; : : : ; pn be the corresponding basis. Take P to be a matrix whose jth column is pj. Since the eigenvectors −1 T are orthonormal, it follows that P = P . Letting D = diag(λ1; : : : ; λn), we see that A = P DP T . We now apply the result of the spectral theorem to the quadratic form Q(x), getting Q(x) = xT P DP T x = (P T x)T D(P T x); which can be written as n T X 2 Q(P y) = y Dy = λjyj : j=1 T The coordinate vector of x with respect to the basis p1; : : : ; pn is y = P x, and Pn 2 we can write Q(x) in terms of y as j=1 λjyj . Sylvester's law of inertia tells us that if T T T Q(x) = (R x) diag(µ1; : : : ; µn)(R x) for some matrix R then jfµj : µj > 0gj = jfλj : λj > 0gj, jfµj : µj = 0gj = jfλj : λj = 0gj, and jfµj : µj < 0gj = jfλj : λj < 0gj.
    [Show full text]
  • Principal Component Analysis (Pca)
    A SURVEY: PRINCIPAL COMPONENT ANALYSIS (PCA) Sumanta Saha1, Sharmistha Bhattacharya (Halder)2 1Department of IT, Tripura University, Agartala,(India) 2Department of Mathematics, Tripura University, Agartala, (India) ABSTRACT Principal component analysis (PCA) is one of the most widely used multivariate techniques in statistics. It is commonly used to reduce the dimensionality of data in order to examine its underlying structure and the covariance/correlation structure of a set of variables. It is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. The number of principal components is less than or equal to the number of original variables. It is a way of identifying patterns in data, and expressing the data in such a way as to highlight their similarities and differences. This is the survey paper of the existing theory and techniques for PCA with example. Keywords: Correlation structure, Covariance structure, Multivariate Technique, Orthogonal Transformation, Principal Component. I. INTRODUCTION OF PRINCIPAL COMPONENTS ANALYSIS (PCA) Principal component analysis (PCA) is a statistical procedure that uses orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables are called principal components. The number of principal components are less than or equal to the number of original variables. This transformation is defined in such a way that the first principal component has the largest possible variance (that is, accounts for as much of the variability in the data as possible), and each succeeding component in turn has the highest variance possible under the constraint that it be orthogonal to (i.e., uncorrelated with) the preceding components [3].
    [Show full text]
  • The Principal Axis Theorem for Holomorphic Functions
    PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 128, Number 2, Pages 325{335 S 0002-9939(99)05451-9 Article electronically published on September 27, 1999 THE PRINCIPAL AXIS THEOREM FOR HOLOMORPHIC FUNCTIONS JOACHIM GRATER¨ AND MARKUS KLEIN (Communicated by Steven R. Bell) Abstract. An algebraic approach to Rellich’s theorem is given which states that any analytic family of matrices which is normal on the real axis can be diagonalized by an analytic family of matrices which is unitary on the real axis. We show that this result is a special version of a purely algebraic theorem on the diagonalization of matrices over fields with henselian valuations. 1. Introduction This paper aims at reviewing some more or less well known material from analysis and algebra which hopefully is put into some new perspective (at least it was new for the authors) by systematically exploring the relations between algebra and analysis. It is motivated by a well known theorem of analytic perturbation theory due to Rellich [Re]: Near z = 0, the eigenvalues of an analytic family A(z) of matrices which is hermitian (resp. normal) for real values of z can be chosen to be analytic, even in the case of eigenvalue degeneracies. This theorem is presented in several textbooks (see [Ba, Ka, RS]) where it is proven by use of a rather strong algebraic result: The branching behaviour of the eigenvalues of any analytic family of matrices is explicitly given in terms of Puiseux series. The proof of Rellich’s result then boils down to the observation that hermiticity (or normality) of the family excludes branching.
    [Show full text]
  • Differential Geometry Notes
    Mathematics 433/533; Class Notes Richard Koch March 24, 2005 ii Contents Preface vii 1 Curves 1 1.1 Introduction .................................... 1 1.2 Arc Length .................................... 2 1.3 Reparameterization ................................ 3 1.4 Curvature ..................................... 5 1.5 The Moving Frame ................................ 7 1.6 The Frenet-Serret Formulas ........................... 8 1.7 The Goal of Curve Theory ............................ 12 1.8 The Fundamental Theorem ........................... 12 1.9 Computers .................................... 14 1.10 The Possibility that κ = 0 ............................ 16 1.11 Formulas for γ(t) ................................. 20 1.12 Higher Dimensions ................................ 21 2 Surfaces 25 2.1 Parameterized Surfaces .............................. 25 2.2 Tangent Vectors ................................. 28 2.3 Directional Derivatives .............................. 31 2.4 Vector Fields ................................... 32 2.5 The Lie Bracket .................................. 32 2.6 The Metric Tensor ................................ 34 2.7 Geodesics ..................................... 38 2.8 Energy ....................................... 40 2.9 The Geodesic Equation ............................. 42 2.10 An Example .................................... 46 2.11 Geodesics on a Sphere .............................. 49 2.12 Geodesics on a Surface of Revolution ...................... 50 2.13 Geodesics on a Poincare Disk .........................
    [Show full text]
  • Orthogonality
    Part VII Orthogonality 547 Section 31 Orthogonal Diagonalization Focus Questions By the end of this section, you should be able to give precise and thorough answers to the questions listed below. You may want to keep these questions in mind to focus your thoughts as you complete the section. What does it mean for a matrix to be orthogonally diagonalizable and why • is this concept important? What is a symmetric matrix and what important property related to diago- • nalization does a symmetric matrix have? What is the spectrum of a matrix? • Application: The Multivariable Second Derivative Test In single variable calculus, we learn that the second derivative can be used to classify a critical point of the type where the derivative of a function is 0 as a local maximum or minimum. Theorem 31.1 (The Second Derivative Test for Single-Variable Functions). If a is a critical number of a function f so that f 0(a) = 0 and if f 00(a) exists, then if f (a) < 0, then f(a) is a local maximum value of f, • 00 if f (a) > 0, then f(a) is a local minimum value of f, and • 00 if f (a) = 0, this test yields no information. • 00 In the two-variable case we have an analogous test, which is usually seen in a multivariable calculus course. Theorem 31.2 (The Second Derivative Test for Functions of Two Variables). Suppose (a; b) is a critical point of the function f for which fx(a; b) = 0 and fy(a; b) = 0.
    [Show full text]
  • Topics in Matrix Theory Singular Value Decomposition
    Topics in Matrix Theory Singular Value Decomposition Tang Chiu Chak The University of Hong Kong 3rd July, 2014 Contents 1 Introduction 2 2 Review on linear algebra 2 2.1 Definition . .2 2.1.1 Operation for Matrices . .2 2.1.2 Notation and terminology . .3 2.2 Some basic results . .5 3 Singular value Decomposition 6 3.1 Schur's Theorem and Spectral Theorem . .6 3.2 Unitarily diagonalizable . .7 3.3 Singular value decomposition . .9 3.4 Brief disscussion on SVD . 10 3.4.1 Singular vector . 10 3.4.2 Reduced SVD . 11 3.4.3 Pseudoinverse . 11 4 Reference 12 1 1 Introduction In linear algebra, we have come across some useful theories related to matrices. For instance, eigenvalues of squared matrices, which in some sense translate com- plicated matrix multiplication to simple scalar multiplication and provide us a way to factorise some matrices into a nice form. However, most of the discussions in the courses are restricted in squared matrix with real entries. Will the theorems become different if we allow complex matirces? Can we extend the concept of eigenvalue to rectangular matrices? In response to these question, here we would like to first have a quick review on what we learned in linear algebra (but in the contest of complex matrices) and then study upon the idea of singular value{the "eigenvalue" for non-squared matrices. 2 Review on linear algebra 2.1 Definition An m by n matrix is a rectantgular array with m rows and n columns (and hence mn entries in total) and sometimes we will entriewise write an m by n ma- trix whose (i; j)-entry is aij as [aij]m×n.
    [Show full text]
  • Linear Algebra
    Linear Algebra Chapter 8: Eigenvalues: Further Applications and Computations Section 8.2. Applications to Geometry—Proofs of Theorems May 1, 2018 () Linear Algebra May 1, 2018 1 / 8 Table of contents 1 Theorem 8.2. Classification of Second-Degree Plane Curves 2 Page 430 Number 16 () Linear Algebra May 1, 2018 2 / 8 Theorem 8.2. Classification of Second-Degree Plane Curves Theorem 8.2 Theorem 8.2. Classification of Second-Degree Plane Curves. Every equation of the form ax2 + bxy + xy 2 + dx + ey + f = 0 for a, b, c not all zero can be reduced to an equation of the form 2 2 λ1t1 + λ2t + gt1 + ht2 + k = 0 by means of an orthogonal substitution corresponding to a rotation of the plane. The coefficients λ1 and λ2 in the second equation are the eigenvalues of the symmetric coefficient matrix of the quadratic-form portion of the first equation. The curve describes a (possibly degenerate or empty) ellipse if λ1λ2 > 0 hyperbola if λ1λ2 < 0 parabola if λ1λ2 = 0. () Linear Algebra May 1, 2018 3 / 8 So the original equation can be reduced to the equation 2 2 λ1t1 + λ2t2 + gt1 + ht2 + k = 0. Theorem 8.2. Classification of Second-Degree Plane Curves Theorem 8.2 (continued 1) Proof. By Theorem 8.1, “Principal Axis Theorem,” there is a 2 × 2 x t orthogonal matrix C with det(C) = 1 such that = C 1 and y t2 2 2 2 2 ax + bxy + cy = λ1t1 + λ2t2 where λ1 and λ2 are eigenvalues of the symmetric coefficient matrix of the quadratic form ax2 + bxy + xy 2.
    [Show full text]