Optimization Algorithms on Matrix Manifolds 00˙AMS June 18, 2007 00˙AMS June 18, 2007

Total Page:16

File Type:pdf, Size:1020Kb

Optimization Algorithms on Matrix Manifolds 00˙AMS June 18, 2007 00˙AMS June 18, 2007 00˙AMS June 18, 2007 Optimization Algorithms on Matrix Manifolds 00˙AMS June 18, 2007 00˙AMS June 18, 2007 Optimization Algorithms on Matrix Manifolds P.-A. Absil Robert Mahony Rodolphe Sepulchre PRINCETON UNIVERSITY PRESS PRINCETON AND OXFORD 00˙AMS June 18, 2007 Copyright c 2008 by Princeton University Press Published by Princeton University Press 41 William Street, Princeton, New Jersey 08540 In the United Kingdom: Princeton University Press 3 Market Place, Woodstock, Oxfordshire OX20 1SY All Rights Reserved ISBN-13: 978-0-691-13298-3 ISBN-10: 0-691-13298-4 British Library Cataloging-in-Publication Data is available This book has been composed in Computer Modern in LATEX The publisher would like to acknowledge the authors of this volume for providing the camera-ready copy from which this book was printed. Printed on acid-free paper. ∞ press.princeton.edu Printed in the United States of America 10 9 8 7 6 5 4 3 2 1 00˙AMS June 18, 2007 To our parents 00˙AMS June 18, 2007 00˙AMS June 18, 2007 Contents Foreword, by Paul Van Dooren xi Notation conventions xiii 1. Introduction 1 2. Motivation and Applications 5 2.1 A case study: the eigenvalue problem 5 2.1.1 The eigenvalue problem as an optimization problem 7 2.1.2 Some benefits of an optimization framework 9 2.2 Research problems 10 2.2.1 Singular value problem 10 2.2.2 Matrix approximations 12 2.2.3 Independent component analysis 13 2.2.4 Pose estimation and motion recovery 14 2.3 Notes and references 16 3. Matrix Manifolds: First-Order Geometry 17 3.1 Manifolds 18 3.1.1 Definitions: charts, atlases, manifolds 18 3.1.2 The topology of a manifold* 20 3.1.3 How to recognize a manifold 21 3.1.4 Vector spaces as manifolds 22 n×p n×p 3.1.5 The manifolds R and R∗ 22 3.1.6 Product manifolds 23 3.2 Differentiable functions 24 3.2.1 Immersions and submersions 24 3.3 Embedded submanifolds 25 3.3.1 General theory 25 3.3.2 The Stiefel manifold 26 3.4 Quotient manifolds 27 3.4.1 Theory of quotient manifolds 27 3.4.2 Functions on quotient manifolds 29 3.4.3 The real projective space RPn−1 30 3.4.4 The Grassmann manifold Grass(p, n) 30 3.5 Tangent vectors and differential maps 32 3.5.1 Tangent vectors 33 3.5.2 Tangent vectors to a vector space 35 00˙AMS June 18, 2007 viii CONTENTS 3.5.3 Tangent bundle 36 3.5.4 Vector fields 36 3.5.5 Tangent vectors as derivations∗ 37 3.5.6 Differential of a mapping 38 3.5.7 Tangent vectors to embedded submanifolds 39 3.5.8 Tangent vectors to quotient manifolds 42 3.6 Riemannian metric, distance, and gradients 45 3.6.1 Riemannian submanifolds 47 3.6.2 Riemannian quotient manifolds 48 3.7 Notes and References 51 4. Line-Search Algorithms on Manifolds 54 4.1 Retractions 54 4.1.1 Retractions on embedded submanifolds 56 4.1.2 Retractions on quotient manifolds 59 4.1.3 Retractions and local coordinates* 61 4.2 Line-search methods 62 4.3 Convergence analysis 63 4.3.1 Convergence on manifolds 63 4.3.2 A topological curiosity* 64 4.3.3 Convergence of line-search methods 65 4.4 Stability of fixed points 66 4.5 Speed of convergence 68 4.5.1 Order of convergence 68 4.5.2 Rate of convergence of line-search methods* 70 4.6 Rayleigh quotient minimization on the sphere 73 4.6.1 Cost function and gradient calculation 74 4.6.2 Critical points of the Rayleigh quotient 74 4.6.3 Armijo line search 76 4.6.4 Exact line search 78 4.6.5 Accelerated line search: Locally optimal conjugate gradient 78 4.6.6 Links with the power method and inverse iteration 78 4.7 Refining eigenvector estimates 80 4.8 Brockett cost function on the Stiefel manifold 80 4.8.1 Cost function and search direction 80 4.8.2 Critical points 81 4.9 Rayleigh quotient minimization on the Grassmann manifold 83 4.9.1 Cost function and gradient calculation 83 4.9.2 Line-search algorithm 85 4.10 Notes and references 86 5. Matrix Manifolds: Second-Order Geometry 91 5.1 Newton’s method in Rn 91 5.2 Affine connections 93 5.3 Riemannian Connection 96 5.3.1 Symmetric connections 96 5.3.2 Definition of the Riemannian connection 97 5.3.3 Riemannian connection on Riemannian submanifolds 98 5.3.4 Riemannian connection on quotient manifolds 100 00˙AMS June 18, 2007 CONTENTS ix 5.4 Geodesics, exponential mapping, and parallel translation 101 5.5 Riemannian Hessian operator 104 5.6 Second covariant derivative* 108 5.7 Notes and references 110 6. Newton’s Method 111 6.1 Newton’s method on manifolds 111 6.2 Riemannian Newton method for real-valued functions 113 6.3 Local convergence 114 6.3.1 Calculus approach to local convergence analysis 117 6.4 Rayleigh quotient algorithms 118 6.4.1 Rayleigh quotient on the sphere 118 6.4.2 Rayleigh quotient on the Grassmann manifold 120 6.4.3 Generalized eigenvalue problem 121 6.4.4 The nonsymmetric eigenvalue problem 125 6.4.5 Newton with subspace acceleration: the Jacobi-Davidson approach 126 6.5 Analysis of Rayleigh quotient algorithms 128 6.5.1 Convergence analysis 128 6.5.2 Numerical implementation 129 6.6 Notes and references 131 7. Trust-Region Methods 136 7.1 Models 137 7.1.1 Models in Rn 137 7.1.2 Models in general Euclidean spaces 137 7.1.3 Models on Riemannian manifolds 138 7.2 Trust-region methods 140 7.2.1 Trust-region methods in Rn 140 7.2.2 Trust-region methods on Riemannian manifolds 140 7.3 Computing a trust-region step 141 7.3.1 Computing a nearly exact solution 142 7.3.2 Improving on the Cauchy point 143 7.4 Convergence analysis 145 7.4.1 Global convergence 145 7.4.2 Local convergence 152 7.4.3 Discussion 158 7.5 Applications 159 7.5.1 Checklist 159 7.5.2 Symmetric eigenvalue decomposition 160 7.5.3 Computing an extreme eigenspace 161 7.6 Notes and references 166 8. A Constellation of Superlinear Algorithms 168 8.1 Vector transport 168 8.1.1 Vector transport and affine connections 170 8.1.2 Vector transport by differentiated retraction 172 8.1.3 Vector transport on Riemannian submanifolds 174 8.1.4 Vector transport on quotient manifolds 174 8.2 Approximate Newton methods 175 00˙AMS June 18, 2007 x CONTENTS 8.2.1 Finite difference approximations 176 8.2.2 Secant methods 178 8.3 Conjugate gradients 180 8.3.1 Application: Rayleigh quotient minimization 183 8.4 Least-square methods 184 8.4.1 Gauss-Newton methods 186 8.4.2 Levenberg-Marquardt methods 187 8.5 Notes and references 188 A. Elements of Linear Algebra, Topology, and Calculus 189 A.1 Linear algebra 189 A.2 Topology 191 A.3 Functions 193 A.4 Asymptotic notation 194 A.5 Derivatives 195 A.6 Taylor’s formula 198 Bibliography 200 Index 221 00˙AMS June 18, 2007 Foreword Constrained optimization is quite well established as an area of research, and there exist several powerful techniques that address general problems in that area. In this book a special class of constraints is considered, called geomet- ric constraints, which express that the solution of the optimization problem lies on a manifold. This is a recent area of research that provides powerful alternatives to the more general constrained optimization methods. Clas- sical constrained optimization techniques work in an embedded space that can be of a much larger dimension than that of the manifold. Optimization algorithms that work on the manifold have therefore a lower complexity and quite often also have better numerical properties (see, e.g., the numerical integration schemes that preserve invariants such as energy). The authors refer to this as unconstrained optimization in a constrained search space. The idea that one can describe difference or differential equations whose solution lies on a manifold originated in the work of Brockett, Flaschka, and Rutishauser. They described, for example, isospectral flows that yield time-varying matrices which are all similar to each other and eventually converge to diagonal matrices of ordered eigenvalues. These ideas did not get as much attention in the numerical linear algebra community as in the area of dynamical systems because the resulting difference and differential equations did not lead immediately to efficient algorithmic implementations. An important book synthesizing several of these ideas is Optimization and Dynamical Systems (Springer, 1994), by Helmke and Moore, which focuses on dynamical systems related to gradient flows that converge exponentially to a stationary point that is the solution of some optimization problem. The corresponding discrete-time version of this algorithm would then have linear convergence, which seldom compares favorably with state-of-the-art eigenvalue solvers. The formulation of higher-order optimization methods on manifolds grew out of these ideas. Some of the people that applied these techniques to basic linear algebra problems include Absil, Arias, Chu, Dehaene, Edelman, El- den, Gallivan, Helmke, H¨uper, Lippert, Mahony, Manton, Moore, Sepulchre, Smith, and Van Dooren. It is interesting to see, on the other hand, that several basic ideas in this area were also proposed by Luenberger and Gabay in the optimization literature in the early 1980s, and this without any use of dynamical systems.
Recommended publications
  • The Grassmann Manifold
    The Grassmann Manifold 1. For vector spaces V and W denote by L(V; W ) the vector space of linear maps from V to W . Thus L(Rk; Rn) may be identified with the space Rk£n of k £ n matrices. An injective linear map u : Rk ! V is called a k-frame in V . The set k n GFk;n = fu 2 L(R ; R ): rank(u) = kg of k-frames in Rn is called the Stiefel manifold. Note that the special case k = n is the general linear group: k k GLk = fa 2 L(R ; R ) : det(a) 6= 0g: The set of all k-dimensional (vector) subspaces ¸ ½ Rn is called the Grassmann n manifold of k-planes in R and denoted by GRk;n or sometimes GRk;n(R) or n GRk(R ). Let k ¼ : GFk;n ! GRk;n; ¼(u) = u(R ) denote the map which assigns to each k-frame u the subspace u(Rk) it spans. ¡1 For ¸ 2 GRk;n the fiber (preimage) ¼ (¸) consists of those k-frames which form a basis for the subspace ¸, i.e. for any u 2 ¼¡1(¸) we have ¡1 ¼ (¸) = fu ± a : a 2 GLkg: Hence we can (and will) view GRk;n as the orbit space of the group action GFk;n £ GLk ! GFk;n :(u; a) 7! u ± a: The exercises below will prove the following n£k Theorem 2. The Stiefel manifold GFk;n is an open subset of the set R of all n £ k matrices. There is a unique differentiable structure on the Grassmann manifold GRk;n such that the map ¼ is a submersion.
    [Show full text]
  • Cheap Orthogonal Constraints in Neural Networks: a Simple Parametrization of the Orthogonal and Unitary Group
    Cheap Orthogonal Constraints in Neural Networks: A Simple Parametrization of the Orthogonal and Unitary Group Mario Lezcano-Casado 1 David Mart´ınez-Rubio 2 Abstract Recurrent Neural Networks (RNNs). In RNNs the eigenval- ues of the gradient of the recurrent kernel explode or vanish We introduce a novel approach to perform first- exponentially fast with the number of time-steps whenever order optimization with orthogonal and uni- the recurrent kernel does not have unitary eigenvalues (Ar- tary constraints. This approach is based on a jovsky et al., 2016). This behavior is the same as the one parametrization stemming from Lie group theory encountered when computing the powers of a matrix, and through the exponential map. The parametrization results in very slow convergence (vanishing gradient) or a transforms the constrained optimization problem lack of convergence (exploding gradient). into an unconstrained one over a Euclidean space, for which common first-order optimization meth- In the seminal paper (Arjovsky et al., 2016), they note that ods can be used. The theoretical results presented unitary matrices have properties that would solve the ex- are general enough to cover the special orthogo- ploding and vanishing gradient problems. These matrices nal group, the unitary group and, in general, any form a group called the unitary group and they have been connected compact Lie group. We discuss how studied extensively in the fields of Lie group theory and Rie- this and other parametrizations can be computed mannian geometry. Optimization methods over the unitary efficiently through an implementation trick, mak- and orthogonal group have found rather fruitful applications ing numerically complex parametrizations usable in RNNs in recent years (cf.
    [Show full text]
  • What's in a Name? the Matrix As an Introduction to Mathematics
    St. John Fisher College Fisher Digital Publications Mathematical and Computing Sciences Faculty/Staff Publications Mathematical and Computing Sciences 9-2008 What's in a Name? The Matrix as an Introduction to Mathematics Kris H. Green St. John Fisher College, [email protected] Follow this and additional works at: https://fisherpub.sjfc.edu/math_facpub Part of the Mathematics Commons How has open access to Fisher Digital Publications benefited ou?y Publication Information Green, Kris H. (2008). "What's in a Name? The Matrix as an Introduction to Mathematics." Math Horizons 16.1, 18-21. Please note that the Publication Information provides general citation information and may not be appropriate for your discipline. To receive help in creating a citation based on your discipline, please visit http://libguides.sjfc.edu/citations. This document is posted at https://fisherpub.sjfc.edu/math_facpub/12 and is brought to you for free and open access by Fisher Digital Publications at St. John Fisher College. For more information, please contact [email protected]. What's in a Name? The Matrix as an Introduction to Mathematics Abstract In lieu of an abstract, here is the article's first paragraph: In my classes on the nature of scientific thought, I have often used the movie The Matrix to illustrate the nature of evidence and how it shapes the reality we perceive (or think we perceive). As a mathematician, I usually field questions elatedr to the movie whenever the subject of linear algebra arises, since this field is the study of matrices and their properties. So it is natural to ask, why does the movie title reference a mathematical object? Disciplines Mathematics Comments Article copyright 2008 by Math Horizons.
    [Show full text]
  • Multivector Differentiation and Linear Algebra 0.5Cm 17Th Santaló
    Multivector differentiation and Linear Algebra 17th Santalo´ Summer School 2016, Santander Joan Lasenby Signal Processing Group, Engineering Department, Cambridge, UK and Trinity College Cambridge [email protected], www-sigproc.eng.cam.ac.uk/ s jl 23 August 2016 1 / 78 Examples of differentiation wrt multivectors. Linear Algebra: matrices and tensors as linear functions mapping between elements of the algebra. Functional Differentiation: very briefly... Summary Overview The Multivector Derivative. 2 / 78 Linear Algebra: matrices and tensors as linear functions mapping between elements of the algebra. Functional Differentiation: very briefly... Summary Overview The Multivector Derivative. Examples of differentiation wrt multivectors. 3 / 78 Functional Differentiation: very briefly... Summary Overview The Multivector Derivative. Examples of differentiation wrt multivectors. Linear Algebra: matrices and tensors as linear functions mapping between elements of the algebra. 4 / 78 Summary Overview The Multivector Derivative. Examples of differentiation wrt multivectors. Linear Algebra: matrices and tensors as linear functions mapping between elements of the algebra. Functional Differentiation: very briefly... 5 / 78 Overview The Multivector Derivative. Examples of differentiation wrt multivectors. Linear Algebra: matrices and tensors as linear functions mapping between elements of the algebra. Functional Differentiation: very briefly... Summary 6 / 78 We now want to generalise this idea to enable us to find the derivative of F(X), in the A ‘direction’ – where X is a general mixed grade multivector (so F(X) is a general multivector valued function of X). Let us use ∗ to denote taking the scalar part, ie P ∗ Q ≡ hPQi. Then, provided A has same grades as X, it makes sense to define: F(X + tA) − F(X) A ∗ ¶XF(X) = lim t!0 t The Multivector Derivative Recall our definition of the directional derivative in the a direction F(x + ea) − F(x) a·r F(x) = lim e!0 e 7 / 78 Let us use ∗ to denote taking the scalar part, ie P ∗ Q ≡ hPQi.
    [Show full text]
  • Handout 9 More Matrix Properties; the Transpose
    Handout 9 More matrix properties; the transpose Square matrix properties These properties only apply to a square matrix, i.e. n £ n. ² The leading diagonal is the diagonal line consisting of the entries a11, a22, a33, . ann. ² A diagonal matrix has zeros everywhere except the leading diagonal. ² The identity matrix I has zeros o® the leading diagonal, and 1 for each entry on the diagonal. It is a special case of a diagonal matrix, and A I = I A = A for any n £ n matrix A. ² An upper triangular matrix has all its non-zero entries on or above the leading diagonal. ² A lower triangular matrix has all its non-zero entries on or below the leading diagonal. ² A symmetric matrix has the same entries below and above the diagonal: aij = aji for any values of i and j between 1 and n. ² An antisymmetric or skew-symmetric matrix has the opposite entries below and above the diagonal: aij = ¡aji for any values of i and j between 1 and n. This automatically means the digaonal entries must all be zero. Transpose To transpose a matrix, we reect it across the line given by the leading diagonal a11, a22 etc. In general the result is a di®erent shape to the original matrix: a11 a21 a11 a12 a13 > > A = A = 0 a12 a22 1 [A ]ij = A : µ a21 a22 a23 ¶ ji a13 a23 @ A > ² If A is m £ n then A is n £ m. > ² The transpose of a symmetric matrix is itself: A = A (recalling that only square matrices can be symmetric).
    [Show full text]
  • Building Deep Networks on Grassmann Manifolds
    Building Deep Networks on Grassmann Manifolds Zhiwu Huangy, Jiqing Wuy, Luc Van Goolyz yComputer Vision Lab, ETH Zurich, Switzerland zVISICS, KU Leuven, Belgium fzhiwu.huang, jiqing.wu, [email protected] Abstract The popular applications of Grassmannian data motivate Learning representations on Grassmann manifolds is popular us to build a deep neural network architecture for Grassman- in quite a few visual recognition tasks. In order to enable deep nian representation learning. To this end, the new network learning on Grassmann manifolds, this paper proposes a deep architecture is designed to accept Grassmannian data di- network architecture by generalizing the Euclidean network rectly as input, and learns new favorable Grassmannian rep- paradigm to Grassmann manifolds. In particular, we design resentations that are able to improve the final visual recog- full rank mapping layers to transform input Grassmannian nition tasks. In other words, the new network aims to deeply data to more desirable ones, exploit re-orthonormalization learn Grassmannian features on their underlying Rieman- layers to normalize the resulting matrices, study projection nian manifolds in an end-to-end learning architecture. In pooling layers to reduce the model complexity in the Grass- summary, two main contributions are made by this paper: mannian context, and devise projection mapping layers to re- spect Grassmannian geometry and meanwhile achieve Eu- • We explore a novel deep network architecture in the con- clidean forms for regular output layers. To train the Grass- text of Grassmann manifolds, where it has not been possi- mann networks, we exploit a stochastic gradient descent set- ble to apply deep neural networks.
    [Show full text]
  • Geometric-Algebra Adaptive Filters Wilder B
    1 Geometric-Algebra Adaptive Filters Wilder B. Lopes∗, Member, IEEE, Cassio G. Lopesy, Senior Member, IEEE Abstract—This paper presents a new class of adaptive filters, namely Geometric-Algebra Adaptive Filters (GAAFs). They are Faces generated by formulating the underlying minimization problem (a deterministic cost function) from the perspective of Geometric Algebra (GA), a comprehensive mathematical language well- Edges suited for the description of geometric transformations. Also, (directed lines) differently from standard adaptive-filtering theory, Geometric Calculus (the extension of GA to differential calculus) allows Fig. 1. A polyhedron (3-dimensional polytope) can be completely described for applying the same derivation techniques regardless of the by the geometric multiplication of its edges (oriented lines, vectors), which type (subalgebra) of the data, i.e., real, complex numbers, generate the faces and hypersurfaces (in the case of a general n-dimensional quaternions, etc. Relying on those characteristics (among others), polytope). a deterministic quadratic cost function is posed, from which the GAAFs are devised, providing a generalization of regular adaptive filters to subalgebras of GA. From the obtained update rule, it is shown how to recover the following least-mean squares perform calculus with hypercomplex quantities, i.e., elements (LMS) adaptive filter variants: real-entries LMS, complex LMS, that generalize complex numbers for higher dimensions [2]– and quaternions LMS. Mean-square analysis and simulations in [10]. a system identification scenario are provided, showing very good agreement for different levels of measurement noise. GA-based AFs were first introduced in [11], [12], where they were successfully employed to estimate the geometric Index Terms—Adaptive filtering, geometric algebra, quater- transformation (rotation and translation) that aligns a pair of nions.
    [Show full text]
  • Determinants Math 122 Calculus III D Joyce, Fall 2012
    Determinants Math 122 Calculus III D Joyce, Fall 2012 What they are. A determinant is a value associated to a square array of numbers, that square array being called a square matrix. For example, here are determinants of a general 2 × 2 matrix and a general 3 × 3 matrix. a b = ad − bc: c d a b c d e f = aei + bfg + cdh − ceg − afh − bdi: g h i The determinant of a matrix A is usually denoted jAj or det (A). You can think of the rows of the determinant as being vectors. For the 3×3 matrix above, the vectors are u = (a; b; c), v = (d; e; f), and w = (g; h; i). Then the determinant is a value associated to n vectors in Rn. There's a general definition for n×n determinants. It's a particular signed sum of products of n entries in the matrix where each product is of one entry in each row and column. The two ways you can choose one entry in each row and column of the 2 × 2 matrix give you the two products ad and bc. There are six ways of chosing one entry in each row and column in a 3 × 3 matrix, and generally, there are n! ways in an n × n matrix. Thus, the determinant of a 4 × 4 matrix is the signed sum of 24, which is 4!, terms. In this general definition, half the terms are taken positively and half negatively. In class, we briefly saw how the signs are determined by permutations.
    [Show full text]
  • Matrices and Tensors
    APPENDIX MATRICES AND TENSORS A.1. INTRODUCTION AND RATIONALE The purpose of this appendix is to present the notation and most of the mathematical tech- niques that are used in the body of the text. The audience is assumed to have been through sev- eral years of college-level mathematics, which included the differential and integral calculus, differential equations, functions of several variables, partial derivatives, and an introduction to linear algebra. Matrices are reviewed briefly, and determinants, vectors, and tensors of order two are described. The application of this linear algebra to material that appears in under- graduate engineering courses on mechanics is illustrated by discussions of concepts like the area and mass moments of inertia, Mohr’s circles, and the vector cross and triple scalar prod- ucts. The notation, as far as possible, will be a matrix notation that is easily entered into exist- ing symbolic computational programs like Maple, Mathematica, Matlab, and Mathcad. The desire to represent the components of three-dimensional fourth-order tensors that appear in anisotropic elasticity as the components of six-dimensional second-order tensors and thus rep- resent these components in matrices of tensor components in six dimensions leads to the non- traditional part of this appendix. This is also one of the nontraditional aspects in the text of the book, but a minor one. This is described in §A.11, along with the rationale for this approach. A.2. DEFINITION OF SQUARE, COLUMN, AND ROW MATRICES An r-by-c matrix, M, is a rectangular array of numbers consisting of r rows and c columns: ¯MM..
    [Show full text]
  • 28. Exterior Powers
    28. Exterior powers 28.1 Desiderata 28.2 Definitions, uniqueness, existence 28.3 Some elementary facts 28.4 Exterior powers Vif of maps 28.5 Exterior powers of free modules 28.6 Determinants revisited 28.7 Minors of matrices 28.8 Uniqueness in the structure theorem 28.9 Cartan's lemma 28.10 Cayley-Hamilton Theorem 28.11 Worked examples While many of the arguments here have analogues for tensor products, it is worthwhile to repeat these arguments with the relevant variations, both for practice, and to be sensitive to the differences. 1. Desiderata Again, we review missing items in our development of linear algebra. We are missing a development of determinants of matrices whose entries may be in commutative rings, rather than fields. We would like an intrinsic definition of determinants of endomorphisms, rather than one that depends upon a choice of coordinates, even if we eventually prove that the determinant is independent of the coordinates. We anticipate that Artin's axiomatization of determinants of matrices should be mirrored in much of what we do here. We want a direct and natural proof of the Cayley-Hamilton theorem. Linear algebra over fields is insufficient, since the introduction of the indeterminate x in the definition of the characteristic polynomial takes us outside the class of vector spaces over fields. We want to give a conceptual proof for the uniqueness part of the structure theorem for finitely-generated modules over principal ideal domains. Multi-linear algebra over fields is surely insufficient for this. 417 418 Exterior powers 2. Definitions, uniqueness, existence Let R be a commutative ring with 1.
    [Show full text]
  • Lesson 1: Solutions to Polynomial Equations
    NYS COMMON CORE MATHEMATICS CURRICULUM Lesson 1 M3 PRECALCULUS AND ADVANCED TOPICS Lesson 1: Solutions to Polynomial Equations Classwork Opening Exercise How many solutions are there to the equation 푥2 = 1? Explain how you know. Example 1: Prove that a Quadratic Equation Has Only Two Solutions over the Set of Complex Numbers Prove that 1 and −1 are the only solutions to the equation 푥2 = 1. Let 푥 = 푎 + 푏푖 be a complex number so that 푥2 = 1. a. Substitute 푎 + 푏푖 for 푥 in the equation 푥2 = 1. b. Rewrite both sides in standard form for a complex number. c. Equate the real parts on each side of the equation, and equate the imaginary parts on each side of the equation. d. Solve for 푎 and 푏, and find the solutions for 푥 = 푎 + 푏푖. Lesson 1: Solutions to Polynomial Equations S.1 This work is licensed under a This work is derived from Eureka Math ™ and licensed by Great Minds. ©2015 Great Minds. eureka-math.org This file derived from PreCal-M3-TE-1.3.0-08.2015 Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. NYS COMMON CORE MATHEMATICS CURRICULUM Lesson 1 M3 PRECALCULUS AND ADVANCED TOPICS Exercises Find the product. 1. (푧 − 2)(푧 + 2) 2. (푧 + 3푖)(푧 − 3푖) Write each of the following quadratic expressions as the product of two linear factors. 3. 푧2 − 4 4. 푧2 + 4 5. 푧2 − 4푖 6. 푧2 + 4푖 Lesson 1: Solutions to Polynomial Equations S.2 This work is licensed under a This work is derived from Eureka Math ™ and licensed by Great Minds.
    [Show full text]
  • Stiefel Manifolds and Polygons
    Stiefel Manifolds and Polygons Clayton Shonkwiler Department of Mathematics, Colorado State University; [email protected] Abstract Polygons are compound geometric objects, but when trying to understand the expected behavior of a large collection of random polygons – or even to formalize what a random polygon is – it is convenient to interpret each polygon as a point in some parameter space, essentially trading the complexity of the object for the complexity of the space. In this paper I describe such an interpretation where the parameter space is an abstract but very nice space called a Stiefel manifold and show how to exploit the geometry of the Stiefel manifold both to generate random polygons and to morph one polygon into another. Introduction The goal is to describe an identification of polygons with points in certain nice parameter spaces. For example, every triangle in the plane can be identified with a pair of orthonormal vectors in 3-dimensional 3 3 space R , and hence planar triangle shapes correspond to points in the space St2(R ) of all such pairs, which is an example of a Stiefel manifold. While this space is defined somewhat abstractly and is hard to visualize – for example, it cannot be embedded in Euclidean space of fewer than 5 dimensions [9] – it is easy to visualize and manipulate points in it: they’re just pairs of perpendicular unit vectors. Moreover, this is a familiar and well-understood space in the setting of differential geometry. For example, the group SO(3) of rotations of 3-space acts transitively on it, meaning that there is a rotation of space which transforms any triangle into any other triangle.
    [Show full text]