Exponential Matrix and Their Properties

Total Page:16

File Type:pdf, Size:1020Kb

Exponential Matrix and Their Properties International Journal of Scientific and Innovative Mathematical Research (IJSIMR) Volume 4, Issue 1, January 2016, PP 53-63 ISSN 2347-307X (Print) & ISSN 2347-3142 (Online) www.arcjournals.org Exponential Matrix and Their Properties Mohammed Abdullah Saleh Salman1,2 Dr. V.C.Borkar College of Education & languages, Yeshwant Mahavidyalaya, Department of Mathematics & Statistics, Department of Mathematics & Statistics, University of Amran. Swami Ramanand Teerth Marthwada Amran, Yemen. University, Nanded, India [email protected] [email protected] Abstract: The matrix exponential is a very important subclass of matrix functions. In this paper, we discuss some of the more common matrix exponential and some methods for computing it. In principle, the matrix exponential could be calculated in different methods some of the methods are preferable to others but none are entirely satisfactory. Due to that, we discussed computations of the matrix exponential using Taylor Series, Scaling and Squaring, Eigenvectors, and the Schur decomposition methods theoretically. Keywords: Matrix Exponential, Commuting Matrix, Non-commuting Matrix. 1. INTRODUCTION The purpose of this note is matrix functions, The theory of matrix functions was subsequently developed by many mathematicians over the ensuing 100 years. Today, matrices of functions are widely used in science and engineering and are of growing interest, due to the succinct way they allow solutions to be expressed and recent advances in numerical algorithms for computing them [ ]. In general is an interesting area in linear algebra, matrix analysis and are used in many areas especially matrix Exponential .The matrix exponential is a very important subclass of functions of matrices that has been studied extensively in the last 50 years [ ]. The computation of matrix functions has been one of the most challenging problems in numerical linear algebra. Among the matrix functions one of the most interesting is the matrix exponential. A large number of methods has been proposed for the matrix exponential, many of them of pedagogic interest only or of dubious numerical stability. Some of the more computationally useful methods are surveyed in [ ] In principle, the matrix exponential could be computed in many ways and many different methods to calculate matrix exponential [ ,9]. In practice, some of the methods are preferable to others, but none are completely satisfactory. 2. DEFINITIONS OF EXP(A): The functions of a matrix in which we are interested can be defined in various ways. In mathematics, the matrix exponential is a function on square matrices analogous to the ordinary A exponential function [1, , , , 7]. Let A ∈ Mn. The exponential of A, denoted by e or exp(A) , is the n × n matrix given by the power series k 2 n 1 A A A A e I A ...... (`1) k 0 k! 2! (n 1)! Where A0 = I Note that this is the generalization of the Taylor series expansion of the standard Exponential n n function. The series (1) converges absolutely for all A C has radius of convergence equal to +1), so the exponential of A is well-defined. To prove the Convergence of the series, we have the following theorem. ©ARC Page | 53 Mohammed Abdullah Saleh Salman & Dr. V.C.Borkar Theorem (2.1) for more detail in [6]: The series (1) converges absolutely for all A M n . Furthermore, let be a normalized sub multiplicative norm on M n . Then eA e A (2) Proof: The nth partial sum is Ak Sn k 0 k! So k k Ak m Ak Ak A A e A S n k! k! k! k! k! k 0 k 0 k m 1 k m 1 k m 1 Since A is a real number and the right-hand side is a part of the convergent series of real numbers A k e A k 0 k! then this equation is convergent, if > 0 there is an N such that for m > n, A k e A k! k m 1 This is sufficient to prove that S n is convergent. Furthermore, note that k Ak k A A A A e e k 0 k! k 0 k! k 0 k! In some cases, it is a simple matter to express the matrix exponential of an n n complex A matrix A shall be denoted by e and can be defined in a number of equivalent ways [ ]: 1 e At e zt (zI A) 1 dz (3) 2 i Or At At k e lim (1 ) (4) k k Or At dx e AX(t) , X(0) 1 (5) dt For details see [7], and we have other definitions but we leave it to reader to collect them. 3. COMPUTATION OF EXPONENTIAL MATRIX There are many methods used to compute the exponential of a matrix. Approximation Theory, differential equations, the matrix eigenvalues, and the matrix characteristic Polynomials are some of the various methods used. we will outline various simplistic Methods for finding the exponential of a matrix. The methods examined are given by the type of matrix [ , ,8,9]. International Journal of Scientific and Innovative Mathematical Research (IJSIMR) Page 54 Exponential Matrix and Their Properties 3.1- Computing Matrix Exponential for Diagonal Matrix and for Diagonalizable Matrices if A is a diagonal matrix having diagonal entries then we have ea1 ea 2 e A a e n Now, Let be A R n n symmetric and has a complete set of linear independent Eigenvectors v1 ,v2 ,......., vn such that Av k k vk k 1,2,...........n (6) let us define the matrix T = [ ] whose columns are the eigenvector of A corresponding to the eigenvalues of A, we have AT Av1, Av 2 ,......., Avn v 1 , v 2 ,........ v 3 T 1 Since A T T 1 where A 2 n Now using to compute and we can write it as follow Ak 1 1 eA (TΛT 1)k T( Λk )T 1 TeΛT 1 k 0 k! k! k 0 k! And hence e 1 e 2 e A T T 1 e A e n Example: Consider the matrix 3 0 0 A 0 5 0 0 0 1 then by using the above formula for diagonal form we get the exponential matrix is e3 0 0 e A 0 e5 0 0 0 e1 For diagonalizable matrix we give this example International Journal of Scientific and Innovative Mathematical Research (IJSIMR) Page 55 Mohammed Abdullah Saleh Salman & Dr. V.C.Borkar Example: Let 5 1 A 2 2 after found the eigenvalues and eigenvectors and construct matrix T we use this formula to compute as follow 1 1 e4 0 2 1 2e 4 - e3 e4 e3 e A 1 - 2 0 e3 1 -1 2e3 2e4 2e 3 e4 3.2- Computing Matrix Exponential for General Square Matrices 3.2.1- Using Jordan Normal Form Suppose A is not diagonalizable matrix which it is not possible to find n linearly independent eigenvectors of the matrix A, In this case can use the Jordan form of A. Suppose j is the Jordan form of A, with P the transition matrix. Then A j 1 e Te T Where j diag( j1 1, j2 2 ,....., j1 k ) diag( j1 1 j2 2 ..... j1 ) 1 AThenT T J j j j e (e 1 1 e 2 2 ...... e k k ) Thus, the problem is to find the matrix exponential of a Jordan block where the Jordan block k has the form Jk ( ) k Nk M k and in general N as ones on the k th upper diagonal and is the null matrix if k n the dimension of the matrix. by using the above expression we have k k J k ( ) 1 k 1 k 1 k j j e J ( I N) N k 0 k! k 0 k! k 0 k! j 0 J This can be written 2 n 1 j A N N e e e I N ....... 2! (n 1)! Example: 21 17 6 A 5 -1 - 6 4 4 16 Then we calculate the eigenvalues of A which are [ ] We have A PJP 1 then we calculate P which is 1 5 2 4 4 1 1 P - 2 4 4 0 4 0 And International Journal of Scientific and Innovative Mathematical Research (IJSIMR) Page 56 Exponential Matrix and Their Properties 16 1 J (4) 0 16 Therefore, by using the Jordan canonical form to compute the exponential of matrix A is 13e16 e4 13e16 5e4 2e16 2e4 1 e A 9e16 e4 - 9e16 5e4 - 2e16 2e4 4 16e16 16e16 4e16 3.2.2- Using Hamilton Theorem Cayley Theorem 3.1 (Cayley Hamilton) Let A a square matrix and ( ) A I its characteristic polynomial then (A) 0. Proof: Consider a n n square matrix A and a polynomial p(x) and (x) be the characteristic polynomial of A. Then write p(x) in the form p(x) (x)q(x) r(x) by Cayley-Hamilton (x) 0 , then p(A) = r(A) such that we can write polynomial 1 X k r (X ) k! k x k Where r (x) is the remainder of long division of by , Then the matrix exponential k k! can be written as n 1 Ak e A r (A) k! k k 0 k 0 thus e A is a polynomial of A of degree less than n n 1 k A e ak A k 0 Consider now an eigenvector v with the corresponding eigenvalue , Then A 1 k 1 k e v A v v e v k 0 k! k 0 k! Analogously n 1 n 1 k k ak A v ( ak )v k 0 k 0 and thus if we have n distinct eigenvalues so that the above equation is an interpolation problem which can be used to compute the coefficients a k .
Recommended publications
  • Linear Systems and Control: a First Course (Course Notes for AAE 564)
    Linear Systems and Control: A First Course (Course notes for AAE 564) Martin Corless School of Aeronautics & Astronautics Purdue University West Lafayette, Indiana [email protected] August 25, 2008 Contents 1 Introduction 1 1.1 Ingredients..................................... 4 1.2 Somenotation................................... 4 1.3 MATLAB ..................................... 4 2 State space representation of dynamical systems 5 2.1 Linearexamples.................................. 5 2.1.1 Afirstexample .............................. 5 2.1.2 Theunattachedmass........................... 6 2.1.3 Spring-mass-damper ........................... 6 2.1.4 Asimplestructure ............................ 7 2.2 Nonlinearexamples............................... 8 2.2.1 Afirstnonlinearsystem ......................... 8 2.2.2 Planarpendulum ............................. 9 2.2.3 Attitudedynamicsofarigidbody. .. 9 2.2.4 Bodyincentralforcemotion. 10 2.2.5 Ballisticsindrag ............................. 11 2.2.6 Doublependulumoncart . .. .. 12 2.2.7 Two-linkroboticmanipulator . .. 13 2.3 Discrete-timeexamples . ... 14 2.3.1 Thediscreteunattachedmass . 14 2.4 Generalrepresentation . ... 15 2.4.1 Continuous-time ............................. 15 2.4.2 Discrete-time ............................... 18 2.4.3 Exercises.................................. 20 2.5 Vectors....................................... 22 2.5.1 Vector spaces and IRn .......................... 22 2.5.2 IR2 andpictures.............................. 24 2.5.3 Derivatives ...............................
    [Show full text]
  • Quantum Information
    Quantum Information J. A. Jones Michaelmas Term 2010 Contents 1 Dirac Notation 3 1.1 Hilbert Space . 3 1.2 Dirac notation . 4 1.3 Operators . 5 1.4 Vectors and matrices . 6 1.5 Eigenvalues and eigenvectors . 8 1.6 Hermitian operators . 9 1.7 Commutators . 10 1.8 Unitary operators . 11 1.9 Operator exponentials . 11 1.10 Physical systems . 12 1.11 Time-dependent Hamiltonians . 13 1.12 Global phases . 13 2 Quantum bits and quantum gates 15 2.1 The Bloch sphere . 16 2.2 Density matrices . 16 2.3 Propagators and Pauli matrices . 18 2.4 Quantum logic gates . 18 2.5 Gate notation . 21 2.6 Quantum networks . 21 2.7 Initialization and measurement . 23 2.8 Experimental methods . 24 3 An atom in a laser field 25 3.1 Time-dependent systems . 25 3.2 Sudden jumps . 26 3.3 Oscillating fields . 27 3.4 Time-dependent perturbation theory . 29 3.5 Rabi flopping and Fermi's Golden Rule . 30 3.6 Raman transitions . 32 3.7 Rabi flopping as a quantum gate . 32 3.8 Ramsey fringes . 33 3.9 Measurement and initialisation . 34 1 CONTENTS 2 4 Spins in magnetic fields 35 4.1 The nuclear spin Hamiltonian . 35 4.2 The rotating frame . 36 4.3 On-resonance excitation . 38 4.4 Excitation phases . 38 4.5 Off-resonance excitation . 39 4.6 Practicalities . 40 4.7 The vector model . 40 4.8 Spin echoes . 41 4.9 Measurement and initialisation . 42 5 Photon techniques 43 5.1 Spatial encoding .
    [Show full text]
  • MATH 237 Differential Equations and Computer Methods
    Queen’s University Mathematics and Engineering and Mathematics and Statistics MATH 237 Differential Equations and Computer Methods Supplemental Course Notes Serdar Y¨uksel November 19, 2010 This document is a collection of supplemental lecture notes used for Math 237: Differential Equations and Computer Methods. Serdar Y¨uksel Contents 1 Introduction to Differential Equations 7 1.1 Introduction: ................................... ....... 7 1.2 Classification of Differential Equations . ............... 7 1.2.1 OrdinaryDifferentialEquations. .......... 8 1.2.2 PartialDifferentialEquations . .......... 8 1.2.3 Homogeneous Differential Equations . .......... 8 1.2.4 N-thorderDifferentialEquations . ......... 8 1.2.5 LinearDifferentialEquations . ......... 8 1.3 Solutions of Differential equations . .............. 9 1.4 DirectionFields................................. ........ 10 1.5 Fundamental Questions on First-Order Differential Equations............... 10 2 First-Order Ordinary Differential Equations 11 2.1 ExactDifferentialEquations. ........... 11 2.2 MethodofIntegratingFactors. ........... 12 2.3 SeparableDifferentialEquations . ............ 13 2.4 Differential Equations with Homogenous Coefficients . ................ 13 2.5 First-Order Linear Differential Equations . .............. 14 2.6 Applications.................................... ....... 14 3 Higher-Order Ordinary Linear Differential Equations 15 3.1 Higher-OrderDifferentialEquations . ............ 15 3.1.1 LinearIndependence . ...... 16 3.1.2 WronskianofasetofSolutions . ........ 16 3.1.3 Non-HomogeneousProblem
    [Show full text]
  • The Exponential of a Matrix
    5-28-2012 The Exponential of a Matrix The solution to the exponential growth equation dx kt = kx is given by x = c e . dt 0 It is natural to ask whether you can solve a constant coefficient linear system ′ ~x = A~x in a similar way. If a solution to the system is to have the same form as the growth equation solution, it should look like At ~x = e ~x0. The first thing I need to do is to make sense of the matrix exponential eAt. The Taylor series for ez is ∞ n z z e = . n! n=0 X It converges absolutely for all z. It A is an n × n matrix with real entries, define ∞ n n At t A e = . n! n=0 X The powers An make sense, since A is a square matrix. It is possible to show that this series converges for all t and every matrix A. Differentiating the series term-by-term, ∞ ∞ ∞ ∞ n−1 n n−1 n n−1 n−1 m m d At t A t A t A t A At e = n = = A = A = Ae . dt n! (n − 1)! (n − 1)! m! n=0 n=1 n=1 m=0 X X X X At ′ This shows that e solves the differential equation ~x = A~x. The initial condition vector ~x(0) = ~x0 yields the particular solution At ~x = e ~x0. This works, because e0·A = I (by setting t = 0 in the power series). Another familiar property of ordinary exponentials holds for the matrix exponential: If A and B com- mute (that is, AB = BA), then A B A B e e = e + .
    [Show full text]
  • Scientific Computing: an Introductory Survey
    Eigenvalue Problems Existence, Uniqueness, and Conditioning Computing Eigenvalues and Eigenvectors Scientific Computing: An Introductory Survey Chapter 4 – Eigenvalue Problems Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction permitted for noncommercial, educational use only. Michael T. Heath Scientific Computing 1 / 87 Eigenvalue Problems Existence, Uniqueness, and Conditioning Computing Eigenvalues and Eigenvectors Outline 1 Eigenvalue Problems 2 Existence, Uniqueness, and Conditioning 3 Computing Eigenvalues and Eigenvectors Michael T. Heath Scientific Computing 2 / 87 Eigenvalue Problems Eigenvalue Problems Existence, Uniqueness, and Conditioning Eigenvalues and Eigenvectors Computing Eigenvalues and Eigenvectors Geometric Interpretation Eigenvalue Problems Eigenvalue problems occur in many areas of science and engineering, such as structural analysis Eigenvalues are also important in analyzing numerical methods Theory and algorithms apply to complex matrices as well as real matrices With complex matrices, we use conjugate transpose, AH , instead of usual transpose, AT Michael T. Heath Scientific Computing 3 / 87 Eigenvalue Problems Eigenvalue Problems Existence, Uniqueness, and Conditioning Eigenvalues and Eigenvectors Computing Eigenvalues and Eigenvectors Geometric Interpretation Eigenvalues and Eigenvectors Standard eigenvalue problem : Given n × n matrix A, find scalar λ and nonzero vector x such that A x = λ x λ is eigenvalue, and x is corresponding eigenvector
    [Show full text]
  • The Exponential Function for Matrices
    The exponential function for matrices Matrix exponentials provide a concise way of describing the solutions to systems of homoge- neous linear differential equations that parallels the use of ordinary exponentials to solve simple differential equations of the form y0 = λ y. For square matrices the exponential function can be defined by the same sort of infinite series used in calculus courses, but some work is needed in order to justify the construction of such an infinite sum. Therefore we begin with some material needed to prove that certain infinite sums of matrices can be defined in a mathematically sound manner and have reasonable properties. Limits and infinite series of matrices Limits of vector valued sequences in Rn can be defined and manipulated much like limits of scalar valued sequences, the key adjustment being that distances between real numbers that are expressed in the form js−tj are replaced by distances between vectors expressed in the form jx−yj. 1 Similarly, one can talk about convergence of a vector valued infinite series n=0 vn in terms of n the convergence of the sequence of partial sums sn = i=0 vk. As in the case of ordinary infinite series, the best form of convergence is absolute convergence, which correspondsP to the convergence 1 P of the real valued infinite series jvnj with nonnegative terms. A fundamental theorem states n=0 1 that a vector valued infinite series converges if the auxiliary series jvnj does, and there is P n=0 a generalization of the standard M-test: If jvnj ≤ Mn for all n where Mn converges, then P n vn also converges.
    [Show full text]
  • Approximating the Exponential from a Lie Algebra to a Lie Group
    MATHEMATICS OF COMPUTATION Volume 69, Number 232, Pages 1457{1480 S 0025-5718(00)01223-0 Article electronically published on March 15, 2000 APPROXIMATING THE EXPONENTIAL FROM A LIE ALGEBRA TO A LIE GROUP ELENA CELLEDONI AND ARIEH ISERLES 0 Abstract. Consider a differential equation y = A(t; y)y; y(0) = y0 with + y0 2 GandA : R × G ! g,whereg is a Lie algebra of the matricial Lie group G. Every B 2 g canbemappedtoGbythematrixexponentialmap exp (tB)witht 2 R. Most numerical methods for solving ordinary differential equations (ODEs) on Lie groups are based on the idea of representing the approximation yn of + the exact solution y(tn), tn 2 R , by means of exact exponentials of suitable elements of the Lie algebra, applied to the initial value y0. This ensures that yn 2 G. When the exponential is difficult to compute exactly, as is the case when the dimension is large, an approximation of exp (tB) plays an important role in the numerical solution of ODEs on Lie groups. In some cases rational or poly- nomial approximants are unsuitable and we consider alternative techniques, whereby exp (tB) is approximated by a product of simpler exponentials. In this paper we present some ideas based on the use of the Strang splitting for the approximation of matrix exponentials. Several cases of g and G are considered, in tandem with general theory. Order conditions are discussed, and a number of numerical experiments conclude the paper. 1. Introduction Consider the differential equation (1.1) y0 = A(t; y)y; y(0) 2 G; with A : R+ × G ! g; where G is a matricial Lie group and g is the underlying Lie algebra.
    [Show full text]
  • Diagonalizable Matrix - Wikipedia, the Free Encyclopedia
    Diagonalizable matrix - Wikipedia, the free encyclopedia http://en.wikipedia.org/wiki/Matrix_diagonalization Diagonalizable matrix From Wikipedia, the free encyclopedia (Redirected from Matrix diagonalization) In linear algebra, a square matrix A is called diagonalizable if it is similar to a diagonal matrix, i.e., if there exists an invertible matrix P such that P −1AP is a diagonal matrix. If V is a finite-dimensional vector space, then a linear map T : V → V is called diagonalizable if there exists a basis of V with respect to which T is represented by a diagonal matrix. Diagonalization is the process of finding a corresponding diagonal matrix for a diagonalizable matrix or linear map.[1] A square matrix which is not diagonalizable is called defective. Diagonalizable matrices and maps are of interest because diagonal matrices are especially easy to handle: their eigenvalues and eigenvectors are known and one can raise a diagonal matrix to a power by simply raising the diagonal entries to that same power. Geometrically, a diagonalizable matrix is an inhomogeneous dilation (or anisotropic scaling) — it scales the space, as does a homogeneous dilation, but by a different factor in each direction, determined by the scale factors on each axis (diagonal entries). Contents 1 Characterisation 2 Diagonalization 3 Simultaneous diagonalization 4 Examples 4.1 Diagonalizable matrices 4.2 Matrices that are not diagonalizable 4.3 How to diagonalize a matrix 4.3.1 Alternative Method 5 An application 5.1 Particular application 6 Quantum mechanical application 7 See also 8 Notes 9 References 10 External links Characterisation The fundamental fact about diagonalizable maps and matrices is expressed by the following: An n-by-n matrix A over the field F is diagonalizable if and only if the sum of the dimensions of its eigenspaces is equal to n, which is the case if and only if there exists a basis of Fn consisting of eigenvectors of A.
    [Show full text]
  • Singles out a Specific Basis
    Quantum Information and Quantum Noise Gabriel T. Landi University of Sao˜ Paulo July 3, 2018 Contents 1 Review of quantum mechanics1 1.1 Hilbert spaces and states........................2 1.2 Qubits and Bloch’s sphere.......................3 1.3 Outer product and completeness....................5 1.4 Operators................................7 1.5 Eigenvalues and eigenvectors......................8 1.6 Unitary matrices.............................9 1.7 Projective measurements and expectation values............ 10 1.8 Pauli matrices.............................. 11 1.9 General two-level systems....................... 13 1.10 Functions of operators......................... 14 1.11 The Trace................................ 17 1.12 Schrodinger’s¨ equation......................... 18 1.13 The Schrodinger¨ Lagrangian...................... 20 2 Density matrices and composite systems 24 2.1 The density matrix........................... 24 2.2 Bloch’s sphere and coherence...................... 29 2.3 Composite systems and the almighty kron............... 32 2.4 Entanglement.............................. 35 2.5 Mixed states and entanglement..................... 37 2.6 The partial trace............................. 39 2.7 Reduced density matrices........................ 42 2.8 Singular value and Schmidt decompositions.............. 44 2.9 Entropy and mutual information.................... 50 2.10 Generalized measurements and POVMs................ 62 3 Continuous variables 68 3.1 Creation and annihilation operators................... 68 3.2 Some important
    [Show full text]
  • Mata Glossary of Common Terms
    Title stata.com Glossary Description Commonly used terms are defined here. Mata glossary arguments The values a function receives are called the function’s arguments. For instance, in lud(A, L, U), A, L, and U are the arguments. array An array is any indexed object that holds other objects as elements. Vectors are examples of 1- dimensional arrays. Vector v is an array, and v[1] is its first element. Matrices are 2-dimensional arrays. Matrix X is an array, and X[1; 1] is its first element. In theory, one can have 3-dimensional, 4-dimensional, and higher arrays, although Mata does not directly provide them. See[ M-2] Subscripts for more information on arrays in Mata. Arrays are usually indexed by sequential integers, but in associative arrays, the indices are strings that have no natural ordering. Associative arrays can be 1-dimensional, 2-dimensional, or higher. If A were an associative array, then A[\first"] might be one of its elements. See[ M-5] asarray( ) for associative arrays in Mata. binary operator A binary operator is an operator applied to two arguments. In 2-3, the minus sign is a binary operator, as opposed to the minus sign in -9, which is a unary operator. broad type Two matrices are said to be of the same broad type if the elements in each are numeric, are string, or are pointers. Mata provides two numeric types, real and complex. The term broad type is used to mask the distinction within numeric and is often used when discussing operators or functions.
    [Show full text]
  • Matrix Exponential Updates for On-Line Learning and Bregman
    Matrix Exponential Updates for On-line Learning and Bregman Projection ¥ ¤£ Koji Tsuda ¢¡, Gunnar Ratsch¨ and Manfred K. Warmuth Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tubingen,¨ Germany ¡ AIST CBRC, 2-43 Aomi, Koto-ku, Tokyo, 135-0064, Japan £ Fraunhofer FIRST, Kekulestr´ . 7, 12489 Berlin, Germany ¥ University of Calif. at Santa Cruz ¦ koji.tsuda,gunnar.raetsch § @tuebingen.mpg.de, [email protected] Abstract We address the problem of learning a positive definite matrix from exam- ples. The central issue is to design parameter updates that preserve pos- itive definiteness. We introduce an update based on matrix exponentials which can be used as an on-line algorithm or for the purpose of finding a positive definite matrix that satisfies linear constraints. We derive this update using the von Neumann divergence and then use this divergence as a measure of progress for proving relative loss bounds. In experi- ments, we apply our algorithms to learn a kernel matrix from distance measurements. 1 Introduction Most learning algorithms have been developed to learn a vector of parameters from data. However, an increasing number of papers are now dealing with more structured parameters. More specifically, when learning a similarity or a distance function among objects, the parameters are defined as a positive definite matrix that serves as a kernel (e.g. [12, 9, 11]). Learning is typically formulated as a parameter updating procedure to optimize a loss function. The gradient descent update [5] is one of the most commonly used algorithms, but it is not appropriate when the parameters form a positive definite matrix, because the updated parameter is not necessarily positive definite.
    [Show full text]
  • Matrix Exponentiated Gradient Updates for On-Line Learning and Bregman Projection
    JournalofMachineLearningResearch6(2005)995–1018 Submitted 11/04; Revised 4/05; Published 6/05 Matrix Exponentiated Gradient Updates for On-line Learning and Bregman Projection Koji Tsuda [email protected] Max Planck Institute for Biological Cybernetics Spemannstrasse 38 72076 Tubingen,¨ Germany, and Computational Biology Research Center National Institute of Advanced Science and Technology (AIST) 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan Gunnar Ratsch¨ [email protected] Friedrich Miescher Laboratory of the Max Planck Society Spemannstrasse 35 72076 Tubingen,¨ Germany Manfred K. Warmuth [email protected] Computer Science Department University of California Santa Cruz, CA 95064, USA Editor: Yoram Singer Abstract We address the problem of learning a symmetric positive definite matrix. The central issue is to de- sign parameter updates that preserve positive definiteness. Our updates are motivated with the von Neumann divergence. Rather than treating the most general case, we focus on two key applications that exemplify our methods: on-line learning with a simple square loss, and finding a symmetric positive definite matrix subject to linear constraints. The updates generalize the exponentiated gra- dient (EG) update and AdaBoost, respectively: the parameter is now a symmetric positive definite matrix of trace one instead of a probability vector (which in this context is a diagonal positive def- inite matrix with trace one). The generalized updates use matrix logarithms and exponentials to preserve positive definiteness. Most importantly, we show how the derivation and the analyses of the original EG update and AdaBoost generalize to the non-diagonal case. We apply the resulting matrix exponentiated gradient (MEG) update and DefiniteBoost to the problem of learning a kernel matrix from distance measurements.
    [Show full text]