Bayesian Inference of Log Determinants

Total Page:16

File Type:pdf, Size:1020Kb

Bayesian Inference of Log Determinants Bayesian Inference of Log Determinants Jack Fitzsimons1 Kurt Cutajar2 Michael Osborne1 Stephen Roberts1 Maurizio Filippone2 1 Information Engineering, University of Oxford, UK 2 Department of Data Science, EURECOM, France Abstract The standard approach for evaluating the log- determinant of a positive definite matrix involves The log-determinant of a kernel matrix ap- the use of Cholesky decomposition (Golub & Van Loan, pears in a variety of machine learning prob- 1996), which is employed in various applications of lems, ranging from determinantal point pro- statistical models such as kernel machines. However, cesses and generalized Markov random fields, the use of Cholesky decomposition for general dense 3 through to the training of Gaussian processes. matrices requires O(n ) operations, whilst also entail- 2 Exact calculation of this term is often in- ing memory requirements of O(n ). In view of this tractable when the size of the kernel matrix ex- computational bottleneck, various models requiring ceeds a few thousands. In the spirit of proba- the log-determinant for inference bypass the need to bilistic numerics, we reinterpret the problem of compute it altogether (Anitescu et al., 2012; Stein et al., computing the log-determinant as a Bayesian 2013; Cutajar et al., 2016; Filippone & Engler, 2015). inference problem. In particular, we com- Alternatively, several methods exploit sparsity and struc- bine prior knowledge in the form of bounds ture within the matrix itself to accelerate computations. from matrix theory and evidence derived from For example, sparsity in Gaussian Markov Random stochastic trace estimation to obtain proba- fields (GMRFs) arises from encoding conditional inde- bilistic estimates for the log-determinant and pendence assumptions that are readily available when its associated uncertainty within a given com- considering low-dimensional problems. For such matri- putational budget. Beyond its novelty and the- ces, the Cholesky decompositions can be computed in oretic appeal, the performance of our proposal fewer than O(n3) operations (Rue & Held, 2005; Rue is competitive with state-of-the-art approaches et al., 2009). Similarly, Kronecker-based linear algebra to approximating the log-determinant, while techniques may be employed for kernel matrices com- also quantifying the uncertainty due to budget- puted on regularly spaced inputs (Saatc¸i, 2011). While constrained evidence. these ideas have proven successful for a variety of spe- cific applications, they cannot be extended to the case of general dense matrices without assuming special forms arXiv:1704.01445v1 [stat.ML] 5 Apr 2017 1 INTRODUCTION or structures for the available data. Developing scalable learning models without compro- To this end, general approximations to the log- mising performance is at the forefront of machine learn- determinant frequently build upon stochastic trace es- ing research. The scalability of several learning mod- timation techniques using iterative methods (Avron & els is predominantly hindered by linear algebraic op- Toledo, 2011). Two of the most widely-used polynomial erations having large computational complexity, among approximations for large-scale matrices are the Taylor which is the computation of the log-determinant of a ma- and Chebyshev expansions (Aune et al., 2014; Han et al., trix (Golub & Van Loan, 1996). The latter term features 2015). A more recent approach draws from the possibil- heavily in the machine learning literature, with applica- ity of estimating the trace of functions using stochastic tions including spatial models (Aune et al., 2014; Rue Lanczos quadrature (Ubaru et al., 2016), which has been & Held, 2005), kernel-based models (Davis et al., 2007; shown to outperform polynomial approximations from Rasmussen & Williams, 2006), and Bayesian learn- both a theoretic and empirical perspective. ing (Mackay, 2003). Inspired by recent developments in the field of proba- the logarithm of the matrix, and employ trace estima- bilistic numerics (Hennig et al., 2015), in this work we tion techniques (Hutchinson, 1990) to obtain unbiased propose an alternative approach for calculating the log- estimates of these. Chen et al. (2011) propose an itera- determinant of a matrix by expressing this computation tive algorithm to efficiently compute the product of the as a Bayesian quadrature problem. In doing so, we refor- logarithm of a matrix with a vector, which is achieved mulate the problem of computing an intractable quantity by computing a spline approximation to the logarithm into an estimation problem, where the goal is to infer the function. A similar idea using Chebyshev polynomi- correct result using tractable computations that can be als has been developed by Han et al. (2015). Most re- carried out within a given time budget. In particular, we cently, the Lanczos method has been extended to handle model the eigenvalues of a matrix A from noisy obser- stochastic estimates of the trace and obtain probabilistic vations of Tr(Ak) obtained through stochastic trace esti- error bounds for the approximation (Ubaru et al., 2016). mation using the Taylor approximation method (Zhang Blocking techniques, such as in Ipsen & Lee (2011) and & Leithead, 2007). Such a model can then be used Ambikasaran et al. (2016), have also been proposed. to make predictions on the infinite series of the Tay- In our work, we similarly strive to use a small num- lor expansion, yielding the estimated value of the log- ber of matrix-vector products for approximating log- determinant. Aside from permitting a probabilistic ap- determinants. However, we show that by taking a proach for predicting the log-determinant, this approach Bayesian approach we can combine priors with the ev- inherently yields uncertainty estimates for the predicted idence gathered from the intermediate results of matrix- value, which in turn serves as an indicator of the quality vector products involved in the afore-mentioned methods of our approximation. to obtain more accurate results. Most importantly, our Our contributions are as follows. proposal has the considerable advantage that it provides a full distribution on the approximated value. 1. We propose a probabilistic approach for computing the log-determinant of a matrix which blends differ- Our proposal allows for the inclusion of explicit bounds ent elements from the literature on estimating log- on log-determinants to constrain the posterior distribu- determinants under a Bayesian framework. tion over the estimated log-determinant (Bai & Golub, 1997). Nystrom¨ approximations can also be used to 2. We demonstrate how bounds on the expected value bound the log-determinant, as shown by Bardenet & Tit- of the log-determinant improve our estimates by sias (2015). Similarly, Gaussian processes (Rasmussen constraining the probability distribution to lie be- & Williams, 2006) have been formulated directly using tween designated lower and upper bounds. the eigendecomposition of its spectrum, where eigenvec- 3. Through rigorous numerical experiments on syn- tors are approximated using the Nystrom¨ method (Peng thetic and real data, we demonstrate how our & Qi, 2015). There has also been work on estimating method can yield superior approximations to com- the distribution of kernel eigenvalues by analyzing the peting approaches, while also having the additional spectrum of linear operators (Braun, 2006; Wathen & benefit of uncertainty quantification. Zhu, 2015), as well as bounds on the spectra of ma- trices with particular emphasis on deriving the largest 4. Finally, in order to demonstrate how this technique eigenvalue (Wolkowicz & Styan, 1980; Braun, 2006). may be useful within a practical scenario, we em- In this work, we directly consider bounds on the log- ploy our method to carry out parameter selection for determinants of matrices (Bai & Golub, 1997). a large-scale determinantal point process. To the best of our knowledge, this is the first time that 2 BACKGROUND the approximation of log-determinants is viewed as a Bayesian inference problem, with the resulting quantifi- As highlighted in the introduction, several approaches cation of uncertainty being hitherto unexplored thus far. for approximating the log-determinant of a matrix rely on stochastic trace estimation for accelerating computa- 1.1 RELATED WORK tions. This comes about as a result of the relationship between the log-determinant of a matrix, and the corre- The most widely-used approaches for estimating log- sponding trace of the log-matrix, whereby determinants involve extensions of iterative algorithms, such as the Conjugate-Gradient and Lanczos methods, logDet (A) = Trlog (A). (1) to obtain estimates of functions of matrices (Chen et al., 2011; Han et al., 2015) or their trace (Ubaru et al., 2016). Provided the matrix log(A) can be efficiently sampled, The idea is to rewrite log-determinants as the trace of this simple identity enables the use of stochastic trace es- 100 know that the smallest eigenvalue is bounded by zero, 10-1 λn ≥ 0. Motivated by the identity presented in (1), the Taylor series expansion (Barry & Pace, 1999; Zhang & 10-2 Leithead, 2007) may be employed for evaluating the log- -3 10 determinant of matrices having eigenvalues bounded be- 10-4 ν = 1 ν = 30 tween zero and one. In particular, this approach relies on Absolute Error ν = 10 ν = 40 10-5 the following logarithm identity, ν = 20 ν = 50 -6 1 k 10 0 1 2 3 4 5 6 X A 10 10 10 10 10 10 10 log (I − A) = − : (3) Order of Truncation k k=1 While the infinite summation is not explicitly com- Figure 1: Expected absolute error of truncated Taylor putable in finite time, this may be approximated by com- series for stationary ν-continuous kernel matrices. The puting a truncated series instead. Furthermore, given that dashed grey lines indicate O(n−1). the trace of matrices is additive, we find m k X Tr A Trlog (I − A) ≈ − : (4) timation techniques (Avron & Toledo, 2011; Fitzsimons k et al., 2016). We elaborate further on this concept below. k=1 The Tr(Ak) term can be computed efficiently and recur- 2.1 STOCHASTIC TRACE ESTIMATION sively by propagating O(n2) vector-matrix multiplica- tions in a stochastic trace estimation scheme.
Recommended publications
  • Lecture Notes in One File
    Phys 6124 zipped! World Wide Quest to Tame Math Methods Predrag Cvitanovic´ January 2, 2020 Overview The career of a young theoretical physicist consists of treating the harmonic oscillator in ever-increasing levels of abstraction. — Sidney Coleman I am leaving the course notes here, not so much for the notes themselves –they cannot be understood on their own, without the (unrecorded) live lectures– but for the hyperlinks to various source texts you might find useful later on in your research. We change the topics covered year to year, in hope that they reflect better what a graduate student needs to know. This year’s experiment are the two weeks dedicated to data analysis. Let me know whether you would have preferred the more traditional math methods fare, like Bessels and such. If you have taken this course live, you might have noticed a pattern: Every week we start with something obvious that you already know, let mathematics lead us on, and then suddenly end up someplace surprising and highly non-intuitive. And indeed, in the last lecture (that never took place), we turn Coleman on his head, and abandon harmonic potential for the inverted harmonic potential, “spatiotemporal cats” and chaos, arXiv:1912.02940. Contents 1 Linear algebra7 HomeworkHW1 ................................ 7 1.1 Literature ................................ 8 1.2 Matrix-valued functions ......................... 9 1.3 A linear diversion ............................ 12 1.4 Eigenvalues and eigenvectors ...................... 14 1.4.1 Yes, but how do you really do it? . 18 References ................................... 19 exercises 20 2 Eigenvalue problems 23 HomeworkHW2 ................................ 23 2.1 Normal modes .............................. 24 2.2 Stable/unstable manifolds .......................
    [Show full text]
  • Introduction to Quantum Field Theory
    Utrecht Lecture Notes Masters Program Theoretical Physics September 2006 INTRODUCTION TO QUANTUM FIELD THEORY by B. de Wit Institute for Theoretical Physics Utrecht University Contents 1 Introduction 4 2 Path integrals and quantum mechanics 5 3 The classical limit 10 4 Continuous systems 18 5 Field theory 23 6 Correlation functions 36 6.1 Harmonic oscillator correlation functions; operators . 37 6.2 Harmonic oscillator correlation functions; path integrals . 39 6.2.1 Evaluating G0 ............................... 41 6.2.2 The integral over qn ............................ 42 6.2.3 The integrals over q1 and q2 ....................... 43 6.3 Conclusion . 44 7 Euclidean Theory 46 8 Tunneling and instantons 55 8.1 The double-well potential . 57 8.2 The periodic potential . 66 9 Perturbation theory 71 10 More on Feynman diagrams 80 11 Fermionic harmonic oscillator states 87 12 Anticommuting c-numbers 90 13 Phase space with commuting and anticommuting coordinates and quanti- zation 96 14 Path integrals for fermions 108 15 Feynman diagrams for fermions 114 2 16 Regularization and renormalization 117 17 Further reading 127 3 1 Introduction Physical systems that involve an infinite number of degrees of freedom are usually described by some sort of field theory. Almost all systems in nature involve an extremely large number of degrees of freedom. A droplet of water contains of the order of 1026 molecules and while each water molecule can in many applications be described as a point particle, each molecule has itself a complicated structure which reveals itself at molecular length scales. To deal with this large number of degrees of freedom, which for all practical purposes is infinite, we often regard a system as continuous, in spite of the fact that, at certain distance scales, it is discrete.
    [Show full text]
  • 2.2 Kernel and Range of a Linear Transformation
    2.2 Kernel and Range of a Linear Transformation Performance Criteria: 2. (c) Determine whether a given vector is in the kernel or range of a linear trans- formation. Describe the kernel and range of a linear transformation. (d) Determine whether a transformation is one-to-one; determine whether a transformation is onto. When working with transformations T : Rm → Rn in Math 341, you found that any linear transformation can be represented by multiplication by a matrix. At some point after that you were introduced to the concepts of the null space and column space of a matrix. In this section we present the analogous ideas for general vector spaces. Definition 2.4: Let V and W be vector spaces, and let T : V → W be a transformation. We will call V the domain of T , and W is the codomain of T . Definition 2.5: Let V and W be vector spaces, and let T : V → W be a linear transformation. • The set of all vectors v ∈ V for which T v = 0 is a subspace of V . It is called the kernel of T , And we will denote it by ker(T ). • The set of all vectors w ∈ W such that w = T v for some v ∈ V is called the range of T . It is a subspace of W , and is denoted ran(T ). It is worth making a few comments about the above: • The kernel and range “belong to” the transformation, not the vector spaces V and W . If we had another linear transformation S : V → W , it would most likely have a different kernel and range.
    [Show full text]
  • 23. Kernel, Rank, Range
    23. Kernel, Rank, Range We now study linear transformations in more detail. First, we establish some important vocabulary. The range of a linear transformation f : V ! W is the set of vectors the linear transformation maps to. This set is also often called the image of f, written ran(f) = Im(f) = L(V ) = fL(v)jv 2 V g ⊂ W: The domain of a linear transformation is often called the pre-image of f. We can also talk about the pre-image of any subset of vectors U 2 W : L−1(U) = fv 2 V jL(v) 2 Ug ⊂ V: A linear transformation f is one-to-one if for any x 6= y 2 V , f(x) 6= f(y). In other words, different vector in V always map to different vectors in W . One-to-one transformations are also known as injective transformations. Notice that injectivity is a condition on the pre-image of f. A linear transformation f is onto if for every w 2 W , there exists an x 2 V such that f(x) = w. In other words, every vector in W is the image of some vector in V . An onto transformation is also known as an surjective transformation. Notice that surjectivity is a condition on the image of f. 1 Suppose L : V ! W is not injective. Then we can find v1 6= v2 such that Lv1 = Lv2. Then v1 − v2 6= 0, but L(v1 − v2) = 0: Definition Let L : V ! W be a linear transformation. The set of all vectors v such that Lv = 0W is called the kernel of L: ker L = fv 2 V jLv = 0g: 1 The notions of one-to-one and onto can be generalized to arbitrary functions on sets.
    [Show full text]
  • On the Range-Kernel Orthogonality of Elementary Operators
    140 (2015) MATHEMATICA BOHEMICA No. 3, 261–269 ON THE RANGE-KERNEL ORTHOGONALITY OF ELEMENTARY OPERATORS Said Bouali, Kénitra, Youssef Bouhafsi, Rabat (Received January 16, 2013) Abstract. Let L(H) denote the algebra of operators on a complex infinite dimensional Hilbert space H. For A, B ∈ L(H), the generalized derivation δA,B and the elementary operator ∆A,B are defined by δA,B(X) = AX − XB and ∆A,B(X) = AXB − X for all X ∈ L(H). In this paper, we exhibit pairs (A, B) of operators such that the range-kernel orthogonality of δA,B holds for the usual operator norm. We generalize some recent results. We also establish some theorems on the orthogonality of the range and the kernel of ∆A,B with respect to the wider class of unitarily invariant norms on L(H). Keywords: derivation; elementary operator; orthogonality; unitarily invariant norm; cyclic subnormal operator; Fuglede-Putnam property MSC 2010 : 47A30, 47A63, 47B15, 47B20, 47B47, 47B10 1. Introduction Let H be a complex infinite dimensional Hilbert space, and let L(H) denote the algebra of all bounded linear operators acting on H into itself. Given A, B ∈ L(H), we define the generalized derivation δA,B : L(H) → L(H) by δA,B(X)= AX − XB, and the elementary operator ∆A,B : L(H) → L(H) by ∆A,B(X)= AXB − X. Let δA,A = δA and ∆A,A = ∆A. In [1], Anderson shows that if A is normal and commutes with T , then for all X ∈ L(H) (1.1) kδA(X)+ T k > kT k, where k·k is the usual operator norm.
    [Show full text]
  • Low-Level Image Processing with the Structure Multivector
    Low-Level Image Processing with the Structure Multivector Michael Felsberg Bericht Nr. 0202 Institut f¨ur Informatik und Praktische Mathematik der Christian-Albrechts-Universitat¨ zu Kiel Olshausenstr. 40 D – 24098 Kiel e-mail: [email protected] 12. Marz¨ 2002 Dieser Bericht enthalt¨ die Dissertation des Verfassers 1. Gutachter Prof. G. Sommer (Kiel) 2. Gutachter Prof. U. Heute (Kiel) 3. Gutachter Prof. J. J. Koenderink (Utrecht) Datum der mundlichen¨ Prufung:¨ 12.2.2002 To Regina ABSTRACT The present thesis deals with two-dimensional signal processing for computer vi- sion. The main topic is the development of a sophisticated generalization of the one-dimensional analytic signal to two dimensions. Motivated by the fundamental property of the latter, the invariance – equivariance constraint, and by its relation to complex analysis and potential theory, a two-dimensional approach is derived. This method is called the monogenic signal and it is based on the Riesz transform instead of the Hilbert transform. By means of this linear approach it is possible to estimate the local orientation and the local phase of signals which are projections of one-dimensional functions to two dimensions. For general two-dimensional signals, however, the monogenic signal has to be further extended, yielding the structure multivector. The latter approach combines the ideas of the structure tensor and the quaternionic analytic signal. A rich feature set can be extracted from the structure multivector, which contains measures for local amplitudes, the local anisotropy, the local orientation, and two local phases. Both, the monogenic signal and the struc- ture multivector are combined with an appropriate scale-space approach, resulting in generalized quadrature filters.
    [Show full text]
  • A Guided Tour to the Plane-Based Geometric Algebra PGA
    A Guided Tour to the Plane-Based Geometric Algebra PGA Leo Dorst University of Amsterdam Version 1.15{ July 6, 2020 Planes are the primitive elements for the constructions of objects and oper- ators in Euclidean geometry. Triangulated meshes are built from them, and reflections in multiple planes are a mathematically pure way to construct Euclidean motions. A geometric algebra based on planes is therefore a natural choice to unify objects and operators for Euclidean geometry. The usual claims of `com- pleteness' of the GA approach leads us to hope that it might contain, in a single framework, all representations ever designed for Euclidean geometry - including normal vectors, directions as points at infinity, Pl¨ucker coordinates for lines, quaternions as 3D rotations around the origin, and dual quaternions for rigid body motions; and even spinors. This text provides a guided tour to this algebra of planes PGA. It indeed shows how all such computationally efficient methods are incorporated and related. We will see how the PGA elements naturally group into blocks of four coordinates in an implementation, and how this more complete under- standing of the embedding suggests some handy choices to avoid extraneous computations. In the unified PGA framework, one never switches between efficient representations for subtasks, and this obviously saves any time spent on data conversions. Relative to other treatments of PGA, this text is rather light on the mathematics. Where you see careful derivations, they involve the aspects of orientation and magnitude. These features have been neglected by authors focussing on the mathematical beauty of the projective nature of the algebra.
    [Show full text]
  • The Kernel of a Linear Transformation Is a Vector Subspace
    The kernel of a linear transformation is a vector subspace. Given two vector spaces V and W and a linear transformation L : V ! W we define a set: Ker(L) = f~v 2 V j L(~v) = ~0g = L−1(f~0g) which we call the kernel of L. (some people call this the nullspace of L). Theorem As defined above, the set Ker(L) is a subspace of V , in particular it is a vector space. Proof Sketch We check the three conditions 1 Because we know L(~0) = ~0 we know ~0 2 Ker(L). 2 Let ~v1; ~v2 2 Ker(L) then we know L(~v1 + ~v2) = L(~v1) + L(~v2) = ~0 + ~0 = ~0 and so ~v1 + ~v2 2 Ker(L). 3 Let ~v 2 Ker(L) and a 2 R then L(a~v) = aL(~v) = a~0 = ~0 and so a~v 2 Ker(L). Math 3410 (University of Lethbridge) Spring 2018 1 / 7 Example - Kernels Matricies Describe and find a basis for the kernel, of the linear transformation, L, associated to 01 2 31 A = @3 2 1A 1 1 1 The kernel is precisely the set of vectors (x; y; z) such that L((x; y; z)) = (0; 0; 0), so 01 2 31 0x1 001 @3 2 1A @yA = @0A 1 1 1 z 0 but this is precisely the solutions to the system of equations given by A! So we find a basis by solving the system! Theorem If A is any matrix, then Ker(A), or equivalently Ker(L), where L is the associated linear transformation, is precisely the solutions ~x to the system A~x = ~0 This is immediate from the definition given our understanding of how to associate a system of equations to M~x = ~0: Math 3410 (University of Lethbridge) Spring 2018 2 / 7 The Kernel and Injectivity Recall that a function L : V ! W is injective if 8~v1; ~v2 2 V ; ((L(~v1) = L(~v2)) ) (~v1 = ~v2)) Theorem A linear transformation L : V ! W is injective if and only if Ker(L) = f~0g.
    [Show full text]
  • Some Key Facts About Transpose
    Some key facts about transpose Let A be an m × n matrix. Then AT is the matrix which switches the rows and columns of A. For example 0 1 T 1 2 1 01 5 3 41 5 7 3 2 7 0 9 = B C @ A B3 0 2C 1 3 2 6 @ A 4 9 6 We have the following useful identities: (AT )T = A (A + B)T = AT + BT (kA)T = kAT Transpose Facts 1 (AB)T = BT AT (AT )−1 = (A−1)T ~v · ~w = ~vT ~w A deeper fact is that Rank(A) = Rank(AT ): Transpose Fact 2 Remember that Rank(B) is dim(Im(B)), and we compute Rank as the number of leading ones in the row reduced form. Recall that ~u and ~v are perpendicular if and only if ~u · ~v = 0. The word orthogonal is a synonym for perpendicular. n ? n If V is a subspace of R , then V is the set of those vectors in R which are perpendicular to every vector in V . V ? is called the orthogonal complement to V ; I'll often pronounce it \Vee perp" for short. ? n You can (and should!) check that V is a subspace of R . It is geometrically intuitive that dim V ? = n − dim V Transpose Fact 3 and that (V ?)? = V: Transpose Fact 4 We will prove both of these facts later in this note. In this note we will also show that Ker(A) = Im(AT )? Im(A) = Ker(AT )? Transpose Fact 5 As is often the case in math, the best order to state results is not the best order to prove them.
    [Show full text]
  • Journal of Computational and Applied Mathematics Exponentials of Skew
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector Journal of Computational and Applied Mathematics 233 (2010) 2867–2875 Contents lists available at ScienceDirect Journal of Computational and Applied Mathematics journal homepage: www.elsevier.com/locate/cam Exponentials of skew-symmetric matrices and logarithms of orthogonal matrices João R. Cardoso a,b,∗, F. Silva Leite c,b a Institute of Engineering of Coimbra, Rua Pedro Nunes, 3030-199 Coimbra, Portugal b Institute of Systems and Robotics, University of Coimbra, Pólo II, 3030-290 Coimbra, Portugal c Department of Mathematics, University of Coimbra, 3001-454 Coimbra, Portugal article info a b s t r a c t Article history: Two widely used methods for computing matrix exponentials and matrix logarithms are, Received 11 July 2008 respectively, the scaling and squaring and the inverse scaling and squaring. Both methods become effective when combined with Padé approximation. This paper deals with the MSC: computation of exponentials of skew-symmetric matrices and logarithms of orthogonal 65Fxx matrices. Our main goal is to improve these two methods by exploiting the special structure Keywords: of skew-symmetric and orthogonal matrices. Geometric features of the matrix exponential Skew-symmetric matrix and logarithm and extensions to the special Euclidean group of rigid motions are also Orthogonal matrix addressed. Matrix exponential ' 2009 Elsevier B.V. All rights reserved. Matrix logarithm Scaling and squaring Inverse scaling and squaring Padé approximants 1. Introduction n×n Given a square matrix X 2 R , the exponential of X is given by the absolute convergent power series 1 X X k eX D : kW kD0 n×n X Reciprocally, given a nonsingular matrix A 2 R , any solution of the matrix equation e D A is called a logarithm of A.
    [Show full text]
  • The Exponential Map for the Group of Similarity Transformations and Applications to Motion Interpolation
    The exponential map for the group of similarity transformations and applications to motion interpolation Spyridon Leonardos1, Christine Allen–Blanchette2 and Jean Gallier3 Abstract— In this paper, we explore the exponential map The problem of motion interpolation has been studied and its inverse, the logarithm map, for the group SIM(n) of extensively both in the robotics and computer graphics n similarity transformations in R which are the composition communities. Since rotations can be represented by quater- of a rotation, a translation and a uniform scaling. We give a formula for the exponential map and we prove that it is nions, the problem of quaternion interpolation has been surjective. We give an explicit formula for the case of n = 3 investigated, an approach initiated by Shoemake [21], [22], and show how to efficiently compute the logarithmic map. As who extended the de Casteljau algorithm to the 3-sphere. an application, we use these algorithms to perform motion Related work was done by Barr et al. [2]. Moreover, Kim interpolation. Given a sequence of similarity transformations, et al. [13], [14] corrected bugs in Shoemake and introduced we compute a sequence of logarithms, then fit a cubic spline 4 that interpolates the logarithms and finally, we compute the various kinds of splines on the 3-sphere in R , using the interpolating curve in SIM(3). exponential map. Motion interpolation and rational motions have been investigated by Juttler¨ [9], [10], Juttler¨ and Wagner I. INTRODUCTION [11], [12], Horsch and Juttler¨ [8], and Roschel¨ [20]. Park and Ravani [18], [19] also investigated Bezier´ curves on The exponential map is ubiquitous in engineering appli- Riemannian manifolds and Lie groups and in particular, the cations ranging from dynamical systems and robotics to group of rotations.
    [Show full text]
  • Linear Algebra Review
    Appendix A Linear Algebra Review In this appendix, we review results from linear algebra that are used in the text. The results quoted here are mostly standard, and the proofs are mostly omitted. For more information, the reader is encouraged to consult such standard linear algebra textbooks as [HK]or[Axl]. Throughout this appendix, we let Mn.C/ denote the space of n n matrices with entries in C: A.1 Eigenvectors and Eigenvalues n For any A 2 Mn.C/; a nonzero vector v in C is called an eigenvector for A if there is some complex number such that Av D v: An eigenvalue for A is a complex number for which there exists a nonzero v 2 Cn with Av D v: Thus, is an eigenvalue for A if the equation Av D v or, equivalently, the equation .A I /v D 0; has a nonzero solution v: This happens precisely when A I fails to be invertible, which is precisely when det.A I / D 0: For any A 2 Mn.C/; the characteristic polynomial p of A is given by A previous version of this book was inadvertently published without the middle initial of the author’s name as “Brian Hall”. For this reason an erratum has been published, correcting the mistake in the previous version and showing the correct name as Brian C. Hall (see DOI http:// dx.doi.org/10.1007/978-3-319-13467-3_14). The version readers currently see is the corrected version. The Publisher would like to apologize for the earlier mistake.
    [Show full text]