Fast Orthogonal Projection Based on Kronecker Product

Total Page:16

File Type:pdf, Size:1020Kb

Fast Orthogonal Projection Based on Kronecker Product Fast Orthogonal Projection Based on Kronecker Product Xu Zhang1;2, Felix X. Yu2;3, Ruiqi Guo3, Sanjiv Kumar3, Shengjin Wang1, Shih-Fu Chang2 1Tsinghua University, 2Columbia University, 3 Google Research Abstract projection to the input data before converting it into a bi- nary or a non-binary code. For instance, a k-bit binary code We propose a family of structured matrices to speed is simply given as, up orthogonal projections for high-dimensional data com- h(x) = sign(Rx): (1) monly seen in computer vision applications. In this, a structured matrix is formed by the Kronecker product of However, the projection operation becomes expensive as the a series of smaller orthogonal matrices. This achieves input dimensionality d increases. In practice, to achieve O(d log d) computational complexity and O(log d) space high recall in retrieval tasks, it is often desirable to use long complexity for d-dimensional data, a drastic improvement codes with a large k such that k = O(d) [22, 8, 29]. In this over the standard unstructured projections whose compu- case, the space and computational complexity of projection tational and space complexities are both O(d2). We also is O(d2), and such a high cost often becomes the bottleneck introduce an efficient learning procedure for optimizing at both learning and prediction time. For instance, when such matrices in a data dependent fashion. We demon- k = d = 50; 000, projection matrix alone takes 10 GB (sin- strate the significant advantages of the proposed approach gle precision) and projecting one vector can take 800 ms on in solving the approximate nearest neighbor (ANN) image a single core. search problem with both binary embedding and quantiza- In addition, in many applications, it is desired that the tion. Comprehensive experiments show that the proposed projection matrix is orthogonal. An orthogonal transfor- approach can achieve similar or better accuracy as the ex- mation preserves the Euclidean distance between points. isting state-of-the-art but with significantly less time and And it can also distribute the variance more evenly across memory. the dimensions. These properties are important to make several well-known techniques perform well on real-world data [10, 26]. Another motivation for orthogonal projec- 1. Introduction tion comes from the goal of learning maximally uncor- related bits while learning data-dependent binary codes. Linear projection is one of the most widely used opera- One way to achieve that is by imposing orthogonal (or tions, fundamental to many algorithms in computer vision. near orthogonal) constraints on the projections [10, 37]. Given a vector x 2 Rd, and a projection matrix R 2 Rk×d, k In binary embedding, many independent empirical exper- the linear projection computes h(x) 2 R : iments [18, 15, 31, 10] have shown that imposing orthogo- h(x) = Rx: nality constraint on the projection achieves better results for approximate nearest neighbor search. Ji et al. [19] provided In the area of large-scale search and retrieval in computer theoretical analysis to support the use of orthogonal pro- vision, linear projection is usually followed by quantiza- jection. While in quantization, the orthogonality constraint tion to convert high dimensional image features into bi- of the projection matrix makes some critical optimization nary embeddings [15, 23, 34, 25] or product codes [16, 26]. problems possible to solve [26, 35]. However, the main These compact codes have been used to speed up search challenge is that the computational complexity of building and reduce storage in image retrieval [32], feature match- an orthogonal matrix is O(d3) while space and projection ing [31], attribute recognition [28], and object categoriza- complexity is O(d2), both of which are expensive for large tion [6] among others. For example, the popularly used Lo- d. cality Sensitive Hashing (LSH) technique applies a linear In order to speed up projection for high-dimensional This work was supported by the National Science and Technology data, researchers have studied various types of structured Support Program under Grant No. 2013BAK02B04 and Initiative Scientic matrices, including Hadamard matrix, Toeplitz matrix and Research Program of Tsinghua University under Grant No. 20141081253. Felix X. Yu was supported in part by the IBM PhD Fellowship Award when Circulant matrix. The idea behind projecting with struc- working on this project. tured matrices instead of the traditional unstructured matri- ces is that one can exploit the structure to achieve better Method Time Space Time (Learning) space and computational complexity than quadratic. Pro- Unstructured [10] O(d2) O(d2) O(Nd2) jecting vectors with structured matrices have been studied Bilinear [8] O(d1:5) O(d) O(Nd1:5) in a variety of contexts. In dimensionality reduction, Alon Circulant [37] O(d log d) O(d) O(Nd log d) 2 and Chazelle [1] studied projection using a sparse Gaussian Kronecker (ours) O(d log d) O(log d) O(Nd log d) matrix paired with a Hadamard matrix. This was followed Table 1. Computational and space costs. d: data dimensionality, et al by Dasgupta . [5] who used a combination of permu- N: number of training samples. The space and time complexities tation and diagonal matrices along with Hadamard matrix de×de of Kronecker projection are based on Aj 2 R , 8j, where de in LSH. These variants of Hadamard matrices were fur- is a small constant. ther used by Jacques et al. in compressed sensing [14] and Le et al. [21] in kernel approximation. These works uti- lize the well-known fast Johnson-Lindenstrauss transform to compute Kronecker projection with a time complexity of to achieve O(d log d) time complexity. Researchers have O(d log d); 3) By changing the sizes of the small orthog- also used Toeplitz-structured matrices [2, 11] and circulant onal matrices, the resulting matrix has varying number of matrices [12, 37, 38, 4] for projection, which also obtains parameters (degrees of freedom), making it easier to con- O(d log d) time complexity. trol performance-speed trade-off; 4) The space complexity However, the main problem with the commonly used is O(log d) in comparison to O(d) for most other structured structured matrices is that they are not orthogonal. Al- matrices. though the Hadamard matrix is orthogonal by itself, it is We study Kronecker projection in two application set- typically used in combination with other matrices (e.g., tings: binary embedding, and quantization (Section 2). sparse Gaussian or diagonal/permutation matrices), which We propose a randomized version (Section 4) and a convert it into a non-orthogonal matrix. Besides this, there data-dependent learned version (Section 5) of such ma- is another problem with using the Hadamard matrix di- trices. We conduct extensive image retrieval experiments rectly: there are no free-parameters in Hadamard matri- on ImageNet-32768, ImageNet-16384, and Flickr-16384 ces. Thus, one cannot learn an Hadamard matrix in a data- datasets (Section 6). The results show that with fixed num- dependent fashion. ber of bits, the method needs much less space and time than In this work we introduce a very flexible family of or- the state-of-the-art methods to achieve similar or better per- thogonal structured matrices formed by Kronecker prod- formance. uct of small element matrices, leading to substantially re- duced space and computational complexity. One can vary 2. Background the number of free parameters in these matrices to adapt to We begin by reviewing the two settings where the fast or- the needs of a given application. The most related work to thogonal projection based on Kronecker Product is applied: our proposed method is the bilinear projection [8], which is binary embedding, and quantization. also orthogonal and faster than quadratic. We show that the bilinear method can be viewed as a special case of the pro- 2.1. Binary Embedding posed method. Moreover, our structure is more flexible and 1:5 Binary embedding methods map original vectors into k- has lower computational complexity than O(d ) of the bi- k linear method. Table 1 summarizes the space and time com- bit binary vectors such that h(x) 2 f+1; −1g . Since data- plexity of the proposed method in comparison with other points are stored as binary codes, the storage cost is reduced structured matrices. significantly even when k = O(d). The approximate near- est neighbors are retrieved using Hamming distance in the 1.1. Our Contributions binary code space, which can be computed very efficiently using table lookup, or the POPCNT instruction on modern In this work, we propose a novel method to construct a computer architectures. family of orthogonal matrices by using the Kronecker prod- Locality Sensitive Hashing (LSH) is a popular method uct of a series of small orthogonal matrices. Formally, the for generating binary codes that preserves cosine dis- Kronecker projection matrix is defined as tance [3, 27] and typically uses randomized projections in R = A1 ⊗ A2 ⊗ · · · ⊗ AM ; (1) to generate binary codes. However, many works have where Aj, j = 1; ··· ;M are small orthogonal matrices. shown the advantages of learning data-dependent binary We term them as the element matrices. Such Kronecker codes by optimizing the projection matrix R in (1) instead product matrices have the following unique advantages: of using the randomized ones [20, 25, 36, 24, 8]. Specifi- 1) They satisfy orthogonality constraint and therefore pre- cally, Iterative Quantization (ITQ) [10] showed that by us- serve Euclidean distance in the original space; 2) Similar to ing a PCA projection followed by a learned orthogonal pro- Hadamard and circulant matrices, there exists fast algorithm jection, the resulting binary embedding outperforms non- orthogonal or randomized orthogonal projection in retrieval 3.
Recommended publications
  • Block Kronecker Products and the Vecb Operator* CORE View
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector Block Kronecker Products and the vecb Operator* Ruud H. Koning Department of Economics University of Groningen P.O. Box 800 9700 AV, Groningen, The Netherlands Heinz Neudecker+ Department of Actuarial Sciences and Econometrics University of Amsterdam Jodenbreestraat 23 1011 NH, Amsterdam, The Netherlands and Tom Wansbeek Department of Economics University of Groningen P.O. Box 800 9700 AV, Groningen, The Netherlands Submitted hv Richard Rrualdi ABSTRACT This paper is concerned with two generalizations of the Kronecker product and two related generalizations of the vet operator. It is demonstrated that they pairwise match two different kinds of matrix partition, viz. the balanced and unbalanced ones. Relevant properties are supplied and proved. A related concept, the so-called tilde transform of a balanced block matrix, is also studied. The results are illustrated with various statistical applications of the five concepts studied. *Comments of an anonymous referee are gratefully acknowledged. ‘This work was started while the second author was at the Indian Statistiral Institute, New Delhi. LINEAR ALGEBRA AND ITS APPLICATIONS 149:165-184 (1991) 165 0 Elsevier Science Publishing Co., Inc., 1991 655 Avenue of the Americas, New York, NY 10010 0024-3795/91/$3.50 166 R. H. KONING, H. NEUDECKER, AND T. WANSBEEK INTRODUCTION Almost twenty years ago Singh [7] and Tracy and Singh [9] introduced a generalization of the Kronecker product A@B. They used varying notation for this new product, viz. A @ B and A &3B. Recently, Hyland and Collins [l] studied the-same product under rather restrictive order conditions.
    [Show full text]
  • The Kronecker Product a Product of the Times
    The Kronecker Product A Product of the Times Charles Van Loan Department of Computer Science Cornell University Presented at the SIAM Conference on Applied Linear Algebra, Monterey, Califirnia, October 26, 2009 The Kronecker Product B C is a block matrix whose ij-th block is b C. ⊗ ij E.g., b b b11C b12C 11 12 C = b b ⊗ 21 22 b21C b22C Also called the “Direct Product” or the “Tensor Product” Every bijckl Shows Up c11 c12 c13 b11 b12 c21 c22 c23 b21 b22 ⊗ c31 c32 c33 = b11c11 b11c12 b11c13 b12c11 b12c12 b12c13 b11c21 b11c22 b11c23 b12c21 b12c22 b12c23 b c b c b c b c b c b c 11 31 11 32 11 33 12 31 12 32 12 33 b c b c b c b c b c b c 21 11 21 12 21 13 22 11 22 12 22 13 b21c21 b21c22 b21c23 b22c21 b22c22 b22c23 b21c31 b21c32 b21c33 b22c31 b22c32 b22c33 Basic Algebraic Properties (B C)T = BT CT ⊗ ⊗ (B C) 1 = B 1 C 1 ⊗ − − ⊗ − (B C)(D F ) = BD CF ⊗ ⊗ ⊗ B (C D) =(B C) D ⊗ ⊗ ⊗ ⊗ C B = (Perfect Shuffle)T (B C)(Perfect Shuffle) ⊗ ⊗ R.J. Horn and C.R. Johnson(1991). Topics in Matrix Analysis, Cambridge University Press, NY. Reshaping KP Computations 2 Suppose B, C IRn n and x IRn . ∈ × ∈ The operation y =(B C)x is O(n4): ⊗ y = kron(B,C)*x The equivalent, reshaped operation Y = CXBT is O(n3): y = reshape(C*reshape(x,n,n)*B’,n,n) H.V.
    [Show full text]
  • The Partition Algebra and the Kronecker Product (Extended Abstract)
    FPSAC 2013, Paris, France DMTCS proc. (subm.), by the authors, 1–12 The partition algebra and the Kronecker product (Extended abstract) C. Bowman1yand M. De Visscher2 and R. Orellana3z 1Institut de Mathematiques´ de Jussieu, 175 rue du chevaleret, 75013, Paris 2Centre for Mathematical Science, City University London, Northampton Square, London, EC1V 0HB, England. 3 Department of Mathematics, Dartmouth College, 6188 Kemeny Hall, Hanover, NH 03755, USA Abstract. We propose a new approach to study the Kronecker coefficients by using the Schur–Weyl duality between the symmetric group and the partition algebra. Resum´ e.´ Nous proposons une nouvelle approche pour l’etude´ des coefficients´ de Kronecker via la dualite´ entre le groupe symetrique´ et l’algebre` des partitions. Keywords: Kronecker coefficients, tensor product, partition algebra, representations of the symmetric group 1 Introduction A fundamental problem in the representation theory of the symmetric group is to describe the coeffi- cients in the decomposition of the tensor product of two Specht modules. These coefficients are known in the literature as the Kronecker coefficients. Finding a formula or combinatorial interpretation for these coefficients has been described by Richard Stanley as ‘one of the main problems in the combinatorial rep- resentation theory of the symmetric group’. This question has received the attention of Littlewood [Lit58], James [JK81, Chapter 2.9], Lascoux [Las80], Thibon [Thi91], Garsia and Remmel [GR85], Kleshchev and Bessenrodt [BK99] amongst others and yet a combinatorial solution has remained beyond reach for over a hundred years. Murnaghan discovered an amazing limiting phenomenon satisfied by the Kronecker coefficients; as we increase the length of the first row of the indexing partitions the sequence of Kronecker coefficients obtained stabilises.
    [Show full text]
  • Kronecker Products
    Copyright ©2005 by the Society for Industrial and Applied Mathematics This electronic version is for personal use and may not be duplicated or distributed. Chapter 13 Kronecker Products 13.1 Definition and Examples Definition 13.1. Let A ∈ Rm×n, B ∈ Rp×q . Then the Kronecker product (or tensor product) of A and B is defined as the matrix a11B ··· a1nB ⊗ = . ∈ Rmp×nq A B . .. . (13.1) am1B ··· amnB Obviously, the same definition holds if A and B are complex-valued matrices. We restrict our attention in this chapter primarily to real-valued matrices, pointing out the extension to the complex case only where it is not obvious. Example 13.2. = 123 = 21 1. Let A 321and B 23. Then 214263 B 2B 3B 234669 A ⊗ B = = . 3B 2BB 634221 694623 Note that B ⊗ A = A ⊗ B. ∈ Rp×q ⊗ = B 0 2. For any B , I2 B 0 B . Replacing I2 by In yields a block diagonal matrix with n copies of B along the diagonal. 3. Let B be an arbitrary 2 × 2 matrix. Then b11 0 b12 0 0 b11 0 b12 B ⊗ I2 = . b21 0 b22 0 0 b21 0 b22 139 “ajlbook” — 2004/11/9 — 13:36 — page 139 — #147 From "Matrix Analysis for Scientists and Engineers" Alan J. Laub. Buy this book from SIAM at www.ec-securehost.com/SIAM/ot91.html Copyright ©2005 by the Society for Industrial and Applied Mathematics This electronic version is for personal use and may not be duplicated or distributed. 140 Chapter 13. Kronecker Products The extension to arbitrary B and In is obvious.
    [Show full text]
  • A Constructive Arbitrary-Degree Kronecker Product Decomposition of Tensors
    A CONSTRUCTIVE ARBITRARY-DEGREE KRONECKER PRODUCT DECOMPOSITION OF TENSORS KIM BATSELIER AND NGAI WONG∗ Abstract. We propose the tensor Kronecker product singular value decomposition (TKPSVD) that decomposes a real k-way tensor A into a linear combination of tensor Kronecker products PR (d) (1) with an arbitrary number of d factors A = j=1 σj Aj ⊗ · · · ⊗ Aj . We generalize the matrix (i) Kronecker product to tensors such that each factor Aj in the TKPSVD is a k-way tensor. The algorithm relies on reshaping and permuting the original tensor into a d-way tensor, after which a polyadic decomposition with orthogonal rank-1 terms is computed. We prove that for many (1) (d) different structured tensors, the Kronecker product factors Aj ;:::; Aj are guaranteed to inherit this structure. In addition, we introduce the new notion of general symmetric tensors, which includes many different structures such as symmetric, persymmetric, centrosymmetric, Toeplitz and Hankel tensors. Key words. Kronecker product, structured tensors, tensor decomposition, TTr1SVD, general- ized symmetric tensors, Toeplitz tensor, Hankel tensor AMS subject classifications. 15A69, 15B05, 15A18, 15A23, 15B57 1. Introduction. Consider the singular value decomposition (SVD) of the fol- lowing 16 × 9 matrix A~ 0 1:108 −0:267 −1:192 −0:267 −1:192 −1:281 −1:192 −1:281 1:102 1 B 0:417 −1:487 −0:004 −1:487 −0:004 −1:418 −0:004 −1:418 −0:228C B C B−0:127 1:100 −1:461 1:100 −1:461 0:729 −1:461 0:729 0:940 C B C B−0:748 −0:243 0:387 −0:243 0:387 −1:241 0:387 −1:241 −1:853C B C B 0:417
    [Show full text]
  • Introduction to Kronecker Products
    Math 515 Fall, 2010 Introduction to Kronecker Products If A is an m × n matrix and B is a p × q matrix, then the Kronecker product of A and B is the mp × nq matrix 2 3 a11B a12B ··· a1nB 6 7 6 a21B a22B ··· a2nB 7 A ⊗ B = 6 . 7 6 . 7 4 . 5 am1B am2B ··· amnB Note that if A and B are large matrices, then the Kronecker product A⊗B will be huge. MATLAB has a built-in function kron that can be used as K = kron(A, B); However, you will quickly run out of memory if you try this for matrices that are 50 × 50 or larger. Fortunately we can exploit the block structure of Kronecker products to do many compu- tations involving A ⊗ B without actually forming the Kronecker product. Instead we need only do computations with A and B individually. To begin, we state some very simple properties of Kronecker products, which should not be difficult to verify: • (A ⊗ B)T = AT ⊗ BT • If A and B are square and nonsingular, then (A ⊗ B)−1 = A−1 ⊗ B−1 • (A ⊗ B)(C ⊗ D) = AC ⊗ BD The next property we want to consider involves the matrix-vector multiplication y = (A ⊗ B)x; where A 2 Rm×n and B 2 Rp×q. Thus A ⊗ B 2 Rmp×nq, x 2 Rnq, and y 2 Rmp. Our goal is to exploit the block structure of the Kronecker product matrix to compute y without explicitly forming (A ⊗ B). To see how this can be done, first partition the vectors x and y as 2 3 2 3 x1 y1 6 x 7 6 y 7 6 2 7 q 6 2 7 p x = 6 .
    [Show full text]
  • Kronecker Products and Matrix Calculus in System Theory
    772 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS, VOL. CAS-25, NO. 9, ISEFTEMBER 1978 Kronecker Products and Matrix Calculus in System Theory JOHN W. BREWER I Absfrucr-The paper begins with a review of the algebras related to and are based on the developments of Section IV. Also, a Kronecker products. These algebras have several applications in system novel and interesting operator notation will ble introduced theory inclluding the analysis of stochastic steady state. The calculus of matrk valued functions of matrices is reviewed in the second part of the in Section VI. paper. This cakuhs is then used to develop an interesting new method for Concluding comments are placed in Section VII. the identification of parameters of linear time-invariant system models. II. NOTATION I. INTRODUCTION Matrices will be denoted by upper case boldface (e.g., HE ART of differentiation has many practical ap- A) and column matrices (vectors) will be denoted by T plications in system analysis. Differentiation is used lower case boldface (e.g., x). The kth row of a matrix such. to obtain necessaryand sufficient conditions for optimiza- as A will be denoted A,. and the kth colmnn will be tion in a.nalytical studies or it is used to obtain gradients denoted A.,. The ik element of A will be denoted ujk. and Hessians in numerical optimization studies. Sensitiv- The n x n unit matrix is denoted I,,. The q-dimensional ity analysis is yet another application area for. differentia- vector which is “I” in the kth and zero elsewhereis called tion formulas. In turn, sensitivity analysis is applied to the the unit vector and is denoted design of feedback and adaptive feedback control sys- tems.
    [Show full text]
  • Representing Words in a Geometric Algebra
    Representing Words in a Geometric Algebra Arjun Mani Princeton University [email protected] Abstract Word embeddings, or the mapping of words to vectors, demonstrate interesting properties such as the ability to solve analogies. However, these models fail to capture many key properties of words (e.g. hierarchy), and one might ask whether there are objects that can better capture the complexity associated with words. Geometric algebra is a branch of mathematics that is characterized by algebraic operations on objects having geometric meaning. In this system objects are linear combinations of subspaces rather than vectors, and certain operations have the significance of intersecting or unioning these subspaces. In this paper we introduce and motivate geometric algebra as a better representation for word embeddings. Next we describe how to implement the geometric product and interestingly show that neural networks can learn this product. We then introduce a model that represents words as objects in this algebra and benchmark it on large corpuses; our results show some promise on traditional word embedding tasks. Thus, we lay the groundwork for further investigation of geometric algebra in word embeddings. 1 Motivation Words are complex creatures; a single word has one (or multiple) semantic meanings, plays a particular syntactic role in a sentence, and has complex relationships with different words (simple examples of such relationships are synonyms and antonyms). The representations of words as vectors, and the ability of these models to capture some of this complexity, has been a seminal advance in the field of natural language processing. Such word embedding models (first introduced by Mikolov et al.
    [Show full text]
  • Spectral Properties of Sums of Certain Kronecker Products
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector Linear Algebra and its Applications 431 (2009) 1691–1701 Contents lists available at ScienceDirect Linear Algebra and its Applications journal homepage: www.elsevier.com/locate/laa Spectral properties of sums of certain Kronecker products ∗ Jun-e Feng a, James Lam b, , Yimin Wei c a School of Mathematics, Shandong University, Jinan 250100, People’s Republic of China b Department of Mechanical Engineering, University of Hong Kong, Pokfulam Road, Hong Kong c School of Mathematics, Fudan University, Shanghai 20043, People’s Republic of China ARTICLE INFO ABSTRACT Article history: We introduce two kinds of sums of Kronecker products, and their Received 22 September 2008 induced operators. We study the algebraic properties of these two Accepted 4 June 2009 kinds of matrices and their associated operators; the properties Available online 4 July 2009 include their eigenvalues, their eigenvectors, and the relationships Submitted by C.K. Li between their spectral radii or spectral abscissae. Furthermore, two projected matrices of these Kronecker products and their induced Keywords: operators are also studied. Eigenvalue © 2009 Elsevier Inc. All rights reserved. Eigenvector Spectrum Kronecker product Spectral abscissa Spectral radius 1. Introduction Kronecker product, also called direct product or tensor product, is an operation which owes its origin from group theory and has important applications in physics [6]. Techniques involving Kronecker products have been successfully applied in many fields of matrix theory because of its computational and notational advantages, see [1,2,5,7,8,12,13] and references therein.
    [Show full text]
  • Determinants in the Kronecker Product of Matrices: the Incidence Matrix of a Complete Graph
    DETERMINANTS IN THE KRONECKER PRODUCT OF MATRICES: THE INCIDENCE MATRIX OF A COMPLETE GRAPH CHRISTOPHER R.H. HANUSA AND THOMAS ZASLAVSKY Abstract. We investigate the least common multiple of all sub- determinants, lcmd(A⊗ B), of a Kronecker product of matrices, of which one is an integral matrix A with two columns and the other is the incidence matrix of a complete graph with n vertices. We prove that this quantity is the least common multiple of lcmd(A) to the power n − 1 and certain binomial functions of the entries of A. 1. Introduction In a study of non-attacking placements of chess pieces, Chaiken, Hanusa, and Zaslavsky [1] were led to a quasipolynomial formula that depends in part on the least common multiple of the determinants of all square submatrices of a certain Kronecker product matrix, namely, the Kronecker product of an integral 2 ×2 matrix A with the incidence matrix of a complete graph. We give a compact expression for the least common multiple of the subdeterminants of this product matrix, generalized to A of order m × 2. 2. Background Kronecker product. For matrices A =(aij)m×k and B =(bij)n×l, the Kronecker product A⊗B is defined to be the mn×kl block matrix a11B ··· a1kB . . .. . am1B ··· amkB It is known (see [2], for example) that when A and B are square matri- ces of orders m and n, respectively, then det(A⊗B)=det(A)n det(B)m. 2000 Mathematics Subject Classification. 15A15, 05C50, 15A57. Key words and phrases. Kronecker product, determinant, least common multi- ple, incidence matrix of complete graph, matrix minor.
    [Show full text]
  • Matrix Differential Calculus with Applications to Simple, Hadamard, and Kronecker Products
    JOURNAL OF MATHEMATICAL PSYCHOLOGY 29, 414492 (1985) Matrix Differential Calculus with Applications to Simple, Hadamard, and Kronecker Products JAN R. MAGNUS London School of Economics AND H. NEUDECKER University of Amsterdam Several definitions are in use for the derivative of an mx p matrix function F(X) with respect to its n x q matrix argument X. We argue that only one of these definitions is a viable one, and that to study smooth maps from the space of n x q matrices to the space of m x p matrices it is often more convenient to study the map from nq-space to mp-space. Also, several procedures exist for a calculus of functions of matrices. It is argued that the procedure based on differentials is superior to other methods of differentiation, and leads inter alia to a satisfactory chain rule for matrix functions. c! 1985 Academic Press, Inc 1. INTRODUCTION In mathematical psychology (among other disciplines) matrix calculus has become an important tool. Several definitions are in use for the derivative of a matrix function F(X) with respect to its matrix argument X (see Dwyer & MacPhail, 1948; Dwyer, 1967; Neudecker, 1969; McDonald & Swaminathan, 1973; MacRae, 1974; Balestra, 1976; Nel, 1980; Rogers, 1980; Polasek, 1984). We shall see that only one of these definitions is appropriate. Also, several procedures exist for a calculus of functions of matrices. We shall argue that the procedure based on differentials is superior to other methods of differentiation. These, then, are the two purposes of this paper. We have been inspired to write this paper by Bentler and Lee’s (1978) note.
    [Show full text]
  • Hadamard, Khatri-Rao, Kronecker and Other Matrix Products
    INTERNATIONAL JOURNAL OF °c 2008 Institute for Scienti¯c INFORMATION AND SYSTEMS SCIENCES Computing and Information Volume 4, Number 1, Pages 160{177 HADAMARD, KHATRI-RAO, KRONECKER AND OTHER MATRIX PRODUCTS SHUANGZHE LIU AND GOTZÄ TRENKLER This paper is dedicated to Heinz Neudecker, with a®ection and appreciation Abstract. In this paper we present a brief overview on Hadamard, Khatri- Rao, Kronecker and several related non-simple matrix products and their prop- erties. We include practical applications, in di®erent areas, of some of the al- gebraic results. Key Words. Hadamard (or Schur) product, Khatri-Rao product, Tracy- Singh product, Kronecker product, vector cross product, positive de¯nite vari- ance matrix, multi-way model, linear matrix equation, signal processing. 1. Introduction Matrices and matrix operations are studied and applied in many areas such as engineering, natural and social sciences. Books written on these topics include Berman et al. (1989), Barnett (1990), LÄutkepohl (1996), Schott (1997), Magnus and Neudecker (1999), Hogben (2006), Schmidt and Trenkler (2006), and espe- cially Bernstein (2005) with a number of referred books classi¯ed as for di®erent areas in the preface. In this paper, we will in particular overview briefly results (with practical applications) involving several non-simple matrix products, as they play a very important role nowadays and deserve to receive such attention. The Hadamard (or Schur) and Kronecker products are studied and applied widely in matrix theory, statistics, system theory and other areas; see, e.g. Styan (1973), Neudecker and Liu (1995), Neudecker and Satorra (1995), Trenkler (1995), Rao and Rao (1998), Zhang (1999), Liu (2000a, 2000b) and Van Trees (2002).
    [Show full text]