<<

arXiv:1511.01306v2 [cs.NA] 3 Feb 2016 ti h eyfudto ftno ler.Isedauthors Instead algebra. outer the of to foundation refer very sometimes the is it [11]. h [10], [1], alone community product the Kronecker in The method notations [10]. unfolding two product different [3], three n-way [1], least the literature at of the is use There make othe [9]. while but [3], [8] notations [2], these operators use Kruskal never and authors Tucker Some found. the been been to not [1 have refer has properties notations papers different on consensus their of a best Yet number at large describe to a published processing, signal and nnw rpriso h ra omllwwt suggested with law normal array multiwa the yet for useful notations. of expressions the recalls properties clearer paper sh unknown to the is Finally, lead it operations. and array notations symbols, different these t two produc how with Kronecker of represented and properties also product an some are tensor vectorization with The product. comply the Kronecker to redefine operators We matricization processing. array multiway emty[0.I te od,tesgetdnttoswill notations suggested the algebraic words, from to other each lead definitions of In with versions [10]. compatible swapped geometry Kro- be is and to which have products others, not tensor do the products that matricizatio necker so and computed vectorization are the decades operators that for used and recommend been We physics have [13]. quantum in where wi used geometry consistent notations algebraic are with that and arrays another, on one manipulations and array operations, for known well vectorization. the some and redefining a matricization the to and importanc namely also product crucial leads tensor of is This general product a Kronecker between -dependent difference the exactly that vectorizations and matricizations with manipula arrays Moreover, the time. same and the arrays at manipulating operators when th multilinear confusion using Yet to given. leads been symbol has same basis a when matrices for product [12]. literature the through and historically used eeaiainsnei ple otnoso n re and use product product the Kronecker order tensor the suggest of any general for authors symbols of Some basis different wider two basis. tensors for canonical of any to allows to the product referring applies in without tensor it the vectors since But of generalization space. product vector tensor each a as h esrpoutisl sams ee sd vnthough even used, never almost is itself product tensor The science data in topic popular a become have tensors Since Abstract h anga fti ae steeoet e notations set to therefore is paper this of goal main The tensor the of expression the is product Kronecker The bu oain nMlia ra Processing Array Multiway in Notations About Ti ae ie noeve fnttosue in used notations of overview an gives paper —This vec I b NTRODUCTION p a 1] u h aesmo a been has symbol same the But [11]. b b “ q a ⊠ ˝ b hc a eseen be may which , , ⊠ ´ re yE Cohen J´er´emy E. n the and means widely ]–[7]. own ting in s he as th rs e. d n y e t ota o vr etrspace vector every for that so hr savco space vector a is There from n iiermapping bilinear a and hc ed ouncsaiycmlctdequations. complicated unnecessarily to leads which from .Aottno products tensor About A. omllwcnb ad olfrmlia ra analysis. array array multiway the for that tool shown handy is using a It be [9]. II can Hoff ar law Section by the normal defined in exposes as III given law Section normal some Finally are notations. Then tensors suggested processing. the of array properties multiway useful in extensively used hsdfesfo h sa qaiyvec equality usual the from differs This eas ietedfiiino h ue rdc ftovectors two of basis. product outer given the a of in definition the give also We nohrwrs h esrpouto etrspaces vector of product tensor the words, other In mappings. multilinear to b n h rnce rdc 3,[2.Frtltu en the define us let : product First arrays outer [12]. two of the [3], product are product Kronecker tensor Kronecker product a the basis-dependent tensor represents known and the well this Two for space, coordinates. representation tensor of array the an for by given is basis esr r tm rmatno space tensor a from real-v items that and are arrays, to tensors solely not applies product tensor h arxagba u obssi ie.W hnne ouse to need then We given. is basis no but algebra, the more to manner trivial a arrays. in two extended than be can definitions These n sdfie by: defined is and where R ulsoe(ier etrspace vector (linear) one builds P p h rtscinitoue oenttosfraloperator all for notations some introduces section first The is e sdfietetno rdc sgvni [14]. in given as product tensor 1: the Definition define us let First ento 3: Definition ento 2: Definition hsdfiiini o ai eedn,wihmasta the that means which dependent, basis not is definition This miut apn hnteue esrpoutmp to maps product tensor used the when happens Ambiguity 1 ˆ R q E E p 1 A b 2 b ˆ and sdntdby denoted is ⊠ steotrpout and product, outer the is F F B to to B : “ G G P Let h ue rdc ftovectors two of product outer The h rnce rdc ftoarrays two of product Kronecker The hr soeadol n iermapping linear one only and one is there , endby defined – — — — » R .AF A I. b p E a 2 a a ˆ p , 21 p 11 1 q E a : a F . . . 1 2 B B B b b b EW E sdntdby denoted is etovco pcso field a on spaces vector two be F b ˆ b g q D P aldtetno rdc space, product tensor the called , G a ij p F ,y x, 12 EFINITIONS R n o l iiermapping bilinear all for and “ Â Ñ B p ⊠ 1 “ q a i ˆ E i E steKoekrproduct. Kronecker the is p b i 2 b . . . j . h n sdfie y: by defined is and F p A  x ⊠ p b i a a ď a p 1 B y N 1 b q q q 1 1 hsextends This . R B b P B a “ q n R i P hna When . fl ffi ffi ffi fi p 1 R p . p E 2 p b i ˆ alued 1 q A ⊠ i q and ď ray 1 K a q N h P g 2 s 1 , . 2 the symbol b for this . Moreover, the Kronecker definition states that an array should be vectorized choosing product and the tensor product in matrix algebra coincide when elements by the reverse lexicographic order of indices, i.e. a basis is provided. It is shown in the rest of this paper that columns after columns in the first mode, sliding along the this notation eases array manipulation significantly. second mode, then the third mode and so on (see figure 1a). On the other hand, the does not need a third Yet every tensor can be expressed as a sum of R tensor symbol, since there is no ambiguity with the tensor product. products of vectors by the following model, called Canonical Whenever dealing with arrays, the tensor product of vectors Polyadic (CP) decomposition or PARAFAC [19], [20] : is to be read as an outer product, but in any other case the outer product makes no sense. To comply with usual notations R N Y “ apiq in , the symbol ˝ both proposed and later r r“1 i“1 abandoned by Comon [15], [16] should be dropped since it ÿ â refers also to composition. Also, the usual notation for the and using the columnwise vectorization is not compatible with tensor product is b, except in part of the multiway array the way the Kronecker product and the CP decomposition are processing community. defined. To see this, take a matrix M defined by an outer Multiway array processing is based on applying multilinear product M “ a b b. When vectorizing M columnwise, it is operators, which are themselves higher order tensors (see well known that vectors a and b are swapped to express M appendix A), on data arrays to mine these data. A well- as a Kronecker product, since in its definition the Kronecker known example is the multilinear operator obtained by High product makes the second index vary first : vecpMq“ b ⊠ a. Order Singular Value Decomposition (HOSVD) [3] yielding The point here is that once the Kronecker product has been compression for the data array. However, when fixing a basis defined, it fixes the direction for the vectorization of any array a a there is not a single definition of a product of two tensors. in order to get a direct property like vec p i iq“ i i. qiˆni In particular, if U i are matrices in R , the operator Thus we give the following definition of the vectorization U Rni Â Ò iďN i acting on tensors in iďN cannot be expressed operator of an array, along with some basic properties in Table in a given basis with outer products and Kronecker products I: [17]. Still for computing this product of tensors, some authors Definition 4: The vectorization of an array Y, denoted defined the k-way product [2], [3] : vecpYq, is the isomorphism defined as the following function

nk of its CP decomposition: Y U : Y U p ‚k kqi1,...,iN “ i1,...,ik´1,j,ik`1,...,iN in,j N j“1 niˆ1 n1 n ÿ R ˆ¨¨¨ˆ N Ñ Ri“1 The bullet symbol ‚ should be preferred to the multiplication ś R N R N symbol ˆ, since the latter may already have other meanings. : apiq apiq vec r ÞÑ r Moreover, the k-way product is the contraction of a matrix r“1 i“1 r“1 i“1 along its second mode with a tensor along its k-th mode. ÿ â ÿ ò U This definition means the elements in the array are taken in Now operators iďN i are already extensively used in quantum mechanics among others, and their action on a tensor the lexicographic order of their index, i.e. along the last mode, 1 is denoted with an implicit multilinear product: then along the mode N ´ and so on (see figure 1b). In the same spirit, we define the matricization (or the N unfolding) of an array similarly to what is called flattening U i Y :“ Y ‚1 U 1 ¨¨¨‚N U N ˜i“1 ¸ in algebraic geometry [10]. Some properties are given in table â I. which is to be read as N contractions for the second mode Definition 5: The matricization of an array Y along the i- of each matrix in the tensor product with the i-th mode of th mode, denoted rYs is the isomorphism defined as the the tensor. This also provides a computation mean for the piq following function of its CP decomposition : application of an operator on multiway arrays. One can simply sequentially unfold the tensor in each mode, multiply the N niˆ nj n1ˆ¨¨¨ˆn resulting matrix with the matrix operator U i expressed in the R N Ñ R j‰i ś right basis, and refold the tensor. For matrices, the 2-mode R N R N product is indeed well known : : apjq apiq apjq r.spiq r ÞÑ r b r T r“1 j“1 r“1 j‰i M ‚1 U 1 ‚2 U 2 “ U 1MU 2 ÿ â ÿ ò This definition can be exploited to obtain the unfoldings One last useful for multiway array processing is the of a three way array Y as shown in figure 2. For example, (basis-dependent) Khatri-Rao product : a compact MATLAB R2014b code for computing the three A d B “rA:1 ⊠ B:1,..., A:p ⊠ B:ps unfoldings is:

Y 1 “ reshape ppermute pY, r1, 3, 2sq ,n1,n2n3q B. About manipulation of arrays p q Y 2 “ reshape ppermute pY, r2, 3, 1sq ,n2,n1n3q A major claim of this paper is to modify the usual definition p q Y 3 2 1 of the vectorization operator for arrays [18]. For now, the usual p3q “ reshape ppermute pY, r , , sq ,n3,n1n2q 3

k k Also from Table I, unfoldings can be easily obtained: (3) (2) (1) (2) j j N Y U U G (1) (3) piq “ i b j piq ˜ j‰i ¸ âN U G U T i i “ i piq j ˜j‰i ¸ ò a) columnwise vectorization b) suggested vectorization Of course for actual computation of the action of Fig. 1. Two possibilities for vectorizing an array. If Yijk is an element in N U T Y Y U T array Y of size I ˆ J ˆ K, then in the vectorized form y it is located at : i“1 i on , all the i-way ‚i i products may first a) y 1 1 b) y 1 1 be computed sequentially before matricizing the tensor. pk´ qIJ`pj´ qI`i pi´ qJK`pj´ qK`k ´Â ¯ b) Example 2: In the first section, the CP decomposition

N1 of a tensor was defined. With our notations, this model can be easily expressed as follows: . ... . R N N Y apiq A I N “ r “ i I r“1 i“1 ˜i“1 ¸ M M 1 K ÿ â â where Ai “ ra1 ... aRs and I is the diagonal tensor with Y M M p1q “ r 1| ... | K s entries equal to 1. The CP model can be described with Y N N p2q “ r 1| ... | I s unfoldings and in a vectorized form using (15) from table Y N T N T I : p3q “ r 1 | ... | I s

T T N N IˆK JˆK where M i P R and N i P R Y piq “ AiIpiq Aj “ Ai Aj ˜j‰i ¸ ˜j‰i ¸ Fig. 2. Unfoldings of a three-way array Y of size I ˆ J ˆ K R N ò N ä apiq A 1 vecpYq“ r “ i r“1 i“1 ˜i“1 ¸ II. SOME USEFUL PROPERTIES ÿ ò ä where 1 is a vector of ones. When manipulating multiway arrays, there are multiple Applying a multilinear transformation to the tensor results properties that may simplify calculations by wide margins. N W in operating i“1 i on Y, which has the following CP Formula (16) is useful for obtaining a compact formula for model : the Jacobian of a CP model or obtaining bounds on estimation  errors. Table I provides overall well known results [12], with N R N W W apiq our suggested notations. Note that permutation of vectors or i Y “ i r . ˜i“1 ¸ r“1 i“1 operators when switching from tensor products to vectorized â ÿ â arrays do not appear anymore. This kind of preprocessing is very useful if the observation In what follows, some well-studied decompositions are noise is correlated, or in some applications e.g. flexible cou- revisited with the notations we promote. pled tensor decomposition [21] where the decomposition may a) Exemple 1: The Tucker decomposition of a tensor Y be computed in a transformed domain for one of the mode.

R1...RN N

Yi1 ...iN “ Uik rk Gr1...rN III. SECOND ORDER STATISTICS FOR TENSORS r1...rN ˜k“1 ¸ ÿ ź When dealing with Gaussian non-i.i.d. multiway arrays, the can be expressed independently of any fixed basis as the most natural way of describing the distribution would be to following : give a definition to array normal laws. This has been done for matrices [22], and recently for arrays of any order [9]. The N definition from [9] is the following : Y “ U i G ˜i“1 ¸ Definition 6: Let X be a multivariate random variable in â Rn1ˆ¨¨¨ˆnN . We say that X follows an array normal law of where operators U may be supposed unitary for identifiability i mean M and with tensor covariance Γ “ N Σ if and of the model. i“1 i only if Thus finding the Tucker decomposition of a tensor means  niˆRi 1 finding linear operators U i in R so that the multilinear Γ´ 2 X M 2 exp ´ } p ´ q} operator N U T projects the tensor on its true subspace 2 i“1 i Γ p pX |M, q“ ˆ n 1 ˙ Ri i R , in the same spirit as Principal Component Analysis. 2 i 2 Γ 2 i ´Â ¯ p πq | | ś  4

Now with notations from two way array processing [22], vecpa b bq“ a ⊠ b (1) the matrix law of the ith unfolding is given by : AXBT A ⊠ B X vec “ p q vec p q (2) N N T MN U X U T U U T U U T pA´b Bq pC b¯ Dq “ pAC b BDq , if compatible (3) piq „ i piq j , i i , j j ˜ j‰i j‰i ¸ N N N ò ò piq piq U i a “ U ia (4) and the normal law of the vectorized tensor is : ˜i“1 ¸˜i“1 ¸ ˜i“1 ¸ N N â â â U U U T pA ⊠ Bq pC d Dq “ pAC d BDq , if compatible (5) vecpT q„ N i vecpX q, i i ˜˜i“1 ¸ i“1 ¸ pA b BqT “ AT b BT (6) ò ò A direct consequence is that preprocessing one mode of a AX XB D A ⊠ B X D ` “ ô p q vec p q“ vec p q (7) noisy tensor modifies the covariance of the noise in this mode N N but this does not affect the other modes. In a least square apiq apiq vec “ inthesameorder (8) procedure for fitting the CP decomposition, this results in ˜i“1 ¸ i“1 modifying the maximum likelihood estimate for the modified Y Xâ U Yò UX “ ‚i ô piq “ piq (9) mode only. N N vec U i Y “ U i vec pYq (10) CONCLUDING REMARKS ˜˜i“1 ¸ ¸ ˜i“1 ¸ N â ò N This paper proposed a reworked set of notations for the mul- U U Y U T j Y “ i piq j (11) tiway array processing community. The notations commonly 1 «˜j“ ¸ ff i ˜j‰i ¸ used are sometimes not compatible with other communities, â p q ò N N and the author believes the suggested notations simplify cal- U U U T Y j Y “ i b j piq (12) culations with arrays. The permutation induced by the column 1 «˜j“ ¸ ff i ˜ ˜j‰i ¸¸ wise vectorization operator is a source of error, as well as the â p q â R N R N confusion between tensor products and the Kronecker product. apjq Y apiq apiq Y “ r ô piq “ r b r (13) Moreover, unfolding operators given in the literature suggest a r“1 j“1 r“1 j‰i similar permutation, which is also unnecessary. With proposed ÿ â ÿ ò R notations, there are no more permutations anywhere in the a ⊠ b A B 1 A a a r r “ p d q , where “r 1 ... Rs (14) computation of matricized and vectorized tensors. Finally, r“1 this framework enables a simple definition of the multivariate ÿ R N N normal distribution for arrays of any order, which is useful for Y “ apiq b apjq “ A AT (15) piq r r i j example in computing decompositions of noisy preprocessed r“1 j‰i ˜j‰i ¸ ÿ ò ä tensors. B N a j“1 j a a I a Notations in this paper are of course one choice among a “ 1 b ... b i´1 b b ... b N (16) ÂB i others, and some definitions are not the ones usually used. For N N N n digging further into tensors, some nice surveys are available U U j‰i j i “ | i| (17) [1], [11], [23]. ˇi“1 ˇ 1 ś ˇ ˇ i“ ˇâ ˇ ź ˇ ˇ APPENDIX A ˇ ˇ TABLE I LINEAR OPERATORS ACTING ON TENSORS SOMEUSEFULPROPERTIES We state and prove that linear operators acting on a tensor space of finite dimension linear spaces is itself a tensor space. N U 1 This justifies the notation i 1 i suggested in this paper. 1 N ´ “ Γ´ 2 Σ 2 Σ where “ i“1 i and i are symmetric. It is noted A deeper proof including infinite dimensions can be found in as X „ AN pM, Γq the excellent book by Hackbusch [24].  Array normal distributions are useful when manipulating Let E, E1, F and F 1 be four finite vector spaces on a field the data arrays before decomposing them. They suppose the K, of respective dimensions n,n and m,m. Let b and b1 be covariance is expressed in every mode, i.e. the covariance of two tensor products on E ˆ F and E1 ˆ F 1, and consider the the vectorized tensor can be expressed as a Kronecker product following mapping: of N symmetric matrices, i.e. the covariance is separable. LpE, E1qˆ LpF, F 1q Ñ LpE b F, E1 b1 F 1q For example, say it is needed to preprocess some tensor 1 bp : pu, vq ÞÑ u bp v : px b yq ÞÑ upxqb vpyq Y, noisy version of X corrupted by i.i.d. Gaussian noise, U where LpE, E1q is the linear space of linear operators mapping with the multilinear operator iďN i. Then the array T “ U E to E1. i N i Y follows the array normal law ď  Theorem 1: LpE bF, E1 b1 F 1q associated with the bilinear ` ˘ N N mapping bp is a tensor space, i.e. U U U T T „ AN i X , i i . 1 1 1 1 1 LpE b F, E b F q“ LpE, E qbp LpF, F q. ˜˜i“1 ¸ i“1 ¸ â â 5

To prove theorem 1, we only need to check that bp [14] N. Bourbaki, Algebra I: chapters 1-3. Springer Science & Business maps one basis of LpE, E1qˆ LpF, F 1q to a free family of Media, 1998. 1 1 1 [15] P. Comon, “Blind channel identification and extraction of more sources LpE b F, E b F q, which yields injectivity. Then we will than sensors,” in SPIE’s International Symposium on Optical Science, be able to conclude arguing that this two spaces have the Engineering, and Instrumentation. International Society for Optics and Photonics, 1998, pp. 2–13. same dimension, so that by Green’s theorem, bp is a bijective [16] A. Stegeman and P. Comon, “Subtracting a best -1 approximation from the carthesian product space to a linear may increase tensor rank,” and its Applications, vol. space, which is exactly what a tensor product is [14]. 433, no. 7, pp. 1276–1300, 2010. 1 1 1 [17] L.-H. Lim, “role of tensors in numerical mathematics,” Winter school Let teiui, teiui, tfjuj and tfjuj be some bases of E, E , F 1 1 on Latent variable analysis, 2015. and F . Define the following basis for LpE, E q: [18] H. Neudecker, “Some theorems on matrix differentiation with special 1 reference to kronecker matrix products,” Journal of the American uipej q“ δij ei Statistical Association, vol. 64, no. 327, pp. 953–963, 1969. [19] J. D. Carroll and J.-J. Chang, “Analysis of individual differences in where δij is the kronecker symbol equal to 1 if and only if multidimensional scaling via an n-way generalization of eckart-young i equals j. Define the basis tviu similarly. Let us prove that decomposition,” Psychometrika, vol. 35, no. 3, pp. 283–319, 1970. [20] R. A. Harshman, “Foundations of the parafac procedure: Models and tui bp vj uij is a free family. conditions for an” explanatory” multi-modal factor analysis,” 1970. Given some tλij uij , suppose for any x b y in E b F [21] R. Cabral Farias, J. E. Cohen, C. Jutten, and P. Comon, “Joint decompositions with flexible couplings,” hal repository, Tech. Rep., λij uipxqb vj pyq“ 0. 2015. [Online]. Available: https://hal.archives-ouvertes.fr/hal-01135920 i,j [22] A. K. Gupta and D. K. Nagar, Matrix variate distributions. CRC Press, ÿ 1999, vol. 104. This is equivalent to [23] L. Grasedyck, D. Kressner, and C. Tobler, “A literature survey of low- rank tensor approximation techniques,” arXiv preprint arXiv:1302.7121, 1 1 1 2013. λij νiν e b f “ 0. j i j [24] W. Hackbusch, Tensor spaces and numerical tensor calculus. Springer i,j ÿ Science & Business Media, 2012, vol. 42. 1 where νi and νj are coefficients of x and y in bases teiui and 1 1 1 tfj uj. Since this is true for any x, y and that tei b fj uij is a basis of E1 b1 F 1, all lambdas have to go to zero, thus proving theorem 1.

ACKNOWLEDGMENT The author would like to thank Pierre Comon and Rodrigo Cabral Farias for useful discussions.

REFERENCES [1] T. Kolda and B. Bader, “Tensor decompositions and applications,” SIAM rev., vol. 51, no. 3, pp. 455–500, 2009. [2] T. G. Kolda, Multilinear operators for higher-order decompositions. United States. Department of Energy, 2006. [3] L. De Lathauwer, B. De Moor, and J. Vandewalle, “A multilinear singular value decomposition,” SIAM J. Matrix Anal. Appl., vol. 21, no. 4, pp. 1253–1278, 2000. [4] A. Cichocki, R. Zdunek, A. H. Phan, and S.-i. Amari, Nonnegative matrix and tensor factorizations: applications to exploratory multi-way data analysis and blind source separation. John Wiley & Sons, 2009. [5] R. Bro, “Multi-way analysis in the food industry: models, algorithms, and applications,” Ph.D. dissertation, University of Amsterdam, The Netherlands, 1998. [6] H. A. Kiers, “Towards a standardized notation and terminology in multiway analysis,” Journal of chemometrics, vol. 14, no. 3, pp. 105– 122, 2000. [7] P. M. Kroonenberg, Three-mode principal component analysis: Theory and applications. DSWO press, 1983, vol. 2. [8] J. B. Kruskal, “Three-way arrays: rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics,” Linear algebra and its applications, vol. 18, no. 2, pp. 95–138, 1977. [9] P. D. Hoff, “Separable covariance arrays via the tucker product, with applications to multivariate relational data,” Bayesian Analysis, vol. 6, no. 2, pp. 179–196, 2011. [10] J. M. Landsberg, Tensors: geometry and applications. American Mathematical Society, 2012. [11] P. Comon, “Tensors: a brief introduction,” IEEE Sig. Process. Mag., vol. 31, no. 3, pp. 44–53, 2014. [12] J. W. Brewer, “Kronecker products and in system theory,” Circuits and Systems, IEEE Transactions on, vol. 25, no. 9, pp. 772–781, 1978. [13] M. A. Nielsen and I. L. Chuang, Quantum computation and quantum information. Cambridge university press, 2010.