Hyperdeterminant and Tensor Rank

Hyperdeterminant and Tensor Rank

Hyperdeterminant and Tensor Rank Lek-Heng Lim Workshop on Tensor Decompositions and Applications CIRM, Luminy, France August 29–September 2, 2005 Thanks: NSF DMS 01-01364 Collaborators Pierre Comon Laboratoire I3S Universit´ede Nice Sophia Antipolis Vin de Silva Department of Mathematics Stanford University 2 Matrix Multiplication Let f : U → V and g : V → W be linear maps; U, V, W vector spaces over R of dimensions n, m, l. With choice of bases on U, V, W , g, f have matrix representations l×m m×n A = [aij] ∈ R ,B = [bjk] ∈ R . The matrix representation of h = g ◦ f (ie. h(x) := g(f(x))) is l×n then C = [cik] ∈ R where n c := X a b . ik j=1 ij jk Similarly for bilinear g : V1 × V2 → R and linear f1 : U1 → V1, f2 : d ×d d ×s U2 → V2 with matrix representations A ∈ R 1 2, B1 ∈ R 1 1, d ×s B2 ∈ R 2 2. The composite map h, where h(x, y) := g(f1(x), f2(y)), has ma- trix representation > s1×s2 C = B2 AB1 ∈ R . 3 Multilinear Matrix Multiplication Do the same for multilinear map g : V1 × · · · × Vk → R and linear maps f1 : U1 → V1, . , fk : Uk → Vk; dim(Vi) = si, dim(Ui) = di. With choice of bases on Vi’s and Ui’s, g is represented by A = d1×···×dk and by = [ 1 ] d1×s1 = aj1···j ∈ R f1, . , fk M1 mj i ∈ R ,...,Mk J kK 1 1 [mk ] ∈ dk×sk. jkik R If we compose g by f1, . , fk to get h : U1 × · · · × Uk → R defined by h(x1,..., xk) = g(f(x1), . , f(xk)), s1×···×sk then h is represented by ci1···i ∈ R where J kK c := Pd1 ··· Pdk a m1 ··· mk . (1) i1···ik j1=1 jk=1 j1···jk j1i1 jkik The covariant multilinear matrix multiplication will be written s1×···×sk A(M1,...,Mk) := ci1···i ∈ R . J kK 4 Contravariant Version The contravariant multilinear matrix multiplication of aj1···jk ∈ d1×···×dk by matrices L = [`1 ] ∈ r1×d1,...,L =J [`k ]K ∈ R 1 i1j1 R k ikjk r ×d R k k is defined by r1×···×rk (L1,...,Lk)A = bi1···i ∈ R , J kK b := Pd1 ··· Pdk `1 ··· `k a . (2) i1···ik j1=1 jk=1 i1j1 ikjk j1···jk ∗ This comes from the composition of a multilinear map g : V1 × ∗ · · · × Vk → R by linear maps f1 : V1 → U1, . , fk : Vk → Uk. Simple relation if we disregard covariance/contravariance: > > (L1,...,Lk)A = A(L1 ,...,Lk ) > > A(M1,...,Mk) = (M1 ,...,Mk )A. > † Works over C too (replace Li by Li ). 5 Properties d ×···×d r ×d • Let A, B ∈ R 1 k and λ, µ ∈ R. Let L1 ∈ R 1 1,..., Lk ∈ r ×d R k k. Then (L1,...,Lk)(λA + µB) = λ(L1,...,Lk)A + µ(L1,...,Lk)B. d ×···×d r ×d r ×d • Let A ∈ R 1 k. Let L1 ∈ R 1 1,..., Lk ∈ R k k, and s ×r s ×r M1 ∈ R 1 1,..., Mk ∈ R k k. Then (M1,..., Mk)(L1,..., Lk)A = (M1L1,..., MkLk)A s ×d where MiLi ∈ R i i is simply the matrix-matrix product of Mi and Li. d ×···×d r ×d • Let A ∈ R 1 k and λ, µ ∈ R. Let L1 ∈ R 1 1,..., Lj, Mj ∈ rj×dj r ×d R ,...,Lk ∈ R k k. Then A(L1, . , λLj + µMj,...,Lk) = λ(L1,..., Lj,...,Lk)A + µ(L1,..., Mj,...,Lk)A. 6 Aside: Relation with Kronecker Product d ×···×d d ···d Forgetful map R 1 k → R 1 k, A 7→ vec(A) (‘forgets’ the multilinear structure), then vec((L1,...,Lk)A) = L1 ⊗ · · · ⊗ Lk vec(A). d ···d ×d ···d where L1 ⊗ · · · ⊗ Lk ∈ R 1 k 1 k is the Kronecker product of L1,...,Lk. 7 Matrix Techniques m×n×l m×n×l Start with R and C . l = 2 is well understood, may m×n 2 m×n 2 be regarded as pairs of matrices (A, B) ∈ (C ) or (R ) , m×n or equivalently, as a matrix pencil λA + µB ∈ C[λ, µ] or m×n R[λ, µ] . Kronecker-Weierstrass Theory. There exist S ∈ GL(m),T ∈ GL(n) such that (SAT, SBT ) can be decomposed into block pairs of the following forms 1 0 0 1 1 0 .. .. (p+1)×p 0 . , 1 . ∈ R , . . .. 1 .. 0 0 1 1 0 0 1 1 0 0 1 , ∈ q×(q+1), ... ... ... ... R 1 0 0 1 1 0 −a0 . 1 1 .. , ∈ r×r. ... .. R . 0 −ar−2 1 1 −ar−1 Likewise for C. Similar but simpler results obtained by Jos ten Berge for generic pairs. 8 Larger Sizes and Higher Orders Want to obtain results as general as possible — for tensors of arbitrary size and order over both R and C. For larger values of k or d1, . , dk, techniques relying on multilinear matrix multipli- cations become increasingly less effective. Inherent limitation: d1×···×d k dim(R k) = d1 ····· dk = O(d ) while 2 2 2 dim(GL(d1) × · · · × GL(dk)) = d1 + ··· + dk = O(kd ) and 2 dim(O(d1)×· · ·×O(dk)) = d1(d1−1)/2+···+dk(dk−1)/2 = O(kd ). d ×···×d The action of GL(d1)×· · ·×GL(dk) on R 1 k has uncountably many orbits, {(L1,...,Lk)A | (L1,...,Lk) ∈ GL(d1) × · · · × GL(dk)}, as soon as di > 2, k > 4. 9 Multilinear Functional and its Gradient d ×···×d Multilinear functional associated with A ∈ R 1 k, ie. d1 d fA : R × · · · × R k → R, (3) Pd1 Pdk 1 k (x1,..., x ) 7→ ··· a x ··· x , k j1=1 jk=1 j1···jk j1 jk can be written as fA(x1,..., xk) = A(x1,..., xk) (4) where the rhs is the right multilinear multiplication by xi = (xi , . , xi )>, regarded as a d × 1 matrix. 1 di i Gradient of fA may be written as ∇fA = (∇x1fA,..., ∇xkfA) where ∂fA ∂fA ∇ if (x1,..., x ) = ,..., = A(x1,..., xi−1,I , x +1,..., x ). x A k ∂xi ∂xi di i k 1 di denotes identity matrix. Idi di × di 10 Hyperdeterminant (d +1)×···×(d +1) Work in C 1 k for the time being (di ≥ 1). Consider (d1+1)×···×(d +1) S := {A ∈ C k | ∇fA(x1,..., xk) = 0 for some non-zero (x1,..., xk)}. Theorem (Gelfand, Kapranov, Zelevinsky, 1992). S is a hypersurface if and only if X dj ≤ di i6=j for all j = 1, . , k. Let ∆ be the equation of the hypersurface, ie. a multivariate polynomial in the entries of A such that (d +1)×···×(d +1) S = {A ∈ C 1 k | ∆(A) = 0}. Then ∆ may be chosen to have integer coefficients. m×n For C , the condition becomes m ≤ n and n ≤ m — that’s why matrix determinants is only defined for square matrices. Since ∆ has integer coefficients, ∆(A) is real-valued for A ∈ (d +1)×···×(d +1) R 1 k . 11 Geometric View (d +1)×···×(d +1) d +1 Let X = {x1 ⊗ · · · ⊗ xk ∈ C 1 k | xi ∈ C i } be the (smooth) manifold of decomposable tensors (X oftened called the Segre variety). (d +1)×···×(d +1) Let A ∈ C 1 k . Then the condition ∇fA(x1,..., xk) = 0 for some non-zero (x1,..., xk) is equivalent to saying that the hyperplane orthogonal to A, ie. (d1+1)×···×(d +1) HA := {B ∈ C k | hA, Bi = 0} contains a tangent to X at the point x1 ⊗ · · · ⊗ xk. This may also be taken as an alternative definition of the hyperdeterminant ∆(A). Projective duality: X∗ = S. 12 ¢(A) = 0 rank(A) = 1 ¢(B) ≠ 0 rank(B) = 2 rank(B) = 2 ¢(A) = 0 ¢(B) = 0 rank(A) = 1 Minor Inaccuracy (d +1)×···×(d +1) Should really be working in projective spaces P (C 1 k ) = (d +1)···(d +1)−1 P 1 k . This is the set of equivalence classes (d +1)×···×(d +1) × [A] := {λA ∈ C 1 k | λ ∈ C }. (d +1)×···×(d +1) Thing to note is that the for any A ∈ C 1 k and × λ ∈ C , rank⊗(λA) = rank⊗(A). (d +1)×···×(d +1) So outer-product rank is well-defined in P (C 1 k ), (d +1)×···×(d +1) ie. given [A] ∈ P (C 1 k ) define rank⊗([A]) = rank⊗(A) for any A ∈ [A]. 13 Examples A. Cayley, “On the theory of linear transformation,” Cambridge Math. J., 4 (1845), pp. 193–209. 2×2×2 Hyperdeterminant of A = [[aijk]] ∈ R is " " # " #! 1 a a a a ∆(A) = det 000 010 + 100 110 4 a001 a011 a101 a111 " # " #!#2 a a a a − det 000 010 − 100 110 a001 a011 a101 a111 " # " # a a a a − 4 det 000 010 det 100 110 . a001 a011 a101 a111 A result that parallels the matrix case is the following: the system 14 of bilinear equations a000x0y0 + a010x0y1 + a100x1y0 + a110x1y1 = 0, a001x0y0 + a011x0y1 + a101x1y0 + a111x1y1 = 0, a000x0z0 + a001x0z1 + a100x1z0 + a101x1z1 = 0, a010x0z0 + a011x0z1 + a110x1z0 + a111x1z1 = 0, a000y0z0 + a001y0z1 + a010y1z0 + a011y1z1 = 0, a100y0z0 + a101y0z1 + a110y1z0 + a111y1z1 = 0, has a non-trivial solution iff ∆(A) = 0. Examples 2×2×3 Hyperdeterminant of A = [[aijk]] ∈ R is a000 a001 a002 a100 a101 a102 ∆(A) = det a100 a101 a102 det a010 a011 a012 a010 a011 a012 a110 a111 a112 a000 a001 a002 a000 a001 a002 − det a100 a101 a102 det a010 a011 a012 a110 a111 a112 a110 a111 a112 Again, the following is true: a000x0y0 + a010x0y1 + a100x1y0 + a110x1y1 = 0, a001x0y0 + a011x0y1 + a101x1y0 + a111x1y1 = 0, a002x0y0 + a012x0y1 + a102x1y0 + a112x1y1 = 0, a000x0z0 + a001x0z1 + a002x0z2 + a100x1z0 + a101x1z1 + a102x1z2 = 0, a010x0z0 + a011x0z1 + a012x0z2 + a110x1z0 + a111x1z1 + a112x1z2 = 0, a000y0z0 + a001y0z1 + a002y0z2 + a010y1z0 + a011y1z1 + a012y1z2 = 0, a100y0z0 + a101y0z1 + a102y0z2 + a110y1z0 + a111y1z1 + a112y1z2 = 0, has a non-trivial solution iff ∆(A) = 0.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    31 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us