Spectrum and Pseudospectrum of a Tensor

Lek-Heng Lim

University of California, Berkeley

July 11, 2008

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 1 / 27 Matrix eigenvalues and eigenvectors

One of the most important ideas ever invented.

I R. Coifman et. al.: “Eigenvector magic: eigenvectors as an extension of Newtonian calculus.” Normal/Hermitian A

I Invariant subspace: Ax = λx. > > I Rayleigh quotient: x Ax/x x. > 2 I Lagrange multipliers: x Ax − λ(kxk − 1). > I Best rank-1 approximation: minkxk=1kA − λxx k. Nonnormal A −1 −1 I Pseudospectrum: σε(A) = {λ ∈ C | k(A − λI ) k > ε }. ∗ I Numerical range: W (A) = {x Ax ∈ C | kxk = 1}. ∗ I Irreducible representations of C (A) with natural Borel structure. ∗ I Primitive ideals of C (A) with hull-kernel topology. How can one define these for tensors?

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 2 / 27 DARPA mathematical challenge eight One of the twenty three mathematical challenges announced at DARPA Tech 2007. Problem Beyond convex optimization: can linear algebra be replaced by algebraic geometry in a systematic way?

Algebraic geometry in a slogan: polynomials are to algebraic geometry what matrices are to linear algebra.

Polynomial f ∈ R[x1,..., xn] of degree d can be expressed as

> > f (x) = a0 + a1 x + x A2x + A3(x, x, x) + ··· + Ad (x,..., x).

n n×n n×n×n n×···×n a0 ∈ R, a1 ∈ R , A2 ∈ R , A3 ∈ R ,..., Ad ∈ R . : d = 2. Numerical multilinear algebra: d > 2.

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 3 / 27 Hypermatrices

Totally ordered finite sets: [n] = {1 < 2 < ··· < n}, n ∈ N. Vector or n-tuple f :[n] → R. > n If f (i) = ai , then f is represented by a = [a1,..., an] ∈ R . Matrix f :[m] × [n] → R. m,n m×n If f (i, j) = aij , then f is represented by A = [aij ]i,j=1 ∈ R . Hypermatrix (order 3)

f :[l] × [m] × [n] → R.

l,m,n l×m×n If f (i, j, k) = aijk , then f is represented by A = aijk i,j,k=1 ∈ R . J K X [n] [m]×[n] [l]×[m]×[n] Normally R = {f : X → R}. Ought to be R , R , R .

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 4 / 27 Hypermatrices and tensors

Up to choice of bases n a ∈ R can represent a vector in V (contravariant) or a linear functional in V ∗ (covariant). m×n A ∈ R can represent a bilinear form V × W → R (contravariant), ∗ ∗ a bilinear form V × W → R (covariant), or a linear V → W (mixed). l×m×n A ∈ R can represent trilinear form U × V × W → R (contravariant), bilinear operators V × W → U (mixed), etc. A hypermatrix is the same as a tensor if

1 we give it coordinates (represent with respect to some bases); 2 we ignore covariance and contravariance.

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 5 / 27 Basic operation on a hypermatrix

m×n A matrix can be multiplied on the left and right: A ∈ R , p×m q×n X ∈ R , Y ∈ R ,

> p×q (X , Y ) · A = XAY = [cαβ] ∈ R

where Xm,n cαβ = xαi yβj aij . i,j=1 l×m×n A hypermatrix can be multiplied on three sides: A = aijk ∈ R , p×l q×m r×n J K X ∈ R , Y ∈ R , Z ∈ R ,

p×q×r (X , Y , Z) ·A = cαβγ ∈ R J K where Xl,m,n cαβγ = xαi yβj zγk aijk . i,j,k=1

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 6 / 27 Numerical range of a tensor Covariant version: A· (X >, Y >, Z >) := (X , Y , Z) ·A. Gives convenient notations for multilinear functionals and multilinear l m n operators. For x ∈ R , y ∈ R , z ∈ R , Xl,m,n A(x, y, z) := A· (x, y, z) = aijk xi yj zk , i,j,k=1 Xm,n A(I , y, z) := A· (I , y, z) = aijk yj zk . j,k=1 n×n Numerical range of square matrix A ∈ C , ∗ ∗ W (A) = {x Ax ∈ C | kxk2 = 1} = {A(x, x ) ∈ C | kxk2 = 1}. n×···×n Plausible generalization to cubical hypermatrix A ∈ C , ( {A(x, x∗,..., x) ∈ | kxk = 1} odd order, W (A) = C k ∗ ∗ {A(x, x ,..., x ) ∈ C | kxkk = 1} even order.

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 7 / 27 Symmetric hypermatrices

n×n×n Cubical hypermatrix aijk ∈ R is symmetric if J K aijk = aikj = ajik = ajki = akij = akji .

Invariant under all permutations σ ∈ Sk on indices. k n S (R ) denotes set of all order-k symmetric hypermatrices.

Example Higher order derivatives of multivariate functions.

Example

Moments of a random vector x = (X1,..., Xn): Z Z n  n mk (x) = E(xi1 xi2 ··· xik ) = ··· xi1 xi2 ··· xik dµ(xi1 ) ··· dµ(xik ) . i1,...,ik =1 i1,...,ik =1

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 8 / 27 Symmetric hypermatrices

Example

Cumulants of a random vector x = (X1,..., Xn):  n     X p−1 Q Q κk (x) =  (−1) (p − 1)!E xi ··· E xi  . i∈A1 i∈Ap A1t···tAp ={i1,...,ik } i1,...,ik =1

For n = 1, κk (x) for k = 1, 2, 3, 4 are the expectation, variance, skewness, and kurtosis.

Important in Independent Component Analysis (ICA).

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 9 / 27 Multilinear

n×n×n Let A ∈ R (easier if A symmetric). 1 How should one define its eigenvalues and eigenvectors? 2 What is a decomposition that generalizes the eigenvalue decomposition of a matrix? l×m×n Let A ∈ R 1 How should one define its singualr values and singular vectors? 2 What is a decomposition that generalizes the singular value decomposition of a matrix? Somewhat surprising: (1) and (2) have different answers.

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 10 / 27 Tensor ranks (Hitchcock, 1927) m×n Matrix rank. A ∈ R . rank(A) = dim(span {A ,..., A }) (column rank) R •1 •n = dim(span {A ,..., A }) (row rank) R 1• m• Pr T = min{r | A = i=1ui vi } (outer product rank). Multilinear rank. A ∈ l×m×n. rank (A) = (r (A), r (A), r (A)), R  1 2 3 r (A) = dim(span {A ,..., A }) 1 R 1•• l•• r (A) = dim(span {A ,..., A }) 2 R •1• •m• r (A) = dim(span {A ,..., A }) 3 R ••1 ••n l×m×n Outer product rank. A ∈ R . Pr rank⊗(A) = min{r | A = i=1ui ⊗ vi ⊗ wi } l,m,n where u ⊗ v ⊗ w : = ui vj wk i,j,k=1. J K L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 11 / 27 Matrix EVD and SVD

Rank revealing decompositions. 2 n Symmetric eigenvalue decomposition of A ∈ S (R ),

> Xr A = V ΛV = λi vi ⊗ vi , i=1 where rank(A) = r, V ∈ O(n) eigenvectors, Λ eigenvalues. m×n Singular value decomposition of A ∈ R ,

> Xr A = UΣV = σi ui ⊗ vi i=1 where rank(A) = r, U ∈ O(m) left singular vectors, V ∈ O(n) right singular vectors, Σ singular values.

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 12 / 27 One plausible EVD and SVD for hypermatrices

Rank revealing decompositions associated with the outer product rank. 3 n Symmetric outer product decomposition of A ∈ S (R ), Xr A = λi vi ⊗ vi ⊗ vi i=1

where rankS(A) = r, vi unit vector, λi ∈ R. l×m×n Outer product decomposition of A ∈ R , Xr A = σi ui ⊗ vi ⊗ wi i=1

l m n where rank⊗(A) = r, ui ∈ R , vi ∈ R , wi ∈ R unit vectors, σi ∈ R.

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 13 / 27 Another plausible EVD and SVD for hypermatrices

Rank revealing decompositions associated with the multilinear rank. l×m×n Singular value decomposition of A ∈ R ,

A = (U, V , W ) ·C

where rank (A) = (r , r , r ), U ∈ l×r1 , V ∈ m×r2 , W ∈ n×r3  1 2 3 R R R r ×r ×r have orthonormal columns and C ∈ R 1 2 3 . 3 n Symmetric eigenvalue decomposition of A ∈ S (R ),

A = (U, U, U) ·C

where rank (A) = (r, r, r), U ∈ n×r has orthonormal columns and  R 3 r C ∈ S (R ).

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 14 / 27 Variational approach to eigenvalues/vectors

m×n A ∈ R symmetric. Eigenvalues and eigenvectors are critical values and critical points of

> 2 x Ax/kxk2.

Equivalently, critical values/points of x>Ax constrained to unit sphere. Lagrangian: > 2 L(x, λ) = x Ax − λ(kxk2 − 1). n Vanishing of ∇L at critical (xc , λc ) ∈ R × R yields familiar

Axc = λc xc .

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 15 / 27 Variational approach to singular values/vectors

m×n A ∈ R . Singular values and singular vectors are critical values and critical points of > x Ay/kxk2kyk2. Lagrangian:

> L(x, y, σ) = x Ay − σ(kxk2kyk2 − 1).

m n At critical (xc , yc , σc ) ∈ R × R × R,

> Ayc /kyc k2 = σc xc /kxc k2, A xc /kxc k2 = σc yc /kyc k2.

Writing uc = xc /kxc k2 and vc = yc /kyc k2 yields familiar

> Avc = σc uc , A uc = σc vc .

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 16 / 27 Eigenvalues/vectors of a tensor

Extends to hypermatrices. > n p p p > For x = [x1,..., xn] ∈ R , write x := [x1 ,..., xn ] . k k k 1/k Define the ‘` -norm’ kxkk = (x1 + ··· + xn ) . k n Define eigenvalues/vectors of A ∈ S (R ) as critical values/points of the multilinear Rayleigh quotient

k A(x,..., x)/kxkk .

Lagrangian k L(x, λ) := A(x,..., x) − λ(kxkk − 1). At a critical point k−1 A(In, x,..., x) = λx .

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 17 / 27 Eigenvalues/vectors of a tensor

If A is symmetric,

A(In, x, x,..., x) = A(x, In, x,..., x) = ··· = A(x, x,..., x, In).

Also obtained by Liqun Qi independently:

I L. Qi, “Eigenvalues of a real supersymmetric tensor,” J. Symbolic Comput., 40 (2005), no. 6.

I L, “Singular values and eigenvalues of tensors: a variational approach,” Proc. IEEE Int. Workshop on Computational Advances in Multi-Sensor Adaptive Processing, 1 (2005). For unsymmetric hypermatrices — get different eigenpairs for different modes (unsymmetric matrix have different left/right eigenvectors). Falls outside Classical Invariant Theory — not invariant under Q ∈ O(n), ie. kQxk2 = kxk2.

Invariant under Q ∈ GL(n) with kQxkk = kxkk .

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 18 / 27 Singular values/vectors of a tensor

l×m×n Likewise for singular values/vectors of A ∈ R . Lagrangian is

L(x, y, z, σ) = A(x, y, z) − σ(kxkkykkzk − 1)

where σ ∈ R is the Lagrange multiplier. At a critical point,

A (Il , y/kyk, z/kzk) = σx/kxk,

A (x/kxk, Im, z/kzk) = σy/kyk,

A (x/kxk, y/kyk, In) = σz/kzk.

Normalize to get

A(Il , v, w) = σu, A(u, Im, w) = σv, A(u, v, In) = σw.

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 19 / 27 Pseudospectrum of a tensor

n×n Pseudospectrum of square matrix A ∈ C ,

−1 −1 σε(A) = {λ ∈ C | k(A − λI ) k2 > ε } = {λ ∈ C | σmin(A − λI ) < ε}.

n×···×n One plausible generalization to cubical hypermatrix A ∈ C ,

Σ σε (A) = {λ ∈ C | σmin(A − λI) < ε}.

Another plausible generalization,

∆ −1 σε (A) = {λ ∈ C | infDetn,...,n(B)=0kA − λI − BkF < ε }.

Fact: hyperdeterminant Detn,...,n(B) = 0 iff 0 is a singular value of B.

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 20 / 27 Perron-Frobenius theorem for hypermatrices

k n An order-k cubical hypermatrix A ∈ T (R ) is reducible if there exist a permutation σ ∈ Sn such that the permuted hypermatrix

bi1···ik = aσ(j1)···σ(j ) J K J k K

has the property that for some m ∈ {1,..., n − 1}, bi1···ik = 0 for all i1 ∈ {1,..., n − m} and all i2,..., ik ∈ {1,..., m}. We say that A is irreducible if it is not reducible. In particular, if A > 0, then it is irreducible.

Theorem (L) k n Let 0 ≤ A = aj1···jk ∈ T (R ) be irreducible. Then A has J K 1 a positive real eigenvalue λ with an eigenvector x; 2 x may be chosen to have all entries non-negative; 3 if µ is an eigenvalue of A, then |µ| ≤ λ.

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 21 / 27 Hypergraphs

G = (V , E) is 3-hypergraph.

I V is the finite set of vertices. I E is the subset of hyperedges, ie. 3-element subsets of V . Write elements of E as [x, y, z](x, y, z ∈ V ). G is undirected, so [x, y, z] = [y, z, x] = ··· = [z, y, x]. Hyperedge is said to degenerate if of the form [x, x, y] or [x, x, x] (hyperloop at x). We do not exclude degenerate hyperedges. G is m-regular if every v ∈ V is adjacent to exactly m hyperedges. G is r-uniform if every edge contains exactly r vertices. Good reference: D. Knuth, The art of computer programming, 4, pre-fascicle 0a, 2008.

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 22 / 27 Spectral hypergraph theory

Define the order-3 adjacency hypermatrix A = aijk by J K ( 1 if [x, y, z] ∈ E, axyz = 0 otherwise.

|V |×|V |×|V | A ∈ R nonnegative symmetric hypermatrix. Consider cubic form X A(f , f , f ) = axyz f (x)f (y)f (z), x,y,z

V where f ∈ R . Eigenvalues (resp. eigenvectors) of A are the critical values (resp. critical points) of A(f , f , f ) constrained to the f ∈ `3(V ), ie. X f (x)3 = 1. x∈V

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 23 / 27 Spectral hypergraph theory

We have the following. Lemma (L) Let G be an m-regular 3-hypergraph. A its adjacency hypermatrix. Then

1 m is an eigenvalue of A; 2 if λ is an eigenvalue of A, then |λ| ≤ m; 3 λ has multiplicity 1 if and only if G is connected.

Related work: J. Friedman, A. Wigderson, “On the second eigenvalue of hypergraphs,” Combinatorica, 15 (1995), no. 1.

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 24 / 27 Spectral hypergraph theory A hypergraph G = (V , E) is said to be k-partite or k-colorable if there exists a partition of the vertices V = V1 ∪ · · · ∪ Vk such that for any k vertices u, v,..., z with auv···z 6= 0, u, v,..., z must each lie in a distinct Vi (i = 1,..., k).

Lemma (L) Let G be a connected m-regular k-partite k-hypergraph on n vertices. Then

1 If k ≡ 1 mod 4, then every eigenvalue of G occurs with multiplicity a multiple of k. 2 If k ≡ 3 mod 4, then the spectrum of G is symmetric, ie. if λ is an eigenvalue, then so is −λ. 3 Furthermore, every eigenvalue of G occurs with multiplicity a multiple of k/2, ie. if λ is an eigenvalue of G, then λ and −λ occurs with the same multiplicity.

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 25 / 27 To do

Cases k ≡ 0, 2 mod 4 Cheeger type isoperimetric inequalities Expander hypergraphs Algorithms for eigenvalues/vectors of a hypermatrix

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 26 / 27 Advertisement

Geometry and representation theory of tensors for computer science, statistics, and other areas

1 MSRI Summer Graduate Workshop

I July 7 to July 18, 2008 I Organized by J.M. Landsberg, L.-H. Lim, J. Morton

I Mathematical Sciences Research Institute, Berkeley, CA I http://msri.org/calendar/sgw/WorkshopInfo/451/show sgw

2 AIM Workshop

I July 21 to July 25, 2008 I Organized by J.M. Landsberg, L.-H. Lim, J. Morton, J. Weyman I American Institute of Mathematics, Palo Alto, CA I http://aimath.org/ARCC/workshops/repnsoftensors.html

L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 27 / 27