Singular Values and Eigenvalues of Tensors: a Variational Approach

SINGULAR VALUES AND EIGENVALUES OF TENSORS: A VARIATIONAL APPROACH Lek-Heng Lim Stanford University Institute for Computational and Mathematical Engineering Gates Building 2B, Room 286, Stanford, CA 94305 ABSTRACT order-2 tensor), the constrained variational approach generalizes We propose a theory of eigenvalues, eigenvectors, singular values, in a straight-forward manner — one simply replaces the bilinear and singular vectors for tensors based on a constrained variational functional x|Ay (resp. quadratic form x|Ax) by the multilinear approach much like the Rayleigh quotient for symmetric matrix functional (resp. homogeneous polynomial) associated with a ten- eigenvalues. These notions are particularly useful in generalizing sor (resp. symmetric tensor) of order k. The constrained critical certain areas where the spectral theory of matrices has tradition- values/points then yield a notion of singular values/vectors (resp. ally played an important role. For illustration, we will discuss a eigenvalues/vectors) for order-k tensors. multilinear generalization of the Perron-Frobenius theorem. An important point of distinction between the order-2 and order- k cases is in the choice of norm for the constraints. At first glance, it may appear that we should retain the l2-norm. However, the crit- 1. INTRODUCTION icality conditions so obtained are no longer scale invariant (ie. the It is well known that the eigenvalues and eigenvectors of a sym- property that xc in (1) or (uc, vc) in (2) may be replaced by αxc metric matrix A are the critical values and critical points of its or (αuc, αvc) without affecting the validity of the equations). To | 2 preserve the scale invariance of eigenvectors and singular vectors Rayleigh quotient, x Ax/kxk2, or equivalently, the critical val- 2 | for tensors of order k ≥ 3, the l -norm must be replaced by the ues and points of the quadratic form x Ax constrained to vectors k 2 n l -norm (where k is the order of the tensor), with unit l -norm, {x | kxk2 = 1}. If L : R × R → R is the associated Lagrangian with Lagrange multiplier λ, k k 1/k kxkk = (|x1| + ··· + |xn| ) . | 2 L(x, λ) = x Ax − λ(kxk2 − 1), n then the vanishing of ∇L at a critical point (xc, λc) ∈ × The consideration of eigenvalues and singular values with respect R R p yields the familiar defining condition for eigenpairs to l -norms where p 6= 2 is prompted by recent works [1, 2] of Choulakian, who studied such notions for matrices. Axc = λcxc. (1) Nevertheless, we shall not insist on having scale invariance. Instead, we will define eigenpairs and singular pairs of tensors Note that this approach does not work if A is nonsymmetric — the p critical points of L would in general be different from the solutions with respect to any l -norm (p > 1) as they can be interesting even of (1). when p 6= k. For example, when p = 2, our defining equations for A little less widely known is an analogous variational approach singular values/vectors (6) become the equations obtained in the m×n best rank-1 approximations of tensors studied by Comon [3] and to the singular values and singular vectors of a matrix A ∈ R , | de Lathauwer et. al. [4]. For the special case of symmetric tensors, with x Ay/kxk2kyk2 assuming the role of the Rayleigh quotient. m n our equations for eigenvalues/vectors for p = 2 and p = k define The associated Lagrangian function L : R × R × R → R is now respectively, the Z-eigenvalues/vectors and H-eigenvalues/vectors | in the soon-to-appear paper [5] of Qi. For simplicity, we will re- L(x, y, σ) = x Ay − σ(kxk2kyk2 − 1). strict our study to integer-valued p in this paper. L is continuously differentiable for non-zero x, y. The first order We thank Gunnar Carlsson, Pierre Comon, Lieven de Lath- condition yields auwer, Vin de Silva, and Gene Golub for helpful discussions. We | Ayc/kyck2 = σcxc/kxck2,A xc/kxck2 = σcyc/kyck2, would also like to thank Liqun Qi for sending us an advanced copy of his very relevant preprint. m n at a critical point (xc, yc, σc) ∈ R × R × R. Writing uc = xc/kxck2 and vc = yc/kyck2, we get the familiar | 2. TENSORS AND MULTILINEAR FUNCTIONALS Avc = σcuc,A uc = σcvc. (2) Although it is not immediately clear how the usual definitions A k-array of real numbers representing an order-k tensor will be of eigenvalues and singular values via (1) and (2) may be gen- d1×···×dk denoted by A = aj1···jk ∈ R . Just as an order-2 eralized to tensors of order k ≥ 3 (a matrix is regarded as an tensor (ie. matrix)J may beK multiplied on the left and right by a pair of matrices (of consistent dimensions), an order-k tensor may This work appeared in: Proceedings of the IEEE International Work- shop on Computational Advances in Multi-Sensor Adaptive Processing be ‘multiplied on k sides’ by k matrices. The covariant multi- (1) (CAMSAP ’05), 1 (2005), pp. 129–132. linear matrix multiplication of A by matrices M = [m ] ∈ 1 j1i1 d1×s1 (k) dk×sk ,...,Mk = [m ] ∈ is defined by 3. SINGULAR VALUES AND SINGULAR VECTORS R jkik R d ×···×d Let A ∈ R 1 k . Then A defines a multilinear functional A(M1,...,Mk) := d d d p A : R 1 × · · · × R k → R via (3). Let us equip R i with the l i - d1 dk X X (1) (k) s1×···×sk r ··· aj ···j m ··· m z ∈ . norm, k·kp , i = 1, . , k. We will define the singular values and 1 k j1i1 jkik R i j1=1 jk=1 singular vectors of A as the critical values and critical points of This operation arises from the way a multilinear functional trans- A(x1,..., xk)/kx1kp1 · · · kxkkpk , suitably normalized. Taking d d forms under compositions with linear maps. In particular, the mul- a constrained variational approach, we let L : R 1 × · · · × R k × d ×···×d tilinear functional associated with a tensor A ∈ R 1 k and R → R be its gradient may be succinctly expressed via covariant multilinear multiplication: L(x1,..., xk, σ) := A(x1,..., xk) − σ(kx1kp · · · kxkkp − 1). Xd1 Xdk (1) (k) 1 k A(x1,..., xk) = ··· aj ···j x ··· x , (3) 1 k j1 jk j1=1 jk=1 L is continuously differentiable when xi 6= 0, i = 1, . , k. The ∇xi A(x1,..., xk) = A(x1,..., xi−1,Idi , xi+1,..., xk). vanishing of the gradient, Note that we have slightly abused notations by using A to denote ∇L = (∇ L, . , ∇ L, ∇ L) = (0,..., 0, 0) both the tensor and its associated multilinear functional. x1 xk σ n×···×n An order-k tensor aj1···jk ∈ R is called symmetric gives if aj ···j = aj J···j forK any permutation σ ∈ Sk. The σ(1) σ(k) 1 k A(Id , x2, x3,..., xk) = σϕp −1(x1), homogeneous polynomial associated with a symmetric tensor A = 1 1 A(x1,Id2 , x3,..., xk) = σϕp2−1(x2), aj1···jk and its gradient can again be conveniently expressed as J K . (6) Xn Xn . A(x,..., x) = ··· aj1···jk xj1 ··· xjk , (4) j1=1 jk=1 A(x1, x2,..., xk−1,Idk ) = σϕpk−1(xk), ∇A(x,..., x) = kA(In, x,..., x). at a critical point (x1,..., xk, σ). As in the derivation of (2), one Observe that for a symmetric tensor A, gets also the unit norm condition kx k = ··· = kx k = 1. A(In, x, x,..., x) = A(x,In, x,..., x) = 1 p1 k pk ··· = A(x, x,..., x,In). (5) The unit vector xi and σ in (6), will be called the mode-i singular vector, i = 1, . , k, and singular value of A respectively. Note The preceding discussion is entirely algebraic but we will now that the mode-i singular vectors are simply the order-k equivalent introduce norms on the respective spaces. Let k·kαi be a norm d of left- and right-singular vectors for order 2 (a matrix has two on R i , i = 1, . , k. Then the norm (cf. [6]) of the multilinear d1 dk ‘sides’ or modes while an order-k tensor has k). functional A : ×· · ·× → induced by k·kα ,..., k·kα R R R 1 k p1,...,pk is defined as We will use the name l -singular values/vectors if we wish to emphasize the dependence of these notions on k·kpi , i = |A(x1,..., xk)| 1, . , k. If p = ··· = p = p, then we will use the shorter kAk := sup 1 k α1,...,αk lp p kx1kα1 · · · kxkkαk name -singular values/vectors. Two particular choices of will be of interest to us: p = 2 and p = k — both of which reduce to di where the supremum is taken over all non-zero xi ∈ R , i = the matrix case when k = 2 (not so for other choices of p). The p 1, . , k. We will be interested in the case where the k·kαi ’s are l - former yields norms. Recall that for 1 ≤ p ≤ ∞, the lp-norm is a continuously n | n differentiable function on R \{0}. For x = [x1, . , xn] ∈ R , A(x1,..., xi,Idi , xi+1,..., xk) = σxi, i = 1, . , k, we will write p p p | while the latter yields a homogeneous system of equations that is x := [x1, . , xn] invariant under scaling of (x ,..., x ). In fact, when k is even, (ie. taking pth power coordinatewise) and 1 k the lp-singular values/vectors are solutions to p p | ϕp(x) :=[sgn(x1)x1,..., sgn(xn)xn] k−1 A(x1,..., xi,Idi , xi+1,..., xk) = σxi , i = 1, .

Singular Values and Eigenvalues of Tensors: a Variational Approach

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support