arXiv:2108.09502v1 [math.NA] 21 Aug 2021 o ri cec n ri-nprdTcnlg,te11Project 111 the Technology, 2018SHZDZX Brain-Inspired (No. and Science Project Brain Major for Technology o Foundation and Science Science Natural National Municipal Internation the by Fudan supported Zhangjiang F is search MOE China; China; Shanghai, Education, University, of Fudan Ministry University), Comput (Fudan of Intelligence Laboratory Key Shangha China; China; Shanghai, Technology, Shanghai, Inspired University, Fudan Intelligence, Inspired China. of R. P. 400047, Chongqing, etrainaayi ftidodrtno eigenvalue tensor third-order of analysis Perturbation ∗ † E-mail: orsodn uhr E-mail: author. Corresponding inaayi ujc ntno ievle ne tensor-t attent under our eigenvalues pay also we tensor and paper, on this subject In analysis now. gettin tion to then especially from progress, made been considerable has and fields many in system equation, differential ordinary multidimensional theorem, ieetlevels. different e,BurFk theorem, Bauer-Fike ues, xmlswihso h onaiso the of boundaries pseudospectra the various show which with examples together presented, is tensors hoe,aeetne rmmti otno ae h td o study The case. tensor to cir matrix Gershgorin from case, extended general are its theorem, and theorem Bauer-Fike the alized rbe ae ntno-esrmultiplication tensor-tensor on based problem etrainaayi a enpiaiycniee ob o be to considered primarily been has analysis Perturbation M ujc classifications: subject AMS Keywords: [email protected] T egnau ftidodrtnosi ie.Svrlclass Several given. is tensors third-order of -eigenvalue ε pedsetater o hr-re esr.Tedefinit The tensors. third-order for theory -pseudospectra etrainter,tno-esrmlilcto,tens multiplication, tensor-tensor theory, perturbation hnxnMo Changxin ε colo ahmtclSine,hnqn omlUniversity, Normal Sciences,Chongqing Mathematical of School . [email protected] pedsetater,Grhoi iceterm Kahan theorem, circle Gershgorin theory, -pseudospectra ue8 2021 8, June 51,15A69 15A18, Abstract ∗ 1 ε pedsetao eti esr under tensors certain of -pseudospectra eyn Ding Weiyang nttt fSineadTcnlg o Brain- for Technology and Science of Institute . hn ne rn 1049 Shanghai 11801479, Grant under China f toa ersineadBrain-Inspired and Neuroscience ational N.B18015). (No. etrfrBanSineadBrain- and Science Brain for Center i lInvto etr .Dn’ re- Ding’s W. Center. Innovation al no utpiainsense; multiplication ensor ut-iertm invariant time multi-linear 1,Z a,adSaga Center Shanghai and Lab, ZJ 01), novdwt matrices, with involved g rprisadnumerical and properties otesCne o ri Science, Brain for Center rontiers l hoe n Kahan and theorem cle eo h anissues main the of ne n clrsls uhas such results, ical o oteperturba- the to ion ε † pedsetaof -pseudospectra o ftegener- the of ion reigenval- or 1 Introduction
Perturbation theory, which has been studied for more than ninety years, stems from the ideas of Rayleigh [16] and Schr¨odinger [18] when they studied the eigenvalue problems in vibrating system and quantum mechanics, separately. Since then, extensive researches have been conducted by many researchers and those pioneering works can be found in the celebrated monographs: Perturbation theory of eigenvalue problems by Rellich [17], Perturbation Theory for Linear Operator by Kato [7] and Perturbation Analysis of Matrix by Sun [20], to name a few. Tensor, which can be regarded as a generalization of the matrix to higher-order case, has been attracted considerable attention in different fields of science recently. As one of the most important and basic operations like matrix multiplication, tensor multiplication has been attended greatly by researchers. In 2008, a new type of tensor multiplication, that allows a third-order tensor to be written as a product of third-order tensors also, have been proposed by Kilmer at al. [10] when they considered the problem of generalizing the matrix SVD to tensor case, and it is termed as tensor-tensor multiplication. In the definition of this new multiplication, three operators are involved closely. For a third- n1×n2×n3 tensor A ∈ C , we get its frontal slices, denoted by A::k or Ak for short, by fixing the last index. Therefore it has n3 frontal slices, i.e., A1,...,An3 , which are matrices with size n1 × n2. The first operation we will introduce is about creating a block circulant matrix from the frontal slices of a tensor. That is,
A1 An3 An3−1 ··· A2
A2 A1 An3 ··· A3 bcirc(A)= . . . . . (1.1) ...... . A A .. A A n3 n3−1 2 1 with size (n1n3) × (n2n3). The other two operations could be regarded as the “inverse” of each other. They are the unfold and fold commands defined as follows,
A1 A2 unfold(A)= . , fold(unfold(A)) = A. . An3 On the basis of the above three operations, the tensor-tensor multiplication of any two tensors B ∈ Cn1×p×n3 and C ∈ Cp×n2×n3 is defined as
A = B ∗ C = fold(bcirc(B) · unfold(C)). (1.2)
One can easily check that A ∈ Cn1×n2×n3 . By the size of the three tensors involved in (1.2), we can see that special attention should be paid to the first two indices of B and
2 C since their frontal slices need to be consistent with the multiplication of matrices. The tensor-tensor multiplication (1.2) has demonstrated its usefulness in many areas, including, but not limited to, image processing (such as image deblurring and compression, object and facial recognition), tensor principal component analysis, tensor completion, pattern recognition; see [8, 11, 15, 22, 23] and the references therein. On the basis of this important tensor-tensor multiplication, many researchers therefore considered the functions of multidimensional arrays. It can be regarded as generalization of the functions of matrices, and many nice properties have been given; more details are discussed in the articles [12, 13, 14]. Specially, one basic but also important concept, T - eigenvalue, has been proposed by Miao et al. [14], and then the stability of T -eigenvalues has been introduced in [6] for studying the tensor Lyapunov equation that appears in spa- tially invariant systems. Moreover, some results, such as Weyl’s and Cauchy’s interlacing theorems, from the matrix case to the tensor case have been given [6]. Motivated by these researches mentioned above, we pay our attention to the perturba- tion analysis of third-order tensors under the important tensor-tensor multiplication (1.2) in this paper. Many classical results of the matrix case will be generalized to the tensor case. Moreover, the pseudospectra theory for third-order tensors also has been considered. The remainder of this paper is organized as follows. We describe some notations that often used in the following and revisit several basic concepts as well as fundamental results in section 2. Section 3 is one of the main parts which focus on the perturbation analysis results on third-order tensors. Several classical theorems on matrices are extended to tensor case. The issue on ε-pseudospectra theory, which is the other main part, of tensors is studied in section 4. Section 5 presents the multidimensional ordinary differential equation and also intestigate the claose relationship of T -eiganvelue with it. Some properties of multidimensional ordinary differential equation are also given. This paper culminates with conclusions and remarks.
2 Preliminaries
In this section, we introduce the notations used throughout the paper and also review the basic concepts of tensors, such as identity tensor, transpose of a tensor, F -diagonal tensor, and orthogonal tensor. Generally, scalars are denoted by lowercase letters, e.g., a. Vectors and matrices are denoted by boldface lowercase letters and capital letters, respectively, e.g., v and A. Euler script letters are used to denote the higher-order tensors, e.g., A. Frontal slices of a tensor Cn1×n2×n3 T ∈ are denoted by T1,...,Tn3 . Some basic concepts of tensors are revisited next.
Definition 2.1. ([9, Definition 3.14]). Let A ∈ Cn1×n2×n3 , then the transposed tensor A⊤ ∈ Cn2×n1×n3 (conjugate transposed tensor AH ∈ Cn2×n1×n3 ) is obtained by taking the
3 transpose (conjugate transpose) of each frontal slices and then reversing the order of trans- posed frontal slices 2 through n3.
Unlike the transposed tensor which is well-defined for any n1 × n2 × n3 tensor, the identity tensor, orthogonal tensor and the inverse of a tensor are only applicable for the tensors with square frontal slices.
m×m×ℓ Definition 2.2. ([9, Definition 3.4], identity tensor). Let Immℓ ∈ C . If its frontal slice I1 is the identity matrix of size m × m, and whose other frontal slices I2,...,Iℓ are all zeros, then we call Immℓ an identity tensor. Definition 2.3. ([9, Definition 3.5], inverse of a tensor). Let A ∈ Cm×m×ℓ. We call tensor B an inverse of A if it satisfies the following two qualities
A ∗ B = Immℓ, and B∗A = Immℓ. Definition 2.4. ([9, Definition 3.18], orthogonal and unitary tensor). Let Q ∈ Rm×m×ℓ. ⊤ ⊤ m×m×ℓ We call Q an orthogonal tensor provided that Q ∗ Q = Q ∗ Q = Immℓ. If Q ∈ C H H and Q ∗ Q = Q ∗ Q = Immℓ, then we call it an unitary tensor. Based on the above definition, a tensor A ∈ Cm×m×n is said to be symmetric if A = A⊤, or Hermitian if A = AH [6]. We call a third-order tensor D ∈ Cm×m×ℓ an F -diagonal tensor if all its frontal slices D1,...,Dℓ are diagonal matrices. Definition 2.5. ([12, 14], F-diagonalizable tensor). Assume that A ∈ Cm×m×ℓ such that A = P∗D∗P−1, then we call A an F -diagonalizable tensor if D is an F -diagonal tensor. Some useful lemmas are recalled as follows. Lemma 2.1. ([6, 12, 13]). The following results hold for third-order tensors A ∈ Cm×n×p: (a) The operator bcirc defined in (1.1) is a linear operator, i.e., bcirc(αA + βB)= α bcirc(A)+ β bcirc(B) where B has the same size as A and α, β are constants. (b) bcirc(A ∗ B) = bcirc(A) bcirc(B) where B ∈ Cn×s×p. (c) bcirc A⊤ = (bcirc(A))⊤, and bcirc AH = (bcirc(A))H. (d) If A is invertible, then its inverse tensor is unique and bcirc (A−1) = (bcirc(A))−1. Lemma 2.2. ([6, Theorems 2.7 and 2.8]). Let A ∈ Cm×m×n. Some fundamental results involving Hermitian or symmetric tensors are: (a) The tensor A is symmetric if and only if bcirc (A) = (bcirc(A))⊤. (b) The tensor A is Hermitian if and only if bcirc (A) = (bcirc(A))H. (c) All T -eigenvalues (cf. (3.3)) of a Hermitian tensor A are real.
4 n×n Lemma 2.3. ([14, Lemma 4]). Suppose A1, ··· , Ap, B1, ··· , Bp ∈ C are matrices satisfying
A1 Ap Ap−1 ··· A2 B1 A A A ··· A B 2 1 p 3 2 H . . . =(Fp ⊗ In) . Fp ⊗ In ...... Ap Ap−1 Ap−2 ··· A1 Bp where Fp is the normalized discrete Fourier matrix of size p×p. Then, B1, ··· , Bp are diag- onal (sub-diagonal, upper-triangular, lower-triangular) matrices if and only if A1, ··· , Ap are diagonal (sub-diagonal, upper-triangular, lower-triangular) matrices. Similar as the matrix case, two tensors A, B ∈ Cm×m×n are said to commute if A ∗ B = B∗A. A tensor A ∈ Cm×m×n is normal if A∗AH = AH ∗A, that is, if A commutes with its conjugate transpose under tensor-tensor multiplication sense. Obviously, the symmetric tensor and Hermitian tensor are normal. A normal tensor can always be F -diagonalizable by an unitary tensor, as the conclusion given next. Lemma 2.4. ([14]). Let A ∈ Cm×m×n be a normal tensor, then there exists an unitary tensor U ∈ Cm×m×n such that U∗A∗U H = D where D is an F -diagonal tensor.
3 Perturbation analysis on third-order tensors
In this section, we firstly give a definition of the generalized tensor-eigenvalue of third- order tensors. And then some classical results, such as the Gershgorin circle theorem [4], the Bauer-Fike theorem and its general case [1, 3, 19], and the Kahan theorem [20] which are well-known in matrix theory, are extended into the tensor case.
3.1 Tensor eigenvalues under tensor-tensor multiplication Firstly, we give the definition of generalized tensor-eigenvalue for third-order tensors under tensor-tensor multiplication. Definition 3.1. (Generalized T -eigenvalue of tensors). Let A, B ∈ Cm×m×ℓ. If there is a λ ∈ C and a nonzero tensor X ∈ Cm×1×ℓ such that
A ∗ X = λ(B ∗ X ), (3.1) then λ is called a generalized T -eigenvalue of A relative to B and X is a T-eigenvector associated to λ.
5 Remark 3.1. By the definition of the tensor-tensor multiplication given in (1.2), we can see that the equality (3.1) is equivalent to bcirc(A) unfold(X )= λ · (bcirc(B) unfold(X )). (3.2) And note that unfold(X ) is a vector with size mℓ, thus this generalized eigenvalue problem based on tensor-tensor product has a close relationship with the classical generalized matrix eigenvalue problem. Hence, there are mℓ eigenvalues for the problem (3.1) if and only if rank(B) := rank(bcirc(B)) = mℓ. If B is rank deficient, then the set of all the generalized T -eigenvalue of A relative to B may be finite, empty or infinite. Remark 3.2. If we choose B as the identity tensor in (3.1), then we get A ∗ X = λX . (3.3) This is the case that given in [6, Definition 2.5] which gives the definition of T -eigenvalue of a third-order tensor. An equivalent definition which based on the tensor singular value decomposition is also displayed in [14]. Therefore Definition 3.1 is the generalization of definitions given in [6, Definition 2.5] and also [14]. Similarly, as the equivalent form given in (3.2) for (3.1), we obtain that (3.3) can be transformed into bcirc(A) unfold(X )= λ · unfold(X ). That is to say, all T -eigenvalues of tensor A are actually eigenvalues of the circulant matrix bcirc(A), and vice versa.
3.2 Gershgorin circle theorem for tensors Firstly, we consider the easy but fundamental Gershgorin circle theorem in this sub- section . Let A ∈ Cm×m×ℓ. With the help of the normalized discrete Fourier transform matrix, bcirc(A) can be block-diagonalized. Furthermore, by a sequence of similarity transfor- mation, it can be “more diagonal”. And in this case, we can use the diagonal entries to approximate the T -eigenvalues of original tensor A. Theorem 3.1. (Gershgorin circle theorem for tensors). Let A ∈ Cm×m×ℓ and assume
−1 H X (Fm ⊗ In) bcirc(A)(Fm ⊗ In)X = D + F where X is the transformation matrix, D = diag(d1,...,dmℓ) and F has zero diagonal entries. Then we have mℓ
Λ(A) ⊆ Θi i=1 [ 6 where Λ(A) denotes the set of its T -eigenvalues and
mℓ
Θi = z ∈ C : |z − di| ≤ |fij| , i =1, . . . , ml. ( j=1 ) X Proof. Without loss of generality, we assume that λ ∈ Λ(A), and furthermore, we suppose that λ =6 di for i = 1,...,mℓ. Notice that T -eigenvalues are not affected by similarity transformations and the matrix I − (λI − D)−1F is singular. Therefore,
1 mℓ 1 ≤ (D − λI)−1F ≤ |f | ∞ |d − λ| kj k j=1 X
mℓ for some k. The above inequalities imply that |λ − dk| ≤ j=1 |fkj| which further implies λ ∈ Θk. Note that λ is arbitrary and we complete the proof. It is noted that a different version of the Gershgorin circleP theorem for tensors has been considered at the same time in [2]. In our theorem, the circles are based on the diagonal entries of the transformation form. However, in Theorem 5.2 of [2], the circles are based on the entries of the original tensor.
3.3 Bauer-Fike theorem for tensors It is well-known that Bauer-Fike theorem is a classical result for a complex-valued diag- onalizable matrix. It concerns the perturbation theory of the eigenvalue. More specifically, it states that an absolute upper bound for the deviation of one perturbed matrix eigenvalue from a properly chosen eigenvalue of the exact matrix can be estimated by the product of the condition number of the eigenvector matrix and the norm of the perturbation [1]. As for the case of non-diagonalizable matrices, the Bauer-Fike theorem has been gen- eralized in [3, 4]. Moreover, this result on only part of the spectrum of a matrix was also considered in [3]. We generalize this celebrated result to third-order tensor case as follows. Noted that the Bauer-Fike Theorem considered in Theorem 5.3 of [2] at the same time is one special case of the next result. They only considered the 2-norm case.
Theorem 3.2 (Bauer-Fike Theorem for Tensors). Let A ∈ Cm×m×n be an F -diagonalizable tensor. That is, P−1 ∗A∗P = D, (3.4) where D is an F -diagonal tensor. Suppose that µ is a T -eigenvalue of A + δA in which δ is a small number. Then, under the spectral norm or Frobenius norm case, there exists a T -eigenvalue λ of A such that
|λ − µ| ≤ κp(P)kδAkp, p =2,F.
7 Moreover, for the 1- and ∞-norms, we have
|λ − µ| ≤ κp(P)κp(Fm ⊗ In)kδAkp, p =1, ∞.
In the above expressions, Fm is the normalized discrete Fourier transform matrix; κp(P)= −1 k bcirc(P)kp kbcirc(P )kp is the condition number of P, and kAkp = k bcirc(A)kp. Proof. By Lemma 2.1, we know that (3.4) is equivalent to
(bcirc(P))−1 bcirc(A) bcirc(P) = bcirc(D).
The right-hand of the above equality is a block circulant matrix, thus it can be block- diagonalized by discrete Fourier transform matrix. That is to say,
D(1) D(2) H (Fm ⊗ In) bcirc(D)(Fm ⊗ In)= D = . . .. D(n) By Lemma 2.3, we can see that D(1),D(2),...,D(n) are diagonal matrix since D is an F -diagonal matrix. It is not hard to see that all diagonal entries of those matrices D(1),D(2),...,D(n) are the T -eigenvalues of tensor A. Also note that bcirc(A + δA) = bcirc(A) + bcirc(δA). And by the two arguments above, then we obtain
−1 H (Fm ⊗ In)[(bcirc(P)) bcirc(A + δA) bcirc(P)](Fm ⊗ In) −1 H =(Fm ⊗ In)[(bcirc(P)) bcirc(A) bcirc(P)](Fm ⊗ In) −1 H (3.5) +(Fm ⊗ In)[(bcirc(P)) bcirc(δA) bcirc(P)](Fm ⊗ In) −1 H = D +(Fm ⊗ In)[(bcirc(P)) bcirc(δA) bcirc(P)](Fm ⊗ In).
Without loss of generality, we assume that µ∈ / Λ(A), otherwise the result is trivially true. Let µ be a T -eigenvalue of A+δA. Then µ is an eigenvalue of bcirc(A+δA), and therefore det(bcirc(A + δA) − µImn)=0. By the result of (3.5), one can find that
0 = det(bcirc(A) + bcirc(δA) − µImn) −1 = det(Fm ⊗ In) · det (bcirc(P)) · det(bcirc(A) + bcirc(δA) − µImn) H · det(bcirc(P)) · det( Fm ⊗ In) −1 H = det D +(Fm ⊗ In)[(bcirc(P)) bcirc(δA) bcirc(P)](Fm ⊗ In) − µImn = det( D − µImn) −1 −1 H · det (D − µImn) (Fm ⊗ In)[(bcirc(P)) bcirc(δA) bcirc(P)](Fm ⊗ In)+ Imn . (3.6)
8 The assumption that µ∈ / Λ(A) implies that
−1 −1 H det (D − µI) (Fm ⊗ In)[(bcirc(P)) bcirc(δA) bcirc(P)](Fm ⊗ In)+ Imn =0 which shows that − 1 is an eigenvalue of the matrix
−1 −1 H (D − µI) (Fm ⊗ In)[(bcirc(P)) bcirc(δA) bcirc(P)](Fm ⊗ In) . Since all p-norms (p =1 , 2,F, ∞) are consistent matrix norms and thus we have
−1 −1 H | − 1|≤k(D − µI) (Fm ⊗ In)[(bcirc(P)) bcirc(δA) bcirc(P)](Fm ⊗ In) kp −1 −1 H ≤ (D − µI) p kFm ⊗ Inkp k(bcirc(P)) kpk bcirc(δA)kpk bcirc(P)kpkFm ⊗ Inkp −1 = (D − µI) · κp(bcirc(P)) · κp(Fm ⊗ In) ·k bcirc(δA)kp p −1 = (D − µI) κp(P)κp(Fm ⊗ In)k bcirc(δA)kp. p − 1 Notice that (D − µI) is a diagonal matrix, then for p =1, 2, ∞ we have k(D − µI)−1xk 1 1 (D − µI)−1 = max p = max = . (3.7) p x k kp6=0 kxkp λ∈Λ(A) |λ − µ| minλ∈Λ(A) |λ − µ|
Therefore −1 min |λ − µ| ≤ (D − µI) κp(P)κp(Fm ⊗ In)kδAkp. (3.8) λ∈Λ(A) p
Finally, for the 2-norm case, we obtain H κ2(Fm ⊗ In)= kFm ⊗ Ink2kFm ⊗ Ink2 =1
H since Fm⊗In and Fm ⊗In are unitary. The result for Frobenius norm is trivial since spectral norm of a matrix is not larger than its Frobenius norm. The proof is completed. The following conclusion describes the relationship between variation of T -spectrum and difference of two tensors. We omit the proof since it can be easily get by the above theorem. Corollary 3.1. Let A, B ∈ Cm×m×n and A is an F -diagonalizable tensor with decomposi- tion as P−1 ∗A∗P = D. mn mn Let sA(B) be the distance of two T -spectral sets ΛA = {λi}i=1 and ΛB = {µi}i=1, and it is defined by sA(B) = max { min |λi − µj|}. 1≤j≤mn 1≤i≤mn Then we have sA(B) ≤ κ2(P)kB −Ak2. If A is a normal, then sA(B) ≤ kB−Ak2.
9 Generally, most tensors are not F -diagonalizable which means that (3.4) is not satisfied. Therefore, we cannot get an F -diagonal tensor by a transformation under tensor-tensor multiplication. In this case, we have the following decomposition. Lemma 3.1. [14] (T-Schur decomposition). Let A ∈ Cm×m×n, then there exists an unitary tensor Q such that Q−1 ∗A∗Q = T = D + N . (3.9) where D is an F -diagonal tensor and each frontal slice of N is strictly upper triangular. In the following, we present two general case of Bauer-Fike theorem for tensors (i.e., Theorem 3.2). They are based on T -Schur decomposition and can be viewed as the gen- eralization of the matrix cases for non-diagonalizable matrices that given in [3, 4]. Theorem 3.3. (Generalization of Bauer-Fike theorem). Let Q−1 ∗A∗Q = D + N be a T-Schur decomposition of A ∈ Cm×m×n as given in (3.9). The tensor B ∈ Cm×m×n and ǫ is a small scalar. If µ is a T -eigenvalue of A + ǫB and q is the smallest positive number such that |N|q =0 where
N (1) N (2) H N := . =(Fm ⊗ In) · bcirc(N ) · (Fm ⊗ In) .. N (n) and |N| = (|Nij|) denotes the absolute of a matrix element-wisely, then for spectral and Frobenius norms we have
min |λ − µ| ≤ max θ, θ1/q (3.10) λ∈Λ(A) in which q−1 k θ = k bcirc(ǫB)kp kNkp, p =2,F. (3.11) Xk=0 For the 1- and ∞-norms, we get
1/q min |λ − µ| ≤ max{θp, θp } (3.12) λ∈Λ(A) where q−1 k θp = k bcirc(ǫB)kpκp(Q)κp(Fm ⊗ In) kNk2, p =1, ∞. Xk=0 Proof. The theorem is clearly true if µ ∈ Λ(A), as the left-hand sides of (3.10) and (3.12) vanish. Therefore we assume that µ∈ / Λ(A). By Lemma 2.1, we can see that
µImn − bcirc(A + ǫB)= µImn − bcirc(A) − bcirc(ǫB),
10 and moreover it is singular. This means that
−1 H (Fm ⊗ In) bcirc(Q) [µImn − bcirc(A) − bcirc(ǫB)] bcirc(Q)(Fm ⊗ In) (3.13) is also singular since the matrices multiplied on the left and right sides are nonsingular. Notice that −1 H (Fm ⊗ In) · bcirc(Q) bcirc(A) bcirc(Q) · (Fm ⊗ In) H =(Fm ⊗ In) · [bcirc(D) + bcirc(N )] · (Fm ⊗ In) D(1) N (1) D(2) N (2) = . + . .. .. D(n) N (n) :=D + N. Therefore (3.13) can be rewritten as
−1 H µImn − D − N − (Fm ⊗ In) · bcirc(Q) bcirc(ǫB) bcirc(Q) · (Fm ⊗ In), and then the following matrix
−1 −1 H Imn − (µImn − D − N) (Fm ⊗ In) · bcirc(Q) bcirc(ǫB) bcirc(Q) · (Fm ⊗ In) (3.14) is singular. q By the assumption that |N| = 0 and note that µImn − D is diagonal, it follows that −1 q ((µImn − D) N) = 0. Hence,
q−1 −1 −1 k −1 ((µImn − D) − N) = (µImn − D) N (µImn − D) k=0 X and q−1 k −1 1 kNk k((µImn − D) − N) k ≤ minλ∈Λ(A) |λ − µ| minλ∈Λ(A) |λ − µ| Xk=0 under the 1-, 2- and ∞-norms cases. If minλ∈Λ(A) |λ − µ| ≥ 1, then
q−1 −1 1 k k((µImn − D) − N) k ≤ kNk , minλ∈Λ(A) |λ − µ| Xk=0 and if minλ∈Λ(A) |λ − µ| < 1, then
q−1 −1 1 k k((µImn − D) − N) k ≤ q kNk . (minλ∈Λ(A) |λ − µ|) Xk=0 11 By (3.14), we obtain
−1 −1 H 1 ≤ (µImn − D − N) (Fm ⊗ In) · bcirc(Q) bcirc(ǫB) bcirc(Q) · (Fm ⊗ In) −1 H −1 = (µImn − D − N) k bcirc(ǫB)kkFm ⊗ InkkFm ⊗ Inkk bcirc(Q) kk bcirc( Q)k
and under the spectral norm case, we get
q−1 q−1 k q k min |λ − µ|≤k bcirc(ǫB)k2 kNk2 or ( min |λ − µ|) ≤k bcirc(ǫB)k2 kNk2 λ∈Λ(A) λ∈Λ(A) Xk=0 Xk=0 q−1 k for minλ∈Λ(A) |λ−µ| ≥ 1 or minλ∈Λ(A) |λ−µ| < 1, respectively. Let θ = k bcirc(ǫB)k2 k=0 kNk2. Then we get the result (3.10) for spectral norm. The Frobenius norm case can be get easily. For the 1- and ∞-norms, by using (3.14) again we get P
1/q min |λ − µ| ≤ max{θp, θp } λ∈Λ(A) where q−1 k θp = k bcirc(ǫB)kpκp(Q)κp(Fm ⊗ In) kNk2 Xk=0 and the proof is completed. Now, we give one more general result of Theorem 3.2. Different from the above theorem which involving with an F -diagonal tensor, in the next result, we consider block-diagonal case. Let A ∈ Cm×m×n. Notice that bcirc(A) can be block-diagonalized as follows,
A(1) A(2) H (Fm ⊗ In) bcirc(A)(Fm ⊗ In)= . . .. A(n) By Lemma 2.3, we know that A(i) where i = 1,...,n may not diagonal since generally tensor A is not F -diagonal. For each matrix A(i) , let X(i) be a transformation matrix (i) −1 (i) (i) (i) (i) such that (X ) A X = diag(Aki ) where Aki is in triangular Schur form with
(i) (i) (i) Aki = Dki + Nki , ki =1,...,ℓi.
12 H (1) (2) (n) Denote bcirc(X )=(Fm ⊗ In) diag(X ,X ,...,X )(Fm ⊗ In). Then we get −1 H (Fm ⊗ In) · bcirc(X ) bcirc(A) bcirc(X ) · (Fm ⊗ In) (1) A1 . .. diag(A(1)) k1 A(1) diag(A(2)) ℓ1 k2 . = . = .. .. A(n) diag(A(n)) 1 kn .. . A(n) ℓn (1) (1) D1 N1 .. .. . . D(1) N (1) ℓ1 ℓ1 . . = .. + .. D(n) N (n) 1 1 . . .. .. D(n) N (n) ℓn ℓn :=D˜ + N.˜ By the above analysis, we have the following conclusion. Theorem 3.4. (i) If µ is a T -eigenvalue of A + ǫB and q is the dimension of Aki , then we have 1/q min |λ − µ| ≤ max θ1, θ1 λ∈Λ(A) where n o θ1 = Cǫk bcirc(B)kpκp(X ), p =2,F. −1 q−1 i k (i) and C = k=0 kN k2 provided that maxj Aj − µI occurring at j = ki.
For 1-P and ∞-norms, under the above condition, we get
1/q min |λ − µ| ≤ max{θ2, θ2 } λ∈Λ(A) where θ2 = Cǫk bcirc(B)kpκp(X )κp(Fm ⊗ In). Proof. We only need to consider the case that µ is not an T -eigenvalue of A. Hence µImn − D˜ − N˜ is nonsingular. Similar as (3.14), the matrix ˜ ˜ −1 −1 H Imn − (µImn − D − N) (Fm ⊗ In) · bcirc(X ) bcirc(ǫB) bcirc(X ) · (Fm ⊗ In) is singular. By similar proof process as Theorem 3.3, we could get the conclusion.
13 3.4 Kahan theorem for tensors The result on a Hermite tensor that is perturbed by a Hermite tensor is studied in [6]. Next, we give a result that a Hermite tensor is perturbed by any tensors.
Theorem 3.5. (Kahan theorem for tensors). Let A ∈ Rm×m×n be a Hermite tensor. mn Suppose that its T -eigenvalues set is denoted by ΛA = {λi}i=1 such that its T -eigenvalues are arranged in a non-increasing order:
λmax = λ1 ≥ λ2 ≥···≥ λmn−1 ≥ λmn = λmin. (3.15)
mn Suppose that B = A+E and let ΛB = {βk+iγk}k=1 such that β1 ≥ β2 ≥···≥ βmn−1 ≥ βmn. Let bcirc(E) − bcirc(E)H E = y 2i and σk = {β + iγ ∈ C : |β + iγ − λk| ≤kEk2, |γ|≤kEyk2}. Then mn
ΛB ⊂ σk. k[=1 Proof. On one hand, by Lemma 2.4, a Hermite tensor is F -diagonalizable by a unitary tensor. According to Corollary 3.1, we know that there exists one λk such that |β + iγ − λk|≤kEk2 for any given T -eigenvalue β + iγ of B. On the other hand, suppose that B ∗ X =(β + iγ)X , then by the definition of tensor- tensor multiplication, it is equivalent with
bcirc(B) unfold(X )=(β + iγ) unfold(X ).
Thus it is reasonable to assume that k unfold(X )k2 = 1 since X is nonzero which implies that the vector unfold(X ) is also nonzero. Therefore,
(unfold(X ))H bcirc(B) unfold(X )= β+iγ and (unfold(X ))H bcirc(B)H unfold(X )= β−iγ which implies that
(unfold(X ))H[bcirc(B) − bcirc(B)H ] unfold(X ) γ = = unfold(X ))HE unfold(X ). 2i y
Thus |γ|≤kEyk2. Our conclusion follows by combining this two parts.
14 4 Pseudospectra of third-order tensors
Pseudospectra of finite-dimensional matrices has been thoroughly investigated in the classical book by Trefethen [21]. Three definitions of pseudospectra for any norm and one definition for spectral norm are given, and those definitions are equivalent under certain conditions. Many properties and representative numerical results are presented by many pictures. In thhis section, we study the pseudospectra theory of third-order tensors under tensor- tensor multiplication sense. The definition of pseudospectra on the basis of T -eigenvalue defined in Definition 3.1 is given first. Some properties based on the definition are given soon afterwards.
4.1 Pseudospectra of third-order tensors under tensor-tensor mul- tiplication First, we give the definition of ε-pseudospectra of an m × m × n tensor A. If the norm k·kp is not specified, we take the convention that p =1, 2, ∞. Definition 4.1 (ε-pseudospectra of a Tensor). Let A ∈ Cm×m×n. Then the block circulant matrix bcirc(A) generated by the tensor A can be factored as follows,
A(1) A(2) H bcirc(A)= Fm ⊗ In · . · (Fm ⊗ In) . (4.1) .. A(n) For each square matrix A(i) where i ∈ [n] := {1,...,n}, we have
(i) (i) −1 −1 Λε(A ) := z ∈ C : (zIm − A ) ≥ ε
(i) where ε is a positive scalar. If there are some i such that zI m − A are singular, we define (i) −1 (zIm − A ) = ∞. In this definition, we denote the block diagonal matrix in (4.1) as diag(A(1),...,A(n)) := A.
(I) We call (i) −1 −1 Λε(A) := z ∈ C : max (zIm − A ) ≥ ε i∈[n] as the ε-pseudospectra of tensor A. mn×mn (II) Λε(A)= {z ∈ C : z ∈ Λ(A + E) for some E ∈ C with kEk ≤ ε}. mn (III) Λε(A)= {z ∈ C : there exists v ∈ C with kvk =1 such that k(A − zImn)vk ≤ ε} . Theorem 4.1. The three definitions (I), (II) and (III) are equivalent.
15 (i) Proof. For a block diagonal matrix, we find that kAk = maxi∈[n] kA k and the inverse of A can be get by computing the inverse of each A(i). Therefore, (i) −1 −1 max (zIm − A ) = (zImn − A) i∈[n] and thus −1 −1 Λε(A)= z ∈ C : (zImn − A) ≥ ε . (4.2) The equivalence of (4.2) and (II), (III) can be easily got by the matrix case [21].
Remark 4.1. Notice that z ∈ C and it is variable, while A(i) is stationary when the tensor A is given. Therefore by definition (I), we can also see that n n (i) (i) −1 −1 Λε(A)= Λε(A )= z ∈ C : (zIm − A ) ≥ ε . i=1 i=1 [ [ Under the case of the spectral norm, we give the following definition . Definition 4.2. Let A ∈ Cm×m×n. Then the block-circulant matrix bcirc(A) generated by the tensor A can be factored as (4.1). For each square matrix A(i) where i ∈ [n], we have (i) (i) Λε(A )= z ∈ C : σmin(zIm − A ) ≤ ε where ε is a positive scalar and σmin (·) denotes the minimum singular value. We call n (i) Λε(A)= Λε(A ) i=1 [ as the ε-pseudospectra of tensor A. By Remark 4.1, we get the following conclusion. Theorem 4.2. The three definitions given in Definition 4.1 and the one in Definition 4.2 are also equivalent.
4.2 Properties of pseudospectra of tensors We study the properties of pseudospectra for third-order tensors in this subsection. Many fundamental results are given in the next theorem. Theorem 4.3. Let tensor A ∈ Cm×m×n and suppose that the positive scalar ǫ is given arbitrarily. (1) The set Λǫ(A) is nonempty, open, and bounded. Moreover, there are at most nm connected components, each containing one or more T -eigenvalues of A. (2) For any c ∈ C, we have Λε(A + c)= c +Λε(A) where A + c is shorthand for A + cI and I is the identity with the same size as A. (3) For any nonzero c ∈ C, we have Λ|c|ε(cA)= cΛε(A). H (4) If the spectral norm is applied, then Λε A = Λε(A). 16 (i) Proof. To prove the assertion of (1), one can use the fact that each Λǫ(A ) of the matrix A(i) has the properties that nonempty, open, and bounded, with at most m connected components, each containing one or more eigenvalues of A(i). Then same prop- erties are also hold for the given tensor A. Moreover, there are at most nm connected components since n
Λε(A)= Λε(Di). i=1 [ We now come to the matter of part (2). First, note that
A1 + cIm An An−1 ··· A2 A2 A1 + cIm An ··· A3 bcirc(A + cI)= . . . . . ...... . A A .. A A + cI n n−1 2 1 m = bcirc(A)+ c bcirc(I) H H = Fm ⊗ In · A · (Fm ⊗ In)+ c[ Fn ⊗ Im · Imn · (Fm ⊗ In)] H = Fm ⊗ In · (A + cImn) · (Fm ⊗ In) (1) A + cIm A(2) + cI H m = Fm ⊗ In · . · (Fm ⊗ In) . .. (n) A + cIm Therefore,
n n (i) (i) Λε(A + c)= Λε(A + cIm)= [Λε(A )+ c]= c +Λε(A) i=1 i=1 [ [ for any c ∈ C and we complete the proof of this part. For the part (3), by Lemma 2.1, we know that
H H bcirc(cA)= c bcirc(A)= c[ Fm ⊗ In · A · (Fm ⊗ In)] = Fm ⊗ In · (cA) · (Fm ⊗ In) which implies that
n n n
Λ|c|ε(cA)= Λ|c|ε(cDi)= cΛε(Di)= c Λε(Di) i=1 i=1 i=1 [ [ [ since for any nonzero c ∈ C and matrix A ∈ Cm×m, the following equality
Λ|c|ε(cA)= cΛε(A)
17 holds. Thus we get the result that Λ|c|ε(cA)= cΛε(A) for any nonzero c ∈ C. Now, we prove the last part of this theorem. By Lemma 2.1, we know that
H H H bcirc A = Fm ⊗ In · A · (Fm ⊗ In) .
Therefore,
n n n H (i) H (i) (i) Λε A = Λε((A ) )= Λε(A )= Λε(A )= Λε (A) i=1 i=1 i=1 [ [ [ H m×m where the conclusion Λε(A ) = Λε(A) under the two-norm for any matrix A ∈ C is applied in the second equality.
Remark 4.2. By the results above, we can see that the function of pseudospetra on tensor A is linear.
The properties of pseudospectra on normal tensors are given next.
Theorem 4.4. (Pseudospectra of a normal tensor). Let ∆ε be an open ε-ball; that is, ∆ε = {z ∈ C : |z| <ε}. A sum of sets is defined as
σ(A) + ∆ε = {z : z = z1 + z2, z1 ∈ σ(A), z2 ∈ ∆ε} where Λ(A) is the T -spectrum (sets of T -eigenvalues) of the tensor A. Then for any tensor A ∈ Cm×m×n, we have Λε(A) ⊇ Λ(A) + ∆ε ∀ε> 0. (4.3)
Moreover if A is normal and k·k = k·k2, then
Λε(A)=Λ(A) + ∆ε ∀ε> 0. (4.4)
Proof. If λ is an T -eigenvalue of tensor A, then it is an eigenvalue of the matrix bcirc(A). Therefore λ + µ is an eigenvalue of bcirc(A)+ µI for any µ ∈ C. Note that kµIk = |µ|, then by the definition of pseudospectra on tensors, we obtain λ + µ ∈ Λε(A) for any |µ| <ε. The proof (4.3) is finished. For the normal tensor case, by Lemma 2.4 and Lemma 2.1, we have
bcirc(U) bcirc(A)(bcirc(U))H = bcirc(D) and D(1) D(2) H (Fm ⊗ In) bcirc(D)(Fm ⊗ In)= . := D (4.5) .. D(n) 18 (i) in which D is diagonal for i = 1, ··· , n by Lemma 2.3. Also note that k·k = k·k2, we may assume directly that A is F -diagonal. Therefore, the diagonal entries of bcirc(A) are equal to the T -eigenvalues. As we all know, the ε-pseudospectrum is just the union of the open ε-balls about the points of the spectrum for any normal matrix; equivalently, we have 1 (z − bcirc(A))−1 = (4.6) 2 dist(z, Λ(bcirc(A))) which implies dist(z, Λ(bcirc(A))) <ε by the ε-pseudospectrum of tensors. We get the conclusion since Λ(A) + ∆ε is the same as {z : dist(z, Λ(A)) <ε}. Theorem 4.5. (Bauer-Fike Theorem). Suppose tensor A ∈ Cm×m×n is F -diagonalizable, i.e., it has decomposition (3.4). If the spectral norm is applied, then for each positive scalar ǫ, we have
Λ(A) + ∆ε ⊆ Λǫ(A) ⊆ Λ(A) + ∆εκ2(P). Proof. We only need to prove the second inclusion. By the definition of pseudospec- tra, Lemma 2.1 and decompositions (3.4) and (4.5), it is not hard to find that −1 −1 −1 k(zImn − A) k = k bcirc(P)(zImn − bcirc(D)) bcirc(P) k κ(P) κ(P) ≤ = dist(z, Λ(bcirc(D))) dist(z, Λ(D)) κ(P) = . dist(z, Λ(A)) Similar as the matrix case, we get our conclusion.
4.3 An examples of the ε-pseudospectrum
Example 4.1. Let A be a third-order tensor with three frontal faces A1, A2 and A3. Firstly, we consider an example that all frontal faces are the same and each is a tridiagonal Toeplitz matrix, that is, A1 = A2 = A3 = Tpz where 0 1 1 0 1 4 . . . N×N Tpz = ...... ∈ R . 1 4 0 1 1 0 4 We denote this tensor as A0 and we can see that the size of bcirc(A0)) is 3N × 3N. How- ever, it is non-symmetrical. Notice that Tpz can be symmetrized by the diagonal similarity transformation −1 DTpzD = S
19 where D = diag 2, 4,..., 2N and