<<

arXiv:2108.09502v1 [math.NA] 21 Aug 2021 o ri cec n ri-nprdTcnlg,te11Project 111 the Technology, 2018SHZDZX Brain-Inspired (No. and Science Project Brain Major for Technology o Foundation and Science Science Natural National Municipal Internation the by Fudan supported Zhangjiang F is search MOE China; China; Shanghai, Education, University, of Fudan Ministry University), Comput (Fudan of Intelligence Laboratory Key Shangha China; China; Shanghai, Technology, Shanghai, Inspired University, Fudan Intelligence, Inspired China. of R. P. 400047, Chongqing, etrainaayi ftidodrtno eigenvalue tensor third-order of analysis Perturbation ∗ † E-mail: orsodn uhr E-mail: author. Corresponding inaayi ujc ntno ievle ne tensor-t attent under our eigenvalues pay also we tensor and paper, on this subject In analysis now. gettin tion to then especially from progress, made been considerable has and fields many in system equation, differential ordinary multidimensional theorem, ieetlevels. different e,BurFk theorem, Bauer-Fike ues, xmlswihso h onaiso the of boundaries pseudospectra the various show which with examples together presented, is tensors hoe,aeetne rmmti otno ae h td o study The case. tensor to cir Gershgorin from case, extended general are its theorem, and theorem Bauer-Fike the alized rbe ae ntno-esrmultiplication tensor-tensor on based problem etrainaayi a enpiaiycniee ob o be to considered primarily been has analysis Perturbation M ujc classifications: subject AMS Keywords: [email protected] T egnau ftidodrtnosi ie.Svrlclass Several given. is tensors third-order of -eigenvalue ε pedsetater o hr-re esr.Tedefinit The tensors. third-order for theory -pseudospectra etrainter,tno-esrmlilcto,tens multiplication, tensor-tensor theory, perturbation hnxnMo Changxin ε colo ahmtclSine,hnqn omlUniversity, Normal Sciences,Chongqing Mathematical of School . [email protected] pedsetater,Grhoi iceterm Kahan theorem, circle Gershgorin theory, -pseudospectra ue8 2021 8, June 51,15A69 15A18, Abstract ∗ 1 ε pedsetao eti esr under tensors certain of -pseudospectra eyn Ding Weiyang nttt fSineadTcnlg o Brain- for Technology and Science of Institute . hn ne rn 1049 Shanghai 11801479, Grant under China f toa ersineadBrain-Inspired and Neuroscience ational N.B18015). (No. etrfrBanSineadBrain- and Science Brain for Center i lInvto etr .Dn’ re- Ding’s W. Center. Innovation al no utpiainsense; multiplication ensor ut-iertm invariant time multi-linear 1,Z a,adSaga Center Shanghai and Lab, ZJ 01), novdwt matrices, with involved g rprisadnumerical and properties otesCne o ri Science, Brain for Center rontiers l hoe n Kahan and theorem cle eo h anissues main the of ne n clrsls uhas such results, ical o oteperturba- the to ion ε † pedsetaof -pseudospectra o ftegener- the of ion reigenval- or 1 Introduction

Perturbation theory, which has been studied for more than ninety years, stems from the ideas of Rayleigh [16] and Schr¨odinger [18] when they studied the eigenvalue problems in vibrating system and quantum mechanics, separately. Since then, extensive researches have been conducted by many researchers and those pioneering works can be found in the celebrated monographs: Perturbation theory of eigenvalue problems by Rellich [17], Perturbation Theory for Linear Operator by Kato [7] and Perturbation Analysis of Matrix by Sun [20], to name a few. Tensor, which can be regarded as a generalization of the matrix to higher-order case, has been attracted considerable attention in different fields of science recently. As one of the most important and basic operations like , tensor multiplication has been attended greatly by researchers. In 2008, a new type of tensor multiplication, that allows a third-order tensor to be written as a product of third-order tensors also, have been proposed by Kilmer at al. [10] when they considered the problem of generalizing the matrix SVD to tensor case, and it is termed as tensor-tensor multiplication. In the definition of this new multiplication, three operators are involved closely. For a third- n1×n2×n3 tensor A ∈ C , we get its frontal slices, denoted by A::k or Ak for short, by fixing the last index. Therefore it has n3 frontal slices, i.e., A1,...,An3 , which are matrices with size n1 × n2. The first operation we will introduce is about creating a block from the frontal slices of a tensor. That is,

A1 An3 An3−1 ··· A2

A2 A1 An3 ··· A3 bcirc(A)=  . . . . .  (1.1) ......  .   A A .. A A   n3 n3−1 2 1    with size (n1n3) × (n2n3). The other two operations could be regarded as the “inverse” of each other. They are the unfold and fold commands defined as follows,

A1 A2 unfold(A)=  .  , fold(unfold(A)) = A. .    An3      On the basis of the above three operations, the tensor-tensor multiplication of any two tensors B ∈ Cn1×p×n3 and C ∈ Cp×n2×n3 is defined as

A = B ∗ C = fold(bcirc(B) · unfold(C)). (1.2)

One can easily check that A ∈ Cn1×n2×n3 . By the size of the three tensors involved in (1.2), we can see that special attention should be paid to the first two indices of B and

2 C since their frontal slices need to be consistent with the multiplication of matrices. The tensor-tensor multiplication (1.2) has demonstrated its usefulness in many areas, including, but not limited to, image processing (such as image deblurring and compression, object and facial recognition), tensor principal component analysis, tensor completion, pattern recognition; see [8, 11, 15, 22, 23] and the references therein. On the basis of this important tensor-tensor multiplication, many researchers therefore considered the functions of multidimensional arrays. It can be regarded as generalization of the functions of matrices, and many nice properties have been given; more details are discussed in the articles [12, 13, 14]. Specially, one basic but also important concept, T - eigenvalue, has been proposed by Miao et al. [14], and then the stability of T -eigenvalues has been introduced in [6] for studying the tensor Lyapunov equation that appears in spa- tially invariant systems. Moreover, some results, such as Weyl’s and Cauchy’s interlacing theorems, from the matrix case to the tensor case have been given [6]. Motivated by these researches mentioned above, we pay our attention to the perturba- tion analysis of third-order tensors under the important tensor-tensor multiplication (1.2) in this paper. Many classical results of the matrix case will be generalized to the tensor case. Moreover, the pseudospectra theory for third-order tensors also has been considered. The remainder of this paper is organized as follows. We describe some notations that often used in the following and revisit several basic concepts as well as fundamental results in section 2. Section 3 is one of the main parts which focus on the perturbation analysis results on third-order tensors. Several classical theorems on matrices are extended to tensor case. The issue on ε-pseudospectra theory, which is the other main part, of tensors is studied in section 4. Section 5 presents the multidimensional ordinary differential equation and also intestigate the claose relationship of T -eiganvelue with it. Some properties of multidimensional ordinary differential equation are also given. This paper culminates with conclusions and remarks.

2 Preliminaries

In this section, we introduce the notations used throughout the paper and also review the basic concepts of tensors, such as identity tensor, of a tensor, F -diagonal tensor, and orthogonal tensor. Generally, scalars are denoted by lowercase letters, e.g., a. Vectors and matrices are denoted by boldface lowercase letters and capital letters, respectively, e.g., v and A. Euler script letters are used to denote the higher-order tensors, e.g., A. Frontal slices of a tensor Cn1×n2×n3 T ∈ are denoted by T1,...,Tn3 . Some basic concepts of tensors are revisited next.

Definition 2.1. ([9, Definition 3.14]). Let A ∈ Cn1×n2×n3 , then the transposed tensor A⊤ ∈ Cn2×n1×n3 (conjugate transposed tensor AH ∈ Cn2×n1×n3 ) is obtained by taking the

3 transpose () of each frontal slices and then reversing the order of trans- posed frontal slices 2 through n3.

Unlike the transposed tensor which is well-defined for any n1 × n2 × n3 tensor, the identity tensor, orthogonal tensor and the inverse of a tensor are only applicable for the tensors with square frontal slices.

m×m×ℓ Definition 2.2. ([9, Definition 3.4], identity tensor). Let Immℓ ∈ C . If its frontal slice I1 is the of size m × m, and whose other frontal slices I2,...,Iℓ are all zeros, then we call Immℓ an identity tensor. Definition 2.3. ([9, Definition 3.5], inverse of a tensor). Let A ∈ Cm×m×ℓ. We call tensor B an inverse of A if it satisfies the following two qualities

A ∗ B = Immℓ, and B∗A = Immℓ. Definition 2.4. ([9, Definition 3.18], orthogonal and unitary tensor). Let Q ∈ Rm×m×ℓ. ⊤ ⊤ m×m×ℓ We call Q an orthogonal tensor provided that Q ∗ Q = Q ∗ Q = Immℓ. If Q ∈ C H H and Q ∗ Q = Q ∗ Q = Immℓ, then we call it an unitary tensor. Based on the above definition, a tensor A ∈ Cm×m×n is said to be symmetric if A = A⊤, or Hermitian if A = AH [6]. We call a third-order tensor D ∈ Cm×m×ℓ an F -diagonal tensor if all its frontal slices D1,...,Dℓ are diagonal matrices. Definition 2.5. ([12, 14], F-diagonalizable tensor). Assume that A ∈ Cm×m×ℓ such that A = P∗D∗P−1, then we call A an F -diagonalizable tensor if D is an F -diagonal tensor. Some useful lemmas are recalled as follows. Lemma 2.1. ([6, 12, 13]). The following results hold for third-order tensors A ∈ Cm×n×p: (a) The operator bcirc defined in (1.1) is a linear operator, i.e., bcirc(αA + βB)= α bcirc(A)+ β bcirc(B) where B has the same size as A and α, β are constants. (b) bcirc(A ∗ B) = bcirc(A) bcirc(B) where B ∈ Cn×s×p. (c) bcirc A⊤ = (bcirc(A))⊤, and bcirc AH = (bcirc(A))H. (d) If A is invertible, then its inverse tensor is unique and bcirc (A−1) = (bcirc(A))−1.   Lemma 2.2. ([6, Theorems 2.7 and 2.8]). Let A ∈ Cm×m×n. Some fundamental results involving Hermitian or symmetric tensors are: (a) The tensor A is symmetric if and only if bcirc (A) = (bcirc(A))⊤. (b) The tensor A is Hermitian if and only if bcirc (A) = (bcirc(A))H. (c) All T -eigenvalues (cf. (3.3)) of a Hermitian tensor A are real.

4 n×n Lemma 2.3. ([14, Lemma 4]). Suppose A1, ··· , Ap, B1, ··· , Bp ∈ C are matrices satisfying

A1 Ap Ap−1 ··· A2 B1 A A A ··· A B  2 1 p 3   2  H . . . =(Fp ⊗ In) . Fp ⊗ In ......      Ap Ap−1 Ap−2 ··· A1   Bp           where Fp is the normalized discrete Fourier matrix of size p×p. Then, B1, ··· , Bp are diag- onal (sub-diagonal, upper-triangular, lower-triangular) matrices if and only if A1, ··· , Ap are diagonal (sub-diagonal, upper-triangular, lower-triangular) matrices. Similar as the matrix case, two tensors A, B ∈ Cm×m×n are said to commute if A ∗ B = B∗A. A tensor A ∈ Cm×m×n is normal if A∗AH = AH ∗A, that is, if A commutes with its conjugate transpose under tensor-tensor multiplication sense. Obviously, the symmetric tensor and Hermitian tensor are normal. A normal tensor can always be F -diagonalizable by an unitary tensor, as the conclusion given next. Lemma 2.4. ([14]). Let A ∈ Cm×m×n be a normal tensor, then there exists an unitary tensor U ∈ Cm×m×n such that U∗A∗U H = D where D is an F -diagonal tensor.

3 Perturbation analysis on third-order tensors

In this section, we firstly give a definition of the generalized tensor-eigenvalue of third- order tensors. And then some classical results, such as the Gershgorin circle theorem [4], the Bauer-Fike theorem and its general case [1, 3, 19], and the Kahan theorem [20] which are well-known in matrix theory, are extended into the tensor case.

3.1 Tensor eigenvalues under tensor-tensor multiplication Firstly, we give the definition of generalized tensor-eigenvalue for third-order tensors under tensor-tensor multiplication. Definition 3.1. (Generalized T -eigenvalue of tensors). Let A, B ∈ Cm×m×ℓ. If there is a λ ∈ C and a nonzero tensor X ∈ Cm×1×ℓ such that

A ∗ X = λ(B ∗ X ), (3.1) then λ is called a generalized T -eigenvalue of A relative to B and X is a T-eigenvector associated to λ.

5 Remark 3.1. By the definition of the tensor-tensor multiplication given in (1.2), we can see that the equality (3.1) is equivalent to bcirc(A) unfold(X )= λ · (bcirc(B) unfold(X )). (3.2) And note that unfold(X ) is a vector with size mℓ, thus this generalized eigenvalue problem based on tensor-tensor product has a close relationship with the classical generalized matrix eigenvalue problem. Hence, there are mℓ eigenvalues for the problem (3.1) if and only if (B) := rank(bcirc(B)) = mℓ. If B is rank deficient, then the set of all the generalized T -eigenvalue of A relative to B may be finite, empty or infinite. Remark 3.2. If we choose B as the identity tensor in (3.1), then we get A ∗ X = λX . (3.3) This is the case that given in [6, Definition 2.5] which gives the definition of T -eigenvalue of a third-order tensor. An equivalent definition which based on the tensor decomposition is also displayed in [14]. Therefore Definition 3.1 is the generalization of definitions given in [6, Definition 2.5] and also [14]. Similarly, as the equivalent form given in (3.2) for (3.1), we obtain that (3.3) can be transformed into bcirc(A) unfold(X )= λ · unfold(X ). That is to say, all T -eigenvalues of tensor A are actually eigenvalues of the circulant matrix bcirc(A), and vice versa.

3.2 Gershgorin circle theorem for tensors Firstly, we consider the easy but fundamental Gershgorin circle theorem in this sub- section . Let A ∈ Cm×m×ℓ. With the help of the normalized discrete Fourier transform matrix, bcirc(A) can be block-diagonalized. Furthermore, by a sequence of similarity transfor- mation, it can be “more diagonal”. And in this case, we can use the diagonal entries to approximate the T -eigenvalues of original tensor A. Theorem 3.1. (Gershgorin circle theorem for tensors). Let A ∈ Cm×m×ℓ and assume

−1 H X (Fm ⊗ In) bcirc(A)(Fm ⊗ In)X = D + F where X is the , D = diag(d1,...,dmℓ) and F has zero diagonal entries. Then we have mℓ

Λ(A) ⊆ Θi i=1 [ 6 where Λ(A) denotes the set of its T -eigenvalues and

mℓ

Θi = z ∈ C : |z − di| ≤ |fij| , i =1, . . . , ml. ( j=1 ) X Proof. Without loss of generality, we assume that λ ∈ Λ(A), and furthermore, we suppose that λ =6 di for i = 1,...,mℓ. Notice that T -eigenvalues are not affected by similarity transformations and the matrix I − (λI − D)−1F is singular. Therefore,

1 mℓ 1 ≤ (D − λI)−1F ≤ |f | ∞ |d − λ| kj k j=1 X

mℓ for some k. The above inequalities imply that |λ − dk| ≤ j=1 |fkj| which further implies λ ∈ Θk. Note that λ is arbitrary and we complete the proof. It is noted that a different version of the Gershgorin circleP theorem for tensors has been considered at the same time in [2]. In our theorem, the circles are based on the diagonal entries of the transformation form. However, in Theorem 5.2 of [2], the circles are based on the entries of the original tensor.

3.3 Bauer-Fike theorem for tensors It is well-known that Bauer-Fike theorem is a classical result for a complex-valued diag- onalizable matrix. It concerns the perturbation theory of the eigenvalue. More specifically, it states that an absolute upper bound for the deviation of one perturbed matrix eigenvalue from a properly chosen eigenvalue of the exact matrix can be estimated by the product of the of the eigenvector matrix and the norm of the perturbation [1]. As for the case of non-diagonalizable matrices, the Bauer-Fike theorem has been gen- eralized in [3, 4]. Moreover, this result on only part of the spectrum of a matrix was also considered in [3]. We generalize this celebrated result to third-order tensor case as follows. Noted that the Bauer-Fike Theorem considered in Theorem 5.3 of [2] at the same time is one special case of the next result. They only considered the 2-norm case.

Theorem 3.2 (Bauer-Fike Theorem for Tensors). Let A ∈ Cm×m×n be an F -diagonalizable tensor. That is, P−1 ∗A∗P = D, (3.4) where D is an F -diagonal tensor. Suppose that µ is a T -eigenvalue of A + δA in which δ is a small number. Then, under the spectral norm or Frobenius norm case, there exists a T -eigenvalue λ of A such that

|λ − µ| ≤ κp(P)kδAkp, p =2,F.

7 Moreover, for the 1- and ∞-norms, we have

|λ − µ| ≤ κp(P)κp(Fm ⊗ In)kδAkp, p =1, ∞.

In the above expressions, Fm is the normalized discrete Fourier transform matrix; κp(P)= −1 k bcirc(P)kp kbcirc(P )kp is the condition number of P, and kAkp = k bcirc(A)kp. Proof. By Lemma 2.1, we know that (3.4) is equivalent to

(bcirc(P))−1 bcirc(A) bcirc(P) = bcirc(D).

The right-hand of the above equality is a block circulant matrix, thus it can be block- diagonalized by discrete Fourier transform matrix. That is to say,

D(1) D(2) H   (Fm ⊗ In) bcirc(D)(Fm ⊗ In)= D = . . ..    D(n)      By Lemma 2.3, we can see that D(1),D(2),...,D(n) are since D is an F -diagonal matrix. It is not hard to see that all diagonal entries of those matrices D(1),D(2),...,D(n) are the T -eigenvalues of tensor A. Also note that bcirc(A + δA) = bcirc(A) + bcirc(δA). And by the two arguments above, then we obtain

−1 H (Fm ⊗ In)[(bcirc(P)) bcirc(A + δA) bcirc(P)](Fm ⊗ In) −1 H =(Fm ⊗ In)[(bcirc(P)) bcirc(A) bcirc(P)](Fm ⊗ In) −1 H (3.5) +(Fm ⊗ In)[(bcirc(P)) bcirc(δA) bcirc(P)](Fm ⊗ In) −1 H = D +(Fm ⊗ In)[(bcirc(P)) bcirc(δA) bcirc(P)](Fm ⊗ In).

Without loss of generality, we assume that µ∈ / Λ(A), otherwise the result is trivially true. Let µ be a T -eigenvalue of A+δA. Then µ is an eigenvalue of bcirc(A+δA), and therefore det(bcirc(A + δA) − µImn)=0. By the result of (3.5), one can find that

0 = det(bcirc(A) + bcirc(δA) − µImn) −1 = det(Fm ⊗ In) · det (bcirc(P)) · det(bcirc(A) + bcirc(δA) − µImn) H · det(bcirc(P)) · det( Fm ⊗ In)  −1 H = det D +(Fm ⊗ In)[(bcirc(P)) bcirc(δA) bcirc(P)](Fm ⊗ In) − µImn = det(D − µImn)  −1 −1 H · det (D − µImn) (Fm ⊗ In)[(bcirc(P)) bcirc(δA) bcirc(P)](Fm ⊗ In)+ Imn . (3.6) 

8 The assumption that µ∈ / Λ(A) implies that

−1 −1 H det (D − µI) (Fm ⊗ In)[(bcirc(P)) bcirc(δA) bcirc(P)](Fm ⊗ In)+ Imn =0 which shows that −1 is an eigenvalue of the matrix 

−1 −1 H (D − µI) (Fm ⊗ In)[(bcirc(P)) bcirc(δA) bcirc(P)](Fm ⊗ In) . Since all p-norms (p =1 , 2,F, ∞) are consistent matrix norms and thus we have

−1 −1 H | − 1|≤k(D − µI) (Fm ⊗ In)[(bcirc(P)) bcirc(δA) bcirc(P)](Fm ⊗ In) kp −1 −1 H ≤ (D − µI) p kFm ⊗ Inkp k(bcirc(P)) kpk bcirc(δA)kpk bcirc(P)kpkFm ⊗ Inkp −1 = (D − µI) · κp(bcirc(P)) · κp(Fm ⊗ In) ·k bcirc(δA)kp p −1 = (D − µI) κp(P)κp(Fm ⊗ In)k bcirc(δA)kp. p − 1 Notice that (D − µI) is a diagonal matrix, then for p =1, 2, ∞ we have k(D − µI)−1xk 1 1 (D − µI)−1 = max p = max = . (3.7) p x k kp6=0 kxkp λ∈Λ(A) |λ − µ| minλ∈Λ(A) |λ − µ|

Therefore −1 min |λ − µ| ≤ (D − µI) κp(P)κp(Fm ⊗ In)kδAkp. (3.8) λ∈Λ(A) p

Finally, for the 2-norm case, we obtain H κ2(Fm ⊗ In)= kFm ⊗ Ink2kFm ⊗ Ink2 =1

H since Fm⊗In and Fm ⊗In are unitary. The result for Frobenius norm is trivial since spectral norm of a matrix is not larger than its Frobenius norm. The proof is completed. The following conclusion describes the relationship between variation of T -spectrum and difference of two tensors. We omit the proof since it can be easily get by the above theorem. Corollary 3.1. Let A, B ∈ Cm×m×n and A is an F -diagonalizable tensor with decomposi- tion as P−1 ∗A∗P = D. mn mn Let sA(B) be the distance of two T -spectral sets ΛA = {λi}i=1 and ΛB = {µi}i=1, and it is defined by sA(B) = max { min |λi − µj|}. 1≤j≤mn 1≤i≤mn Then we have sA(B) ≤ κ2(P)kB −Ak2. If A is a normal, then sA(B) ≤ kB−Ak2.

9 Generally, most tensors are not F -diagonalizable which means that (3.4) is not satisfied. Therefore, we cannot get an F -diagonal tensor by a transformation under tensor-tensor multiplication. In this case, we have the following decomposition. Lemma 3.1. [14] (T-). Let A ∈ Cm×m×n, then there exists an unitary tensor Q such that Q−1 ∗A∗Q = T = D + N . (3.9) where D is an F -diagonal tensor and each frontal slice of N is strictly upper triangular. In the following, we present two general case of Bauer-Fike theorem for tensors (i.e., Theorem 3.2). They are based on T -Schur decomposition and can be viewed as the gen- eralization of the matrix cases for non-diagonalizable matrices that given in [3, 4]. Theorem 3.3. (Generalization of Bauer-Fike theorem). Let Q−1 ∗A∗Q = D + N be a T-Schur decomposition of A ∈ Cm×m×n as given in (3.9). The tensor B ∈ Cm×m×n and ǫ is a small scalar. If µ is a T -eigenvalue of A + ǫB and q is the smallest positive number such that |N|q =0 where

N (1) N (2)   H N := . =(Fm ⊗ In) · bcirc(N ) · (Fm ⊗ In) ..    N (n)      and |N| = (|Nij|) denotes the absolute of a matrix element-wisely, then for spectral and Frobenius norms we have

min |λ − µ| ≤ max θ, θ1/q (3.10) λ∈Λ(A)  in which q−1 k θ = k bcirc(ǫB)kp kNkp, p =2,F. (3.11) Xk=0 For the 1- and ∞-norms, we get

1/q min |λ − µ| ≤ max{θp, θp } (3.12) λ∈Λ(A) where q−1 k θp = k bcirc(ǫB)kpκp(Q)κp(Fm ⊗ In) kNk2, p =1, ∞. Xk=0 Proof. The theorem is clearly true if µ ∈ Λ(A), as the left-hand sides of (3.10) and (3.12) vanish. Therefore we assume that µ∈ / Λ(A). By Lemma 2.1, we can see that

µImn − bcirc(A + ǫB)= µImn − bcirc(A) − bcirc(ǫB),

10 and moreover it is singular. This means that

−1 H (Fm ⊗ In) bcirc(Q) [µImn − bcirc(A) − bcirc(ǫB)] bcirc(Q)(Fm ⊗ In) (3.13) is also singular since the matrices multiplied on the left and right sides are nonsingular. Notice that −1 H (Fm ⊗ In) · bcirc(Q) bcirc(A) bcirc(Q) · (Fm ⊗ In) H =(Fm ⊗ In) · [bcirc(D) + bcirc(N )] · (Fm ⊗ In) D(1) N (1) D(2) N (2) =  .  +  .  .. ..      D(n)   N (n)      :=D + N.    Therefore (3.13) can be rewritten as

−1 H µImn − D − N − (Fm ⊗ In) · bcirc(Q) bcirc(ǫB) bcirc(Q) · (Fm ⊗ In), and then the following matrix

−1 −1 H Imn − (µImn − D − N) (Fm ⊗ In) · bcirc(Q) bcirc(ǫB) bcirc(Q) · (Fm ⊗ In) (3.14) is singular. q By the assumption that |N| = 0 and note that µImn − D is diagonal, it follows that −1 q ((µImn − D) N) = 0. Hence,

q−1 −1 −1 k −1 ((µImn − D) − N) = (µImn − D) N (µImn − D) k=0 X  and q−1 k −1 1 kNk k((µImn − D) − N) k ≤ minλ∈Λ(A) |λ − µ| minλ∈Λ(A) |λ − µ| Xk=0   under the 1-, 2- and ∞-norms cases. If minλ∈Λ(A) |λ − µ| ≥ 1, then

q−1 −1 1 k k((µImn − D) − N) k ≤ kNk , minλ∈Λ(A) |λ − µ| Xk=0 and if minλ∈Λ(A) |λ − µ| < 1, then

q−1 −1 1 k k((µImn − D) − N) k ≤ q kNk . (minλ∈Λ(A) |λ − µ|) Xk=0 11 By (3.14), we obtain

−1 −1 H 1 ≤ (µImn − D − N) (Fm ⊗ In) · bcirc(Q) bcirc(ǫB) bcirc(Q) · (Fm ⊗ In) −1 H −1 = (µImn − D − N) k bcirc(ǫB)kkFm ⊗ InkkFm ⊗ Inkk bcirc(Q) kk bcirc( Q)k

and under the spectral norm case, we get

q−1 q−1 k q k min |λ − µ|≤k bcirc(ǫB)k2 kNk2 or ( min |λ − µ|) ≤k bcirc(ǫB)k2 kNk2 λ∈Λ(A) λ∈Λ(A) Xk=0 Xk=0 q−1 k for minλ∈Λ(A) |λ−µ| ≥ 1 or minλ∈Λ(A) |λ−µ| < 1, respectively. Let θ = k bcirc(ǫB)k2 k=0 kNk2. Then we get the result (3.10) for spectral norm. The Frobenius norm case can be get easily. For the 1- and ∞-norms, by using (3.14) again we get P

1/q min |λ − µ| ≤ max{θp, θp } λ∈Λ(A) where q−1 k θp = k bcirc(ǫB)kpκp(Q)κp(Fm ⊗ In) kNk2 Xk=0 and the proof is completed. Now, we give one more general result of Theorem 3.2. Different from the above theorem which involving with an F -diagonal tensor, in the next result, we consider block-diagonal case. Let A ∈ Cm×m×n. Notice that bcirc(A) can be block-diagonalized as follows,

A(1) A(2) H   (Fm ⊗ In) bcirc(A)(Fm ⊗ In)= . . ..    A(n)      By Lemma 2.3, we know that A(i) where i = 1,...,n may not diagonal since generally tensor A is not F -diagonal. For each matrix A(i) , let X(i) be a transformation matrix (i) −1 (i) (i) (i) (i) such that (X ) A X = diag(Aki ) where Aki is in triangular Schur form with

(i) (i) (i) Aki = Dki + Nki , ki =1,...,ℓi.

12 H (1) (2) (n) Denote bcirc(X )=(Fm ⊗ In) diag(X ,X ,...,X )(Fm ⊗ In). Then we get −1 H (Fm ⊗ In) · bcirc(X ) bcirc(A) bcirc(X ) · (Fm ⊗ In) (1) A1 . .. diag(A(1))   k1 A(1) diag(A(2))  ℓ1   k2   .  = . =  ..  ..      A(n)   diag(A(n))   1   kn   ..     .     A(n)   ℓn  (1)  (1)  D1 N1 .. ..  .   .  D(1) N (1)  ℓ1   ℓ1   .   .  =  ..  +  ..       D(n)   N (n)   1   1   .   .   ..   ..       D(n)   N (n)   ℓn   ℓn  :=D˜ + N.˜    By the above analysis, we have the following conclusion. Theorem 3.4. (i) If µ is a T -eigenvalue of A + ǫB and q is the of Aki , then we have 1/q min |λ − µ| ≤ max θ1, θ1 λ∈Λ(A) where n o θ1 = Cǫk bcirc(B)kpκp(X ), p =2,F. −1 q−1 i k (i) and C = k=0 kN k2 provided that maxj Aj − µI occurring at j = ki.

For 1-P and ∞-norms, under the above condition, we get

1/q min |λ − µ| ≤ max{θ2, θ2 } λ∈Λ(A) where θ2 = Cǫk bcirc(B)kpκp(X )κp(Fm ⊗ In). Proof. We only need to consider the case that µ is not an T -eigenvalue of A. Hence µImn − D˜ − N˜ is nonsingular. Similar as (3.14), the matrix ˜ ˜ −1 −1 H Imn − (µImn − D − N) (Fm ⊗ In) · bcirc(X ) bcirc(ǫB) bcirc(X ) · (Fm ⊗ In) is singular. By similar proof process as Theorem 3.3, we could get the conclusion.

13 3.4 Kahan theorem for tensors The result on a Hermite tensor that is perturbed by a Hermite tensor is studied in [6]. Next, we give a result that a Hermite tensor is perturbed by any tensors.

Theorem 3.5. (Kahan theorem for tensors). Let A ∈ Rm×m×n be a Hermite tensor. mn Suppose that its T -eigenvalues set is denoted by ΛA = {λi}i=1 such that its T -eigenvalues are arranged in a non-increasing order:

λmax = λ1 ≥ λ2 ≥···≥ λmn−1 ≥ λmn = λmin. (3.15)

mn Suppose that B = A+E and let ΛB = {βk+iγk}k=1 such that β1 ≥ β2 ≥···≥ βmn−1 ≥ βmn. Let bcirc(E) − bcirc(E)H E = y 2i and σk = {β + iγ ∈ C : |β + iγ − λk| ≤kEk2, |γ|≤kEyk2}. Then mn

ΛB ⊂ σk. k[=1 Proof. On one hand, by Lemma 2.4, a Hermite tensor is F -diagonalizable by a unitary tensor. According to Corollary 3.1, we know that there exists one λk such that |β + iγ − λk|≤kEk2 for any given T -eigenvalue β + iγ of B. On the other hand, suppose that B ∗ X =(β + iγ)X , then by the definition of tensor- tensor multiplication, it is equivalent with

bcirc(B) unfold(X )=(β + iγ) unfold(X ).

Thus it is reasonable to assume that k unfold(X )k2 = 1 since X is nonzero which implies that the vector unfold(X ) is also nonzero. Therefore,

(unfold(X ))H bcirc(B) unfold(X )= β+iγ and (unfold(X ))H bcirc(B)H unfold(X )= β−iγ which implies that

(unfold(X ))H[bcirc(B) − bcirc(B)H ] unfold(X ) γ = = unfold(X ))HE unfold(X ). 2i y

Thus |γ|≤kEyk2. Our conclusion follows by combining this two parts.

14 4 Pseudospectra of third-order tensors

Pseudospectra of finite-dimensional matrices has been thoroughly investigated in the classical book by Trefethen [21]. Three definitions of pseudospectra for any norm and one definition for spectral norm are given, and those definitions are equivalent under certain conditions. Many properties and representative numerical results are presented by many pictures. In thhis section, we study the pseudospectra theory of third-order tensors under tensor- tensor multiplication sense. The definition of pseudospectra on the basis of T -eigenvalue defined in Definition 3.1 is given first. Some properties based on the definition are given soon afterwards.

4.1 Pseudospectra of third-order tensors under tensor-tensor mul- tiplication First, we give the definition of ε-pseudospectra of an m × m × n tensor A. If the norm k·kp is not specified, we take the convention that p =1, 2, ∞. Definition 4.1 (ε-pseudospectra of a Tensor). Let A ∈ Cm×m×n. Then the block circulant matrix bcirc(A) generated by the tensor A can be factored as follows,

A(1) A(2) H   bcirc(A)= Fm ⊗ In · . · (Fm ⊗ In) . (4.1) ..     A(n)      For each A(i) where i ∈ [n] := {1,...,n}, we have

(i) (i) −1 −1 Λε(A ) := z ∈ C : (zIm − A ) ≥ ε

 (i) where ε is a positive scalar. If there are some i such that zI m − A are singular, we define (i) −1 (zIm − A ) = ∞. In this definition, we denote the block diagonal matrix in (4.1) as diag(A(1),...,A(n)) := A.

(I) We call (i) −1 −1 Λε(A) := z ∈ C : max (zIm − A ) ≥ ε i∈[n]   as the ε-pseudospectra of tensor A. mn×mn (II) Λε(A)= {z ∈ C : z ∈ Λ(A + E) for some E ∈ C with kEk ≤ ε}. mn (III) Λε(A)= {z ∈ C : there exists v ∈ C with kvk =1 such that k(A − zImn)vk ≤ ε} . Theorem 4.1. The three definitions (I), (II) and (III) are equivalent.

15 (i) Proof. For a block diagonal matrix, we find that kAk = maxi∈[n] kA k and the inverse of A can be get by computing the inverse of each A(i). Therefore, (i) −1 −1 max (zIm − A ) = (zImn − A) i∈[n] and thus −1 −1 Λε(A)= z ∈ C : (zImn − A) ≥ ε . (4.2) The equivalence of (4.2) and (II), (III) can be easily got by the matrix case [21].

Remark 4.1. Notice that z ∈ C and it is variable, while A(i) is stationary when the tensor A is given. Therefore by definition (I), we can also see that n n (i) (i) −1 −1 Λε(A)= Λε(A )= z ∈ C : (zIm − A ) ≥ ε . i=1 i=1 [ [  Under the case of the spectral norm, we give the following definition . Definition 4.2. Let A ∈ Cm×m×n. Then the block-circulant matrix bcirc(A) generated by the tensor A can be factored as (4.1). For each square matrix A(i) where i ∈ [n], we have (i) (i) Λε(A )= z ∈ C : σmin(zIm − A ) ≤ ε where ε is a positive scalar and σmin (·) denotes the minimum singular value. We call n (i) Λε(A)= Λε(A ) i=1 [ as the ε-pseudospectra of tensor A. By Remark 4.1, we get the following conclusion. Theorem 4.2. The three definitions given in Definition 4.1 and the one in Definition 4.2 are also equivalent.

4.2 Properties of pseudospectra of tensors We study the properties of pseudospectra for third-order tensors in this subsection. Many fundamental results are given in the next theorem. Theorem 4.3. Let tensor A ∈ Cm×m×n and suppose that the positive scalar ǫ is given arbitrarily. (1) The set Λǫ(A) is nonempty, open, and bounded. Moreover, there are at most nm connected components, each containing one or more T -eigenvalues of A. (2) For any c ∈ C, we have Λε(A + c)= c +Λε(A) where A + c is shorthand for A + cI and I is the identity with the same size as A. (3) For any nonzero c ∈ C, we have Λ|c|ε(cA)= cΛε(A). H (4) If the spectral norm is applied, then Λε A = Λε(A).  16 (i) Proof. To prove the assertion of (1), one can use the fact that each Λǫ(A ) of the matrix A(i) has the properties that nonempty, open, and bounded, with at most m connected components, each containing one or more eigenvalues of A(i). Then same prop- erties are also hold for the given tensor A. Moreover, there are at most nm connected components since n

Λε(A)= Λε(Di). i=1 [ We now come to the matter of part (2). First, note that

A1 + cIm An An−1 ··· A2 A2 A1 + cIm An ··· A3 bcirc(A + cI)=  . . . . .  ......  .   A A .. A A + cI   n n−1 2 1 m  = bcirc(A)+ c bcirc(I)  H H = Fm ⊗ In · A · (Fm ⊗ In)+ c[ Fn ⊗ Im · Imn · (Fm ⊗ In)] H = Fm ⊗ In · (A + cImn) · (Fm ⊗ In)  (1)  A + cIm A(2) + cI H  m  = Fm ⊗ In · . · (Fm ⊗ In) . ..  (n)    A + cIm      Therefore,

n n (i) (i) Λε(A + c)= Λε(A + cIm)= [Λε(A )+ c]= c +Λε(A) i=1 i=1 [ [ for any c ∈ C and we complete the proof of this part. For the part (3), by Lemma 2.1, we know that

H H bcirc(cA)= c bcirc(A)= c[ Fm ⊗ In · A · (Fm ⊗ In)] = Fm ⊗ In · (cA) · (Fm ⊗ In) which implies that  

n n n

Λ|c|ε(cA)= Λ|c|ε(cDi)= cΛε(Di)= c Λε(Di) i=1 i=1 i=1 [ [ [ since for any nonzero c ∈ C and matrix A ∈ Cm×m, the following equality

Λ|c|ε(cA)= cΛε(A)

17 holds. Thus we get the result that Λ|c|ε(cA)= cΛε(A) for any nonzero c ∈ C. Now, we prove the last part of this theorem. By Lemma 2.1, we know that

H H H bcirc A = Fm ⊗ In · A · (Fm ⊗ In) .

Therefore,  

n n n H (i) H (i) (i) Λε A = Λε((A ) )= Λε(A )= Λε(A )= Λε (A) i=1 i=1 i=1  [ [ [ H m×m where the conclusion Λε(A ) = Λε(A) under the two-norm for any matrix A ∈ C is applied in the second equality.

Remark 4.2. By the results above, we can see that the function of pseudospetra on tensor A is linear.

The properties of pseudospectra on normal tensors are given next.

Theorem 4.4. (Pseudospectra of a normal tensor). Let ∆ε be an open ε-ball; that is, ∆ε = {z ∈ C : |z| <ε}. A sum of sets is defined as

σ(A) + ∆ε = {z : z = z1 + z2, z1 ∈ σ(A), z2 ∈ ∆ε} where Λ(A) is the T -spectrum (sets of T -eigenvalues) of the tensor A. Then for any tensor A ∈ Cm×m×n, we have Λε(A) ⊇ Λ(A) + ∆ε ∀ε> 0. (4.3)

Moreover if A is normal and k·k = k·k2, then

Λε(A)=Λ(A) + ∆ε ∀ε> 0. (4.4)

Proof. If λ is an T -eigenvalue of tensor A, then it is an eigenvalue of the matrix bcirc(A). Therefore λ + µ is an eigenvalue of bcirc(A)+ µI for any µ ∈ C. Note that kµIk = |µ|, then by the definition of pseudospectra on tensors, we obtain λ + µ ∈ Λε(A) for any |µ| <ε. The proof (4.3) is finished. For the normal tensor case, by Lemma 2.4 and Lemma 2.1, we have

bcirc(U) bcirc(A)(bcirc(U))H = bcirc(D) and D(1) D(2) H   (Fm ⊗ In) bcirc(D)(Fm ⊗ In)= . := D (4.5) ..    D(n)      18 (i) in which D is diagonal for i = 1, ··· , n by Lemma 2.3. Also note that k·k = k·k2, we may assume directly that A is F -diagonal. Therefore, the diagonal entries of bcirc(A) are equal to the T -eigenvalues. As we all know, the ε-pseudospectrum is just the union of the open ε-balls about the points of the spectrum for any ; equivalently, we have 1 (z − bcirc(A))−1 = (4.6) 2 dist(z, Λ(bcirc(A))) which implies dist(z, Λ(bcirc(A))) <ε by the ε-pseudospectrum of tensors. We get the conclusion since Λ(A) + ∆ε is the same as {z : dist(z, Λ(A)) <ε}. Theorem 4.5. (Bauer-Fike Theorem). Suppose tensor A ∈ Cm×m×n is F -diagonalizable, i.e., it has decomposition (3.4). If the spectral norm is applied, then for each positive scalar ǫ, we have

Λ(A) + ∆ε ⊆ Λǫ(A) ⊆ Λ(A) + ∆εκ2(P). Proof. We only need to prove the second inclusion. By the definition of pseudospec- tra, Lemma 2.1 and decompositions (3.4) and (4.5), it is not hard to find that −1 −1 −1 k(zImn − A) k = k bcirc(P)(zImn − bcirc(D)) bcirc(P) k κ(P) κ(P) ≤ = dist(z, Λ(bcirc(D))) dist(z, Λ(D)) κ(P) = . dist(z, Λ(A)) Similar as the matrix case, we get our conclusion.

4.3 An examples of the ε-pseudospectrum

Example 4.1. Let A be a third-order tensor with three frontal faces A1, A2 and A3. Firstly, we consider an example that all frontal faces are the same and each is a tridiagonal , that is, A1 = A2 = A3 = Tpz where 0 1 1 0 1  4  . . . N×N Tpz = ...... ∈ R .  1   4 0 1   1   0   4    We denote this tensor as A0 and we can see that the size of bcirc(A0)) is 3N × 3N. How- ever, it is non-symmetrical. Notice that Tpz can be symmetrized by the diagonal similarity transformation −1 DTpzD = S

19 where D = diag 2, 4,..., 2N and

 1 0 2 1 0 1  2 2  . . . N×N S = ...... ∈ R .  1 1   2 0 2   1   0   2    Let Db be a block diagonal matrix such that Db = diag (D,D,D), then

−1 Db bcirc(A0))Db = Sb is symmetrical and thus all the eigenvalues are real which means that all the T -eigenvalues of the given tensor A0 are real and only appear in the real axis. This agrees with the results Lemma 2.2.

3

2

1

0

-1

-2

-3

-3 -2 -1 0 1 2 3

(a) CONTOUR only. (b) SURF and CONTOUR.

−1 −2 −10 Figure 1: Boundaries of pseudospectra Λǫ(A0) for ε = 10 , 10 ,..., 10 under the condition that A1 = A2 = A3 = Tpz. The T -eigenvalues of tensor A0 are plotted as crosses ‘×’. (Left: CONTOUR only; Right: SURF and CONTOUR)

However, the pseudospectra of A0 lie far from the real axis. We consider a case where N = 20 and the results are given in Figure 1 in which the T -eigenvalues of ten- sor A0 are plotted as crosses ‘×’. The spectral norm is chosen here and we set ε = −1 −2 −10 10 , 10 ,..., 10 . By the results we can see that the boundaries of Λǫ(A0) lie far be- yond the real axis.

20 5

4

3

2

1

0

-1

-2

-3

-4

-5

-5 0 5

(a) CONTOUR only. (b) SURF and CONTOUR.

−1 −2 −10 Figure 2: Boundaries of pseudospectra Λǫ(A1) for ε = 10 , 10 ,..., 10 under the condition that A1 = Tpz and A2 = A3 =2Tpz. The T -eigenvalues of tensor A1 are plotted as crosses ‘×’. (Left: CONTOUR only; Right: SURF and CONTOUR)

Many T -eigenvalues of the given tensor A0 are zero. Another tensor A1 where A1 = Tpz and A2 = A3 = 2Tpz is considered next. The corresponding matrix bcirc(A1)) is non- symmetrical but all its eigenvalues are real. Similar results are given in Figure 2. A case that the tensor has complex T -eigenvalues is considered next. By setting A3 = 2A2 =4A1 = Tpz, we get a new tensor A2. The results are given in Figure 3.

8 8

6 6

4 4

2 2

0 0

-2 -2

-4 -4

-6 -6

-8 -8 -8 -6 -4 -2 0 2 4 6 8 -8 -6 -4 -2 0 2 4 6 8

(a) CONTOUR only. (b) PCOLOR and CONTOUR.

−1 −2 −10 Figure 3: Boundaries of pseudospectra Λǫ(A2) for ε = 10 , 10 ,..., 10 under the condition that A1 = Tpz and A2 = A3 =2Tpz. The T -eigenvalues of tensor A2 are plotted as crosses ‘×’. (Left: CONTOUR only; Right: PCOLOR and CONTOUR)

At last, we consider an example where A1 = Tpz, A2 = 10Tpz and A3 = eye(N) and we

21 denote this tensor as A3. With similar parameters setting, we get the boundaries results in Figure 4.

10 10

5 5

0 0

-5 -5

-10 -10

-10 -5 0 5 10 -10 -5 0 5 10

(a) CONTOUR only. (b) PCOLOR and CONTOUR.

−1 −2 −10 Figure 4: Boundaries of pseudospectra Λǫ(A3) for ε = 10 , 10 ,..., 10 under the condition that A1 = Tpz, A2 = 10Tpz and A3 = eye(N). The T -eigenvalues of tensor A2 are plotted as crosses ‘×’. (Left: CONTOUR only; Right: PCOLOR and CONTOUR)

5 Multidimensional ordinary differential equation

We consider the first-order multidimensional ordinary differential equation in this sec- tion. During the studying process, we investigate the relationship between differential equation and T -eigenvalues. Before studying the differential equation, we first suppose that the coefficient tensor A ∈ Cm×m×n has square frontal faces. The unknown function Y is defined as follows, Y : [0, ∞) → Cm×s×n. (5.1)

d Let the operator dt be acting elementwise. Then, under the above assumptions, we consider the following differential equation [12], dY (t)= A∗Y(t) (5.2) dt with initial condition Y(0) = Y0 being given. By unfolding both sides of (5.1) then we could get a equivalent matrix form,

Y1(t) Y1(t) d . = bcirc(A) . . (5.3) dt  .   .  Yn(t) Yn(t)         22 Furthermore, according to the result in the theory of ordinary differential equation and by the definition of tensor T -function [12], the solution of (5.2) can be represented as

Y(t) = fold(exp(tbcirc(A)) · unfold(Y(0))) = exp(At) ∗Y0. (5.4)

Consider the equation (5.4), the matrix which made up of Y1,...,Yn is of size mn × s. For simplicity, we only consider one column case, that is, dX (t)= A ∗ X (t), where X : [0, ∞) → Cm×1×n. (5.5) dt Therefore, we have X1(t) X1(t) d . = bcirc(A) . . (5.6) dt  .   .  Xn(t) Xn(t)     With initial value X (0) and by the theory of differential equations, we find that

X1(t) X1(0) . .  .  = exp(t bcirc(A))  .  . (5.7) Xn(t) Xn(0)     Notice that the left-hand side of (5.7) is an vector with length np. Therefore it is resealable to assume that solution of (5.7) has the form ξ exp(λt) where the exponent λ and vector ξ are to be determined. Substituting this solution form for both sides of (5.6), we get (bcirc(A) − λImn)ξ = 0 (5.8) and then A∗ fold(ξ)= λ · fold(ξ), where fold(ξ) ∈ Cm×1×n, which shows that, by Definition 3.1, λ is a T -eigenvalue of the given tensor A with T -eigenvector fold(ξ). Based on the above analysis, we conclude that the investigation concerning T -eigenvalues and eigenvectors plays an important part in analyzing the solution of the multidimensional ordinary differential equation (5.5). Furthermore, by the theories of matrix functions [5] and tensor T -functions [12, 13], we can see that T -eigenvalues defined in Definition 3.1 are also crucial and basic parts in studying theses matrix or tensor functions. More generally, the differential equation (5.5) considered is a special case of dX (t)= A(t) ∗ X (t)+ G(t) (5.9) dt in which G has the same size as X . Then, similarly as the above process, we get

X1(t) X1(t) G1(t) d . = bcirc(A(t)) . + . . (5.10) dt  .   .   .  Xn(t) Xn(t) Gn(t)             23 By the theory for the system of first-order linear differential equations, it is not hard to get the existence and uniqueness theorem for (5.9). We omit its proof here. Theorem 5.1. Let the mmn functions A(i, j, k)(t) and mn functions G(i, 1,k)(t) where i, j ∈ [m] and k ∈ [n] be continuous on an interval (ts, te). Suppose that the initial value is X (t0) = X0 in which t0 is a specified value in (ts, te). Then there exists a unique solution X (t)=(X (i, 1,k)(t)) of system (5.9) that satisfies the initial conditions given above. Moreover, the solution exists throughout the assumed interval (ts, te). Instead of considering the general (5.9) differential equation directly, we mainy pay our attention to dX (t)= A(t) ∗ X (t) (5.11) dt obtained by setting G(t) = 0 in (5.9). And throughout this section, we assume that the continuous and initial conditions in Theorem 5.1 are satisfied. Suppose that X1(t) and X2(t) are two solutions of equation (5.11); in other words, we have dX dX 1 (t)= A(t) ∗ X (t), 2 (t)= A(t) ∗ X (t). dt 1 dt 2 Then, similar as the method in matrix case, we can generate more solutions by forming linear combinations of X1(t) and X2(t). As stated in the following theorem.

Theorem 5.2. (Principle of Superposition). If the tensor-valued functions X1(t) and X2(t) are two solutions of equation (5.11), then, for any scalars c1 and c2, the linear combination c1X1(t)+ c2X2(t) is also a solution. Proof. Notice that d(c X (t)+ c X (t)) dX dX 1 1 2 2 = c 1 (t)+ c 2 (t) dt 1 dt 2 dt =c1 fold(bcirc(A(t)) · unfold(X1(t))) + c2 fold(bcirc(A(t)) · unfold(X2(t)))

= fold (bcirc(A(t)) · (c1 unfold(X1(t)) + c2 unfold(X2(t))))

=A(t) ∗ (c1X1(t)+ c2X2(t)), and we get the result. By repeated application of the above theorem, it is easy to find that each finite linear combination of solutions of equation (5.11) is also a solution. Moreover, by analogy with the matrix case, we can prove that each solution of (5.11) is a combination of some solutions.

Theorem 5.3. Suppose that X1,..., Xmn are solutions of (5.11) and under the unfold operation the mn vectors unfold(X1),..., unfold(Xmn) are linearly independent for each t ∈ (ts, te). Then each solution of (5.11) can be expressed as a linear combination of the above mn solutions in exactly one way, that is,

X (t)= c1X1(t)+ ··· + cmnXmn(t). (5.12)

24 Proof. We first suppose that t0 ∈ (ts, te) and let X0 = X (t0). If we can prove that there exist scalars c1,...,cmn such that (5.12) holds for t = t0,

c1X (t0)+ ··· + cmnXmn(t0)= X0, (5.13) then by Theorem 5.1 we can get the result. By unfolding both sides of (5.13), we could get np equations in scalar form and the coefficients can be denoted as, at t = t0,

(X1)1(t) ··· (Xmn)1(t) . . . M [X1,..., Xmn]= . .. .   (5.14) (X1)n(t) ··· (Xmn)n(t)   = [unfold(X1),..., unfold(Xmn)] which is of size mn × mn. By the assumption of , we can see that the existence of c1,...,cmn is guaranteed, and furthermore in only one way. From the above analysis, the columns of matrix M are linearly independent at a given t if and only if det(M) =6 0. We call this , denoted by W [X1,..., Xmn] as the of the given mn solutions. Therefore we conclude that if the Wronskian of X1,..., Xmn is nonzero, then (5.12) holds. For equation (5.11) that considered, if any set of solutions {X1,..., Xmn} with nonzero Wronskian W [X1,..., Xmn] at each time t ∈ [ts, te], then it is said to be a fundamental set of solutions in [ts, te]. If the scalars c1,...,cmn can be chosen arbitrarily in (5.12), then we call it as a general solution. By the uniqueness property of Theorem 5.1, actually the Wronskian only has two possible cases on the interval [ts, te]. It is either identically zero or always nonzero.

Theorem 5.4. Let {X1,..., Xmn} be a set of solutions of (5.11) on the [ts, te]. If the Wronskian W [X1,..., Xmn] is nonzero (zero) at arbitrary t0 ∈ [ts, te], then it is nonzero (zero) at the whole interval.

Proof. Since unfold(Xk) is a vector for each k =1,...,mn, then for simplicity it is convenient to rewrite the Wronskian as

W [X1,..., Xmn]= |unfold(X1),..., unfold(Xmn)|

(X1)1 ··· (Xmn)1 . . . (5.15) = . .. . .

(X1)mn ··· (Xmn)mn

25 Therefore by the properties of determinant, we get

d(X1)1 d(Xmn)1 (X ) ··· (X ) dt ··· dt 1 1 mn 1 dW ...... = . .. . + ··· + . .. . dt . . d(X1)mn d(Xmn)mn (X1)mn ··· (Xmn)mn ··· (5.16) dt dt

=( X1)1W +(X2)2W + ··· +( Xmn)mn W

= ((X1)1 +(X2)2 + ··· +(Xmn)mn) W.

Hence our conclusion follows since

W (t)= c exp [(X1)1 +(X2)2 + ··· +(Xmn)mn] dt (5.17) Z  in which the constant c is arbitrary. The above theorem provides an easy criterion to distinguish the fundamental set of solutions from all sets that contains mn solutions of (5.11) by computing the value of their Wronskian at any point in [ts, te]. Moreover, by this result, we can prove that the system we considered always has a fundamental set of solutions.

Theorem 5.5. Let {X1,..., Xmn} be a set of solutions of (5.11) on the [ts, te] that satisfies the following initial condition,

⊤ Xi(t0)= ei = (0,..., 1,..., 0), i =1, . . . , mn, t0 ∈ [ts, te].

Then they form a fundamental set of solutions.

Now we give a result involving with complex-valued solution in the equation (5.11) whose coefficients are all real-valued.

Theorem 5.6. Let the coefficient A(t) in (5.11) be a real-valued continuous function. If X1(t)+ iX2(t) is a is a complex-valued solution, then both X1(t) and X2(t) are real-valued solutions of (5.11).

Proof. By substituting X1(t)+ iX2(t) for X (t) in (5.11) and notice that A(t) is real-valued, this result can be get.

6 Conclusions and Remarks

Based on the well-known tensor-tensor multiplication, the generalized T -eigenvalue and eigenvector are defined firstly in this paper, and then we focus our attention on the perturbation theory on T -eigenvalues. On one hand, we considered many classical perturbation results of matrix in the third-order tensor case, many conclusions, such as Bauer-Fike theorem, Kahan theorem and Gershgorin circle theorem, are presented. On

26 the other hand, the ε-pseudospectra of tensor is studied. Many equivalent definitions are given, and then we considered the properties. Some numerical experiments are also presented. In the last section, we investigate the close relationship between T -eigenvalues and multidimensional ordinary differential equations. Some properties of the differential equations are also presented. However, there are still many perturbation results not considered. Also, giving an estimate of the T -eigenvalues of tensor A is also meaningful. Future research directions including the theory of multi-linear time invariant system specified for third-order tensors under tensor-tensor multiplication as well as the important role of T -eigenvalue plays on in this system and also some algorithms for computation. This system is given by dX (t) = A ∗ X (t)+ B ∗ U(t) dt , (6.1) Y(t) = C ∗ X (t)  where, X (t) ∈ Rm×1×n is the latent state space tensor, U(t) ∈ Rℓ×1×n is the control tensor and Y(t) ∈ Rs×1×n is the output tensor; A ∈ Rm×m×n, B ∈ Rm×ℓ×n and C ∈ Rs×m×n are real valued tensors. Notions such as stability, reachability and observability of (6.1) can also be generalized from the classical control theory. Moreover, computational framework and also application-driven study will be given high priority in the future. Besides, the of third-order tensors and algorithms for computation are also under consideration.

Acknowledgments

The authors would like to express their appreciation to Prof. Yimin Wei for many useful discussions and valuable comments.

References

[1] F. L. Bauer and C. T. Fike, Norms and exclusion theorems, Numerische Mathematik, 2 (1960), pp. 137–141.

[2] Z. Cao and P. Xie, On some tensor inequalities based on the t-product, 2021. arXiv.

[3] K.-W. E. Chu, Generalization of the Bauer-Fike theorem, Numerische Mathematik, 49 (1986), pp. 685–691.

[4] G. H. Golub and C. F. Van Loan, Matrix computations, The Johns Hopkins University Press, Baltimore, 2013.

[5] N. J. Higham, Functions of matrices: Theory and computation, SIAM, Philadelphia, 2008.

[6] W. hui Liu and X. qing Jin, A study on T-eigenvalues of third-order tensors, Appl., 612 (2021), pp. 357–374.

27 [7] T. Kato, Perturbation theory for linear operators, vol. 132, Springer Science & Business Media, 2013.

[8] M. E. Kilmer, K. Braman, N. Hao, and R. C. Hoover, Third-order tensors as opera- tors on matrices: A theoretical and computational framework with applications in imaging, SIAM Journal on Matrix Analysis and Applications, 34 (2013), pp. 148–172.

[9] M. E. Kilmer and C. D. Martin, Factorization strategies for third-order tensors, Linear Algebra Appl., 435 (2011), pp. 641–658.

[10] M. E. Kilmer, C. D. Martin, and L. Perrone, A third-order generalization of the matrix svd as a product of third-order tensors, Tufts University, Department of Computer Science, Tech. Rep. TR-2008-4, (2008).

[11] Y. Liu, L. Chen, and C. Zhu, Improved robust tensor principal component analysis via low-rank core matrix, IEEE Journal of Selected Topics in Signal Processing, 12 (2018), pp. 1378–1389.

[12] K. Lund, The tensor t-function: a definition for functions of third-order tensors, Numer. Linear Algebra Appl., 27 (2020), pp. e2288, 17.

[13] Y. Miao, L. Qi, and Y. Wei, Generalized tensor function via the tensor singular value de- composition based on the t-product, Linear Algebra and its Applications, 590 (2020), pp. 258– 303.

[14] , T-Jordan and T-Drazin inverse based on the T-product, Communica- tions on Applied Mathematics and Computation, (2021), pp. 201–220.

[15] E. Newman and M. E. Kilmer, Nonnegative tensor patch dictionary approaches for image compression and deblurring applications, SIAM Journal on Imaging Sciences, 13 (2020), pp. 1084–1112.

[16] L. Rayleigh, The theory of sound. Vol. I, London, 1927.

[17] F. Rellich and J. Berkowitz, Perturbation theory of eigenvalue problems, CRC Press, 1969.

[18] E. Schrodinger¨ , Quantisierung als Eigenwertproblem, Annalen Phys., 386 (1926), pp. 109– 139.

[19] X. Shi and Y. Wei, A sharp version of Bauer–Fikes theorem, Journal of Computational and Applied Mathematics, 236 (2012), pp. 3218–3227.

[20] J. Sun, Perturbation Analysis of Matrix, Science Press, Beijing, 1987.

[21] L. N. Trefethen and M. Embree, Spectra and pseudospectra: the behavior of nonnormal matrices and operators, Princeton University Press, 2005.

[22] T. Wu, Graph regularized low-rank representation for submodule clustering, Pattern Recog- nition, 100 (2020), p. 107145.

28 [23] X.-L. Zhao, W.-H. Xu, T.-X. Jiang, Y. Wang, and M. K. Ng, Deep plug-and-play prior for low-rank tensor completion, Neurocomputing, 400 (2020), pp. 137 – 149.

29