Filomat 33:13 (2019), 4261–4280 Published by Faculty of Sciences and , https://doi.org/10.2298/FIL1913261D University of Nis,ˇ Serbia Available at: http://www.pmf.ni.ac.rs/filomat

Classification and Approximation of Solutions to Sylvester Equation

Bogdan D. Djordjevi´ca, Nebojˇsa C.ˇ Dinˇci´cb

aDepartment of Mathematics, Mathematical Institute of the Serbian Academy of Sciences and Arts, Belgrade, Republic of Serbia bDepartment of Mathematics, Faculty of Sciences and Mathematics, University of Niˇs,P.O. Box 224, Niˇs18000, Republic of Serbia

Abstract. In this paper we solve Sylvester matrix equation with infinitely-many solutions and conduct their classification. If the conditions for their existence are not met, we provide a way for their approximation by least-squares minimal-norm method.

1. Introduction and preliminaries

For given vector spaces V1 and V2, let A L(V2), B L(V1) and C L(V1, V2) be linear operators. Equations of the form ∈ ∈ ∈

AX XB = C (1) − with solution X L(V1, V2) are called Sylvester equations, Sylvester-Rosenblum equations or algebraic Ricatti equations.∈ Such equations have various application in vast fields of mathematics, physics, computer science and engineering (see e.g. [5], [19] and references therein). Fundamental results, established by Sylvester and Rosenblum themselves, are now-days the starting point in solving contemporary problems where these equations occur. These results are Theorem 1.1. [23] (Sylvester matrix equation) Let A, B and C be matrices. Equation AX XB = C has unique solution X iff σ(A) σ(B) = . − ∩ ∅ Theorem 1.2. [22] (Rosenblum operator equation) Let A, B and C be bounded linear operators. Equation AX XB = C has unique solution X if σ(A) σ(B) = . − ∩ ∅ Equations with unique solutions have been extensively studied so far. There are numerous results re- garding this case, some of them theoretical (e. g. Lyapunov stability criteria and spectral operators), which can be found in [2], [5] or [9], and some of them computational (matrix sign function, factorization of matri- ces and operators, various iterative methods etc.). It should be mentioned that matrix eq. (1) with unique solution X has been solved numerically (among others) in [4], [6], [13], [14], [18] and [21]. The case where

2010 Mathematics Subject Classification. Primary 47A62; 15A18; Secondary 65F15 Keywords. Sylvester equation, matrix spectrum, matrix approximations, eigenvalues and eigenvectors, , Frobenius norm, least-squares solution, minimal-norm solution Received: 13 February 2019; Accepted: 13 May 2019 Communicated by Dijana Mosic´ Research supported by Ministry of Science, Republic of Serbia, Grant No. 174007 Email addresses: [email protected] (Bogdan D. Djordjevic),´ [email protected] (Nebojsaˇ C.ˇ Dinciˇ c)´ B. D. Djordjevi´c,N. C.ˇ Dinˇci´c / Filomat 33:13 (2019), 4261–4280 4262

A, B and C are unbounded operators but solution X is unique and bounded has been studied in [16] and [20].

Solvability of eq. (1) in matrices, discarding uniqueness of solution, has been studied in [7] and partially in [18]. Main results in [7] are based on the idea that solutions X can be provided as parametric matrices, where number of parameters at hand depends on dimensions of the corresponding eigenspaces for A and B.

The case where A, B and C are unbounded, with infinitely many unbounded solutions X has been studied in [8]. This particular research paper provides insight on new solutions (called the weak solutions), which are only defined on the corresponding eigenspaces for A and B.

This research paper concerns the case when A and B are matrices whose spectra intersect, while ma- trix C is a rectangular matrix of appropriate dimensions. We obtain sufficient conditions for existence of infinitely-many solutions and provide a way for their classification. If the conditions for their existence are not met, we give a way of approximating particular solutions. This study relies on the eigenspace-analysis conducted in [7] and [8].

We assume V1 and V2 to be finite dimensional Hilbert spaces over the same scalar filed C or R, while A (V2), B (V1) and C (V1, V2) are assumed to be operators which correspond to the afore-mentioned∈ B matrices.∈ B Further, ∈(L B) and (L) denote null-space and range of the given operator L. Recall that every finite-dimensional subspaceN WR of a Hilbert space V is closed. Consequently, there exists orthogonal projector from V to W, which will be denoted as PW.

2. Existence and classification of solutions

Through out this paper, we assume that A and B share s common eigenvalues and denote that set by σ:

λ , . . . , λs =: σ = σ(A) σ(B). { 1 } ∩ k k For more elegant notation, we introduce EB = (B λkI) and EA = (A λkI) whenever λk σ. Differ- N − N k− ∈ ent eigenvalues generate mutually orthogonal eigenvectors, so the spaces EB form an orthogonal sum. Put s P k EB := EB. It is a closed subspace of V1 and there exists EB⊥ such that V1 = EB EB⊥. Take B = BE B1 with k=1 ⊕ ⊕ respect to that decomposition and denote C = CP . 1 EB⊥

Proposition 2.1. Let V be a Hilbert space and L (V). If W is L invariant subspace of V, then W⊥ is ∈ B − L∗ invariant subspace of V. −

k k Theorem 2.1. (Existence of solutions) For every k 1,..., s , let λk, EA and EB be provided as in the previous paragraph. If ∈ { }

 k  (C )⊥ = (B ) and C E (A λkI), (2) N 1 R 1 B ⊂ R − then there exist infinitely many solutions X to the equation (1).

k Proof. For every 1 k s, let E , EB, E⊥, BE and B be provided as in the previous paragraph. Note that ≤ ≤ B B 1 (C )⊥ = (C∗ ), where C∗ (V , E⊥). N 1 R 1 1 ∈ B 2 B

Step 1: solutions on EB⊥. B. D. Djordjevi´c,N. C.ˇ Dinˇci´c / Filomat 33:13 (2019), 4261–4280 4263

We first conduct analysis on E⊥. Space EB is BPE invariant subspace of V1 and Proposition 2.1 yields B B⊥ − E⊥ to be (BPE )∗ invariant subspace of V1, so without loss of generality we can observe B∗ as B∗ : E⊥ E⊥. B B⊥ − 1 1 B → B Since σ(BE) = λ , . . . , λs , it follows that { 1 }

σ(B∗ ) 0 σ(B∗) λ¯ ,..., λ¯ s . 1 ⊆ { } ∪ \{ 1 }

Case 1. Assume that σ(B∗ ) σ(A∗) = . Then there exists unique X∗ (V , E⊥) such that 1 ∩ ∅ 1 ∈ B 2 B

X∗ A∗ B∗ X∗ = C∗ , 1 − 1 1 1 that is, there exists unique X (E⊥, V ) such that 1 ∈ B B 2 AX X B = C 1 − 1 1 1 holds.

Case 2. Assume that σ(A∗) σ(B∗ ) , . It follows that σ(A∗) σ(B∗ ) = 0 . But then A∗ cannot be ∩ 1 ∅ ∩ 1 { } nilpotent. Truly, if σ(A∗) = 0 = σ(A), then by assumption, σ(B) σ(A) , , therefore, 0 σ(B), that is, { } ∩ ∅ ∈ 0 σ. If u (B ), it follows that B u = 0 and u E⊥, but then Bu = B u = 0, so u (B) EB, therefore ∈ ∈ N 1 1 ∈ B 1 ∈ N ⊂ u EB EB⊥ = 0 . Hence contradiction, implying that A∗ is not nilpotent, but rather has finite ascend, ∈ ∩ { } m asc(A∗) = m 1, where ((A∗) ) is a proper subspace of V . ≥ N 2

Now observe B∗ : E⊥ E⊥, which is not invertible by assumption. Take arbitrary Z∗ ( (A∗), (B∗ )) 1 B → B 0 ∈ B N N 1 operator. Then for every d (A∗), there exists (by (2)) unique u (B∗ )⊥ such that ∈ N ∈ N 1

B1∗ u = C1∗ d.

Define X∗ (Z∗ ) on (A∗) as X∗ (Z∗ )d := Z∗ d + u. Since asc(A∗) = m, the following recursive formula applies. 1 0 N 1 0 0 Assume that m = 1. Precisely, decompose V = (A∗) (A∗)⊥ and A∗ = 0 A∗ . Then A∗ is injective 2 N ⊕ N ⊕ 1 1 from (A∗)⊥ to (A∗)⊥ and X∗ can be defined on (A∗)⊥ as restriction of X∗ from Case 1. N N 1 N 1

Assume that m > 1. Then proceed to decompose (A∗)⊥ = (A∗ ) (A∗ )⊥ and and define X∗ on (A∗ ) N N 1 ⊕ N 1 1 N 1 as X∗ (N∗ )u := N∗ u + d, where Z∗ ( (A∗ ), (B∗ )) is arbitrary operator and 1 1 1 1 ∈ B N 1 N 1

B1∗ u = C1∗ d.

If A∗ is injective on (A∗ )⊥, i.e. if m = 2, then X can be defined on (A∗ )⊥ as restriction of X from Case 1 N 1 1 N 1 1 1. If not, then proceed to decompose (A1∗ )⊥ = (A2∗ ) (A2∗ )⊥ and so on. Eventually, one would get to iteration no. m, in a manner that N N ⊕ N

V = (A∗) (A∗ ) (A∗ ) ... (A∗ ) (A∗ )⊥ 2 N ⊕ N 1 ⊕ N 2 ⊕ ⊕ N m ⊕ N m and A∗ : (A∗ )⊥ (A∗ )⊥ is injective. Then σ(B∗ ) σ(A∗ ) = , ergo define X∗ on (A∗ )⊥ as restriction m N m → N m 1 ∩ m ∅ 1 N m of X∗ from Case 1 to (A∗ )⊥ . Further, for 0 n m, let Z∗ ( (A∗ ), (B∗ )) be arbitrary operators. Then 1 N m ≤ ≤ n ∈ B N n N 1 define X∗ on (A∗ ) as 1 N n X1∗ (Zn∗ )d := Zn∗ d + u, where once again u (B∗ )⊥ is unique element such that B∗ u = C∗ d. Equivalently, there exists X ∈ N 1 1 1 1 ∈ (EB⊥, V2) such that B AX X B = C , 1 − 1 1 1 where

X1 = X1(Z0∗ , Z1∗ ,..., Zm∗ ).

Condition (C∗ ) = (B∗ )⊥ = (B ) yields X to be well defined on the entire E⊥. R 1 N 1 R 1 1 B B. D. Djordjevi´c,N. C.ˇ Dinˇci´c / Filomat 33:13 (2019), 4261–4280 4264

Step 2: solutions on EB.

s P k We now conduct our analysis on EB. Define EA = EA and split V2 into orthogonal sum V2 = EA EA⊥. k=1 ⊕ Decompose A = AE A1 with respect to that sum. Then A1 is injective on EA⊥ and A1v = Av, for every ⊕ k k k v E⊥. For every k 1,..., s let Nk (E , E ) be arbitrary. For every u E , by assumption (2), there ∈ A ∈ {  } ∈ B B A ∈ B exists unique d(u) Ek ⊥ such that ∈ A (A λkI)d(u) = Cu. − Define k k X : u Nku + d(u), u E . E 7→ ∈ B   s k k k 1 k P k Then XE : EB EA PEk ⊥ (A1 λkI)− CEB defines a linear map. What is left is to check whether XE := XE → ⊕ A − k=1 is a solution to the equation AXE XEBE = CPE − B restricted to EB. However, this is directly verifiable. For any u EB there exist unique α1, . . . , αs C(R) and k P ∈ ∈ unique uk E , 1 k s, such that u = αkuk. Then ∈ B ≤ ≤ Xs Xs Xs Xs k k (AXE XEB)u = A αkX uk λkαkX uk = (αk(A λkI))(Nkuk + d(uk)) = αkCuk = Cu. − E − E − k=1 k=1 k=1 k=1 It follows that ! X 0 X = E . (3) 0 X1 is a solution to eq. (1).

Remark. Notice that in proof of Theorem 2.1 Case 2. only emerges if σ(A) σ(B1) = 0 while 0 < σ(B). Because of special circumstances under which this problem takes place, this∩ situation{ will} be analyzed separately from the standard problem (see Corollary 2.3). Theorem 2.1 naturally inquires answers to the following questions:

Question 1. Is every solution to the equation (1) of the form (3)?

Question 2. Under which conditions is the solution to (1) unique?

Both of these questions have affirmative answers, which is justified by analysis of the following eigen-problem associated with given Sylvester equation:

λ λ Assume that 0 σ = σ(A) σ(B) and let Nλ (EB, EA), for every λ σ be arbitrary. Define Nσ := λ σNλ. Find a solution X to∈ Sylvester equation∩ such that the∈ Bfollowing eigen-problem∈ is uniquely solved ⊕ ∈   AX XB = C  − 1 λ (4) Xuλ := P(Eλ ) (A λI)− Cuλ + Nλuλ, uλ E , λ σ 0 . A ⊥ − ∈ B ∈ ∪ { } Theorem 2.2. (Uniqueness of the solution to the eigen-problem) With respect to the previous notation, assume that 0 σ. 1) If the∈ condition (2) holds for every shared eigenvalue λ σ, then solution X depends only on the choice ∈ of operator Nσ, that is, for fixed Nσ, there exists unique solution X such that (4) holds. 2) Conversely, for every solution X to (1) and for every shared eigenvalue λ for matrices A and B, there B. D. Djordjevi´c,N. C.ˇ Dinˇci´c / Filomat 33:13 (2019), 4261–4280 4265

1 exists unique quotient class (A λI)− C( (B λI)) (A λI) such that X is unique solution to the quotient eigen-problem − N − ⊕N −   AX XB = C  − 1 (5) X : (B λI) (A λI)− C( (B λI)) (A λI). N − → − N − ⊕ N − Proof. Recall notation from proof of Theorem 2.1. 1) The first statement of the theorem is proved directly. Namely, take V = EB E⊥, B = BE B , V = EA E⊥, 1 ⊕ B ⊕ 1 2 ⊕ A A = AE A1 like in Theorem 2.1. Then there exists X = XE X1, which is a solution to (1). By construction, ⊕ ⊕ λ since σ(B1) σ(A) = , Case 1. applies and X1 is uniquely determined in (E⊥, V2) while XE is uniquely ∩ ∅ λ λ B determined in the class (EB/EB, V2/EA) for every λ σ. Varying λ in σ completes the proof. 2) Conversely, let X be aB solution to the eq. (1). Let λ∈be one of the shared eigenvalues for A and B and fix u as a corresponding eigenvector for B. Then XBu = λXu. Hence AXu XBu = (A λI)Xu = Cu. − − Split Xu into the orthogonal sum Xu = v1 + v2, where v1 (A λI) and v2 ( (A λI))⊥. Then v2 is 1 ∈ N − ∈ N − the sought expression P (A λI)⊥ (A λI)− Cu and Xu v2 + (A λI). Condition (2) follows immediately. Repeating the same procedureN − for− every shared eigenvalue∈ N for A−and B completes the proof.

Corollary 2.1. (Number of solutions) let Σ be the set of all Nσ introduced in the eigen-problem associated with given Sylvester equation (1), that is

n λ λ o Σ = Nσ : Nσ = λ σNλ, Nλ (E , E ), λ σ(A) σ(B) = σ 0 . ⊕ ∈ ∈ B B A ∈ ∩ 3 { } Let S be the set of all solutions to (1) which satisfy condition (2). Then Σ = S . | | | | Proof. For arbitrary Nσ Σ, there exits unique X S such that (4) holds. Further, for arbitrary X S and arbitrary λ σ there∈ exist quotient classes Eλ ∈and Eλ such that (5) holds. Define N : Eλ Eλ∈ to ∈ A B λ B → A be bounded. Then Nσ = λ σNλ. It follows that Nσ Σ. There is one-to-one surjective correspondence S Σ. ⊕ ∈ ∈ ↔ Remark. Due to Corollary 2.1, solution X(N ) S,(N Σ), can be referred to as particular solution. σ ∈ σ ∈ Corollary 2.2. (Size of particular solution) With the assumptions and notation from Theorem 2.1, Theorem 2.2 and Corollary 2.1, norm of X(Nσ) is given as Xs 2 2 2 2 1 2 2 X(Nσ) = XE + X1 = Nσ + P(Ek ) (A λkI)− CPEk + X1 . (6) k k k k k k k k k A ⊥ − B k k k k=1

Proof. Taking the same decomposition as in Theorem 2.1, let X = XE + X1. Since XE annihilates EB⊥ and X1 annihilates EB, it follows that 2 2 2 2 X = XE + X = XE + X . k k k 1k k k k 1k By the same argument, taking Xs 2 2 1 2 XE = Nσ + P(Ek ) (A λkI)− CPEk k k k k k A ⊥ − B k k=1 completes the proof.

Corollary 2.3. (Singularities on EB⊥) Assume that 0 < σ but 0 σ(A) σ(B1) and let dsc(A) = m 1. For every ∈ ∩m ≥ n+1 n P 0 n m, define Zn (R(B1)⊥, (A )⊥ (A )) and let Z = Zn. If (C1)⊥ = (B1), then there are ≤ ≤ ∈ B R ∩ R n=0 N R infinitely many solutions to (1) on EB⊥. Those solutions depend only on choice for Z, that is, if Z is fixed then there exists unique solution X1(Z) on EB⊥. n+1 n Proof. Proof is the same as part 1) in Theorem 2.2. Note that dsc(A) = asc(A∗) = m and (A )⊥ (A ) = n+1 n R ∩ R ((A∗) ) ((A∗) )⊥. Then proceed to Case 2. of proof of Theorem 2.1. N ∩ N B. D. Djordjevi´c,N. C.ˇ Dinˇci´c / Filomat 33:13 (2019), 4261–4280 4266

3. Fourier approximation minimal norm solution

As illustrated in Theorem 2.2, the system (4) has unique solution provided that all the input parameters are known and satisfy conditions (2). However, if the only input information is σ(A) σ(B) = σ , , the condition (2) is in general not easy or possible to verify. Thus approximation analysis requires∩ more detailed∅ approach.

The easiest assumption is that there exist eigenvalues for A and B

λk , . . . , λk σ 1 w ∈ such that ` C(E ) (A λ I) = , ` k ,..., kw . B ∩ R − ` ∅ ∈ { 1 }

Ergo any eigenvector u` of B that corresponds to λk` does not obey the condition (2) (` = k1, kw), that is, Cu < (A λ I). There exists an orthonormed basis (ek)k for (A λ I), such that Cu can be approximated ` R − ` R − ` ` by Cu (A λ I) and this approximation is the best possible, where ` ∈ R − ` X Cu = Cu, ek ek. ` h i k

kw P ` w w This way, operator C is directly defined on EB EB . The space EB is finite-dimensional and therefore `=k1 ≡ w has an orthogonal complement in V1, denoted as W = EB ⊥. Thus the extension of C on V1 is admissible and we define C := C CPW. ⊕ Now we solve the approximate Sylvester equation AX XB = C, and the solutions X (which exist from Theorem 2.1) are approximate solutions to the initial eq. AX− XB = C. Combining Corollary 2.2, we see that the error of approximation is derived from −

sup (AX XB C)u = sup (C C)u u =1 k − − k u =1 k − k k k k k and this approximation is the best possible, for given u`. However, note that C is not uniquely determined, but still depends on the input parameters: the corresponding eigenvectors for B and the choice for bases in the spaces (A λ`I). Hence we try to extract one particular C which is the best suited for our approximation problem: R −

Problem 0. Find those (or that one) approximations for C such that solutions have the smallest possible norm Ce = C : AX XB = C X is the smallest possible . { − ⇒ k k } This transfers our problem into minimum function problem, which is solvable in terms of numerical analysis.

4. Least-squares Minimum-norm solutions

When it comes to applications of matrix Sylvester equation, Frobenius norm seems to play more impor- tant role than the operator sup norm. Hence we continue our approximation analysis with that norm. − m n Frobenius norm, sometimes also called Euclidean norm, of a matrix A C × is defined as ∈ 1/2 1 2  m n  min m,n  / X X  X{ }  2  2  A F =  aij  =  σ (A) , || ||  | |   i  i=1 j=1 i=1 B. D. Djordjevi´c,N. C.ˇ Dinˇci´c / Filomat 33:13 (2019), 4261–4280 4267

T 1/2 where σi(A) are the singular values of A. Also, A F = tr(AA∗) (recall that A∗ = A is conjugate trans- || || pose). Recall that Frobenius norm is sub-multiplicative, i.e. AB F A F B F and unitarily invariant || || ≤ || || || || i.e. U1AU2 F = A F for some unitary U1, U2. From the very definition, it also follows that for matrix A || || || || 2 Pp Pq 2 partitioned on block-matrices, A = [Aij]p q, it follows that A = Aij . We mention that on any × || ||F i=1 j=1 || ||F finite dimensional space any two norms are equivalent.

Now we state two problems about Sylvester equation AX XB = C we are dealing with. Recall that m m n n m n m n − A C × , B C × and C C × are given, and X C × is an unknown matrix. Now Problem 0. can be broken∈ down∈ into two separate∈ problems ∈

Problem 1. Find the set of all Xb such that AXb XBb C F is the smallest possible, i.e. S || − − ||

min AX XB C F = AXb XBb C F. m n X C × || − − || || − − || ∈ Problem 2. Among all Xb find the one with the smallest Frobenius norm, i.e.

min Xb F = Xb0 F. Xb || || || || ∈S

Definition 4.1. Matrices Xb which are solutions for Problem 1 are least-squares solutions. Matrices Xc0 which are solutions for Problem 2 are minimal-norm least-squares solution. Before we continue our analysis, we remark the following facts: if Sylvester equation is consistent for given C, then set from Problem 1 consists of all solutions of • the Sylvester equation, and norm of approximation errorS is zero. If there is unique solution, then it solves Problem 2 as well. In the case when there are infinitely many solutions, Problem 2 gives those solutions with the smallest norm; for homogeneous equation (i.e. C = 0), set consists of all homogeneous solutions, and solution of • Problem 2 is unique, namely X = 0. S It is well-known fact that eq. (1) can, by and vectorization operation, be transformed into T (In A B Im) vec(X) = vec(C). ⊗ − ⊗ T The matrix In A B Im is often called ”nivellateur” in the literature, and the least-squares minimal-norm solution is unique⊗ − and⊗ is given by

[ T vec(X) = (In A B Im)†vec(C), ⊗ − ⊗ where T† denotes the unique Moore-Penrose inverse of in general rectangular complex matrix T. For more on the topic of the generalized inverses reader is referred to [3]. Remark that effective calculation of the Moore-Penrose inverse of nivellateur appears to be very difficult. The authors are unaware of such method, but for the group inverse there is a recent paper of Hartwig and Patr´ıcio [11]. Our aim is to reduce the problem for original Sylvester equation to the simplest Sylvester equation case, m m n n similar to the approach used in [7]. Suppose that matrices A C × and B C × have the following Jordan canonical forms (for some invertible matrices S and T): ∈ ∈

1 1 A = SJAS− , B = TJBT− .

Without loss of generality, we may assume that , σ(A) σ(B) = λ1, ..., λs , hence we excluded the unique solution case, so ∅ ∩ { }

m m JA = dia1 J(λ1; p11, p12, ..., p1,k1 ), ..., J(λs; ps1, ps2, ..., ps,ks ) C × , { } ∈ n n JB = dia1 J(λ ; q , q , ..., q ), ..., J(λs; qs , qs , ..., qs ) C × , { 1 11 12 1,`1 1 2 ,`s } ∈ B. D. Djordjevi´c,N. C.ˇ Dinˇci´c / Filomat 33:13 (2019), 4261–4280 4268 where pij, j = 1, ki, i = 1, s, and qij, j = 1, `i, i = 1, s, are natural numbers, and ki and `i are geometric multiplicities of the eigenvalue λi, i = 1, s, of A and B, respectively. Notation J(λ; t1, ..., tk) stands for

J(λ; t , ..., tk) = dia1 Jt (λ), ..., Jt (λ) = Jt (λ) ... Jt (λ), 1 { 1 k } 1 ⊕ ⊕ k where Jt1 (λ) is the Jordan of dimension t1 t1 with λ on its main diagonal. × 1 1 If we put those Jordan forms in the equation, we have (we denoted Y = S− XT and D = S− CT):

2 1 1 2 AX XB C = SJAS− X XTJBT− C = || − − ||F || − − ||F  1 1 1  1 2 = S JA(S− XT) (S− XT)JB (S− CT) T− = || − − ||F 1 2 = S(JAY YJB D)T− || − − ||F ≤ 2 1 2 2 S T− JAY YJB D = ≤ || ||F|| ||F|| − − ||F 2 2 = α (S, T) JAY YJB D . || − − ||F 1 We used the notation α(S, T) = S F T− F. Remark that if S and T are unitary matrices, then equality is attained in the previous formula.|| || || || Because of:

JAY YJB D = [J(λi; pi , ..., pi k )][Yij] [Yij][J(λj; qj , ..., qj )] [Dij] = − − 1 , i − 1 ,`j −

= [J(λi; pi1, ..., pi,k )Yij Yij J(λj; qj1, ..., qj,`j ) Dij]s s = i − − × (ij) (ij) (ij) = [Jpi,u (λi)Yuv Yuv Jqj,v (λj) Duv ] , − − u=1,ki, v=1,`j, i,j=1,s we have Xs 2 2 JAY YJB D = J(λi; pi , ..., pi k )Yij Yij J(λj; qj , ..., qj ) Dij = || − − ||F || 1 , i − 1 ,`j − ||F i,j=1 ` Xs Xki Xj (ij) (ij) (ij) 2 = Jp (λi)Y Y Jq (λj) D . || i,u uv − uv j,v − uv ||F i,j=1 u=1 v=1

Now, we distinguish two cases:

(ij) (ij) (ij) d(ij) if λi , λj, then by Theorem 1.1 the equation Jp (λi)Y Y Jq (λj) = D has unique solution Y , • i,u uv − uv j,v uv uv d(ij) d(ij) (ij) 2 so Jp (λi)Y Y Jq (λj) D = 0. || i,u uv − uv j,v − uv ||F (ij) (ij) (ij) if λi = λj, then the equation Jpi,u (λi)Yuv Yuv Jqj,v (λj) = Duv , after translation for λi, reduces to • (ii) (ii) (ii) − Jpi,u (0)Yuv Yuv Jqi,v (0) = Duv . From Theorem 1.1 we already know that there may be either infinitely many solutions,− or no solutions at all.

Therefore,

2 2 2 AX XB C α (S, T) JAY YJB D = || − − ||F ≤ || − − ||F Xs Xki X`i 2 (ii) (ii) (ii) 2 = α (S, T) Jp (λi)Y Y Jq (λi) D = || i,u uv − uv i,v − uv ||F i=1 u=1 v=1 Xs Xki X`i 2 (ii) (ii) (ii) 2 = α (S, T) Jp (0)Y Y Jq (0) D . || i,u uv − uv i,v − uv ||F i=1 u=1 v=1 B. D. Djordjevi´c,N. C.ˇ Dinˇci´c / Filomat 33:13 (2019), 4261–4280 4269

Now we can take minimum over all X (equivalently, over all Y, since they are similar matrices):

2 2 2 min AX XB C F α (S, T) min JAY YJB D F = X || − − || ≤ Y || − − || Xs Xki X`i 2 (ii) (ii) (ii) 2 = α (S, T) min Jpi,u (0)Yuv Yuv Jqi,v (0) Duv F. (ii) || − − || i=1 u=1 v=1 Yuv

Therefore, in order to solve the Problem 1, we need to investigate the following simpler versions of original problems:

Problem 1’. Find the set of all least-squares solutions Xb, i.e. S

min Jm(0)X XJn(0) C F = Jm(0)Xb XJb n(0) C F. m n X C × || − − || || − − || ∈ Problem 2’. Among all Xb find the one, Xc, with the smallest Frobenius norm, i.e. ∈ S 0

min Xb F = Xb0 F. Xb || || || || ∈S We will prove that such Xb is unique, and give a method for its explicit finding.

5. Least-squares Solutions for the Simplest Case

m n Let us denote p = min m, n for m, n N. For given matrix A C × , the set { } ∈ ∈

dk(A):= aij : j i = k , k = m + 1, n 1, { − } − − will be called k th small diagonal. For k = 0 we have ”the” diagonal, i.e. the set aii : i = 1, p . When we refer to some small− diagonal, we assume that its elements are ordered accordingly{ to increase} of the index i. For example, if we are dealing with 0 th small diagonal, we assume that the set d (A) = a , a , ..., app is − 0 { 11 22 } ordered. We will denote by σm k sum of all elements along the k th small diagonal dk: + − X σm k(A) = aij, k = m + 1, n 1. + − − aij d ∈ k

Theorem 5.1. Sylvester equation Jm(0)X XJn(0) = C has a least-squares solution Xb given by −

Xb = Xh + Xp + Xc, (7) where Xh denotes the solution of appropriate homogeneous equation:  " #  pn 1(Jn(0))  − , m n, Xh =  0(m n) n ≥  h − × i  0m (n m) qm 1(Jm(0)) , m n, × − − ≤

Xp is an expression given by

 n 1  k+1  P− T k  Jm(0) CJn(0) , m n,  k=0 ≥ Xp =  m 1  k+1  P− k T  Jm(0) C Jn(0) , m n, − k=0 ≤ B. D. Djordjevi´c,N. C.ˇ Dinˇci´c / Filomat 33:13 (2019), 4261–4280 4270 and  " #  0(m n) n  − × , m n,  W Xc =  ≥  h T i  W 0m (n m) , m n, − × − ≤ where we denoted    0 0 ... 0 0     σp/p 0 ... 0 0    W =  ......  . −    σ3/3 2σ4/4 ... 0 0    σ2/2 2σ3/3 ... (p 1)σp/p 0 p p − × The magnitude of the deviation is:

Xp σ2 ∆(Xb; C) = min ∆(X; C) = k , X k k=1 where σk is a sum of elements over the (m + k) th small diagonal from the matrix C. −

Proof. In expanded form, matrix expression R R(X) = Jm(0)X XJn(0) C = [rij] is: ≡ − −    x21 c11 x22 x11 c12 ... x2n x1,n 1 c1n   − − − − − −   x31 c21 x32 x21 c22 ... x3n x2,n 1 c2n   − − − − − −   ......  .    xm1 cm 1,1 xm2 xm 1,1 cm 1,2 ... xmn xm 1,n 1 cm 1,n   − − − − − − − − − − −  cm1 xm1 cm2 ... xm,n 1 cmn − − − − − −

As we can see, any fixed small diagonal contains some unknowns and parameters, and those unknowns and parameters cannot be found in any other small diagonal. Therefore, we will make summation over all small diagonals, and then of all elements on each od small diagonals. Since all those summations of the elements can have one of possible 3 forms, described by functions M1, M2 and M4 for m n (or M1, M3 and M for m n) from Proposition 8.1 (see Appendix), it is customary to separate into three≥ sums. So we have: 4 ≤

m n n 1 X X X− X R(X) 2 = r2 = r2 = || ||F ij ij i=1 j=1 k= m+1 rij d − ∈ k m+p n p n 1 −X X X− X X− X 2 2 2 2 = rm1 + rij + rij + rij = k= m+2 rij d k= m+p+1 rij d k=n p+1 rij d − ∈ k − ∈ k − ∈ k m+p n p −X X− 2 = c + M1(xij dk; aij dk) + M2 3(xij dk; aij dk)+ m1 ∈ ∈ ∨ ∈ ∈ k= m+2 k= m+p+1 − − n 1 X− + M (xij dk; aij dk) 4 ∈ ∈ k=n p+1 −

Since all summands are nonnegative, the minimization works independently for each of the small diagonal in some of three sums by using Proposition 8.1. We denote by ”hat” the elements on which the minimum B. D. Djordjevi´c,N. C.ˇ Dinˇci´c / Filomat 33:13 (2019), 4261–4280 4271 is attained. Therefore: m+p n p −X X− 2 2 min R(X) F = c + min M1(xij dk; aij dk) + min M2 3(xij dk; aij dk)+ X || || m1 ∈ ∈ ∨ ∈ ∈ k= m+2 k= m+p+1 − − n 1 X− + min M (xij dk; aij dk) = 4 ∈ ∈ k=n p+1 − m+p n p −X X− 2 = c + M1(xbij dk; aij dk) + M2 3(xbij dk; aij dk)+ m1 ∈ ∈ ∨ ∈ ∈ k= m+2 k= m+p+1 − − n 1 X− + M4(xbij dk; aij dk) = ∈ ∈ k=n p+1 − m+p −X 2 = c + M1(xbij dk; aij dk) + 0 + 0 = m1 ∈ ∈ k= m+2 − p 1 2 X− σ = k . k k=1

During the minimization process of the functions M1, M2 (or M3) and M4, we have obtained the following:

the unique xbij lying on the diagonals dk, k = m + 1, m + p 1; • − − − the unique xbij lying on the diagonals dk, k = m + p, n p 1; • − − −

the elements on each of diagonals dk, k = n p, n 1, depend on one real parameter, and we assume • this element is from the first row in the case−m n−, and in the last column if m n. ≥ ≤ If we rearrange the matrix Xb whose elements are known, we obtain precisely (7). Such rearrangement looks rather cumbersome in general case, and some insight can be brought after looking at Examples 6.1 and 6.2.

Let us prove that any Xb = Xh + Xp + Xc given by (7) is indeed a least-square solution:   Jm(0)X XJn(0) C = (Jm(0)Xh Xh Jn(0)) + Jm(0)Xp Xp Jn(0) C + − − − − − + Jm(0)Xc Xc Jn(0) =  −  = Jm(0)Xp Xp Jn(0) C + Jm(0)Xc Xc Jn(0) − − − It is not hard to check that Jm(0)Xp Xp Jn(0) C is a matrix whose all entries are zero, except the m th row (case m n) or the first column (case− m n):− − ≥ ≤      0 0 ... 0   σn 0 ... 0     −   ......   ......  Jm(0)Xp Xp Jn(0) C =   or   . − −  0 0 ... 0   σ2 0 ... 0     −  σ σ ... σn σ 0 ... 0 − 1 − 2 − − 1 On the similar way it can be shown that for m n ≥  0 0 0 ... 0 0     σ /n 0 ......   n   − n n   σn 1/( 1) σn/ ......   − − − −  Jm(0)Xc Xc Jn(0) =  ......  , −    σ3/3 σ4/4 ......   − −   σ2/2 σ3/3 σ4/4 ... σn/n 0   − − − −  0 σ2/2 2σ3/3 (n 2)σn 1/(n 1) (n 1)σn/n − − − − B. D. Djordjevi´c,N. C.ˇ Dinˇci´c / Filomat 33:13 (2019), 4261–4280 4272

(and just transposed matrix if m n), so we conclude that ≤   0(m n) n  − ×   σ /n 0 ... 0 0   n  " #    σn 1/(n 1) σn/n ... 0 0  0(m n) n Jm(0)X XJn(0) C =  − −  = − × T − − −  ......  − qp 1(Jp(0))   −  σ2/2 σ3/3 ... σn/n 0    σ1 σ2/2 ... σn 1/(n 1) σn/n − − if m n, and transpose of this matrix if m n. The Frobenius norm of this matrix is ≥ ≤ Xn  2 Xn σ2 2 σk k Jm(0)X XJn(0) C = k = , || − − ||F k k k=1 k=1 so we have the proof.

Corollary 5.1. Sylvester equation Jm(0)X XJn(0) = C is consistent if and only if σk = 0, k = 1, p, i.e. iff − Xc = 0, and its general solution is given by (7).

Remark that this result agrees with Theorems 2.3 and 2.5 from [7].

Corollary 5.2. The least-squares solution for the equation Jn(0)X XJn(0) = In is −

Xb = pn 1(Jn(0)), −

2 for any complex pn 1, and Jn(0)Xb XJb n(0) In = n. − || − − ||F

6. Minimal-norm Least-squares Solution for the Simplest Case

In the previous section, we observed that there is a whole class of the least-squares solutions, depending on some parameters, whether or not the equation is consistent. Now we prove that those free parameters can be chosen such that the solution is with minimal Frobenius norm. If we look at the structure of matrices Xh and Xc from Theorem 5.1, we see that they do not have any non-zero entry on the same position (for m n matrix Xh is upper triangular, while Xc is strictly lower triangular; the case m n is just transposed ≥ ≤ situation), therefore only matrix Xp may have the influence on the minimization. So far, we concluded that

2 2 2 min Xb F = min Xh + Xp F + Xc F. Xb || || Xh || || || || ∈S

Remark that Xc F is a constant, which is zero iff the equation is consistent and strictly positive otherwise. || || Theorem 6.1. There is a unique least-squares minimal-norm solution for the Sylvester equation Jm(0)X − XJn(0) = C, given by

Xc0 = Xfh + Xp + Xc, (8) where  " #  pn∗ 1(Jn(0))  − , m n, Xf =  0(m n) n ≥ h  h − × i  0m (n m) qm∗ 1(Jm(0)) , m n, × − − ≤ and pn∗ 1 and qm∗ 1 are uniquely determined . − − B. D. Djordjevi´c,N. C.ˇ Dinˇci´c / Filomat 33:13 (2019), 4261–4280 4273

Proof. According to the Theorem 5.1, such Xc0 is a least-squares solution. We must show that it has minimal norm among all least-squares solutions. Let us consider the matrix T = Xh + Xp = [tij]m n. The only entries that can be minimized include × x11, ..., x1n (when m n) or x1m, ..., xnm (when m n), and they are precisely along the small diagonals from n p to n 1: ≥ ≤ − − n 1 n p 1 n 1 X− X X− − X X− X T 2 = t2 = t2 + t2 = || ||F ij ij ij k= m+1 tij d k= m+1 tij d k=n p tij d − ∈ k − ∈ k − ∈ k n p 1 n 1 X− − X X− 2 = t + G(xij dk; cij dk), ij ∈ ∈ k= m+1 tij d k= m+1 − ∈ k − 2 2 2 Xc0 = min Xb F = min Xh + Xp F + Xc F = || || Xb || || Xh || || || || ∈S n p 1 n 1 X− − X X− 2 2 = t + min G(xij dk; cij dk) + Xc = ij ∈ ∈ || ||F k= m+1 tij d k= m+1 − ∈ k − n p 1 n 1 X− − X X− 2 2 = t + G(x∗ dk; cij dk) + Xc . ij ij ∈ ∈ || ||F k= m+1 tij d k= m+1 − ∈ k −

Therefore, for such x1∗i, i = 1, n (when m n), or xim∗ , i = 1, m (when m n), which is given by Proposition 8.2, we obtained the unique least-squares minimum-norm≥ solution. Since≤ it is already known that homogeneous solution is block-, we have the result.

Corollary 6.1. There is unique minimal-norm solution for consistent Sylvester equation, and it is given by

Xc0 = Xfh + Xp, with notations as in the previous Theorem. In order to clarify and illustrate the constructions given in previous two theorems about least-squares and minimal-norm solutions, we present the following two in-detailed examples. The first is dealing with case when m n, and the second one with the case m n. ≥ ≤ Example 6.1. Let us find the minimum-norm least-squares solution of the equation J4(0)X XJ3(0) = C, C 4 3 4 3 2 2 − ∈ R × , i.e. we want to find all X R × such that ∆(X) F = J4(0)X XJ3(0) C F is minimal, and among them such X which is minimal-Frobenius-norm∈ matrix.|| || || − − || We can arrange the small diagonals of matrix C as follows (below the matrix there are the indices for small diagonals):    c21 c11 c12 c13     c31 c32 c22 c23  ,   c41 c42 c43 c33 3 2 1 0 1 2 − − − and let us denote by σk, k = 1, 3, sum over the (4 + k) th small diagonal (k = 3, 2) of the matrix C, i.e. − −

σ1 = c41, σ2 = c31 + c42, σ3 = c21 + c32 + c43.

We can arrange the small diagonal for the matrix J (0)X XJ (0) C as well: 4 − 3 −  x c x c x x c x x c   31 21 21 11 22 11 12 23 12 13   x c x x− c x x− c x − x − c − −   41 31 42 31 32 32 21 22 33 22 23   c x − c −x −c x − x − c − −  − 41 − 41 − 42 − 42 − 43 43 − 32 − 33 B. D. Djordjevi´c,N. C.ˇ Dinˇci´c / Filomat 33:13 (2019), 4261–4280 4274

Vertical bars partition the small diagonals according to the entries they have. By using the properties of the Frobenius norm, it is clear that J (0)X XJ (0) C 2 can be obtained as a sum of squared entries over || 4 − 3 − ||F each of the columns. For left submatrix, those sums are precisely the function M1 applied to all its columns except the first one (we enlist unknown variables xij and parameters cij from top to the bottom); for central submatrix we have function M2, while for right submatrix function M4 is applied for each of its columns. Hence, we have

2  2 2 H(x11, ..., x43) = ( c41) + (x41 c31) + ( x41 c42) + − − − −  + (x c )2 + (x x c )2 + ( x c )2 + 31 − 21 42 − 31 − 32 − 42 − 43  2 2 2 + (x21 c11) + (x32 x21 c22) + (x43 x32 c33) +  − − −  − − + (x x c )2 + (x x c )2 + (x x c )2 = 22 − 11 − 12 33 − 22 − 23 23 − 12 − 13 2 = c41 + M1(x41; c31, c42) + M1(x31, x42; c21, c32, c43)+ + M2(x21, x32, x43; c11, c22, c33)+

+ M4(x11, x22, x33; c12, c23) + M4(x12, x23; c13).

By the Theorem 8.1, we have unique bx21,bx31,bx32,bx41,bx42,bx43, given by:

1 bx41 = (c31 c42), 2 − 1 1 bx31 = (2c21 c32 c43), bx42 = (c21 + c32 2c43), 3 − − 3 − bx21 = c11, bx32 = c11 + c22, bx43 = c11 + c22 + c33,

2 2 such that M1(bx41; c31, c42) = σ2/2 and M1(bx31,bx42; c21, c32, c43) = σ3/3, while bx11,bx12,bx13,bx22,bx23,bx33 are depend- ing of parameters denoted by x11, x12, x13:

bx11 = x11, bx22 = x11 + c12, bx33 = x11 + c12 + c23, bx12 = x12, bx23 = x12 + c13, bx13 = x13, for them minima of all M2 and M4 are zeros. Therefore, the minimum of the function H is:

2 2 X3 σ2 2 σ2 σ3 k H(bx11, ...,bx43) = σ + + = . 1 2 3 k k=1

It is clear that H(bx11, ...,bx43) = 0 σk = 0, k = 1, 3, which is precisely the consistency condition (Theorem ⇔ 2.3 from [7]). If we decompose such Xb as:    x11 x12 x13     c11 x11 + c12 x12 + c13  Xb =  1  =  (2c21 c32 c43) c11 + c22 x11 + c12 + c23   3 − −  1 (c c ) 1 (c + c 2c ) c + c + c 2 31 − 42 3 21 32 − 43 11 22 33  x x x   0 0 0   0 0 0   11 12 13       0 x11 x12   c11 c12 c13   0 0 0  =   +   +  σ  =  x   c c c c c   3 0 0   0 0 11   21 11 + 22 12 + 23   3       − σ2 2σ3  0 0 0 c31 c21 + c32 c11 + c22 + c33 0 − 2 − 3 = Xh + Xp + Xc, B. D. Djordjevi´c,N. C.ˇ Dinˇci´c / Filomat 33:13 (2019), 4261–4280 4275 where Xh, Xp and Xc are as in the Theorem 7. Also, remark that:    0 0 0   σ3  " #  0 0  01 3 J (0)Xb XJb (0) C =  3  = × , 4 3  − σ2 σ3 0  p J T − −  2 3  − 2( 3(0))  −σ − σ2 σ3  − 1 − 2 − 3 2 where p2(t) = σ3/3 + σ2/2 t + σ1 t . In order to obtain minimum-norm· · solution among all least-squares solutions, let us arrange small diagonals of Xb as follows (vertical bars separate those small diagonals which take parts in minimization from the others):

 c x x x   11 11 12 13   σ3   c21 c11 + c22 x11 + c12 x12 + c13   − 3  .  σ 2σ   c 2 c + c 3 c + c + c x + c + c  31 − 2 21 32 − 3 11 22 33 11 12 23

There are three parameters, x11, x12 and x13, which can be used for the minimization of sums over the columns of right submatrix. In order to obtain such solution, we should minimize the following function (by using the Theorem 8.2) h : R3 R: → 2 2 2 2 2 2 h(x11, x12, x13) = x11 + (x11 + c12) + (x11 + c12 + c23) + x12 + (x12 + c13) + x13 = 2 = F(x11; c12, c12 + c23) + F(x21; c13) + x13. According to Theorem 8.2, the minimum is obtained for

2c12 + c23 c13 x∗ = , x∗ = , x∗ = 0, 11 − 3 12 − 2 13

2 2 2 and it is h(x11∗ , x12∗ , x13∗ ) = (3c13 + 4(c12 + c12c23 + c23))/6. Therefore, the least-squares minimal-norm solution is:

 (2c + c )/3 c /2 0   − 12 23 − 13   c (c c )/3 c /2  Xc  11 12 23 13  0 =  −  .  (2c21 c32 c43)/3 c11 + c22 (c12 + 2c23)/3   (c− c−)/2 (c + c 2c )/3 c + c + c  31 − 42 21 32 − 43 11 22 33

Note that the solution depends on all entries of the matrix C except c41.

Example 6.2. Let us find the minimum-norm least-squares solution of the equation J3(0)X XJ4(0) = C, C 3 4 3 4 2 2 − ∈ R × , i.e. we want to find all X R × such that ∆(X) F = J3(0)X XJ4(0) C F is minimal, and among them such X which is minimal-Frobenius-norm∈ matrix.|| || || − − || We can arrange the small diagonals of matrix C as follows (below the matrix there are the indices for small diagonals):    c11 c12 c13 c14     c21 c22 c23 c24  ,   c31 c32 c33 c34 2 1 0 1 2 3 − − and let us denote by σk, k = 1, 3, sum over the (3 + k) th small diagonal (k = 2, 3) of the matrix C, i.e. − −

σ1 = c31, σ2 = c21 + c32, σ3 = c11 + c22 + c33. B. D. Djordjevi´c,N. C.ˇ Dinˇci´c / Filomat 33:13 (2019), 4261–4280 4276

We can arrange the small diagonal for the matrix J (0)X XJ (0) C as well: 3 − 4 −  x c x x c x x c x x c   21 11 22 11 12 23 12 13 24 13 14   x c x x− c x − x − c x − x − c − −   31 21 32 21 22 33 22 23 34 23 24   c x − c −x −c −x −c − −  − 31 − 31 − 32 − 32 − 33 − 33 − 34 Vertical bars partition the small diagonals according to the entries they have. By using the properties of the Frobenius norm, it is clear that J (0)X XJ (0) C 2 can be obtained as a sum of squared entries over || 3 − 4 − ||F each of the columns. For left submatrix, those sums are precisely the function M1 applied to all its columns except the first one (we enlist unknown variables xij and parameters cij from top to the bottom); for central submatrix we have function M3, while for right submatrix function M4 is applied for each of its columns. Hence, we have 2  2 2 H(x11, ..., x34) = ( c41) + (x31 c21) + ( x31 c32) + − − − −  + (x c )2 + (x x c )2 + ( x c )2 + 21 − 11 32 − 21 − 22 − 32 − 33  2 2 2 + (x22 x11 c12) + (x33 x22 c23) + ( x33 c34) +  − − − −  − − + (x x c )2 + (x x c )2 + (x x c )2 = 23 − 12 − 13 34 − 23 − 24 24 − 13 − 14 2 = c41 + M1(x31; c21, c32) + M1(x21, x32; c11, c22, c33)+ + M3(x11, x22, x33; c12, c23, c34)+

+ M4(x12, x23, x34; c13, c24) + M4(x13, x24; c14).

By Theorem 8.1, we have unique bx11,bx21,bx22,bx31,bx32,bx33, given by: 1 bx31 = (c21 c32), 2 − 1 1 bx21 = (2c11 c22 c33), bx32 = (c11 + c22 2c33), 3 − − 3 − bx11 = (c12 + c23 + c34), x22 = (c23 + c34), bx33 = c34, − − − 2 2 such that M1(bx31; c21, c32) = σ2/2 and M1(bx21,bx32; c11, c22, c33) = σ3/3, while bx12,bx13,bx14,bx23,bx24,bx34 are depend- ing of parameters denoted by x14, x24, x34 (this is important difference from the previous example!):

bx34 = x34, bx23 = x34 c24, bx12 = x34 c13 c24, − − − bx24 = x24, bx13 = x24 c14, − bx14 = x14, for them minima of all M3 and M4 are zeros. Therefore, the minimum of the function H is: 2 2 X3 σ2 2 σ2 σ3 k H(bx11, ...,bx34) = σ + + = . 1 2 3 k k=1

It is clear that H(bx11, ...,bx34) = 0 σk = 0, k = 1, 3, which is precisely the consistency condition (Theorem ⇔ 2.5 from [7]). If we decompose such Xb as:  (c + c + c ) x c c x c x   − 12 23 34 34 − 13 − 24 24 − 14 14  Xb =  1 (2c c c ) (c + c ) x c x  =  3 11 − 22 − 33 − 23 34 34 − 24 24   1 (c c ) 1 (c + c 2c ) c x  2 21 − 32 3 11 22 − 33 − 34 34  0 x x x   (c + c + c ) (c + c ) c 0   0 0 0 0   34 24 14   12 23 34 13 24 14       − − −   2σ3  =  0 0 x34 x24  +  (c22 + c33) (c23 + c34) c24 0  +  3 0 0 0  =    − − −   σ σ   0 0 0 x   c c c 0   2 3 0 0  34 − 32 − 33 − 34 2 3 = Xh + Xp + Xc, B. D. Djordjevi´c,N. C.ˇ Dinˇci´c / Filomat 33:13 (2019), 4261–4280 4277 where Xh, Xp and Xc are as in the Theorem 7. Also, remark that:

 σ3   3 0 0 0  h i  − σ2 σ3  J (0)Xb XJb (0) C =  0 0  = p (J (0))T 0 3 4  2 3  2 3 3 1 , − −  −σ − σ2 σ3 0  − × − 1 − 2 − 3 2 where p2(t) = σ3/3 + σ2/2 t + σ1 t . In order to obtain minimum-norm· · solution among all least-squares solutions, let us arrange small diagonals of Xb as follows (vertical bars separate those small diagonals which take parts in minimization from the others):

 (c + c + c ) x c c x c x   − 12 23 34 34 − 13 − 24 24 − 14 14   2σ3   c c (c + c ) x c x   22 33 23 34 34 24 24  .  σ 3 σ− − − −   2 c 3 c c x  2 − 32 3 − 33 − 34 34

There are three parameters, x14, x24 and x34, which can be used for the minimization of sums over the columns of right submatrix. In order to obtain such solution, we should minimize the following function (by using the Theorem 8.2) h : R3 R: → h(x , x , x ) = x2 + (x c )2 + (x c c )2 + x2 + (x c )2 + x2 = 14 24 34 34 34 − 24 34 − 24 − 13 24 24 − 14 14 = G(x ; c , c ) + G(x ; c ) + x2 . 34 − 24 − 13 24 − 14 14 According to Theorem 8.2, the minimum is obtained for

2c24 + c13 c14 x∗ = , x∗ = , x∗ = 0, 34 3 24 2 14

2 2 2 and it is h(x14∗ , x24∗ , x34∗ ) = (3c14 + 4(c24 + c24c13 + c13))/6. Therefore, the least-squares minimal-norm solution is:

 (c + c + c ) (2c + c )/3 c /2 0   12 23 34 24 13 14  Xc  (2−c c c )/3 − (c + c )(c − c )/3 c /2  0 =  11 22 33 23 34 13 24 14  .  (c− c−)/2 (c −+ c 2c )/3 −c (c + 2c )/3  21 − 32 11 22 − 33 − 34 13 24

Note that the solution depends on all entries of the matrix C except c31.

7. Return to the Main Case

We have thoroughly examined the simplest case, so we can return to our original problem.

2 2 2 min AX XB C F α (S, T) min JAY YJB D F = X || − − || ≤ Y || − − || Xs Xki X`i 2 (ii) (ii) (ii) 2 = α (S, T) min Ju(0)Yuv Yuv Jv(0) Duv F = (ii) || − − || i=1 u=1 v=1 Yuv

s ki `i min u,v ? 2 (ii) X X X X{ } σ (Duv ) = α2(S, T) k k i=1 u=1 v=1 k=1 If the right hand side is zero, i.e.

(ii) σk(D ) = 0, k = 1, min u, v , i = 1, s, uv { } B. D. Djordjevi´c,N. C.ˇ Dinˇci´c / Filomat 33:13 (2019), 4261–4280 4278 then the equation is consistent. This result is independent of the choice of matrices S and T (except the request that proper Jordan blocks should be on appropriate positions). If the right hand side is not zero, at least we have an upper bound (which need not be the best possible!) for the error. 1 1 Remark that if A = SJAS− , then A = (γS)JA(γS)− for any γ , 0, so one can further analyze quantity α(S, T).

8. Appendix: Some Auxiliary Results on the Minimum of the Functions

m n Suppose that C C × is a complex matrix. It can be written on the unique way as C = C0 + iC00, where m n ∈ 2 2 2 C0, C00 R × . Note that C = C0 + C00 . Because of: ∈ || ||F || ||F || ||F 2 ∆(X; C): = Jm(0)X XJn(0) C = || − − ||F 2 = Jm(0)(X0 + iX00) (X0 + iX00)Jn(0) (C0 + iC00) = || − − ||F 2 = Jm(0)X0 X0 Jn(0) C0 + i(Jm(0)X00 X00 Jn(0) C00) = || − − − − ||F 2 2 = Jm(0)X0 X0 Jn(0) C0 + Jm(0)X00 X00 Jn(0) C00 = || − − ||F || − − ||F = ∆(X0; C0) + ∆(X00; C00), it is enough to consider minimization of appropriate real function. The following two results can be easily proven by elementary calculations, but since they are the backbone for Theorems 5.1–6.1, we formulate them as propositions.

n Proposition 8.1. 1) The function M : R R depending on n + 1 real parameters a , ..., an , 1 → 1 +1 n 1 X− 2 2 2 M (x , ..., xn; a , ..., an, an ) = (x a ) + (xk xk ak ) + (xn + an ) 1 1 1 +1 1 − 1 +1 − − +1 +1 k=1 attains its minimum for the uniquely determined arguments

 k n+1  k n+1 1  X X  X k X bxk = (n + 1 k) ai k ai = ai ai, k = 1, n, n + 1  − −  − n + 1 i=1 i=k+1 i=1 i=1 and this minimal value is  n+1 2 1 X  M1(bx1, ...,bxn; a1, ..., an+1) =  ai . n + 1   i=1 n 2) The function M : R R, depending on n real parameters a , ..., an, 2 → 1 n 1 X− 2 2 M (x , ..., xn; a , ..., an) = (x a ) + (xk xk ak ) 2 1 1 1 − 1 +1 − − +1 k=1 attains its minimum 0 for the uniquely determined arguments

Xk bxk = ai, k = 1, n. i=1

n 3) The function M : R R, depending on n real parameters a , ..., an, 3 → 1 n 1 X− 2 2 M (x , ..., xn; a , ..., an) = (xk xk ak) + (xn + an) 3 1 1 +1 − − k=1 B. D. Djordjevi´c,N. C.ˇ Dinˇci´c / Filomat 33:13 (2019), 4261–4280 4279 attains its minimum 0 for the uniquely determined arguments

Xn bxk = ai, k = 1, n. − i=k

n 4) The function M4 : R R, depending on n 1 real parameters a1, ..., an 1, → − − n 1 X− 2 M4(x1, ..., xn; a1, ..., an 1) = (xk+1 xk ak) − − − k=1 attains its minimum 0 for the one-parameter set of the arguments

Xk bxk = x1 + ai 1, k = 2, n, − i=2 or n 1 X− bxk = xn ai, k = 1, n 1. − − i=k

Proposition 8.2. 1) The function F : R R, depending on n real parameters a , ..., an: → 1 2 2 2 F(x; a1, ..., an) = x + (x + a1) + ... + (x + an) , attains its minimum at a1 + ... + an x∗ = , − n + 1 and this minimal value is

 2   Xn Xn  Xn X  2 1   1  2  F(x∗; a1, ..., an) = a  ak = n a 2 ajak . k − n + 1   n + 1  j −  k=1 k=1 j=1 j

2) The function G(x; a1, ..., an) = F(x; a1, a1 + a2, ..., a1 + a2 + ... + an), depending on n real parameters a1, ..., an, attains its minimum for Xn n + 1 j x∗ = − aj, − n + 1 j=1 and  2  2 Xn Xk  Xn Xk    1   G(x∗; a1, ..., an) =  aj  aj .   − n + 1   k=1 j=1 k=1 j=1

References

[1] S. Albeverio, A. K. Motovilov and A. A. Shkalikov, Bounds on variation of spectral subspaces under J self-adjoint perturbations, Int. Equ. Oper. Theory 64 (2009), 455–486. − [2] W. Arendt, F. Rabiger¨ and A. Sourour, Spectral properties of the operator equation AX + XB = Y, Quart. J. Math. Oxford (2) 45 (1994), 133–149. [3] A. Ben-Israel and T. N. E. Greville, Generalized inverses, theory and applications, 2nd edition, Springer, 2003. [4] P. Benner, Factorized solutions of Sylvester equations wth applications in control, in Proc. of the 16th International Symposium on Mathematical Theory of Network and Systems (MTNS 2004), 2004. [5] R. Bhatia and P. Rosenthal, How and why to solve the operator equation AX XB = Y, Bull. London Math. Soc. 29 (1997), 1–21 [6] R. Byers, Solving the algebraic Ricatti equation with the matrix sign function,− Linear Algebra Appl. 85 (1987), 267–279. [7] N. C.ˇ Dinciˇ c,´ Solving the Sylvester equation AX XB = C when σ(A) σ(B) , , Electron. J. Linear Algebra 35 (2019), 1–23. − ∩ ∅ B. D. Djordjevi´c,N. C.ˇ Dinˇci´c / Filomat 33:13 (2019), 4261–4280 4280

[8] B. D. Djordjevic´ and N. C.ˇ Dinciˇ c,´ Solving the operator equation AX XB = C with closed A and B, Integral Equation and Operator Theory (2018) 90:51 − [9] M. P. Drazin, On a result of J. J. Sylvester, Linear Algebra Appl. 505 (2016), 361–366. [10] F. R. Gantmacher, The theory of matrices, vol 1, Chelsea Pub. Co., 1959. [11] R. E. Hartwig and P. Patr´ıcio, Group inverse of the nivellateur, Linear and Multilinear Algebra 66 (12) (2018), 2409–2420 [12] R. A. Horn and C. R. Johnson, Matrix Analysis, Cambridge University Press, 1985. [13] Q. Hu, D. Cheng, The polynomial solution to the Sylvester matrix equation, Applied Mathematics Letters 19 (9) (2006), 859–864. [14] D. Y. Hu, L. Reichel, Krylov-subspace methods for the Sylvester equation, Linear Algebra Appl. 172 (1992), 283–313. [15] X. Jin, Y. Wei, Numerical linear algebra and its applications, SCIENCE PRESS USA Inc. Beijing, 2005. [16] N. T. Lan, On the operator equation AX XB = C with unbounded operators A, B and C, Abstract and Applied Analysis 6 (6) (2001), 317–328. − [17] G. Lumer, M. Rosenblum, Linear operator equations, Proc. Amer. Math. Soc. 10 (1959), 32–41. [18] E-C. Ma, A finite series solution of the matrix equation AX XB = C, SIAM J. Appl. Math. 14 (3) (1966), 490–495. [19] S. Mecheri, Why we solve the operator equation AX XB =−C, preprint, (www.researchgate.net/publication/240626459 Why− we solve the operator equation AX XB C) [20] V. Q. Phong,´ The operator equation AX XB = C with unbounded operators A and B and related abstract Cauchy problems, Math. Z. 208 (1991), 567–588. − [21] J. D. Roberts, Linear model reduction and solution of the algebraic Ricatti equation by use of the sign function, Internat. J. Control 32 (1980), 677–687. [22] M. Rosenblum, On the operator equation BX XA = Q, Duke Math. J. 23 (1956), 263–270 [23] J. J. Sylvester, Sur l’equation en matrices px =−xq, C. R. Acad. Sci. Paris, 99 (1884) 67–71 and 115–116.