Communications and Control Engineering

Ai-Guo Wu Ying Zhang Complex Conjugate Equations for Systems and Control Communications and Control Engineering

Series editors Alberto Isidori, Roma, Italy Jan H. van Schuppen, Amsterdam, The Netherlands Eduardo D. Sontag, Piscataway, USA Miroslav Krstic, La Jolla, USA More information about this series at http://www.springer.com/series/61 Ai-Guo Wu • Ying Zhang

Complex Conjugate Matrix Equations for Systems and Control

123 Ai-Guo Wu Ying Zhang Harbin Institute of Technology, Shenzhen Harbin Institute of Technology, Shenzhen University Town of Shenzhen University Town of Shenzhen Shenzhen Shenzhen China China

ISSN 0178-5354 ISSN 2197-7119 (electronic) Communications and Control Engineering ISBN 978-981-10-0635-7 ISBN 978-981-10-0637-1 (eBook) DOI 10.1007/978-981-10-0637-1

Library of Congress Control Number: 2016942040

Mathematics Subject Classification (2010): 15A06, 11Cxx

© Springer Science+Business Media Singapore 2017 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made.

Printed on acid-free paper

This Springer imprint is published by Springer Nature The registered company is Springer Science+Business Media Singapore Pte Ltd. To our supervisor, Prof. Guang-Ren Duan

To Hong-Mei, and Yi-Tian (Ai-Guo Wu)

To Rui, and Qi-Yu (Ying Zhang) Preface

Theory of matrix equations is an important branch of mathematics, and has broad applications in many engineering fields, such as control theory, information theory, and signal processing. Specifically, algebraic Lyapunov matrix equations play vital roles in stability analysis for linear systems, and coupled Lyapunov matrix equa- tions appear in the analysis for Markovian jump linear systems; algebraic Riccati equations are encountered in optimal control. Due to these reasons, matrix equa- tions are extensively investigated by many scholars from various fields, and the content on matrix equations has been very rich. Matrix equations are often covered in some books on linear algebra, matrix analysis, and numerical analysis. We list several books here, for example, Topics in Matrix Analysis by R.A. Horn and C.R. Johnson [143], The Theory of Matrices by P. Lancaster and M. Tismenetsky [172], and Matrix Analysis and Applied Linear Algebra by C.D. Meyer [187]. In addition, there are some books on special matrix equations, for example, Lyapunov Matrix Equations in System Stability and Control by Z. Gajic [128], Matrix Riccati Equations in Control and Systems Theory by H. Abou-Kandil [2], and Generalized Sylvester Equations: Unified Parametric Solutions by Guang-Ren Duan [90]. It should be pointed out that all the matrix equations investigated in the aforemen- tioned books are in real domain. By now, it seems that there is no book on complex matrix equations with the conjugate of unknown matrices. For convenience, this class of equations is called complex conjugate matrix equations. The first author of this book and his collaborators began to consider complex matrix equations with the conjugate of unknown matrices in 2005 inspired by the work [155] of Jiang published in Linear Algebra and Applications. Since then, he and his collaborators have published many papers on complex conjugate matrix equations. Recently, the second author of this book joined this field, and has obtained some interesting results. In addition, some complex conjugate matrix equations have found applications in the analysis and design of antilinear systems. This book aims to provide a relatively systematic introduction to complex conjugate matrix equations and its applications in discrete-time antilinear systems.

vii viii Preface

The book has 12 chapters. In Chap. 1, first a survey is given on linear matrix equations, and then recent development on complex conjugate matrix equations is summarized. Some mathematical preliminaries to be used in this book are collected in Chap. 2. Besides these two chapters, the rest of this book is partitioned into three parts. The first part contains Chaps. 3–5, and focuses on the iterative solutions for several types of complex conjugate matrix equations. The second part consists of Chaps. 6–10, and focuses on explicit closed-form solutions for some complex conjugate matrix equations. In the third part, including Chaps. 11 and 12, several applications of complex conjugate matrix equations are considered. In Chap. 11, stability analysis of discrete-time antilinear systems is investigated, and some sta- bility criteria are given in terms of anti-Lyapunov matrix equations, which are special complex conjugate matrix equations. In Chap. 12, some feedback design problems are solved for discrete-time antilinear systems by using several types of complex conjugate matrix equations. Except part of Chap. 2 and Subsection 6.1.1, the other materials of this book are based on our own research work, including some unpublished results. The intended audience of this monograph includes students and researchers in areas of control theory, linear algebra, communication, numerical analysis, and so on. An appropriate background for this monograph would be the first course on linear algebra and linear systems theory. Since 1980s, many researchers have devoted much effort in complex conjugate matrix equations, and much contribution has been made to this area. Owing to space limitation and the organization of the book, many of their published results are not included or even not cited. We extend our apologies to these researchers. It is under the supervision of our Ph.D. advisor, Prof. Guang-Ren Duan at Harbin Institute of Technology (HIT), that we entered the field of matrix equations with their applications in control systems design. Moreover, Prof. Duan has also made much contribution to the investigation of complex conjugate matrix equa- tions, and has coauthored many papers with the first author. Some results in these papers have been included in this book. Therefore, at the beginning of preparing the manuscript, we intended to get Prof. Duan as the first author of this book due to his contribution on complex conjugate matrix equations. However, he thought that he did not make contribution to the writing of this book, and thus should not be an author of this book. Here, we wish to express our sincere gratitude and appreciation to Prof. Duan for his magnanimity and selflessness. We also would like to express our profound gratitude to Prof. Duan for his careful guidance, wholehearted sup- port, insightful comments, and great contribution. We also would like to give appreciation to our colleague, Prof. Bin Zhou of HIT for his help. The first author has coauthored some papers included in this book with Prof. Gang Feng when he visited City University of Hong Kong as a Research Fellow. The first author would like to express his sincere gratitude to Prof. Feng for his help and contribution. Dr. Yan-Ming Fu, Dr. Ming-Zhe Hou, Mr. Yang-Yang Qian, and Dr. Ling-Ling Lv have also coauthored with the first author a few papers included in this book. The first author would extend his great thanks to all of them for their contribution. Preface ix

Great thanks also go to Mr. Yang-Yang Qian and Mr. Ming-Fang Chang, Ph.D. students of the first author, who have helped us in typing a few sections of the manuscripts. In addition, Mr. Fang-Zhou Fu, Miss Dan Guo, Miss Xiao-Yan He, Mr. Zhen-Peng Zeng, and Mr. Tian-Long Qin, Master students of the first author, and Mr. Yang-Yang Qian and Mr. Ming-Fang Chang have provided tremendous help in finding errors and typos in the manuscripts. Their help has significantly improved the quality of the manuscripts, and is much appreciated. The first and second authors would like to thank his wife Ms. Hong-Mei Wang and her husband Dr. Rui Zhang, respectively, for their constant support in every aspect. Part of the book was written when the first author visited the University of Western Australia (UWA) from July 2013 to July 2014. The first author would like to thank Prof. Victor Sreeram at UWA for his help and invaluable suggestions. We would like to gratefully acknowledge the financial support kindly provided by the National Natural Science Foundation of China under Grant Nos. 60974044 and 61273094, by Program for New Century Excellent Talents in University under Grant No. NCET-11-0808, by Foundation for the Author of National Excellent Doctoral Dissertation of China under Grant No. 201342, by Specialized Research Fund for the Doctoral Program of Higher Education under Grant Nos. 20132302110053 and 20122302120069, by the Foundation for Creative Research Groups of the National Natural Science Foundation of China under Grant Nos. 61021002 and 61333003, by the National Program on Key Basic Research Project (973 Program) under Grant No. 2012CB821205, by the Project for Distinguished Young Scholars of the Basic Research Plan in Shenzhen City under Contract No. JCJ201110001, and by Key Laboratory of Electronics Engineering, College of Heilongjiang Province (Heilongjiang University). Lastly, we thank in advance all the readers for choosing to read this book. It is much appreciated if readers could possibly provide, via email: [email protected], feedback about any problems found.

July 2015 Ai-Guo Wu Ying Zhang Contents

1 Introduction ...... 1 1.1 Linear Equations ...... 2 1.2 Univariate Linear Matrix Equations ...... 5 1.2.1 Lyapunov Matrix Equations ...... 5 1.2.2 Kalman-Yakubovich and Normal Equations ...... 9 1.2.3 Other Matrix Equations...... 13 1.3 Multivariate Linear Matrix Equations ...... 16 1.3.1 Roth Matrix Equations ...... 16 1.3.2 First-Order Generalized Sylvester Matrix Equations . . . 18 1.3.3 Second-Order Generalized Sylvester Matrix Equations ...... 24 1.3.4 High-Order Generalized Sylvester Matrix Equations ...... 25 1.3.5 Linear Matrix Equations with More Than Two Unknowns...... 26 1.4 Coupled Linear Matrix Equations...... 27 1.5 Complex Conjugate Matrix Equations...... 30 1.6 Overview of This Monograph ...... 33 2 Mathematical Preliminaries ...... 35 2.1 Kronecker Products ...... 35 2.2 Leverrier Algorithms ...... 42 2.3 Generalized Leverrier Algorithms...... 46 2.4 Singular Value Decompositions ...... 49 2.5 Vector Norms and Operator Norms ...... 52 2.5.1 Vector Norms ...... 52 2.5.2 Operator Norms ...... 56 2.6 A Real Representation of a Complex Matrix ...... 63 2.6.1 Basic Properties ...... 64 2.6.2 Proof of Theorem 2.7 ...... 68

xi xii Contents

2.7 Consimilarity...... 73 2.8 Real Linear Spaces and Real Linear Mappings ...... 75 2.8.1 Real Linear Spaces...... 76 2.8.2 Real Linear Mappings ...... 81 2.9 Real Inner Product Spaces...... 83 2.10 Optimization in Complex Domain ...... 87 2.11 Notes and References ...... 90

Part I Iterative Solutions 3 Smith-Type Iterative Approaches ...... 97 3.1 Infinite Series Form of the Unique Solution...... 98 3.2 Smith Iterations ...... 103 3.3 Smith (l) Iterations ...... 105 3.4 Smith Accelerative Iterations ...... 108 3.5 An Illustrative Example ...... 115 3.6 Notes and References ...... 116 4 Hierarchical-Update-Based Iterative Approaches ...... 119 4.1 Extended Con-Sylvester Matrix Equations...... 121 4.1.1 The Matrix Equation AXB þ CXD ¼ F ...... 121 4.1.2 A General Case ...... 126 4.1.3 Numerical Examples...... 133 4.2 Coupled Con-Sylvester Matrix Equations ...... 135 4.2.1 Iterative Algorithms ...... 137 4.2.2 Convergence Analysis ...... 139 4.2.3 A More General Case...... 146 4.2.4 A Numerical Example ...... 147 4.3 Complex Conjugate Matrix Equations with Transpose of Unknowns...... 149 4.3.1 Convergence Analysis ...... 151 4.3.2 A Numerical Example ...... 157 4.4 Notes and References ...... 158 5 Finite Iterative Approaches ...... 163 5.1 Generalized Con-Sylvester Matrix Equations ...... 163 5.1.1 Main Results ...... 164 5.1.2 Some Special Cases ...... 172 5.1.3 Numerical Examples...... 175 5.2 Extended Con-Sylvester Matrix Equations...... 179 5.2.1 The Matrix Equation AXB þ CXD ¼ F ...... 179 5.2.2 A General Case ...... 192 5.2.3 Numerical Examples...... 195 Contents xiii

5.3 Coupled Con-Sylvester Matrix Equations ...... 198 5.3.1 Iterative Algorithms ...... 198 5.3.2 Convergence Analysis ...... 199 5.3.3 A More General Case...... 206 5.3.4 Numerical Examples...... 207 5.3.5 Proofs of Lemmas 5.15 and 5.16 ...... 209 5.4 Notes and References ...... 221

Part II Explicit Solutions 6 Real-Representation-Based Approaches...... 225 6.1 Normal Con-Sylvester Matrix Equations ...... 226 6.1.1 Solvability Conditions ...... 226 6.1.2 Uniqueness Conditions ...... 230 6.1.3 Solutions...... 233 6.2 Con-Kalman-Yakubovich Matrix Equations...... 241 6.2.1 Solvability Conditions ...... 241 6.2.2 Solutions...... 243 6.3 Con-Sylvester Matrix Equations...... 250 6.4 Con-Yakubovich Matrix Equations...... 259 6.5 Extended Con-Sylvester Matrix Equations...... 267 6.6 Generalized Con-Sylvester Matrix Equations ...... 270 6.7 Notes and References ...... 273 7 Polynomial-Matrix-Based Approaches...... 275 7.1 Homogeneous Con-Sylvester Matrix Equations ...... 276 7.2 Nonhomogeneous Con-Sylvester Matrix Equations...... 284 7.2.1 The First Approach ...... 285 7.2.2 The Second Approach ...... 293 7.3 Con-Yakubovich Matrix Equations...... 294 7.3.1 The First Approach ...... 295 7.3.2 The Second Approach ...... 305 7.4 Extended Con-Sylvester Matrix Equations...... 307 7.4.1 Basic Solutions ...... 308 7.4.2 Equivalent Forms...... 311 7.4.3 Further Discussion ...... 316 7.4.4 Illustrative Examples ...... 318 7.5 Generalized Con-Sylvester Matrix Equations ...... 321 7.5.1 Basic Solutions ...... 322 7.5.2 Equivalent Forms...... 324 7.5.3 Special Solutions ...... 329 7.5.4 An Illustrative Example ...... 332 7.6 Notes and References ...... 334 xiv Contents

8 Unilateral-Equation-Based Approaches ...... 335 8.1 Con-Sylvester Matrix Equations...... 336 8.2 Con-Yakubovich Matrix Equations...... 343 8.3 Nonhomogeneous Con-Sylvester Matrix Equations...... 349 8.4 Notes and References ...... 354 9 Conjugate Products ...... 355 9.1 Complex Polynomial Ring ðÞC½sŠ; þ ; ~ ...... 355 9.2 Division with Remainder in ðÞC½sŠ; þ ; ~ ...... 359 9.3 Greatest Common Divisors in ðÞC½sŠ; þ ; ~ ...... 362 9.4 Coprimeness in ðÞC½sŠ; þ ; ~ ...... 365 9.5 Conjugate Products of Polynomial Matrices...... 366 9.6 Unimodular Matrices and Smith Normal Form...... 371 9.7 Greatest Common Divisors ...... 377 9.8 Coprimeness of Polynomial Matrices ...... 379 9.9 Conequivalence and Consimilarity ...... 382 9.10 An Example ...... 385 9.11 Notes and References ...... 385 10 Con-Sylvester-Sum-Based Approaches ...... 389 10.1 Con-Sylvester Sum...... 389 10.2 Con-Sylvester- Equations ...... 394 10.2.1 Homogeneous Case ...... 394 10.2.2 Nonhomogeneous Case...... 397 10.3 An Illustrative Example ...... 400 10.4 Notes and References ...... 402

Part III Applications in Systems and Control 11 Stability for Antilinear Systems ...... 405 11.1 Stability for Discrete-Time Antilinear Systems...... 407 11.2 Stochastic Stability for Markovian Antilinear Systems ...... 410 11.3 Solutions to Coupled Anti-Lyapunov Equations ...... 423 11.3.1 Explicit Iterative Algorithms ...... 424 11.3.2 Implicit Iterative Algorithms ...... 428 11.3.3 An Illustrative Example ...... 432 11.4 Notes and References ...... 435 11.4.1 Summary ...... 435 11.4.2 A Brief Overview ...... 436 12 Feedback Design for Antilinear Systems ...... 439 12.1 Generalized Eigenstructure Assignment...... 439 12.2 Model Reference Tracking Control...... 442 12.2.1 Tracking Conditions ...... 443 12.2.2 Solution to the Feedback Stabilizing Gain ...... 445 12.2.3 Solution to the Feedforward Compensation Gain . . . . . 446 12.2.4 An Example ...... 447 Contents xv

12.3 Finite Horizon Quadratic Regulation...... 450 12.4 Infinite Horizon Quadratic Regulation...... 461 12.5 Notes and References ...... 467 12.5.1 Summary ...... 467 12.5.2 A Brief Overview ...... 468

References ...... 471

Index ...... 485 Notation

Notation Related to Subspaces

Z Set of all integer numbers R Set of all real numbers C Set of all complex numbers Rn Set of all real vectors of dimension n Cn Set of all complex vectors of dimension n RmÂn Set of all real matrices of dimension m  n CmÂn Set of all complex matrices of dimension m  n RmÂn½sŠ Set of all polynomial matrices of dimension m  n with real coefficients CmÂn½sŠ Set of all polynomial matrices of dimension m  n with complex coefficients Ker The kernel of a mapping Image The image of a mapping rdim The real dimension of a real linear space

Notation Related to Vectors and Matrices

n 0n Zero vector in R mÂn 0mÂn in R ÂÃIn of order n Matrix of dimension m  n with the i-th row and j-th column element aij mÂn being aij AÀ1 Inverse matrix of matrix A AT Transpose of matrix A A Complex conjugate of matrix A AH Transposed complex conjugate of matrix A n The matrix whose i-th block diagonal is Ai diag Ai i¼1

xvii xviii Notation

ReðAÞ Real part of matrix A ImðAÞ Imaginary part of matrix A detðAÞ Determinant of matrix A adjðAÞ Adjoint of matrix A trðAÞ Trace of matrix A rankðAÞ Rank of matrix A vecðAÞ Vectorization of matrix A Kronecker product of two matrices qðÞA Spectral radius of matrix A kðÞA Set of the eigenvalues of matrix A kminðÞA The minimal eigenvalue of matrix A kmaxðÞA The maximal eigenvalue of matrix A rmaxðÞA The maximal singular value of matrix A k k A 2 2-norm of matrix A kAk Frobenius norm of matrix A ! Ak The k-th right alternating power of matrix A

Ak The k-th left alternating power of matrix A kðÞE; A Set of the finite eigenvalues of matrix pair ðÞE; A

Other Notation

E Mathematical expectation i The imaginary unit I½m; nŠ The set of integers from m to n min The minimum value in a set max The maximum value in a set ~ Conjugate product of two polynomial matrices F Á Con-Sylvester sum () If and only if £ Empty set Chapter 1 Introduction

The theory of matrix equations is an active research topic in matrix algebra, and has been extensively investigated by many researchers. Different matrix equations have wide applications in various areas, such as, communication, signal processing and control theory. Specifically, Lyapunov matrix equations are often encountered in sta- bility analysis of linear systems [160]; the homogeneous continuous-time Lyapunov equation in block companion matrices plays a vital role in the investigation of fac- torizations of Hermitian block Hankel matrices [228]; generalized Sylvester matrix equations are often encountered in eigenstructure assignment of linear systems [90]. As to a matrix equation, three basic problems need to be considered: the solvabil- ity conditions, solving approaches and expressions of the solutions. For real matrix equations, a considerable number of results have been obtained for these problems. In addition, some other problems have also been considered for some special matrix equations. For example, geometric properties of continuous-time Lyapunov matrix equations were investigated in [286]; bounds of the solution were studied for discrete- time algebraic Lyapunov equations in [173, 227] and for continuous-time Lyapunov equations in [173]. However, there are only a few results on complex matrix equa- tions with the conjugate of unknown matrices reported in literature. For convenience, the type of these matrix equations is called the complex conjugate matrix equation. Recently, complex conjugate matrix equations have found some applications in discrete-time antilinear systems. In this book, some recent results are summarized for several kinds of complex conjugate matrix equations and their applications in analysis and feedback design of antilinear systems. In this chapter, the main aim is to first provide a survey on real linear matrix equations, and then give recent progress on complex conjugate matrix equations. The recent progress on antilinear systems and related problems will be given in the part of “Notes and References” of Chaps. 11 and 12. At the end of this chapter, an overview of this monograph is presented. Symbols used in this chapter are now introduced. It should be pointed out that these symbols are also adopted throughout this book. For two integers m ≤ n, the notation

© Springer Science+Business Media Singapore 2017 1 A.-G. Wu and Y. Zhang, Complex Conjugate Matrix Equations for Systems and Control, Communications and Control Engineering, DOI 10.1007/978-981-10-0637-1_1 2 1 Introduction

I[m, n] denotes the set {m, m + 1, ..., n}. For a A, we use det A, ρ (A), λ (A), λmin (A) , and λmax (A) to denote the determinant, the spectral radius, the set of eigenvalues, the minimal and maximal eigenvalues of A, respectively. The notations A, AT, and AH denote the conjugate, transpose and conjugate transpose of the matrix A, respectively. Re (A) and Im (A) denote the real part and imaginary part n of the matrix A, respectively. In addition, diag Ai is used to denote the block diagonal i=1 matrix whose elements in the main block-diagonal are Ai, i ∈ I[1, n]. The symbol “⊗” is used to denote the Kronecker product of two matrices.

1.1 Linear Equations

The most common linear equation may be the following real equation

Ax = b, (1.1) where A ∈ Rm×n and b ∈ Rm are known, and x ∈ Rn is the vector to be determined. If A is a square matrix, it is well-known that the linear equation (1.1) has a unique solution if and only if the matrix A is invertible, and in this case, the unique solution can be given by x = A−1b. In addition, this unique solution can also be given by

det A x = i , i ∈ I[1, n], i det A where xi is the i-th element of the vector x, and Ai is the matrix formed by replacing the i-th column of A with the column vector b. This is the celebrated Cramer’s rule. For the general case, it is well-known that the matrix equation (1.1) has a solution if and only if   rank Ab = rankA.

In addition, the solvability of the general equation (1.1) can be characterized in terms of generalized inverses, and the general expression of all the solutions to the equation (1.1) can also be given in terms of generalized inverses. Definition 1.1 ([206, 208]) Given a matrix A ∈ Rm×n, if a matrix X ∈ Rn×m satisfies

AXA = A, then X is called a generalized inverse of the matrix A. The generalized inverse may be not unique. An arbitrary generalized inverse of the matrix A is denoted by A−. Theorem 1.1 ([208, 297]) Given a matrix A ∈ Rm×n,letA− be an arbitrary gen- eralized inverse of A. Then, the vector equation (1.1) has a solution if and only if 1.1 Linear Equations 3

AA−b = b. (1.2)

Moreover, if the condition (1.2) holds, then all the solutions of the vector equation (1.1) can be given by   x = A−b + I − A−A z, where z is an arbitrary n-dimensional vector.

The analytical solutions of the equation (1.1) given by inverses or generalized inverses have neat expressions, and play important roles in theoretical analysis. How- ever, it has been recognized that the operation of matrix inverses is not numerically reliable. Therefore, many numerical methods are applied in practice to solve linear vector equations. These methods can be classified into two types. One is the transfor- mation approach, in which the matrix A needs to be transformed into some special canonical forms, and the other is the iterative approach which generates a sequence of vectors that approach the exact solution. An iterative process may be stopped as soon as an approximate solution is sufficiently accurate in practice. For the equation (1.1) with m = n, the celebrated iterative methods include Jacobi iteration and Gauss-Seidel iteration. Let ⎡ ⎤ ⎡ ⎤ b1 x1   ⎢ b ⎥ ⎢ x ⎥ A = a , b = ⎢ 2 ⎥ , x = ⎢ 2 ⎥ . ij n×n ⎣ ···⎦ ⎣ ···⎦

bn xn

Then, the vector equation (1.1) can be explicitly written as ⎧ ⎪ + +···+ = ⎨⎪ a11x1 a12x2 a1nxn b1 a21x1 + a22x2 +···+a2nxn = b2 ⎪ ··· . ⎩⎪ an1x1 + an2x2 +···+annxn = bn

The Gauss-Seidel and Jacobi iterative methods require that the vector equation (1.1) has a unique solution, and all the entries in the main diagonal of A are nonzero, that is, aii = 0, i ∈ I[1, n]. It is assumed that the initial values xi (0) of xi, i ∈ I[1, n], are given. Then, the Jacobi iterative method obtains the unique solution of (1.1)by the following iteration [132]: ⎛ ⎞ − 1 i 1 n x (k + 1) = ⎝b − a x (k) − a x (k)⎠ , i ∈ I[1, n], i a i ij j ij j ii j=1 j=i+1 and the Gauss-Seidel iterative method obtains the unique solution of (1.1)bythe following forward substitution [132]: 4 1 Introduction ⎛ ⎞ − 1 i 1 n x (k + 1) = ⎝b − a x (k + 1) − a x (k)⎠ , i ∈ I[1, n]. i a i ij j ij j ii j=1 j=i+1

Now, let

n D = diag aii, ⎡i=1 ⎤ ⎡ ⎤ 000 ··· 0 0 a12 a13 ··· a1n ⎢ ⎥ ⎢ ⎥ ⎢ a21 00 ··· 0 ⎥ ⎢ 00a23 ··· a2n ⎥ ⎢ ⎥ ⎢ ⎥ L = ⎢ a31 a32 0 ··· 0 ⎥ , U = ⎢ 000··· ··· ⎥ . ⎣ ⎦ ⎣ ⎦ ··· ··· ··· ··· ··· ··· ··· ··· ··· an−1,n an1 an2 ··· an,n−1 0 000··· 0

Then, the Gauss-Seidel iterative algorithm can be written in the following compact form: x (k + 1) =−(D + L)−1 Ux (k) + (D + L)−1 b, and the Jacobi iterative algorithm can be written as

x (k + 1) =−D−1 (L + U) x (k) + D−1b.

It is easily known that the Gauss-Seidel iteration is convergent if and only if   ρ (D + L)−1 U < 1, and the Jacobi iteration is convergent if and only if   ρ D−1 (L + U) < 1.

In general, the Gauss-Seidel iteration converges faster than the Jacobi iteration since the recent estimation is used in the Gauss-Seidel iteration. However, there exist examples where the Jacobi method is faster than the Gauss–Seidel method. In 1950, David M. Young, Jr. and H. Frankel proposed a variant of the Gauss- Seidel iterative method for solving the equation (1.1) with m = n [156]. This is the so-called successive over-relaxation (SOR) method, by which the elements xi, i ∈ I[1, n],ofx can be computed sequentially by forward substitution: ⎛ ⎞ ω i−1 n x (k + 1) = (1 − ω) x (k) + ⎝b − a x (k + 1) − a x (k)⎠ , i ∈ I[1, n], i i a i ij j ij j ii j=1 j=i+1 where the constant ω>1 is called the relaxation factor. Analytically, this algorithm can be written as 1.1 Linear Equations 5

x (k + 1) = (D + ωL)−1 [ωb − (ωU + (ω − 1) D) x (k)] .

The choice of relaxation factor is not necessarily easy, and depends on the properties of A. It has been proven that if A is symmetric and positive definite, the SOR method is convergent with 0 <ω<2. If A is symmetric and positive definite, the equation (1.1) can be solved by the conjugate gradient method proposed by Hestenes and Stiefel. This method is given in the following theorem.

Theorem 1.2 Given a symmetric and positive definite matrix A ∈ Rn×n, the solution of the equation(1.1) can be obtained by the following iteration ⎧ T( ) ( ) ⎪ α (k) = r k p k ⎪ pT(k)Ap(k) ⎨⎪ x (k + 1) = x (k) + α (k) p (k) r (k + 1) = b − Ax (k + 1) , ⎪ T( ) ( + ) ⎪ s (k) =−p k Ar k 1 ⎩⎪ pT(k)Ap(k) p (k + 1) = r (k + 1) + s (k) p (k) where the initial conditions x (0),r(0) , and p (0) are given as x (0) = x0, and p (0) = r (0) = b − Ax (0).

1.2 Univariate Linear Matrix Equations

In this section, a simple survey is provided for linear matrix equations with only one unknown matrix variable. Let us start with the Lyapunov matrix equations.

1.2.1 Lyapunov Matrix Equations

The most celebrated univariate matrix equations may be the continuous-time and discrete-time Lyapunov matrix equations, which play vital roles in stability analy- sis [75, 160], controllability and observability analysis of linear systems [3]. The continuous-time and discrete-time Lyapunov matrix equations are respectively in the forms as

ATX + XA =−Q, (1.3) X − ATXA = Q,(1.4) where A ∈ Rn×n, and positive semidefinite matrix Q ∈ Rn×n are known, and X is the matrix to be determined. In [103, 104], the robust stability analysis was investigated for linear continuous-time and discrete-time systems, respectively, and the admissi- ble perturbation bounds of the system matrices were given in terms of the solutions 6 1 Introduction of the corresponding Lyapunov matrix equations. In [102], the robust stability was considered for linear continuous-time systems subject to unmodeled dynamics, and an admissible bound was given for the nonlinear perturbation function based on the solution to the Lyapunov matrix equation of the nominal linear system. In linear systems theory, it is well-known that the controllability and observability of a lin- ear system can be checked by the existence of a positive definite solution to the corresponding Lyapunov matrix equation [117]. In [145], the continuous-time Lyapunov matrix equation was used to analyze the weighted logarithmic norm of matrices. While in [106], this equation was employed to investigate the so-called generalized positive definite matrix. In [222], the inverse solution of the discrete-time Lyapunov equation was applied to generate q-Markov covers for single-input-single-output discrete-time systems. In [317], a relationship between the weighted norm of a matrix and the corresponding discrete-time Lya- punov matrix equation was first established, and then an iterative algorithm was presented to obtain the spectral radius of a matrix by the solutions of a sequence of discrete-time Lyapunov matrix equations. For the solutions of Lyapunov matrix equations with special forms, many results have been reported in literature. When A is in the Schwarz form, and Q is in a special diagonal form, the solution of the continuous-time Lyapunov matrix equation (1.3) was explicitly given in [12]. When AT is in the following companion form: ⎡ ⎤ 01 ⎢ . . . ⎥ ⎢ . . .. ⎥ AT = ⎢ . . ⎥ , ⎣ 00··· 1 ⎦ −a0 −a1 ··· −an−1   and Q = bbT with b = 00··· 1 T, it was shown in [221] that the solution of the Lyapunov matrix equation (1.3) with A Hurwitz stable can be given by using the entries of a Routh table. In [165], a simple algorithm was proposed for a closed- form solution to the continuous-time Lyapunov matrix equation (1.3)byusingthe Routh array when A is in a companion form. In [19], the solutions for the above two Lyapunov matrix equations, which are particularly suitable for symbolic implemen- tation, were proposed for the case where the matrix A is in a companion form. In [24], the following special discrete-time Lyapunov matrix equation was considered:

X − FXFT = GQGT, where the matrix pair (F, G) is in a controllable canonical form. It was shown in [24] that the solution to this equation is the inverse of a Schur-Cohn matrix associated with the characteristic polynomial of F. When A is Hurwitz stable, the unique solution to the continuous-time Lyapunov matrix equation (1.3) can be given by the following integration form [28]: 1.2 Univariate Linear Matrix Equations 7

 ∞ T X = eA tQeAt d t. (1.5) 0

Further, let Q = BBT with B ∈ Rn×r, and let the function eAt be expressed as a finite sum of the power of A:

At n−1 e = a1 (t) I + a2 (t) A +···+an (t) A .

Then, it was shown in [193] that the unique solution of (1.3) can also be expressed by X = Ctr (A, B) H CtrT (A, B) (1.6) where   Ctr (A, B) = BAB··· An−1B is the controllability  matrix of the matrix pair (A, B), and H = G ⊗ Ir with G = T = G gij × , n n  ∞ gij = ai (t) aj (t) d t. 0

The expression in (1.6) may bring much convenience for the analysis of linear systems due to the appearance of the controllability matrix. In addition, with the help of the expression (1.6) some eigenvalue bounds of the solution to the equation (1.3)were given in [193]. In [217], an infinite series representation of the unique solution to the continuous-time Lyapunov matrix equation (1.3) was also given by converting it into a discrete-time Lyapunov matrix equation. When A is Schur stable, the following theorem summarizes some important prop- erties of the discrete-time Lyapunov matrix equation (1.4).

Theorem 1.3 ([212]) If A is Schur stable, then the solution of the discrete-time Lyapunov matrix equation (1.4) exists for any matrix Q, and is given as  2π     1 θ −1 − θ −1 X = AT − ei I Q A − e i I d θ, 2π 0 or equivalently by ∞   X = AT i QAi. i=0

Many numerical algorithms have been proposed to solve the Lyapunov matrix equations. In view that the solution of the Lyapunov matrix equation (1.3)isatleast semidefinite, Hammarling [136] found an ingenuous way to compute the Cholesky factor of X directly. The basic idea is to apply triangular structure to solve the equation iteratively. By constructing a new rank-1 updating scheme, an improved Hammarling method was proposed in [220] to accommodate a more general case 8 1 Introduction of Lyapunov matrix equations. In [284], by using a dimension-reduced method an algorithm was proposed to solve the continuous-time Lyapunov matrix equation (1.3) in controllable canonical forms. In [18], the presented Smith iteration for the discrete-time Lyapunov matrix equation (1.4) was in the form of

X (k + 1) = ATX (k) A + Q with X (0) = Q. Besides the solutions to Lyapunov matrix equations, the bounds of the solutions have also been extensively investigated. In [191], the following result was given on the eigenvalue bounds of the discrete-time Lyapunov matrix equation

X − AXAT = BBT, (1.7) where A ∈ Rn×n and B ∈ Rn×r are known matrices, and X is the matrix to be determined.

Theorem 1.4 Given matrices A ∈ Rn×n and B ∈ Rn×r, for the solution X to the discrete-time Lyapunov matrix equation (1.7) there holds     T T λmin Ctr (A, B) Ctr (A, B) P ≤ X ≤ λmax Ctr (A, B) Ctr (A, B) P, where P is the solution to the Lyapunov matrix equation

P − AnP (An)T = I.

In [227], upper bounds for the norms and trace of the solution to the discrete-time Lyapunov matrix equation (1.7) were presented in terms of the resolvent of A.In [124], lower and upper bounds for the trace of the solution to the continuous-time Lyapunov matrix equation (1.3) were given in terms of the logarithmic norm of A. In [116], lower bounds were established for the minimal and maximal eigenvalues of the solution to the discrete-time Lyapunov equation (1.7). Recently, parametric Lyapunov matrix equations were extensively investigated. In [307, 315], some properties of the continuous-time parametric Lyapunov matrix equations were given. In [307], the solution of the parametric Lyapunov equation was applied to semiglobal stabilization for continuous-time linear systems subject to actuator saturation; while in [315] the solution was used to design a state feedback stabilizing law for linear systems with input delay. The discrete-time parametric Lya- punov matrix equations were investigated in [313, 314], and some elegant properties were established. 1.2 Univariate Linear Matrix Equations 9

1.2.2 Kalman-Yakubovich and Normal Sylvester Matrix Equations

A general form of the continuous-time Lyapunov matrix equation is the so-called normal Sylvester matrix equation

AX − XB = C. (1.8)

A general form of the discrete-time Lyapunov matrix equation is the so-called Kalman-Yakubovich matrix equation

X − AXB = C. (1.9)

In the matrix equations (1.8) and (1.9), A ∈ Rn×n, B ∈ Rp×p, and C ∈ Rn×p are the known matrices, and X ∈ Rn×p is the matrix to be determined. On the solvability of the normal Sylvester matrix equation (1.8), there exists the following result which has been well-known as Roth’s removal rule. Theorem 1.5 ([210]) Given matrices A ∈ Rn×n,B∈ Rp×p, and C ∈ Rn×p,the normal Sylvester matrix equation (1.8) has a solution if and only if the following two partitioned matrices are similar     AC A 0 , ; 0 B 0 B or equivalently, the following two polynomial matrices are equivalent:     sI − A −C sI − A 0 , . 0 sI − B 0 sI − B

This matrix equation has a unique solution if and only if λ (A) ∩ λ (B) = ∅. The result in the preceding theorem was generalized to the Kalman-Yakubovich matrix equation (1.9) in [238]. This is the following theorem. Theorem 1.6 Given matrices A ∈ Rn×n,B∈ Rp×p, and C ∈ Rn×p, the Kalman- Yakubovich matrix equation (1.9) has a solution if and only if there exist nonsingular real matrices S and R such that     sI + AC sI + A 0 S R = . 0 sB + I 0 sB + I

On the numerical solutions to the normal Sylvester matrix equations, there have been a number of results reported in literature over the past 30 years. The Bartels- Stewart method [13] may be the first numerically stable approach to systematically solving small-to-medium scale Lyapunov and normal Sylvester matrix equations. The basic idea of this method is to apply the Schur decomposition to transform the 10 1 Introduction equation into a triangular system which can be solved efficiently by forward or back- ward substitutions. To save computation time, Bartels-Stewart method was extended in [167] to treat the adjoint equations. In [131, 140], the backward stability analysis and backward error analysis of Bartels-Stewart algorithm were given. In [220], three columnwise direct solver schemes were proposed by modifying the Bartels-Stewart algorithm. In [133], the so-called Hessenberg-Schur algorithm was proposed for the normal Sylvester matrix equation (1.8). This algorithm requires the transformation of the larger of the two matrices A and B,sayA, into the upper Hessenberg form and the other B to the real Schur form. Like the Bertels-Stewart and Hammarling algorithms, the Hessenberg-Schur method is also an example of the transformation method. Different from the Bertels-Stewart algorithm, the Hessenberg-Schur algo- rithm only requires the matrix A to be reduced to Hessenberg form. In [205], a numerical algorithm was proposed to solve the normal Sylvester matrix equation (1.8) by orthogonal reduction of the matrix B to a block upper Hessenberg form. In [17], a factored ADI (alternating direction implicit) method was presented. In [139], projection methods that use the extended block Arnoldi process were proposed to solve the low-rank normal Sylvester matrix equations. For explicit solutions to the normal Sylvester matrix equation (1.8), perhaps the earliest result may be the one in the form of finite double matrix series for the case of both A and B being of Jordan forms in [183]. In [157], two explicit solutions were established in terms of principle idempotents and nilpotence of the coefficient matrices. In [37], explicit general solutions were given by using eigenprojections. In [171], an infinite series representation of the unique solution was established by converting the normal Sylvester matrix equation (1.8) into a Kalman-Yakubovich matrix equation in the form of (1.9). When B is in Jordan form, a finite iterative approach was proposed in [92] to solve the equation (1.8). When A is a normalized lower , the normal Sylvester matrix equation (1.8) was investigated in [46]. It was pointed out in [46] that the solution is uniquely determined by its first row. When the right-hand side of the normal Sylvester equation (1.8) is a matrix with rank 1, a simple iterative method was proposed in [233] based on an extension of the Astrom-Jury-Agniel algorithm. When the coefficient matrices are not in any canonical forms, explicit solutions in the form of X = MN−1 were established for the normal Sylvester matrix equation (1.8) in literature, for example, [152, 182, 194, 258]. In [152, 182, 258], the matrix N is the value of the eigenpolynomial of A at B, while in [194] it is the value of an annihilating polynomial of A at B. In [152], the solution was obtained by applying Cayley-Hamilton theorem and M was expressed as the sum of a group of matrices which can be iteratively derived in terms of the coefficient matrices. In [182], the solution was derived based on a Bezout identity related to the mutual primeness of two polynomials, and M was given in terms of the coefficient matrices of adj (sI − A). In [258], the solution was established with the help of Kronecker maps, and M was represented by the controllability matrix and observability matrix. In [194], the solution was constructed based on the similarity of two partitioned matrices, and M was provided by a finite double series form associated with the coefficient matrices. In addition, by applying spectral theory, in [171] the unique solution was 1.2 Univariate Linear Matrix Equations 11 expressed by a contour integration on resolvents of A and B. By applying the Faddeev iterative sequence, a finite double series solution was also derived in [137]. In [47], a closed-form finite series representation of the unique solution was developed. In this solution, some coefficients are closely related to the companion matrices of the characteristic polynomials of matrices A and B. The result in [47] is very elegant, and thus it is provided in the following theorem. Before proceeding, for a polynomial

t−1 t α (s) = α0 + α1s +···+αt−1s + s the corresponding is defined as ⎡ ⎤ 01 ⎢ ⎥ ⎢ 1 ⎥ ⎢ ⎥ = ⎢ .. ⎥ Kα ⎢ . ⎥ , ⎣ 1 ⎦ −α0 −α1 −α2 ··· −αt−1 and the corresponding upper Hankle matrix as ⎡ ⎤ α1 α2 ··· αt−1 1 ⎢ ⎥ ⎢ α2 α3 ··· 1 ⎥ ⎢ ⎥ Hα = ⎢ ··· ··· ⎥ . ⎣ ⎦ αt−1 1 1

∈ Rn×n ∈ Rp×p = T ∈ Theorem 1.7 ([47]) Given matrices A ,B , and C CACB n×p n×q R with CA ∈ R , let the normal Sylvester matrix equation (1.8) have a unique solution, and let α (s) and β (s) with respective degrees μ and ν be coprime monic polynomials such that

α (A) C = 0, Cβ (B) = 0.

Then the unique solution of the normal Sylvester matrix equation (1.8) has the rep- resentation ν μ i−1 j−1 X = γijA CB = = j 0 i 0 ⎡ ⎤ T CB    ⎢ T ⎥ = ··· μ−1 ⊗ ⎢ CB B ⎥ CA ACA A CA Iq ⎣ ··· ⎦ , T ν−1 CB B 12 1 Introduction   = γ where the matrix ij μ×ν is the unique solution of     T T Kα − Kβ = 10··· 0 10··· 0 . (1.10)

In the preceding theorem, the symbol “⊗” represents the Kronecker product of two matrices. The detailed definition can be found in Sect. 2.1 of this monograph. On the solution to the matrix equation (1.10), the following result has been given in [47].

Lemma 1.1 ([47]) The unique solution of the matrix equation (1.10), when it exists, is given by ν ≥ μ (1) when ,     −1 =− Hα 0μ×(ν−μ) α Kβ ,

(2) when ν ≤ μ,      T −1 Hβ = β Kα . 0(μ−ν)×ν

On the solution of the Kalman-Yakubovich matrix equation (1.9), many results have been reported in literature. In [23], the unique solution was given in terms of a when A and B were in companion forms. In [137], a finite iterative method for solving the Kalman-Yakubovich matrix equation was given based on the Faddeev sequence. The solution can be quickly obtained if a solution was known for a Kalman-Yakubovich matrix equation with a right-hand side of rank 1. In [305], the unique solutions to the Kalman-Yakubovich and Stein equations were given in terms of controllability matrices, observability matrices and Hankel matrices. In addition, explicit solutions in the form of X = MN −1 to the Kalman-Yakubovich matrix equation were established in literature, for example, [155, 280]. In [155], the solution was obtained by applying linear operators approach, and M was expressed as a finite double sum in terms of the coefficients of the characteristic polynomial of the matrix A. In [280], the solution was established with the aid of Kronecker maps, and M was expressed in terms of the controllability and observability matrices. On the numerical approach for solving the Kalman-Yakubovich matrix equa- tion, the typical methods are the Smith-type iterations. In [18], the presented Smith iteration for the Kalman-Yakubovich matrix equation (1.9) was in the form of X (k + 1) = AX (k) B + C with X (0) = C. A quadratically convergent version of this iteration was given in [218]. This iteration can be written as

X (k + 1) = X (k) + A (k) X (k) B (k) , A (k + 1) = A2 (k) , B (k + 1) = B2 (k) , 1.2 Univariate Linear Matrix Equations 13 with initial values X (0) = C, A (0) = A, and B (0) = B. Obviously, the preceding iteration only works for square A and B. Moreover, the Smith (l) iteration was pro- posed in [218]. It was shown in [200] that a moderate increase in the number of shifts l can accelerate the convergence significantly. However, it was also observed that the speed of convergence was hardly improved by a further increase of l [134, 200]. To improve the speed of convergence, one can adopt the so-called Smith accelerative iteration [217]. In addition, a new Smith-type iteration named the r-Smith iteration was proposed in [310]. In fact, the normal Sylvester matrix equation (1.8) can be transformed into a Kalman-Yakubovich matrix equation [171]. For a nonzero real constant number a, it follows from (1.8) that

(aI + A) X (aI − B) − (aI − A) X (aI + B) = 2aC.

If a is chosen so that (aI − A)−1 and (aI + B)−1 exist, then pre- and post-multiplying by these matrices, respectively, gives

(aI − A)−1 (aI + A) X (aI − B)(aI + B)−1 − X = 2a (aI − A)−1 C (aI + B)−1 .

Denote U = (aI − A)−1 (aI + A) , V = (aI − B)(aI + B)−1 .

It is easily known that

1 1 (aI − A)−1 = (U + I) , (aI + B)−1 = (V + I) . 2a 2a With the previous expressions, now the normal Sylvester matrix equation (1.8) has been transformed into the following Kalman-Yakubovich matrix equation

1 UXV − X = (U + I) C (V + I) . 2a Thus, some numerical approaches for Kalman-Yakubovich matrix equations can be applied to normal Sylvester matrix equations. A more general transformation approach from normal Sylvester matrix equations to Kalman-Yakubovich matrix equations was presented in [188].

1.2.3 Other Matrix Equations

For linear matrix equations with only one unknown matrix, there are some other types besides those mentioned in previous subsections. The simplest one may be the following bilateral linear matrix equation 14 1 Introduction

AXB = C, (1.11) where A ∈ Rm×k, B ∈ Rl×n, and C ∈ Rm×n are known matrices, and X ∈ Rk×l is the unknown matrix. It is well-known that the bilateral matrix equation (1.11) has a solution if and only if     B rank AC = rank A, rank = rank B, C or equivalently AA−CB−B = C.

In this case, the general solution of (1.11) can be expressed by [15]     X = A−CB− + I − A−A V + W I − BB− , where V and W are the free parametric matrices. Many researchers have given some results on properties of the solution to the matrix equation (1.11). It was shown in [226] that

max rank X = min {k, l, k + l + rank C − rank A − rank B} , AXB=C min rank X = rank C. AXB=C

In addition, in [59, 171, 239] the following matrix equation was considered:

n m i j αijA XB = C, (1.12) i=0 j=0 where X is the matrix to be determined. This equation is a general form of the equations (1.3), (1.4), (1.8), (1.9), and (1.11). In [171], the solution was explicitly given by a double integration. In [59], the uniqueness condition of the solution to (1.12) was established in terms of a bivariate polynomials. In [239], the unique solution was expressed by a double sum, and the rank of the solution was also estimated. In [146, 147, 239], the following matrix equation was investigated

ω i A XBi = C, (1.13) i=0

n×n m×q n×q where A ∈ R , Bi ∈ R , i ∈ I[0,ω], and C ∈ R are known matrices. This equation includes the continuous-time and discrete-time Lyapunov matrix equations (1.3) and (1.4), the normal Sylvester matrix equation (1.8), the Kalman-Yakubovich matrix equation (1.9), and the matrix equations (1.11) and (1.12) as special cases. 1.2 Univariate Linear Matrix Equations 15

Denote ω i m×q B (s) = Bis ∈ R [s] . (1.14) i=0

In [239], it was shown that the equation (1.13) with Bi being square, i ∈ I[0,ω] is solvable for all C if and only if det B (γ ) = 0, for any γ ∈ λ (A). The unique solution can be expressed in the form of

α i X = A CGi, i=0 where the matrices Gi, i ∈ I[0,α], can be determined from an auxiliary equation which involves the companion matrix of the characteristic polynomial of A. More- over, the conclusion in [239] pointed out that the solution of the matrix equation (1.13) can be expressed by the solutions of a normal Sylvester matrix equation and a Kalman-Yakubovich matrix equation. In [147], the solvability of the equation (1.13) was characterized by using Smith normal form of polynomial matrices, and the gen- eral solution was given by generalized inverses. By generalizing the Roth’s theorem in [210] on the normal Sylvester matrix equation (1.8), the following interesting conclusion was obtained in [146]. n×n m×q Theorem 1.8 ([146]) Given matrices A ∈ R ,Bi ∈ R ,i∈ I[0,ω], and C ∈ Rn×q, define the polynomial matrix B (s) as in (1.14). Then, the matrix equation (1.13) has a solution if and only if the following two polynomial matrices are equivalent:     sI − A −C sI − A 0 , . 0 B (s) 0 B (s)

Obviously, by specializing the result in the preceding theorem, one can easily obtain an alternative condition on the existence of solutions to the matrix equation (1.11) with m = k. In mathematical literature, the following matrix equation was also extensively investigated AXB − CXD = E, (1.15) where X is the matrix to be determined, and A, B, C, and D are square matrices with appropriate dimensions. This equation is a third general form of the equations (1.3), (1.4), (1.8), and (1.9). In [129, 190], a matrix transformation approach was established to solve the matrix equation (1.15). In this method, the QZ algorithm was employed to structure the equation in such a way that it can be solved columnwisely by a block substitution technique. In [41], the existence condition of the unique solution of the matrix equation (1.15) was given, and a numerical algorithm was proposed to solve it. In [138], the unique solution of the matrix equation (1.15) was given in an explicit form by using the coefficients of the Laurent expansions of (sC − A)−1 and (sB − D)−1. In [58], for the matrix equation (1.15) a gradient- 16 1 Introduction based and a least-squares-based iterative algorithms were established for the solution by applying the hierarchical identification principle in [53, 54], respectively. In [52, 58], the hierarchical identification principle was also used to solve the general matrix equation N AiXBi = C, i=1 where X is the unknown matrix. Besides, some other matrix equations were also investigated. In [298], necessary and sufficient conditions were given for the existence of at least a full-column rank solution to the matrix equation AX = EXJ. Recently, some mixed-type Lyapunov matrix equations were intensively investigated. In [289], the following mixed-type Lyapunov matrix equation with respect to X was investigated:

X = AXBT + BXAT + Q, (1.16) and some sufficient solvability conditions were derived for this matrix equation in terms of inequalities. In [114], a new solvability condition was proposed for the equation (1.16) by using Bhaskar and Lakshmikantham’s fixed point theorem, and also an iterative algorithm was constructed to solve this equation. In the analysis of Itô stochastic systems, the following mixed-type Lyapunov matrix equation appears [16, 36]: AX + XAT + BXBT + Q = 0. (1.17)

In [16], a solvability condition for this equation was given by spectral radius of a T T linear operator related to the two operators L1 (X) = AX + XA and L2 (X) = BXB . In [114], a new solvability condition and an iterative algorithm were proposed for the matrix equation (1.17).

1.3 Multivariate Linear Matrix Equations

We first provide a survey of linear matrix equations with two unknown matrices in the first two subsections. In the third subsection of this section, a survey will be given on the matrix equations with more than two unknown matrices.

1.3.1 Roth Matrix Equations

The following matrix equation has been considered in [210] by Roth

AX − YB = C, (1.18) 1.3 Multivariate Linear Matrix Equations 17 where A ∈ Rm×p, B ∈ Rk×n, and C ∈ Rm×n are known matrices, and X ∈ Rp×n and Y ∈ Rm×k are the matrices to be determined. For convenience, the matrix equation (1.18) will be called the Roth matrix equation in this monograph. It was shown in [210] that the Roth matrix equation (1.18) has a solution if and only if the following two partitioned matrices are equivalent     AC A 0 , . 0 B 0 B

An alternative proof of this result was given in [126]. In [40], necessary and sufficient conditions for the solvability of this equation were given by using singular value decomposition (SVD). The Roth matrix equation was further investigated in [8], and the existence conditions and general solutions were expressed in terms of generalized inverses. In addition, the extremal ranks of the solutions to the Roth matrix equation (1.18) were given in [180]. In [4], the Roth matrix equation (1.18) was revisited based on the rank condition of a partitioned matrix. In the following theorem, some typical results on the Roth matrix equation (1.18) are summarised.

Theorem 1.9 ([8, 180]) Given matrices A ∈ Rm×p,B∈ Rk×n, and C ∈ Rm×n,the Roth matrix equation (1.18) has a solution if and only if     I − AA− C I − B−B = 0.

If this is true, the general solution of (1.18) has the form    X = A−C + A−ZB + I − A−A W     , Y =− I − AA− CB− + Z − I − AA− ZBB− where W ∈ Rp×n and Z ∈ Rm×k are two arbitrary matrices. Further, the maximal and minimal ranks of a pair of solutions to (1.18) are given by    B max rankX = min n, p, p − rankA + rank , AX−YB=C C    max rankY = min m, k, k − rankB + rank AC , AX−YB=C   B min rankX = rank − rankB, AX−YB=C C   min rankY = rank AC − rankA. AX−YB=C

In [9, 149, 196, 236], a more general equation

AXB + CYD = E (1.19) 18 1 Introduction was investigated. For convenience, this equation is called generalized Roth matrix equation. A necessary and sufficient condition for its solvability and a representation of its general solution were established in [9, 149] in terms of generalized inverses. It was shown in [196, 236] that the matrix equation (1.19) has a solution if and only if the following rank conditions hold: ⎧     ⎪ rank ACE = rank AC ⎪ ⎡ ⎤ ⎪ B   ⎪ B ⎪ rank ⎣ D ⎦ = rank ⎨⎪ D E   . ⎪ EA ⎪ rank = rankA + rankD ⎪ D 0 ⎪   ⎪ EC ⎩⎪ rank = rankB + rankC B 0

In addition, some generalized inverse conditions were also established in [236] for the existence of solutions to the matrix equation (1.19), and the general solution was also given with the help of generalized inverses. In [288], the necessary and sufficient conditions for the existence and uniqueness of the solutions of (1.19) were given by using canonical correlation decomposition of matrix pairs. Besides, necessary and sufficient conditions for the solvability of (1.19) were given in [40] by using generalized singular value decompositions. In [199], a finite iterative method was proposed for solving the generalized Roth matrix equation (1.19). When the matrix equation is solvable, then, for any initial matrix pair, a solution pair can be obtained within finite iteration steps in the absence of round-off errors.

1.3.2 First-Order Generalized Sylvester Matrix Equations

In controller design of linear systems, the following Sylvester matrix equation is often encountered: AX + BY = XF, (1.20) where A ∈ Rn×n, B ∈ Rn×r, and F ∈ Rp×p are known matrices, and X ∈ Rn×p and Y ∈ Rr×p are the matrices to be determined. This matrix equation plays a very vital role in eigenstructure assignment [130, 169], pole assignment [214], and so on. Its dual form is the following so-called Sylvester-observer matrix equation:

XA + YC = FX, (1.21) where A ∈ Rn×n, C ∈ Rm×n, and F ∈ Rp×p are known matrices, and X ∈ Rp×n and Y ∈ Rp×m are the matrices to be determined. It is well-known that the existence of a Luenberge observer for linear systems can be characterized based on this equation [60]. A more general form of (1.20)is 1.3 Multivariate Linear Matrix Equations 19

AX + BY = EXF. (1.22)

This generalized Sylvester matrix equation appears in the field of descriptor linear systems [81], and can be used to solve the problems of eigenstructure assignment [64] and output regulation [211] for descriptor linear systems. An important variation of (1.22), called generalized Sylvester-observer equation

XA + YC = FXE, (1.23) where X and Y need to be determined, arises in observer design [32, 254], fault detection [127] of descriptor linear systems. A more general form of the matrix equation (1.22) is the following equation

AX − EXF = B0Y + B1YF, (1.24) which was studied in [82] for solving eigenstructure assignment in a type of general- ized descriptor linear systems. Besides the Sylvester matrix equations (1.20)–(1.23), in [305] the following matrix equation was investigated:

MXF − X = TY, (1.25) where M ∈ Rn×n, F ∈ Rp×p, and T ∈ Rn×r are known matrices. Obviously, the matrix equations (1.20)–(1.25) are homogeneous. In fact, their nonhomogeneous counterparts have also been investigated. For example, the so-called regulator equa- tion AX + BY = XF + R was considered in [300]. The nonhomogeneous generalized Sylvester matrix equa- tion AX + BY = EXF + R was studied in [78, 278]. The nonhomogeneous equation

AX − EXF = B0Y + B1YF + R was studied in [86]. In the aforementioned equations, the largest degree of square matrices is 1. Due to this reason, all these equations are called first-order generalized Sylvester matrix equations.

1.3.2.1 Solution Approaches

There have been many numerical algorithms for solving these matrix equations. In [60], an orthogonal-transformation-based algorithm was proposed to solve the Sylvester-observer matrix equation (1.21). In this algorithm, the matrix pair (A, C) 20 1 Introduction is first transformed via a unitary state-space transformation into staircase form. With such a transformation, the solution of the Sylvester-observer matrix equation can be obtained by a reduced dimensional matrix equation with Schur form. The advantage of this approach is that one can use more degrees of freedom in the equation to find a solution matrix with some desired robustness properties such as the minimum norm. In [45], a computational method for solving the matrix equation (1.21)was proposed when A is large and sparse. This method uses Amoldi’s reduction in the initial process, and allows an arbitrary choice of distinct eigenvalues of the matrix F. The numerical aspects of the method in [45] was discussed in [30], and a strategy was presented for choosing the eigenvalues of F. In [22], in view of the design requirement the generalized Sylvester matrix equation (1.20) was first changed into a normal Sylvester matrix equation by choosing F in a block upper Hessenberg matrix and fixing Y to a special matrix, and then a parallel algorithm was given by reducing the matrix A to lower Hessenberg form. In [44], an algorithm was proposed to construct an orthogonal solution of the Sylvester-observer matrix equation (1.21) by generalizing the classical Arnoldi method. In [31], a block algorithm was proposed to compute a full rank solution of (1.21). This algorithm does not require the reduction of the matrix A. On the numerical solutions of (1.22) and (1.23), only a few results were reported in literature. In [32], a singular value decomposition (SVD) based block algorithm was proposed for solving the generalized Sylvester-observer matrix equation (1.23). In this algorithm, the matrix F needs to be chosen in a special block form, and the matrices E, A, and C are not necessarily reduced to any canonical forms. In [33], a new algorithm was proposed to numerically solve the generalized Sylvester-observer matrix equation (1.23). This algorithm can be viewed as a natural generalization of the well-known observer-Hessenberg algorithm in [22]. In this algorithm, the matrices E and A should be respectively transformed into an upper and a block upper Hessenberg matrix by orthogonal transformation. The algorithm in [33] was improved in [34], and was applied to state and velocity estimation in vibrating systems. The aforementioned numerical approaches for solving the four first-order gen- eralized Sylvester matrix equations (1.20), (1.21), (1.22), and (1.23) can only give one solution each time. However, for several applications it is important to obtain general solutions of these equations. For example, in robust pole assignment prob- lem one encounters optimization problems in which the criterion function can be expressed in terms of the solutions to a Sylvester matrix equation [164]. When F is in a Jordan form, an attractive analytical and restriction-free solution was presented in [231] for the matrix equation (1.21). Reference [66] proposes two solutions to the Sylvester matrix equation (1.20) for the case where the matrix F is in a Jordan form. One is in a finite iterative form, and the other is in an explicit form. To obtain the explicit solution given in [66], one needs to carry out a right coprime factorization of (sI − A)−1 B (when the eigenvalues of the Jordan matrix F are undetermined) or a series of singular value decompositions (when the eigenvalues of F are known). When the matrix F is in a companion form, an explicit solution expressed by a , a symmetric operator and a controllability matrix was established in [301]. 1.3 Multivariate Linear Matrix Equations 21

In many applications, for example, model reference control [96, 100], Luenberger observer design [93], the Sylvester matrix equation in the form of (1.20) with F being an arbitrary matrix is often encountered. Therefore, it is useful and interesting to give complete and explicit solutions by using the general matrix F itself directly. For such a case, a finite series solution to the Sylvester matrix equation (1.20)was proposed in [300]. Some equivalent forms of such a solution were also provided in that paper. On the generalized Sylvester matrix equation (1.22), an explicit solution was provided in [64] when F is in a Jordan form. This solution is given by a finite iteration. In addition, a direct explicit solution for the matrix equation (1.22)was established in [69] with the help of the right coprime factorization (sE − A)−1 B. The results in [64, 69] can be viewed as a generalization of those in [66]. The case where F is in a general form was firstly investigated in [303], and a complete parametric solution was presented by using the coefficient matrices of a right coprime factorization of (sE − A)−1 B. In [247], an explicit solution expressed by generalized R-controllability matrix and generalized symmetric operator matrix was also given. In order to obtain this solution, one needs to solve a standard unilateral matrix equation. In [262], an explicit solution was also given for the generalized Sylvester matrix equation (1.22) in terms of R-controllability matrix, generalized symmetric operator matrix and observability matrix by using Leverrier algorithm for descriptor linear systems. In [202], the generalized Sylvester matrix equation (1.22) was solved by transforming it into a linear vector equation with the help of Kronecker products. Now, we give the results in [69, 303] on the generalized Sylvester matrix equation (1.22). Due to the block diagonal structure of Jordan forms, when the result is stated for the case of Jordan forms, the matrix F is chosen to be the following matrix ⎡ ⎤ θ 1 ⎢ . ⎥ ⎢ .. ⎥ ⎢ θ ⎥ p×p F = ⎢ ∈ C . (1.26) .. ⎥ ⎣ . 1 ⎦ θ

Theorem 1.10 ([69]) Given matrices E, A ∈ Rn×nand B ∈ Rn×r satisfying   rank sE − AB = n, (1.27) for all s ∈ C, let F be in the form of (1.26). Further, let N (s) ∈ Rn×r [s] and D (s) ∈ Rr×r [s] be a pair of right coprime polynomial matrices satisfying

(A − sE) N (s) + BD (s) = 0. (1.28)

Then, all the solutions to the generalized Sylvester matrix equation (1.22) are given by 22 1 Introduction         (θ) (θ) k−1 (θ) xk = N + d N +···+ 1 d N fk fk−1 − f1, yk D (θ) ds D (θ) (k − 1)! dsk 1 D (θ) k ∈ I[1, p],

r where fi ∈ C ,i∈ I[1, p], are a group of arbitrary vectors. In the preceding expres- sion, xk and yk, are the k-th column of the matrices X and Y, respectively.

Theorem 1.11 ([303]) Given matrices E, A ∈ Rn×n, B ∈ Rn×r, and F ∈ Rp×p satisfying (1.27) for all s ∈ C,let

ω−1 ω i n×r i r×r N (s) = Nis ∈ R [s], and D (s) = Nis ∈ R [s] i=0 i=0 be a pair of right coprime polynomial matrices satisfying (1.28). Then, all the solu- tions to the generalized Sylvester matrix equation (1.22) are given by  ω−1 X = N0Z + N1ZF +···+Nω−1ZF ω , Y = D0Z + D1ZF +···+DωZF where Z ∈ Rr×p is an arbitrary matrix.

For the matrix equation (1.24), based on the concept of F -coprimeness, degrees of freedom existing in the general solution to this type of equations were first given in [82], and then a general complete parametric solution in explicit closed form was established based on generalized right factorization. On the matrix equation (1.25), a neat explicit parametric solution was given in [305] by using the coefficients of the characteristic polynomial and adjoint polynomial matrices. For the matrix equation AX + BY = EXF + R, an explicit parametric solution was established in [78] by elementary transformation of polynomial matrices when F is in a Jordan form; while in [278] the solution of this equation was given based on the solution of a standard matrix equation without any structural restriction on the coefficient matrices.

1.3.2.2 Applications in Control Systems Design

The preceding several kinds of Sylvester matrix equations have been extensively applied to control systems design. In eigenstructure assignment problems of linear systems, the Sylvester matrix equation (1.20) plays vital roles. By using a parametric solution to time-varying Sylvester matrix equations, eigenstructure assignment prob- lems were considered in [105] for time-varying linear systems. In [66], a parametric approach was proposed for state feedback eigenstructure assignment in linear sys- tems based on explicit solutions to the Sylvester matrix equation (1.20). In [63], the robust pole assignment problem was solved via output feedback for linear systems by combining parametric solutions of two Sylvester matrix equations in the form of 1.3 Multivariate Linear Matrix Equations 23

(1.20) with eigenvalue sensitivity theory. In [73, 224], the output feedback eigen- structure assignment was investigated for linear systems. In [224], the problem was solved by using two coupled Sylvester matrix equations and the concept of (C, A, B) -invariance; while in [73] the problem was handled by using an explicit parametric solution to the Sylvester matrix equation based on singular value decompositions. In [67], the problem of eigenstructure assignment via decentralized output feedback was solved by the parametric solution proposed in [66] for the Sylvester matrix equation (1.20). In [65], a complete parametric approach for eigenstructure assign- ment via dynamical compensators was proposed based on the explicit solutions of the Sylvester matrix equation (1.20). In [91], the parametric approach in [65] was utilized to deal with the robust control of a basic current-controlled magnetic bearing by an output dynamical compensator. In [39], disturbance suppressible controllers were designed by using Sylvester equations based on left eigenstructure assignment scheme. In addition, some observer design problems can also be solved in the framework of explicit parametric solutions of the Sylvester matrix equation (1.20). For example, in [99] the design of Luenberger observers with loop transfer recovery was considered; an eigenstructure assignment approach was proposed in [95] to the design of proportional integral observers for continuous-time linear systems. A further application of parametric solutions to the Sylvester matrix equation (1.20) is in fault detection. In [98], the problem of fault detection in linear systems was investigated based on Luenberger observers. In [101], the problem of fault detection based on proportional integral observers was solved by using the parametric solutions given in [66]. In some design problems of descriptor linear systems, the generalized Sylvester matrix equations (1.22) and (1.23) are very important. In [64], the eigenstructure assignment via state feedback was investigated for descriptor linear systems based on the proposed explicit solution to the generalized Sylvester matrix equation (1.22). The parametric solution of (1.22) proposed in [69] was applied to state feedback eigenstructure assignment and response analysis in [70], output feedback eigen- structure assignment in [68] and eigenstructure assignment via static proportional plus derivative state feedback in [97]. In [291], the obtained iterative solution to the generalized Sylvester matrix equation (1.22) was used to solve the eigenstructure assignment problem for descriptor linear systems. In [94], disturbance decoupling via output feedback in descriptor linear systems was investigated. This problem was tackled by output feedback eigenstructure assignment with the help of parametric solutions to the generalized Sylvester matrix equation (1.22). Also, the parametric solution of (1.22) in [69] was used to design some proportional integral observers for descriptor linear systems in [244–246, 249]. In [251], the parametric solution given in [69] was applied to the design of generalized proportional integral deriva- tive observers for descriptor linear systems. In addition, the result on a parametric solution in [247] to the matrix equation (1.22) has been used to design proportional multi-integral observers for descriptor linear systems in [254] for discrete-time case and in [263] for continuous-time case. 24 1 Introduction

1.3.3 Second-Order Generalized Sylvester Matrix Equations

In analysis and design of second-order linear systems [74, 77, 166], the following matrix equation is often encountered

MXF2 + DXF + KX = BY, (1.29) where X and Y are unknown matrices. A more general form of (1.29)isasfollows

2 2 MXF + DXF + KX = B2YF + B1YF + B0Y, (1.30) which was proposed in [84] to investigate the problem of generalized eigenstructure assignment in a type of second-order linear systems. The following nonhomogeneous form of (1.30) was also studied in [87]

2 2 MXF + DXF + KX = B2YF + B1YF + B0Y + R, (1.31) where R is an additional parameter matrix. In the previous equations, the largest degree of F is 2, and thus they are called second-order generalized Sylvester matrix equations. When the matrix M is nonsingular and the matrix F is in a Jordan form, two general parametric solutions were established in [74] for the matrix equation (1.29). The first one mainly depends on a series of singular value decompositions, and is thus numerically simple and reliable. The second one utilizes the right factorization, and allows the eigenvalues of F to be set undetermined. The approaches in [74] were generalized in [77] to the case where the matrix M in (1.29) is not required to be nonsingular. The second-order generalized Sylvester matrix equation (1.29)was revisited in [113]. Differently from [74, 77], the matrix F can be an arbitrary square matrix in [113]. By using the coefficient matrices of the right coprime factorization of the system, a complete general explicit solution to the equation (1.29) was given in a finite series form. In [50], a finite iterative algorithm was proposed for the equation (1.29). In [1], the matrix equation (1.30) with B2 = 0 was considered, and an explicit solution was given by a finite iteration. It can be found that the approach in [1] is a generalization of that in [64]. This case was also handled in [283], where the matrix F is required to be diagonal, and an explicit solution was given by using the right coprime factorization. The homogeneous equation (1.30) was investigated in [84]. It was first shown that the degrees of freedom in the general solution to this equation are determined by a so-called F-left coprime condition, and then based on a generalized version of matrix fraction right factorizations, a general complete parametric solution to this equation was established for the case where the matrix F is an arbitrary square matrix. In [87], the nonhomogeneous equation (1.31)was studied. Based on the general complete parametric solution to the homogeneous equation (1.30) and Smith form reduction, a general complete parametric solution to this equation was obtained. 1.3 Multivariate Linear Matrix Equations 25

The type of second-order Sylvester matrix equations has found many applications. In [1, 74], the problem of eigenstructure assignment in second-order systems was considered via proportional plus derivative feedback. In [74], the problem was solved based on the parametric solution of (1.29), while in [1] it was solved on the basis of the solution of (1.30) with B2 = 0. The same problem was considered in [77] for descriptor second-order linear systems based on the proposed explicit solutions of the matrix equation (1.29) with singular M. In [283], a type of dynamical compensators for second-order systems were designed based on explicit solutions to second-order Sylvester equations in the forms of (1.30) with B2 = 0. In [304], the design of Luenberger function observers was investigated for second-order systems. The gain matrices were given in terms of parametric solutions to the so-called second-order generalized Sylvester-observer equation which is the dual form of the matrix equation (1.30) with B2 = 0.

1.3.4 High-Order Generalized Sylvester Matrix Equations

In [79, 108], the following so-called high-order generalized Sylvester matrix equation was investigated m i AiXF = BY, (1.32) i=0

n×n n×r p×p where Ai ∈ R , i ∈ I[0, m], B ∈ R , and F ∈ R are known matrices, and X and Y are the matrices to be determined. When Am is nonsingular and F is diagonal, two general parametric solutions to this type of matrix equations were established in [79]. Moreover, these two results were applied in [79] to eigenstructure assignment of high-order linear systems via proportional plus derivative feedback. The result on eigenstructure assignment in [79] was used in [109] to deal with the problem of robust pole assignment. The case of singular Am was considered for the matrix equation (1.32) in [108]. In [294], the problem of model reference control was considered for high-order descriptor linear systems. It can be found that this problem is also related to the matrix equation (1.32). In addition, the so-called rectangular high-order Sylvester equations in the form of

m i AiXF = R i=0 were investigated in [88, 89] for F arbitrary and Jordan, respectively. In [107], the problem of observer design was investigated in high-order descriptor linear systems. In order to obtain all the observer gains, the following equation is encountered: m m−1 i i F XAi = F YCi. (1.33) i=0 i=0 26 1 Introduction

When F is a Jordan matrix, an explicit solution was given in [107] for this equation. In [292, 293], the problems of output feedback eigenstructure assignment were con- sidered for high-order normal and descriptor linear systems, respectively. In order to solve these problems, the matrix equations (1.32) and (1.33) are involved. In [292, 293], simple solutions were presented for the matrix equation (1.33) with F diagonal. It is easily found that the aforementioned approaches for solving high-order gen- eralized Sylvester equations require the matrix F in a Jordan form. In [80], a complete general parametric solution to the matrix equation (1.32) without any restriction on F was given. By using the result in [80], complete parametric control approaches for high-order descriptor linear systems were proposed in [110]. Based on the proposed parametric design approaches, a parametric method for the gain-scheduling con- troller design of a linear time-varying system was proposed, and the design of a BTT missile autopilot was carried out. The following new type of high-order Sylvester equations were proposed in [83]

m m i i AiXF = BiYF . (1.34) i=0 i=0

As in [80], the matrix F is also assumed in [83] to be arbitrary. A complete explicit solution was given in [83] for the matrix equation (1.34) by using a generalized matrix fraction right factorization. The nonhomogeneous version of (1.34) in the following form m m i i AiXF = BiYF + R i=0 i=0 was investigated in [85]. A complete general parametric solution in a neat explicit closed form was established in [85] for the preceding equation using the feature of F-coprimeness. In addition, the so-called Sylvester-polynomial matrix equation was investigated in [257], and a complete parametric solution was given with the aid of the so-called Kronecker maps.

1.3.5 Linear Matrix Equations with More Than Two Unknowns

There are some results on linear matrix equations with more than two unknown matrices. A method for investigating the solvability of matrix equations is based on the minimal rank of a matrix expression. In [287], the minimal rank was given for the matrix expression 4 A − BiXiCi i=1 1.3 Multivariate Linear Matrix Equations 27 with X1, X2, X3, and X4 being variant matrices. By setting this obtained minimal rank to be zero, the solvability condition can be immediately obtained for the matrix equation 4 BiXiCi = A i=1 with respect to unknown matrices X1, X2, X3, and X4. In [312], the following matrix equation was considered ω h k i AkiXkF = 0, k=1 i=0 where Xk, k ∈ I[1, h], are unknown matrices. By using the so-called Kronecker matrix polynomials as tools, the explicit general solution to this matrix equation was given. In [253], a class of generalized Sylvester matrix equations with three unknown matrices were considered. An explicit solution was given based on elementary transformation of polynomial matrices. This solution can provide all the degrees of freedom, and was applied in [253] to obtain a parametric design for proportional multiple-integral derivative observer of descriptor linear systems.

1.4 Coupled Linear Matrix Equations

Coupled matrix equations have wide applications in several areas, such as stability theory, control theory, perturbation analysis, and some other fields of pure and applied mathematics. When computing an additive decomposition of a generalized transfor- mation matrix [158], one naturally encounters the following generalized Sylvester matrix equation  XA − DY = E , (1.35) XC − BY = F where X and Y are the unknown matrices. In addition, the generalized Sylvester matrix equation (1.35) arises in computing the deflating subspace of descriptor lin- ear systems [159]. In stability analysis of linear jump systems with Markovian tran- sitions, the following two types of coupled Lyapunov matrix equations are often encountered [25, 185]

N T + + + = > ∈ I[ , ] Ai Pi PiAi Qi pijPj 0,Qi 0, i 1 N , (1.36) j=1 and ⎛ ⎞ N = T ⎝ ⎠ + > ∈ I[ , ] Pi Ai pijPj Ai Qi, Qi 0, i 1 N , (1.37) j=1 28 1 Introduction where Pj, j ∈ I[1, N], are the matrices to be determined. A more general coupled matrix equation has the following form

Ai1X1Bi1 + Ai2X2Bi2 +···+AipXpBip = Ei, i ∈ I[1, N], (1.38) where Xj, j ∈ I[1, p], are the matrices to be determined. Due to their broad applications, coupled matrix equations have attracted con- siderable attention from many researchers. In [236], the following coupled matrix equation was investigated:  A XB = C 1 1 1 . A2XB2 = C2

It was shown in [236] that this coupled equation is solvable if and only if the following conditions hold: ⎧ ⎡ ⎤ ⎪   ⎪ A1 C1 0   ⎪ ⎣ − ⎦ = A1 + ⎪ rank A2 0 C2 rank rank B1 B2 , ⎪ A2 ⎪ 0 B1 B2 ⎪   ⎪ = ⎨⎪ rank A1 C1 rankA1,   rank A C = rankA , ⎪ 2 2 2 ⎪   ⎪ ⎪ C1 = ⎪ rank rankB1, ⎪  B1  ⎪ ⎪ C2 ⎩ rank = rankB2. B2

It was pointed out in [43] that the existence of a positive definite solution to the discrete-time Markovian jump Lyapunov (DMJL) matrix equation (1.37) is related to the spectral radius of an required to be less than one. When A, B, C, and D are square matrices, it was shown in [41] that the matrix equation (1.35) has a unique solution if and only if the spectra of the pencils (A − sC) and (D − sB) have an empty intersection. This conclusion implies a dual relation between the matrix equations (1.35) and (1.15). The existence condition of a solution to the equation (1.35) with square A, B, C, and D was given in [240]. A direct method for solving these kinds of coupled matrix equations is to convert them into matrix-vector equations by using the Kronecker product. With this idea, a simple and elegant explicit solution to the DMJL matrix equation (1.37) was given in [43] in terms of matrix inverses. However, computational difficulties arise when the dimensions of the coefficient matrices are high. To eliminate the high dimension- ality problem, one can first transform coefficient matrices into some special forms, and then solve some matrix equations to be more easily dealt with. Such a kind of methods belongs to the so-called transformation approaches, and has been widely adopted. For example, the generalized Schur method by applying the QZ algorithm was developed to solve the generalized Sylvester matrix equation (1.35) in [159]. In 1.4 Coupled Linear Matrix Equations 29

[177], based on the projection theorem in Hilbert space, by making use of the gener- alized singular value decompositions and the canonical correlation decompositions, an analytical expression of the least-squares solution to the simple coupled matrix equation (AXB, GXH) = (C, D) was established. Another kind of efficient approaches to solve coupled matrix equations is the type of iterative algorithms. A parallel iterative scheme for solving the DMJL equation (1.37) was proposed in [26]. Under the condition that the corresponding Markovian jump system is stochastically stable, this method was proven to be convergent to the exact solution if the initial condition chosen is zero. In [237], the restriction of zero initial conditions in [26] was removed, and a new iterative method was provided by using the solutions of standard discrete-time Lyapunov equations as its intermediate steps. In [248], two new iterative algorithms were developed for solving the coupled Lyapunov matrix equation (1.37) by using the latest estimation. The idea in [26] was also extended to solve the continuous-time Markovian jump Lyapunov (CMJL) matrix equation (1.36) in [25]. By using the concept of latest estimation proposed in [248], an implicit sequential algorithm was proposed in [201] to solve the CMJL matrix equation (1.36) based on the algorithm in [26]. Besides, implicit iterative algorithms with some tunable parameters were developed in [255] to solve the coupled Lyapunov matrix equation (1.36). A significant feature of the algorithms in [255] is that the iterative sequences are updated by using not only the information in the last step, but also the information in the current step and the previous steps. The aforementioned so-called parallel scheme is only valid for some coupled matrix equations with special structure, and is not applicable to general coupled matrix equations. For some general coupled matrix equations, the hierarchical- identification-principle-based technique was developed by Ding and his co-authors. The basic idea of such an algorithm is to regard the unknown matrices to be found as the system parameters to be identified. In [56], a gradient-based iterative algo- rithm was proposed for (1.38) with unique solutions by the hierarchical identification principle. In [55], a least-squares iterative algorithm was proposed to solve the gen- eralized Sylvester matrix equation (1.35) based on the hierarchical identification principle [53, 54]. This idea was also used in [55] to solve the general coupled matrix equation (1.38) with p = N. In order to obtain the solution, the concept of the block-matrix inner product was introduced in [55]. Another approach for constructing iterative algorithms is to adopt the concept of optimization. With this idea, a gradient-based algorithm was developed in [309] to solve the coupled Lyapunov matrix equation (1.37). This approach has been gener- alized to some more complicated cases. Gradient-based iterations were constructed in [306] to solve the general coupled matrix equation (1.38). A significant character- istic of the method in [306] is that a necessary and sufficient condition guaranteeing the convergence of the algorithm can be explicitly obtained. Moreover, the algo- rithm in [306] removed the restriction of p = N. In [311] gradient-based iterative algorithms were also proposed to obtain the weighted least-squares solution to the general coupled Sylvester matrix equation (1.38). Necessary and sufficient conditions guaranteeing the convergence of the proposed algorithms were also derived. 30 1 Introduction

The aforementioned iterative algorithms are only applicable to the case where the involved matrix equation has a unique solution. For a coupled matrix equation with infinitely many solutions, it is suitable to adopt finite iterative methods. This kind of approaches is very similar to the conjugate gradient approach for solving ordinary linear matrix-vector equations. The basic idea is to construct a sequence of orthogonal residuals in an appropriate inner product space. With this idea, iterative algorithms were proposed for solving the DMJL matrix equation (1.37) in [229] and CMJL matrix equation (1.36) in [279]. By using the inner product as a tool, it was proven that the proposed algorithm can obtain the exact solution within finite iteration steps. Moreover, the algorithm can be implemented without transformation of original coefficient matrices into any canonical forms. In [49], a finite iterative algorithm was proposed to obtain the reflexive solution to the generalized Sylvester matrix equation (1.35). The method in [49] was generalized in [51] to obtain the bisymmetric solution of the matrix equation (1.38) with p = N = 2. In [135], the matrix form of the bi-conjugate gradient stabilized (Bi-CGSTAB) method was established for solving a kind of coupled matrix equations with two unknown matrices by employing Kronecker products and vectorization operators.

1.5 Complex Conjugate Matrix Equations

By the complex conjugate matrix equation it means a complex matrix equation with the conjugate of the unknown matrices. The earliest investigated complex conjugate matrix equation may be the following matrix equation with unknown matrix X:

AX − XB = C, (1.39) which was studied by Bevis, Hall and Hartwing in [21]. This equation can be viewed as a conjugate version of the normal Sylvester matrix equation (1.8). Due to this reason, in this book the matrix equation (1.39) will be referred to as normal con- Sylvester matrix equation. In [21, 142], a necessary and sufficient condition for the existence of a solution to the normal con-Sylvester equation was established by using the concept of consimilarity [141, 148] associated with two partitioned matrices related to A, B, and C. The general solution to the corresponding homogeneous equation AX − XB = 0 was given in [21], where it was shown that any solution to the normal con-Sylvester equation can be written as a particular solution plus a complementary solution to the homogeneous equation. In [20], a solution to the normal con-Sylvester matrix equation (1.39) was obtained in the case where the matrices A and B are both in consimilarity Jordan forms. In [258] explicit solutions expressed by the original coefficient matrices were established with the help of a kind of real representation [155] of complex matrices based on a proposed solution to the normal Sylvester equation. Different from [20], the solution can be obtained in terms of original coefficient matrices instead of real representation matrices by using a developed result on characteristic polynomial of real representation matrices. It was 1.5 Complex Conjugate Matrix Equations 31 also shown in [258] that the solution to the normal con-Sylvester matrix equation (1.39) is unique if and only if AA and BB have no common eigenvalues. In addition, the homogeneous equation AX − XB = 0 was also investigated in [225], and it was shown that the solution can be obtained in terms of the Kronecker canonical form of the matrix pencil B + sA and the two non-singular matrices that transform this pencil into its Kronecker canonical form. The following conjugate version of the Kalman-Yakubovich matrix equation (1.9) was firstly investigated in [155]:

X − AXB = C. (1.40)

Similarly, the matrix equation (1.40) will be referred to as con-Kalman-Yakubovich matrix equation in this book. In [155] with the aid of a real representation of complex matrices, the solvability conditions and solutions of this equation were established in terms of its real representation matrix equation. Based on a result on character- istic polynomials of the real representation matrix for a complex matrix, an explicit solution of this equation was given for the con-Kalman-Yakubovich matrix equation (1.40) in terms of controllability and observability matrices in [280], while in [308] the con-Kalman-Yakubovich equation was transformed into the Kalman-Yakubovich equation. In [264], some Smith-type iterative algorithms were developed, and the corresponding convergence analysis was also given. A general form of the matrix equations (1.39) and (1.40) is the following equation

EXF − AX = C, where X is the matrix to be determined, and E, A, F, and C are known complex matrices with E, A, and F being square. This equation was investigated in [242], and three approaches were provided to obtain the solution of this equation. The first approach is to transform it into a real matrix equation with the help of real representations of complex matrices. In the second approach, the solution is given in terms of the characteristic polynomial of a constructed matrix pair. In the third approach, the solution can be neatly expressed in terms of controllability matrices and observability matrices. Some conjugate versions of the matrix equations (1.20), (1.22), and (1.25)have also been investigated in literature. In [269], the following con-Yakubovich matrix equation was considered: X − AXF = BY, where X and Y are the unknown matrices. With the aid of a relation on the characteris- tic polynomial of the real representation for a complex matrix, some explicit paramet- ric expressions of the solutions to this equation were derived. The con-Yakubovich matrix equation was also investigated in [266]. Different from the approach in [269], the solutions of this equation were derived in [266] beyond the framework of real representations of complex matrices. With Smith normal form reduction as a tool, a general complete parametric solution was proposed in terms of the original matri- ces. This solution is expressed in a finite series form, and can offer all the degrees 32 1 Introduction of freedom which are represented by a free parameter matrix Z. In [268, 273], the following con-Sylvester matrix equations were investigated:

AX + BY = XF, AX + BY = XF + R, and the solutions to these two equations were explicitly given. In [268], the solution was obtained based on the Leverrier algorithm. While in [273], the general solution was given by solving a standard linear equation. In [282], the following generalized con-Sylvester matrix equation was considered:

AX + BY = EXF, and two approaches were established to solve this matrix equation. The first approach is based on the real representation technique. The basic idea is to transform it into the generalized Sylvester matrix equation. In the second approach the solution to this matrix equation can be explicitly provided. One of the obtained solutions in [282] is expressed in terms of the controllability matrix and the observability matrix. Such a feature may bring much convenience in some analysis and design problems related to the generalized con-Sylvester matrix equation. In [252], an iterative algorithm was presented for solving the nonhomogeneous generalized con-Sylvester matrix equation AX + BY = EXF + R.

This iterative method in [252] can give an exact solution within finite iteration steps for any initial values in the absence of round-off errors. Another feature of the proposed algorithm is that it is implemented by original coefficient matrices. In [267], the so-called con-Sylvester-polynomial matrix equation, which is a general form of the con-Sylvester and con-Yakubovich matrix equations, was investigated. By using the conjugate product of complex polynomial matrices proposed in [271] as a tool, the complete parametric solution was given for the con-Sylvester-polynomial matrix equation. In [241], the following matrix equation was investigated

AX − XHB = C, (1.41) where A ∈ Cm×n, B ∈ Cn×m, and C ∈ Cm×m are known matrices, and X ∈ Cn×m is the matrix to be determined. It was pointed out in [241] that this matrix equation has a solution if and only if there exists a nonsingular matrix S ∈ C(n+m)×(n+m) such that     0 −A C −A = S SH. B 0 B 0

In [48], an alternative proof was provided for the result on the matrix equation (1.41) in [241]. For the matrix equation (1.41) with m = n, a necessary and sufficient 1.5 Complex Conjugate Matrix Equations 33 condition was given in [168] for the existence of its unique solution. In addition, an efficient numerical algorithm was also given in [48] to solve the matrix equation (1.41) by using the generalized Schur decomposition. For some more general complex conjugate matrix equations, there are results available in literature. In [281], an iterative algorithm was constructed to solve the so-called extended con-Sylvester matrix equations by using the hierarchical identifi- cation principle. Such an idea was generalized to solve some other equations in [219, 265, 272]. Another numerical approach was the finite iterative algorithm. In [275], a finite iterative algorithm was proposed to solve the so-called extended con-Sylvester matrix equations by using a newly defined real inner product as a tool. The real inner product in [275] was extended in [270] to the case of matrix groups. With the help of this inner product, a finite iterative algorithm was established in [270] to solve a class of coupled con-Sylvester matrix equations. The real inner products in [270, 275] were also used to solve some other complex matrix equations in [204, 274].

1.6 Overview of This Monograph

Besides the current chapter of introduction, this monograph contains the following four parts: • Mathematical Preliminaries • Part I. Iterative Solutions to Complex Conjugate Matrix Equations • Part II. Explicit Solutions to Complex Conjugate Matrix Equations • Part III. Applications of Complex Conjugate Matrix Equations Some basic mathematical tools are given in the chapter of Mathematical Pre- liminaries. These materials include Kronecker products, Leverrier algorithms and generalized Leverrier algorithms, singular value decompositions, vector norms and operator norms, real representations of complex matrices, consimilarity, real linear spaces, real inner product spaces, and optimization in complex domain. The results in Sects.2.1–2.9 will be used to obtain the solutions of complex conjugate matrix equations, or to derive convergence conditions of the proposed iterative algorithms, and the results in Sect.2.10 will be used in Chap. 12 to solve the problem of quadratic regulation for antilinear systems. Most contents of this chapter can be found in liter- ature and some textbooks. The general structure of Part I is as follows: ⎧ ⎨ Smith-type iterative approaches (Chap. 3) Part I ⎩ Hierarchical-update-based iterative approaches (Chap. 4) Finite iterative approaches (Chap. 5) Some Smith-type algorithms are proposed in Chap. 3 for con-Kalman-Yakubovich matrix equations. In Chap. 4, by using the principle of hierarchical identification some iterative approaches are given for some complex conjugate matrix equations, which 34 1 Introduction include the extended con-Sylvester matrix equations and coupled con-Sylvester matrix equations. The convergence conditions are also given for the proposed algo- rithms. In Chap. 5, finite iterative approaches are established for some complex con- jugate matrix equations, and the convergence of the proposed algorithms is analyzed by using real inner products as tools. An important feature of the algorithms given in this part is that they are implemented by using the original coefficient matrices of the considered complex conjugate matrix equations. The general structure of Part II is as follows: ⎧ ⎪ ⎪ Real-representation-based approaches (Chap. 6) ⎨⎪ Polynomial-matrix-based approaches (Chap. 7) Part II ⎪ Unilateral-equation-based approaches (Chap. 8) ⎪ ⎩⎪ Conjugate products (Chap. 9) Con-Sylvester-sum-based approaches (Chap. 10) In Chap. 6 through Chap. 10, some approaches are provided to obtain explicit solutions of some complex conjugate matrix equations. In Chap. 6, the core idea is to first transform the original equations into real matrix equations by using real representations of complex matrices, and then obtain the explicit solutions by using a result on the characteristic polynomial of the real representation of a complex matrix. In Chap. 7, the polynomial matrix techniques are used to derive explicit solutions of the con-Sylvester matrix equations, con-Yakubovich matrix equations and generalized con-Sylvester matrix equations. It can be seen that the standard and generalized Leverrier algorithms play vital roles in these methods. In Chap. 8,the solutions to the con-Sylvester and con-Yakubovich matrix equations can be obtained with the solutions to standard unilateral linear equations. Chapters9 and 10 aim to give a unified framework for solving the so-called con-Sylvester-polynomial matrix equations. In Chap. 9, the concept of conjugate products for polynomial matrices is proposed, and some properties are given. In Chap. 10, the con-Sylvester-sum-based approaches are established to solve the general con-Sylvester-polynomial matrix equations with conjugate products of polynomial matrices as tools. The general structure of Part III is as follows:  Stability for antilinear systems (Chap. 11) Part III Feedback design for antilinear systems (Chap. 12) This part aims to give some applications of complex conjugate matrix equations. In Chap.11, stability analysis is investigated for ordinary antilinear systems and sto- chastic antilinear systems. Some stability criteria are given for these systems in terms of special complex conjugate matrix equations, the so-called anti-Lyapunov matrix equation and coupled anti-Lyapunov matrix equation. In addition, new iterative algo- rithms are also provided for the coupled anti-Lyapunov matrix equations. In Chap. 12, some feedback design problems are considered for ordinary antilinear systems. The involved problems include generalized eigenstructure assignment problems, model reference tracking problems and quadratic regulation problems. They are all related to some complex conjugate matrix equations. Chapter 2 Mathematical Preliminaries

In this chapter, some basic mathematical tools, which will be used in the other chapters of this book, are introduced. Throughout this chapter, Ir denotes the r-dimensional identity matrix, and I denotes an identity matrix with appropriate dimensions. The field of scalars denoted by F is either the field of real numbers R or the field of complex numbers C.The notation |·| denotes the absolute value of a real number, or the modulus of a com- plex√ number. In addition, the symbol i is used to denote the imaginary unit, that is, i = −1. For two integers m ≤ n, I[m, n] is used to denote the set {m, m + 1,...,n}. = × For matrices, the following symbols are used. A aij m×n represents an m n matrix A whose entry in the i-th row j-th column position is aij. For a square matrix A,tr(A) is used to denote its trace; in addition, it is denoted that

fA (s) = det(sI − A), and gA (s) = det(I − sA), where det(·) represents the determinant of a matrix. For a general matrix A,Re(A) and Im(A) denote the real and imaginary parts of A, respectively.   = For partitioned matrices the similar symbol is also used. That is, A Aij m×n represents a matrix A with m row partitions and n column partitions, whose block in i-th row j-th column position is Aij. It should be pointed out that all the aforementioned symbols are also valid through- out this monograph.

2.1 Kronecker Products

In this section, we introduce the matrix operation of Kronecker products, a useful notation that has several important applications. A direct application of Kronecker products is to analyze/solve linear matrix equations, which will be seen at the end of this section.

© Springer Science+Business Media Singapore 2017 35 A.-G. Wu and Y. Zhang, Complex Conjugate Matrix Equations for Systems and Control, Communications and Control Engineering, DOI 10.1007/978-981-10-0637-1_2 36 2 Mathematical Preliminaries     m ×n m ×n Definition 2.1 Given A = aij ∈ F 1 1 and B = bij ∈ F 2 2 ,the m1×n1 m2×n2 Kronecker product of A and B is defined to be the partitioned matrix ⎡ ⎤ ··· a11Ba12B a1n1 B ⎢ ··· ⎥   ⎢ a21Ba22B a2n1 B ⎥ m1m2×n1n2 A ⊗ B = = aijB ∈ F . (2.1) ⎣ ··· ··· ··· ⎦ m1×n1 ··· am1Bam2B am1n1 B

The Kronecker product is also called the direct product, or tensor product. Similarly, the left Kronecker product of A = aij and B = bij can be m1×n1 m2×n2 defined as ⎡ ⎤ ··· Ab11 Ab12 Ab1n2 ⎢ Ab Ab ··· Ab ⎥ = ⎢ 21 22 2n2 ⎥ . C ⎣ ··· ··· ··· ⎦ ··· Abm21 Abm22 Abm2n2

Obviously, the above defined matrix C satisfies C = B ⊗ A. This fact implies that it is sufficient to only define the Kronecker product expressed in (2.1). According to Definition 2.1, the following result can be readily obtained.

Lemma 2.1 For two matrices A and B with A in the partitioned form A =[Aij]m×n, there holds   ⊗ = ⊗ A B Aij B m×n .

The next proposition provides some properties of the Kronecker product. The results can follow immediately from the definition, and thus the proof is omitted.

Proposition 2.1 If the dimensions of the involved matrices A, B, and C, are com- patible for the defined operations, then the following conclusions hold. (1) For μ ∈ F, μ ⊗ A = A ⊗ μ = μA; (2) For a, b ∈ F, (aA) ⊗ (bB) = (ab) (A ⊗ B); (3) (A + B) ⊗ C = A ⊗ C + B ⊗ C, A ⊗ (B + C) = A ⊗ B + A ⊗ C; (4) (A ⊗ B)T = AT ⊗ BT, (A ⊗ B)H = AH ⊗ BH; (5) tr(A ⊗ B) = (trA)(trB); (6) For two column vectors u and v, uT ⊗ v = vuT = v ⊗ uT.

The following proposition provides the associative property and distributive property for the Kronecker product. 2.1 Kronecker Products 37

Proposition 2.2 For the matrices with appropriate dimensions, the following results hold. (1) (A ⊗ B) ⊗ C = A ⊗ (B ⊗ C); (2) (A ⊗ B)(C ⊗ D) = AC ⊗ BD.   = ∈ Fm×n ∈ Fp×q ∈ Fs×t Proof (1) Let A aij m×n , B , and C . By using Lemma 2.1 and the 2nd item of Proposition 2.1, one has

( ⊗ ) ⊗ ⎡A B C ⎤ a11Ba12B ··· a1nB ⎢ ··· ⎥ = ⎢ a21Ba22B a2nB ⎥ ⊗ ⎣ ··· ··· ··· ⎦ C a Ba B ··· a B ⎡ m1 m2 mn ⎤ (a11B) ⊗ C (a12B) ⊗ C ··· (a1nB) ⊗ C ⎢ ( ) ⊗ ( ) ⊗ ··· ( ) ⊗ ⎥ = ⎢ a21B C a22B C a2nB C ⎥ ⎣ ··· ··· ··· ⎦ (a B) ⊗ C (a B) ⊗ C ··· (a B) ⊗ C ⎡ m1 m2 mn ⎤ a11 (B ⊗ C) a12 (B ⊗ C) ··· a1n (B ⊗ C) ⎢ ( ⊗ ) ( ⊗ ) ··· ( ⊗ ) ⎥ = ⎢ a21 B C a22 B C a2n B C ⎥ ⎣ ··· ··· ··· ⎦

am1 (B ⊗ C) am2 (B ⊗ C) ··· amn (B ⊗ C) = A ⊗ (B ⊗ C) .

This is the first conclusion.   = ∈ Fm×n = ∈ Fn×p (2) Let A aij m×n , C cij n×p . Then, one has ( ⊗ )( ⊗ ) ⎡A B C D ⎤ ⎡ ⎤ a11Ba12B ··· a1nB c11Dc12D ··· c1pD ⎢ ··· ⎥ ⎢ ··· ⎥ = ⎢ a21Ba22B a2nB ⎥ ⎢ c21Dc22D c2pD ⎥ ⎣ ··· ··· ··· ⎦ ⎣ ··· ··· ··· ⎦

am1Bam2B ··· amnB cn1Dcn2D ··· cnpD ⎡ ⎤ n n n ··· ⎢ a1jB cj1D a1jB cj2D a1jB cjpD ⎥ ⎢ j=1 j=1 j=1 ⎥ ⎢ n n n ⎥ ⎢ ··· ⎥ ⎢ a2jB cj1D a2jB cj2D a2jB cjpD ⎥ = ⎢ j=1 j=1 j=1 ⎥ ⎢ ⎥ ⎢ ··· ··· ··· ⎥ ⎣ n n n ⎦ amjB cj1D amjB cj2D ··· amjB cjpD j=1 j=1 j=1 38 2 Mathematical Preliminaries ⎡    ⎤ n n n ⎢ ··· ⎥ ⎢ a1jcj1 BD a1jcj2 BD a1jcjp BD ⎥ ⎢ j=1  j=1  j=1  ⎥ ⎢ ⎥ ⎢ n n n ⎥ ⎢ ··· ⎥ = ⎢ a2jcj1 BD a2jcj2 BD a2jcjp BD ⎥ ⎢ j=1 j=1 j=1 ⎥ ⎢ ⎥ ⎢ ···  ···  ···  ⎥ ⎢ ⎥ ⎣ n n n ⎦ amjcj1 BD amjcj2 BD ··· amjcjp BD j=1 j=1 j=1 = (AC) ⊗ (BD) .

The proof is thus completed. 

By using Item (2) of Proposition 2.2, the following two conclusions can be obtained.

Proposition 2.3 If U and V are two square unitary matrices, then U ⊗ Visalsoa .

Proof Since U and V are unitary, then UUH = I, and VVH = I. By using Item (4) of Proposition 2.1 and Item (2) of Proposition 2.2, one has

(U ⊗ V )(U ⊗ V )H = (U ⊗ V ) UH ⊗ V H = UUH ⊗ VVH = I ⊗ I = I.

This implies that U ⊗ V is a unitary matrix. The proof is thus completed. 

Proposition 2.4 For two nonsingular matrices A1 and A2, there holds

( ⊗ )−1 = −1 ⊗ −1 A1 A2 A1 A2 .

−1 = −1 = Proof Since A1 and A2 are nonsingular, then A1A1 I, A2A2 I. By using Item (2) of Proposition 2.2, one has

(A ⊗ A ) A−1 ⊗ A−1 1 2 1 2 = −1 ⊗ −1 A1A1 A2A2 = I ⊗ I = I.

This implies that the conclusion is true. 

According to Proposition 2.4, the Kronecker product of two nonsingular matrices is also nonsingular. With this fact, by further using Item (2) of Proposition 2.2 the following result can be readily obtained. 2.1 Kronecker Products 39

m×n s×t Proposition 2.5 Given two matrices A1 ∈ F and A2 ∈ F , there holds

rank(A1 ⊗ A2) = (rankA1)(rankA2) .

Proof Suppose that rankA1 = r1, and rankA2 = r2. Denote

 = ( )  = ( ) 1 diag Ir1 ,0(m−r1)×(n−r1) , 2 diag Ir2 ,0(s−r2)×(t−r2) .

According to basic matrix theory, there exist nonsingular matrices Pi, Qi, i = 1, 2, with appropriate dimensions such that Ai = PiiQi, i = 1, 2. By using Item (2) of Proposition 2.2, one has

A1 ⊗ A2 = (P11Q1) ⊗ (P22Q2) (2.2)

= (P1 ⊗ P2)(1 ⊗ 2)(Q1 ⊗ Q2) .

Since Pi, Qi, i = 1, 2, are all nonsingular, both (P1 ⊗ P2) and (Q1 ⊗ Q2) are nonsin- gular according to Proposition 2.4. In addition, it is obvious that rank (1 ⊗ 2) = r1r2 by using the definition of Kronecker products. Combining these facts with (2.2), gives rank(A1 ⊗ A2) = rank(1 ⊗ 2) = r1r2 = (rankA1)(rankA2) .

The proof is thus completed. 

At the end of this section, the operation of vectorization is considered for a matrix. It will be seen that this operation is very closely related to the Kronecker product. For m×n m amatrixA ∈ F , if it is written as A = a1 a2 ··· an with aj ∈ F , j ∈ I[1, n], then its vectorization is defined as ⎡ ⎤ a1 ⎢ ⎥ ( ) = ⎢ a2 ⎥ ∈ Fmn vec A ⎣ ···⎦ .

an

That is to say, vec (A) is the vector formed by “stacking” the columns of A into a long vector. It is very obvious that the operation “vec” is linear:

vec(αA + βB) = αvec(A) + βvec(B), for any A, B ∈ Fm×n and α, β ∈ F. The next result indicates a close relationship between the operation of “vec” and the Kronecker product.

Proposition 2.6 If A ∈ Fm×n,B∈ Fs×t, and X ∈ Fn×s, then

vec(AXB) = (BT ⊗ A)vec(X). 40 2 Mathematical Preliminaries

Proof For any matrix D ∈ Fs×t, it is written as   D = d1 d2 ··· dt ,

s with dj ∈ F , j ∈ I[1, t]. In addition, it is denoted that   T dj = d1j d2j ··· dsj .

With these notations, one has   vec(AXB) = vec AXb AXb ··· AXb ⎡ ⎤ 1 2 t AXb1 ⎢ ⎥ = ⎢ AXb2 ⎥ ⎣ ··· ⎦ AXb ⎡ t ⎤ b11Ax1 + b21Ax2 +···+bs1Axs ⎢ + +···+ ⎥ = ⎢ b12Ax1 b22Ax2 bs2Axs ⎥ ⎣ ··· ⎦ b Ax + b Ax +···+b Ax ⎡ 1t 1 2t 2 ⎤ ⎡ st⎤ s b11Ab21A ··· bs1A x1 ⎢ ··· ⎥ ⎢ ⎥ = ⎢ b12Ab22A bs2A ⎥ ⎢ x2 ⎥ ⎣ ··· ··· ··· ⎦ ⎣ ···⎦

b1tAb2tA ··· bstA xs = (BT ⊗ A)vec(X).

The conclusion is thus proven. 

The following result establishes a relation between vec XT and vec (X) for an arbitrarily given matrix X. Proposition 2.7 Let m, n be given positive integers. There is a unique matrix P(m, n) ∈ Fmn×mn such that

vec XT = P (m, n) vec(X) for all X ∈ Fm×n. (2.3)

This matrix P (m, n) depends only on the dimensions m and n, and is given by

m n   P(m, n) = E ⊗ ET = ET , (2.4) ij ij ij m×n i=1 j=1

m×n where each Eij ∈ F has entry 1 in the i-th row j-th column position and all other entries are zero. Moreover,

P (m, n) = [P (n, m)]T = [P (n, m)]−1 . 2.1 Kronecker Products 41   m×n m×n Proof Let X = xij ∈ F , and Eij ∈ F be the unit matrix described in the T T = T ∈ I[ , ] statement of the proposition. Notice that Eij XEij xijEij for all i 1 m and j ∈ I[1, n]. Thus, m n m n T = T = T T X xijEij Eij XEij . i=1 j=1 i=1 j=1

Now use the identity for the vec of a threefold matrix product given in Proposition 2.6 to write m n m n T = T T = ⊗ T ( ) vec X vec Eij XEij Eij Eij vec X , i=1 j=1 i=1 j=1 which verifies (2.3). Since XT T = X and XT ∈ Fn×m, one has

vec(X) = P (n, m) vec XT = P (n, m) P (m, n) vec(X),

−1 n×m so P (n, m) = [P (m, n)] . Finally, let εij denote the unit matrices in F , notice ε = T that ij Eji , and compute

n m n m ( , ) = ε ⊗ εT = T ⊗ P n m ij ij Eji Eji i=1 j=1 i=1 j=1 m n = ⊗ T T = ( , ) T Eij Eij [P m n ] . i=1 j=1

The proof is thus completed. 

The matrix P(m, n) in the above proposition is known as a , which is very useful in the field of algebra. A direct application of Kronecker products is to investigate the following general linear matrix equation p AiXBi = C, (2.5) i=1

m×s t×n m×n where Ai ∈ C , Bi ∈ C , i ∈ I[1, p], and C ∈ C are known matrices, and X ∈ Cs×t is the unknown matrix. By using Proposition 2.6, the matrix equation (2.5) can be reduced to a matrix-vector equation of the form Gx = c where G ∈ Cmn×st, and x ∈ Cst, c ∈ Cmn. 42 2 Mathematical Preliminaries

m×s t×n m×n Theorem 2.1 Given Ai ∈ C ,Bi ∈ C ,i∈ I[1, p], and C ∈ C , a matrix X ∈ Cs×t is a solution of the matrix equation(2.5) if and only if the vector x = vec (X) is a solution of the equation Gx = c, with p = T ⊗ , = ( ). G Bi Ai c vec C i=1

The conclusion of this theorem can be easily proven by using Proposition 2.6 and the linearity of the operation of vec.

2.2 Leverrier Algorithms

For a matrix A ∈ Cn×n, the coefficients of its characteristic polynomial play an important role in many applications, such as control theory and matrix algebra. It is well known that the so-called Leverrier algorithm can be used to obtain these coefficients in a successive manner. This is the following theorem.

Theorem 2.2 Given A ∈ Cn×n, denote

n fA (s) = det(sI − A) = a0 + a1s +···+ans ,an = 1, (2.6) n−1 adj(sI − A) = R0 + R1s +···+Rn−1s .

Then, the coefficients ai, i ∈ I[0, n−1], and the coefficient matrices Ri, i ∈ I[0, n−1], can be obtained by the following iteration  tr(R − A) a − =− n i n i i , (2.7) Rn−i = Rn−i+1A + an−i+1I with the initial value Rn−1 = I.

In the rest of this section, the aim is to give a proof of Theorem 2.2. To this end, some preliminaries are needed. For the polynomial fA (s) in (2.6), the companion matrix is defined as ⎡ ⎤ 00··· 0 −a0 ⎢ ⎥ ⎢ 10··· 0 −a1 ⎥ ⎢ ··· − ⎥ C = ⎢ 1 0 a2 ⎥ , (2.8) A ⎢ . ⎥ ⎣ .. ··· ⎦

1 −an−1 2.2 Leverrier Algorithms 43 whose characteristic polynomial is fA (s) in (2.6). For the companion matrix CA in (2.8), one has the following two lemmas.

Lemma 2.2 For the companion matrix CA in (2.8) and an integer m less than n, denote   ··· = m + m−1 +···+ 0 x1 x2 xn amCA am−1CA a0CA.

Then,   T x1 = a0 a1 ··· am 01×(n−m−1) ,

xi+1 = CAxi,i∈ I[1, n − 1].

Proof Let ei,bethei-th column of the identity matrix In. By simple calculations, it is easily found that

C e = e + , i ∈ I[1, n − 1], A i i 1  T CAen = −a0 −a1 ··· −an−1 .

With the above relations, one has

k C e = e + , k ∈ I[1, n − i], (2.9) A i i k  n−i+1 = − − ··· − T CA ei a0 a1 an−1 .

Thus, it can be derived from (2.9) that

= m + m−1 +···+ 0 x1 amCA am−1CA a0CA e1 = a e + + a − e +···+a e  m m 1 m 1 m  0 1 T = a0 a1 ··· am 01×(n−m−1) , and for i ∈ I[1, n − 1],

= m + m−1 +···+ 0 xi+1 amCA am−1CA a0CA ei+1 = m + m−1 +···+ 0 amCA ei+1 am−1CA ei+1 a0CAei+1 m+1 m 1 = a C e + a − C e +···+a C e m A i m 1 A i 0 A i = m + m−1 +···+ 0 CA amCA an−1CA a0CA ei = CAxi.

The proof is thus completed.  44 2 Mathematical Preliminaries

Lemma 2.3 For the companion matrix CA in (2.8), the first (n − k) elements on the principal diagonal of

= k + k−1 +···+ , ∈ I[ , − ] Ek CA an−1CA an−k+1CA k 1 n 1 (2.10) are 0, and the other k elements are −an−k.

Proof Let ei, i ∈ I[1, n],bethei-th column of the identity matrix In.Fori ∈ I[ , − ] T 1 n k ,thei-th element on the principal diagonal of Ek is ei Ekei. Then, by using (2.9) one has

T = T k + k−1 +···+ ei Ekei ei CAei an−1CA ei an−k+1CAei = T( + +···+ ) ei ei+k an−1ei+k−1 an−k+1ei+1 (2.11) = 0.

T ∈ I[ − + , ] To determine the remaining diagonal elements ei Ekei of Ek, i n k 1 n , the following Cayley-Hamilton identity for the companion matrix CA in (2.8)is considered: n + n−1 +···+ = , CA an−1CA a0In 0 which can be written as

n−k =−( n−k +···+ + ) EkCA an−kCA a1CA a0In , (2.12) where Ek is defined in (2.10). Let the matrix within parentheses on the right-hand side in (2.12) have columns y1, y2,...,yn. Then by Lemma 2.2 one has   T y1 = a0 a1 ··· an−k 01×(k−1) , (2.13)

yi+1 = CAyi,fori ∈ I[1, n − 1]. (2.14)

In addition, by using (2.9) it follows from (2.13) that for i ∈ I[1, n],

= i−1 = i−1 ( + +···+ ) yi CA y1 CA a0e1 a1e2 an−ken−k+1 (2.15) = a0ei + a1ei+1 +···+an−ken−k+i.

Now denote the columns of Ek in (2.10)byz1, z2,...,zn. By a further application of (2.9) it follows that the i-th column on the left-hand side of (2.12)is

n−k = = , ∈ I[ , ]. EkCA ei Eken−k+i zn−k+i i 1 k

Equating the first k columns on either side of (2.12) therefore produces

zn−k+i =−yi, i ∈ I[1, k]. (2.16) 2.2 Leverrier Algorithms 45

It follows from (2.16) and (2.15) that for i ∈ I[1, k],

T en−k+iEken−k+i = T =− T en−k+izn−k+i en−k+iyi (2.17) =− T ( + +···+ ) en−k+i a0ei a1ei+1 an−ken−k+i =−an−k.

With (2.11) and (2.17), the proof is thus completed. 

With the previous lemmas as preliminary, now the proof of Theorem 2.2 can be given. The proof of Theorem 2.2: Obviously, the following identity holds

(sI − A) adj(sI − A) = Idet(sI − A).

With the notation in Theorem 2.2, the preceding expression gives

n−1 (sI − A) R0 + R1s +···+Rn−1s n = a0I + a1Is +···+anIs .

Equating coefficients of sn−1, sn−2,...,s0, in this relation, one immediately obtains the second expression in (2.7). In addition, by the iteration in the second expression of (2.7) it can be derived that for k ∈ I[1, n],

k−1 k−2 Rn−k = A + an−1A +···+an−k+1I. (2.18)

( ) = ( ) On the other hand, let CA be in (2.8). Then, it is well-known that fA s fCA s . Thus, if fA (s) = (s − λ1)(s − λ2) ···(s − λn) , then for an integer k ≥ 0 there holds

n k = k = λk trA trC i . i=1

With this relation, it follows from (2.18) and Lemma 2.3 that

k k−1 −ka − = trE = tr C + a − C +···+a − + C n k k A n 1 A n k 1 A k k−1 = tr A + an−1A +···+an−k+1A

= tr(Rn−kA).

This gives the first expression in (2.7). The proof is thus completed. 46 2 Mathematical Preliminaries

Remark 2.1 It is easily known from the proof that the Leverrier algorithm in (2.7) can also be written as  tr(AR − ) a − =− n i n i i , Rn−i = ARn−i+1 + an−i+1I with the initial value Rn−1 = I.

2.3 Generalized Leverrier Algorithms

In Sect. 2.2, the celebrated Leverrier algorithm is given. By using this algorithm, one can obtain the determinant and the adjoint matrix of the polynomial matrix (sI − A), whose leading coefficient matrix is the identity matrix. In this section, the aim is to give an algorithm for computing the determinant and the adjoint matrix of the general polynomial matrix (sM − A) for M, A ∈ Cn×n. The polynomial matrix (sM − A) can be written as

(sM − A) = zI − (zI − sM + A) = zI − Aˆ , where Aˆ = zI − sM + A (2.19) and z is a new pseudo variable, which does not affect (sM − A), since it can be eliminated. Now it is clearly seen that the Leverrier algorithm in Sect. 2.2 can be applied to compute the inverse of the matrix zI − Aˆ . Denote   ˆ n−1 n−2 adj zI − A = R (s) = z R0(s) + z R1(s) +···+ zRn−2(s) + Rn−1(s), (2.20) and   ˆ n n−1 n−2 det zI − A = q(s) = z + q1(s)z + q2(s)z +···+qn(s). (2.21)

By using the Leverrier algorithm, one has   ( ) = , ( ) =− ˆ R0 s In q1 s tr A ,  ˆ 1 ˆ R1(s) = AR0(s) + q1In, q2(s) =− tr AR1(s) , 2   ( ) = ˆ ( ) + , ( ) =−1 ˆ ( ) (2.22) R2 s AR1 s q2In q3 s 3 tr AR2 s , ··· ···   ( ) = ˆ ( ) + , ( ) =−1 ˆ ( ) Rn−1 s ARn−2 s qn−1In qn s n tr ARn−1 s . 2.3 Generalized Leverrier Algorithms 47

The matrices Ri(s), i ∈ I[1, n−1], can also be computed by the following expression:

ˆ i ˆ i−1 ˆ i−2 Ri(s) = A + q1(s)A + q2(s)A +···+qi(s)I. (2.23)

The matrices Ri(s), i ∈ I[1, n − 1], are no longer the coefficient matrices of the powers of s, but depend on the variable s itself. This can be seen from (2.22), since the matrix Aˆ depends on s.As(2.20) is independent of z, in the following, for the sake of simplicity, one can take z = 0. Therefore, the relations (2.19)–(2.21) can be written as Aˆ =−Ms + A, (2.24)   ˆ adj −A = R (s) = Rn−1(s), (2.25)

q(s) = qn(s). (2.26)

Note that there are an infinite number of forms of Aˆ , and q(s), depending on the specific value of the pseudo variable z. It is seen from (2.22) that the degree of the polynomial matrix Ri(s), i ∈ I[0, n − 1], and of the polynomial qi(s), i ∈ I[1, n],is at most equal to i. Hence, Ri(s) and qi(s) can be written as

i k Ri(s) = Ri,ks , (2.27) k=0

i k qi(s) = qi,ks , (2.28) k=0 where Ri,k and qi,k are the constant coefficient matrices and scalars of the power s, respectively. It follows from (2.27) that, in order to obtain R(s) and q(s),itis sufficient to iteratively compute the coefficient matrices Rn−1,k, and the coefficients qn,k given by

  n−1 ˆ k R(s) =−adj A = Rn−1(s) = Rn−1,ks , (2.29) k=0 n k q (s) = qn(s) = qn,ks , (2.30) k=0 in terms of Ri,k, qi,k. Substituting (2.24) and (2.27) into the iterative relations Ri(s), i ∈ I[0, n − 1], in (2.22), one can obtain the following general iterative relations by equating the coefficients of the powers of s in the two sides of each equation: 48 2 Mathematical Preliminaries ⎧ ⎨ −MRi,k−1 + qi+1,kIn, if k = i + 1, = − + ∈ I[ , ] Ri+1,k ⎩ ARi,k MRi,k−1 qi+1,kIn, if k 1 i , (2.31) ARi,0 + qi+1,0In, if k = 0, and ⎧   1 ⎨ , − , = + i+1 tr MRi k 1 if k i 1, 1 q + , = − tr[AR , − MR , − ], if k ∈ I[1, i], i 1 k ⎩ i+1 i k i k 1 (2.32) 1 − [ , ], if k = 0. i+1 tr ARi 0

The previous result can be summarized as the following theorem. Theorem 2.3 Given matrices M, A ∈ Cn×n, denote

n det(sM − A) = q (s) = qn,0 + qn,1s +···+qn,ns , n−1 adj(sM − A) = R (s) = Rn−1,0 + Rn−1,1s +···+Rn−1,n−1s .

Then, the coefficients qn,i, i ∈ I[0, n], and the coefficient matrices Rn−1,i, i ∈ I[0, n− 1], can be obtained by the iteration (2.31)–(2.32). The formulas (2.31) and (2.32) are readily reduced to the Leverrier algorithm in the previous section if it is assumed that M = I. In this case Ri,k and qi,k for k = 0 are reduced to zero. In some applications, one may need the determinant and the adjoint matrix of (I − sM) for a given matrix M. Obviously, they can be obtained by using the algo- rithm in Theorem 2.3. Such a method is a bit complicated since the iteration is 2-dimensional. In fact, they can be obtained by the following result. Theorem 2.4 Given a matrix M ∈ Cn×n, denote

n n−1 det(I − sM) = qns + qn−1s +···+q0,q0 = 1, (2.33) n−1 n−2 adj(I − sM) = Rn−1s + Rn−2s +···+R0. (2.34)

Then, the coefficients qi, and the coefficient matrices Ri, i ∈ I[1, n − 1], can be obtained by the iteration  Ri = Ri−1M + qiI ( ) , (2.35) =−tr Ri−1M qi i with the initial value R0 = I. Proof If (2.33) and (2.34) hold, it is easily known that

n n−1 det(sI − M) = q0s + q1s +···+qn, n−1 n−2 adj(sI − M) = R0s + R1s +···+Rn−1.

By using the Leverrier algorithm in Theorem 2.2, the result is immediately obtained.  2.3 Generalized Leverrier Algorithms 49

The algorithms in Theorems 2.3 and 2.4 are called the generalized Leverrier algorithms in this book.

2.4 Singular Value Decompositions

The singular value decomposition (SVD) is a very useful tool in matrix analysis and numerical computation. Since it involves only unitary or orthogonal matrices, its solution is regarded to be numerically very simple and reliable. Because of this nice feature, the SVD has found many useful applications in signal processing and statistics. In this book, the SVD will also be repeatedly used. This section is started with the concept of singular values of matrices.

Definition 2.2 For a matrix M ∈ Cm×n, a positive real number σ is called a singular value of M if there exist unit-length vectors u ∈ Cm and v ∈ Cn such that

Mv = σu, MHu = σ v. (2.36)

The vectors u and v are called left-singular and right-singular vectors for σ , respec- tively.

In this book, σmax (·) is used to denote the maximal singular value of a matrix. = In Definition 2.2, the length for a vector is its Euclidean length. For a vector a ··· T ∈ Cn   a1 a2 an , its Euclidean length is denoted by a 2. That is,   n   =  | |2. a 2 ai i=1

For a matrix M ∈ Cm×n of rank r,if(2.36) holds, then one has

MHMv = MHσu = σMHu = σσv = σ 2v.

This implies that the square of a singular value of M is an eigenvalue of MHM.On the other hand, it is obvious that MHM is Hermitian. Thus, MHM has r positive eigenvalues, and admits an orthonormal basis of eigenvectors. If λ is a positive eigenvalue of MHM, and η is the corresponding unit-length eigenvector, then one has MHMη = λη, ηHη = 1. (2.37)

Pre-multiplying by ηH both sides of the first expression in (2.37), gives

ηHMHMη = ληHη = λ, 50 2 Mathematical Preliminaries which implies that √  η = λ M 2 ,

M√η and thus λ is of unit-length. It follows from (2.37) that

Mη √ MH √ = λη. (2.38) λ

In addition, it is obvious that   √ Mη Mη = λ √ . (2.39) λ √ The expressions in (2.38) and (2.39) reveal that λ is a singular value√ of the matrix M√η η λ M, and λ and are left-singular and right-singular vectors for , respectively. The preceding result can be summarized as the following lemma.

Lemma 2.4 For a matrix M ∈ Cm×n, a positive real number σ is a singular value of M if and only if σ 2 is an eigenvalue of MHM. Moreover, if η is a unit-length H σ 2 Mη η eigenvector of M Mfor , then σ and are left-singular and right-singular vectors of M for σ , respectively.

Similarly, the following result can also be easily obtained.

Lemma 2.5 For a matrix M ∈ Cm×n, a positive real number σ is a singular value of M if and only if σ 2 is an eigenvalue of MMH. Moreover, if π is a unit-length H σ 2 π MHπ eigenvector of MM for , then and σ are left-singular and right-singular vectors of M for σ , respectively.

Before proceeding, the following well-known result on Hermitian matrices is needed.

Lemma 2.6 Given Q = QH ∈ Cn×n, there is a unitary matrix W ∈ Cn×n such that WQWH is a .

With the preceding preliminary, one can obtain the following theorem on singular value decompositions.

Theorem 2.5 For a matrix M ∈ Cm×n with rank r, there exist two unitary matrices U ∈ Cm×m and V ∈ Cn×n such that

H M = U diag σ1, σ2, ···, σr, 0(m−r)×(n−r) V (2.40) with σ1 ≥ σ2 ≥···≥σr > 0. 2.4 Singular Value Decompositions 51

Proof It is obvious that MHM is nonnegative definite. In addition, rankMHM = r since rankM = r. With these two facts, it can be known by Lemma 2.6 that there exists a unitary matrix V ∈ Cn×n satisfying

H H V M MV = diag λ1, λ2, ..., λr,0(n−r)×(n−r) (2.41) with λ1 ≥ λ2 ≥···≥λr > 0. Denote   V = η1 η2 ··· ηn . (2.42)

λ ∈ I[ , ] H η ∈ I[ , ] It is obvious that i, i 1 r , are the eigenvalues of M M√, and i, i 1 r , are the corresponding unit-length eigenvectors. Denote σi = λi, i ∈ I[1, r]. Then, Mηi by using Lemma 2.4 it can be obtained that ui = and ηi are left-singular and σi right-singular vectors of M for σi, i ∈ I[1, r], respectively. Thus, one has

Mηi = σiui, i ∈ I[1, r], (2.43)

Mηi with ui = , i ∈ I[1, r]. The expression (2.43) can be compactly written as the σi following matrix form:     M η1 η2 ··· ηr = u1 u2 ··· ur diag (σ1, σ2, ..., σr) . (2.44)

In addition, it is easily derived from (2.41) and (2.42) that

ηH H η = ∈ I[ + , ] i M M i 0, for i r 1 n , which implies that Mηi = 0, for i ∈ I[r + 1, n]. (2.45)

ηHη = = , ∈ I[ , ] Moreover, since V in (2.42) is unitary, then i j 0, for i j, i j 1 r . With this, it follows from (2.41) that for i = j, i, j ∈ I[1, r],   Mη H Mη ηHMHMη H = i j = i j ui uj σi σj σiσj ηHλ η = i j j = 0. σiσj

Mηi That is, ui = , i ∈ I[1, r], are a group of orthogonal vectors. Thus, one can find σi a group of vectors ui, i ∈ I[1, m], such that   U = u1 u2 ··· um is a unitary matrix. With this matrix U, it follows from (2.44) and (2.45) that 52 2 Mathematical Preliminaries

U diag σ , σ , ..., σ ,0( − )×( − )   1 2  r m r n r = M η η ··· η 0 ×( − )   1 2 r  m n r   = M η η ··· η M η + η + ··· η  1 2 r r 1 r 2 n = M η1 η2 ··· ηn = MV.

This can be equivalently written as (2.40). 

The expression in (2.40) is called the singular value decomposition of the matrix M.

Remark 2.2 The proof of Theorem 2.5 on the SVD can also be done by treating the matrix MMH. The proof procedure is very similar.

At the end of this section, a property of left-singular and right-singular vectors is given in the following lemma. The result can be easily obtained by using Lemma 2.4, and thus the proof is omitted.

Lemma 2.7 For a matrix M ∈ Cm×n, if u and v are left-singular and right-singular vectors of M for the singular value σ, respectively, then

uHMv = σ.

2.5 Vector Norms and Operator Norms

In this section, some basic concepts and properties on norms are introduced.

2.5.1 Vector Norms

A vector norm is a measure of the size of a vector in a vector space. Norms may be thought of as generalizations of the Euclidean length.

Definition 2.3 Given a vector space V over a field F, a norm on V is a function · : V → R with the following properties: (1) Absolute homogeneity: ax = |a|x, for all a ∈ F and all x ∈ V ; (2) Triangle inequality: x + y ≤ x + y, for all x, y ∈ V ; (3) If x = 0, then x = 0.

A vector space V with a norm · is called a normed space, and denoted by (V, ·), or briefly V . 2.5 Vector Norms and Operator Norms 53

By the first axiom, one has 0 = 0 and x = −x for any x ∈ V .Thus,by triangle inequality one has

0 = x − x ≤ x + −x = 2 x .

This implies that x ≥ 0 for any x ∈ V . n For an n-dimensional Euclidean space C , the intuitive notation of length of the T vector x = x1 x2 ··· xn is captured by   n   =  | |2 x 2 xi , i=1 which is the Euclidean norm on Cn. This is perhaps the best known vector norm since  −  ∈ Cn x y 2 measures the standard Euclidean distance between two points x, y . Cn = Lemma 2.8 For the n-dimensional complex vector space ,anormofx T n x1 x2 ··· xn ∈ C can be defined as

 1 n p   = | |p x p xi , (2.46) i=1 where p ≥ 1. The norm defined in (2.46) is called p-norm.

Proof For the case of p = 1, the conclusion can be easily proven. Next, let us show > that the properties of norms are valid for all p-norms with p 1. T n (1) For a ∈ C, x = x1 x2 ··· xn ∈ C , one has

 1  1 n p n p   = | |p = | |p | |p ax p axi a xi i=1 i=1  1  1 n p n p p p p = |a| |xi| = |a| |xi| i=1 i=1 = | |  a x p .

· This implies that the function p in (2.46) satisfies the absolute homogeneity. (2) It is easily checked that for fixed 0 <α<1, there holds

ξ α − αξ + α − 1 ≤ 0 (2.47) for all ξ ≥ 0. Let |a| 1 1 ξ = ,α= , 1 − α = . |b| p q 54 2 Mathematical Preliminaries

Thus, by using (2.47) one has

1 |a| p 1 |a| 1 − ≤ , 1 | | |b| p p b q which can be equivalently written as

1 1 |a| |b| |a| p |b| q ≤ + . (2.48) p q

The expression in (2.48) is the celebrated Young’s Inequality. T T n For two vectors x = x1 x2 ··· xn , y = y1 y2 ··· yn ∈ C , and two scalars > > 1 + 1 = p 1 and q 1 with p q 1, by using Young’s Inequality one has

1 1 (|x |p) p (|y |q) q 1 |x |p 1 |y |q i i ≤ i + i , ⎛ ⎞ 1 ⎛ ⎞ 1 n n p q p q n n x p y q ⎝ x p⎠ ⎝ y q⎠ j j j j j=1 j=1 j=1 j=1 for i ∈ I[1, n]. Summing both sides of the preceding relations, it can be obtained that

n p 1 q 1 (|xi| ) p (|yi| ) q i=1 ≤ 1 + 1 = ⎛ ⎞ 1 ⎛ ⎞ 1 1, p q p q n n ⎝ p⎠ ⎝ q⎠ xj yj j=1 j=1 which is equivalent to

⎛ ⎞ 1 ⎛ ⎞ 1 n n p n q ⎝ p⎠ ⎝ q⎠ |xi||yi| ≤ xj yj . (2.49) i=1 j=1 j=1

The expression in (2.49) is the celebrated Hölder’s Inequality. By using Hölder’s Inequality, for p > 1 one has

n p |xi + yi| i=1 n p−1 = |xi + yi||xi + yi| i=1 2.5 Vector Norms and Operator Norms 55

n n p−1 p−1 ≤ |xi||xi + yi| + |yi||xi + yi| i=1 i=1  1  1  1  1 n p n q n p n q p q(p−1) p q(p−1) ≤ |xi| |xi + yi| + |yi| |xi + yi| ⎡ i=1 i=1 ⎤ i=1 i=1  1  1  1 n p n p n q ⎣ p p ⎦ q(p−1) = |xi| + |yi| |xi + yi| . i=1 i=1 i=1

1 + 1 = ( − ) = Since p q 1, then q p 1 p. Thus, from the preceding relation one has

n p |xi + yi| (2.50) ⎡i=1 ⎤  1  1  1 n p n p n q ⎣ p p ⎦ p ≤ |xi| + |yi| |xi + yi| . i=1 i=1 i=1

− 1 = 1 In view that 1 q p , it is easily obtained from (2.50) that

 1  1  1 n p n p n p p p p |xi + yi| ≤ |xi| + |yi| . (2.51) i=1 i=1 i=1 · This implies that the function p in (2.46) satisfies the triangle inequality.   = (3) It is obvious that 0 p 0. · With the preceding facts, it can be concluded that the function p in (2.46)isa norm of Cn. 

Remark 2.3 The expression in (2.51) is called Minkowski’s Inequality.

In the preceding lemma, if p tends to infinity, one can obtain the following infinity T n norm of x = x1 x2 ··· xn ∈ C

x∞ = max {|xi| , i ∈ I[1, n]} .

For the n-dimensional complex vector space Cn, the following norm of x ∈ Cn is often used:   = H x M x Mx, where M ∈ Cn×n is a given positive definite matrix. × In addition, for the vector space Cm n in which an element is an m×n dimensional = ∈ Cm×n matrix, the often used vector norm of X xij m×n is the so-called Frobenius norm: 56 2 Mathematical Preliminaries   m n   =  2 = H X F xij tr X X . i=1 j=1

2.5.2 Operator Norms

For two normed vector spaces (V , ·α) and W, ·β ,letA be a linear operator from V to W. The norm of the operator A is defined as  " A xβ A α→β = max . x=0 xα

If V = Cn and W = Cm, then the operator A can be represented by a matrix A ∈ Cm×n. In this case, the preceding defined norm is called an operator norm of the matrix A. That is, an operator norm of A is defined as  " Axβ Aα→β = max . (2.52) x=0 xα

The norm ·α→β in (2.52) is called subordinate to the vector norms ·α and ·β . In addition, the operator norm defined in (2.52) is referred to as α → β induced norm; if α = β, it is referred to as α-induced norm, and is denoted by ·α.In(2.52), if α, β are chosen to be real numbers greater than or equal to 1, the vector norms ·α and ·β in (2.52) represent the p-norms defined in (2.46).

n m Theorem 2.6 Let ·α be a norm in C , and ·β be a norm in C . In addition, let m×n ·α→β be the operator norm defined in (2.52) for all matrices in C . Then, for m×n n the matrices A, B ∈ C , and a vector x0 ∈ C , the following properties hold.

(1) If Aα→β = 0, then A = 0; (2) cAα→β = |c|Aα→β , for any c ∈ C; (3) A + Bα→β ≤ Aα→β + Bα→β ; (4) Ax0β ≤ Aα→β x0α.

Proof (1) It is assumed that Aα→β = 0, but A = 0. Since A = 0, then there exists a nonzero vector x0 such that Ax0 = 0. Thus, x0α > 0 and Ax0β > 0. In this case, it can be obtained that  " Axβ Ax0β Aα→β = max ≥ > 0. = x 0 xα x0α

This contradicts Aα→β = 0. Therefore, A = 0. 2.5 Vector Norms and Operator Norms 57

(2) By using definitions and properties of vector norms, one has  "  " cAxβ |c|Axβ cAα→β = max = max =   =   x 0 x α " x 0 x α Axβ = |c| max = |c|Aα→β . x=0 xα

(3) By using definitions and properties of vector norms, one has  "  " (A + B) xβ Axβ + Bxβ A + Bα→β = max ≤ max =   =   x 0  x"α  x 0 " x α Axβ Bxβ ≤ max + max x=0 xα x=0 xα = Aα→β + Bα→β .

(4) This conclusion can be easily derived from the definition of operator norms. 

n m l Let (C , ·α), C , ·β , and C , ·γ be three normed spaces. Then for any A ∈ Cm×n and B ∈ Cl×m, there holds

BAα→γ ≤ Bβ→γ Aα→β .

This conclusion can be easily obtained from the definition in (2.52). In fact, according to (2.52) one has  "  " BAxγ BAxγ Axβ BAα→γ = max = max x=0 x x=0 Ax x  α "  " β α BAxγ Axβ ≤ max max x=0 Ax x=0 x # β$ α Byγ = max Aα→β y=0 yβ

= Bβ→γ Aα→β .

Next, several often used operator norms are introduced. These operator norms are induced by p-norms for vectors, and can be calculated independent of the definition = ∈ Cm×n (2.52). In each case, the matrix A is denoted by A aij m×n . During the · interpretation, for a vector the notation p represents the p-norm defined in (2.46). • 1-norm, or column sum norm: # $ m   = A 1 max aij . j∈I[1,n] i=1 58 2 Mathematical Preliminaries   Write A in terms of its column as A = a1 a2 ··· an , then % %   = % % . A 1 max aj j∈I[1,n] 1   T Denote x = x1 x2 ··· xn . Then, one has

  =  + +···+  Ax 1 x1a1 x2a2 xnan 1 n % % n % % ≤ % % = % % xjaj 1 xj aj 1 = = ⎛j 1 ⎞ j 1 n ≤ ⎝ ⎠ {  } =     xj max ak 1 x 1 A 1 . k∈I[1,n] j=1

It follows from this expression that

Ax max 1 ≤ A . (2.53) =   1 x 0 x 1

If the vector x is chosen to be x = ek (the k-th unit basic vector), then for any k ∈ I[1, n] one has  "   Ax 1 max ≥ 1ak = ak , =   1 1 x 0 x 1

and hence  "   Ax 1 max ≥ max ak = A . (2.54) =   ∈I[ , ] 1 1 x 0 x 1 k 1 n

The relations (2.53) and (2.54) imply that the column norm is the 1-induced norm · 1. • 2-norm, or maximal singular value norm:

  = σ ( ) A 2 max A . (2.55)

λ ≥ λ ≥···≥λ H Let 1 2 √ n be n eigenvalues of the matrix A A. According to Lemma H 2.4, σmax (A) = λ1.Letξi be the unit-length eigenvector of A A corresponding H to eigenvalue λi, i ∈ I[1, n]. Since A A is Hermitian, then ξi, i ∈ I[1, n],forman orthonormal basis of Cn.Let n x = αiξi. i=1 2.5 Vector Norms and Operator Norms 59

Then, one has   n   =  |α |2 x 2 i , i=1

and   √ n   = H H =  λ |α |2 Ax 2 x A Ax i i i=  1    n & n ≤  λ |α |2 = λ  |α |2 = σ ( )   1 i 1 i max A x 2 . i=1 i=1

This relation implies that

Ax max 2 ≤ σ (A) . (2.56) =   max x 0 x 2

On the other hand, if one chooses x = ξ1, then ' ' Ax Aξ  & max 2 ≥ 1 2 = ξ HAHAξ = λ ξ Hξ = λ . (2.57) =   ξ  1 1 1 1 1 1 x 0 x 2 1 2

The relations (2.56) and (2.57) imply that the maximal singular value norm is the · 2-induced norm 2. •∞-norm, or row sum norm: ⎧ ⎫ ⎨n ⎬ A∞ = max aij . i∈I[1,m] ⎩ ⎭ j=1   T Denote x = x1 x2 ··· xn . One has # $ n Ax∞ = max aikxk i∈I[1,m] # k=1 $ n ≤ max |aik||xk| i∈I[1,m] # k=1  $ n ≤ max |aik| max {|xk|} i∈I[1,m] k∈I[1,n] k=1 + # $, - . n = max |aik| · max {|xk|} i∈I[1,m] k∈I[1,n] k=1 60 2 Mathematical Preliminaries

= A∞ x∞ .

It follows from this expression that

Ax∞ max ≤ A∞ . (2.58) x=0 x∞

In addition, it is assumed that ⎧ ⎫ ⎨n ⎬ n max aij = asj i∈I[1,m] ⎩ ⎭ j=1 j=1 i θj for some s ∈ I[1, m], and denote asj = asj e , j ∈ I[1, n]. Now, choose   − θ − θ − θ T z = e i 1 e i 2 ··· e i n ,

then z∞ = 1. With such a choice, one has

Ax∞ max x=0 x∞     ≥ Az ∞ = Az ∞   (2.59) z ∞ 1 ⎧ ⎫ n ⎨n ⎬ = asj = max aij = A∞ . i∈I[1,m] ⎩ ⎭ j=1 j=1

The relations (2.58) and (2.59) imply that the row norm is the operator norm ·∞. · Besides the preceding three operator norms, the norm 1→2 is provided in the next lemma.   m×n Lemma 2.9 Given a matrix A = a1 a2 ··· an ∈ C , there holds  "   /% % 0 Ax 2 % % A → = max = max aj , 1 2 =   ∈I[ , ] 2 x 0 x 1 j 1 n · = , where p is the p-norm of vectors defined in (2.46), p 1 2.   T n Proof Denote x = x1 x2 ··· xn ∈ C . Then, by using properties of vector norms one has

  =  + +···+  Ax 2 x1a1 x2a2 xnan 2 n % % n % % ≤ % % = % % xjaj 2 xj aj 2 j=1 j=1 2.5 Vector Norms and Operator Norms 61 ⎛ ⎞ n ≤ ⎝ ⎠ {  } xj max ak 2 k∈I[1,n] j=1 =   {  } x 1 max ak 2 . k∈I[1,n]

It follows from this expression that   Ax 2 max ≤ max {ak } . (2.60) =   ∈I[ , ] 2 x 0 x 1 k 1 n

If the vector x is chosen to be x = ek (the k-th unit basic vector), then for any k ∈ I[1, n] one has  "     Ax 2 1ak 2 max ≥ = ak , =     2 x 0 x 1 ek 1 and hence  "   Ax 2 max ≥ max ak . (2.61) =   ∈I[ , ] 2 x 0 x 1 k 1 n

The relations (2.60) and (2.61) imply the conclusion of this lemma. 

The following lemma is a result on the p → q induced norm of partitioned matrices.   = Lemma 2.10 Let A be a block partitioned matrix with A Aij m×n where the dimensionality of Aij,i ∈ I [1, m] , j ∈ I [1, n], are compatible. Then for any p → q induced norm with p, q ≥ 1 there holds: % % %  %   ≤   A p→q % Aij p→q × % . m n p→q

Proof Let a vector x be partitioned consistently with A as   = T T ··· T T , x x1 x2 xn

% % and note that %  %   =     ···   T x p % x1 p x2 p xn p % . p

Then, by using the preceding relation and Theorem 2.6 it can be obtained that 62 2 Mathematical Preliminaries % % %  % % Aij × % m n p→q ⎧ %⎡ ⎤% ⎫ ⎧% % ⎫ ⎪ % n % ⎪ %  % ⎪ % j=1 A1jxj % ⎪ ⎪ ⎪ ⎪ n ⎪ ⎨% Aij x% ⎬ ⎨ %⎢ ⎥% ⎬ m×n 1 %⎢ j=1 A2jxj ⎥% = max q = max %⎢ . ⎥% = ⎪   ⎪ = ⎪  %⎣ . ⎦% ⎪ x 0 ⎩ x p ⎭ x 0 ⎪ x p % . % ⎪ ⎩⎪ % n % ⎭⎪ A x j=1 mj j q ⎧ %⎡ ⎤% ⎫ ⎪ %  n A x  % ⎪ ⎨⎪ % j=1 1j j q % ⎬⎪ 1 %⎢  n A x  ⎥% = max %⎢ j=1 2j j q ⎥% = ⎪  %⎣ ··· ⎦% ⎪ x 0 ⎪ x p % % ⎪ ⎩ %  n A x  % ⎭ j=1 mj j q q ⎧ %⎡ ⎤% ⎫ ⎪ % n     % ⎪ ⎪ % j=1 A1j p→q xj p % ⎪ ⎨ %⎢ n ⎥% ⎬ 1 A  → x  ≤ max %⎢ j=1 2j p q j p ⎥% = ⎪  %⎣ ··· ⎦% ⎪ x 0 ⎪ x p % % ⎪ ⎩ % n     % ⎭ j=1 Amj p→q xj p ⎧ %⎡ q ⎤ ⎡ ⎤% ⎫ ⎪ %     ···     % ⎪ ⎪ % A11 p→q A12 p→q A1n p→q x1 p % ⎪ ⎨ %⎢     ···   ⎥ ⎢   ⎥% ⎬ 1 %⎢ A21 p→q A22 p→q A2n p→q ⎥ ⎢ x2 p ⎥% = max %⎢ . . . ⎥ ⎢ . ⎥% = ⎪  %⎣ . . . ⎦ ⎣ . ⎦% ⎪ x 0 ⎪ x p % . . . . % ⎪ ⎩ % % ⎭ A  → A  → ··· A  → x  % % m1 p q m2 p q mn p q n p q %  % ≤   . % Aij p→q × % m n p→q

The proof is thus completed.  In the preceding lemma, if q = p, the following corollary is readily obtained.   = Corollary 2.1 [316] Let A be a block partitioned matrix with A Aij m×n where the dimensionality of Aij,i ∈ I [1, m] , j ∈ I [1, n], are compatible. Then for any p-induced norm with p ≥ 1 there holds: % % %  %   ≤   A p % Aij p × % . m n p

The following lemma is on the 2-norm of the Kronecker product of two matrices. Lemma 2.11 Given two matrices A and B, there holds

 ⊗  =     . A B 2 A 2 B 2

m×n p×q Proof Let A ∈ C and B ∈ C with rank A = r1 and rank B = r2. By carrying out singular value decompositions for A and B, there exist unitary matrices Ui, Vi, i = 1, 2, with appropriate dimensions such that

A = U1 1V1, B = U2 2V2, 2.5 Vector Norms and Operator Norms 63 with

= diag δ , δ , ...,δ ,0( − )×( − ) , δ ≥ δ ≥···≥δ > 0, 1 1 2 r1 m r1 n r1 1 2 r1 = σ σ ...,σ σ ≥ σ ≥···≥σ > 2 diag 1, 2, r2 ,0(p−r2)×(q−r2) , 1 2 r2 0.

By Item (2) of Proposition 2.2, one has

A ⊗ B = (U1 ⊗ U2)( 1 ⊗ 2)(V1 ⊗ V2) . (2.62)

By Proposition 2.3, it is known that both U1 ⊗U2 and V1 ⊗V2 are unitary. Therefore, it follows from (2.62) that the maximal singular value of A ⊗ B is δ1σ1. The conclusion is immediately obtained from this fact and the expression (2.55). 

At the end of this subsection, a relation is given for 2-norms and Frobenius norms.

Lemma 2.12 For a matrix A, there holds

  ≤   . A 2 A F

Proof Let A ∈ Cm×n with rank A = r. By carrying out singular value decompositions for A, there exist unitary matrices U and V with appropriate dimensions such that

A = U V , with = diag σ1, σ2, ...,σr,0(m−r)×(n−r) , σ1 ≥ σ2 ≥···≥σr > 0.

Thus, one has ' '   = H = H H H A F tr AA tr U VV U ' ' = tr U HUH = tr UHU H   ' r = H =  σ 2 ≥ σ =   tr i 1 A 2 . i=1

The proof is thus completed. 

2.6 A Real Representation of a Complex Matrix

m×n m×n Let A ∈ C , then A can be uniquely written as A = A1 + A2 i with A1, A2 ∈ R . The real representation Aσ of the matrix A is defined as 64 2 Mathematical Preliminaries - . A1 A2 2m×2n Aσ = ∈ R . (2.63) A2 −A1

In general, one often uses the following real representation for a complex matrix m×n A = A1 + A2 i with A1, A2 ∈ R - . A −A 1 2 , A2 A1 which is different from the real representation defined in (2.63). In the first subsection, some basic properties of the real representation defined in (2.63) are given, and the proof of the result on the characteristic polynomial of the real representation is provided in the second subsection.

2.6.1 Basic Properties

i i For an n × n complex matrix A, define Aσ = (Aσ ) , and - . - . Ij 0 0 Ij Pj = , Qj = , (2.64) 0 −Ij −Ij 0 where Ij is the j×j identity matrix. The following lemma gives some basic properties of the real representation defined in (2.63).

Lemma 2.13 (The properties of the real representation) In the following statements, if the power indices k and l are negative, it is required that the involved matrices A and B are nonsingular. (1) If A, B ∈ Cm×n,a∈ R, then ⎧ ⎨ (A + B)σ = Aσ + Bσ (aA)σ = aAσ ⎩ . = PmAσ Pn A σ

(2) If A ∈ Cm×n and B ∈ Cn×r, then

(AB)σ = Aσ PnBσ = Aσ (B)σ Pr.

n×n (3) If A ∈ C , then A is nonsingular if and only if Aσ is nonsingular. n×n 2k k (4) If A ∈ C , and k is an integer, then Aσ = ((AA) )σ Pn. ∈ Cm×n (5) If A , then = T T = . QmAσ Qn Aσ , A σ Aσ 2.6 A Real Representation of a Complex Matrix 65

(6) If A ∈ Cm×m,B∈ Cn×n,C∈ Cm×n, k and l are integers, then ⎧   s t ⎨ AA ACB BB ,k= 2s + 1, l = 2t + 1 k l =   σ Aσ Cσ Bσ s t . ⎩ AA C BB ,k= 2s, l = 2t σ

(7) Given three matrices A, B, and C with appropriate dimensions, there holds

( ) = ABC σ Aσ B σ Cσ .

Proof By direct calculation, it is easily known that Items (1), (2) and (5) hold. Also Item (3) follows from (2) directly. Now, let us show Item (4) by induction. The conclusion is obvious for k = 0. When k = 1, by using Item (2) it is easily known that the conclusion holds. 2i i It is assumed that the conclusion holds for k = i. That is, Aσ = ((AA) )σ Pn. With this assumption, by Item (2) one has

2(i+1) 2(i+1) 2i 2 2i A = (Aσ ) = (Aσ ) (Aσ ) = (Aσ ) AA P σ σ n = (( )i) = ( )i AA σ Pn AA σ Pn AA AA σ Pn i+1 = ((AA) )σ Pn.

This implies that the conclusion holds for k = i + 1. By induction, it is known that the conclusion holds for integer k ≥ 0. Next, the case with k < 0 is considered for Item (4). First, by using Item (2) it is obtained that   A = A −1 −1 σ Pn σ Pn for A ∈ Cn×n. With this relation and Item (4) with k ≥ 0, for integer k < 0 one has

−  − 2k −2k 1 −k 1 Aσ = Aσ = ((AA) )σ Pn  − −1 −k 1 = (Pn) ((AA) )σ     − −1 k 1 = Pn (AA) PnPn   σ = ( )k AA σ Pn.

This is the conclusion. The preceding facts imply that Item (4) holds for all integer k. Item (6) can be easily checked by using Item (4). Finally, let us show Item (7). Let B ∈ Cn×m. By Item (2), one has

( ) = ( ) = ( ) = ( ) ABC σ AB σ PmCσ A σ B σ PmPmCσ A σ B σ Cσ .

The lemma is thus proven.  66 2 Mathematical Preliminaries

n×n Lemma 2.14 Given a matrix A ∈ C ,ifγ ∈ λ(Aσ ), then {±γ,±γ } ⊂ λ(Aσ ).   αT αT T α ∈ Cn = , , Proof Let 1 2 with i , i 1 2 be an eigenvector of Aσ corresponding to the eigenvalue γ . That is, - . - . α1 α1 Aσ = γ . (2.65) α2 α2

By the definition of Aσ , the following relations can be easily obtained from (2.65) - . - . α2 α2 Aσ = (−γ ) , −α1 −α1 - . - . α1 α1 Aσ = γ , α2 α2 - . - . α2 α2 Aσ = (−γ ) . −α1 −α1

Thus, the proof is completed. 

The following lemma gives some results on the 2-norm and Frobenius norm of the · · real representation defined in (2.63). From now on, and 2 are used to represent the Frobenius norm and 2-norm for a given matrix, respectively.

Lemma 2.15 Given a complex matrix A, the following relations hold.

2 2 (1) Aσ  = 2 A ;   =   (2) Aσ 2 A 2. Proof The conclusion of Item (1) can be easily obtained by the definitions of the real representations and Frobenius norms. Now, let us show the conclusion of Item (2). Given a matrix A ∈ Cm×n with rank A = r, r ≤ min (m, n). Performing the singular value decomposition for A ∈ Cm×n,gives

A = U V , (2.66) where U ∈ Cm×m and V ∈ Cn×n are two unitary matrices, and - .  = 0 ∈ Rm×n 00(m−r)×(n−r) 2.6 A Real Representation of a Complex Matrix 67 is a real matrix with  = diag(σ1, σ2, ···, σr), σi > 0, i ∈ I[1, r].Let ⎡ ⎤ ⎡ ⎤ Ir 00 0 Ir 000 ⎢ ⎥ ⎢ ⎥ ⎢ 00Im−r 0 ⎥ ⎢ 00−Ir 0 ⎥ ϒ1 = ⎣ ⎦ , ϒ2 = ⎣ ⎦ , 0 Ir 00 0 In−r 00 00 0 Im−r 00 0In−r ⎡ ⎤  ⎣ ⎦ 1 =  . 02(m−r)×2(n−r)

Then, by Lemma 2.13 one has from (2.66)

Aσ = (Uσ Pm) σ (PnVσ ) (2.67)

= (Uσ Pm) ϒ1 1ϒ2 (PnVσ ) .

In addition, it is easily derived that

H (Uσ Pmϒ1)(Uσ Pmϒ1) = ( ϒ ) ϒH H ( )H Uσ Pm 1 1 Pm Uσ (2.68) H = Uσ (Uσ ) .

m×m Now let U = U1 + iU2 with U1, U2 ∈ R . Since U is a unitary matrix, then

UUH = (U + iU ) UT − iUT 1 2 1 2 = T + T + T − T U1U1 U2U2 i U2U1 U1U2 = I, which implies that

T + T = , T − T = U1U1 U2U2 I U2U1 U1U2 0.

In view of the definition of the real representation, it follows from the preceding relations that

( )H -Uσ Uσ .- . T T = U1 U2 U1 U2 U −U UT −UT - 2 1 2 1 . T + T T − T = U1U1 U2U2 U1U2 U2U1 T − T T + T U2U1 U1U2 U2U2 U1U1 = I. 68 2 Mathematical Preliminaries

It is known from this relation and (2.68) that Uσ Pmϒ1 is a unitary matrix. Similarly, it is easily shown that ϒ2PnVσ is also a unitary matrix. Therefore, the expression in (2.67) gives a singular value decomposition for the real representation matrix Aσ . In view of the definitions of and 2-norm, the conclusion is immediately obtained.  The following theorem is concerned with the characteristic polynomial of the real representation of a complex matrix. Theorem 2.7 Let A ∈ Cn×n, then

( ) = ( 2) = ( 2) ∈ R . fAσ s fAA s fAA s [s]

Proof The proof is provided in Subsection 2.6.2.  For a matrix A ∈ Cn×n,let

n i fA (s) = det(sI − A) = ais . i=0

It is easily known that n i gA (s) = det(I − sA) = an−is . i=0

With this simple fact, the following result can be immediately obtained from Theorem 2.7. Theorem 2.8 Let A ∈ Cn×n, then

( ) = ( 2) = ( 2) ∈ R . gAσ s gAA s gAA s [s]

With the above two theorems, the following results can be readily derived by using Item (4) of Lemma 2.13. Lemma 2.16 Let A ∈ Cn×n and F ∈ Cp×p. Then

( ) = ( ( )) . fAσ Fσ fAA FF σ Pp

Lemma 2.17 Let A ∈ Cn×n and F ∈ Cp×p. Then

( ) = ( ( )) . gAσ Fσ gAA FF σ Pp

2.6.2 Proof of Theorem 2.7

This subsection is devoted to give a proof of Theorem 2.7. 2.6 A Real Representation of a Complex Matrix 69

Lemma 2.18 [143] Let A ∈ Cm×n and B ∈ Cn×m with m ≤ n. Then, BA has the same eigenvalues as AB, counting multiplicity, together with an additional n − m eigenvalues equal to 0. That is,

n−m fBA (s) = s fAB(s).

Proof Consider the following two identities involving block matrices in C(m+n)×(m+n):

- .- . - . AB 0 IA AB ABA = , B 0 0 I BBA - .- . - . IA 00 AB ABA = . 0 I BBA BBA

Since the - . IA ∈ C(m+n)×(m+n) 0 I is nonsingular (all its eigenvalues are 1), then one has

- .− - .- . - . IA 1 AB 0 IA 00 = . (2.69) 0 I B 0 0 I BBA

That is, the two (m + n) × (m + n) matrices - . - . AB 0 00 C = and C = 1 B 0 2 BBA are similar. The eigenvalues of C1 are the eigenvalues of AB together with n zeros. The eigenvalues of C2 are the eigenvalues of BA together with m zeros. Since the eigenvalues of C1 and C2 are the same by (2.69), including multiplicities, the assertion of the theorem follows. 

The result in the preceding lemma is standard. According to this lemma, the following result is readily obtained.

∈ Cn×n, ( ) = ( ) ∈ R Lemma 2.19 Let A then fAA s fAA s [s]. ( ) = ( ) Proof It is obvious from the preceding lemma that fAA s fAA s . Thus one has λ(AA) =/λ(AA). On the other hand, if0 γ ∈ λ(AA), and the corresponding eigenvector chain is vij, j ∈ I[1, pi], i ∈ I[1, q] , then

AAvij = γ vij + vi,j−1, vi0 = 0.

By this relation, one has AAvij = γ vij + vi,j−1, 70 2 Mathematical Preliminaries / 0 which implies that γ ∈ λ(AA), and vij, j ∈ I[1, pi], i ∈ I[1, q] is the eigenvector chain corresponding to γ . Therefore, γ and γ has the same algebraic multiplicity. ( ) = ( ) ∈ R .  With the preceding two aspects, it follows that fAA s fAA s [s] Now let us give another real representation of a complex matrix. In order to distinguish it from the real representation given in (2.63), it is called the first real representation. The definition is given as follows.

m×n Definition 2.4 For a complex matrix A = A1 + iA2, with Ai ∈ R , i = 1, 2, the first real representation σ¯ is defined as - . A1 −A2 Aσ¯ = . A2 A1

It should be pointed out that the so-called first real representation is the often used one in the area of complex matrices. For A ∈ Cn×n,let - . −iI iI P = , −I −I then one has −1 P diag A, A P = Aσ¯ .

Based on this relation, the following conclusion on the first real representation can be obtained from basic matrix theory.

Lemma 2.20 Let A ∈ Cn×n. Then

( ) = ( ) = ( ) ( ) fAσ¯ s fAσ¯ s fA s fA s .

Lemma 2.21 Let A ∈ Cn×n. Then, there holds

= 2 . AA σ¯ Aσ

Proof Let A = A1 + i A2, then - . 2 + 2 − = A1 A2 A1A2 A2A1 = ( )2 . AA σ¯ − 2 + 2 Aσ (2.70) A2A1 A1A2 A1 A2

This implies that the conclusion is true.  2.6 A Real Representation of a Complex Matrix 71

Before giving the next lemmas, the following symbols are defined: ⎡ ⎤ λ 1 ⎢ . ⎥ ⎢ λ .. ⎥ ⎢ ⎥ n×n J1(λ, n) = ⎢ ∈ C , .. ⎥ ⎣ . 1 ⎦ λ ⎡ ⎤ λ −1 ⎢ . ⎥ ⎢ .. ⎥ ⎢ λ ⎥ n×n J2(λ, n) = ⎢ ∈ C , .. ⎥ ⎣ . −1 ⎦ λ (n) = diag((−1)0 , (−1)1 ,...,(−1)n−1).

For these three matrices, the following two lemmas can be obtained by simple computations.

Lemma 2.22 Given λ ∈ C, the following relation holds:

(n)J1(λ, n) = J2(λ, n)(n).

= λ ∈ C 2(λ, ) (λ2, ) Lemma 2.23 (1) For 0 ,J1 n is similar to J1 n . 2( , ) ( , n+1 ) ( , n−1 ) (2) When n is odd, J1 0 n is similar to diag J1 0 2 ,J1 0 2 . When n is 2( , ) ( , n ) ( , n ) even, J1 0 n is similar to diag J1 0 2 ,J1 0 2 . Based on the above lemmas, the following conclusion on the characteristic poly- nomial of the square of a matrix can be obtained.

Lemma 2.24 For a given matrix A ∈ Cn×n,if

2n fA(s) = (s − λi), i=1 then 2n ( ) = ( − λ2). fA2 s s i i=1

n×n Lemma 2.25 Given a matrix A ∈ C ,if0 = γ ∈ λ(Aσ ), then −γ ∈ λ(Aσ ), and γ and −γ have the same Jordan structure. In details, it is assumed that the eigenvalue γ has p Jordan blocks with the orders qi,i∈ I[1, p], the eigenvalue −γ has also p Jordan blocks with the orders of qi,i∈ I[1, p]. 72 2 Mathematical Preliminaries   αT αT T ∈ I[ , ] ∈ I[ , ] Proof Let 1ij 2ij , j 1 qi , i 1 p , be a group of eigenvector chains of Aσ corresponding to the eigenvalue γ . Thus, one has - .- . - . - . - . A A α α α , − α , 1 2 1ij = γ 1ij + 1i j 1 , 1i 0 = 0, (2.71) A2 −A1 α2ij α2ij α2i,j−1 α2i,0

j ∈ I[1, qi], i ∈ I[1, p], that is,  α + α = γα + α A1 1ij A2 2ij 1ij 1i,j−1 . A2α1ij − A1α2ij = γα2ij + α2i,j−1

Rearranging the above relations, one can obtain  (−α ) + α = (−γ)(−α ) + α A1 2ij A2 1ij 2ij 2i,j−1 , A2(−α2ij) − A1α1ij = (−γ)α1ij − α1i,j−1 that is, - .- . - . - . A A −α −α −α , − 1 2 2ij = (−γ) 2ij − 2i j 1 . (2.72) A2 −A1 α1ij α1ij α1i,j−1

Let - . −α −α ··· −α = 2i1 2i2 2iqi ∈ I[ , ] Vi α α ··· α , i 1 p . 1i1 1i2 1iqi

Then it follows from (2.72) that

Aσ Vi = ViJ2(−γ,qi), i ∈ I[1, p].

In view of Lemma 2.22, one has

Aσ (Vi(qi)) = ViJ2(−γ,qi)(qi) = (Vi(qi)) J1(−γ,qi), i ∈ I[1, p].

This implies the conclusion. 

With the above preliminaries, now the proof of Theorem 2.7 can be presented. Proof of Theorem 2.7: According to Lemma 2.25, one can assume that

2n 2n ( ) = − (−λ ) ( − λ ) = ( 2 − λ2). fAσ s [s i ] s i s i (2.73) i=1 i=1

By using Lemma 2.24, it is obtained that

2n   2n 2 2 2 2 2 ( ) = − (−λ ) − λ = ( − λ ) . fAσ s s i s i s i (2.74) i=1 i=1 2.6 A Real Representation of a Complex Matrix 73

It follows from Lemmas 2.19, 2.20, and 2.21 that

2 2 f 2 (s) = f (s) = f (s)f (s) = f (s) = f (s). (2.75) Aσ (AA)σ¯ AA AA AA AA

Combining (2.75) with (2.74), in view of Lemma 2.19 one has

2n ( ) = ( ) = ( − λ2) ∈ R . fAA s fAA s s i [s] (2.76) i=1

The conclusion of Theorem 2.7 can be immediately obtained from (2.73) and (2.76).

2.7 Consimilarity

For complex matrices, besides similarity there is another equivalence relation, con- similarity.

Definition 2.5 Two matrices A, B ∈ Cn×n are said to be consimilar if there exists a −1 nonsingular matrix S ∈ Cn×n such that A = SBS . If the matrix S can be taken to be unitary, A and B are said to be unitarily consimilar.

−1 −1 If A = SBS and S = U is unitary, then A = SBS = UBUT;ifS = R is a real −1 nonsingular matrix, then A = SBS = RBR−1. Thus, special cases of consimilarity include congruence and ordinary similarity. Analogously to the case of similarity, one may be concerned with the equivalence classes containing triangular or diagonal representatives under consimilarity.

Definition 2.6 AmatrixA ∈ Cn×n is said to be contriangularizable if there exists a nonsingular S ∈ Cn×n such that S−1AS is upper triangular; it is said to be condi- agonalizable if S can be chosen so that S−1AS is diagonal. It is said to be unitarily contriangularizable or unitarily condiagonalizable if it can be reduced by consimi- larity to the required form via a unitary matrix.

If A ∈ Cn×n is condiagonalizable and

−1 S AS =  = diag(λ1,λ2,..., λn),   n then AS = S.IfS = s1 s2 ··· sn with each si ∈ C , this identity says that Asi = λisi for i ∈ I[1, n]. This equation is similar to, but crucially different from, the usual eigenvector-eigenvalue equation.

Definition 2.7 Let A ∈ Cn×n be given. A nonzero vector x ∈ Cn such that Ax = λx for some λ ∈ C is said to be a coneigenvector of A; the scalar λ is a coneigenvalue of A. 74 2 Mathematical Preliminaries

The identity AS = S with  diagonal says that every nonzero column of S is a coneigenvector of A. Since the columns of S are independent if and only if S is nonsingular, one can see that a matrix A ∈ Cn×n is condiagonalizable if and only if it has n independent coneigenvectors. To this extent, the theory of condiagonalization is entirely analogous to the theory of ordinary diagonalization. But every matrix has at least one eigenvalue, and it has only finitely many distinct eigenvalues; in this regard, the theory of coneigenvalues is rather different. If Ax = λx, then for all θ ∈ R there holds   e−iθ Ax = A eiθ x = e−iθ λx = e−2iθ λ eiθ x , which implies that e−2iθ λ is a coneigenvalue of A for any θ ∈ R if λ is a coneigenvalue of A. This shows that a matrix may have infinitely many coneigenvalues. On the other hand, if Ax = λx, then

AAx = A(Ax) = A λx = λAx = λλx = |λ|2 x, so a scalar λ is a coneigenvalue of A only if |λ|2 is an eigenvalue of AA. The example - . 0 −1 A = , 10 for which AA =−I has no nonnegative eigenvalues, shows that there are matrices that have no coneigenvalues at all. It is known, however, that if A ∈ Cn×n and n is odd, then A must have at least one coneigenvalue, a result analogous to the fact that every real matrix of odd order has at least one real eigenvalue. Thus, in contrast to the theory of ordinary eigenvalues, a matrix may have infinitely many distinct coneigenvalues or it may have no coneigenvalues at all. If a matrix has a coneigenvalue, it is sometimes convenient to select from among the coneigenvalues of equal modulus the unique nonnegative one as a representative. The necessary condition just observed for the existence of a coneigenvalue is also sufficient as stated below.

× Proposition 2.8 √Let A ∈ Cn n, and let λ ≥ 0 be given. Then λ is an eigenvalue of AA if and only if λ is a coneigenvalue of A. √ √ Proof If λ ≥ 0, λ ≥ 0, and Ax = λx for some x = 0, then √  √ √ √ AAx = A(Ax) = A λx = λAx = λ λx = λx.

Conversely, if AAx = λx for some x = 0, there are two possibilities: (a) Ax and x are dependent; or (b) Ax and x are independent. 2.7 Consimilarity 75

In the former case, there is some μ ∈ C such that Ax = μx, which says that μ is a coneigenvalue of A. But then

λx = AAx = A(Ax) = A (μx) = μAx = μμx = |μ|2 x, √ − θ so |μ| = λ. Since e 2i μ is a coneigenvalue√ associated with the coneigenvector eiθ x for any θ ∈ R, one can conclude that + λ is a coneigenvalue of A. Notice that   AA(Ax) = A AAx = A λx = λ (Ax) and AAx = λx,soifλ is a simple eigenvalue of AA, (a) must always be the case. In the latter case (b) (which could occur if λ is a multiple eigenvalue of AA),the vector √ y = Ax + λx √ is nonzero and is a coneigenvector corresponding to the coneigenvalue λ since √ √ √ √ √ Ay = AAx + λAx = λx + λAx = λ(Ax + λx) = λy. 

It has been seen that for each distinct nonnegative eigenvalue of AA there corresponds a coneigenvector of A, a result analogous to the ordinary theory of eigenvectors. The following result extends this analogy a bit further.

n×n Proposition 2.9 Let A ∈ C be given, and let x1, x2,...,xk be coneigenvectors of A with corresponding coneigenvalues λ1,λ2,...,λk.If|λi| = λj whenever 1 ≤ i, j ≤ k and i = j, then {x1,...,xk} is an independent set.

2 Proof Each xi is an eigenvector of AA with associated eigenvalue |λi| . The vectors xi, i ∈ I[1, k], are independent because they are eigenvectors of the matrix AA and 2 their associated eigenvalues |λi| , i ∈ I[1, k] , are distinct by assumption. 

2.8 Real Linear Spaces and Real Linear Mappings

Let us first recall the concept of mapping. Let U and V be two nonempty sets of elements. A rule T that assigns some unique element y ∈ V for each x ∈ U is called a mapping of U into V , and it is denoted by T : U → V .IfT assigns y ∈ V for x ∈ U, then y is said to be the image of x under T, and is denoted by y = T  x. Most readers are very familiar with the concept of linear spaces and linear mappings. It is well-known that for a given A ∈ Cm×n, the mapping T : x → Ax is a linear mapping. However, the simple mapping M : x → Ax is not a linear mapping since M  (cx) = cAx = c (M  x) = c (M  x) 76 2 Mathematical Preliminaries for an arbitrarily given complex scalar c ∈ C. Thus, some conclusions on linear spaces are not easy to be applied to the simple mapping M : x → Ax. Nevertheless, the mapping M : x → Ax has some nice properties. For example, M  (cx) = c (M  x) for any real scalar c ∈ R. This is the motivation to investigate real linear spaces.

2.8.1 Real Linear Spaces

The concept of real linear spaces is first given as follows.

Definition 2.8 Let V be a set on which a closed binary operation (+) is defined. In addition, let a binary operation (scalar multiplication) be defined from V × R to V . The set V is said to be a real linear space if for any x, y, z ∈ V , and any α, β ∈ R, the following axioms are satisfied: A1. x + y is a unique vector of the set V ; A2. x + y = y + x; A3. (x + y) + z = x + (y + z); A4. There is a vector 0 such that x + 0 = x; A5. For every x ∈ V there exists a vector −x such that x + (−x) = 0. S1. αx is a unique vector of the set V ; S2. α (βx) = (αβ) x; S3. α (x + y) = αx + αy; S4. (α + β) x = αx + βx; S5. 1x = x.

An element in a real linear space V is called a vector. It is easily checked that the set Cn equipped with the ordinary vector addition is both a linear space and a real linear space. Now, consider the following set - . " a V = a ∈ C, b ∈ R . (2.77) 0 b

For this set, the addition is defined as follows - . - . - . a a a + a 1 + 2 = 1 2 . b1 b2 b1 + b2

In addition, for α ∈ C the scalar multiplication is defined as follows - . - . a αa α = . b αb 2.8 Real Linear Spaces and Real Linear Mappings 77

It is obvious that the set V in (2.77) is not a linear space over C, since for x ∈ V , i x may not belong to V . However, it is easily checked that the set V in (2.77)isareal linear space by the axioms in Definition 2.8. Let V be a real linear space, and consider a subset V0 of elements from V .The operations of addition and scalar multiplication are defined for all elements of V and, in particular, for those belonging to V0.IfV0 is a real linear space with the same addition and scalar multiplication, then V0 is said to be a real linear subspace of V . However, it is not necessary to check whether a subset is a real linear subspace by the axioms in Definition 2.8.

Theorem 2.9 A nonempty subset V0 of a real linear space V is a real linear subspace of V if for every x, y ∈ V0, and any α ∈ R, there hold

(1) x + y ∈ V0; (2) αx ∈ V0. 2 Example 2.1 The set V0 in (2.77) is a subset of C . Previously, it has been checked that V0 is a real linear space according to the definition of real linear spaces. However, 2 2 since V0 is a subset of C , and C is a real linear space, one can use the preceding theorem to determine whether V0 is a real linear space. In fact, for the subset V0 in (2.77) it is easily checked that x + y ∈ V0 and αx ∈ V0 for every x, y ∈ V0 and any 2 α ∈ R. Therefore, V0 is a real linear subspace of C , and thus is a real linear space. For real linear subspaces, the following conclusion can be proven easily.

Lemma 2.26 If Vi,i= 1, 2, are real linear subspaces of a real linear space V , then the set V1 ∩ V2 is also a real linear subspace of V .

In a real linear space V , given a set of vectors {x1, x2,...,xk}, if a vector x can be written as x = α1x1 + α2x2 +···+αkxk for some real scalars αi, i ∈ I[1, k], then x is called a real linear combination of the vectors x1, x2,...,xk. A set of vectors {x1, x2,...,xk} is said to be real linearly dependent if there exit real coefficients αi, i ∈ I[1, k], not all 0, such that

α1x1 + α2x2 +···+αkxk = 0.

A set of vectors that is not real linearly dependent is said to be real linearly indepen- dent. 3    4 Example 2.2 In C2,thesetof 1 + i1 T , −2 −1 + i T is linearly dependent over the field C of complex numbers since - . - . 1 + i −2 − (1 − i) + = 0. 1 −1 + i 3    4 However, the set of 1 + i1 T , −2 −1 + i T is real linearly independent. 78 2 Mathematical Preliminaries

n Example 2.3 In C , ei represents the vector whose i-th element is 1, and all the other elements are zero. Obviously, ei and iei are linearly dependent since ei + i (iei) = 0. However, if Cn is viewed as a real linear space with ordinary vector addition and scalar multiplication, then the vectors ei and iei are real linearly independent.

Example 2.3 implies that the concept of linear dependence is closely relevant to the scalar field. For the same vector sets, even if the definitions of addition and scalar multiplication are identical, a set of vectors to be linearly dependent over some field may be linearly independent over another field. It can be easily verified that the set of all real linear combinations of the vectors a1, a2, ..., an belonging to a real linear space V generates a real linear subspace. In order to distinguish from the notation of conventionally generated subspace, this real linear subspace is denoted by RLspan {a1, a2,...,an}, that is, # $ n { , ,..., } = α α ∈ R ∈ I[ , ] RLspan a1 a2 an iai i , i 1 n . i=1

Example 2.4 For a set S of complex vectors ai, i ∈ I[1, n], the linear subspace generated by S (over complex field C) is denoted by # $ n { , ,..., } = α α ∈ C ∈ I[ , ] span a1 a2 an iai i , i 1 n . i=1

Let ei represent the n-dimensional vector whose i-th element is 1, and all the other elements are 0, then

n RLspan {e1, e2,...,en} = R , n span {e1, e2,...,en} = C , n n RLspan {ie1, ie2,...,ien} = iR = {iη | η ∈ R } , n span {ie1, ie2,...,ien} = C .

Let V be a real linear space, a finite set of vectors

{a1, a2,...,an} (2.78) is said to be a real basis of V if they are real linearly independent, and every vector x ∈ V is a real linear combination of the vectors in (2.78):

n x = αiai, αi ∈ R, i ∈ I[1, n]. i=1 2.8 Real Linear Spaces and Real Linear Mappings 79

In other words, the vectors in (2.78) form a real basis of V if

V = RLspan {a1, a2,...,an} and no ai (i ∈ I[1, n]) can be discarded. In this case, the vectors in (2.78) are referred to as real basis vectors of V .

n Example 2.5 For C , it is well-known that the set {e1, e2, ..., en} with ei defined as in Example 2.4 is a basis of V if Cn is regarded as a linear space over field C. n However, {e1, e2, ..., en} is not a real basis of C . One can choose the following set of vectors as a real basis of Cn:

{e1, e2, ..., en,ie1,ie2, ...,ien} .

Example 2.6 Consider the set Pn of complex polynomials of s with degree less than or equal to n/.IfthesetPn is0 viewed as a linear space over field C, it is well-known 2 n that the set 1, s, s ,...,s is a basis of Pn. However, a real basis of Pn is / 0 1, s, s2,...,sn,i, is, is2,...,isn .

Example 2.7 Consider the set - . " x V = x ∈ Cn, y ∈ Rn . y

It is easily verified that the set V with ordinary vector addition and scalar multipli- cation is not a linear space over field C. However, the set V with ordinary vector addition and scalar multiplication is a real linear space. The following set forms a real basis of V : - . - . - . - . - . - . e e e 0 0 0 1 , 2 , ..., n , n , n , ..., n , 0 0 0 e e e - n. - n . - n ." 1 2 n ie ie ie 1 , 2 , ..., n . 0n 0n 0n

Similarly to normal linear spaces, a real basis necessarily consists of only finitely many vectors. Accordingly, the real linear spaces they generate are said to be finite dimensional. The following property holds for finite dimensional real linear spaces. Proposition 2.10 Any vector of a finite dimensional real linear space can be expressed uniquely as a real linear combination of the vectors of a fixed real basis. Similarly to the case of normal linear spaces, there are concepts of coordi- nates in real linear spaces. Let x belong to a real linear space V with a real basis T n {a1, a2, ..., an}. The column vector α = α1 α2 ··· αn ∈ R such that the decomposition 80 2 Mathematical Preliminaries

n x = αiai i=1 holds is said to be the representation of x with respect to the real basis {a1, a2, ..., an}. The real scalars α1, α2, ..., αn, are referred to as the coordinates of x with respect to the real basis {a1, a2, ..., an}.

Example 2.8 Consider the real linear space - . " x V = x ∈ R, y ∈ C , y with the ordinary vector addition and scalar multiplication. It is easily known that - . - . - ." 1 0 0 , , (2.79) 0 1 i   forms a real linear basis. For any xy T ∈ V , one has - . - . - . - . x 1 0 0 = x + (Re y) + (Im y) . y 0 1 i     This relation implies that α = x Re y Im y T is the representation of xy T with respect to the real basis in (2.79).

Theorem 2.10 All real bases for a finite dimensional real linear space have the same number of vectors.

The proof of this theorem is very analogous to the case of normal linear spaces, and thus is omitted. For a finite dimensional real linear space V , the number of vectors in a real basis is a characteristic of the real linear space that is invariant under different choice of real bases. In this book, this number is formally defined as the real dimension of the real linear space V , and is denoted by rdimV . With this notation, one has

rdim Cn = 2n.

Remark 2.4 In this subsection, some concepts have been given for real linear spaces. In fact, the so-called real linear space is only a special case of normal linear spaces. Remember that the involved field is the field of real numbers. Nevertheless, it should be pointed out that, in most textbooks on matrices or linear algebra, the set Cn is only viewed as a linear space over the field of C.

In the next subsection, it will be seen that the theory on real linear spaces can be conveniently applied to some complex mappings. 2.8 Real Linear Spaces and Real Linear Mappings 81

2.8.2 Real Linear Mappings

Let us begin this subsection with the concept of image of a mapping. Let U and V be two nonempty sets of elements, and T be a mapping of U into V . The image of the mapping T is defined as

Image T = {y ∈ V | y = T  x, x ∈ U} .

In addition, if V has a zero element, the kernel of T is defined as follows:

Ker T = {x ∈ U | T  x = 0} .

Now, it is required that U and V are two real linear spaces. A mapping T : U → V is real linear if for any two vectors x1, x2 ∈ U, and any real scalar α ∈ R, there hold

T  (x1 + x2) = T  x1 + T  x2,

T  (αx1) = α (T  x1) .

Example 2.9 Let U = Cm and V = Cn. Obviously, both U and V are linear spaces over the field of C with the ordinary vector addition and scalar multiplication. More- over, both U and V are also real linear spaces. Let A ∈ Cn×m. It is well-known that the mapping T : U → V defined by T  x = Ax is linear. Moreover, this mapping is also real linear.

Example 2.10 Let U = Cm and V = Cn. For a given matrix A ∈ Cn×m, the mapping T : U → V defined by T  x = Ax, which is an antilinear mapping, is often encountered in the context of consimilarity. For this mapping, there holds T (αx) = α (T  x). Generally, for an arbitrary complex number α, Aαx = αAx. Therefore, this mapping is not linear over the field of C. However, this mapping is real linear.

m n n×m Example 2.11 Let U = C and V = C . For two given matrices M, M# ∈ C , consider the mapping T : U → V defined by T  x = Mx + M#x. This mapping is neither linear, nor antilinear over the field of C. However, this mapping is real linear.

Example 2.12 Given two matrices A ∈ Cn×n and F ∈ Cp×p, consider the mapping T : Cn×p → Cn×p defined by

T  X = AX − XF.

In this book, this mapping will be referred to as a con-Sylvester mapping. It is easily checked that this mapping is real linear.

Next, the image and kernel of real linear mappings are investigated.

Theorem 2.11 Let U and V be two real linear spaces, and T be a real linear mapping of U into V . Then, Image T is a real linear subspace of V . 82 2 Mathematical Preliminaries

Proof Let y1, y2 ∈ Image T. Then, there are x1, x2 ∈ U such that yi = T  xi, i = 1, 2. Since T is a real linear mapping, then for any α ∈ R

y1 + y2 = T  x1 + T  x2 = T  (x1 + x2) ;

αy1 = α (T  x1) = T  (αx1) .

Since U is a real linear space, then x1 + x2 ∈ U and αx1 ∈ U. Therefore, it follows from the preceding two relations that y1 + y2 ∈ Image T, and αy1 ∈ Image T.By Theorem 2.9, Image T is a real linear subspace of V . 

Theorem 2.12 Let U and V be two real linear spaces, and T be a real linear mapping of U into V . Then, Ker T is a real linear subspace of U.

The proof of this theorem is very simple, and thus is omitted.

Theorem 2.13 Let U and V be two real linear spaces, and T be a real linear mapping of U into V . Then,

rdim (Image T) + rdim (Ker T) = rdim U.

Proof Let r1 = rdim (Ker T), and r = rdim U. Then, there exist a set of real linearly independent vector xi, i ∈ I[1, r1], such that / 0 = , ,..., Ker T RLspan x1 x2 xr1 .

... { , ,..., } In this case, there exist additional vectors xr1+1, xr1+2, , xr such that x1 x2 xr is a real basis of U. Thus, one has #  $ r =  α α ∈ R ∈ I[ , ] Image T T ixi i , i 1 r # i=1   $ r1 r =  α +  α α ∈ R ∈ I[ , ] T ixi T ixi i , i 1 r = = + # i 1  i r1 1  $ r1 r = α (  ) + α (  ) α ∈ R ∈ I[ , ] i T xi i T xi i , i 1 r = = + # i 1  i r1 1 $ r = α (  ) α ∈ R ∈ I[ + , ] i T xi i , i r1 1 r . i=r1+1

Now, let us show that T  xi, i ∈ I[r1 + 1, r], are real linearly independent. Let βi, i ∈ I[r1 + 1, r], be real scalars such that

r βi (T  xi) = 0.

i=r1+1 2.8 Real Linear Spaces and Real Linear Mappings 83

Since T is real linear, one has  r T  βixi = 0,

i=r1+1 which implies that r βixi ∈ Ker T.

i=r1+1

Therefore, βi = 0, i ∈ I[r1 + 1, r]. With the preceding relations, one has

rdim (Image T) = r − r1.

The proof is thus completed. 

2.9 Real Inner Product Spaces

An inner product space is a vector space with an additional structure called an inner product. This additional structure associates each pair of vectors in the space with a scalar quantity known as the inner product of the vectors. For a general complex vector space, the complex inner product is defined as follows. Definition 2.9 A complex inner product space is a vector space V over the complex field C together with an inner product, i.e., with a map

·, · : V × V → C satisfying the following three axioms for all vectors x, y, z ∈ V , and all scalars a ∈ C. (1) Conjugate symmetry: x, y = y, x; (2) Linearity in the first argument:

ax, y = a x, y ; x + y, z = x, z + y, z ;

(3) Positive-definiteness: x, x > 0 for all x = 0.

Two vectors u, v ∈ V are said to be orthogonal if u, v = 0. In Definition 2.9, conjugate symmetry gives

x, ay = ay, x = a y, x = ay, x = a x, y ; x, y + z = y + z, x = y, x + z, x = x, y + x, z . 84 2 Mathematical Preliminaries

An inner product space V with the inner product ·, · has a naturally defined norm: & x = x, x, for any x ∈ V .

For an n-dimensional inner product space V with the inner product ·, ·, there exists a special basis in which all the vectors are orthogonal and have unit norm. Such { } 5a basis6 is called orthonormal. In details, a basis ε1, ε2, ..., εn is orthonormal if εi,εj = 0, for i = j and εi,εi = εi = 1 for each i.Inann-dimensional complex vector space Cn, the often encountered inner product is given by

x, y = yHx for any x, y ∈ Cn. For the matrix space Cm×n, one can equip it with the following inner product: X, Y = tr Y HX for any X, Y ∈ Cm×n. For a real linear space, one can equip it with a real inner product.

Definition 2.10 [297] A real inner product space is a real linear space V equipped with a real inner product, i.e., with a map

·, · : V × V → R satisfying the following three axioms for all vectors x, y, z ∈ V , and all scalars a ∈ R. (1) Symmetry: x, y = y, x; (2) Linearity in the first argument:

ax, y = a x, y ; x + y, z = x, z + y, z ;

(3) Positive-definiteness: x, x > 0 for all x = 0. Two vectors u, v ∈ V are said to be orthogonal if u, v = 0.

The following theorem defines a real inner product on space Cn.

Theorem 2.14 In the real linear space Cn, a real inner product can be defined as

x, y = Re yHx (2.80) for x, y ∈ Cn. This real inner product space is denoted as (Cn, R, ·, ·). 2.9 Real Inner Product Spaces 85

Proof (1) For x, y ∈ Cn, one has   T x, y = Re yHx = Re yHx = Re xTy   = Re xTy = Re xHy = y, x .

(2) For a real number a, and x, y, z ∈ Cn, it is easily obtained that

ax, y = Re yH (ax) = Re ayHx

= a Re yHx = a x, y ,

  x + y, z = Re zH (x + y)

= Re zHx + Re zHy = x, z + y, z .

2 (3) It is  well-known that tr xHx = x > 0 for all x = 0. Thus, x, x = Re tr xHx > 0 for all x = 0. According to Definition 2.10, all the above arguments reveal that the space Cn with the real inner product defined in (2.80) is a real inner product space. 

When the matrix space Cm×n is viewed as a real linear space, the following theorem provides a real inner product for this space.

Theorem 2.15 In the real linear space Cm×n, a real inner product can be defined as   A, B = Re tr AHB (2.81) for A, B ∈ Cm×n. This real inner product space is denoted as Cm×n, R, ·, · .

Proof (1) For A, B ∈ Cm×n, according to the properties of the trace of a matrix one has       A, B = Re tr AHB = Re tr BTA = Re tr BTA      = Re tr BTA = Re tr BHA = B, A . 86 2 Mathematical Preliminaries

(2) For a real number a, and A, B, C ∈ Cm×n, it can be derived that     aA, B = Re tr (aA)H B = Re tr aAHB     = Re a tr AHB = a Re tr AHB = a A, B ,   A + B, C = Re tr (A + B)H C   = Re tr AH + BH C     = Re tr AHC + Re tr BHC = A, C + B, C .

(3) It is  well-known that tr AHA > 0 for all A = 0. Thus, A, A = Re tr AHA > 0 for all A = 0. According to Definition 2.10, all the above arguments reveal that the space Cm×n with the real inner product defined in (2.81) is a real inner product space. 

The real inner product space defined in Theorem 2.14 is different from the complex Hilbert space, and is a very interesting inner product space. In V = C2,let - . - . 1 + i 1 v = , v = . 1 i 2 −i   H =− H = By simple computations, one has v 1 v2 i, so Re tr v1 v2 0. Thus, in the real 2 inner product space C , R, ·, · , the vectors v1 and v2 are orthogonal. However, n v1 and v2 are not orthogonal in Hilbert space. In space C , the symbol ei is used to denote the vector whose i-th element is 1, and the other elements are zero. In the real inner product space (Cn, R, ·, ·) with ·, · defined in (2.80), it is easily verified that ei,iei, i ∈ I[1, n], are orthogonal. In addition, it is easily known that for any vector v ∈ Cn there must exist 2n real numbers ai, i ∈ I[1, 2n], such that

n n v = aiei + an+iiei. i=1 i=1

These facts reveal that ei,iei, i ∈ I[1, n], form a group of orthogonal real basis of the real inner product space (Cn, R, ·, ·). So the real dimension of the real inner product space (Cn, R, ·, ·) is 2n. Similarly, it is easily verified that the real inner product space Cm×n, R, ·, · with ·, · defined in (2.81)is2mn-dimensional. In general, the real linear space × × × Cm1 n1 × Cm2 n2 ×···×CmN nN with the real inner product defined as 2.9 Real Inner Product Spaces 87 + , N ( , ,..., ), ( , ,..., ) = H R1 R2 RN S1 S2 SN Re tr Ri Si i=1

m1×n1 m2×n2 mN ×nN for (R1, R2,...,RN ), (S1, S2,...,SN ) ∈ C × C ×···×C , is of real N dimension 2 mini. i=1

2.10 Optimization in Complex Domain

In this section, some results are introduced on the optimization of real-valued func- tions with respect to complex variables. First, the partial derivative of a function with respect to a complex variable is introduced. Let z be a complex variable, and denote it as

z = x + iy, x, y ∈ R.

Then, the formal derivatives for z and its conjugate z are defined as, respectively   ∂ ∂ ∂ = 1 − ∂ ∂ i∂ , (2.82) z 2  x y  ∂ 1 ∂ ∂ = + i . (2.83) ∂z 2 ∂x ∂y

With the definitions in (2.82) and (2.83), it is easily checked that

∂z ∂z = = 1, (2.84) ∂z ∂z ∂z ∂z = = 0. (2.85) ∂z ∂z

The expressions (2.84) and (2.85) describe a basic result in complex analysis: the complex variable z and its conjugate z are independent when partial derivatives need to be computed. Let f (z, z) be a scalar function with complex variable z. In addition, the partial derivatives of f (z, z) with respect to the real part x and the imaginary part y of z is continuous. Thus, there hold   ∂ ( , ) ∂ ∂ f z z = 1 f − f ∂ ∂ i∂ , z 2  x y  ∂f (z, z) 1 ∂f ∂f = + i . ∂z 2 ∂x ∂y 88 2 Mathematical Preliminaries

With these basic definitions, the gradient and conjugate gradient of a scalar func- tion with respect to a vector are given.   T n Definition 2.11 Let ω = ω1 ω2 ··· ωn ∈ C , and f (ω) be a complex-valued scalar function of ω. Then the gradient of f (ω) with respect to ω is defined as   ∂ (ω) T f ∂f (ω) ∂f (ω) ··· ∂f (ω) =∇ωf (ω) = ∂ω ∂ω ∂ω . ∂ω 1 2 n

In addition, the gradient of f (ω) with respect to the conjugate ω of the vector variable ω, which is also called the conjugate gradient of f (ω) with respect to ω, is defined as   ∂ (ω) T f ∂f (ω) ∂f (ω) ··· ∂f (ω) =∇ωf (ω) = ∂ω ∂ω ∂ω . ∂ω 1 2 n

In the following, the definitions are given on the gradient and conjugate gradient of a vector function with respect to a scalar.   T m Definition 2.12 If ω ∈ C, and f (ω) = f1(ω) f2(ω) ··· fm(ω) ∈ C is a complex- valued vector function. Then, the gradient of f (ω) with respect to ω is defined as ∂ (ω)   f ∂f (ω) ∂f (ω) ∂f (ω) =∇ωf (ω) = 1 2 ··· m , ∂ω ∂ω ∂ω ∂ω and the conjugate gradient of f (ω) with respect to ω is defined as ∂ (ω)   f ∂f (ω) ∂f (ω) ∂f (ω) =∇ωf (ω) = 1 2 ··· m . ∂ω ∂ω ∂ω ∂ω

Similarly to the preceding definitions, the gradient of a vector function with (ω) = (ω) (ω) ··· (ω) T respect to a vector can be defined. Let f f1  f2 fm is an T n m-dimensional vector function of ω = ω1 ω2 ··· ωn ∈ C , then the gradient of f (ω) with respect to ω is defined as ∂ (ω)   f ∂f (ω) ∂f (ω) ∂f (ω) =∇ωf (ω) = 1 2 ··· m , ∂ω ∂ω ∂ω ∂ω which can be written as ⎡ ⎤ ∂ (ω) ∂ (ω) ∂ (ω) f1 f2 ··· fm ∂ω ∂ω ∂ω ⎢ ∂ (ω)1 ∂ (ω)1 ∂ (ω)1 ⎥ ∂ (ω) f1 f2 ··· fm f ⎢ ∂ω ∂ω ∂ω ⎥ = ⎢ 2 2 2 ⎥ . ∂ω ⎣ ··· ··· ··· ⎦ ∂ (ω) ∂ (ω) ∂ (ω) f1 f2 ··· fm ∂ωn ∂ωn ∂ωn 2.10 Optimization in Complex Domain 89

Similarly, the conjugate gradient of f (ω) with respect to ω is defined as ∂ (ω)   f ∂f (ω) ∂f (ω) ∂f (ω) =∇ωf (ω) = 1 2 ··· m , ∂ω ∂ω ∂ω ∂ω which can be expressed as ⎡ ⎤ ∂ (ω) ∂ (ω) ∂ (ω) f1 f2 ··· fm ∂ω ∂ω ∂ω ⎢ ∂ (ω)1 ∂ (ω)1 ∂ (ω)1 ⎥ ∂ (ω) f1 f2 ··· fm f ⎢ ∂ω ∂ω ∂ω ⎥ = ⎢ 2 2 2 ⎥ . ∂ω ⎣ ··· ··· ··· ⎦ ∂ (ω) ∂ (ω) ∂ (ω) f1 f2 ··· fm ∂ωn ∂ωn ∂ωn

By the preceding definitions, some often-encountered formulae are given on gra- dient and conjugate gradient. (1) Given A ∈ Cn×n, for the quadratic function f (x) = xHAx with x ∈ Cn there holds ∂f (x) ∂f (x) = ATx, = Ax. ∂x ∂x

(2) (The chain rule) If y(x) ∈ Cn is a complex vector-valued function of x ∈ Cn, then ∂ ( ( )) ∂ ( ) ∂ ( ) f y x = y x f y . ∂x ∂x ∂y

(3) Given A ∈ Cn×m, for the function f (x) = Ax with respect to x ∈ Cn there holds

∂f (x) ∂f (x) = AT, = 0. ∂x ∂x   T Let f (ω) be a complex-valued function with respect to ω = ω1 ω2 ··· ωn ∈ Cn. The H(f (ω)) of the function f (ω) is a square matrix of second- order partial derivatives, and has the following form ⎡ ⎤ ∂2f (ω) ∂2f (ω) ··· ∂2f (ω) ∂ω1∂ω1 ∂ω1∂ω2 ∂ω1∂ωn ⎢ 2 2 2 ⎥ ∂ f (ω) ∂ f (ω) ··· ∂ f (ω) ⎢ ∂ω ∂ω ∂ω ∂ω ∂ω ∂ω ⎥ H(f (ω)) = ⎢ 2 1 2 2 2 n ⎥ . ⎣ ··· ··· ··· ⎦ ∂2f (ω) ∂2f (ω) ··· ∂2f (ω) ∂ωn∂ω1 ∂ωn∂ω2 ∂ωn∂ωn

It is easily seen that the transpose of the Hessian matrix of f (ω) is the gradient of the conjugate gradient of the function f (ω). That is, - . ∂f (ω) ∂ ∂f (ω) HT(f (ω)) = = . ∂ω∂ω ∂ω ∂ω 90 2 Mathematical Preliminaries

2 Due to this relation, the Hessian matrix will be denoted by ∇ωf (ω). Thus, one has - . - - .. - . ∂2f (ω) T ∂ ∂f (ω) T ∂f (ω) ∇2 f (ω) = = = ω ∂ω∂ω ∂ω ∂ω ∂ω ∂ω . i j n×n

The Hessian matrix plays a vital role in the solution to minimization problems. At the end of this section, a sufficient condition and a necessary condition are provided for the local minimum point of a real-valued function f (ω).

Lemma 2.27 Let f (ω) be a real-valued function with respect to complex vector n variable ω ∈ C . Then, ω = ω0 is a strictly local minimum point of f (ω) if ∂f (ω) ∂2f (ω) = , > . ∂ω 0 ∂ω∂ω 0 ω=ω0 ω=ω0

Lemma 2.28 Let f (ω) be a real-valued function with respect to complex vector n variable ω ∈ C .Ifω = ω0 is a local minimum point of f (ω), then ∂f (ω) ∂2f (ω) = , ≥ . ∂ω 0 ∂ω∂ω 0 ω=ω0 ω=ω0

For a convex function f (ω) with respect to the complex vector ω, its arbitrary local minimum point ω = ω0 is the global minimum point. If the convex function (ω) ω = ω ∂f (ω) = , f is differentiable, then the stationary point 0 satisfying ∂ω¯ 0 ω=ω0 is the global minimum point.

2.11 Notes and References

In this chapter, some mathematical tools which will be used in the sequel chapters of this book are introduced. The main materials in this chapter are well-known, and can be found in some textbooks. Of course, there are some of the authors’ own research results. These include the result on the characteristic polynomial of the real representation of a complex matrix in Sect. 2.6, and the real inner product given in Theorem 2.14. In Sect.2.1, the operation of Kronecker products for matrices is introduced. The materials in Sect.2.1 are mainly taken from [143, 172]. Some further properties on Kronecker products can be found in literature and some textbooks. For example, an identity on the determinant of the Kronecker product of two matrices was given in [172]; eigenvalues of Kronecker products were also investigated in [172]. The Kronecker product plays an important role in a variety of areas, such as, signal processing, image processing and semidefinite programming. A new method for estimating high dimensional covariance matrices was presented in [230] based on 2.11 Notes and References 91 a Kronecker product series expansion of the true . In [207], the application of Kronecker products in discrete unitary transformations was introduced, and some characteristics of Hadamard transformations were derived in the framework of Kronecker products. A brief survey on Kronecker products can be referred to [232]. In Sect. 2.2, the celebrated Leverrier algorithm is introduced. The proof of this algorithm is taken from [10]. In this proof, an appropriate companion matrix is used. Another proof was given in [144] by establishing a simple formula related to the trace of the resolvent and the characteristic polynomial of a matrix. The Leverrier algorithm is also called Leverrier-Faddeev algorithm, and has been widely applied. In [137], it was used to solve Lyapunov and Sylvester matrix equations. In Sect.2.3, a generalization of the Leverrier algorithm is introduced. The gen- eralized Leverrier algorithm for (sM − A) was given in [186] in order to calculate the transfer function for a descriptor linear system. The further extension for this algorithm was given in [174]. In fact, the result in Theorem 2.4 is a direct corollary of the main result in [174]. In [305], this result was also obtained by directly using the Leverrier algorithm. The content of this section is mainly taken from [186, 305]. It should be pointed out that the idea in [186] has been extended to obtain algorithms for solving determinant of some general polynomial matrices [234]. In Sect.2.4, the concepts of singular values and singular vectors are first intro- duced, and then the well-known singular value decomposition (SVD) of a given complex matrix is given. Up till now, it has been recognized that the process of obtaining a singular value decomposition for a given matrix is numerically reliable. In addition, the SVD is one of the most important tools in numerical linear algebra, because it contains a lot of information about a matrix, including rank, distance to singularity, column spaces, row spaces, and null spaces. Therefore, the SVD has become a popular technique, and can be found in many textbooks [81, 143, 172]. Another feature of the SVD is that it involves only two unitary or orthogonal matri- ces. This makes it a very favorable tool in many applications. For example, in [261] the SVD was used to derive the I-controllablizability condition, and give a general expression of I-controllablizing controller for square descriptor linear systems. Such a method was generalized in [259, 260] to investigate the impulsive-mode control- lablizability of nonsquare descriptor linear systems. In [112], the SVD was also used to solve the problem of dynamical order assignment of descriptor linear systems. In [63], the robust pole assignment problem via output feedback was solved for linear systems with the SVD as a tool. Besides, SVD techniques also play important roles in image processing [7, 151]. In Sect. 2.5, the vector norms and operator norms for matrices are introduced. Vec- tor norms are generalizations of ordinary length, and are used to measure the size of a vector in vector spaces. In the first subsection of this section, an axiomatic definition is first given for vector norms, and then the celebrated p-norms are introduced. These materials can be found in most textbooks on matrix theory or numerical analysis [142, 172]. The proof on p-norms is also presented in this subsection. During the procedure of the proof, the celebrated Young’s Inequality and Hölder’s Inequality are used. In fact, the triangle inequality on p-norms is the celebrated Minkowski’s Inequality. For more information on Hölder’s Inequality and Minkowski’s Inequality, one can refer 92 2 Mathematical Preliminaries to [184, 187]. In the second subsection of Sect. 2.5, operator norms of matrices are introduced. The definition of operator norms is first given, and then some properties of operator norms are established. Moreover, some often encountered operator norms are introduced. The operator norm in this section can be found in [132]. It is worth pointing out that the operator norm given in this section is different from that in most textbooks on matrix theory, where the operator norm for a square matrix is defined as [142, 172, 187]  " Ax A = max . x=0 x

In this definition, the two vector norms are required to be identical. However, the definition of operator norms in Sect. 2.5 does not have this requirement. In addition, in Sect. 2.5 it is not required that the matrix A is square. On general operator norms introduced in Sect. 2.5, one can refer to [62, 132, 209]. Table2.1 lists some p → q induced norms for a matrix ⎡ ⎤ T a1·   ⎢ aT ⎥   A = a = ⎢ 2· ⎥ = a a ··· a ij m×n ⎣ ···⎦ 1 2 n . T am·

These results can be found in [62, 209]. In Table2.1, {−1, 1}n represents the Cartesian product of the set {−1, 1}. That is,

{−1, 1}n = {−1, 1} × {−1, 1} ×···×{−1, 1} 3  4 T = a1 a2 ··· an ai ∈ {−1, 1} , i ∈ I [1, n] .

In Sect. 2.6, the concept of a real representation of complex matrices is introduced, and some properties of this real representation are given. The real representation in Sect. 2.6 was firstly proposed by Jiang and Wei in [155] to solve the con-Kalman- Yakubovich matrix equation X − AXB = C. In [258], it was used to give an explicit solution of normal con-Sylvester matrix equations. This real representation was also applied to investigate some other complex conjugate matrix equations [265, 281]. In addition, it was also used in [154] to investigate the consimilarity of complex matrices. In Sect. 2.6, the results in Lemmas 2.13 and 2.14 are taken from [155]. The result of Lemma 2.15 is taken from [265]. The results of Theorems 2.7 and 2.8 were given in [258] and [280]. However, in Subsection 2.6.2 an alternative proof for Theorem 2.7 was provided. In Sect.2.7, the consimilarity of complex matrices is simply introduced. The main content of this section is taken from [142]. Besides similarity, consimilarity is another equivalence relation between complex matrices. Consimilarity firstly appeared as a change of basis for the matrix representation of semilinear transformations [20, 150]. In Sect. 2.7, the concepts of consimilarity, coneigenvalues and coneigenvectors are given, and a relation between coneigenvalues and eigenvalues is established. 2.11 Notes and References 93

Table 2.1 Some p → q induced norms Norm Expression# $ Norm Expression m max aij j∈I[1,n] A → = A∞→ max {As → } 1 1 i/%1 % 0 1 n 1 1 = % % s∈{−1,1} max aj 1 j∈I[1,#n] $ ⎧ ⎫ m ⎨n ⎬ 2 2 max aij max aij   j∈I[1,n]   i∈I[1,m] ⎩ ⎭ A 1→2 i=/%1 % 0 A ∞→2 j=1 % % = max aj = {  } ∈I[ , ] 2 max ai· 2 j 1 n i∈I[1,⎧m] ⎫ ⎨n ⎬ / 0 max aij     i∈I[1,m] ⎩ ⎭ A 1→∞ max aij A ∞→∞ j=1 i∈I[1,m], j∈I[1,n] = {  } max ai· 1 / 0 i∈I[1,m]   σ T   σ A 2→1 max max s A A 2→2 max (A) s∈{−1,1}m

It can be seen that the theory of coneigenvalues is much more complicated than that of eigenvalues. A matrix may have infinitely many distinct coneigenvalues or it may have no coneigenvalues at all. There have been many results on consimilarity available in literature. In [141], a criterion of consimilarity between A and B was established by the similarity between AA and BB combined with the alternating- product rank condition. Moreover, some canonical forms were given in [141] under consimilarity. Section2.8 first introduces the concept of real linear spaces, and then gives some properties of real linear mappings. In fact, the concept of real linear spaces is only a special case of normal linear spaces. However, in most books the entries of the involved vector in a linear space over the field R are all real. In this section, the real linear space is emphasized in order to deal with some special spaces. Some concepts in normal linear spaces are specialized to the context of real linear spaces. These concepts include real bases, real dimensions, and real linear dependence. In the second subsection of this section, a result on the real dimensionality of image and kernel of a real linear mapping is given. This result will be used in Sect.6.1 to investigate the solvability of the so-called normal con-Sylvester matrix equation. In Sect. 2.9, the real inner product space is introduced. A real inner product for the matrix space Cm×n is given, which will be used in Chap.5 to investigate finite iterative algorithms for some complex conjugate matrix equations. In Sect. 2.10, some basic results are introduced for complex optimization. The definition of the derivative with respect to a complex variable is first given, and then the concepts of gradient and conjugate gradient are presented. Based on these preliminaries, some conditions for minimum point are given for a real-valued function with respect to complex vector variables. The result in this section will be used in Chap. 12. The main content of this section is taken from [297]. Part I Iterative Solutions Chapter 3 Smith-Type Iterative Approaches

As stated in Sect. 1.5, a complex conjugate matrix equation means a complex matrix equation with the conjugate of the unknown matrices. In this chapter, the following con-Kalman-Yakubovich matrix equation is investigated

X − AXB = C, (3.1) where A ∈ Cm×m , B ∈ Cp×p, and C ∈ Cm×p are known matrices, and X ∈ Cm×p is the matrix to be determined. This equation is a simple complex conjugate matrix equa- tion. As mentioned in Introductory Sect.1.5, the con-Kalman-Yakubovich matrix equation is a conjugate version of the following matrix equation

X − AXB = C.

In the literature, this equation is called Kalman-Yakubovich matrix equation. Thus, the matrix equation in (3.1) is called con-Kalman-Yakubovich matrix equation in this monograph. The main aim of this chapter is to establish some Smith-type iterative methods for the con-Kalman-Yakubovich matrix equation (3.1) by generalizing the Smith-type iterations for the Kalman-Yakubovich matrix equation. As a preliminary, an infinite series solution to the matrix equation (3.1) is first given in Sect.3.1.In Sects.3.2 and 3.3, the Smith iterative algorithm and Smith (l) iterative algorithm are provided to solve the con-Kalman-Yakubovich matrix equation (3.1), respectively. The algorithms in Sects.3.2 and 3.3 are linearly convergent. To improve the con- vergence rate, accelerative iterative algorithms are proposed in Sect. 3.4 to solve the con-Kalman-Yakubovhich matrix equation (3.1).  T From now on, for a column vector x = x1 x2 ··· xn the notation x denotes its Euclidean norm, that is,

© Springer Science+Business Media Singapore 2017 97 A.-G. Wu and Y. Zhang, Complex Conjugate Matrix Equations for Systems and Control, Communications and Control Engineering, DOI 10.1007/978-981-10-0637-1_3 98 3 Smith-Type Iterative Approaches   n  2 x = |xi | . i=1

For a matrix A, the notations ρ (A) and A are used to denote its spectral radius and Frobenius norm, respectively. For an arbitrary real a, the operation a denotes the greatest integer number not to exceed a, that is, a = a + p with 0 ≤ p < 1 and a ∈ Z.

3.1 Infinite Series Form of the Unique Solution

For the discussion on the con-Kalman-Yakubovich matrix equation (3.1), it is nec- essary to introduce a notation. For a complex matrix C and an integer number k,it is defined that C∗k = C∗(k−1) with C∗0 = C. With this definition, it is obvious that  C, for even k C∗k = . C, for odd k

For such an operation, one can obtain for integers k and l

∗l C∗k = C∗(k+l).(3.2)

In this section, it is assumed that ρ(AA)<1 and ρ(BB)<1 for the con-Kalman- Yakubovich matrix equation (3.1). Now let us proceed to investigate the con-Kalman-Yakubovich matrix equation (3.1). Let X1 = X − C. Then the matrix equation (3.1) can be equivalently rewritten as X1 − AX1 B = ACB. (3.3)

Let X2 = X1 − ACB. Then, the matrix equation (3.3) is equivalent to

X2 − AX2 B = AAC BB.(3.4)

Let X3 = X2 − AAC BB, one can equivalently rewrite (3.4)as

X3 − AX3 B = AAACBBB.

Denote

 k  k k  k  2 k−2  ∗k k−2  2 Xk+1 = Xk − AA A 2 C B 2 BB , X0 = X. (3.5) 3.1 Infinite Series Form of the Unique Solution 99

Then, by repeating the previous process one has the following equation with respect to Xk+1

Xk+1 − AXk+1 B + +  k 1  k+1 k+1  k 1  2 k+1−2  ∗(k+1) k+1−2  2 = AA A 2 C B 2 BB .

In view that ρ(AA)<1 and ρ(BB)<1, one can conjecture from the iteration (3.5) that the unique solution to the con-Kalman-Yakubovich matrix equation (3.1) can be given by ∞  k  k k  k  2 k−2  ∗k k−2  2 Xexact = AA A 2 C B 2 BB . (3.6) k=0

Before giving the formal result, one needs two operations on complex matrices. −→ ←− Definition 3.1 For A ∈ Cn×n, and k ∈ Z, the operations A k and A k are respec- tively defined as

−→  k  k k 2 k−2  A = AA A 2 , ←− k  k  k k−2  2 A = A 2 AA .

In Definition 3.1, the matrix A is required to be nonsingular if k is a negative −→ integer. For the sake of convenience, A k is referred to as the k-th right alternating ←− power of A, and A k as the k-th left alternating power of A. According to Definition 3.1, it is obvious that

−→ ←− A 1 = A 1 = A; −→ ←− A 2 = AA, A 2 = AA; −→ ←− A 3 = A 3 = AAA; −→ ←− A−1 = A−1 = A−1; −→ ←− −→ 2 ←− 2 A−2 = A−1 A−1 = A−1 , A−2 = A−1 A−1 = A−1 ; −→ ←− −→ 3 3 ←− A−3 = A−1 A−1 A−1 = A−1 = A−1 = A−3.

From these expressions, one may conjecture that for nonsingular A there hold

−→ ←− −→ k ←− k A−k = A−1 , A−k = A−1 (3.7) for k ∈ Z. It is easily proven that this conjecture is true. This is the following lemma. 100 3 Smith-Type Iterative Approaches

Lemma 3.1 For a nonsingular A, and k ∈ Z, the expressions in (3.7) hold. Proof When k is even, there hold

−k k k k =− , = . 2 2 2 2 Thus, one has −→ − −  k  −k k −k 2 −k−2  2 A = AA A 2 = AA

 k  k −1 2 2 = A A−1 = A−1 A−1 −→  k  k 2 k−2  k = A−1 A−1 A−1 2 = A−1 .

When k is odd, there hold

−k −k − 1 k k − 1 = , = . 2 2 2 2

Thus, one has

−→ − − −  k  −k k 1 −k 2 −k−2  2 A = AA A 2 = AA A  k+1  k−1  −1 2 −1 2 −1 = A A−1 A = A A−1 A A−1 A

 k−1 = A−1 A−1 2 A−1 −→  k  k 2 k−2  k = A−1 A−1 A−1 2 = A−1 .

The preceding two facts imply that the first relation in (3.7) holds. The second expression can also be proven by the similar method.  Besides the properties given in Lemma 3.1, the two operations in Definition 3.1 have also many other nice properties as stated below. Lemma 3.2 For A ∈ Cm×m ,k,l∈ Z, the following relations hold: −→ ←− k −→ k ←− (1) A = A k ; A = A k ; −→ −−→ ←−− l −→ l ←− l (2) A 0 = I; A2l+1 = A2l+1 = A AA ;A2l = AA ;A2l = AA ; −→ ←− −→ ←− (3) For odd k, A k = A k ; for even k, A k = A k ; −→ −−→ ←− ←−− k = k+1 k = k+1 (4) AA A ; A A A  ; −→ −→ ∗l −→ ←− ∗l ←− ←− (5) A l A k = Ak+l ; A k A l = Al+k ; −→ ←− −−→ k −−−−−→ ←−− k ←−−−−− (6) A2l+1 = Ak(2l+1); A2l+1 = Ak(2l+1). 3.1 Infinite Series Form of the Unique Solution 101

In the above items, when k or l is negative, the matrix A is required to be invertible.

Proof Only the proofs of Items (4) and (5) are given. Proof of Item (4): When k is odd, one has

− + + k = k 1, k 1 = k 1. 2 2 2 2

Thus one has   k k−1 k+1 −→   k 2 k−2  2 2 AA k = A AA A 2 = A AA A = AA

+ −−→  k 1  k+1 2 k+1−2  k+1 = AA A 2 = A .

When k is even, there holds

+ k = k 1 = k . 2 2 2

Thus, one also has   k k k −→   k 2 k−2  2 2 AA k = A AA A 2 = A AA = AA A

+ −−→  k 1  k+1 2 k+1−2  k+1 = AA A 2 = A .

−→ −−→ The above two relations imply AA k = Ak+1. By applying the same method, the second expression can be easily proven. Proof of Item (5): First, let us investigate the case where l ≥ 0. The following expression needs to be proven  ←− ∗l ←− ←− A k A l = Al+k (3.8) by induction principle for l. It is obvious that this relation holds for l = 0. Now it is assumed that the relation (3.8) holds for l = t, that is  ←− ∗t ←− ←− A k A t = At+k .

With this assumption, by applying Item (4) one has  ←− ∗(t+1) ←− A k At+1 102 3 Smith-Type Iterative Approaches     ←− ∗t ←− ←− ∗t ←− = A k A t A = A k A t A

←− ←−−−− = At+k A = At+k+1.

This implies that the relation (3.8) holds for l = t + 1. The above argument reveals that (3.8) holds for any integer l ≥ 0. Now, let us consider the case where l < 0. In this case, one can set l =−n, n > 0. For this case, one needs to show  ←− ∗n ←− ←−−− A k A−n = A−n+k . (3.9)

This assertion will be proven by induction principle for n. When n = 1, by using Item (4) one has   ←− ∗n ←− ←− ←− ←−− ←−− A k A−n = A k A−1 = Ak−1 A A−1 = Ak−1.

Thus, (3.9) holds for n = 1. Now, it is assumed that (3.9) holds for n = t ≥ 1, that is,  ←− ∗t ←− ←−−− A k A−t = A−t+k .

With this assumption, by using Item (4) one has  ←− ∗(t+1) ←−−−− A k A−(t+1)      ←− ∗t ←− ←− ∗t ←− = A k A−t A−1 = A k A−t A−1

←−−− ←−−−−− ←−−−−−−− = A−t+k A−1 = A−t−1+k = A−(t+1)+k .

This implies that (3.9) holds for n = t + 1. By mathematical induction principle, the relation (3.9) holds for any integer n > 0. With the above two aspects, it is known that  ←− ∗l ←− ←− A k A l = Al+k holds for any integers k and l. The first relation in Item (5) can be proven along the same line.  By using Lemma 3.2, for an invertible square complex matrix A and an integer l, one can obtain the following interesting results:   ←− ∗l ←− −→ −→ ∗l A l A−l = I ; A−l A l = I . 3.1 Infinite Series Form of the Unique Solution 103

The properties in Lemma 3.2 will play a vital role in the development of the iterative algorithms for solving the con-Kalman-Yakubovich matrix equation (3.1) in the next sections. By applying the notation in Definition 3.1, the main result of this section can be stated as the following theorem.

Theorem 3.1 Given matrices A ∈ Cm×m ,B∈ Cp×p, and C ∈ Cm×p with ρ(AA)< 1 and ρ(BB)<1, the unique solution X = Xexact to the con-Kalman-Yakubovich matrix equation (3.1) can be expressed as (3.6), or equivalently,

∞  −→ ←− k ∗k k Xexact = A C B . (3.10) k=0

Proof By applying the expression (3.10) and Lemma 3.2, one has

AXexact B  ∞ ∞  −→ ←−  −→ ←− = A A k C∗k B k B = AA k C∗(k+1) B k B k=0 k=0 ∞ ∞  −−→ ←−−  −→ ←− = Ak+1C∗(k+1) Bk+1 = A k C∗k B k . k=0 k=1

In addition, it is easily known that the expression Xexact in (3.10) can be rewritten as

∞ ∞  −→ ←−  −→ ←− k ∗k k k ∗k k Xexact = A C B = C + A C B . k=0 k=1

With the above two expressions, one easily knows that X = Xexact satisfies the con-Kalman-Yakubovich matrix equation (3.1). The proof is thus completed. 

3.2 Smith Iterations

From this section, some iterative algorithms will be developed to solve the con- Kalman-Yakubovich matrix equation (3.1). The first iterative algorithm is the so- called Smith iteration.

Algorithm 3.1 For the matrix equation (3.1), the Smith iteration is in the form of

X(k + 1) = AX(k)B + C. (3.11)

This algorithm is a generalization of the Smith iteration in [18] for the Kalman- Yakubovich matrix equation X − AXB = C. It possesses the following nice prop- erties. 104 3 Smith-Type Iterative Approaches

Theorem 3.2 Given A ∈ Cm×m ,B∈ Cp×p, and C ∈ Cm×p with ρ(AA)<1 and ρ(BB)<1, for an arbitrary initial value X(0), the Smith iteration (3.11) converges to the exact solution (3.10) of the con-Kalman-Yakubovich matrix equation (3.1). Further, denote the iteration error E(k) = Xexact − X (k) where Xexact is given by (3.10). Then E(k) satisfies

−→ ←− k ∗k k E(k) = A (Xexact − X(0)) B .

Proof For the iteration (3.11), it is easy to obtain for the initial value X (0)

− −→ ←− k 1 −→ ←− X(k) = A k X ∗k (0)B k + A i C∗i B i , k ≥ 1. (3.12) i=0

In view that ρ(AA)<1 and ρ(BB)<1, one has

−→ ←− lim A k X ∗k (0)B k = 0. k→∞

Combining this relation with Theorem 3.1 gives the first conclusion. In addition, from (3.10) and (3.12), by applying some properties in Lemma 3.2 one has

∞  −→ ←− −→ ←− E(k) = A i C∗i B i − A k X ∗k (0)B k i=k ∞  −→ ←− −→ ←− = Ai+kC∗(i+k) Bi+k − A k X ∗k (0)B k i=0 ∞  −→ −→∗ ←−∗ ←− −→ ←− k ∗k k = A k A i C∗i B i B k − A k X ∗k (0)B k = i 0   ∞ ∗k −→  −→ ←− ←− −→ ←− = A k A i C∗i B i B k − A k X ∗k (0)B k i=0 −→ ←− −→ ←− k ∗k k k ∗k k = A (Xexact) B − A X (0)B .

This is the second conclusion. 

From the expression Ek in the above theorem, it is known that

lim E(k) = 0 k→∞ under the conditions of ρ(AA)<1 and ρ(BB)<1. Thus, the sequence generated by the Smith iterative algorithm (3.11) converges to the exact solution of the con- Kalman-Yakubovich matrix equation (3.1) with ρ(AA)<1 and ρ(BB)<1. 3.3 Smith (l) Iterations 105

3.3 Smith (l) Iterations

The second iterative method proposed in this chapter is the so-called Smith (l) iteration. It is a generalization of the Smith (l) iteration in [200, 218] for the Kalman- Yakubovich matrix equation. In this method, l shifts are used in a cyclic manner.

Algorithm 3.2 (Smith (l) iteration) For the matrix equation (3.1), this iteration is in the form of − −→ ←− l 1 −→ ←− X(k + 1) = A l X ∗l (k)B l + A i C∗i B i , (3.13) i=0 where l ≥ 1 is a prescribed positive integer.

It is obvious that the Smith iteration in (3.11) is a special case of the Smith (l) iteration with l = 1. The next theorem gives some properties of the Smith (l) iteration.

Theorem 3.3 Given A ∈ Cm×m ,B∈ Cp×p, and C ∈ Cm×p with ρ(AA)<1 and ρ(BB)<1, for an arbitrary initial value X (0), the Smith (l) iteration (3.13) converges to the exact solution Xexact given in (3.10) of the con-Kalman-Yakubovich matrix equation (3.1). Further, denote E(k) = Xexact − X (k) where Xexact is given by (3.10). Then E(k) satisfies

−→ ←− kl ∗(kl) kl E(k) = A (Xexact − X(0)) B . (3.14)

Proof For the given initial value X(0), by applying (3.2) and Item (5) of Lemma 3.2 one easily has for the iteration (3.13)

− −→ ←− l 1 −→ ←− X(1) = A l X ∗l (0)B l + A i C∗i B i , i=0 and

− −→ ←− l 1 −→ ←− X(2) = A l X ∗l (1)B l + A i C∗i B i i=0   − ∗l − −→ −→ ←− l 1 −→ ←− ←− l 1 −→ ←− = A l A l X ∗l (0)B l + A i C∗i B i B l + A i C∗i B i i=0 i=0 − − −→ ←− l 1 −→ ←− l 1 −→ ←− = A 2l X ∗(2l)(0)B 2l + Ai+l C∗(i+l) Bi+l + A i C∗i B i i=0 i=0 − −→ ←− 2l 1 −→ ←− = A 2l X ∗(2l)(0)B 2l + A i C∗i B i . i=0 106 3 Smith-Type Iterative Approaches

With the above two relations, one can conjecture that

− −→ ←− kl1 −→ ←− X(k) = A kl X ∗(kl)(0)B kl + A i C∗i B i . (3.15) i=0

In fact, this can be proven by induction principle. It is obvious that the relation (3.15) holds for k = 1. Now it is assumed that the relation (3.15) holds for k = n ≥ 1, that is, − −→ ←− nl1 −→ ←− X(n) = A nl X ∗(nl)(0)B nl + A i C∗i B i . i=0

From this relation and the iteration (3.13), one has   − ∗l − −→ −→ ←− nl1 −→ ←− ←− l 1 −→ ←− X(n + 1) = A l A nl X ∗(nl)(0)B nl + A i C∗i B i B l + A i C∗i B i i=0 i=0 − − −−−−→ ←−−−− nl1 −→ ←− l 1 −→ ←− = A(n+1)l X ∗(nl+l)(0)B(n+1)l + Ai+l C∗(i+l) Bi+l + A i C∗i B i i=0 i=0 ( + ) − −−−−→ ←−−−− n 1 l 1 −→ ←− = A(n+1)l X ∗(nl+l)(0)B(n+1)l + A i C∗i B i . i=0

This implies that (3.15) holds for k = n + 1. The above argument reveals that (3.15) holds for all k ≥ 1. In view that ρ(AA)<1 and ρ(BB)<1, it follows from (3.15) that the first conclusion of the theorem holds. In addition, from (3.10) and (3.15), by applying some properties in Lemma 3.2 one has

∞  −→ ←− −→ ←− E(k) = A i C∗i B i − A kl X ∗(kl)(0)B kl i=kl ∞  −−→ ←−− −→ ←− = Ai+klC∗(i+kl) Bi+kl − A kl X ∗(kl)(0)B kl i=0 ∞  −→ −→∗( ) ←−∗( ) ←− −→ ←− kl ∗(kl) kl = A kl A i C∗i B i B kl − A kl X ∗(kl)(0)B kl = i 0   ∞ ∗(kl) −→  −→ ←− ←− −→ ←− = A kl A i C∗i B i B kl − A kl X ∗(kl)(0)B kl i=0 −→ ←− −→ ←− kl ∗(kl) kl kl ∗(kl) kl = A (Xexact) B − A X (0)B .

The proof is thus completed.  3.3 Smith (l) Iterations 107

From the expression Ek in the above theorem, it is easily known that the sequence {Xk } generated by the Smith (l) algorithm (3.13) converges to the exact solution of the con-Kalman-Yakubovich matrix equation (3.1)ifρ(AA)<1 and ρ(BB)<1. When l takes odd or even numbers, one can obtain the following two special iterations of the Smith (l) iteration for the con-Kalman-Yakubovich matrix equation (3.1). Moreover, they both converge to the exact solution of the equation (3.1) for any initial value X (0) under the condition of ρ(AA)<1 and ρ(BB)<1. Algorithm 3.3 (Odd Smith (l) iteration) For the con-Kalman-Yakubovich matrix equation (3.1), this iterative algorithm is in the form of

2l −→ ←− l l i ∗i i Xk+1 = AA AX(k)B BB + A C B . (3.16) i=0

Obviously, the Smith iteration (3.11) is a special case of the odd Smith (l) iteration (3.16) with l = 0. Algorithm 3.4 (Even Smith (l) iteration) For the con-Kalman-Yakubovich matrix equation (3.1), this iterative method has the form of

− 2l 1 −→ ←− l l X(k + 1) = AA X(k) BB + A i C∗i B i . (3.17) i=0

For further investigation on the convergence of the proposed iterative algorithms, let us consider their convergence orders, which describe the rate at which successive iterations become close to each other. The following definition on convergence order is a generalization of the case of scalars. Definition 3.2 Suppose that the matrix iteration X (k + 1) = f (X (k)) converges to the exact solution X = Xexact of the matrix equation X = f (X), and let E(k) = Xexact − X(k) be the iteration error at step k. If there exist positive constants c = 0 and p such that  ( + ) E k 1 ≤ lim p c, k→∞ E(k) then the iteration is said to converge to Xexact with convergence order p. The number c is called the asymptotic error constant. If p = 1, the iteration is said to be linearly convergent; if p > 1, the iteration is said to be superlinearly convergent; if p = 2, the iteration is said to be quadratically convergent. According to (3.14), one has

−−−−→ ←−−−− E(k + 1) = A(k+1)l (X − X(0))∗(kl+l) B(k+1)l exact  −→ −→ ∗l ∗ ←− ∗l ←− l kl ∗(kl) l kl l = A A (Xexact − X(0)) B B −→ ←− = A l (E(k))∗l B l . 108 3 Smith-Type Iterative Approaches

It follows from this relation that      −→  ←− E(k + 1) ≤ A l  B l  E(k) ,fork ≥ 0.

This implies that the Smith (l) iteration is linearly convergent.

Remark 3.1 It follows from the iteration error formula (3.14) that the Smith (l) algorithm has faster convergence speed with greater l. However, as seen in Sect. 3.5, the speed of convergence is hardly improved by a further increase of l.Inview that the computational load at each step will be heavier with the increase of l,itis suggested that one should adopt a Smith (l) algorithm with moderate l to solve the con-Kalman-Yakubovich matrix equation.

3.4 Smith Accelerative Iterations

The iteration algorithms proposed in the previous sections are linearly convergent. To improve the speed of convergence, some accelerative iterations are given in this section.

Algorithm 3.5 (Squared Smith iteration) For the con-Kalman-Yakubovich matrix equation (3.1), this iteration has the form ⎧ ⎨ X(k + 1) = A(k)X(k)B(k) + X (k), ( ) = 2( − ) ( ) = 2( − ) ⎩ A k A k 1 , B k B k 1 . (3.18) X(0) = C + ACB, A(0) = AA, B(0) = BB

This algorithm is a generalization of the squared Smith iteration in [217, 310] for the Kalman-Yakubovich matrix equation X − AXB = C. Different from the iterations in [217, 310], in the iteration (3.18) the initial values of X (0), A(0), and B(0) are, respectively, C + ACB, AA and BB, instead of C, A, and B.

Theorem 3.4 On the iteration algorithm in (3.18) to solve the con-Kalman- Yakubovich matrix equation (3.1) with ρ(AA)<1 and ρ(BB)<1, there holds

k+1− 21 −→ ←− X(k) = A i C∗i B i . (3.19) i=0

Further, denote E(k) = Xexact − X(k) where Xexact is given by (3.10). Then E(k) satisfies −−→ ←−− 2k+1 2k+1 E(k) = A Xexact B . (3.20) 3.4 Smith Accelerative Iterations 109

Proof The first conclusion will be proven by induction principle. Obviously, the iteration (3.19) holds for k = 0. Now, it is assumed that (3.19) holds for k = n, that is, n+1− 21 −→ ←− X(n) = A i C∗i B i . (3.21) i=0

From (3.18), one has

2k 2k A(k) = AA , B(k) = BB , k ≥ 0. (3.22)

Then, by using (3.18), (3.21), (3.22), and Lemma 3.2 it can be derived that ⎛ ⎞ n+1− n+1− 21 −→ ←− 21 −→ ←− 2n 2n X(n + 1) = AA ⎝ A i C∗i B i ⎠ BB + A i C∗i B i i=0 i=0 ⎛ ⎞ + ∗2n 1 n+1− n+1− −−→ 21 −→ ←− ←−− 21 −→ ←− n+1 n+1 = A2 ⎝ A i C∗i B i ⎠ B2 + A i C∗i B i i=0 i=0

n+1− n+1− 21 −−−−→ ←−−−− 21 −→ ←− n+1 n+1 n+1 = A2 +i C∗(2 +i) B2 +i + A i C∗i B i i=0 i=0 n+2− n+1− 21 −→ ←− 21 −→ ←− = A i C∗i B i + A i C∗i B i i=2n+1 i=0 n+2− 21 −→ ←− = A i C∗i B i . i=0

This relation implies that the expression (3.19) holds for k = n + 1. By induction principle, the preceding facts imply the first conclusion of this theorem. In addition, by using some properties in Lemma 3.2, one can obtain the following relation from (3.10) and (3.19)

∞  −→ ←− E(k) = A i C∗i B i i=2k+1 ∞ −−−−→ ←−−−− + k+1 ∗( + k+1) + k+1 = Ai 2 C i 2 Bi 2 i=0 ∞  +  +  −−→ −→ ∗( k 1) + ←− ∗( k 1) ←−− k+1 2 ∗2k 1 2 k+1 = A2 A i C∗i B i B2 i=0 110 3 Smith-Type Iterative Approaches

∞  −−→ −→ ←− ←−− k+1 k+1 = A2 A i C∗i B i B2 i=0   ∞ −−→  −→ ←− ←−− k+1 k+1 = A2 A i C∗i B i B2 . i=0

This is the second conclusion. The proof is thus completed. 

From the expression (3.20) in the above theorem, it is known that

lim E(k) = 0 k→∞ under the conditions of ρ(AA)<1 and ρ(BB)<1. Thus, the sequence generated by the squared Smith iteration (3.18) converges to the exact solution of the con- Kalman-Yakubovich matrix equation (3.1) with ρ(AA)<1 and ρ(BB)<1. The following two iterations for the con-Kalman-Yakubovich matrix equation (3.1) are the generalization of the r-Smith iteration in [310] for the Kalman- Yakubovich matrix equation X − AXB = C. Due to the limitation of operations on complex matrices, the cases for odd r and even r cannot be written in a unified framework.

Algorithm 3.6 (r-even Smith iteration)Letr ≥ 1, for the matrix equation (3.1)this iteration is in the form ⎧ − ⎪ r 1 ⎨⎪ X(k + 1) = Ai (k)X(k)Bi (k), k ≥ 0, i=0 (3.23) ⎪ ( + ) = r ( ) ( + ) = r ( ) ≥ ⎩⎪ A k 1 A k , B k 1 B k , k 0, X(0) = C + ACB, A(0) = AA, B(0) = BB.

It is obvious that the squared Smith iteration in (3.18) is a special case of the r-even Smith iteration with r = 2.

Theorem 3.5 For the r-even Smith iteration (3.23) of the matrix equation (3.1) with ρ(AA)<1 and ρ(BB)<1, there holds

k − 2r 1 −→ ←− X(k) = A i C∗i B i . (3.24) i=0

Further, denote E(k) = Xexact − X(k) where Xexact and X(k) are given by (3.10) and (3.24), respectively. Then E(k) satisfies

−→ ←− 2r k 2r k E(k) = A Xexact B . (3.25) 3.4 Smith Accelerative Iterations 111

Proof The first conclusion is proven by induction principle. It is obvious that (3.24) is true for k = 0. From (3.23), one has

r k r k A(k) = AA , B(k) = BB . (3.26)

Now it is assumed that (3.24) holds for k = n, that is,

n − 2r 1 −→ ←− X(n) = A i C∗i B i . i=0

With this assumption, by using (3.23), (3.26), and Lemma 3.2 one has   − n − r 1  2r 1 −→ ←−  r n j r n j X(n + 1) = AA A i C∗i B i BB j=0 i=0   n − n − 2r 1 −→ ←− 2r 1 −→ ←− r n r n = A i C∗i B i + AA A i C∗i B i BB +··· = = i 0  i 0  n − 2r 1 −→ ←− (r−2)r n (r−2)r n + AA A i C∗i B i BB =  i 0  n − 2r 1 −→ ←− (r−1)r n (r−1)r n + AA A i C∗i B i BB i=0   n n − n − ∗2r 2r 1 −→ ←− −→ 2r 1 −→ ←− ←− n n = A i C∗i B i + A2r A i C∗i B i B2r +··· i=0 i=0   n n − ∗(2(r−2)r ) −−−−−−→ 2r 1 −→ ←− ←−−−−−− n n + A2(r−2)r A i C∗i B i B2(r−2)r i=0   n n − ∗(2(r−1)r ) −−−−−−→ 2r 1 −→ ←− ←−−−−−− n n + A2(r−1)r A i C∗i B i B2(r−1)r i=0 n − n − 2r 1 −→ ←− 2r 1 −−−→ ←−−− n n = A i C∗i B i + Ai+2r C∗i Bi+2r +··· i=0 i=0 n − n − 2r 1 −−−−−−−−→ ←−−−−−−−− 2r 1 −−−−−−−−→ ←−−−−−−−− n n n n + Ai+2(r−2)r C∗i Bi+2(r−2)r + Ai+2(r−1)r C∗i Bi+2(r−1)r i=0 i=0 n − × n − 2r 1 −→ ←− 2 2r 1 −→ ←− = A i C∗i B i + A i C∗i B i +··· i=0 i=2r n 112 3 Smith-Type Iterative Approaches

( − ) n − × n − 2 r 1 r 1 −→ ←− 2rr 1 −→ ←− + A i C∗i B i + A i C∗i B i i=2(r−2)r n i=2(r−1)r n n+1− 2r 1 −→ ←− = A i C∗i B i . i=0

This relation implies that the expression (3.24) holds for k = n + 1. The proof of the first conclusion is thus completed by the induction principle. The second conclusion can be proven by adopting the same method as in the previous theorem. 

It can be easily derived from (3.25) that the sequence generated by the algorithm (3.23) converges to the exact solution of the con-Kalman-Yakubovich matrix equation (3.1).

Algorithm 3.7 (r-odd Smith iteration)Letr ≥ 1, for the matrix equation (3.1)this iterative algorithm is in the following form ⎧ 2r ⎪ −→ ←− ⎨⎪ X(k + 1) = A i (k)X ∗i (k)B i (k), k ≥ 0, = i −−−0 → ←−−− (3.27) ⎪ 2r+1 2r+1 ⎩⎪ A(k + 1) = A (k), B(k + 1) = B (k), k ≥ 0, X(0) = C, A(0) = A, B(0) = B.

For this iterative algorithm, one has the following theorem on the expressions of general terms and the iteration error.

Theorem 3.6 For the r-odd Smith iteration (3.23) of the matrix equation (3.1) with ρ(AA)<1 and ρ(BB)<1, there holds

( + )k − 2r1 1 −→ ←− X(k) = A i C∗i B i ,k≥ 0. (3.28) i=0

Further, denote E(k) = Xexact − X(k) where Xexact and X(k) are given by (3.10) and (3.28), respectively. Then E(k) satisfies

−−−−−→ ←−−−−− (2r+1)k (2r+1)k E(k) = A Xexact B . (3.29)

Proof The first conclusion is proven by induction principle. It is obvious that (3.28) holds for k = 0. By simple computations it is known from (3.27) that

−−−−−→ ←−−−−− k k A(k) = A(2r+1) , B(k) = A(2r+1) . (3.30) 3.4 Smith Accelerative Iterations 113

Now it is assumed that the relation (3.28) holds for k = n, that is,

( + )n − 2r1 1 −→ ←− X(n) = A i C∗i B i , k ≥ 0. i=0

With this assumption, by (3.27), (3.30), and Lemma 3.2 one has   −→ ( + )n − ∗ j ←− 2r −−−−−→ 2r1 1 −→ ←− ←−−−−− n j n j X (n + 1) = A(2r+1) A i C∗i B i B(2r+1) j=0 i=0   n ( + )n − ∗( j(2r+1) ) 2r −−−−−−→ 2r1 1 −→ ←− ←−−−−−− n n = A j(2r+1) A i C∗i B i B j(2r+1) j=0 i=0 ( + )n − 2r 2r1 1 −−−−−−−−→ ←−−−−−−−− n n n = Ai+ j(2r+1) C∗(i+ j(2r+1) ) Bi+ j(2r+1) j=0 i=0 ( + )n − ( + )n − 2r1 1 −→ ←− 2r1 1 −−−−−−−→ ←−−−−−−− n n n = A i C∗i B i + Ai+(2r+1) C∗(i+(2r+1) ) Bi+(2r+1) +··· i=0 i=0 ( + )n − 2r1 1 −−−−−−−−−−−−→ ←−−−−−−−−−−−− n n n + Ai+(2r−1)(2r+1) C∗(i+(2r−1)(2r+1) ) Bi+(2r−1)(2r+1) i=0 ( + )n − 2r1 1 −−−−−−−−−→ ←−−−−−−−−− n n n + Ai+2r(2r+1) C∗(i+2r(2r+1) ) Bi+2r(2r+1) i=0 ( + )n − ( + )n − 2r1 1 −→ ←− 2 2r1 1 −→ ←− = A i C∗i B i + A i C∗i B i +··· i=0 i=(2r+1)n ( + )n − ( + )( + )n − 2r 2r1 1 −→ ←− 2r 1 2r 1 1 −→ ←− + A i C∗i B i + A i C∗i B i i=(2r−1)(2r+1)n i=2r(2r+1)n ( + )n+1− 2r 1 1 −→ ←− = A i C∗i B i . i=0

This relation implies that the expression (3.28) holds for k = n + 1. The proof of the first conclusion is thus completed by the induction principle. The second conclusion can be easily obtained by the same method in the previous theorem. 

Similarly, it is easy to know from (3.29) that the iteration (3.27) converges to the exact solution of the con-Kalman-Yakubovich matrix equation (3.1). From the expressions (3.25) and (3.29) of the iteration errors, it is easily known that the con- vergence orders of r-even and r-odd Smith iterations are r and 2r + 1, respectively. 114 3 Smith-Type Iterative Approaches

Remark 3.2 The convergence speed of the r-even and r-odd Smith algorithms is faster than that of the Smith (l) iteration. In addition, the convergence speed of r- even and r-odd Smith algorithms is hardly improved by a further increase of r.In view of the computational loads, it suffices to adopt 2-even or 1-odd Smith algorithm to solve the con-Kalman-Yakubovich matrix equations.

At the end of this section, let us extend the idea in Algorithm 3.7 to the Kalman- Yakubovich matrix equation X − AXB = C (3.31) where A and B are two known square complex matrices with ρ(A)<1 and ρ(B)< 1, C is a matrix with appropriate dimensions, and X is the matrix to be determined. For the Kalman-Yakubovich matrix equation (3.31), the following so-called (m, r)- Smith iteration is proposed.

Algorithm 3.8 ((m,r)-Smith iteration) For the Kalman-Yakubovich matrix equation with ρ(A)<1 and ρ(B)<1, this iteration has the following form ⎧ − ⎪ r 1 ⎪ X(k + 1) = Ai (k)X(k)Bi (k), k ≥ 0, ⎪ ⎨ i=0 ( + ) = r ( ) ( + ) = r ( ) ≥ ⎪ A k 1 A k , B k 1 B k , k 0, (3.32) ⎪ m−1 ⎪ ⎩⎪ X(0) = Ai CBi , A(0) = Am , B(0) = Bm. i=0

The r-Smith iteration in [310] is a special case of this (m, r) -Smith iteration with m = 1. The following theorem provides the expression of the general terms in the iterative algorithm (3.32).

Theorem 3.7 Given two square complex matrices A and B, and a matrix C with appropriate dimensions, the iteration in (3.32) satisfies

mrk −1 X(k) = Ai CBi ,k≥ 0. (3.33) i=0

Proof The conclusion is proven by induction. It is obvious that (3.33) holds for k = 0. Now, it is assumed that it holds for k = n, that is,

mrn −1 X(n) = Ai CBi . i=0

With this assumption, by using (3.32) one has 3.4 Smith Accelerative Iterations 115   r−1 mrn −1 n i n i X(n + 1) = Amr Ai CBi Bmr = = i 0 i 0   mrn −1 mrn −1 n n = Ai CBi + Amr Ai CBi Bmr = = i 0  i 0  mrn −1 n n +···+ Amr (r−1) Ai CBi Bmr (r−1) i=0 mrn −1 mrn −1 n n = Ai CBi + Ai+mr CBi+mr i=0 i=0 mrn −1 n n +···+ Ai+mr (r−1)CBi+mr (r−1) i=0 mrn −1 2mrn −1 mrn+1−1 = Ai CBi + Ai CBi +···+ Ai CBi i=0 i=mrn i=mrn (r−1) mrn+1−1 = Ai CBi . i=0

This implies that the relation (3.33) holds for k = n + 1. So the conclusion is true by induction principle. 

It has been known in literature that the exact solution X = Xexact of the Kalman- Yakubovich matrix equation (3.31) with ρ(A)<1 and ρ(B)<1 can be expressed as ∞ k k Xexact = A CB . k=0

Since ρ(A)<1 and ρ(B)<1, it is known from (3.33) that the sequence generated by the (m, r)-Smith iteration (3.32) converges to the exact solution to the Kalman- Yakubovich matrix equation (3.31).

3.5 An Illustrative Example

Consider a con-Kalman-Yakubovich matrix equation in the form of (3.1) with ⎡ ⎤ 0.5722 −0.4511 0.1378 0.0444 ⎢ −0.0722 0.5756 0.2200 −0.1556 ⎥ A = ⎢ ⎥ + ⎣ −0.0556 0.2356 0.4967 0.1444 ⎦ −0.1333 −0.0622 0.2378 0.5556 116 3 Smith-Type Iterative Approaches ⎡ ⎤ 0.6778 −0.3867 −0.0089 −0.1000 ⎢ 0.0833 −0.0344 0.0156 0.0111 ⎥ i ⎢ ⎥ , ⎣ −0.2000 0.2622 0.8289 −0.2556 ⎦ −0.0889 1.1022 0.1800 0.5278 ⎡ ⎤ 0.3000 + 0.4000i 0.5400 + 0.0200i 0.2400 + 0.3200i B = ⎣ −0.1000 0.6800 + 0.6400i 0.0800 + 0.0400i ⎦ , −0.3000 − 0.3000i0.5200 − 0.0400i 0.8200 + 0.9600i and ⎡ ⎤ −0.5077 + 0.0355i 2.7424 − 2.2192i 3.1102 − 0.0326i ⎢ 0.5684 − 0.1738i −0.2582 + 0.0601i 1.1381 + 0.1410i ⎥ C = ⎢ ⎥ . ⎣ 1.4868 − 0.1272i −0.1854 + 0.6221i −1.4143 + 1.2116i ⎦ −0.6345 + 4.3033i −0.3772 + 0.0595i −4.1568 + 1.1290i

The exact solution to this matrix equation is ⎡ ⎤ 1 + i1− i −2i ⎢ 002+ i ⎥ X = ⎢ ⎥ . exact ⎣ 10 −1 ⎦ 4i 2 −1 + 3i

For comparison, when the Smith (l) algorithm is applied to this matrix equation, two cases of X(0) = C + ACBand X(0) = C are considered. Define the iteration error as δ(k) = X(k) − Xexact. Shown in Figs.3.1 and 3.2 are the iteration errors for different l when X(0) = C + ACBand X(0) = C, respectively. It can be seen from these two figures that the convergence speed is more rapid with greater l. However, the convergence speed is hardly improved by increasing l when l ≥ 4. In addition, the iteration errors of the r-even Smith algorithm are shown in Fig.3.3. For comparison, the iteration errors of Smith (l) algorithm with l = 6 are also shown in Fig. 3.3.Itis easily seen that the convergence speed of r-even Smith algorithms is faster than that of Smith (l) algorithm. Similarly to the Smith (l) algorithm, the convergence speed of r-even Smith algorithms is hardly improved by further increase of r. The iteration errors of r-odd Smith algorithm are shown in Fig. 3.4.

3.6 Notes and References

This chapter gives some Smith-type iterative algorithms for solving the con-Kalman- Yakubovich matrix equation X − AXB = C. The proposed algorithms can be viewed as generalizations of the Smith-type algorithms for the Kalman-Yakubovich matrix equation. Regarding the Smith-type iterative algorithms, one can refer to relevant literature. In [18], the presented Smith iteration for the Kalman-Yakubovich matrix 3.6 Notes and References 117

10

9

8 l=1 7 l=2 6 l=3

δ 5

4 l=4 l=5 3 l=6 2

1

0 0 10 20 30 40 50 60 70 80 Iteration Steps

Fig. 3.1 Iteration errors with Smith (l) algorithms when X (0) = C + ACB

10

9

8 l=1

7 l=2 6 l=3

δ 5 l=4 l=5 4

3 l=6

2

1

0 0 10 20 30 40 50 60 70 80 Iteration Steps

Fig. 3.2 Iteration errors with Smith (l) algorithms when X (0) = C equation is in the form of X(k + 1) = AX(k)B + C with X (0) = C. In [218], the Smith (l) iteration was proposed. It was shown in [200] that a moderate increase in the number of shifts l can accelerate the convergence nicely. However, it was also observed in [134, 200] that the speed of convergence was hardly improved by a further increase of l. To improve the speed of convergence, one can adopt the so- called Smith accelerative iteration [217]. Very recently, a new Smith-type iteration named the r-Smith iteration was proposed in [310]. In addition, it was shown in [134, 217] that the normal Sylvester matrix equation can be transformed into the Kalman- Yakubovich matrix equation, and then it can be efficiently solved by some existing 118 3 Smith-Type Iterative Approaches

10 9 8 7 6

δ 5 Smith (l) Algorithm with l=6 4 r−even Smith Algorithm with r=2 3 r−even Smith Algorithm with r=3 r−even Smith Algorithm with r=4 2 1 0 0 10 20 30 40 50 60 70 80 Iteration Steps

Fig. 3.3 Iteration errors with r-even Smith Algorithms

10

9

8

7

6

δ 5

4

3 Smith (l) algorithm with l=6 r−odd Smith algorithm with r=1 2 r−odd Smith algorithm with r=2 r−odd Smith algorithm with r=3 1

0 0 10 20 30 40 50 60 70 80 Iteration Steps

Fig. 3.4 Iteration errors with r-odd Smith Algorithms approaches for the Kalman-Yakubovich matrix equation. It should be pointed out that the concept of alternating power of square matrices is proposed in the first section to investigate the Smith-type algorithms of con-Kalman-Yakubovich matrix equations. Some basic properties of this operation are also provided. Although this concept is simple, it is very useful. Chapter 4 Hierarchical-Update-Based Iterative Approaches

In Chap. 3, the con-Kalman-Yakubovich matrix equation has been investigated. Due to its special structure, a natural iterative algorithm can be easily designed. In this chapter, several more complicated con-Sylvester type matrix equations are investi- gated, and some iterative approaches are given to solve these matrix equations. The involved matrix equations include the extended con-Sylvester matrix equation, the coupled con-Sylvester matrix equation and the complex matrix equations with con- jugate and transpose of unknown matrices. In this chapter, the hierarchical principle [53, 54] is used during the derivation of the obtained iterative algorithms. By this prin- ciple, the involved matrix equation is first decomposed into some sub-equations, and then some intermediate iterative algorithms are constructed for these sub-equations. Finally, the unknown matrices in intermediate algorithms are replaced with their esti- mates. An important feature of the approaches in this chapter is that the estimation sequences of the unknown matrices are updated based on a basic iterative algorithm with the aid of hierarchical principle. Thus, the approaches in this chapter are called hierarchical-update-based iterative approaches. Besides some symbols in Chap. 3, in the sequel some more symbols are also used.   T H { } For a matrix A, the symbols A 2, A , A , and Re A are used to denote its 2-norm, transpose, conjugate transpose, and real part, respectively. For a square matrix A,the symbol tr (A) denotes its trace. The symbol “⊗” denotes the Kronecker product of two matrices. Before proceeding, the following result on a simple complex matrix equation is needed. This result is a slight modification of the Theorem 2 in [58]. For the sake of completeness, a detailed proof of this result is given.

Lemma 4.1 Given matrices A ∈ Cm×r,B∈ Cs×n, and F ∈ Cm×n, if the matrix equation AXB = F

© Springer Science+Business Media Singapore 2017 119 A.-G. Wu and Y. Zhang, Complex Conjugate Matrix Equations for Systems and Control, Communications and Control Engineering, DOI 10.1007/978-981-10-0637-1_4 120 4 Hierarchical-Update-Based Iterative Approaches has a unique solution X = X∗, then the sequence {X (k)} generated by the following iteration converges to X∗:

X (k + 1) = X(k) + μAH (F − AX (k) B) BH (4.1)

<μ< 2 for 0  2 2 . A 2 B 2 ˜ Proof Define the error matrix X (k) = X (k) − X∗. Since X = X∗ is the unique solution of the considered equation, then AX∗B = F. With this relation, it follows from (4.1) that

˜ ˜ H H X (k + 1) = X (k) + μA (AX∗B − AX (k) B) B = X˜ (k) − μAHAX˜ (k) BBH.

Let Z (k) = AX˜ (k) B. By using properties of Frobenius norm and 2-norm, Proposi- tion 2.6 and Lemma 2.11, from the preceding relation one has    2 X˜ (k + 1)   = tr X˜ H (k + 1) X˜ (k + 1)        2 = X˜ (k) − μ tr X˜ H (k) AHZ (k) BH − μ tr BZH (k) AX˜ (k)   2 +μ2 AHZ (k) BH        2 = X˜ (k) − μ tr BHX˜ H (k) AHZ (k) − μ tr ZH (k) AX˜ (k) B     2 +μ2  B ⊗ AH vec (Z (k))    2     ≤ X˜ (k) − μ tr ZH (k) Z (k) − μ tr ZH (k) Z (k)   2 +μ2 B ⊗ AH vec (Z (k))2   2   2   =  ˜ ( ) − μ − μ  ⊗ H2  ( )2 X k 2 B A 2 Z k    2   =  ˜ ( ) − μ − μ  2  2  ( )2 X k 2 A 2 B 2 Z k .

Repeating the preceding relation, it can be obtained that    2 X˜ (k + 1)    2    ≤  ˜ ( − ) − μ − μ  2  2  ( )2 +  ( − )2 X k 1 2 A 2 B 2 Z k Z k 1   k  2    ≤  ˜ ( ) − μ − μ  2  2  ( )2 . X 0 2 A 2 B 2 Z i i=0 4 Hierarchical-Update-Based Iterative Approaches 121

<μ< 2 From this relation, it can be seen that under the condition 0  2 2 there A 2 B 2 holds k       2 <μ − μ  2  2  ( )2 ≤  ˜ ( ) 0 2 A 2 B 2 Z i X 0 . i=0

Therefore, one has

∞       2 <μ − μ  2  2  ( )2 ≤  ˜ ( ) 0 2 A 2 B 2 Z i X 0 . i=0

This relation implies that limi→∞ Z (i) = 0. Since the matrix equation in ˜ this lemma has a unique solution, then it follows from Z(k) = AX (k) B that  ˜  limi→∞ X (i) = 0. The conclusion is thus proven. 

The result in the above lemma will be often used in this chapter.

4.1 Extended Con-Sylvester Matrix Equations

In this section, the class of extended con-Sylvester matrix equations is investigated.

4.1.1 The Matrix Equation AXB + CXD = F

In this subsection, the following extended con-Sylvester matrix equation is investi- gated AXB + CXD = F (4.2) where A, C ∈ Cm×r, B, D ∈ Cs×n, and F ∈ Cm×n are given known matrices, and X ∈ Cr×s is the matrix to be determined. It is required that this matrix equation has a unique solution. Obviously, the condition mn = rs is necessary. By the hierarchical identification principle, one needs to define two matrices

F1 = F − CXD, (4.3)

F2 = F − AXB.(4.4)

Then, the matrix equation (4.2) can be decomposed into the following two matrix equations

AXB = F1, (4.5)

CXD = F2. (4.6) 122 4 Hierarchical-Update-Based Iterative Approaches

According to Lemma 4.1, for the matrix equations (4.5) and (4.6) one can construct the following respective iterative forms

H H X1(k + 1) = X1(k) + μA (F1 − AX1 (k) B) B , (4.7) H   H X2(k + 1) = X2(k) + μC F2 − CX2(k)D D . (4.8)

Substituting (4.3) and (4.4)into(4.7) and (4.8), respectively, gives   H H X1(k + 1) = X1(k) + μA F − CXD − AX1 (k) B B , (4.9) H   H X2(k + 1) = X2(k) + μC F − AXB − CX2(k)D D . (4.10)

The right-hand sides of these two expressions contain the unknown matrix X,so it is impossible to implement the algorithms in (4.9) and (4.10). According to the hierarchical principle, the unknown variable X in these two expressions is replaced with its estimates Xi(k), i = 1, 2, respectively. Hence, one has   H H X1(k + 1) = X1(k) + μA F − CX1(k)D − AX1 (k) B B ,   H H X2(k + 1) = X2(k) + μC F − AX2(k)B − CX2(k)D D .

Taking the average of X1(k) and X2 (k), one can obtain an iterative algorithm   H H X1(k + 1) = X(k) + μA F − AX (k) B − CX(k)D B , (4.11)   H H X2(k + 1) = X(k) + μC F − AX(k)B − CX(k)D D , (4.12) X (k) + X (k) X(k) = 1 2 . (4.13) 2 This algorithm can be equivalently rewritten as μ   X(k + 1) = X(k) + AH F − AX (k) B − CX(k)D BH (4.14) 2  μ H H + C F − AX (k) B − CX(k)D D . 2 Theorem 4.1 If the extended con-Sylvester matrix equation (4.2) has a unique solu- tion X = X∗, then the iterative solution X(k) given by the algorithm (4.11)–(4.13), or equivalently, the algorithm (4.14), converges to X∗ if

4 0 <μ<       , (4.15)  2 B ⊗ AH + (Dσ P ) ⊗ CT P σ σ n σ m 2 4.1 Extended Con-Sylvester Matrix Equations 123 where (·)σ is the real representation of complex matrices in Sect.2.6, and Pm and Pn are defined as in (2.64).

Proof First, the following error matrices are defined ˜ Xi(k) = Xi(k) − X∗, i = 1, 2, where Xi (k), i = 1, 2, are expressed in (4.11) and (4.12), respectively. Thus, it is easily derived that ˜ ( ) + ˜ ( ) ˜ X1 k X2 k X(k) = X(k) − X∗ = . 2 Using (4.11) and (4.12), one has   ˜ ˜ H H X1(k + 1) = X(k) + μA F − AX (k) B − CX(k)D B   ˜ H H = X(k) + μA AX∗B + CX∗D − CX(k)D − AX (k) B B   = X˜ (k) − μAH AX˜ (k)B + CX˜ (k)D BH, and   ˜ ˜ H H X2(k + 1) = X(k) + μC F − AX (k) B − CX(k)D D  H H = X˜ (k) − μC AX˜ (k)B + CX˜ (k)D D .

Denote Z(k) = AX˜ (k)B + CX˜ (k)D ∈ Cm×n . (4.16)

Then, from the preceding two expressions one can obtain

1 1 H H X˜ (k + 1) = X˜ (k) − μAHZ(k)BH − μC Z(k)D . 2 2 Recall the well-known fact that tr (AB) = tr (BA). Then, by simple calculations it follows that    2 X˜ (k + 1)

= tr X˜ H(k + 1)X˜ (k + 1)      2 1 H H = X˜ (k) − μ tr X˜ H(k) AHZ(k)BH + C Z(k)D  2  1 H − μ tr BZH(k)A + DZ(k) C X˜ (k) 2 124 4 Hierarchical-Update-Based Iterative Approaches   1  H H2 + μ2 AHZ(k)BH + C Z(k)D   4   2 1 1 = X˜ (k) − μ tr BHX˜ H(k)AHZ(k) − μ tr ZH(k)AX˜ (k)B (4.17) 2 2 1 H H 1 H − μ tr D X˜ H(k)C Z(k) − μ tr Z(k) CX˜ (k)D 2  2  1  H H2 + μ2 AHZ(k)BH + C Z(k)D  . 4

H H H It is obvious that tr D X˜ H(k)C Z(k) + tr Z(k) CX˜ (k)D is real. Then, one has

H H H tr D X˜ H(k)C Z(k) + tr Z(k) CX˜ (k)D

H = tr DHX˜ (k) CHZ(k) + tr ZH(k)CX˜ (k)D .

Combining this expression with (4.16), one can obtain that

tr BHX˜ H(k)AHZ(k) + tr ZH(k)AX˜ (k)B

H H H + tr D X˜ H(k)C Z(k) + tr Z(k) CX˜ (k)D  H = tr BHX˜ H(k)AH + DHX˜ (k) CH Z(k) (4.18)   + tr ZH(k) AX˜ (k)B + CX˜ (k)D = 2 Z(k)2 .

In addition, by using properties of 2-norm, Lemmas 2.15 and 2.13, and Proposition 2.6 one has    H H2 AHZ(k)BH + C Z(k)D     1  H H 2 =  AHZ(k)BH + C Z(k)D  2 σ              2 1  H H H H  =  A Z(k) B + C Pm Z(k) Pn D  2 σ σ σ σ σ σ                 T T 2 1  H H H H  =  B ⊗ A + Pn D ⊗ C Pm vec Z(k)  (4.19) 2 σ σ σ σ σ           2 1  H T  =  B ⊗ A + (Dσ Pn) ⊗ C Pm vec Z(k)  2 σ σ σ σ             2  2 ≤ 1 ⊗ H + ( ) ⊗ T ( )  B σ A Dσ Pn C Pm vec Z k  2 σ σ 2 σ         2 = ⊗ H + ( ) ⊗ T  ( )2  B σ A Dσ Pn C Pm Z k . σ σ 2 4.1 Extended Con-Sylvester Matrix Equations 125

Denote         H T 2 π = B ⊗ A + (Dσ P ) ⊗ C P . (4.20) σ σ n σ m 2

Then, combining (4.18)–(4.20) with (4.17), yields       2  2 1 X˜ (k + 1) ≤ X˜ (k) − μ 1 − μπ Z(k)2 4     2 1   ≤ X˜ (k − 1) − μ 1 − μπ Z(k − 1)2 + Z(k)2 4    k  2 1  ≤ X˜ (0) − μ 1 − μπ Z(i)2 . 4 i=0

If the convergence factor μ is chosen to satisfy (4.15), then one has

 k   1   2 0 <μ 1 − μπ Z(i)2 ≤ X˜ (0) . 4 i=0

Thus,  ∞   1   2 0 <μ 1 − μπ Z(i)2 ≤ X˜ (0) . 4 i=0

It follows from the convergence theorem of series that limi→0 Z(i) = 0. Since the matrix equation (4.2) has a unique solution, then it follows from the expression ˜ (4.16)ofZ(k) that limi→0 X(i) = 0. The proof is thus completed. 

Theorem 4.1 provides a sufficient condition to guarantee the convergence of the algorithm (4.14). However, the condition (4.15) involves Kronecker products and the real representations of the coefficient matrices. This may bring a high dimensionality problem. To overcome this drawback, the following result is given.

Corollary 4.1 If the extended con-Sylvester matrix equation (4.2) has a unique solution X = X∗, then the iterative solution X(k) given by the algorithm (4.11)– (4.13), or equivalently, the algorithm (4.14), converges to X∗ if

2 0 <μ< . (4.21)  2  2 +  2  2 A 2 B 2 C 2 D 2

Proof It follows from Lemma 2.11 that

 ⊗  =     A B 2 A 2 B 2 . (4.22)

Applying this fact and Lemma 2.15, one has 126 4 Hierarchical-Update-Based Iterative Approaches         ⊗ H + ( ) ⊗ T 2 B σ A σ Dσ Pn C σ Pm  2           2  H   T  ≤ B ⊗ A + (Dσ P ) ⊗ C P σ σ 2 n σ m 2           2    H   T  = B A + Dσ P  C P σ 2 σ 2 n 2 σ m 2 = (    +     )2 B 2 A 2 D 2 C 2  ≤  2  2 +  2  2 2 B 2 A 2 D 2 C 2 .

Substituting this relation into (4.15)gives(4.21). 

Compared with Theorem 4.1, the result of Corollary 4.1 has two advantages. One is that the convergence factor guaranteeing the convergence of the algorithm is given in terms of original coefficient matrices instead of their real representations. The other advantage is that the right-hand side of (4.21) is easier to compute. Nevertheless, it is obvious that the result of Corollary 4.1 is more conservative than that of Theorem 4.1.

4.1.2 A General Case

In this subsection, a more general extended con-Sylvester matrix equation in the following form is considered

p q AiXBi + CjXDj = F, (4.23) i=1 j=1

m×r s×n m×n where Ai, Cj ∈ C , Bi, Dj ∈ C , i ∈ I[1, p], j ∈ I[1, q], and F ∈ C are given known matrices, and X ∈ Cr×s is the matrix to be determined. Similarly to the previous subsection, it is assumed that the matrix equation (4.23) has a unique solution. Therefore, it is required that mn = rs. By defining the following matrices

p q Fi = F − AlXBl − CjXDj + AiXBi, i ∈ I[1, p], (4.24) l=1 j=1 p q Fp+j = F − AiXBi − ClXDl + CjXDj, j ∈ I[1, q], (4.25) i=1 l=1 the matrix equation (4.23) can be decomposed into the following matrix equations

AiXBi = Fi, i ∈ I[1, p], (4.26)

CjXDj = Fp+j, j ∈ I[1, q]. (4.27) 4.1 Extended Con-Sylvester Matrix Equations 127

According to Lemma 4.1, for the matrix equations (4.26) and (4.27) one can construct the following respective iterative forms

( + ) = ( ) + μ H ( − ( ) ) H, ∈ I[ ] Xi k 1 Xi k Ai Fi AiXi k Bi Bi i 1, p , (4.28) H   H Xp+j(k + 1) = Xp+j(k) + μCj Fp+j − CjXp+j(k)Dj Dj , j ∈ I[1, q]. (4.29)

Substituting (4.24) and (4.25)into(4.28) and (4.29), respectively, gives

X (k + 1) i   p q = ( ) + μ H − − + − ( ) H, Xi k Ai F AlXBl ClXDl AiXBi AiXi k Bi B (4.30) l=1 l=1 i ∈ I[1, p], X + (k + 1) p j ⎛ ⎞ p q H ⎝ ⎠ H = Xp+j(k) + μCj F − AlXBi − ClXDl + CjXDj − CjXp+j(k)Dj D , (4.31) l=1 l=1 j ∈ I[1, q].

The right-hand sides of the preceding expressions contain the unknown matrix X, so it is impossible to implement the algorithm in (4.30) and (4.31). Analogously to Subsection 4.1.1, by applying the hierarchical principle, the unknown variable X in those expressions is replaced with its corresponding estimates Xi(k), i ∈ I[1, p + q], at k-th step. Hence, one has

( + ) Xi k 1 ⎛ ⎞ p q = ( ) + μ H ⎝ − ( ) − ( ) ⎠ H, ∈ I[ ] Xi k Ai F AlXi k Bl ClXi k Dl Bi i 1, p , l=1 l=1 X + (k + 1) p j ⎛ ⎞ p q H ⎝ ⎠ H = Xp+j(k) + μCj F − AlXp+j(k)Bl − ClXp+j(k)Dl Dj , j ∈ I[1, q]. l=1 l=1

Taking the average of Xi(k), i ∈ I[1, p + q], one can obtain the following iterative algorithm

( + ) Xi k 1   (4.32) p q = ( ) + μ H − ( ) − ( ) H ∈ I[ ] X k Ai F AlX k Bl ClX k Dl Bi , i 1, p , l=1 l=1 Xp+j(k + 1) (4.33) 128 4 Hierarchical-Update-Based Iterative Approaches   p q H H = X(k) + μCj F − AlX(k)Bl − ClX(k)Dl Dj , j ∈ I[1, q], i=1 l=1 p+q 1  X(k) = X (k). (4.34) p + q i i=1

This algorithm can be equivalently rewritten as

( + ) X k 1   μ p p q = X(k) + AH F − A X(k)B − C X(k)D BH (4.35) p + q i l l l l i i=⎛1 l=1 l=1 ⎞ q p q μ H H + C ⎝F − A X(k)B − C X(k)D ⎠ D . p + q j l l l l j j=1 l=1 l=1

On the convergence condition of this algorithm, one has the following result.

Theorem 4.2 If the general extended con-Sylvester matrix equation (4.23) has a unique solution X = X∗, then the iterative solution X(k) given by the algorithm (4.32)–(4.34), or equivalently, the algorithm (4.35), converges to X∗ if

2(p + q) 0 <μ<  , (4.36)  2 p     q        H T   Bi ⊗ A + Dj Pn ⊗ C Pm  σ i σ σ j σ  i=1 j=1 2 where (·)σ is the real representation of complex matrices in Sect.2.6, and Pm and Pn are defined as in (2.64).

Proof First, for the Xi, i ∈ I[1, p + q],givenin(4.32) and (4.33), the following error matrices are defined: ˜ Xi(k) = Xi(k) − X∗, i ∈ I[1, p + q].

Thus, one has  p+q ˜ ( ) ˜ i=1 Xi k X(k) = X(k) − X∗ = . p + q

Using (4.32) and (4.33), one has for i ∈ I[1, p]

˜ ( + ) Xi k 1 ⎛ ⎞ p q = ˜ ( ) + μ H ⎝ − ( ) − ( ) ⎠ H X k Ai F AlX k Bl ClX k Dl Bi l=1 l=1 4.1 Extended Con-Sylvester Matrix Equations 129 ⎛ ⎞ p q p q = ˜ ( ) + μ H ⎝ + − ( ) − ( ) ⎠ H X k Ai AlX∗Bl ClX∗Dl AlX k Bl CjX k Dj Bi l= l= l= j=1 ⎛ 1 1 ⎞1 p q = ˜ ( ) − μ H ⎝ ˜ ( ) + ˜ ( ) ⎠ H X k Ai AlX k Bl ClX k Dl Bi l=1 l=1 and for j ∈ I[1, q] ⎛ ⎞ p q ˜ ˜ H ⎝ ⎠ H Xp+j(k + 1) = X(k) + μCj F − AlX(k)Bl − ClX(k)Dl Dj = = ⎛ l 1 l 1 ⎞ p q ˜ H ⎝ ˜ ˜ ⎠ H = X(k) − μCj AlX(k)Bl + ClX(k)Dl Dj . l=1 l=1

Denoting p q ˜ ˜ m×n Z(k) = AlX(k)Bl + ClX(k)Dl ∈ C , (4.37) l=1 l=1 from the preceding two expressions one can obtain

p q μ μ H H X˜ (k + 1) = X˜ (k) − AHZ(k)BH − C Z(k)D . p + q i i p + q j j i=1 j=1

By simple calculations, one has    2 X˜ (k + 1)

= tr X˜ H(k + 1)X˜ (k + 1) ⎡ ⎛ ⎞⎤   p q  2 μ H H = X˜ (k) − tr ⎣X˜ H(k) ⎝ AHZ(k)BH + C Z(k)D ⎠⎦ p + q i i j j i=1 j=1 ⎡⎛ ⎞ ⎤ p q μ H − tr ⎣⎝ B ZH(k)A + D Z(k) C ⎠ X˜ (k)⎦ p + q i i j j i=1 j=1    2 μ2 p q   H H H H +  A Z(k)B + Cj Z(k)Dj  (p + q)2  i i  i=1 j=1 130 4 Hierarchical-Update-Based Iterative Approaches     p  2 μ  = X˜ (k) − tr BHX˜ H(k)AHZ(k) p + q i i i=1   ⎡ ⎤ p q μ μ H H − tr ZH(k)A X˜ (k)B − tr ⎣ D X˜ H(k)C Z(k)⎦ p + q i i p + q j j i=1 j=1 ⎡ ⎤ q μ H − tr ⎣ Z(k) C X˜ (k)D ⎦ (4.38) p + q j j j=1    2 μ2 p q   H H H H +  A Z(k)B + Cj Z(k)Dj  . (p + q)2  i i  i=1 j=1

It is easily shown that ⎡ ⎤ ⎡ ⎤ q q ⎣ H ˜ H H ⎦ ⎣ H ˜ ⎦ tr Dj X (k)Cj Z(k) + tr Z(k) CjX(k)Dj j=1 j=1 is real. Then, one has ⎡ ⎤ ⎡ ⎤ q q ⎣ H ˜ H H ⎦ ⎣ H ˜ ⎦ tr Dj X (k)Cj Z(k) + tr Z(k) CjX(k)Dj j=1 j=1 ⎡ ⎤ ⎡ ⎤ q q  H  = ⎣ H ˜ ( ) H ( )⎦ + ⎣ ( )H ˜ ( ) ⎦ tr Dj X k Cj Z k tr Z k CjX k Dj . j=1 j=1

Combining this expression with (4.37), it is easily obtained that     p p H ˜ H( ) H ( ) + H( ) ˜ ( ) tr Bi X k Ai Z k tr Z k AiX k Bi i⎡=1 ⎤ i=1⎡ ⎤ q q ⎣ H ˜ H H ⎦ ⎣ H ˜ ⎦ + tr Dj X (k)Cj Z(k) + tr Z(k) CjX(k)Dj j=1 j=1 ⎡⎛ ⎞ ⎤ p q   H = ⎣⎝ H ˜ H( ) H + H ˜ ( ) H⎠ ( )⎦ tr Bi X k Ai Dj X k Cj Z k i=1 j=1 ⎡ ⎛ ⎞⎤ p q ⎣ H ⎝ ˜ ˜ ⎠⎦ + tr Z (k) AiX(k)Bi + CjX(k)Dj i=1 j=1 = 2 Z(k)2 . 4.1 Extended Con-Sylvester Matrix Equations 131

In addition, by using Lemmas 2.13 and 2.15 one has    2 p q   H H H H  A Z(k)B + Cj Z(k)Dj   i i  i=1 j=1 ⎛ ⎞   2  p q  1 ⎝ H H H H⎠  =  A Z(k)B + Cj Z(k)Dj  2  i i  i=1 j=1  σ   2 p       q        1  H H H H  =  A Z(k) B + Cj Pm Z(k) Pn Dj  . 2  i σ σ i σ σ σ σ  i=1 j=1    Let z (k) = vec Z(k) . Then, it follows from the above relation that σ    2 p q   H H H H  A Z(k)B + Cj Z(k)Dj   i i  i=1 j=1 ⎡ ⎤  2  p q                T  1 ⎣ H T H H H ⎦  =  B ⊗ A + Pn Dj ⊗ Cj Pm z (k) 2  i σ i σ σ σ  i=1 j=1 ⎡ ⎤   2  p     q       1 ⎣ H T ⎦  =  Bi ⊗ A + Dj Pn ⊗ C Pm z (k) 2  σ i σ σ j σ  i=1 j=1    2 p     q       1  H T  2 ≤  Bi ⊗ A + Dj Pn ⊗ C Pm z (k) 2  σ i σ σ j σ  i=1 j=1   2  2 p     q        H T  2 =  Bi ⊗ A + Dj Pn ⊗ C Pm Z(k) .  σ i σ σ j σ  i=1 j=1 2

Denote    2 p     q        H T  π =  Bi ⊗ A + Dj Pn ⊗ C Pm .  σ i σ σ j σ  i=1 j=1 2 132 4 Hierarchical-Update-Based Iterative Approaches

Then, combining the preceding three relations, yields    2 X˜ (k + 1)     2 μ μ ≤ X˜ (k) − 2 − π Z(k)2 p + q p + q     2 μ μ   ≤ X˜ (k − 1) − 2 − π Z(k − 1)2 + Z(k)2 p + q p + q    k  2 μ μ  ≤ X˜ (0) − 2 − π Z(i)2 . p + q p + q i=0

If the convergence factor μ is chosen to satisfy (4.15), then one has

 k   μ μ   2 0 < 2 − π Z(i)2 ≤ X˜ (0) . p + q p + q i=0

Thus  ∞   μ μ   2 0 < 2 − π Z(i)2 ≤ X˜ (0) . p + q p + q i=0

2 It follows from the convergence theorem of series that limi→∞ Z(i) = 0. Since the matrix equation (4.23) has a unique solution, it follows from (4.37) that ˜ limi→∞ X(i) = 0. The proof is thus completed. 

Analogously to Subsection4.1.1, at the end of this subsection for the algorithm (4.35) a sufficient convergence condition that is easy to compute is provided. This condition does not involve Kronecker products and the real representations of the coefficient matrices.

Corollary 4.2 If the general extended con-Sylvester matrix equation (4.23) has a unique solution X = X∗, then the iterative solution X(k) given by the algorithm (4.32)–(4.34), or equivalently, the algorithm (4.35), converges to X∗ if

<μ< 2 0 p q     . (4.39) 2 2  2  2 Ai Bi + Cj Dj i=1 2 2 j=1 2 2

Proof Applying the fact (4.22) and Lemma 2.15, one has    2 p     q        H T   Bi ⊗ A + Dj Pn ⊗ C Pm  σ i σ σ j σ  i=1 j=1 2 4.1 Extended Con-Sylvester Matrix Equations 133 ⎛ ⎞ 2 p q                ≤ ⎝  ⊗ H  +  ⊗ T  ⎠ Bi σ Ai σ Dj σ Pn Cj σ Pm 2 2 i=1 j=1 ⎛ ⎞ 2 p q                 = ⎝    H  +    T  ⎠ Bi σ Ai σ Dj σ Pn Cj σ Pm 2 2 2 2 i=1 j=1 ⎛ ⎞ 2 p q     = ⎝     +     ⎠ Bi 2 Ai 2 Dj 2 Cj 2 i=1 j=1 ⎛ ⎞ p q     ≤ ( + ) ⎝  2  2 +  2  2⎠ p q Ai 2 Bi 2 Cj 2 Dj 2 . i=1 j=1

Substituting this relation into (4.36)gives(4.39). 

4.1.3 Numerical Examples

In this subsection, two examples are employed to illustrate the effectiveness of the proposed methods.

Example 4.1 Consider an extended con-Sylvester matrix equation in the form of (4.2) with 1 + 2i 2 − i 2 − 4i i A = , B = , 1 − i2+ 3i −1 + 3i 2

−1 − i −3i −21− i 21 + 11i −9 + 7i C = , D = , F = . 01+ 2i 1 + i −1 − i 52 − 22i −18 + i

This matrix equation has a unique solution

1 + 2i −i X∗ = . 2 + i −1 + i

According to Theorem 4.1, the algorithm (4.14) for this example is convergent if 0 <μ<0.0038. According to Corollary 4.1, the algorithm (4.14) for this example is convergent if 0 <μ<0.0028. Define the relative iteration error as

X∗ − X(k) δ(k) = . (4.40) X∗ 134 4 Hierarchical-Update-Based Iterative Approaches

1

0.9

0.8 μ=0.0005 0.7

0.6 μ=0.0012

μ δ 0.5 =0.0022

μ 0.4 =0.0032

0.3 μ=0.0038

0.2

0.1

0 0 50 100 150 200 Iteration steps

Fig. 4.1 The convergence performance of the algorithm (4.14) for Example 4.1

Figure4.1 shows the relative iteration errors for different convergence factors μ when −6 the initial value takes X(0) = 12 × 10 , where 1n denotes the n × n matrix with all the entries being 1.

Example 4.2 Consider the following matrix equation

A1XB1 + C1XD1 + C2XD2 = F with −1 + i2− i 1 − 2i i A = , B = , 1 1 − i −1 + 2i 1 −1 + 3i −1 + 3i

2 + 3i −1 − i 23− i C = , D = , 1 12+ 2i 1 1 + 3i 1 + 2i

−13+ i 2 −2 + 3i C = , D = , 2 −2 + i2i 2 −1 + 10i 4i

−22 − 8i −21 + 31i F = . −22 − 20i −40 − 10i 4.1 Extended Con-Sylvester Matrix Equations 135

1.4

1.2

μ=0.5 × 10−3 1

μ=0.8× 10−3 0.8 μ × −3

δ =1.2 10

0.6 μ=1.53 × 10−3

0.4

0.2

0 0 50 100 150 200 Iteration steps

Fig. 4.2 The convergence performance of the algorithm (4.35) for Example 4.2

The unique solution X∗ of this equation is

x11 x12 −1 + i1 X∗ = = . x21 x22 1 + 2i 1 − i

According to Theorem 4.2, the algorithm (4.14) for this example is convergent if 0 <μ<1.538 × 10−3. According to Corollary 4.1, the algorithm (4.14) for this −4 −6 example is convergent if 0 <μ<7.341 × 10 .TakeX(0) = 12 × 10 .The algorithm (4.35) is applied to compute X(k). The relative iteration errors are shown in Fig. 4.2, where δ(k) is defined in (4.40). The iterative solutions X(k) are given in Tables 4.1 and 4.2 for μ = 1.53 × 10−3.

4.2 Coupled Con-Sylvester Matrix Equations

In this section, the following coupled con-Sylvester matrix equation is investigated

Ai1X1Bi1 + Ci1X1Di1 + Ai2X2Bi2 + Ci2X2Di2 +···+AipXpBip + CipXpDip = Fi, i ∈ I[1, N], (4.41)

m ×rη sη×n m ×n where Aiη, Ciη ∈ C i , Biη, Diη ∈ C i , Fi ∈ C i i , i ∈ I[1, N], η ∈ I[1, p], rη×sη are the given known matrices, and Xη ∈ C , η ∈ I[1, p], are the matrices to be determined. This coupled matrix equation can be compactly rewritten as 136 4 Hierarchical-Update-Based Iterative Approaches

−3 Table 4.1 The iterative solutions x11 and x12 of Example 4.2 for μ = 1.53 × 10 k x11 x12 10 −0.628424 + 0.146954i 0.489450 + 0.026055i 20 −0.685420 + 0.490153i 0.454772 − 0.007398i 30 −0.731573 + 0.660983i 0.462126 − 0.040725i 40 −0.778816 + 0.746009i 0.487601 − 0.058191i 50 −0.821384 + 0.792338i 0.519193 − 0.065232i 60 −0.857449 + 0.821000i 0.551972 − 0.066903i 70 −0.887156 + 0.841205i 0.583941 − 0.065958i 80 −0.911288 + 0.857005i 0.614327 − 0.063750i 90 −0.930738 + 0.870223i 0.642868 − 0.060932i 100 −0.946338 + 0.881716i 0.669520 − 0.057830i 110 −0.958804 + 0.891921i 0.694330 − 0.054620i 120 −0.968732 + 0.901086i 0.717381 − 0.051398i 130 −0.976614 + 0.909368i 0.738773 − 0.048227i 140 −0.982849 + 0.916881i 0.758606 − 0.045143i 150 −0.987760 + 0.923715i 0.776981 − 0.042173i Solution −1 + i 1

Table 4.2 The iterative solutions x21 and x22 and iteration errors of Example 4.2 for μ = 1.53 × 10−3

k x21 x22 δ 10 1.005831 + 1.125477i 0.803896 − 1.257518i 0.446844 20 1.151435 + 1.494408i 0.888572 − 1.344139i 0.326441 30 1.165009 + 1.657356i 0.934110 − 1.347993i 0.273501 40 1.147708 + 1.743858i 0.959012 − 1.327500i 0.239945 50 1.122956 + 1.797195i 0.973075 − 1.300483i 0.213944 60 1.098634 + 1.834102i 0.981597 − 1.273121i 0.192289 70 1.077314 + 1.861709i 0.987182 − 1.247523i 0.173764 80 1.059529 + 1.883374i 0.991089 − 1.224280i 0.157687 90 1.045061 + 1.900861i 0.993943 − 1.203413i 0.143585 100 1.033460 + 1.915210i 0.996083 − 1.184742i 0.131105 110 1.024245 + 1.927106i 0.997703 − 1.168036i 0.119974 120 1.016980 + 1.937036i 0.998933 − 1.153063i 0.109981 130 1.011293 + 1.945369i 0.999861 − 1.139616i 0.100961 140 1.006874 + 1.952395i 1.000554 − 1.127509i 0.092782 150 1.003469 + 1.958342i 1.001062 − 1.116584i 0.085338 Solution 1 + 2i 1 − i 4.2 Coupled Con-Sylvester Matrix Equations 137

p   AiηXηBiη + CiηXηDiη = Fi, i ∈ I[1, N]. η=1

Currently, the emphasis is placed on the case where the matrix equation (4.41) has a unique solution. A necessary condition for the coupled con-Sylvester matrix equation (4.41) to have a unique solution is that the number of the scalar equations in (4.41) is equal to the number of the unknown scalars, that is,

N p mini = rηsη. i=1 η=1

4.2.1 Iterative Algorithms

In order to derive an iterative algorithm for (4.41), one needs to introduce the fol- lowing intermediate matrices

p   il = Fi − AiηXηBiη + CiηXηDiη + AilXlBil, (4.42) η=1 i ∈ I[1, N], l ∈ I[1, p], p   il = Fi − AiηXηBiη + CiηXηDiη + CilXlDil, (4.43) η=1 i ∈ I[1, N], l ∈ I[1, p].

With the preceding definitions, the matrix equation (4.41) can be decomposed into the following matrix equations

AilXlBil = il, i ∈ I[1, N], l ∈ I[1, p],

CilXlDil = il, i ∈ I[1, N], l ∈ I[1, p].

According to Lemma 4.1, for these matrix equations one can construct the following respective iterative forms   1,i( + ) = 1,i( ) + μ H  − 1,i( ) H ∈ I[ , ] ∈ I[ , ] Xl k 1 Xl k Ail il AilXl k Bil Bil , i 1 N , l 1 p , (4.44)   2,i( + ) = 2,i( ) + μ H  − 2,i( ) H ∈ I[ , ] ∈ I[ , ] Xl k 1 Xl k Cil il CilXl k Dil Dil , i 1 N , l 1 p . (4.45)

By substituting (4.42) and (4.43)into(4.44) and (4.45), respectively, for i ∈ I[1, N], l ∈ I[1, p], one has 138 4 Hierarchical-Update-Based Iterative Approaches ⎡ p   1,i( + ) = 1,i( ) + μ H ⎣ − + + Xl k 1 Xl k Ail Fi AiηXηBiη CiηXηDiη (4.46) η=1

− 1,i( ) H AilXlBil AilXl k Bil Bil , ⎡ p   2,i( + ) = 2,i( ) + μ H ⎣ − + + Xl k 1 Xl k Cil Fi AiηXηBiη CiηXηDiη (4.47) η=1

− 2,i( ) H CilXlDil CilXl k Dil Dil .

The right-hand sides of these equations contain the unknown matrices Xη, η ∈ I[1, p], so it is impossible to implement these algorithms. In order to make the algorithms in (4.46) and (4.47) work, the unknown variable matrices Xη, η ∈ I[1, p],in(4.46) and 1,i( ) 2,i( ) (4.47) are respectively replaced with their estimates Xl k , Xl k by applying the hierarchical principle. Hence, one can obtain the following iterative forms:

, X1 i(k + 1) l ⎡ ⎤ p   = 1,i( ) + μ H ⎣ − 1,i( ) + 1,i( ) ⎦ H Xl k Ail Fi AiηXη k Biη CiηXη k Diη Bil , η=1 i ∈ I[1, N], η ∈ I[1, p], , X2 i(k + 1) l ⎡ ⎤ p   = 2,i( ) + μ H ⎣ − 2,i( ) + 2,i( ) ⎦ H Xl k Cil Fi AiηXη k Biη CiηXη k Diη Dil , η=1 i ∈ I[1, N], η ∈ I[1, p].

∈ I[ , ] 1,i( ) 2,i( ) For every matrix variable Xl, l 1 p , taking the average of Xl k , Xl k , i ∈ I[1, N], gives the iterative algorithm

( + ) Xl k 1 ⎡ ⎤ (4.48) μ N p   H ⎣ ⎦ H = X (k) + A F − A ηXη(k)B η + C ηXη(k)D η B l 2N il i i i i i il i=1 η=1 ⎡ ⎤ μ N p   H ⎣ ⎦ H + C F − A ηXη(k)B η + C ηXη(k)D η D . 2N il i i i i i il i=1 η=1 4.2 Coupled Con-Sylvester Matrix Equations 139

4.2.2 Convergence Analysis

In this subsection, convergence properties of the proposed algorithm (4.48) are inves- tigated. The main result of this subsection is given in the following theorem.

Theorem 4.3 If the coupled  con-Sylvester matrix equation (4.41) has a unique solution X1,X2, ...,Xp = X1∗,X2∗, ...,Xp∗ , then the iterative solutions Xl(k), l ∈ I[1,p], given by the algorithm in (4.48) converge to Xl∗,l∈ I[1,p], for arbitrary initial matrices Xl(0),l∈ I[1,p],if

4N 0 <μ< , (4.49)  2 T 2 with T = [Tli]p×N , (4.50) where       = ⊗ H + ( ) ⊗ T ∈ I[ , ] ∈ I[ , ] Tli Bil σ Ail σ Dil σ Pni Cil σ Pmi ,l 1 p ,i 1 N , (4.51) (·) σ is the real representation of complex matrices in Sect.2.6, and Pmi and Pni are defined as in (2.64).

Proof Define error matrices ˜ Xl(k) = Xl(k) − Xl∗, l ∈ I[1, p].

By using the algorithm (4.48), one has

˜ Xl(k + 1) = Xl(k + 1) − Xl∗ = X (k) − X + l l∗⎡ ⎤ μ N p   p   H H A ⎣ A ηXη∗B η + C ηXη∗D η − A ηXη(k)B η + C ηXη(k)D η ⎦ B + 2N il i i i i i i i i il i=1 η=1 η=1 ⎡ ⎤ N p p   μ H   H C ⎣ A ηXη∗B η + C ηXη∗D η − A ηXη(k)B η + C ηXη(k)D η ⎦ D 2N il i i i i i i i i il i=1 η=1 η=1 ⎡ ⎤ μ N p   H H = X˜ (k) − A ⎣ A ηX˜η(k)B η + C ηX˜η(k)D η ⎦ B l 2N il i i i i il i=1 η=1 ⎡ ⎤ N p   μ H H − C ⎣ A ηX˜η(k)B η + C ηX˜η(k)D η ⎦ D . 2N il i i i i il i=1 η=1 140 4 Hierarchical-Update-Based Iterative Approaches

Denote p   ˜ ˜ Zi(k) = AiηXη(k)Biη + CiηXη(k)Diη , i ∈ I[1, N]. (4.52) η=1

Then, combining this relation with the preceding expression, one has

   ˜ 2 Xl(k + 1)

= ˜ H( + ) ˜ ( + ) tr Xl k 1 Xl k 1     N    2 μ  =  ˜ ( ) − ˜ H( ) H ( ) H + H ( ) H Xl k tr Xl k Ail Zi k Bil Cil Zi k Dil 2N =  i 1  N   μ H − tr B ZH(k)A + D Z (k) C X˜ (k) 2N il i il il i il l i=1  2 2 N   μ  H H  +  AHZ (k)BH + C Z (k)D  (4.53) 2  il i il il i il  4N = i 1       N N  2 μ  μ  =  ˜ ( ) − H ˜ H( ) H ( ) − H( ) ˜ ( ) Xl k tr Bil Xl k Ail Zi k tr Zi k AilXl k Bil 2N = 2N =  i 1   i 1  N N μ H H μ H − tr D X˜ H(k)C Z (k) − tr Z (k) C X˜ (k)D 2N il l il i 2N i il l il i=1 i=1  2 2 N   μ  H H  +  AHZ (k)BH + C Z (k)D  . 4N2  il i il il i il  i=1

It is easily known that the following expression is real     N N H ˜ H( ) H ( ) + ( )H ˜ ( ) tr Dil Xl k Cil Zi k tr Zi k CilXl k Dil . i=1 i=1

Then, one has     N N H ˜ H( ) H ( ) + ( )H ˜ ( ) tr Dil Xl k Cil Zi k tr Zi k CilXl k Dil = =  i 1   i 1  N N  H  = H ˜ ( ) H ( ) + H( ) ˜ ( ) tr Dil Xl k Cil Zi k tr Zi k CilXl k Dil . i=1 i=1 4.2 Coupled Con-Sylvester Matrix Equations 141

Combining this relation with (4.52), gives     N N H ˜ H( ) H ( ) + H( ) ˜ ( ) tr Bil Xl k Ail Zi k tr Zi k AilXl k Bil = = i1  i 1   N N + H ˜ H( ) H ( ) + ( )H ˜ ( ) tr Dil Xl k Cil Zi k tr Zi k CilXl k Dil = =  i 1 i 1  N   H = H ˜ H( ) H + H ˜ ( ) H ( ) tr Bil Xl k Ail Dil Xl k Cil Zi k = i1  N   + H( ) ˜ ( ) + ˜ ( ) tr Zi k AilXl k Bil CilXl k Dil . i=1

With this relation, it follows from (4.54) that    ˜ 2 Xl(k + 1)     N   2 μ  H =  ˜ ( ) − H ˜ H( ) H + H ˜ ( ) H ( ) Xl k tr Bil Xl k Ail Dil Xl k Cil Zi k 2N =  i 1  μ N   − tr ZH(k) A X˜ (k)B + C X˜ (k)D (4.54) 2N i il l il il l il i=1  2 2 N   μ  H H  +  AHZ (k)BH + C Z (k)D  . 4N2  il i il il i il  i=1

Denote    zi (k) = vec Zi(k) , i ∈ I [1, N] . σ

By applying Proposition 2.6 and Lemmas 2.13 and 2.15, one has   p  N  2     H ( ) H + H ( ) H   Ail Zi k Bil Cil Zi k Dil  l= i=1 1   p N   2 1  H H H H  =  A Zi(k)B + Cil Zi(k)Dil  2  il il σ  l= i=1 1   p N             2 1  H H H H  =  A Zi(k) B + Cil Pm Zi(k) Pn Dil  2  il σ σ il σ σ i σ i σ  l=1 i=1 142 4 Hierarchical-Update-Based Iterative Approaches   p  N       2        T 1  H T H H H  =  B ⊗ A + Pn Dil ⊗ Cil Pm zi (k) 2  il σ il σ i σ σ i  l= i=1 1   p N        2 1  H T H T  =  B ⊗ A + (D )σ P ⊗ C P z (k) 2  il σ il σ il ni il σ mi i  l=1 i=1 ⎡  ⎤  N        2  H T ⊗ H + ( ) ⊗ T ( )  B σ A σ Di1 σ Pni C σ Pmi zi k ⎢ i=1 i1 i1 i1 ⎥ ⎢ N        ⎥ ⎢ H T ⊗ H + ( ) ⊗ T ( ) ⎥ 1 B σ A σ Di2 σ Pni C σ Pmi zi k = ⎢ i=1 i2 i2 i2 ⎥ ⎢ ··· ⎥ 2 ⎢        ⎥ ⎣ N T   ⎦  H ⊗ H + ⊗ T ( )   B A Dip σ Pni C Pmi zi k  i=1 ip σ ip σ ip σ 1 = Tz (k)2 , 2 where T is defined in (4.50), and

         T T T T z (k) = vec Z (k) vec Z (k) ··· vec Z (k) . (4.55) 1 σ 2 σ N σ

By properties of 2-norm of matrices, it is obtained that   p N  2  H H  1  AHZ (k)BH + C Z (k)D  ≤ T2 z (k)2 . (4.56)  il i il il i il  2 2 l=1 i=1

In addition, by Item (1) of Lemma 2.15, it is easily derived that

N    N  2  2   2 z (k) =  Zi(k)  = 2 Zi(k) . σ i=1 i=1

With this, one has from (4.56) that   p  N  2 N      H ( ) H + H ( ) H  ≤  2  ( )2  Ail Zi k Bil Cil Zi k Dil  T 2 Zi k . l=1 i=1 i=1

By combining this relation with (4.54), and changing the order of summation, one has

p    ˜ 2 Xl(k + 1) = l 1   p   p N    2 μ   H ≤ X˜ (k) − tr BHX˜ H(k)AH + DHX˜ (k) CH Z (k) l 2N il l il il l il i l=1 l=1 i=1 4.2 Coupled Con-Sylvester Matrix Equations 143   μ p N   − tr ZH(k) A X˜ (k)B + C X˜ (k)D + 2N i il l il il l il l=1 i=1 μ2 N T2 Z (k)2 2 2 i 4N = i 1   p   N p    2 μ   H =  ˜ ( ) − H ˜ H( ) H + H ˜ ( ) H ( ) Xl k tr Bil Xl k Ail Dil Xl k Cil Zi k = 2N = = l 1  i 1 l 1  μ N p   − tr ZH(k) A X˜ (k)B + C X˜ (k)D + 2N i il l il il l il i=1 l=1 μ2 N T2 Z (k)2 4N2 2 i i=1 p   N   2 μ    = X˜ (k) − tr ZH(k)Z (k) − l 2N i i l=1 i=1 μ N   μ2 N tr ZH(k)Z (k) + T2 Z (k)2 2N i i 4N2 2 i i=1 i=1 p   N N   2 μ  μ2  = X˜ (k) − Z (k)2 + T2 Z (k)2 l N i 4N2 2 i l=1 i=1 i=1

p     N   2 μ μ  =  ˜ ( ) − −  2  ( )2 Xl k 1 T 2 Zi k = N 4N = l 1 i 1   p     N N   2 μ μ   ≤ X˜ (k − 1) − 1 − T2 Z (k)2 + Z (k − 1)2 l N 4N 2 i i l=1 i=1 i=1 p     k N   2 μ μ   ≤ X˜ (0) − 1 − T2 Z (ω)2 . l N 4N 2 i l=1 ω=0 i=1

If the convergence factor μ is chosen as (4.49), then one has

  k N p   μ μπ     2 1 − Z (ω)2 ≤ X˜ (0) . N 4N i l ω=0 i=1 l=1

Thus   ∞ N p   μ μπ     2 1 − Z (ω)2 ≤ X˜ (0) . N 4N i l ω=0 i=1 l=1 144 4 Hierarchical-Update-Based Iterative Approaches

It follows from the convergence theorem of series that

N lim Z (ω)2 = 0. ω→∞ i i=1

This implies that lim Z (ω) = 0, i ∈ I[1, N]. ω→∞ i

Since the matrix equation (4.41) has a unique solution, it follows from the definition (4.52)ofZi (k) that lim X (ω) = 0, l ∈ I[1, p]. ω→∞ l

Thus the proof is completed. 

Remark 4.1 According to the definition of 2-norm, the relation (4.56) holds for any real vector z. However, in the problem under investigation the vector z has a special structure (4.55). This implies that the estimation (4.56) is a bit conservative. Therefore, the range of the convergence factor μ given in Theorem 4.3 maybeabit conservative. In other words, the algorithm (4.48) might be still convergent even if the parameter does not satisfy (4.49). It is our future work to reduce this conservatism.

In view of the expressions of the algorithm (4.48) and the condition (4.49), one can give the following corollary.

Corollary 4.3 If the coupled  con-Sylvester matrix equation (4.41) has a unique solution X1,X2, ...,Xp = X1∗,X2∗, ...,Xp∗ , then the iterative solutions Xl(k), l ∈ I[1,p], given by

( + ) Xl k 1 ⎡ ⎤ (4.57) N p   = ( ) + μ H ⎣ − ( ) + ( ) ⎦ H Xl k Ail Fi AiηXη k Biη CiηXη k Diη Bil i= η=1 ⎡1 ⎤ N p   H ⎣ ⎦ H +μ Cil Fi − AiηXη(k)Biη + CiηXη(k)Diη Dil , i=1 η=1 converge to Xl∗,l∈ I[1,p], for arbitrary initial matrices Xl(0),l∈ I[1,p],if

2 0 <μ< , (4.58)  2 T 2 where T is defined in (4.50)–(4.51). 4.2 Coupled Con-Sylvester Matrix Equations 145

Theorem 4.3 (respectively, Corollary 4.3) provides a sufficient condition to guar- antee convergence of the iterative algorithm (4.48) (respectively, algorithm (4.57)). However, the condition of (4.49)or(4.58) involves Kronecker products and the real representations of the coefficient matrices. This may bring high dimensionality prob- lem. To overcome this drawback, the following corollary gives a sufficient condition that is easier to compute.

Corollary 4.4 If the coupled  con-Sylvester matrix equation (4.41) has a unique solution X1,X2, ...,Xp = X1∗,X2∗, ...,Xp∗ , then the iterative solutions Xl(k), l ∈ I[1,p], given by the algorithm (4.57) converge to Xl∗,l∈ I[1,p], for arbitrary initial values Xl(0),l∈ I[1,p],if

<μ< 1 0 N p  . (4.59) 2 2 2 2 Ail Bil + Cil Dil i=1 l=1 2 2 2 2

Proof The same symbols as in Theorem 4.3 are adopted. By applying Corollary 2.1 and Lemma 2.12, one has

T2  2    2 ≤     Tli 2 p×N  2   2 ≤     Tli 2 p×N N p         H T 2 = B ⊗ A + (D )σ P ⊗ C P il σ il σ il ni il σ mi 2 i=1 l=1 N p              2  H   T  ≤ B ⊗ A + (D )σ P ⊗ C P il σ il σ 2 il ni il σ mi 2 i=1 l=1 N p            H 2  T 2 ≤ 2 B ⊗ A + (D )σ P ⊗ C P . il σ il σ 2 il ni il σ mi 2 i=1 l=1  ⊗  =     Recall the fact that A B 2 A 2 B 2. Then, by applying Lemma 2.15 one has

 2 T 2 N p              2  H 2  2  T 2 ≤ 2 B A + (D )σ P C P il σ 2 il σ 2 il ni 2 il σ mi 2 i=1 l=1 N p   =  2  2 +  2  2 . 2 Ail 2 Bil 2 Cil 2 Dil 2 i=1 l=1

Combining this relation with (4.58), gives the conclusion.  146 4 Hierarchical-Update-Based Iterative Approaches

Remark 4.2 Compared with Corollary 4.3, the result in Corollary 4.4 has two advan- tages. One is that the range of the convergence factor guaranteeing the convergence of the algorithm (4.57) is given in terms of original coefficient matrices instead of their real representations. The other advantage is that the condition (4.59) is easier to compute than the condition (4.49). Nevertheless, it is obvious that the result of Corollary 4.4 is more conservative than that of Corollary 4.3.

4.2.3 A More General Case

In this subsection, the aim is to consider a class of more general coupled con-Sylvester matrix equations which include the coupled con-Sylvester matrix equation (4.41)as a special case. Such a class of matrix equations is in the form of

ω si1 i1 si2 Ai1jX1Bi1j + Ci1jX1Di1j + Ai2jX2Bi2j+ j=1 j=1 j=1

ω ω i2 sip ip Ci2jX2Di2j +···+ AipjXpBipj + CipjXpDipj = Fi, (4.60) j=1 j=1 j=1 i ∈ I[1, N]

m ×rη sη×n m ×n where Aiηj ∈ C i , Biηj ∈ C i , Fi ∈ C i i , j ∈ I[1, siη], i ∈ I[1, N], m ×rη sη×n η ∈ I[1, p], Ciηj ∈ C i , Diηj ∈ C i , j ∈ I[1,ωiη], i ∈ I[1, N], η ∈ I[1, p], rη×sη are known matrices, and Xη ∈ C , η ∈ I[1, p], are the matrices to be determined. Numerical solutions to such a class of matrix equations have not yet been investigated in literature. It is assumed that the matrix equation (4.60) has a unique solution. By similar derivation, one can construct the following iterative algorithm to solve the matrix equation (4.60):

⎧    ω ⎨ p siu p iu Ri(k) = Fi − AiutXu (k) Biut − CiutXu (k)Diut, u=1 t=1 u=1 t=1 N s N ω (4.61) ⎩ il H H il H H Xl(k + 1) = Xl(k) + μ A Ri(k)B + μ Cilj Ri(k)Dilj . i=1 j=1 ilj ilj i=1 j=1

Analogously to the preceding subsection, one can obtain the following results on the iterative algorithm (4.61) for the coupled con-Sylvester matrix equation (4.60).

Theorem 4.4 If the coupled  con-Sylvester matrix equation (4.60) has a unique solution X1,X2, ...,Xp = X1∗,X2∗, ...,Xp∗ , then the sequence (X1(k),X2(k), ...,Xp(k)) generated by the iterative algorithm (4.61) converges to (X1∗,X2∗, ...,Xp∗), for arbitrary initial matrices Xl(0),l∈ I[1,p],if 4.2 Coupled Con-Sylvester Matrix Equations 147

2 0 <μ<  2 T 2 where T = [Tli]p×N with

ω sil     il     = ⊗ H + ⊗ T Tli Bilj σ Ail σ Dilj σ Pni Cilj σ Pmi . j=1 j=1

Corollary 4.5 If the coupled  con-Sylvester matrix equation (4.60) has a unique solution X1,X2, ...,Xp = X1∗,X2∗, ...,Xp∗ , then the iterative solutions Xl(k), l ∈ I[1,p], given by the algorithm (4.61) converge to Xl∗,l∈ I[1,p], for arbitrary initial values Xl(0),l∈ I[1,p],if

<μ< 1 0        ω     . N p sil  2  2 il  2  2 sil Ailj Bilj + ωil Cilj Dilj i=1 l=1 j=1 2 2 j=1 2 2

4.2.4 A Numerical Example

Example 4.3 Consider the following coupled con-Sylvester matrix equation " A XB + C XD + A YB + C YD = F 11 11 11 11 12 12 12 12 1 (4.62) A21XB21 + C21XD21 + A22YB22 = F2 with the following coefficient matrices

1 + 2i 2 − i 0.5 − i −1 + 3i A = , B = , 11 8 − i2+ 3i 11 −1.5 + 2i 1 − 2i

4i 2 + 2i −13+ i C = , D = , 11 3 + 2.5i i 11 −2 + i4i

1 − 1.5i 3i 1 − 2i i A = , B = , 12 −2 + 3i 4 12 −1 + 3i −1 + 4i

2 − 4i 3 − i −1 − 0.5i −2 + 2i C = , D = , 12 1 + 3i 1 + 2i 12 1 + 2.5i −4 + 2i

0.5 −1 + 0.5i −1 − i −3i A = , B = , 21 1 − 2i −2.5 + 1.5i 21 51+ 2i

−1 + i2− i 4 − i2.5 − i C = , D = , 21 −2 + 3i −1 + 2i 21 i −2 + 2i 148 4 Hierarchical-Update-Based Iterative Approaches

1 − 4i i i2+ 3i A = , B = , 22 −1 + 3i 2 22 41− 10i

−31.5 + 53.5i −20 + 64.5i F = , 1 −13.5 − 14i 39.5 + 77.5i

−14 − 18.5i 10 − 15i F = . 2 26.5 − 44.5i −49.5 − 4.5i

This matrix equation has a unique solution

11+ i i0 X∗ = , Y∗ = . −1 + 2i −2 + 3i 3 + i1− 2i

The algorithm (4.57) is applied to solve the coupled con-Sylvester matrix equa- −6 tion (4.62). The initial matrices are chosen as X(0) = Y(0) = 10 I2. According to Corollary4.3, the algorithm (4.57) is convergent for 0 <μ<3.060×10−4. Accord- ing to Corollary 4.4, the iterative solution given by the algorithm (4.57) converges −4 to (X∗, Y∗) for 0 <μ<0.994 × 10 . However, by “trial and test” procedure for this example the algorithm (4.57) is convergent for 0 <μ<3.207 × 10−4. This fact also confirms the statement in Remark 4.2. Define the relative iteration error as #  ( ) − 2 +  ( ) − 2 δ( ) = X k X∗ Y k Y∗ k 2 2 . X∗ + Y∗

Shown in Fig. 4.3 are the relative iteration errors for different convergence factors μ −6 when the initial values take X(0) = Y(0) = 10 I2.

1.4

1.2

1

μ=0.5 × 10−4 0.8 μ=1.0 × 10−4 δ

−4 0.6 μ=1.5 × 10

μ=3.0 × 10−4 0.4

0.2

0 0 200 400 600 800 1000 1200 1400 1600 1800 2000 Iteration Steps

Fig. 4.3 The convergence performance of the algorithm (4.57) 4.3 Complex Conjugate Matrix Equations with Transpose of Unknowns 149

4.3 Complex Conjugate Matrix Equations with Transpose of Unknowns

In this section, the following complex conjugate matrix equation is considered

s1 s2 s3 s4 T H AlXBl + ClXDl + GlX Hl + MlX Nl = F, (4.63) l=1 l=1 l=1 l=1

m×r s×n m×r s×n where Al ∈ C , Bl ∈ C , l ∈ I[1, s1], Cl ∈ C , Dl ∈ C , l ∈ I[1, s2], m×s r×n m×s r×n Gl ∈ C , Hl ∈ C , l ∈ I[1, s3], Ml ∈ C , Nl ∈ C , l ∈ I[1, s4], and F ∈ Cm×n are known matrices, and X ∈ Cr×s is the matrix to be determined. In this equation, there is the transpose of the unknown matrix besides its conjugate. Obviously, this kind of matrix equations includes the matrix equation AX + XD = F in [20, 21, 258], and X + CXD = F in [155] as special cases. In [20, 21, 155, 258], the matrices A, C, and D are required to be square. In this section, such a restriction will be removed. In addition, the matrix equations in the form of (4.63) also include the matrix equation AXB + CXTD = E in [285] as a special case. When the matrix equation in (4.63) has a unique solution, an iterative algorithm will be established in this section to solve it. For this end, it is denoted that

s1 s2 s3 s4 T H (X) = AlXBl + ClXDl + GlX Hl + MlX Nl. (4.64) l=1 l=1 l=1 l=1

The hierarchical principle implies that by defining matrices

i = F − (X) + AiXBi, i ∈ I[1, s1], (4.65)

ϒi = F − (X) + CiXDi, i ∈ I[1, s2], (4.66) = T − T( ) + T T ∈ I[ , ] i F X Hi XGi , i 1 s3 , (4.67)  = H − H( ) + H H ∈ I[ , ] i F X Ni XMi , i 1 s4 , (4.68) the matrix equation in (4.63) can be decomposed into the following matrix equations

AiXBi = i, i ∈ I[1, s1], (4.69)

CiXDi = ϒi, i ∈ I[1, s2], (4.70) T T = ∈ I[ , ] Hi XGi i, i 1 s3 , (4.71) H H =  ∈ I[ , ] Ni XMi i, i 1 s4 . (4.72)

According to Lemma 4.1, for the matrix equations in (4.69)–(4.72) one can construct the following respective iterative forms 150 4 Hierarchical-Update-Based Iterative Approaches

( + ) = ( ) + μ H ( − ( ) ) H ∈ I[ , ] Xi k 1 Xi k Ai i AiXi k Bi Bi , i 1 s1 , (4.73) H   H X + (k + 1) = X + (k) + μC ϒ − C X + (k)D D , i ∈ I[1, s ], (4.74) s1 i s1 i i i i s1 i i i  2 ( + ) = ( ) + μ − T T ∈ I[ , ] Xs1+s2+i k 1 Xs1+s2+i k Hi i Hi Xs1+s2+ikGi Gi, i 1 s3 , (4.75)   ( + ) = ( ) + μ  − H ( ) H Xs1+s2+s3+i k 1 Xs1+s2+s3+i k Ni i Ni Xs1+s2+s3+i k Mi Mi, (4.76) i ∈ I[1, s4].

Substituting (4.65)–(4.68)into(4.73)–(4.76), respectively, gives

( + ) = ( ) + μ H ( − ( ) + − ( ) ) H Xi k 1 Xi k Ai F X AiXBi AiXi k Bi Bi , (4.77) i ∈ I[1, s ], 1   ( + ) = ( ) + μ H − ( ) + − ( ) H Xs1+i k 1 Xs1+i k Ci F X CiXDi CiXs1+i k Di Di , (4.78) i ∈ I[1, s ], 2  ( + ) = ( ) + μ T − T( ) + T T Xs1+s2+i k 1 Xs1+s2+i k Hi F X Hi XGi  − T ( ) T Hi Xs1+s2+i k Gi Gi, (4.79) i ∈ I[1, s ], 3  ( + ) = ( ) + μ H − H( ) + H H Xs1+s2+s3+i k 1 Xs1+s2+s3+i k Ni F X Ni XMi  − H ( ) H Ni Xs1+s2+s3+i k Mi Mi, (4.80) i ∈ I[1, s4].

The right-hand sides of these expressions contain the unknown matrix X,soitis impossible to implement the algorithms in (4.77)–(4.80). By applying the hierarchical principle, the unknown variable X in these expressions needs to be replaced with its estimates Xi(k), i ∈ I[1, s1 + s2 + s3 + s4]. Hence, one has

X (k + 1) = X (k) + μAH (F − (X (k))) BH, i ∈ I[1, s ], i i i  i i  1 ( + ) = ( ) + μ H − ( ( )) H ∈ I[ , ] Xs1+i k 1 Xs1+i k Ci F Xs1+i k Di , i 1 s2 ,   ( + ) = ( ) + μ T − T( ( )) Xs1+s2+i k 1 Xs1+s2+i k Hi F Xs1+s2+i k Gi, ∈ I[ , ] i 1 s3 ,   ( + ) = ( ) + μ H − H( ( )) Xs1+s2+s3+i k 1 Xs1+s2+s3+i k Ni F Xs1+s2+s3+i k Mi,

i ∈ I[1, s4].

Taking the average of Xi(k), i ∈ I[1, s1 + s2 + s3 + s4], one can obtain the following iterative algorithm

X (k + 1) = X(k) + μAH (F − (X(k))) BH, i ∈ I[1, s ], (4.81) i i   i 1 ( + ) = ( ) + μ H − ( ( )) H ∈ I[ , ] Xs1+i k 1 X k Ci F X k Di , i 1 s2 , (4.82)   ( + ) = ( ) + μ T − T( ( )) ∈ I[ , ] Xs1+s2+i k 1 X k Hi F X k Gi, i 1 s3 , (4.83) 4.3 Complex Conjugate Matrix Equations with Transpose of Unknowns 151   H H Xs +s +s +i(k + 1) = X(k) + μNi F − (X(k)) Mi, i ∈ I[1, s4], (4.84) 1 2 3  s1+s2+s3+s4 Xi(k) X(k) = i=1 . (4.85) s1 + s2 + s3 + s4

This algorithm can be equivalently rewritten as

X(k + 1)   s1 s2   μ H H = X(k) + AH (F − (X(k))) BH + C F − (X(k)) D 4 i i i i (4.86) τ = = τ= s i 1 i 1 1  μ s3   s4   + H FT − T(X(k)) G + N FH − H(X(k)) M , 4 i i i i sτ i=1 i=1 τ=1 where (X) is defined as (4.64). Denote μ θ = . s1 + s2 + s3 + s4

Then one can give the following algorithm for the matrix equation (4.63) ⎧   s1 s2 ⎪ R(k) = F − A X(k)B − C X(k)D ⎪ = l l = l l ⎪ ls 1 l 1s ⎨⎪ 3 T 4 H − GlX (k)Hl − MlX (k)Nl l=1  l=1  s1 s2 . (4.87) ⎪ H H H H ⎪ X(k + 1) = X(k) + θ A R(k)B + Ci R(k)Di + ⎪ i=1 i i i=1 ⎪ s s ⎩ 3 T 4 H HiR (k)Gi + NiR (k)Mi i=1 i=1

4.3.1 Convergence Analysis

Theorem 4.5 If the matrix equation in (4.63) has a unique solution X = X∗, then the iterative solution X(k) given by the algorithm (4.87), converges to X∗ if

2 0 <θ< (4.88)  2 T 2 with

s1     s2   = ⊗ H + ( ) ⊗ T T Bi σ Ai σ Di σ Pn Ci σ Pm i=1 i=1  s3     s4   + H ⊗ + T ⊗ ( ) ( , ) Gi σ Hi σ Mi σ Pm Ni σ Pn P 2n 2m , i=1 i=1 152 4 Hierarchical-Update-Based Iterative Approaches where (·)σ is the real representation of complex matrices in Sect.2.6,Pm is defined as in (2.64), and P (2n, 2m) is defined as in (2.4). Proof The expression in (4.64) is used. Define the error matrix ˜ X(k) = X(k) − X∗.

Since X = X∗ is the unique solution of the equation, then (X∗) = F. Using this relation, (4.87) and the expression (4.64), one gives

X˜ (k + 1) = X(k + 1) − X∗ = ( ) − + Xk X∗  s1 s2   θ H ( ( ) − ( ( ))) H + H ( ) − ( ( )) H + Ai X∗ X k Bi Ci X∗ X k Di  i=1 i=1  s3   s4   T T H H θ Hi (X∗) − (X(k)) Gi + Ni (X∗) − (X(k)) Mi i=1  i=1  s1 s2 = ˜ ( ) − θ H ( ˜ ( )) H + H ( ˜ ( )) H X k Ai X k Bi Ci X k Di  i=1 i=1  s3 s4 T ˜ H ˜ −θ Hi (X(k))Gi + Ni (X(k))Mi . i=1 i=1

Denote

s1 s2 ˜ ˜ ˜ Z(k) = (X(k)) = AlX(k)Bl + ClX(k)Dl + (4.89) l=1 l=1 s3 s4 ˜ T ˜ H GlX (k)Hl + MlX (k)Nl. l=1 l=1

From the preceding two expressions one can obtain

˜ ( + ) X k 1  s1 s2 = ˜ ( ) − θ H ( ) H + H ( ) H+ X k Ai Z k Bi Ci Z k Di i=1 i=1  s3 s4 T H HiZ (k)Gi + NiZ (k)Mi . i=1 i=1

Recall the well-known fact that tr (AB) = tr (BA). Then by simple calculations it follows that 4.3 Complex Conjugate Matrix Equations with Transpose of Unknowns 153    2 X˜ (k + 1)

= tr X˜ H(k + 1)X˜ (k + 1) %    s s  2 1 2 =  ˜ ( ) − θ H ( ) H + H ( ) H+ X k 2 Re tr Ai Z k Bi Ci Z k Di i=1 i=1  ⎤⎫ H s3 s4 ⎬ T( ) + H( ) ˜ ( )⎦ HiZ k Gi NiZ k Mi X k ⎭ (4.90) = = i 1  i 1  s1 s2 + θ 2  H ( ) H + H ( ) H+  Ai Z k Bi Ci Z k Di = = i 1 i 1  2 s3 s4  T( ) + H( )  HiZ k Gi NiZ k Mi . i=1 i=1

By applying properties of the matrix trace, one has ⎧ ⎡  ⎤⎫ H ⎨ s1 s2 ⎬ ⎣ H ( ) H + H ( ) H ˜ ( )⎦ Re ⎩tr Ai Z k Bi Ci Z k Di X k ⎭ i=1 i=1 %  ) s1 s2 H ˜ H ˜ = Re tr Z (k)AiX(k)Bi + Z(k) CiX(k)Di ⎧ ⎡⎛i=1 i=1 ⎞⎤⎫ ⎨ s1 s2 ⎬ = ⎣⎝ H( ) ˜ ( ) + ( )H ˜ ( ) ⎠⎦ Re ⎩tr Z k AiX k Bi Z k CiX k Di ⎭ i=1 i=1 %   ) s1 s2 H ˜ ˜ = Re tr Z (k) AiX(k)Bi + CiX(k)Di , i=1 i=1 and ⎧ ⎡  ⎤⎫ H ⎨ s3 s4 ⎬ ⎣ T( ) + H( ) ˜ ( )⎦ Re ⎩tr HiZ k Gi NiZ k Mi X k ⎭ i=1 i=1 %   ) s3 s4 ˜ H T H = Re tr X (k) HiZ (k)Gi + NiZ (k)Mi %  i=1 i=1 ) s3 s4 T ˜ H H ˜ H = Re tr Z (k)GiX (k)Hi + Z (k)MiX (k)Ni i=1 i=1 154 4 Hierarchical-Update-Based Iterative Approaches ⎧ ⎡⎛ ⎞⎤⎫ ⎨ s3 s4 ⎬ = ⎣⎝ T( ) ˜ H( ) + H( ) ˜ H( ) ⎠⎦ Re ⎩tr Z k GiX k Hi Z k MiX k Ni ⎭ i=1 i=1 %   ) s3 s4 H ˜ T ˜ H = Re tr Z (k) GiX (k)Hi + MiX (k)Ni . i=1 i=1

Substituting the preceding two relations into (4.90), gives    2 X˜ (k + 1) %   s1 s2 H ˜ ˜ =−2θ Re tr Z (k) AiX(k)Bi + CiX(k)Di+ i=1 )i=1 s3 s4 ˜ T ˜ H GiX (k)Hi + MiX (k)Ni i=1  i=1    s s  2 1 2 +  ˜ ( ) + θ 2  H ( ) H + H ( ) H+ X k  Ai Z k Bi Ci Z k Di (4.91) = = i 1  i 1 2 s3 s4  T( ) + H( )  HiZ k Gi NiZ k Mi i=1 i=1     s s  2 1 2 =  ˜ ( ) − θ  ( )2 + θ 2  H ( ) H + H ( ) H+ X k 2 Z k  Ai Z k Bi Ci Z k Di = = i 1  i 1 2 s3 s4  T( ) + H( )  HiZ k Gi NiZ k Mi . i=1 i=1

In addition, by using Lemmas 2.13 and 2.15, and Propositions 2.6 and 2.7, one has    2 s1 s2 s3 s4   H ( ) H + H ( ) H + T( ) + H( )   Ai Z k Bi Ci Z k Di HiZ k Gi NiZ k Mi i=1 i=1 i=1 i=1     2  s1 s2 s3 s4  = 1  H ( ) H + H ( ) H + T( ) + H( )   Ai Z k Bi Ci Z k Di HiZ k Gi NiZ k Mi  2  = = = =   i 1 i 1 i 1 i 1 σ s1       s2       1  H H H H =  A Z(k) B + C Pm Z(k) Pn D + (4.92) 2  i σ σ i σ i σ σ i σ i=1 i=1  s3   s4   2   T   T  H Z(k) G + (N )σ P Z(k) P (M )σ  i σ σ i σ i n σ m i  i=1 i=1 4.3 Complex Conjugate Matrix Equations with Transpose of Unknowns 155     1  2 = T vec Z(k)  2 σ     1  2 ≤ T2 vec Z(k)  2 2 σ =  2  ( )2 T 2 Z k .

Then, combining (4.91) with (4.92), yields      2  2    ˜ ( + ) ≤  ˜ ( ) − θ − θ  2  ( )2 X k 1 X k 2 T 2 Z k    2    ≤  ˜ ( − ) − θ − θ  2  ( − )2 +  ( )2 X k 1 2 T 2 Z k 1 Z k   k  2    ≤  ˜ ( ) − θ − θ  2  ( )2 X 0 2 T 2 Z i . i=0

If the parameter θ is chosen to satisfy (4.88), then one has

k       2 <θ − θ  2  ( )2 ≤  ˜ ( ) 0 2 T 2 Z i X 0 . i=0

Thus ∞       2 <θ − θ  2  ( )2 ≤  ˜ ( ) 0 2 T 2 Z i X 0 . i=0

It follows from the convergence theorem of series that limi→0 Z(i) = 0. Since the matrix equation (4.63) has a unique solution, then it follows from the expression ˜ (4.89)ofZ(k) that limi→0 X(i) = 0. The proof is thus completed.  Theorem 4.5 provides a sufficient condition to guarantee the convergence of the algorithm (4.87). However, the convergence condition (4.88) involves Kronecker products and the real representations of the coefficient matrices. This may bring high dimensionality problem. To overcome this drawback, the following result is given.

Corollary 4.6 If the matrix equation in (4.63) has a unique solution X = X∗, then the iterative solution X(k) given by algorithm (4.87), converges to X∗ if

1 0 <θ< , (4.93) 2h with s1 s2 =  2  2 +  2  2 + h s1 Bi 2 Ai 2 s2 Di 2 Ci 2 i=1 i=1 s3 s4  2  2 +  2  2 s3 Gi 2 Hi 2 s4 Mi 2 Ni 2 . i=1 i=1 156 4 Hierarchical-Update-Based Iterative Approaches

Proof It follows from Lemma 2.11 that for two arbitrary matrices A and B

 ⊗  =     A B 2 A 2 B 2 . (4.94)

Applying this fact and Lemma 2.15, one has     s1     s2     2 ≤  ⊗ H + ( ) ⊗ T  + T 2  Bi σ Ai σ Di σ Pn Ci σ Pm i= i=  1 1 2     2  s3     s4     H ⊗ + T ⊗ ( ) ( , )  Gi σ Hi σ Mi σ Pm Ni σ Pn P 2n 2m  = =  i 1 i 1  2  2 s1     s2    ≤  ⊗ H + ( ) ⊗ T  2  Bi σ Ai σ Di σ Pn Ci σ Pm = = i 1 i 1 2   2 s3     s4    +  H ⊗ + T ⊗ ( )  2  Gi σ Hi σ Mi σ Pm Ni σ Pn = =  i 1  i 1 2  2  2 s1      s2    ≤  ⊗ H  +  ( ) ⊗ T  4  Bi σ Ai σ  4  Di σ Pn Ci σ Pm = = i 1 2  i 1 2   2  2 s3      s4    +  H ⊗  +  T ⊗ ( )  4  Gi σ Hi σ  4  Mi σ Pm Ni σ Pn i=1 2 i=1 2 s1      s2      H 2  T 2 ≤ 4s B ⊗ A + 4s (D )σ P ⊗ C P 1 i σ i σ 2 2 i n i σ m 2 i=1 i=1 s3      s4     H 2  T 2 +4s G ⊗ H + 4s M P ⊗ (N )σ P 3 i σ i σ 2 4 i σ m i n 2 i=1 i=1 s1       s2     2  H 2 2  T 2 = 4s B A + 4s (D )σ P  C P 1 i σ 2 i σ 2 2 i n 2 i σ m 2 i=1 i=1 s3       s4     H 2  2  T 2 2 +4s G H + 4s M P (N )σ P  3 i σ 2 i σ 2 4 i σ m 2 i n 2 i=1 i=1 s1 s2 =  2  2 +  2  2 4s1 Bi 2 Ai 2 4s2 Di 2 Ci 2 i=1 i=1 s3 s4 +  2  2 +  2  2 4s3 Gi 2 Hi 2 4s4 Mi 2 Ni 2 . i=1 i=1

Substituting this relation into (4.88), gives (4.93).  4.3 Complex Conjugate Matrix Equations with Transpose of Unknowns 157

Remark 4.3 Compared with Theorem 4.5, the result of Corollary 4.6 has two advan- tages. One is that the convergence factor guaranteeing the convergence of the algo- rithm is given in terms of original coefficient matrices instead of their real represen- tations. The other advantage is that the condition (4.93) is easier to compute than the condition (4.88). Nevertheless, it is obvious that the result of Corollary 4.6 is more conservative than that of Theorem 4.5.

Remark 4.4 By some numerical examples, it is easily found that the condition (4.88) is also necessary to guarantee the convergence of the algorithm (4.87). It is a pity that such a fact can not be proven in this book.

4.3.2 A Numerical Example

Example 4.4 Consider the following matrix equation

T H AXB + C1XD1 + C2XD2 + GX H + MX N = F with ⎡ ⎤ 0.1901 + 0.0957i 0.1115 + 0.0871i A = ⎣ 0.1748 + 0.2037i 0.2021 + 0.2052i ⎦ , 0.0696 + 0.0102i 0.1855 + 0.0183i ⎡ ⎤ 0.0289 + 0.0122i −0.0049 − 0.0461i B = ⎣ −0.0590 + 0.0585i −0.1146 − 0.0670i ⎦ , 0.1446 − 0.0094i 0.0656 − 0.0692i ⎡ ⎤ 0.2088 + 0.2350i −0.0845 − 0.1455i ⎣ ⎦ C1 = −0.2714 + 0.0306i 0.4106 + 0.0597i , 0.1717 − 0.1197i −0.0567 + 0.2682i ⎡ ⎤ 0.2777 + 0.4535i −0.2948 − 0.7904i ⎣ ⎦ D1 = −0.0416 + 0.3890i −0.4324 − 0.6142i , 0.4598 + 0.4548i −0.2362 − 0.4683i ⎡ ⎤ 0.1926 + 0.7194i 0.1141 + 0.0655i ⎣ ⎦ C2 = −0.3338 − 0.0587i 0.6581 + 0.5978i , 0.6749 + 0.1052i −0.3770 + 0.2826i ⎡ ⎤ 0.0984 + 0.1469i −0.0122 + 0.0458i ⎣ ⎦ D2 = 0.0329 + 0.4375i −0.4600 + 0.3184i , 0.5441 + 0.0915i 0.2819 − 0.3533i 158 4 Hierarchical-Update-Based Iterative Approaches ⎡ ⎤ 0.1237 + 0.2128i −0.4594 − 0.0366i 0.1926 + 0.2333i G = ⎣ 0.1857 + 0.0786i 0.0834 − 0.1151i 0.2473 + 0.6756i ⎦ , 0.2904 + 0.6065i 0.1351 + 0.0427i 0.2332 + 0.1616i

−0.3472 + 0.3338i −0.5500 + 0.0757i H = , 0.8478 − 0.1679i 0.2089 − 0.2269i ⎡ ⎤ 0.0406 + 0.4017i 0.1661 + 0.0799i 0.2328 − 0.0985i M = ⎣ 0.1622 − 0.3010i 0.2929 + 0.2389i 0.0493 + 0.4495i ⎦ , 0.0644 − 0.0274i −0.3631 − 0.3752i 0.0573 + 0.2070i

−0.1572 − 0.1486i 0.5025 + 0.7115i N = , 0.3511 + 0.9164i −0.0133 + 0.0711i ⎡ ⎤ 0.58304691 + 0.06289235i 1.08318841 − 1.30293501i F = ⎣ 0.94040055 − 3.2322254i 0.63675634 + 1.3735854i ⎦ . 0.02211352 − 1.3096752i −0.64130511 − 0.306664i

This matrix equation has a unique solution

−101+ 2i X∗ = . 2 + i −1 + i −2 − 2i

According to Theorem 4.5, the algorithm (4.87) for this example is convergent if 0 <θ<0.3871. According to Corollary 4.6, the algorithm (4.87) for this example is convergent if 0 <θ<0.1232. Define the iteration error as

δ(k) = X∗ − X(k) . (4.95)

Shown in Fig. 4.4 are the iteration errors for different parameters θ when the initial value takes X(0) = 03×2. The iterative solutions X(k) are given in Tables 4.3 and 4.4 for θ = 0.3500. In those two tables,

x (k) x (k) x (k) X (k) = 11 12 13 . x21 (k) x22 (k) x23 (k)

4.4 Notes and References

In this chapter, iterative algorithms have been established for solving some com- plex conjugate matrix equations by hierarchical identification principle [53, 54]. These matrix equations include the so-called extended con-Sylvester matrix equa- tions, the coupled con-Sylvester matrix equations, and the complex conjugate matrix equations with the transpose of unknown matrices. With the aid of the principle of hierarchical identification, the involved matrix equation is first decomposed into some 4.4 Notes and References 159

5

4.5

4

3.5

3 θ=0.0800

δ 2.5 θ=0.2000 2 θ=0.3860

1.5 θ=0.3500

1 θ=0.2800

0.5

0 0 100 200 300 400 500 600 700 800 900 1000 Iteration Steps

Fig. 4.4 The iteration errors versus k

Table 4.3 The iterative solutions to x11, x12, x13 in Example4.3 for θ = 0.3500 k x11 x12 x13 δ 20 −0.6650 + 0.3360i −0.1593 − 0.2022i 0.7792 + 1.9193i 1.2611 40 −0.8046 + 0.1935i −0.0987 − 0.0438i 0.8239 + 2.0945i 0.7364 60 −0.8969 + 0.0807i −0.0625 + 0.0055i 0.8575 + 2.1254i 0.5144 80 −0.9515 + 0.0221i −0.0376 + 0.0194i 0.8820 + 2.1204i 0.3896 100 −0.9847 − 0.0034i −0.0185 + 0.0223i 0.9007 + 2.1056i 0.3083 120 −1.0057 − 0.0121i −0.0033 + 0.0217i 0.9153 + 2.0894i 0.2507 140 −1.0197 − 0.0130i 0.0088 + 0.0201i 0.9269 + 2.0744i 0.2083 160 −1.0292 − 0.0107i 0.0182 + 0.0182i 0.9361 + 2.0615i 0.1768 180 −1.0358 − 0.0074i 0.0254 + 0.0165i 0.9436 + 2.0507i 0.1534 200 −1.0405 − 0.0040i 0.0310 + 0.0150i 0.9495 + 2.0419i 0.1363 220 −1.0438 − 0.0010i 0.0352 + 0.0137i 0.9544 + 2.0348i 0.1237 240 −1.0461 + 0.0015i 0.0382 + 0.0126i 0.9584 + 2.0291i 0.1145 260 −1.0476 + 0.0035i 0.0405 + 0.0117i 0.9616 + 2.0245i 0.1076 280 −1.0486 + 0.0051i 0.0420 + 0.0109i 0.9643 + 2.0208i 0.1024 300 −1.0491 + 0.0064i 0.0430 + 0.0102i 0.9665 + 2.0179i 0.0983 Solution −1 0 1 + 2i

sub-equations, and then some intermediate iterative algorithms are constructed for these sub-equations. Then, the unknown matrices in intermediate algorithms are replaced with their estimates, and thus the final algorithms are obtained by taking the average of intermediate estimates. Sufficient conditions are proposed for these iterative algorithms to be convergent in terms of the real representation matrices of 160 4 Hierarchical-Update-Based Iterative Approaches

Table 4.4 The iterative solutions to x21, x22, x23 in Example4.3 for θ = 0.3500 k x21 x22 x23 δ 20 1.2052 + 1.3347i −1.0255 + 0.7817i −1.8109 − 1.3551i 1.2611 40 1.6450 + 1.2996i −0.9503 + 0.9131i −2.0044 − 1.5650i 0.7364 60 1.7926 + 1.2276i −0.9302 + 0.9470i −2.0407 − 1.6786i 0.5144 80 1.8600 + 1.1772i −0.9279 + 0.9566i −2.0433 − 1.7575i 0.3896 100 1.8970 + 1.1424i −0.9320 + 0.9615i −2.0382 − 1.8144i 0.3083 120 1.9197 + 1.1174i −0.9379 + 0.9654i −2.0317 − 1.8561i 0.2507 140 1.9347 + 1.0987i −0.9439 + 0.9688i −2.0257 − 1.8875i 0.2083 160 1.9452 + 1.0845i −0.9494 + 0.9719i −2.0205 − 1.9114i 0.1768 180 1.9529 + 1.0733i −0.9542 + 0.9745i −2.0163 − 1.9299i 0.1534 200 1.9587 + 1.0645i −0.9583 + 0.9767i −2.0128 − 1.9442i 0.1363 220 1.9632 + 1.0575i −0.9617 + 0.9786i −2.0101 − 1.9555i 0.1237 240 1.9668 + 1.0519i −0.9645 + 0.9801i −2.0078 − 1.9644i 0.1145 260 1.9697 + 1.0473i −0.9669 + 0.9814i −2.0061 − 1.9714i 0.1076 280 1.9720 + 1.0435i −0.9689 + 0.9825i −2.0047 − 1.9769i 0.1024 300 1.9739 + 1.0405i −0.9706 + 0.9835i −2.0036 − 1.9813i 0.0983 Solution 2 + i −1 + i −2 − 2i

the coefficient matrices. In addition, convergence conditions that are easier to com- pute are also given by the original coefficient matrices. During the derivation, some properties of Kronecker products and real representations play vital roles. In fact, with the hierarchical identification principle as tools some new identifi- cation methods have been proposed. In [178, 235], residual-based interactive sto- chastic gradient algorithms for controlled moving average models and Box–Jenkins models were established, respectively. In [179], an auxiliary-model-based multi- innovation stochastic gradient algorithm was derived for multiple-input single-output systems. In [181], the multi-innovation extended stochastic gradient algorithm was proposed for controlled autoregressive moving average models. In [57], a recursive least-squares algorithm was proposed to estimate the parameters of missing data systems. In [216], the hierarchical identification principle was used to identify the crosstalk functions. In [215], a novel transmultiplexer design method was proposed. The convergence rate and upper bound of the parameter estimation error were also investigated. It is worth pointing out that all the algorithms proposed in this chapter can be implemented in term of the original coefficient matrices. It is not required that the coefficient matrices are in any canonical forms. 4.4 Notes and References 161

Finally, some further comments are given for the approaches to deriving the iter- ative algorithms in Sects.4.2 and 4.3. In order to obtain the iterative algorithm (4.48) for solving the coupled con-Sylvester matrix equation (4.41), the hierarchical identi- fication principle is used. During this procedure, the equation (4.41) is decomposed into some equations in the form of AXB = F. In Subsection 4.1.1, iterative algo- rithms for the so-called extended con-Sylvester matrix equation have been given. An alternative approach to deriving iterative algorithm for the equation (4.41)isto decompose it into sub-equations in the form of AXB + CXD = F. Let us introduce the following intermediate matrices

p     il = Fi − AiηXηBiη + CiηXηDiη + AilXlBil + CilXlDil , (4.96) η=1 i ∈ I[1, N], l ∈ I[1, p].

With these matrices, the matrix equation (4.41) can be decomposed into the following matrix equations

AilXlBil + CilXlDil = il, i ∈ I[1, N], l ∈ I[1, p].

According to Theorem 4.1, for these matrix equations one can construct the following respective iterative forms   ( + ) = ( ) + μ H  − ( ) − ( ) H + Xli k 1 Xli k Ail il AilXli k Bil CilXli k Dil Bil   H H μCil il − AilXli(k)Bil − CilXli (k)Dil Dil , (4.97)

i ∈ I[1, N], l ∈ I[1, p].

The right-hand sides of these equations contain the unknown matrices Xη, η ∈ I[1, p], so it is impossible to implement these algorithms. In order to make the algorithms in (4.97) work, the unknown variable matrices Xη, η ∈ I[1, p] in (4.97) are respectively replaced with their estimates Xli(k), i ∈ I[1, N], l ∈ I[1, p]. In view of the expressions of il, i ∈ I[1, N], l ∈ I[1, p],in(4.96), one can obtain the following iterative forms: ⎡ ⎤ p   ( + ) = ( ) + μ H ⎣ − ( ) + ( ) ⎦ H + Xli k 1 Xli k Ail Fi AiηXηi k Biη CiηXηi k Diη Bil η=1 ⎡ ⎤ p   H ⎣ ⎦ H μCil Fi − AiηXηi (k) Biη + CiηXηi (k)Diη Dil . η=1 162 4 Hierarchical-Update-Based Iterative Approaches

For each matrix variable Xl, l ∈ I[1, p], taking the average of Xli(k), i ∈ I[1, N], gives the iterative algorithm ⎡ ⎤ p   ( + ) = ( ) + μ H ⎣ − ( ) + ( ) ⎦ H + Xli k 1 Xli k Ail Fi AiηXη k Biη CiηXη k Diη Bil η=1 ⎡ ⎤ p   H ⎣ ⎦ H μCil Fi − AiηXη (k) Biη + CiηXη (k)Diη Dil , η=1 1 N X (k) = X (k) . l N li i=1

This algorithm can be equivalently rewritten as ⎡ ⎤ μ p   H ⎣ ⎦ H X (k + 1) = X (k) + A F − A ηXη (k) B η + C ηXη (k)D η B + l l N il i i i i i il η=1 ⎡ ⎤ μ p   H ⎣ ⎦ H C F − A ηXη (k) B η + C ηXη (k)D η D . N il i i i i i il η=1

This is the algorithm in Sect.4.2. Analogously, the algorithm in Sect. 4.3 can also be derived by first decomposing the matrix equation (4.63) into some sub-equations in the form of (4.23). The algo- rithm can be finally obtained by using the result in Subsection 4.1.2 and hierarchical identification principle. Chapter 5 Finite Iterative Approaches

In Chap. 3, Smith-type iterative algorithms are presented for con-Kalman-Yakubovich matrix equations. In Chap. 4, iterative approaches have been established for some more complicated complex conjugate matrix equations. A common feature of the approaches in Chaps.3 and 4 is that they are only applicable to the equations with unique solutions. Moreover, only approximate solutions can be obtained. In this chapter, another kind of iterative approaches is given for solving complex conjugate matrix equations. By this kind of approaches, it is not required that the considered matrix equation has a unique solution. In Sect.5.1, the generalized con-Sylvester ma- trix equation is considered. In Sect.5.2, the extended con-Sylvester matrix equations are investigated. The coupled con-Sylvester matrix equations are treated in Sect. 5.3. It is worth pointing out that the basic tool in this chapter is the real inner product investigated in Sect.2.9. As in the previous chapters, tr(·),Re(·), and · are used to denote the trace, the real part and the Frobenius norm of a matrix, respectively. In this chapter, the following properties are frequently used:

tr(AB) = tr(BA),tr(C) = tr(CT),

tr(C1 + C2) = tr(C1) + tr(C2),Re(D) = Re(D), for the matrices with appropriate dimensions.

5.1 Generalized Con-Sylvester Matrix Equations

In this section, generalized con-Sylvester matrix equations in the following form are considered AX + BY = EXF + S, (5.1)

© Springer Science+Business Media Singapore 2017 163 A.-G. Wu and Y. Zhang, Complex Conjugate Matrix Equations for Systems and Control, Communications and Control Engineering, DOI 10.1007/978-981-10-0637-1_5 164 5 Finite Iterative Approaches where A, E ∈ Cn×n, B ∈ Cn×r, F ∈ Cp×p, and S ∈ Cn×p are known matrices, and X ∈ Cn×p and Y ∈ Cr×p are the matrices to be determined. When A = I, this matrix equation becomes the con-Yakubovich matrix equation investigated in [266]; when B = 0 and E = I, the matrix equation (5.1) becomes the normal con-Sylvester matrix equation with the form of (1.39); when B = 0 and A = I, it becomes the con- Kalman-Yakubovich matrix equation investigated in Chap. 3. In addition, when all the matrices are required to be real, the matrix equation (5.1) becomes the generalized nonhomogeneous Sylvester matrix equation. This is why the equation (5.1) is called as the generalized con-Sylvester matrix equation. The generalized con-Sylvester matrix equation (5.1) has infinitely many solutions, and thus can not be solved by any approaches in Chap. 4. In this section, a finite itera- tive algorithm is proposed for solving the generalized con-Sylvester matrix equation (5.1). With the real inner product defined in complex matrix spaces as a tool, it is proven that the proposed algorithm can give an exact solution for any initial values within finite iteration steps in the absence of round-off errors. By specializing the proposed algorithm, iterative algorithms for the normal con-Sylvester, con-Kalman- Yakubovich and generalized Sylvester matrix equations are also established.

5.1.1 Main Results

For the generalized con-Sylvester matrix equation (5.1), the following algorithm is presented to solve it.

Algorithm 5.1 (Finite iterative algorithm for (5.1)) 1. Given initial values X(0) and Y(0), calculate

R(0) = S + EX(0)F − BY(0) − AX(0) ∈ Cn×p, H H P(0) = AHR(0) − E R(0) F ∈ Cn×p, Q(0) = BHR(0) ∈ Cr×p, k :=0;

2. If R(k) = 0, then stop; else, k := k + 1; 3. Calculate

R(k)2 X(k + 1) = X(k) + P(k) ∈ Cn×p, P(k)2 + Q(k)2 R(k)2 Y(k + 1) = Y(k) + Q(k) ∈ Cr×p, P(k)2 + Q(k)2 R(k + 1) = S + EX(k + 1)F − BY(k + 1) − AX(k + 1), 5.1 Generalized Con-Sylvester Matrix Equations 165

2 H H R(k + 1) P(k + 1) = AHR(k + 1) − E R(k + 1)F + P(k), R(k)2 R(k + 1)2 Q(k + 1) = BHR(k + 1) + Q(k); R(k)2

4. Go to Step 2.

In the sequel, the convergence properties are analyzed for the proposed Algorithm 5.1. First, three lemmas are proposed to characterize its basic properties.

Lemma 5.1 Let (X, Y) = (X∗,Y∗) be a solution of the generalized con-Sylvester matrix equation (5.1). Then, for any initial matrices X(0) and Y(0), the sequences {X(k)}, {Y(k)}, {P(k)}, {Q(k)}, and {R(k)} generated by Algorithm 5.1 satisfy      H H 2 Re tr P (k) (X∗ − X(k)) + tr Q (k) (Y∗ − Y(k)) = R(k) , k = 0, 1,.... (5.2)

Proof This assertion is proven by induction principle. When k = 0, in view that (X, Y) = (X∗, Y∗) is a solution to the generalized con-Sylvester matrix equation (5.1) it follows from Algorithm 5.1 that      H H Re tr P (k) (X∗ − X(k)) + tr Q (k) (Y∗ − Y(k))     H H H = Re tr R (0)A − FR(0) E (X∗ − X(0)) + tr R (0)B (Y∗ − Y(0))   H H = Re tr R (0)A (X∗ − X(0)) − R(0) E (X∗ − X(0)) F +   H tr R(0) B (Y∗ − Y(0))    H = Re tr R (0) (A (X∗ − X(0)) + B (Y∗ − Y(0))) −  H tr R(0) E (X∗ − X(0)) F    H = Re tr R (0) (A (X∗ − X(0)) + B (Y∗ − Y(0))) −  H tr R(0) E (X∗ − X(0)) F    H = Re tr R (0) (A (X∗ − X(0)) + B (Y∗ − Y(0))) −   H tr R (0)E X∗ − X(0) F    = Re tr RH(0) S − AX(0) − BY(0) + EX(0)F = R(0)2 .

This implies that (5.2) holds for k = 0. 166 5 Finite Iterative Approaches

Now it is assumed that (5.2) holds for k = ω, ω ≥ 0, that is,      H H 2 Re tr P (ω) (X∗ − X(ω)) + tr Q (ω) (Y∗ − Y(ω)) = R(ω) . (5.3)

With this induction assumption and the fact that (X, Y) = (X∗, Y∗) is a solution to the matrix equation (5.1), it follows from Algorithm 5.1 that      H H Re tr P (ω + 1) (X∗ − X(ω + 1)) + tr Q (ω + 1) (Y∗ − Y(ω + 1))   H H = Re tr R (ω + 1)A − FR(ω + 1) E (X∗ − X(ω + 1))     (ω + )2 H R 1 + Re tr R (ω + 1)B (Y∗ − Y(ω + 1)) + ·  (ω)2     R  H H Re tr P (ω) (X∗ − X(ω + 1)) + tr Q (ω) (Y∗ − Y(ω + 1))    H = Re tr R (ω + 1) (A (X∗ − X(ω + 1)) + B (Y∗ − Y(ω + 1)))   H − Re tr R(ω + 1) E (X∗ − X(ω + 1)) F  (ω + )2      R 1 H H + Re tr P (ω) (X∗ − X(ω)) + tr Q (ω) (Y∗ − Y(ω)) R(ω)2 R(ω + 1)2 R(ω)2      − Re tr PH(ω)P(ω) + tr QH(ω)Q(ω)  (ω)2  (ω)2 +  (ω)2  R P Q  H = Re tr R (ω + 1) (A (X∗ − X(ω + 1)) + B (Y∗ − Y(ω + 1)))  H − Re tr R(ω + 1) E (X∗ − X(ω + 1)) F

R(ω + 1)2 + · R(ω)2 − R(ω)2 R(ω + 1)2 R(ω)2   P(ω)2 + Q(ω)2  (ω)2  (ω)2 +  (ω)2 R  P  Q  = Re tr RH(ω + 1) S − AX(ω + 1) − BY(ω + 1) + EX (ω + 1)F = R(ω + 1)2 .

This implies that (5.2) holds for k = ω+1. With the preceding argument, the relation (5.2) holds for all integer k ≥ 0 by induction principle. 

Lemma 5.2 For the sequences {P(k)}, {Q(k)}, and {R(k)} generated by Algorithm 5.1, if there exists an integer ϕ ≥ 1, such that R(k) = 0 for all k ∈ I[0,ϕ], and let P(−1) = 0, and Q(−1) = 0, then the following relation holds 5.1 Generalized Con-Sylvester Matrix Equations 167   Re tr RH (k + 1) R (j)      R(k)2 = Re tr RH(k)R (j) − Re tr PH(k)P(j) + tr QH(k)Q(j) P(k)2 + Q(k)2    R(k)2 R(j)2 +   Re tr PH(k)P(j − 1) + tr QH(k)Q(j − 1) P(k)2 + Q(k)2 R(j − 1)2 for k, j ∈ I[0,ϕ].

Proof From Algorithm 5.1, one has

( + ) = + ( + ) − ( + ) − ( + ) R k 1 S EX k 1 F AX k 1 BY k 1 R(k)2 = S + E X(k) + P(k) F P(k)2 + Q(k)2   R(k)2 −A X(k) + P(k) P(k)2 + Q(k)2   R(k)2 −B Y(k) + Q(k) P(k)2 + Q(k)2 = S + EX(k)F − AX(k) − BY(k) + (5.4)  R(k)2 EP (k)F − AP(k) − BQ(k) P(k)2 + Q(k)2  R(k)2 = R(k) − AP(k) + BQ(k) − EP (k)F . P(k)2 + Q(k)2

It follows from this relation that    Re tr RH (k + 1) R (j) ⎧ ⎡  ⎤⎫ ⎨  H ⎬ R(k)2 = Re tr ⎣ R(k) + EP (k)F − AP(k) − BQ(k) R (j)⎦ ⎩ P(k)2 + Q(k)2 ⎭

   R(k)2 = Re tr RH(k)R (j) + ·  ( )2 +  ( )2   P k Q k H    Re tr P (k) EHR (j) FH − tr PH(k)AH + QH(k)BH R (j)    R(k)2 = Re tr RH(k)R (j) + · P(k)2 + Q(k)2  H    Re tr P (k) EHR (j) FH − tr PH(k)AH + QH(k)BH R (j) 168 5 Finite Iterative Approaches

   R(k)2 = Re tr RH(k)R (j) + ·  ( )2 +  ( )2    P k Q k H H   Re tr PH(k) E R(j)F − AHR (j) − tr QH(k)BHR (j)    R(k)2 = Re tr RH(k)R (j) − ·  ( )2 +  ( )2    P k Q k  R(j)2 Re tr PH(k) P(j) − P(j − 1) R(j − 1)2     R(k)2 R(j)2 − Re tr QH(k) Q(j) − Q(j − 1) . P(k)2 + Q(k)2 R(j − 1)2

This is the conclusion of this lemma. 

Lemma 5.3 Suppose that the sequences {X(k)}, {Y(k)}, {P(k)}, {Q(k)}, and {R(k)} are generated by Algorithm 5.1 with arbitrary initial matrices X(0) and Y(0). If there exists an integer ϕ ≥ 1, such that R(k) = 0, for all k ∈ I[0, ϕ], then         Re tr PH(k)P(j) + tr QH(k)Q(j) = 0, Re tr RH (k) R (j) = 0 (5.5) for j, k ∈ I[0, ϕ],j= k.

Proof In view of          Re tr AHB = Re tr BTA = Re tr BTA       = Re tr BTA = Re tr BHA , in order to prove the conclusion it suffices to show that (5.5) holds for 0 ≤ j < k ≤ ϕ. Next, this conclusion will be proven by induction principle. Step 1. Using the proof of Lemma 5.2, one has    Re tr RH (1) R (0)    R(0)2 = Re tr RH(0)R (0) − ·  ( )2 +  ( )2    P 0 Q 0 Re tr PH(0)P(0) + tr QH(0)Q(0) (5.6) R(0)2   = R (0)2 − P(0)2 + Q(0)2 P(0)2 + Q(0)2 = 0. 5.1 Generalized Con-Sylvester Matrix Equations 169

From Algorithm 5.1 one also has      Re tr PH (1) P (0) + tr QH (1) Q (0) !  2 H H H R(1) = Re tr AHR(1) − E R(1)F P(0) + P(0)2 R(0)2   2  H R(1) + Re tr BHR(1) Q(0) + Q(0)2  ( )2   R 0 H = Re tr RH(1) (AP(0) + BQ(0)) − R(1) EP(0)F + R(1)2   P(0)2 + Q(0)2  ( )2 R 0 ! H = Re tr RH(1) (AP(0) + BQ(0)) − R(1) EP(0)F +

R(1)2   P(0)2 + Q(0)2  ( )2 R0    = Re tr RH(1) AP(0) + BQ(0) − EP (0)F + R(1)2   P(0)2 + Q(0)2 . R(0)2

In addition, it follows from (5.4) with k = 0 and (5.6) that     Re tr RH(1) AP(0) + BQ(0) − EP (0)F P(0)2 + Q(0)2    = Re tr RH(1) (R(0) − R(1)) R(0)2 P(0)2 + Q(0)2      = Re tr RH(1)R(0) − tr RH(1)R(1) R(0)2 P(0)2 + Q(0)2   =− R(1)2 . R(0)2

With the preceding two relations, it is immediately obtained that      Re tr PH (1) P (0) + tr QH (1) Q (0) = 0.

Step 2. Now, it is assumed that         Re tr PH(v)P(j) + tr QH(v)Q(j) = 0, Re tr RH (v) R (j) = 0 (5.7) for j ∈ I[0, v − 1], v ≥ 1. Then, it will be shown that 170 5 Finite Iterative Approaches         Re tr PH(v + 1)P(j) + tr QH(v + 1)Q(j) = 0, Re tr RH (v + 1) R (j) = 0 (5.8) for j ∈ I[0, v]. For convenience, let P(−1) = 0, and Q(−1) = 0. By using Lemma 5.2, it follows from the induction assumption (5.7) that for j ∈ I[0, v − 1], there holds    Re tr RH (v + 1) R (j)    R(v)2 = Re tr RH(v)R (j) − ·  ( )2 +  ( )2    P v Q v Re tr PH(v)P(j) + tr QH(v)Q(j)  ( )2  ( )2 + R v R j · P(v)2 + Q(v)2 R(j − 1)2      Re tr PH(v)P(j − 1) + tr QH(v)Q(j − 1) = 0.

In addition, it can be obtained from Lemma 5.2 and induction assumption (5.7) that    Re tr RH (v + 1) R (v) R(v)2   = R(v)2 − P(v)2 + Q(v)2 P(v)2 + Q(v)2  ( )2  ( )2 + R v R v · P(v)2 + Q(v)2 R(v − 1)2      Re tr PH(v)P(v − 1) + tr QH(v)Q(v − 1)  ( )2  ( )2 =  R v R v · P(v)2 + Q(v)2 R(v − 1)2      Re tr PH(v)P(v − 1) + tr QH(v)Q(v − 1) = 0.

The previous relations reveal that the second relation in (5.8) holds for j ∈ I[0, v]. From Algorithm 5.1 and the expression (5.4), it can also be obtained that

     Re tr PH(v + 1)P(j) + tr QH(v + 1)Q(j)  !  2 H H H R(v + 1)   = Re tr AHR(v + 1) − E R(v + 1)F P(j) + tr PH(v)P(j) R(v)2   2  H R(v + 1)   + Re tr BHR(v + 1) Q(j) + tr QH(v)Q(j) R(v)2 5.1 Generalized Con-Sylvester Matrix Equations 171   H = Re tr RH(v + 1) (AP(j) + BQ(j)) − R(v + 1) EP(j)F R(v + 1)2      + Re tr PH(v)P(j) + tr QH(v)Q(j)  ( )2 R v ! H = Re tr RH(v + 1) (AP(j) + BQ(j)) − R(v + 1) EP(j)F +

R(v + 1)2      Re tr PH(v)P(j) + tr QH(v)Q(j)  ( )2 R v  = Re tr RH(v + 1) AP(j) + BQ(j) − EP(j)F + R(v + 1)2      Re tr PH(v)P(j) + tr QH(v)Q(j) R(v)2 P(j)2 + Q(j)2    = Re tr RH(v + 1) (R(j) − R(j + 1)) R(j)2 R(v + 1)2      + Re tr PH(v)P(j) + tr QH(v)Q(j) . R(v)2

From this relation, it follows from the induction assumption (5.7) and the proven fact that the second relation in (5.8) holds for j ∈ I[0, v], that for j ∈ I[0, v − 1], there hold      Re tr PH(v + 1)P(j) + tr QH(v + 1)Q(j) = 0 and      Re tr PH(v + 1)P(v) + tr QH(v + 1)Q(v) P(v)2 + Q(v)2    = Re tr RH(v + 1) (R(v) − R(v + 1)) + R(v)2 R(v + 1)2   P(v)2 + Q(v)2 R(v)2 P(v)2 + Q(v)2    = Re tr RH(v + 1)R(v) − R(v)2 P(v)2 + Q(v)2 R(v + 1)2   R(v + 1)2 + P(v)2 + Q(v)2 R(v)2 R(v)2 P(v)2 + Q(v)2    = Re tr RH(v + 1)R(v) R(v)2 = 0.

Thus, the first relation in (5.8) holds for j ∈ I[0, v]. 172 5 Finite Iterative Approaches

From Step 1 and Step 2, the conclusion is immediately obtained by the principle of induction.  From Lemmas 5.1 and 5.3, one has the following conclusion. Theorem 5.1 Given A, E ∈ Cn×n,B∈ Cn×r,F ∈ Cp×p, and S ∈ Cn×p, a solution of the matrix equation (5.1) can be obtained within finite iteration steps by Algorithm 5.1 for any initial matrices X(0) and Y(0) in absence of round-off errors. × Proof First, in the space Cn p one can define a real inner product A, B = Re tr BHA for A, B ∈ Cn×p. From Sect.2.9 it is known that the real inner prod- uct space (Cm×n, R, ·, · ) is of real dimension 2np. According to Lemma 5.1,if R(j) = 0, j ∈ I[0, 2np − 1], then one has P(j) = 0orQ(j) = 0. Thus one can compute X(2np), Y(2np), and R(2np) by Algorithm 5.1. According to Lemma 5.3, it is obtained that    Re tr RH (k) R (j) = 0 for j, k ∈ I[0, 2np − 1], j = k. Thus R (j), j ∈ I[0, 2np − 1], form an orthogonal real basis of the real inner product space (Cm×n, R, ·, · ). In addition, it follows from Lemma 5.3 that    Re tr RH (2np) R (j) = 0 for j ∈ I[0, 2np − 1]. Consequently, it is derived from the property of a real inner product space that R (2np) = 0. So X(2np) and Y(2np) are a pair of exact solutions to the generalized con-Sylvester matrix equation (5.1).  It can be seen from the proof of Theorem 5.1 that the exact solutions to the generalized con-Sylvester matrix equation (5.1) can be obtained with 2np steps by Algorithm 5.1 in absence of round-off errors. However, in practice the round-off errors are inevitable. Thus, it is possible to obtain a solution with desirable with more than 2np steps.

5.1.2 Some Special Cases

In this subsection, by specializing Algorithm 5.1 finite iterative algorithms are proposed to solve the normal con-Sylvester matrix equation, the con-Kalman- Yakubovich matrix equation and the generalized Sylvester matrix equation. First, the following normal con-Sylvester matrix equation is considered

XF − AX = C, (5.9) where A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p are the known matrices, and X is the matrix to be determined. The equation (5.9) is a conjugate version of the normal Sylvester matrix equation XF − AX = C with X being the unknown matrix. By specializing Algorithm 5.1, one can obtain the following algorithm to solve the equation (5.9). 5.1 Generalized Con-Sylvester Matrix Equations 173

Algorithm 5.2 (Finite iterative algorithm for (5.9)) 1. Given initial value X(0), calculate

R(0) = C + AX(0) − X(0)F ∈ Cn×p, H P(0) = R(0)FH − A R(0) ∈ Cn×p, k :=0;

2. If R(k) = 0, then stop; else, k := k + 1; 3. Calculate

R(k)2 X(k + 1) = X(k) + P(k) ∈ Cn×p, P(k)2 R(k + 1) = C + AX(k + 1) − X(k + 1)F, 2 H R(k + 1) P(k + 1) = R(k + 1)FH − A R(k + 1) + P(k); R(k)2

4. Go to Step 2.

The second special equation to be considered is the following so-called con- Kalman-Yakubovich matrix equation

X − AXF = C, (5.10) where A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p are the known matrices, and X is the matrix to be determined. By specializing Algorithm 5.1, one can obtain the following algorithm to solve the equation (5.10).

Algorithm 5.3 (Finite iterative algorithm for (5.10)) 1. Given initial value X(0), calculate

R(0) = C + AX(0)F − X(0) ∈ Cn×p, H H P(0) = R(0) − A R(0)F ∈ Cn×p, k :=0;

2. If R(k) = 0, then stop; else, k := k + 1; 3. Calculate

R(k)2 X(k + 1) = X(k) + P(k) ∈ Cn×p, P(k)2 R(k + 1) = C + AX(k + 1)F − X(k + 1), 174 5 Finite Iterative Approaches

2 H H R(k + 1) P(k + 1) = R(k + 1) − A R(k + 1)F + P(k); R(k)2

4. Go to Step 2. Some properties of the above-mentioned algorithms can be obtained by special- izing the results in Subsection 5.1.1. At the end of this subsection, the following generalized Sylvester matrix equation is studied

AX + BY = EXF + S, (5.11) where A, E ∈ Rn×n, B ∈ Rn×r, F ∈ Rp×p, and S ∈ Rn×p are known matrices, and X ∈ Rn×p and Y ∈ Rr×p are unknown matrices. Algorithm 5.4 (Finite iterative algorithm for (5.11)) 1. Given initial values X(0) and Y(0), calculate

R(0) = S + EX(0)F − BY(0) − AX(0) ∈ Rn×p, P(0) = ATR(0) − ETR(0)FT ∈ Rn×p, Q(0) = BTR(0) ∈ Rr×p, k :=0;

2. If R(k) = 0, then stop; else, k := k + 1; 3. Calculate

R(k)2 X(k + 1) = X(k) + P(k) ∈ Rn×p, P(k)2 + Q(k)2 R(k)2 Y(k + 1) = Y(k) + Q(k) ∈ Rr×p, P(k)2 + Q(k)2 R(k + 1) = S + EX(k + 1)F − BY(k + 1) − AX(k + 1), R(k + 1)2 P(k + 1) = ATR(k + 1) − ETR(k + 1)FT + P(k) , R(k)2 R(k + 1)2 Q(k + 1) = BTR(k + 1) + Q(k); R(k)2

4. Go to Step 2. Similarly to Subsection 5.1.1, for this algorithm one has the following results.

Lemma 5.4 Let (X, Y) = (X∗,Y∗) be a solution of the generalized Sylvester matrix equation (5.11). Then, for arbitrary initial real matrices X(0) and Y(0), the sequences {X(k)}, {Y(k)}, {P(k)}, {Q(k)}, and {R(k)} generated by Algorithm 5.4 satisfy     T T 2 tr P (k) (X∗ − X(k)) + tr Q (k) (Y∗ − Y(k)) = R(k) , k = 0, 1,.... 5.1 Generalized Con-Sylvester Matrix Equations 175

Lemma 5.5 Suppose that the sequences {X(k)}, {Y(k)}, {P(k)}, {Q(k)}, and {R(k)} are generated by Algorithm 5.4 with arbitrary initial real matrices X(0) and Y(0). If there exists an integer ϕ ≥ 1, such that R(k) = 0, for all k ∈ I[0,ϕ], then       tr PT(k)P(j) + tr QT(k)Q(j) = 0, tr RT (k) R (j) = 0 for j, k ∈ I[0,ϕ],j= k. Theorem 5.2 Given matrices A, E ∈ Rn×n,B∈ Rn×r,F∈ Rp×p, and S ∈ Rn×p,a solution of the generalized Sylvester matrix equation (5.11) can be obtained within finite iteration steps by Algorithm 5.4 with arbitrary initial matrices X(0) and Y(0) in absence of round-off errors.

5.1.3 Numerical Examples

In this subsection, two examples are given to illustrate the effectiveness of the pro- posed methods. Example 5.1 Consider a generalized con-Sylvester matrix equation in the form of (5.1) with the following parameters ⎡ ⎤ 1.3652 6.6144 5.8279 2.2595 2.0907 ⎢ ⎥ ⎢ 0.1176 2.8441 4.2350 5.7981 3.7982 ⎥ ⎢ ⎥ A = ⎢ 8.9390 4.6922 5.1551 7.6037 7.8333 ⎥ + ⎣ 1.9914 0.6478 3.3395 5.2982 6.8085 ⎦ 2.9872 9.8833 4.3291 6.4053 4.6110 ⎡ ⎤ 5.6783 4.1537 9.7084 2.1396 4.1195 ⎢ ⎥ ⎢ 7.9421 3.0500 9.9008 6.4349 7.4457 ⎥ ⎢ ⎥ i ⎢ 08.5918.7437 7.8886 3.2004 2 .6795 ⎥ , ⎣ 6.0287 0.1501 4.3866 9.6010 4.3992 ⎦ 0.5027 7.6795 4.9831 7.2663 9.3338

⎡ ⎤ 6.8333 2.0713 4.5142 3.8397 ⎢ ⎥ ⎢ 2.1256 6.0720 0.4390 6.8312 ⎥ ⎢ ⎥ B = ⎢ 8.3924 6.2989 0.2719 0.9284 ⎥ + ⎣ 6.2878 3.7048 3.1269 0.3534 ⎦ 1.3377 5.7515 0.1286 6.1240 ⎡ ⎤ 60.0854 .5758 0.8408 6.7564 ⎢ ⎥ ⎢ 0.1576 3.6757 4.5436 6.9921 ⎥ ⎢ ⎥ i ⎢ 0.1635 6.3145 4.4183 7.2751 ⎥ , ⎣ 1.9007 7.1763 3.5325 4.7838 ⎦ 5.8692 6.9267 1.5361 5.5484 176 5 Finite Iterative Approaches ⎡ ⎤ 1.2105 2.5477 2.3189 1.9089 4.3979 ⎢ ⎥ ⎢ 4.5075 8.6560 2.3931 8.4387 3.4005 ⎥ ⎢ ⎥ E = ⎢ 7.1588 2.3235 0.4975 1.7390 3.1422 ⎥ + ⎣ 8.9284 8.0487 0.7838 1.7079 3.6508 ⎦ 2.7310 9.0840 6.4082 9.9430 3.9324 ⎡ ⎤ 5.9153 9.3424 6.4583 1.3701 6.8732 ⎢ ⎥ ⎢ 1.1975 2.6445 9.6689 8.1876 3.4611 ⎥ ⎢ ⎥ i ⎢ 0.3813 1.6030 6.6493 4.3017 1.6603 ⎥ , ⎣ 4.5860 8.7286 8.7038 8.9032 1.5561 ⎦ 8.6987 92.3788 0.0993 7.3491 1.9112 ⎡ ⎤ 4.2245 + 0.0558i 8.1593 + 6.9318i 4.5069 + 5.5267i F = ⎣ 8.5598 + 2.9741i 4.6077 + 6.5011i 4.1222 + 4.0007i ⎦ , 4.9025 + 0.4916i 4.5735 + 9.8299i 9.0161 + 1.9879i ⎡ ⎤ 6.2520 + 6.3179i 7.5367 + 6.5553i 6.2080 + 4.1363i ⎢ ⎥ ⎢ 7.3336 + 2.3441i 7.9387 + 3.9190i 7.3128 + 6.5521i ⎥ ⎢ ⎥ S = ⎢ 3.7589 + 5.4878i 9.1996 + 6.2731i 1.9389 + 8.3759i ⎥ . ⎣ 0.0988 + 9.3158i 8.4472 + 6.9908i 9.0481 + 3.7161i ⎦ 4.1986 + 3.3520i 3.6775 + 3.9718i 5.6921 + 4.2525i

These coefficient matrices are generated by command “rand” in Matlab. This matrix equation has infinitely many solutions. Now, the initial values are randomly chosen as ⎡ ⎤ 5.9466 + 1.4004i 4.8935 + 9.6164i 7.0357 + 5.9734i ⎢ ⎥ ⎢ 5.6574 + 5.6677i 1.8590 + 0.5886i 4.8496 + 0.4928i ⎥ ⎢ ⎥ X(0) = ⎢ 7.1654 + 8.2301i 7.0064 + 3.6031i 1.1461 + 5.7106i ⎥ , (5.12) ⎣ 5.1131 + 6.7395i 9.8271 + 5.4851i 6.6486 + 7.0086i ⎦ 7.7640 + 9.9945i 8.0664 + 2.6177i 3.6537 + 9.6229i ⎡ ⎤ 7.5052 + 7.3265i 8.0303 + 5.5341i 6.0199 + 6.8020i ⎢ 7.3999 + 4.2223i 0.8388 + 2.9198i 2.5356 + 0.5344i ⎥ Y(0) = ⎢ ⎥ . (5.13) ⎣ 4.3187 + 9.6137i 9.4546 + 8.5796i 8.7345 + 3.5666i ⎦ 6.3427 + 0.7206i 9.1594 + 3.3576i 5.1340 + 4.9830i

By using Algorithm 5.1 one can obtain a solution at k = 60 as follows 5.1 Generalized Con-Sylvester Matrix Equations 177 ⎡ ⎤ 0.7309 + 0.0526i 1.3602 − 2.0702i −0.1472 + 0.1825i ⎢ ⎥ ⎢ −0.6441 + 0.2823i −0.3603 + 1.1404i −0.1274 + 0.0975i ⎥ ⎢ ⎥ X = ⎢ 0.0657 + 0.2475i 0.6426 − 1.4654i −0.1856 + 0.0186i ⎥ , ⎣ 0.7685 − 0.0925i 0.5334 − 0.1466i 0.0695 + 0.3471i ⎦ 1.5168 + 1.5723i 1.9924 + 0.0829i −0.7253 + 0.1917i ⎡ ⎤ 6.8198 + 5.4620i 9.1208 + 7.1544i 5.8696 + 6.9769i ⎢ 7.5492 + 2.6884i 1.7783 + 3.7534i 2.3474 + 0.3715i ⎥ Y = ⎢ ⎥ . ⎣ 4.5651 + 8.5043i 9.7013 + 8.4421i 8.1070 + 4.1528i ⎦ 5.7448 + 2.1586i 11.3590 + 2.4229i 5.2509 + 4.6883i

In this case, the Frobenius norm of the residual is R(60) = 3.5040 × 10−8.The Frobenius norms of the iterative residuals are shown in Fig. 5.1. When the initial matrices are chosen as X(0) = 05×3 and Y(0) = 04×3 a solution can be obtained at k = 60, and is given as ⎡ ⎤ −0.0333 − 0.0042i −0.0392 + 0.1787i −0.0262 − 0.1139i ⎢ ⎥ ⎢ 0.2163 + 0.0342i −0.1452 + 0.0094i −0.0089 − 0.0961i ⎥ ⎢ ⎥ X = ⎢ 0.0148 − 0.0113i −0.0422 + 0.1061i 0.0038 − 0.0717i ⎥ , ⎣ −0.0082 − 0.1221i 0.0433 + 0.0181i −0.0840 + 0.0581i ⎦ −0.2254 − 0.0258i 0.0134 − 0.2555i 0.1408 + 0.1675i ⎡ ⎤ 0.1087 + 0.0095i 0.0778 − 0.0479i −0.1468 + 0.0673i ⎢ 0.0341 − 0.0085i 0.0407 − 0.0988i −0.0196 + 0.1482i ⎥ Y = ⎢ ⎥ , ⎣ −0.0474 + 0.0763i −0.0351 + 0.0122i 0.0986 − 0.0766i ⎦ −0.0111 − 0.1213i −0.0480 − 0.0945i 0.0965 + 0.1813i and the Frobenius norm of the corresponding residual is R(60) = 1.3212 × 10−9. The corresponding Frobenius norms of the iterative residuals are shown in Fig.5.2.

4 Fig. 5.1 Frobenius norms of x 10 2.5 the residuals with initial conditions (5.12)and(5.13) 2

1.5

||R(k)|| 1

0.5

0

0 10 20 30 40 50 60 Step 178 5 Finite Iterative Approaches

50

45

40

35

30

25

||R(k)|| 20

15

10

5

0 0 10 20 30 40 50 60 Step

Fig. 5.2 Frobenius norms of the residuals with X(0) = 05×3 and Y(0) = 04×3

Example 5.2 Consider a normal con-Sylvester matrix equation in the form of (5.9) with the following parameters ⎡ ⎤ ⎡ ⎤ 1 −2 − i −1 + i ! −1 + i1 2i i A = ⎣ 0i 0⎦ , F = , C = ⎣ 0i⎦ . 1 −1 + i 0 −11− i −i1− 2i

This matrix equation has been investigated in [258], and the unique solution to this matrix equation is ⎡ ⎤ 21 − 33 i − 9 + 5 i ⎢ 40 40 8 8 ⎥ = ⎣ 1 + 1 3 − 1 ⎦ X 2 4 i 4 2 i . − 6 + 21 − 13 + 9 5 20 i 20 5 i

When the initial value is chosen as X(0) = 03×2, by using Algorithm 5.2 one can obtain the unique solution at k = 12 as follows

⎡ ⎤ ⎡ ⎤ x11 x12 0.525000 − 0.825000i −1.125000 + 0.625000i ⎣ ⎦ ⎣ ⎦ X = x21 x22 = 0.500000 + 0.250000i 0.750000 − 0.500000i , x31 x32 −1.200000 + 1.050000i −0.650000 + 1.800000i and the Frobenius norm of the corresponding residual is R(12) = 1.586593×10−8. The iterative solutions X(k) are shown in Tables 5.1 and 5.2. 5.2 Extended Con-Sylvester Matrix Equations 179

Table 5.1 The iterative solutions xi1(k), i = 1, 2, 3ofExample5.2 for X(0) = 03×2 k x11 x21 x31 1 0.247934 + 0.165289i −0.165289i −0.578512 − 0.165289i 2 −0.025541 − 0.072560i 0.187422 + 0.322456i −0.391606 + 0.072560i 3 0.166797 + 0.125901i 0.485038 − 0.123075i −0.294115 + 0.307061i 4 0.158819 + 0.095214i 0.692558 + 0.006136i −0.504869 + 0.283896i 5 0.447498 + 0.339647i 0.823137 + 0.646584i −0.910618 + 0.975617i 6 0.463148 + 0.240738i 1.003370 + 0.270563i −0.737716 + 1.468730i 7 0.087412 − 0.063210i 1.398593 + 0.105351i −1.009175 + 1.231796i 8 −0.288588 − 0.117224i 0.708126 + 0.033193i −1.352214 + 1.398999i 9 −0.137385 − 0.472493i 0.719564 + 0.119621i −1.259153 + 1.072973i 10 0.233233 − 0.390987i 0.587093 + 0.328332i −1.102245 + 0.888977i 11 0.424573 − 0.792314i 0.510353 + 0.252614i −1.173481 + 1.070008i 12 0.525000 − 0.825000i 0.500000 + 0.250000i −1.200000 + 1.050000i

Table 5.2 The iterative solutions xi2(k), i = 1, 2, 3ofExample5.2 for X(0) = 03×2 k x12 x22 x32 1 −0.247934 0.247934 + 0.165289i −0.413223 − 0.165289i 2 −0.724148 − 0.249896i 0.224356 + 0.260635i −0.943134 − 0.094038i 3 −1.060227 − 0.216927i −0.094437 − 0.212102i −1.055412 + 0.096626i 4 −1.247916 − 0.241532i −0.189035 − 0.159511i −0.737733 + 0.083596i 5 −0.772762 + 0.023953i 0.068808 − 0.603259i −0.812394 + 0.265940i 6 −0.925804 + 0.069240i 0.758932 − 0.180891i −0.574890 + 0.561609i 7 −0.599346 + 0.317945i 0.682839 − 0.152101i −0.798132 + 1.029381i 8 −0.879323 + 0.331203i 0.649267 − 0.343720i −0.675705 + 1.388881i 9 −1.049247 + 0.398360i 1.063737 − 0.813604i −0.604310 + 1.402518i 10 −1.211364 + 0.750019i 0.927954 − 0.564622i −0.628122 + 1.814071i 11 −1.183491 + 0.842736i 0.762988 − 0.505485i −0.655856 + 1.682982i 12 −1.125000 + 0.625000i 0.750000 − 0.500000i −0.650000 + 1.800000i

5.2 Extended Con-Sylvester Matrix Equations

5.2.1 The Matrix Equation AXB + CXD = F

In this subsection, the following extended con-Sylvester matrix equation is investi- gated AXB + CXD = F, (5.14) 180 5 Finite Iterative Approaches where A, C ∈ Cm×r, B, D ∈ Cs×n, and F ∈ Cm×n are known matrices, and X ∈ Cr×s is the matrix to be determined. Obviously, this kind of matrix equations includes the matrix equation AX + XD = F in [21, 20, 258], and X + CXD = F in [155] as special cases. In [21, 20, 155, 258], the matrices A, C, and D are required to be square. In addition, this type of matrix equations has been investigated in Subsection 4.1.1. However, the proposed algorithm in Subsection 4.1.1 is only applicable to the case that the matrix equation (5.14) has a unique solution. In this subsection, the aforementioned restrictions will be removed. When this matrix equation (5.14) has a solution, the following finite iterative algorithm is presented to solve it.

Algorithm 5.5 (Finite iterative algorithm for (5.14)) 1. Given initial value X(0), calculate

R(0) = F − AX(0)B − CX(0)D, H H P(0) = AHR(0)BH + C R(0)D , k :=0;

2. If R(k) = 0, then stop; else, k := k + 1; 3. Calculate

R(k)2 X(k + 1) = X(k) + P(k), P(k)2 R(k + 1) = F − AX(k + 1)B − CX(k + 1)D, 2 H H R(k + 1) P(k + 1) = AHR(k + 1)BH + C R(k + 1)D + P(k); R(k)2

4. Go to Step 2.

In the sequel, the convergence properties are analyzed for the proposed Algorithm 5.5. First, two lemmas are proposed to characterize its basic properties.

Lemma 5.6 Assume that the extended con-Sylvester matrix equation (5.14) is solv- able, and let X = X∗ be a solution of this equation. Then, for an arbitrary initial matrix X(0), the sequences {X(i)}, {R(i)}, and {P(i)} generated by Algorithm 5.5 satisfy     H 2 tr P (i) (X∗ − X(i)) + tr PH(i) (X∗ − X(i)) = 2 R(i) ,i= 0, 1,..., (5.15) or, equivalently,    H 2 Re tr P (i) (X∗ − X(i)) = R(i) ,i= 0, 1,.... 5.2 Extended Con-Sylvester Matrix Equations 181

Proof The conclusion will be proven by induction principle. When i = 0, from Algorithm 5.5 one has   H tr P (0) (X∗ − X(0))  H H = tr BR (0)A + DR(0) C (X∗ − X(0))    H H = tr R (0)A (X∗ − X(0)) B + tr R(0) C (X∗ − X(0)) D       H H H H = tr R (0)AX∗B + tr R(0) CX∗D − tr R (0)AX(0)B − tr R(0) CX(0)D .

In view that X = X∗ is a solution of the matrix equation (5.14), with this relation it is derived that     H tr P (0) (X∗ − X(0)) + tr PH(0) (X∗ − X(0))       H H H H = tr R (0)AX∗B + tr R (0)CX∗D + tr R(0) AX∗B + tr R(0) CX∗D    − tr RH(0)AX(0)B − tr RH(0)CX(0)D −   H H tr R(0) AX(0)B − tr R(0) CX(0)D       H H = tr R (0) AX∗B + CX∗D + tr R(0) AX∗B + CX∗D (5.16)     H − tr RH(0) AX(0)B + CX(0)D − tr R(0) AX(0)B + CX(0)D   = tr RH(0) F − AX(0)B − CX(0)D +   H tr R(0) F − AX(0)B − CX(0)D    H = tr RH(0)R(0) + tr R(0) R(0) = 2 R(0)2 .

This implies that (5.15) holds for i = 0. When i = 1, from Algorithm 5.5 one has   H tr P (1) (X∗ − X(1))   ( )2   H H R 1 H = tr BR (1)A + DR(1) C (X∗ − X(1)) + tr P (0) (X∗ − X(1)) R(0)2    H H = tr R (1)A (X∗ − X(1)) B + tr R(1) C (X∗ − X(1)) D $ %!  ( )2  ( )2 R 1 H R 0 + tr P (0) X∗ − X(0) − P(0) R(0)2 P(0)2 182 5 Finite Iterative Approaches    H H = tr R (1)A (X∗ − X(1)) B + tr R(1) C (X∗ − X(1)) D  ( )2    ( )2  ( )2 R 1 H R 1 R 0 2 + tr P (0) (X∗ − X(0)) − P(0) R(0)2 R(0)2 P(0)2    H H = tr R (1)A (X∗ − X(1)) B + tr R(1) C (X∗ − X(1)) D  ( )2   R 1 H 2 + tr P (0) (X∗ − X(0)) − R(1) . R(0)2

Since X = X∗ is a solution of the matrix equation (5.14), it follows from the preceding relation and (5.16) that     H tr P (1) (X∗ − X(1)) + tr PH(1) (X∗ − X(1))    H H = tr R (1)A (X∗ − X(1)) B + tr R(1) C (X∗ − X(1)) D   H H + tr R(1) A (X∗ − X(1)) B + tr R (1)C(X∗ − X(1))D  ( )2    ( )2  R 1 H R 1 H + tr P (0) (X∗ − X(0)) + tr P(0) (X∗ − X(0)) R(0)2 R(0)2 −  ( )2 2 R 1  H = tr R (1) AX∗B + CX∗D − AX(1)B − CX(1)D   +tr RH(1) AX∗B + CX∗D − AX(1)B − CX(1)D + R(1)2   2 R(0)2 − 2 R(1)2 R(0)2     = tr RH(1) F − AX(1)B − CX(1)D + tr RH(1) F − AX(1)B − CX(1)D = 2 R(1)2 .

This implies that (5.15) holds for i = 1. Now it is assumed that (5.15) holds for i = ω ≥ 1, that is,     H 2 tr P (ω) (X∗ − X(ω)) + tr PH(ω) (X∗ − X(ω)) = 2 R(ω) . (5.17)

It follows from Algorithm 5.5 that   H tr P (ω + 1) (X∗ − X(ω + 1))  H H = tr BR (ω + 1)A + DR(ω + 1) C (X∗ − X(ω + 1)) +  (ω + )2   R 1 H tr P (ω) (X∗ − X(ω + 1)) R(ω)2 5.2 Extended Con-Sylvester Matrix Equations 183    H H = tr R (ω + 1)A (X∗ − X(ω + 1)) B + tr R(ω + 1) C (X∗ − X(ω + 1)) D $ %!  (ω + )2  (ω)2 R 1 H R + tr P (ω) X∗ − X(ω) − P(ω) R(ω)2 P(ω)2    H H = tr R (ω + 1)A (X∗ − X(ω + 1)) B + tr R(ω + 1) C (X∗ − X(ω + 1)) D  (ω + )2    (ω + )2  (ω)2 R 1 H R 1 R 2 + tr P (ω) (X∗ − X(ω)) − P(ω) R(ω)2 R(ω)2 P(ω)2    H H = tr R (ω + 1)A (X∗ − X(ω + 1)) B + tr R(ω + 1) C (X∗ − X(ω + 1)) D  (ω + )2   R 1 H 2 + tr P (ω) (X∗ − X(ω)) − R(ω + 1) . R(ω)2

In view that X = X∗ is a solution of the matrix equation (5.14), with this relation and (5.17) one has     H tr P (ω + 1) (X∗ − X(ω + 1)) + tr PH(ω + 1) (X∗ − X(ω + 1))    H H = tr R (ω + 1)A (X∗ − X(ω + 1)) B + tr R(ω + 1) C (X∗ − X(ω + 1)) D   H H + tr R(ω + 1) A (X∗ − X(ω + 1)) B + tr R (ω + 1)C(X∗ − X(ω + 1))D  (ω + )2   R 1 H + tr P (ω) (X∗ − X(ω)) + R(ω)2  (ω + )2  R 1 H 2 tr P(ω) (X∗ − X(ω)) − 2 R(ω + 1)  (ω)2  R  H = tr R (ω + 1) AX∗B + CX∗D − AX(ω + 1)B − CX(ω + 1)D   +tr RH(ω + 1) AX∗B + CX∗D − AX(ω + 1)B − CX(ω + 1)D R(ω + 1)2   + 2 R(ω)2 − 2 R(ω + 1)2  (ω)2  R  = tr RH(ω + 1) F − AX(ω + 1)B − CX(ω + 1)D +   tr RH(ω + 1) F − AX(ω + 1)B − CX(ω + 1)D = 2 R(ω + 1)2 .

This implies that (5.15) holds for i = ω + 1. With the above argument, the relation (5.15) holds for all integer i ≥ 0 by induction principle.  184 5 Finite Iterative Approaches

Lemma 5.7 Suppose that the sequences {X(i)}, {R(i)}, and {P(i)} are generated by Algorithm 5.5 with an arbitrary initial matrix X(0). If there exists an integer ϕ ≥ 1, such that R(i) = 0 for all i ∈ I[0, ϕ], then       Re tr RH (j) R (i) = 0, Re tr PH (j) P (i) = 0 (5.18) for i, j ∈ I[0, ϕ],i= j.

Proof First, it follows from Algorithm 5.5 that

R(k + 1) = − ( + ) − ( + ) F AX k 1 B CX k 1D   R(k)2 R(k)2 = F − A X(k) + P(k) B − C X(k) + P(k) D P(k)2 P(k)2  R(k)2 = F − AX(k)B − CX(k)D − AP(k)B + CP(k)D (5.19) P(k)2  R(k)2 = R(k) − AP(k)B + CP(k)D . P(k)2

In view of          Re tr AHB = Re tr BTA = Re tr BTA       = Re tr BTA = Re tr BHA , in order to prove the conclusion it suffices to show       Re tr RH (j) R (i) = 0, Re tr PH (j) P (i) = 0 for 0 ≤ i < j ≤ ϕ. Next, this assertion will be proven by induction principle. Step 1. In this step, it will be shown that for integer i ≥ 0    Re tr RH (i + 1) R (i) = 0, (5.20)    Re tr PH (i + 1) P (i) = 0. (5.21)

The induction principle is adopted to prove (5.20) and (5.21). By using (5.19) one has   tr RH (1) R (0)   $  %H R(0)2 = tr R(0) − AP(0)B + CP(0)D R (0) P(0)2 5.2 Extended Con-Sylvester Matrix Equations 185  ! R(0)2 H = R (0)2 − tr AP(0)B + CP(0)D R (0) P(0)2 2  R(0) H = R (0)2 − tr PH(0)AHR(0)BH + P(0) CHR(0)DH . P(0)2

It follows from this relation and Algorithm 5.5 that     tr RH (1) R (0) + tr RH (1) R (0) 2  R(0) H = 2 R (0)2 − tr PH(0)AHR(0)BH + P(0) CHR(0)DH P(0)2 2  R(0) H H H H H − tr P(0) A R(0)B + PH(0)C R(0)D P(0)2 2   R(0) H H = 2 R (0)2 − tr PH(0) AHR(0)BH + C R(0)D (5.22) P(0)2 2   R(0) H H − tr PH(0) AHR(0)BH + C R(0)D P(0)2  R(0)2 = 2 R (0)2 − P(0)2 + P(0)2 P(0)2 = 0.

This implies that (5.20) holds for i = 0. From Algorithm 5.5 one has     tr PH (1) P (0) + tr PH (1) P (0) !  2 H H H 2 R(1) = tr AHR(1)BH + C R(1)D P(0) + P(0)2 R(0)2  ! H H H +tr AHR(1)BH + C R(1)D P(0)

 2 H 2 R(1) = tr RH(1)AP(0)B + R(1) CP(0)D + P(0)2  ( )2  R 0 H + tr R(1) AP(0)B + RH(1)CP(0)D     = tr RH(1) AP(0)B + CP(0)D + tr RH(1) AP(0)B + CP(0)D 2 R(1)2 + P(0)2 . R(0)2 186 5 Finite Iterative Approaches

In addition, it follows from (5.19) with k = 0 and (5.22) that     tr RH(1) AP(0)B + CP(0)D + tr RH(1) AP(0)B + CP(0)D P(0)2   P(0)2   = tr RH(1) (R(0) − R(1)) + tr RH(1) (R(0) − R(1)) R(0)2 R(0)2  P(0)2   P(0)2     =− 2 R(1)2 + tr RH(1)R(0) + tr RH(1)R(0) R(0)2 R(0)2 P(0)2   =− 2 R(1)2 . R(0)2

Combining this relation with the preceding expression, implies that (5.21) holds for i = 0. Now, it is assumed that the relations (5.20) and (5.21) hold for i = ω − 1, ω ≥ 1. By applying (5.19) and the induction assumption, from Algorithm 5.5 one has     tr RH (ω + 1) R (ω) + tr RH (ω + 1) R (ω)   $  %H R(ω)2 = tr R(ω) − AP(ω)B + CP(ω)D R (ω) P(ω)2   $  %H R(ω)2 +tr R(ω) − AP(ω)B + CP(ω)D R (ω) P(ω)2  ! R(ω)2 H = 2 R(ω)2 − tr AP(ω)B + CP(ω)D R (ω) P(ω)2  ! R(ω)2 H − tr AP(ω)B + CP(ω)D R (ω) P(ω)2 2   R(ω) H H = 2 R(ω)2 − tr PH(ω) AHR (ω) BH + C R (ω)D P(ω)2 2   R(ω) H H − tr PH(ω) AHR (ω) BH + C R (ω)D P(ω)2 $ %! R(ω)2 R(ω)2 = 2 R(ω)2 − tr PH(ω) P(ω) − P(ω − 1) P(ω)2 R(ω − 1)2 $ %! R(ω)2 R(ω)2 − tr PH(ω) P(ω) − P(ω − 1) P(ω)2 R(ω − 1)2 5.2 Extended Con-Sylvester Matrix Equations 187

2 R(ω)2 = 2 R(ω)2 − P(ω)2 P(ω)2  R(ω)2 R(ω)2     + tr PH(ω)P(ω − 1) + tr PH(ω)P(ω − 1) P(ω)2 R(ω − 1)2 = 0.

Thus, (5.20) holds for i = ω. From Algorithm 5.5, one also has     tr PH (ω + 1) P (ω) + tr PH (ω + 1) P (ω) !  2 H H H 2 R(ω + 1) = tr AHR(ω + 1)BH + C R(ω + 1)D P(ω) + P(ω)2 R(ω)2  ! H H H +tr AHR(ω + 1)BH + C R(ω + 1)D P(ω)

 2 H 2 R(ω + 1) = tr RH(ω + 1)AP(ω)B + R(ω + 1) CP(ω)D + P(ω)2  (ω)2  R H + tr R(ω + 1) AP(ω)B + RH(ω + 1)CP(ω)D     = tr RH(ω + 1) AP(ω)B + CP(ω)D + tr RH(ω + 1) AP(ω)B + CP(ω)D 2 R(ω + 1)2 + P(ω)2 . R(ω)2

In addition, from (5.19) and the following proven result     tr RH (ω + 1) R (ω) + tr RH (ω + 1) R (ω) = 0, one has     tr RH(ω + 1) AP(ω)B + CP(ω)D + tr RH(ω + 1) AP(ω)B + CP(ω)D P(ω)2   = tr RH(ω + 1) (R(ω) − R(ω + 1)) + R(ω)2 P(ω)2   tr RH(ω + 1) (R(ω) − R(ω + 1)) R(ω)2 P(ω)2   P(ω)2 =− 2 R(ω + 1)2 + · R(ω)2 R(ω)2      tr RH(ω + 1)R(ω) + tr RH(ω + 1)R(ω) P(ω)2   =− 2 R(ω + 1)2 . R(ω)2 188 5 Finite Iterative Approaches

Combining this relation with the preceding relation, gives that     tr PH (ω + 1) P (ω) + tr PH (ω + 1) P (ω) = 0.

By induction principle, the relations (5.20) and (5.21) hold for all 0 ≤ i ≤ ϕ − 1. Step 2. In this step, one needs to show that for 0 ≤ i ≤ ϕ − l    Re tr RH (i + l) R (i) = 0, (5.23)    Re tr PH (i + l) P (i) = 0, (5.24) hold for integer l ≥ 1. This assertion is proven by induction principle. The case of l = 1 has been proven in Step 1. Now, it is assumed that (5.23) and (5.24) hold for l ≤ ω,ω ≥ 1. The aim is to show    Re tr RH (i + ω + 1) R (i) = 0, (5.25)    Re tr PH (i + ω + 1) P (i) = 0. (5.26)

This will be done in two substeps. Substep 2.1. In this substep, one needs to show that    Re tr RH (ω + 1) R (0) = 0, (5.27)    Re tr PH (ω + 1) P (0) = 0. (5.28)

According to Algorithm 5.5,from(5.19) and the induction assumption one has     tr RH (ω + 1) R (0) + tr RH (ω + 1) R (0) ⎡  ⎤  H R(ω)2 = tr ⎣ R(ω) − AP(ω)B + CP(ω)D R (0)⎦ P(ω)2 ⎡  ⎤  H R(ω)2 +tr ⎣ R(ω) − AP(ω)B + CP(ω)D R (0)⎦ P(ω)2  !     R(ω)2 H = tr RH(ω)R (0) + tr RH(ω)R (0) − tr AP(ω)B + CP(ω)D R (0) P(ω)2  ! R(ω)2 H − tr AP(ω)B + CP(ω)D R (0) P(ω)2 2   R(ω) H H =− tr PH(ω) AHR(0)BH + C R(0)D P(ω)2 2   R(ω) H H − tr PH(ω) AHR(0)BH + C R(0)D P(ω)2 5.2 Extended Con-Sylvester Matrix Equations 189  R(ω)2     =− tr PH(ω)P(0) + tr PH(ω)P(0) P(ω)2 = 0, and     tr PH (ω + 1) P (0) + tr PH (ω + 1) P (0)  ! H H H = tr AHR(ω + 1)BH + C R(ω + 1)D P(0)  ! H H H +tr AHR(ω + 1)BH + C R(ω + 1)D P(0)  R(ω + 1)2     + tr PH(ω)P(0) + tr PH(ω)P(0) R(ω)2     = tr RH(ω + 1) AP(0)B + CP(0)D + tr RH(ω + 1) AP(0)B + CP(0)D P(0)2   P(0)2   = tr RH(ω + 1) (R(0) − R(1)) + tr RH(ω + 1) (R(0) − R(1)) R(0)2 R(0)2 = 0.

Thus, (5.27) and (5.28) hold. Substep 2.2. From Algorithm 5.5 and (5.19), by using the induction assumptions one has    tr PH (i + ω + 1) P (i) + tr PH (i + ω + 1) P (i)  ! H H H = tr AHR(i + ω + 1)BH + C R(i + ω + 1)D P (i)  ! H H H +tr AHR(i + ω + 1)BH + C R(i + ω + 1)D P (i)   R(i + ω + 1)2   + tr PH(i + ω)P (i) + tr PH(i + ω)P (i) (5.29) R(i + ω)2     = tr RH(i + ω + 1) AP(i)B + CP(i)D + tr RH(i + ω + 1) AP(i)B + CP(i)D   P(i)2   = tr RH(i + ω + 1) (R(i) − R(i + 1)) + tr RH(i + ω + 1) (R(i) − R(i + 1)) R(i)2   P(i)2   = tr RH(i + ω + 1)R(i) + tr RH(i + ω + 1)R(i) . R(i)2 190 5 Finite Iterative Approaches

In addition, from (5.19) it can also be derived that  tr RH (i + ω + 1) R (i) ⎡  ⎤  H R(i + ω)2 = tr ⎣ R(i + ω) − AP(i + ω)B + CP(i + ω)D R (i)⎦ P(i + ω)2

 2  R(i + ω) H = tr RH(i + ω)R (i) − tr PH(i + ω)AHR(i)BH + P(i + ω) CHR(i)DH . P(i + ω)2

With this relation, by using the induction assumption one has     tr RH (i + ω + 1) R (i) + tr RH (i + ω + 1) R (i)     = tr RH(i + ω)R (i) + tr RH(i + ω)R (i) 2   R(i + ω) H H − tr PH(i + ω) AHR(i)BH + C R(i)D P(i + ω)2 2   R(i + ω) H H − tr PH(i + ω) AHR(i)BH + C R(i)D (5.30) P(i + ω)2 $ %! R(i + ω)2 R(i)2 =− tr PH(i + ω) P(i) − P(i − 1) P(i + ω)2 R(i − 1)2 $ %! R(i + ω)2 R(i)2 − tr PH(i + ω) P(i) − P(i − 1) P(i + ω)2 R(i − 1)2  R(i + ω)2 R(i)2     = tr PH(i + ω)P(i − 1) + tr PH(i + ω)P(i − 1) . P(i + ω)2 R(i − 1)2

Repeating the processes (5.29) and (5.30), one easily obtains that for certain α and β there hold     tr PH (i + ω + 1) P (i) + tr PH (i + ω + 1) P (i)      = α tr RH (ω + 1) R (0) + tr RH (ω + 1) R (0) and     tr RH (i + ω + 1) R (i) + tr RH (i + ω + 1) R (i)      = β tr RH (ω + 1) R (0) + tr RH (ω + 1) R (0) .

Combining these two relations with (5.27) and (5.28), implies that (5.23) and (5.24) hold for l = ω + 1. From Step 1 and Step 2, the assertion of this lemma holds by the principle of induction. 

With the above two lemmas, one has the following conclusion. 5.2 Extended Con-Sylvester Matrix Equations 191

Theorem 5.3 If the extended con-Sylvester matrix equation (5.14) is solvable, then a solution can be obtained within finite iteration steps by Algorithm 5.5 with an arbitrary initial matrix X(0).

× Proof First, in the space Cm n one can define a real inner product as A, B = Re tr BHA for A, B ∈ Cm×n. From Sect. 2.9 it is known that the real inner product space (Cm×n, R, ·, · ) is of real dimension 2mn. According to Lemma 5.6,if R(i) = 0, i ∈ I[0, 2mn − 1], then one has P(i) = 0, i ∈ I[0, 2mn − 1]. Thus one can compute X(2mn) and R(2mn) by Algorithm 5.5. According to Lemma 5.7, one has    Re tr RH (j) R (i) = 0 for i, j ∈ I[0, 2mn − 1], i = j. Thus R (i), i ∈ I[0, 2mn − 1], form an orthogonal real basis of the real inner product space (Cm×n, R, ·, · ). In addition, it follows from Lemma 5.7 that    Re tr RH (2mn) R (i) = 0 for i ∈ I[0, 2mn − 1]. Consequently, it is derived from the property of a real inner space that R (2mn) = 0. So X(2mn) is an exact solution of the matrix equation (5.14). 

When the extended con-Sylvester matrix equation (5.14) has a solution, it follows from Lemma 5.6 that R(i) = 0ifP(i) = 0. In addition, from the expression of P(i) in Algorithm 5.5 it can be seen that P(i) = 0ifR(i) = 0. Thus, R(i) = 0 if and only if P(i) = 0. This implies that if there exists an integer k ≥ 0 such that P(k) = 0but R(k) = 0, then the extended con-Sylvester matrix equation (5.14) is not solvable. Further, it follows from Lemma 5.7 that the unsolvability can be checked automatically by Algorithm 5.5 within at most 2mn iteration steps. By specializing Algorithm 5.5, one can obtain the following finite iterative algo- rithm for solving the real matrix equation

AXB + CXD = F, (5.31) where A, C ∈ Rm×r, B, D ∈ Rs×n, and F ∈ Rm×n are known matrices, and X ∈ Rr×s is the matrix to be determined.

Algorithm 5.6 (Finite iterative algorithm for the equation (5.31)) 1. Given initial value X(0), calculate

R(0) = F − AX(0)B − CX(0)D, P(0) = ATR(0)BT + CTR(0)DT, k := 0;

2. If R(k) = 0, then stop; else, k := k + 1; 192 5 Finite Iterative Approaches

3. Calculate

R(k)2 X(k + 1) = X(k) + P(k), P(k)2 R(k + 1) = F − AX(k + 1)B − CX(k + 1)D, R(k + 1)2 P(k + 1) = ATR(k + 1)BT + CTR(k + 1)DT + P(k); R(k)2

4. Go to Step 2.

By specializing Lemmas 5.6 and 5.7 and Theorem 5.3, on the real matrix equation (5.31) one has the following conclusions.

Lemma 5.8 Assume that the generalized Sylvester matrix equation (5.31) is solv- able, and let X = X∗ be a solution of this equation. Then, for an arbitrary initial matrix X(0), the sequences {X(i)}, {R(i)}, and {P(i)} generated by Algorithm 5.6 satisfy   T 2 tr P (i) (X∗ − X(i)) = R(i) , i = 0, 1,....

Lemma 5.9 Suppose that the sequences {X(i)}, {R(i)}, and {P(i)} are generated by Algorithm 5.6 with an arbitrary initial matrix X(0). If there exists an integer ϕ ≥ 1, such that R(i) = 0 for all i ∈ I[0, ϕ], then     tr RT (j) R (i) = 0, tr PT (j) P (i) = 0. for i, j ∈ I[0, ϕ],i= j.

Theorem 5.4 If the generalized Sylvester matrix equation (5.31) is solvable, then a solution can be obtained within finite iteration steps by Algorithm 5.6 for an arbitrary initial matrix X(0).

5.2.2 A General Case

In this subsection, the matrix equation with the following general form is considered

&p &q + = Ai1 XBi1 Ci2 XDi2 F, (5.32)

i1=1 i2=1

, ∈ Cm×r , ∈ Cs×n ∈ I[ , ] ∈ I[ , ] ∈ Cm×n where Ai1 Ci2 , Bi1 Di2 , i1 1 p , i2 1 q , and F are known matrices, and X ∈ Cr×s is the matrix to be determined. When this matrix equation (5.32) is solvable, by generalizing Algorithm 5.5 for the matrix equation (5.14) the following finite iterative algorithm can be presented to solve it. 5.2 Extended Con-Sylvester Matrix Equations 193

Algorithm 5.7 (Finite iterative algorithm for (5.32)) 1. Given initial value X(0), calculate

&p &q ( ) = − ( ) − ( ) R 0 F Ai1 X 0 Bi1 Ci2 X 0 Di2 ,

i1=1 i2=1 &p &q H H P(0) = AHR(0)BH + C R(0)D , i1 i1 i2 i2 i1=1 i2=1 k :=0;

2. If R(k) = 0, then stop; else, k := k + 1; 3. Calculate

R(k)2 X(k + 1) = X(k) + P(k), P(k)2 &p &q ( + ) = − ( + ) − ( + ) R k 1 F Ai1 X k 1 Bi1 Ci2 X k 1 Di2 ,

i1=1 i2=1 &p &q 2 H H R(k + 1) P(k + 1) = AHR(k + 1)BH + C R(k + 1)D + P(k); i1 i1 i2 i2 R(k)2 i1=1 i2=1

4. Go to Step 2. Analogously to Algorithm 5.5, the proposed Algorithm 5.7 has the following two basic properties, whose proofs are similar to those in the previous subsection, and thus are omitted.

Lemma 5.10 Assume that the matrix equation (5.32) is solvable, and let X = X∗ be a solution of this equation. Then, for an arbitrary initial matrix X(0), the sequences {X(i)}, {R(i)}, and {P(i)} generated by Algorithm 5.7 satisfy     H 2 tr P (i) (X∗ − X(i)) + tr PH(i) (X∗ − X(i)) = 2 R(i) ,i= 0, 1,..., (5.33) or, equivalently,    H 2 Re tr P (i) (X∗ − X(i)) = R(i) ,i= 0, 1,....

Lemma 5.11 Suppose that the sequences {X(i)}, {R(i)}, and {P(i)} are generated by Algorithm 5.7 with an arbitrary initial matrix X(0). If there exists an integer ϕ ≥ 1, such that R(i) = 0 for all i ∈ I[0,ϕ], then       Re tr RH (j) R (i) = 0, Re tr PH (j) P (i) = 0 (5.34) for i, j ∈ I[0,ϕ],i= j. 194 5 Finite Iterative Approaches

Analogously to Theorem 5.3 for the matrix equation (5.14), from the previous two lemmas one has the following result on the matrix equation (5.32). Theorem 5.5 If the matrix equation (5.32) is solvable, then a solution can be ob- tained within finite iteration steps by Algorithm 5.7 for an arbitrary initial matrix X(0). When the coefficient matrices and the unknown matrix of (5.32) are all real, it becomes the matrix equation in the form of

&p AiXBi = F, (5.35) i=1

m×r s×n m×n where Ai ∈ R , Bi ∈ R , i ∈ I[1, p], and F ∈ C are known matrices, and X ∈ Rr×s is the matrix to be determined. This matrix equation includes the well-known continuous-time and discrete-time Lyapunov matrix equations and the Sylvester-polynomial matrix equation investigated in [257] as special cases. When this matrix equation (5.35) is solvable, by specializing Algorithm 5.7 for the matrix equation (5.32) the following finite iterative algorithm can be presented to solve it. Algorithm 5.8 (Finite iterative algorithm for (5.35)) 1. Given initial value X(0), calculate

&p R(0) = F − AiX(0)Bi, i=1 &p ( ) = T ( ) T P 0 Ai R 0 Bi , i=1 k :=0;

2. If R(k) = 0, then stop; else, k := k + 1; 3. Calculate

R(k)2 X(k + 1) = X(k) + P(k), P(k)2 &p R(k + 1) = F − AiX(k + 1)Bi, i=1 p & R(k + 1)2 P(k + 1) = ATR(k + 1)BT + P(k) ; i i  ( )2 i=1 R k

4. Go to Step 2. Similarly to Algorithm 5.7, the proposed Algorithm 5.8 has the following two basic properties. 5.2 Extended Con-Sylvester Matrix Equations 195

Lemma 5.12 Assume that the matrix equation (5.35) is solvable, and let X = X∗ be a solution of this equation. Then, for an arbitrary initial matrix X(0), the sequences {X(i)}, {R(i)}, and {P(i)} generated by Algorithm 5.5 satisfy   H 2 tr P (i) (X∗ − X(i)) = R(i) ,i= 0, 1,....

Lemma 5.13 Suppose that the sequences {X(i)}, {R(i)}, and {P(i)} are generated by Algorithm 5.8 with an arbitrary initial matrix X(0). If there exists an integer ϕ ≥ 1, such that R(i) = 0 for all i ∈ I[0, ϕ], then     tr RH (j) R (i) = 0, tr PH (j) P (i) = 0 for i, j ∈ I[0, ϕ],i= j. From these two lemmas, the following conclusion can be obtained on the matrix equation (5.35). Theorem 5.6 If the matrix equation (5.35) is solvable, then a solution can be ob- tained within finite iteration steps by Algorithm 5.8 for an arbitrary initial matrix X(0).

5.2.3 Numerical Examples

In this subsection, two examples are employed to illustrate the effectiveness of the proposed methods. Example 5.3 Consider an extended con-Sylvester matrix equation in the form of (5.14) with ! ! 1 + 2i 2 − i 2 − 4i i A = , B = , 1 − i2+ 3i −1 + 3i 2 ! ! −1 − i −3i −21− i C = , D = , 01+ 2i 1 + i −1 − i ! 21 + 11i −9 + 7i F = . 52 − 22i −18 + i

The solution X of this equation is ! ! x x 1 + 2i −i X = 11 12 = . x21 x22 2 + i −1 + i

−6 Algorithm 5.5 with initial value X(0) = 10 12 is applied to compute X(k).The iterative solutions X(k) are shown in Tables 5.3 and 5.4. 196 5 Finite Iterative Approaches

Table 5.3 The iterative solutions of x11 and x12 for Example 5.3 k x11 x12 1 0.772584 + 1.210695i −0.367447 − 0.928042i 2 1.085611 + 1.423762i −0.420658 − 1.101595i 3 1.016287 + 1.712308i 0.017600 − 1.239430i 4 1.261489 + 1.920903i −0.001926 − 1.220301i 5 1.115425 + 1.962996i 0.205149 − 1.064523i 6 1.106905 + 1.913594i 0.060810 − 0.965682i 7 0.998887 + 1.984846i 0.002301 − 1.017246i 8 1.000000 + 2.000000i −0.000000 − 1.000000i Solution 1 + 2i −i

Table 5.4 The iterative solutions of x21 and x22 for Example 5.3 k x21 x22 1 1.955014 − 0.186080i −1.688848 + 0.369804i 2 1.887842 + 0.779816i −1.546118 + 0.022830i 3 1.658064 + 1.056203i −1.477122 + 0.981476i 4 1.724836 + 0.887722i −1.124544 + 0.987303i 5 1.945527 + 0.947329i −1.057093 + 0.922659i 6 2.016444 + 1.004005i −1.023175 + 1.015327i 7 2.009770 + 1.000881i −0.990666 + 1.002365i 8 2.000000 + 1.000000i −1.000000 + 1.000000i Solution 2 + i −1 + i

Example 5.4 Consider the following matrix equation

A1XB1 + C1XD1 + C2XD2 = F with ! ! −1 + i2− i 1 − 2i i A = , B = , 1 1 − i −1 + 2i 1 −1 + 3i −1 + 3i ! ! 2 + 3i −1 − i 23− i C = , D = , 1 12+ 2i 1 1 + 3i 1 + 2i

! ! −13+ i 2 −2 + 3i C = , D = , 2 −2 + i2i 2 −1 + 10i 4i ! −22 − 8i −21 + 31i F = . −22 − 20i −40 − 10i 5.2 Extended Con-Sylvester Matrix Equations 197

The solution X of this equation is ! ! x x −1 + i1 X = 11 12 = . x21 x22 1 + 2i 1 − i

Taking ! 1i X(0) = , −1 + i2i let us apply Algorithm 5.7 to compute X(k). The iterative solutions X(k) are shown in Tables 5.5 and 5.6.

Table 5.5 The iterative solutions of x11 and x12 for Example 5.4 k x11 x12 1 −0.012648 − 0.063756i 1.646899 + 0.692811i 2 −0.874695 − 0.069064i 1.603859 − 0.073439i 3 −1.110709 − 0.121111i 1.422702 + 0.263025i 4 −1.047112 + 1.061671i 1.107697 + 0.278753i 5 −0.862204 + 1.037533i 1.134280 + 0.018899i 6 −0.941108 + 1.014212i 1.085481 + 0.078447i 7 −1.013419 + 1.022822i 1.062750 + 0.009129i 8 −1.000000 + 1.000000i 1.000000 − 0.000000i Solution −1 + i 1

Table 5.6 The iterative solutions of x21 and x22 for Example 5.4 k x21 x22 1 −1.002484 + 0.858411i 0.608583 − 1.049537i 2 0.427182 + 1.331371i −0.306609 − 0.914991i 3 1.129347 + 1.577488i 0.892626 − 0.739930i 4 1.156229 + 1.694248i 0.897194 − 0.962011i 5 1.112303 + 1.927010i 0.989272 − 0.974772i 6 1.040452 + 2.053685i 0.967049 − 0.990693i 7 0.991292 + 2.015701i 1.001803 − 0.963415i 8 1.000000 + 2.0000000i 1.000000 − 1.000000i Solution 1 + 2i 1 − i 198 5 Finite Iterative Approaches

5.3 Coupled Con-Sylvester Matrix Equations

5.3.1 Iterative Algorithms

In this section, the following coupled con-Sylvester matrix equation is investigated

Ai1X1Bi1 + Ci1X1Di1 + Ai2X2Bi2 + Ci2X2Di2 +···+AipXpBip + CipXpDip = Fi, i ∈ I[1, N], (5.36)

m ×rη sη×n m ×n where Aiη, Ciη ∈ C i , Biη, Diη ∈ C i , Fi ∈ C i i , i ∈ I[1, N], η ∈ I[1, p], rη×sη are given known matrices, and Xη ∈ C , η ∈ I[1, p], are the matrices to be determined. This coupled matrix equation can be compactly rewritten as &p   AiηXηBiη + CiηXηDiη = Fi, i ∈ I[1, N]. η=1

When this matrix equation is solvable, the following iterative algorithm is proposed to solve it.

Algorithm 5.9 (Finite iterative algorithm for the matrix equation (5.36))

1. Given initial values Xη(0), η ∈ I[1, p], calculate

&p  mi×ni Ri(0) = Fi − AiηXη(0)Biη + CiηXη(0)Diη ∈ C , i ∈ I[1, N], η=1 &N  H H ( ) = H ( ) H + ( ) ∈ Crη×sη η ∈ I[ , ] Pη 0 AiηRi 0 Biη Ciη Ri 0 Diη , 1 p , i=1 k :=0;

2. If Ri(k) = 0, i ∈ I[1, N], then stop; else, k = k + 1; 3. Calculate

&N Rω(k)2 ( + ) = ( ) + ω=1 ( ) ∈ Crη×sη η ∈ I[ , ] Xη k 1 Xη k &p Pη k , 1 p , Pτ (k)2 τ=1 &p  m ×n Ri(k + 1) = Fi − AiηXη(k + 1)Biη + CiηXη(k + 1)Diη ∈ C i i , i ∈ I[1, N], η=1 5.3 Coupled Con-Sylvester Matrix Equations 199

&N  ( + ) = H ( + ) H + H ( + ) H + Pη k 1 AiηRi k 1 Biη Ciη Ri k 1 Diη i=1 &N  ω( + )2 ω= R k 1 × 1 Pη (k) ∈ Crη sη , η ∈ I[1, p]; &N Rω(k)2 ω=1 4. Go to Step 2.

5.3.2 Convergence Analysis

In this subsection, the convergence properties are studied for the proposed Algorithm 5.9. First, the following lemma is needed.

Lemma 5.14 For the sequences {Pη(k)}, η ∈ I[1, p], and {Ri(k)},i ∈ I[1, N], generated by Algorithm 5.9, if there exists an integer ϕ ≥ 1, such that Ri(k) = 0, i ∈ I[1, N] for all k ∈ I[0,ϕ], and let Pη(−1) = 0, η ∈ I[1, p], then the following relation holds  &N   H ( + ) ( ) Re tr Ri k 1 Ri j i=1 ⎧ ⎫  ' &N   N  ( )2 ⎨&p  ⎬ H ω=1 Rω k H = ( ) ( ) − ' ( ) η( ) Re tr Ri k Ri j p 2 Re tr Pη k P j  τ ( ) ⎩ ⎭ i=1 τ=1 P k η=1 ' ' ⎧ ⎫ N  ( )2 N  ( )2 ⎨ p ⎬ ω=1 Rω k ω=1 Rω j &   H +'  ' Re tr Pη (k)Pη(j − 1) p  ( )2 N  ( − )2 ⎩ ⎭ τ=1 Pτ k ω=1 Rω j 1 η=1 for k, j ∈ I[0,ϕ].

Proof From Algorithm 5.9, one has for i ∈ I[1, N]

Ri(k + 1) &p  = Fi − AiηXη(k + 1)Biη + CiηXη(k + 1)Diη η=1 200 5 Finite Iterative Approaches ⎡ ⎛ ⎞ &N 2 &p ⎢ ⎜ Rω(k) ⎟ = − ( ) + ω=1 ( ) + Fi ⎣Aiη ⎝Xη k &p Pη k ⎠ Biη  2 η=1 Pτ (k) τ=1 ⎛ ⎞ ⎤ &N 2 ⎜ Rω(k) ⎟ ⎥ ( ) + ω=1 ( ) Ciη ⎝Xη k &p Pη k ⎠ Diη⎦ 2 Pτ (k) τ=1 &p  = Fi − AiηXη(k)Biη + CiηXη(k)Diη − (5.37) η=1 &N 2 Rω(k) &p  ω=1 ( ) + ( ) &p AiηPη k Biη CiηPη k Diη  2 Pτ (k) η=1 τ=1 &N 2 Rω(k) &p  = ( ) − ω=1 ( ) + ( ) Ri k &p AiηPη k Biη CiηPη k Diη .  2 Pτ (k) η=1 τ=1 It follows from this relation that  &N   H ( + ) ( ) Re tr Ri k 1 Ri j i=1 ⎧ ⎡⎛ ⎞ ⎤⎫ & H ⎪ N 2 ⎪ ⎨&N Rω(k) &p  ⎬ ⎢⎜ ω=1 ⎟ ⎥ = Re tr ⎣⎝Ri(k) − & AiηPη (k) Biη + CiηPη (k)Diη ⎠ Ri (j)⎦ ⎪ p 2 ⎪ ⎩ = Pτ (k) η= ⎭ i 1 τ=1 1  & N 2 &N   Rω(k) H ω=1 = Re tr R (k)Ri (j) − & · i p  ( )2 i=1 Pτ k ⎧ ⎡ τ=1 ⎤⎫ ⎨&N &p  ⎬ ⎣ H H ( ) H + H ( )H H ( )⎦ Re ⎩ tr BiηPη k Aiη DiηPη k Ciη Ri j ⎭ . i=1 η=1

In addition, by applying some properties of the trace of a matrix one has ⎧ ⎡ ⎤⎫ ⎨&N &p  ⎬ ⎣ H H ( ) H + H ( )H H ( )⎦ Re ⎩ tr BiηPη k Aiη DiηPη k Ciη Ri j ⎭ i=1 η=1 ⎧ ⎡ ⎤⎫ ⎨&N &p  ⎬ = ⎣ H ( ) H ( ) H + ( )H H ( ) H ⎦ Re ⎩ tr Pη k AiηRi j Biη Pη k CiηRi j Diη ⎭ . i=1 η=1 5.3 Coupled Con-Sylvester Matrix Equations 201

By changing order of this double summation, one has from Algorithm 5.9 ⎧ ⎡ ⎤⎫ ⎨&N &p  ⎬ ⎣ H H ( ) H + H ( )H H ( )⎦ Re ⎩ tr BiηPη k Aiη DiηPη k Ciη Ri j ⎭ i=1 η=1 ⎧ ⎡ ⎤⎫ ⎨&p &N  ⎬ = ⎣ H ( ) H ( ) H + ( )H H ( ) H ⎦ Re ⎩ tr Pη k AiηRi j Biη Pη k CiηRi j Diη ⎭ η=1 i=1 ⎧ ⎡ ⎛ ⎞ ⎛ ⎞⎤⎫ ⎨&p &N &N ⎬ = ⎣ H ( ) ⎝ H ( ) H ⎠ + ( )H ⎝ H ( ) H ⎠⎦ Re ⎩ tr Pη k AiηRi j Biη Pη k CiηRi j Diη ⎭ η=1 i=1 i=1 ⎧ ⎡ ⎛ ⎞ ⎛ ⎞⎤⎫ ⎪ ⎪ ⎨&p &N &N ⎬ ⎢ H ⎝ H H ⎠ H ⎝ H H ⎠⎥ = Re tr ⎣Pη (k) A ηRi (j) B η + Pη (k) C Ri (j) D ⎦ ⎩⎪ i i iη iη ⎭⎪ η=1 i=1 i=1 ⎧ ⎡ ⎛ ⎞⎤⎫ ⎨&p &N ⎬ = ⎣ H ( ) ⎝ H ( ) H + H ( ) H⎠⎦ Re ⎩ tr Pη k AiηRi j Biη Ciη Ri j Diη ⎭ η=1 i=1 ⎧ ⎡ ⎛ & ⎞⎤⎫ ⎪ N ⎪ ⎨&p  ω( )2 ⎬ ⎢ ⎜ ω= R j ⎟⎥ = Re tr ⎣PH (k) ⎝Pη (j) − 1 Pη (j − 1)⎠⎦ ⎪ η &N ⎪ ⎩η=  ω( − )2 ⎭ 1 ω= R j 1 ⎧ ⎫ 1 ⎧ ⎫ &N ⎨&p  ⎬ Rω(j)2 ⎨&p  ⎬ H ω=1 H = Re tr Pη (k) Pη(j) − & Re tr Pη (k) Pη(j − 1) . ⎩ ⎭ N 2 ⎩ ⎭ η=1 Rω(j − 1) η=1 ω=1 Combining the preceding two relations, gives the conclusion of this lemma. 

With the previous lemma, one can obtain the following two lemmas on the basic properties of Algorithm 5.9. Their proofs are given in Subsection 5.3.5.

Lemma 5.15 Assume that the coupled con-Sylvester matrix equation (5.36) is solv- able, and let X1,X2 , ...,Xp = X1∗,X2∗, ...,Xp∗ be a solution of this equation. Then, for arbitrary initial matrices Xη(0), η ∈ I[1, p], the sequences {Xη(k)}, {Pη(k)}, η ∈ I[1, p], and {Ri(k)},i∈ I[1, N], generated by Algorithm 5.9 satisfy ⎧ ⎫ ⎨&p   ⎬ &N H( ) − ( ) =  ( )2 = , ,... Re ⎩ tr Pη k Xη∗ Xη k ⎭ Ri k ,k 0 1 . (5.38) η=1 i=1

Lemma 5.16 Suppose that the sequences {Xη(k)}, {Pη(k)}, η ∈ I[1, p], and {Ri(k)}, i ∈ I[1, N], are generated by Algorithm 5.9 with any initial matrices Xη(0), η ∈ &N 2 I[1, p]. If there exists an integer ϕ ≥ 1, such that Ri(k) = 0 for all k ∈ i=1 I[0,ϕ], then 202 5 Finite Iterative Approaches  ⎧ ⎫ &N   ⎨&p  ⎬ H ( ) ( ) = H ( ) ( ) = Re tr Ri k Ri j 0, Re ⎩ tr Pη k Pη j ⎭ 0 (5.39) i=1 η=1 for j, k ∈ I[0,ϕ],j= k. With the above two lemmas, one has the following conclusion. Theorem 5.7 If the coupled con-Sylvester matrix equation (5.36) is solvable, then a solution can be obtained within finite iteration steps by Algorithm 5.9 for arbitrary initial matrices Xη(0), η ∈ I[1, p]. × × × Proof First, in the space Cm1 n1 × Cm2 n2 ×··· ×CmN nN , a real inner product can be defined as   &N   ( , ,... ) , ( , ,... ) = H A1 A2 , AN B1 B2 , BN Re tr Bi Ai (5.40) i=1

m1×n1 m2×n2 mN ×nN for (A1, A2,..., AN ), (B1, B2,..., BN ) ∈ C × C ×··· ×C . Denote

&N d = mini. i=1

× × × From Sect. 2.9 it is known that the space Cm1 n1 × Cm2 n2 ×··· ×CmN nN with the real inner product defined in (5.40) is of real dimension 2d. According to Lemma 5.15,if &N 2 Ri(k) = 0, k ∈ I[0, 2d − 1], i=1 then one has &p / / / /2 Pη(k) = 0, k ∈ I[0, 2d − 1]. η=1

Thus one can compute Xη(2d), η ∈ I[1, p], and Ri(2d), i ∈ I[1, N], by Algorithm 5.9. According to Lemma 5.16, it can be obtained that  &N   H ( ) ( ) = , ∈ I[ , − ], = . Re tr Ri k Ri j 0, j k 0 2d 1 j k i=1

Thus (R1(k), R2(k),..., RN (k)), k ∈ I[0, 2d − 1], form an orthogonal real basis of the previously defined real inner product space. In addition, it follows from Lemma 5.16 that  &N   H ( ) ( ) = ∈ I[ , − ]. Re tr Ri 2d Ri k 0, for k 0 2d 1 i=1 5.3 Coupled Con-Sylvester Matrix Equations 203

Consequently, it is derived from the property of a real inner product space that

(R1(2d), R2(2d), . . . , RN (2d)) = 0.

So Xη(2d), η ∈ I[1, p], are a group of exact solutions to the matrix equation (5.36). 

When the coupled con-Sylvester matrix equation (5.36) is solvable, it follows from Lemma 5.15 that &N 2 Ri(k) = 0 i=1 if &p / / / /2 Pη(k) = 0. η=1

In addition, from the expression of Pη(k), η ∈ I[1, p], in Algorithm 5.9 it can be seen that Pη(k) = 0,η∈ I[1, p],ifRi(k) = 0, i ∈ I[1, N]. Thus,

&N 2 Ri(k) = 0 i=1 if and only if &p / / / /2 Pη(k) = 0. η=1

This implies that if there exists an integer k ≥ 0 such that Pη(k) = 0, η ∈ I[1, p],but Ri(k) = 0forsomei, then the coupled con-Sylvester matrix equation (5.36) has no solutions. Further, it follows from Lemma 5.16 that the unsolvability can be checked &N automatically by Algorithm 5.9 within at most 2 mini iteration steps. i=1 However, due to the influence of the computation errors, it is possible to obtain a satisfactory solution to the coupled con-Sylvester matrix equation (5.36) with more &N than 2 mini iteration steps. i=1 At the end of this section, another property of Algorithm 5.9 is proposed.

Lemma 5.17 Assume that the coupled con-Sylvester matrix equation (5.36) is solv- able, and let X1,X2 , ...,Xp = X1∗,X2∗, ...,Xp∗ be a solution of this ma- trix equation. Then, for arbitrary initial matrices Xη(0), η ∈ I[1, p], the sequences {Xη(k)}, {Pη(k)}, η ∈ I[1, p], and {Ri(k)},i∈ I[1, N], generated by Algorithm 5.9 satisfy 204 5 Finite Iterative Approaches ⎧ ⎫  ⎨&p   ⎬ 0,forj> k H( ) − ( ) = &N Re tr Pη k Xη∗ Xη j 2 . (5.41) ⎩ ⎭ Ri(k) ,forj≤ k η=1 i=1

Proof By applying Lemma 5.15, it follows from (5.48) that ⎧ ⎫ ⎨&p   ⎬ H( ) − ( + ) = . Re ⎩ tr Pη k Xη∗ Xη k 1 ⎭ 0 η=1

Now it is assumed that ⎧ ⎫ ⎨&p   ⎬ H( ) − ( + ) = ≥ Re ⎩ tr Pη k Xη∗ Xη k n ⎭ 0, n 1. η=1

It follows from Lemma 5.16 that ⎧ ⎡ ⎤⎫ ⎨ &p ⎬ ⎣ H( ) ( + )⎦ = ≥ Re ⎩tr Pη k Pη k n ⎭ 0, n 1. η=1

With this relation and the induction assumption, it follows from Algorithm 5.9 that ⎧ ⎡ ⎤⎫ ⎨ &p   ⎬ ⎣ H( ) − ( + + ) ⎦ Re ⎩tr Pη k Xη∗ Xη k n 1 ⎭ η=1 ⎧ ⎡ ⎛ ⎞⎤⎫ &N ⎨⎪ p 2 ⎬⎪ ⎢& ⎜ Rω(k + n) ⎟⎥ = H( ) − ( + ) − ω=1 ( + ) Re tr ⎣ Pη k ⎝Xη∗ Xη k n &p Pη k n ⎠⎦ ⎪ 2 ⎪ ⎩ η= Pτ (k + n) ⎭ 1 τ= ⎧ ⎡ ⎤⎫ 1 ⎨ &p   ⎬ = ⎣ H( ) − ( + ) ⎦ − Re ⎩tr Pη k Xη∗ Xη k n ⎭ η=1 ⎧ ⎡ ⎤⎫ &N 2 Rω(k + n) ⎨ &p ⎬ ω=1 ⎣ H( ) ( + )⎦ &p Re tr Pη k Pη k n  2 ⎩ ⎭ Pτ (k + n) η=1 τ=1 = 0.

By the principle of mathematical induction, one has ⎧ ⎫ ⎨&p   ⎬ H( ) − ( + ) = ≥ Re ⎩ tr Pη k Xη∗ Xη k t ⎭ 0, for t 1. η=1 5.3 Coupled Con-Sylvester Matrix Equations 205

This is the first conclusion of this lemma. From Lemma 5.15, it is known that ⎧ ⎫ ⎨&p   ⎬ &N H( ) − ( ) =  ( )2 Re ⎩ tr Pη j Xη∗ Xη j ⎭ Ri j . η=1 i=1

Now, it is assumed that the following relation holds for n ≥ 0 ⎧ ⎫ ⎨&p   ⎬ &N H( + ) − ( ) =  ( + )2 Re ⎩ tr Pη j n Xη∗ Xη j ⎭ Ri j n . (5.42) η=1 i=1

Similarly to the derivation of (5.46) and (5.47), by applying the preceding induction assumption (5.42) and Lemma 5.16 one has ⎧ ⎫ ⎨&p   ⎬ H( + + ) − ( ) Re ⎩ tr Pη j n 1 Xη∗ Xη j ⎭ η=1  &N   = H( + + ) ( ) + Re tr Ri j n 1 Ri j = i 1 ⎧ ⎡ ⎤⎫ &N 2 Rω(j + n + 1) ⎨ &p   ⎬ ω=1 ⎣ H ⎦ ( + ) η∗ − η( ) &N Re tr Pη j n X X j 2 ⎩ ⎭ Rω(j + n) η=1 ω=1 &N   2 Rω(j + n + 1) &N = ω=1  ( + )2 &N Ri j n 2 Rω(j + n) i=1 ω=1 &N 2 = Rω(j + n + 1) . ω=1

With the above argument, one can immediately obtain by the principle of mathemat- ical induction ⎧ ⎫ ⎨&p   ⎬ &N H( + ) − ( ) =  ( + )2 ≥ Re ⎩ tr Pη j t Xη∗ Xη j ⎭ Ri j t , t 0. η=1 i=1

This is the second expression of (5.41). The proof of this lemma is thus completed.  206 5 Finite Iterative Approaches

5.3.3 A More General Case

In this subsection, the following more general matrix equation is considered

&q1   &q2   Ai1lX1Bi1l + Ci1lX1Di1l + Ai2lX2Bi2l + Ci2lX2Di2l +··· l=1 l=1

&qp   + AiplXpBipl + CiplXpDipl = Fi, i ∈ I[1, N], (5.43) l=1

m ×rη sη×n m ×n where Aiηl, Ciηl ∈ C i , Biηl, Diηl ∈ C i , Fi ∈ C i i , i ∈ I[1, N], l ∈ I[1, qη], rη×sη η ∈ I[1, p], are known matrices, and Xη ∈ C , η ∈ I[1, p], are the matrices to be determined. This matrix equation can also be compactly rewritten as

&p &qη   AiηlXηBiηl + CiηlXηDiηl = Fi, i ∈ I[1, N]. η=1 l=1

By generalizing Algorithm 5.9, one can obtain the following algorithm to solve the matrix equation (5.43).

Algorithm 5.10 (Finite iterative algorithm for the equation (5.43))

1. Given initial values Xη(0), η ∈ I[1, p], calculate

&p &qη  mi×ni Ri(0) = Fi − AiηlXη(0)Biηl + CiηlXη(0)Diηl ∈ C , i ∈ I[1, N] , η=1 l=1 &N &qη  H H ( ) = H ( ) H + ( ) ∈ Crη×sη η ∈ I[ , ] Pη 0 AiηlRi 0 Biηl Ciηl Ri 0 Diηl , 1 p , i=1 l=1 k :=0;

2. If Ri(k) = 0, i ∈ I[1, N], then stop; else, k = k + 1; 3. Calculate & N 2 Rω(k) ω=1 rη×sη Xη(k + 1) = Xη(k) + & Pη (k) ∈ C , η ∈ I[1, p], p 2 Pτ (k) τ=1 &p &qη  mi×ni Ri(k + 1) = Fi − AiηlXη(k + 1)Biηl + CiηXη(k + 1)Diηl ∈ C , i ∈ I[1, N], η=1 l=1 &N &qη  ( + ) = H ( + ) H + H ( + ) H Pη k 1 AiηlRi k 1 Biηl Ciηl Ri k 1 Diηl i=1 l=1 5.3 Coupled Con-Sylvester Matrix Equations 207 & N 2 Rω(k + 1) ω=1 rη×sη + & Pη (k) ∈ C , η ∈ I[1, p]; N 2 Rω(k) ω=1

4. Go to Step 2.

Similarly to Algorithm 5.9, one can obtain the following conclusions for Algo- rithm 5.10 without proofs.

Lemma 5.18 Assume that the matrix equation (5.43) is solvable, and let (X1, X2, ...,Xp = X1∗,X2∗, ...,Xp∗ be a solution of this equation. Then, for arbi- trary initial matrices Xη(0), η ∈ I[1, p], the sequences {Xη(k)}, {Pη(k)}, η ∈ I[1, p], and {Ri(k)},i∈ I[1, N], generated by Algorithm 5.10 satisfy ⎧ ⎫  ⎨&p   ⎬ 0,forj> k H( ) − ( ) = &N Re tr Pη k Xη∗ Xη j 2 . ⎩ ⎭ Ri(k) ,forj≤ k η=1 i=1

Lemma 5.19 Suppose that the sequences {Xη(k)}, {Pη(k)}, η ∈ I[1, p], and {Ri(k)}, i ∈ I[1, N], are generated by Algorithm 5.10 with arbitrary initial matrices Xη(0), η ∈ I[1, p]. If there exists an integer ϕ ≥ 1, such that

&N 2 Ri(k) = 0,fork∈ I[0,ϕ], i=1 then  ⎧ ⎫ &N   ⎨&p  ⎬ H ( ) ( ) = H ( ) ( ) = Re tr Ri k Ri j 0, Re ⎩ tr Pη k Pη j ⎭ 0 i=1 η=1 for j, k ∈ I[0,ϕ],j= k.

Theorem 5.8 If the matrix equation (5.43) is solvable, then a solution of this equa- tion can be obtained within finite iteration steps by Algorithm 5.10 with arbitrary initial matrices Xη(0), η ∈ I[1, p].

5.3.4 Numerical Examples

In this subsection, two examples are given to illustrate the effectiveness of the pro- posed methods. 208 5 Finite Iterative Approaches

Example 5.5 Consider the following coupled con-Sylvester matrix equation

A XB + C XD + A YB + C YD = F 11 11 11 11 12 12 12 12 1 (5.44) A21XB21 + C21XD21 + A22YB22 + C22YD22 = F2 with the following coefficient matrices ! ! 1 + 2i 2 − i 0.5 − i −1 + 3i A = , B = , 11 8 − i2+ 3i 11 −1.5 + 2i 1 − 2i ! ! 4i 2 + 2i −13+ i C = , D = , 11 3 + 2.5i i 11 −2 + i4i

! ! 1 − 1.5i 3i 1 − 2i i A = , B = , 12 −2 + 3i 4 12 −1 + 3i −1 + 4i ! ! 2 − 4i 3 − i −1 − 0.5i −2 + 2i C = , D = , 12 1 + 3i 1 + 2i 12 1 + 2.5i −4 + 2i

! ! 0.5 −1 + 0.5i −1 − i −3i A = , B = , 21 1 − 2i −2.5 + 1.5i 21 51+ 2i ! ! −1 + i2− i 4 − i2.5 − i C = , D = , 21 −2 + 3i −1 + 2i 21 i −2 + 2i

! ! 1 − 4i i i2+ 3i A = , B = , 22 −1 + 3i 2 22 41− 10i ! ! 2 + 3i −1 − i 13i C = , D = , 22 12+ 2i 22 −1 − 5i −2 + i ! ! −31.5 + 53.5i −20 + 64.5i 10 − 56.5i 19 − 8i F = , F = . 1 −13.5 − 14i 39.5 + 77.5i 2 57.5 − 29.5i −59.5 + 8.5i

This matrix equation has infinitely many solutions. Algorithm 5.9 is now used to solve it. When the initial matrices are chosen as X(0) = Y(0) = 02×2, one can obtain a solution as follows at k = 20 ! 3.107720 + 1.586466i 3.838167 − 0.133874i X = , −1.053413 + 0.063526i −2.533058 + 1.841290i ! −0.176073 + 3.723004i 0.669254 + 0.685876i Y = . 3.610190 + 1.155858i −1.145951 − 4.808203i 5.3 Coupled Con-Sylvester Matrix Equations 209

−8 In this case, the Frobenius norms of the residuals are R1(20) = 4.853477 × 10 , −8 and R2(20) = 4.452655×10 . When the initial matrices are chosen as X(0) = I2, Y(0) = i I2, one can obtain the following solution at k = 20 ! 3.107720 + 1.586466i 3.838167 − 0.133874i X = , −1.053413 + 0.063526i −2.533058 + 1.841290i ! −0.176073 + 3.723004i 0.669254 + 0.685876i Y = . 3.610190 + 1.155858i −1.145951 − 4.808203i

−8 The corresponding Frobenius norm of the residuals are R1(20) = 2.910088×10 , −8 and R2(20) = 1.245793 × 10 .

Example 5.6 Consider a matrix equation in the form of (5.44) with the same coef- ficient matrices as in Example 5.5 except for C22 = D22 = 0, and ! −14 − 18.5i 10 − 15i F = . 2 26.5 − 44.5i −49.5 − 4.5i

This matrix equation has a unique solution ! ! x x 11+ i X = 11 12 = , x x −1 + 2i −2 + 3i 21 22 ! ! y y i0 Y = 11 12 = . y21 y22 3 + i1− 2i

This pair of solutions can be obtained by Algorithm 5.9 with any initial matrices. When the initial matrices are chosen as ! ! 14i 84 X(0) = , Y(0) = , 2 + 60i 1 0i the iterative solutions of X(k) and Y(k) are shown in Tables 5.7, 5.8, 5.9, and 5.10, −8 respectively. In addition, R1(20) = 1.081095×10 , and R2(20) = 8.127365× − 010 9. The convergence curve for the residuals is given in Fig. 5.3, where r(k) = 2 2 R1(k) + R2(k) .

5.3.5 Proofs of Lemmas 5.15 and 5.16

This subsection provides the proofs of Lemmas 5.15 and 5.16. 210 5 Finite Iterative Approaches

Table 5.7 The iterative solutions of x11 and x12 for Example 5.6 k x11 x12 1 7.734416 − 3.497054i 0.713355 + 3.621689i 2 12.068544 − 2.268381i 1.435712 − 10.944741i 3 5.456717 − 7.371926i 5.227052 − 5.725489i 4 −1.397488 + 2.751343i 4.8405670 − 6.245320i 5 2.871242 + 4.889785i 0.054342 − 0.437181i 6 5.897083 + 4.259495i −0.720624 + 2.292229i 7 3.519877 + 6.575540i −3.314000 − 2.280658i 8 3.380530 + 5.989063i 3.273114 + 0.562890i 9 4.650146 + 4.293968i 2.197604 − 1.411039i 10 5.246070 + 3.762836i 5.197452 − 2.820016i 11 3.729592 + 4.180738i 5.097875 − 1.331443i 12 2.991084 + 1.564041i 5.700119 − 1.419277i 13 2.772348 − 0.489952i 2.130731 + 0.067846i 14 2.633904 + 1.475011i 1.912539 + 0.787028i 15 1.922894 + 0.469372i 1.216791 + 0.601281i 16 1.957008 + 0.418952i 1.218496 + 0.652792i 17 0.999195 + 0.000489i 1.002115 + 0.999657i 18 1.000155 − 0.000087i 0.999973 + 1.000154i 19 1.000000 + 0.000000i 1.000000 + 1.000000i 20 1.000000 + 0.000000i 1.000000 + 1.000000i Solution 1 1 + i

5.3.5.1 Proof of Lemma 5.15

When k = 0, it follows from Algorithm 5.9 that ⎧ ⎫ ⎨&p    ⎬ H( ) − ( ) Re ⎩ tr Pη 0 Xη∗ Xη 0 ⎭ η=1 ⎧ ⎡⎛ ⎞ ⎤⎫ ⎨&p &N    ⎬ = ⎣⎝ H( ) + ( )H ⎠ − ( ) ⎦ Re ⎩ tr BiηRi 0 Aiη DiηRi 0 Ciη Xη∗ Xη 0 ⎭ η=1 i=1 ⎧ ⎡ ⎤⎫ ⎨&p &N      ⎬ = ⎣ H( ) − ( ) + ( )H − ( ) ⎦ Re ⎩ tr Ri 0 Aiη Xη∗ Xη 0 Biη Ri 0 Ciη Xη∗ Xη 0 Diη ⎭ . η=1 i=1 5.3 Coupled Con-Sylvester Matrix Equations 211

Table 5.8 The iterative solutions of x21 and x22 for Example 5.6 k x21 x22 1 −0.208721 + 51.730786i 2.984380 + 6.491053i 2 2.679714 + 34.934121i 1.406039 + 18.108945i 3 −6.396711 + 25.568233i 3.042710 + 21.213052i 4 −5.589121 + 20.637736i −2.986982 + 15.418970i 5 −0.550157 + 17.166859i −5.724875 + 13.840465i 6 −0.455329 + 15.640638i −2.427763 + 12.089042i 7 −0.700573 + 11.584289i 4.459660 + 7.443106i 8 0.554372 + 9.495321i 4.398854 + 5.024237i 9 0.592702 + 8.709867i 1.809186 + 3.001338i 10 2.941483 + 7.614482i 0.218074 + 2.854513i 11 3.806444 + 7.050575i −0.271018 + 5.196322i 12 4.060034 + 5.312240i −0.351382 + 2.798099i 13 −1.993408 + 2.843178i −4.484741 + 1.949605i 14 −2.244922 + 2.149233i −2.502257 + 3.028937i 15 −1.004401 + 2.000920i −2.408271 + 3.082930i 16 −1.035677 + 1.970778i −2.378800 + 3.108714i 17 −1.001945 + 2.000213i −1.998735 + 2.998907i 18 −0.999969 + 1.999991i −1.999953 + 3.000017i 19 −0.999999 + 2.000000i −2.000000 + 3.000000i 20 −1.000000 + 2.000000i −2.000000 + 3.000000i Solution −1 + 2i −2 + 3i

  By changing order of this double summation, in view that X1 , X2, ..., Xp = X1∗, X2∗, ..., Xp∗ is a solution to the matrix equation (5.36), one has ⎧ ⎫ ⎨&p    ⎬ H( ) − ( ) Re ⎩ tr Pη 0 Xη∗ Xη 0 ⎭ η=1 ⎧ ⎡ ⎤⎫ ⎨&N &p      ⎬ = ⎣ H( ) − ( ) + ( )H − ( ) ⎦ Re ⎩ tr Ri 0 Aiη Xη∗ Xη 0 Biη Ri 0 Ciη Xη∗ Xη 0 Diη ⎭ i=1 η=1 ⎧ ⎡ ⎛ ⎞ ⎛ ⎞⎤⎫ ⎨&N &p   &p   ⎬ = ⎣ H( ) ⎝ − ( ) ⎠ + ( )H ⎝ − ( ) ⎠⎦ Re ⎩ tr Ri 0 Aiη Xη∗ Xη 0 Biη Ri 0 Ciη Xη∗ Xη 0 Diη ⎭ i=1 η=1 η=1 ⎧ ⎡ ⎛ ⎞ ⎛ ⎞⎤⎫ ⎪ ⎪ ⎨&N &p   &p   ⎬ ⎢ H ⎝ ⎠ H ⎝ ⎠⎥ = Re tr ⎣R (0) Aiη Xη∗ − Xη(0) Biη + Ri(0) Ciη Xη∗ − Xη(0) Diη ⎦ ⎩⎪ i ⎭⎪ i=1 η=1 η=1 212 5 Finite Iterative Approaches

Table 5.9 The iterative solutions y11 and y12 of Example 5.6 k y11 y12 1 8.107436 + 3.841159i −15.217141 − 0.102782i 2 5.155349 + 6.828316i −3.229129 + 2.306452i 3 −3.519672 + 3.997849i 1.533185 + 5.745239i 4 −6.797585 + 5.783695i −1.681521 + 0.613774i 5 −6.456900 + 4.564556i −1.243113 + 6.464076i 6 −7.515109 + 6.194286i 0.911598 + 1.486283i 7 −3.617872 + 9.081596i 0.704250 + 4.290845i 8 −4.198589 + 10.981406i 0.798544 + 6.504013i 9 −8.168625 + 10.100155i −0.836288 + 4.285339i 10 −6.812222 + 7.014007i −0.064564 + 2.884083i 11 −5.200774 + 8.118906i −0.055010 + 1.365443i 12 −1.298577 + 5.519958i −0.895111 + 0.154461i 13 1.469931 + 5.020278i 0.746374 + 0.207962i 14 0.882582 + 1.093780i 0.235797 − 0.196327i 15 0.469514 + 1.017673i 0.337215 − 0.311892i 16 0.468201 + 1.040238i 0.121819 − 0.312513i 17 −0.002053 + 1.001850i 0.000377 − 0.008051i 18 −0.000033 + 0.999991i 0.000085 − 0.000037i 19 0.000000 + 1.000000i −0.000000 − 0.000000i 20 −0.000000 + 1.000000i 0.000000 − 0.000000i Solution i 0

⎧ ⎡ ⎤⎫ ⎨&N &p     ⎬ = ⎣ H( ) − ( ) + − ( ) ⎦ Re ⎩ tr Ri 0 Aiη Xη∗ Xη 0 Biη Ciη Xη∗ Xη 0 Diη ⎭ i=1 η=1 ⎧ ⎡ ⎛ ⎞⎤⎫ ⎨&N &p  ⎬ = ⎣ H( ) ⎝ − ( ) + ( ) ⎠⎦ Re ⎩ tr Ri 0 Fi AiηXη 0 Biη CiηXη 0 Diη ⎭ i=1 η=1  &N  &N = H( ) ( ) =  ( )2 Re tr Ri 0 Ri 0 Ri 0 . i=1 i=1

This implies that (5.38) holds for k = 0. Now it is assumed that (5.38) holds for k = n, that is, ⎧ ⎫ ⎨&p   ⎬ &N H( ) − ( ) =  ( )2 Re ⎩ tr Pη n Xη∗ Xη n ⎭ Ri n . (5.45) η=1 i=1 5.3 Coupled Con-Sylvester Matrix Equations 213

Table 5.10 The iterative solutions y21 and y22 of Example 5.6 k y21 y22 1 −0.400599 − 0.561507i 4.942131 − 1.090591i 2 5.532919 − 3.383400i 6.740199 + 10.552393i 3 6.037187 − 9.661109i 2.324593 − 2.473660i 4 6.491023 − 13.5560702i 6.555710 − 2.712159i 5 8.555155 − 13.737884i 5.296378 − 4.181106i 6 9.274076 − 13.616279i 6.599247 − 4.221932i 7 7.680796 − 9.631572i 3.891927 − 7.286682i 8 4.450449 − 8.042246i 3.444553 − 4.032779i 9 4.449221 − 5.110071i −2.061956 − 3.280957i 10 5.805825 − 1.251688i 0.510361 − 7.150132i 11 5.794121 − 0.292328i −1.258448 − 7.251201i 12 7.516521 − 3.600232i −2.752702 − 5.712915i 13 4.289013 − 0.062427i 0.260460 − 3.940731i 14 2.854136 + 0.532395i 0.049439 − 2.554961i 15 1.747045 + 0.270872i 0.546740 − 2.584227i 16 1.732147 + 0.276401i 0.588060 − 2.646472i 17 2.999777 + 0.998829i 1.003067 − 1.998567i 18 3.000037 + 0.999996i 1.000001 − 2.000005i 19 3.000000 + 1.000000i 1.000000 − 1.999999i 20 3.000000 + 1.000000i 1.000000 − 2.000000i Solution 3 + i 1 − 2i

4

2

0

−2

−4 r(k) 10

log −6

−10

−12

−14 0 5 10 15 20 25 30 Iteration number k

Fig. 5.3 Convergence curve for the residual 214 5 Finite Iterative Approaches

It follows from Algorithm 5.9 that ⎧ ⎫ ⎨&p    ⎬ H( + ) − ( + ) Re ⎩ tr Pη n 1 Xη∗ Xη n 1 ⎭ η=1 ⎧  ⎫ ⎨&p &N    ⎬ = H( + ) + ( + )H − ( + ) Re ⎩ tr BiηRi n 1 Aiη DiηRi n 1 Ciη Xη∗ Xη n 1 ⎭ η=1 i=1 ⎧ ⎡ ⎤⎫ &N Rω(n + 1)2 ⎨ &p   ⎬ ω=1 ⎣ H ⎦ + & Re tr Pη (n) Xη∗ − Xη(n + 1) (5.46) N 2 ⎩ ⎭ Rω(n) η=1 ω=1 ⎧  ⎫ ⎨&p &N      ⎬ = H( + ) − ( + ) + ( + )H − ( + ) Re ⎩ tr Ri n 1 Aiη Xη∗ Xη n 1 Biη Ri n 1 Ciη Xη∗ Xη n 1 Diη ⎭ η=1 i=1 ⎧ ⎡ ⎤⎫ &N Rω(n + 1)2 ⎨ &p   ⎬ ω=1 ⎣ H ⎦ + & Re tr Pη (n) Xη∗ − Xη(n + 1) . N 2 ⎩ ⎭ Rω(n) η=1 ω=1   By changing order of double summation and considering that X1, X2, ..., Xp = X1∗ , X2∗, ..., Xp∗ is a solution to the matrix equation (5.36), one has ⎧ ⎡ ⎨&p &N    ⎣ H( + ) − ( + ) + Re ⎩ tr Ri n 1 Aiη Xη∗ Xη n 1 Biη η=1 i=1

H   Ri(n + 1) Ciη Xη∗ − Xη(n + 1) Diη ⎧ ⎡ ⎨&N &p    = ⎣ H( + ) − ( + ) + Re ⎩ tr Ri n 1 Aiη Xη∗ Xη n 1 Biη i=1 η=1

H   Ri(n + 1) Ciη Xη∗ − Xη(n + 1) Diη ⎧ ⎡ ⎛ ⎞ ⎨&N &p   = ⎣ H( + ) ⎝ − ( + ) ⎠ + Re ⎩ tr Ri n 1 Aiη Xη∗ Xη n 1 Biη i=1 η=1 ⎛ ⎞⎤⎫ &p   ⎬ ( + )H ⎝ − ( + ) ⎠⎦ Ri n 1 Ciη Xη∗ Xη n 1 Diη ⎭ η=1 ⎧ ⎡ ⎛ ⎞ ⎨&N &p   = ⎣ H( + ) ⎝ − ( + ) ⎠ + Re ⎩ tr Ri n 1 Aiη Xη∗ Xη n 1 Biη i=1 η=1 ⎛ ⎞⎤⎫ ⎪ &p   ⎬ H ⎝ ⎠⎥ Ri(n + 1) Ciη Xη∗ − Xη(n + 1) Diη ⎦ (5.47) ⎭⎪ η=1 ⎧ ⎡ ⎤⎫ ⎨&N &p     ⎬ = ⎣ H( + ) − ( + ) + − ( + ) ⎦ Re ⎩ tr Ri n 1 Aiη Xη∗ Xη n 1 Biη Ciη Xη∗ Xη n 1 Diη ⎭ i=1 η=1 5.3 Coupled Con-Sylvester Matrix Equations 215 ⎧ ⎡ ⎛ ⎨&N &p   = ⎣ H( + ) ⎝ + − Re ⎩ tr Ri n 1 AiηXη∗Biη CiηXη∗Diη i=1 η=1 ⎞⎤⎫ &p  ⎬ ( + ) + ( + ) ⎠⎦ AiηXη n 1 Biη CiηXη n 1 Diη ⎭ η=1 ⎧ ⎡ ⎛ ⎞⎤⎫ ⎨&N &p  ⎬ = ⎣ H( + ) ⎝ − ( + ) + ( + ) ⎠⎦ Re ⎩ tr Ri n 1 Fi AiηXη n 1 Biη CiηXη n 1 Diη ⎭ i=1 η=1 &N 2 = Ri(n + 1) . i=1

In addition, it follows from Algorithm 5.9 and the induction assumption (5.45) that ⎧ ⎡ ⎤⎫ ⎨ &p   ⎬ ⎣ H( ) − ( + ) ⎦ Re ⎩tr Pη n Xη∗ Xη n 1 ⎭ η=1 ⎧ ⎡ ⎛ ⎞⎤⎫ &N ⎨⎪ p 2 ⎬⎪ ⎢& ⎜ Rω(n) ⎟⎥ = H( ) − ( ) − ω=1 ( ) Re tr ⎣ Pη n ⎝Xη∗ Xη n &p Pη n ⎠⎦ ⎪ 2 ⎪ ⎩ η= Pτ (n) ⎭ 1 τ= ⎧ ⎡ ⎤⎫ 1 ⎨ &p   ⎬ = ⎣ H( ) − ( ) ⎦ − Re ⎩tr Pη n Xη∗ Xη n ⎭ (5.48) η=1 ⎧ ⎡ ⎤⎫ &N 2 Rω(n) ⎨ &p ⎬ ω=1 ⎣ H( ) ( )⎦ &p Re tr Pη n Pη n 2 ⎩ ⎭ Pτ (n) η= τ= 1 1 ⎛ ⎞ &N 2 &N Rω(n) &p / / =  ( )2 − ω=1 ⎝ / ( )/2⎠ Ri n &p Pη n  2 i=1 Pτ (n) η=1 τ=1 = 0.

From (5.46), (5.47), and (5.48), it can be obtained that ⎧ ⎫ ⎨&p   ⎬ &N H( + ) − ( + ) =  ( + )2 Re ⎩ tr Pη n 1 Xη∗ Xη n 1 ⎭ Ri n 1 . η=1 i=1

This implies that (5.38) holds for k = n + 1. With the above logic reasoning, the relation (5.38) holds for all integer k ≥ 0 by mathematical induction principle. 216 5 Finite Iterative Approaches

5.3.5.2 Proof of Lemma 5.16

In view of          Re tr AHB = Re tr BTA = Re tr BTA       = Re tr BTA = Re tr BHA , in order to prove the conclusion it suffices to show that (5.39) holds for 0 ≤ j < k ≤ ϕ. Next, this conclusion will be shown by induction principle. Step 1. In this step, one needs to show that  &N   H ( ) ( ) = Re tr Ri 1 Ri 0 0, (5.49) i=1 and ⎧ ⎫ ⎨&p  ⎬ H ( ) ( ) = . Re ⎩ tr Pη 1 Pη 0 ⎭ 0 (5.50) η=1

Using the proof of Lemma 5.14, one has  &N  H ( ) ( ) Re tr Ri 1 Ri 0 i=1 ⎧ ⎫  ' &N  N ⎨&p  ⎬ Rω( )2 = H( ) ( ) − ω=1 0 H( ) ( ) Re tr Ri 0 Ri 0 'p Re tr Pη 0 Pη 0 (5.51) Pτ ( )2 ⎩ ⎭ i=1 τ=1 0 η=1 ⎛ ⎞ ' &N N &p Rω( )2 / / =  ( )2 − ω=1 0 ⎝ / ( )/2⎠ Ri 0 'p Pη 0 Pτ ( )2 i=1 τ=1 0 η=1 = 0.

Thus, the relation (5.49) holds. From Algorithm 5.9 one also has ⎧ ⎫ ⎨&p  ⎬ H ( ) ( ) Re ⎩ tr Pη 1 Pη 0 ⎭ η=1 ⎧ ⎡  ⎤⎫ ⎨&p &N  H ⎬ = ⎣ H ( ) H + H ( ) H ( )⎦ Re ⎩ tr AiηRi 1 Biη Ciη Ri 1 Diη Pη 0 ⎭ η=1 i=1 5.3 Coupled Con-Sylvester Matrix Equations 217 ⎛ ⎞ &N 2 Rω(1) &p   ω=1 ⎝ H ⎠ + ( ) η( ) &N tr Pη 0 P 0 2 Rω(0) η=1 ω= ⎧ 1  ⎫ ⎨&p &N  ⎬ = H( ) + ( )H ( ) + Re ⎩ tr BiηRi 1 Aiη DiηRi 1 Ciη Pη 0 ⎭ η=1 i=1 ⎛ ⎞ &N 2 Rω(1) &p / / ω=1 ⎝ / /2⎠ η( ) &N P 0 2 Rω(0) η=1 ω= ⎧ 1  ⎫ ⎨&p &N  ⎬ = H( ) ( ) + ( )H ( ) + Re ⎩ tr Ri 1 AiηPη 0 Biη Ri 1 CiηPη 0 Diη ⎭ η=1 i=1 ⎛ ⎞ &N 2 Rω(1) &p / / ω=1 ⎝ / /2⎠ η( ) &N P 0 . 2 Rω(0) η=1 ω=1

In addition, by changing order of the double summation, it follows from (5.37) with k = 0 and (5.51) that ⎧  ⎫ ⎨&p &N  ⎬ H( ) ( ) + ( )H ( ) Re ⎩ tr Ri 1 AiηPη 0 Biη Ri 1 CiηPη 0 Diη ⎭ η=1 i=1 ⎧ ⎡ ⎤⎫ ⎨&N &p  ⎬ = ⎣ H( ) ( ) + ( )H ( ) ⎦ Re ⎩ tr Ri 1 AiηPη 0 Biη Ri 1 CiηPη 0 Diη ⎭ i= η=1 ⎧ 1 ⎡⎛ ⎞ ⎛ ⎞⎤⎫ ⎨&N &p &p ⎬ = ⎣⎝ H( ) ( ) ⎠ + ⎝ ( )H ( ) ⎠⎦ Re ⎩ tr Ri 1 AiηPη 0 Biη Ri 1 CiηPη 0 Diη ⎭ i=1 η=1 η=1 ⎧ ⎡⎛ ⎞ ⎛ ⎞⎤⎫ ⎪ ⎪ ⎨&N &p &p ⎬ ⎢⎝ H ⎠ ⎝ H ⎠⎥ = Re tr ⎣ R (1) AiηPη (0) Biη + Ri(1) CiηPη (0) Diη ⎦ ⎩⎪ i ⎭⎪ i=1 η=1 η=1 ⎧ ⎡ ⎤⎫ ⎨&N &p   ⎬ = ⎣ H( ) ( ) + ( ) ⎦ Re ⎩ tr Ri 1 AiηPη 0 Biη CiηPη 0 Diη ⎭ i=1 η=1 &p  2 Pτ (0) &N   = τ=1 H( ) ( ( ) − ( )) &N Re tr Ri 1 Ri 0 Ri 1 2 Rω(0) i=1 ω=1 &p   2 Pτ (0) &N =− τ=1  ( )2 &N Ri 1 . 2 Rω(0) i=1 ω=1 218 5 Finite Iterative Approaches

With the preceding two relations, one immediately obtains the relation (5.50). Step 2. Now, it is assumed that  ⎧ ⎫ &N   ⎨&p  ⎬ H( ) ( ) = , H ( ) ( ) = Re tr Ri v Ri j 0 Re ⎩ tr Pη v Pη j ⎭ 0 (5.52) i=1 η=1 for j ∈ I[0, v − 1], v ≥ 1. Then, it will be shown that   &N   &p   H( + ) ( ) = , H ( + ) ( ) = Re tr Ri v 1 Ri j 0 Re tr Pi v 1 Pi j 0 (5.53) i=1 i=1 for j ∈ I[0, v]. For convenience, let Pi(−1) = 0. By using Lemma 5.14, it follows from the induction assumption (5.52) that for j ∈ I[0, v − 1],  &N   H( + ) ( ) Re tr Ri v 1 Ri j = i 1 ⎧ ⎫  &N 2 &N   Rω(v) ⎨&p  ⎬ = H( ) ( ) − ω=1 H( ) ( ) Re tr R v Ri j &p Re tr Pη v Pη j i 2 ⎩ ⎭ = Pτ (v) η= i 1 τ= 1   1 ⎧ ⎫ &N &N 2 2 Rω(v) Rω(j) ⎨&p  ⎬ ω=1 ω=1 H +  ( ) η( − ) &p &N Re tr Pη v P j 1 2 2 ⎩ ⎭ Pτ (v) Rω(j − 1) η=1 τ=1 ω=1 = 0.

In addition, it can be obtained from Lemma 5.14 and induction assumption (5.52) that  &N   H ( + ) ( ) Re tr Ri v 1 Ri v i=1 ⎛ ⎞ &N &N Rω(v)2 &p / / =  ( )2 − ω=1 ⎝ / ( )/2⎠ Ri v &p Pη v (5.54) Pτ (v)2 i=1 τ= η=1 1 ! & 2 ⎧ ⎫ N 2 Rω(v) p ω= ⎨&  ⎬ 1 H +  ! Re tr Pη (v)Pη(v − 1) &p &N ⎩ ⎭ Pτ (v)2 Rω(v − 1)2 η=1 τ=1 ω=1 ! & 2 ⎧ ⎫ N 2 Rω(v) p ω= ⎨&  ⎬ 1 H =  ! Re tr Pη (v)Pη(v − 1) . &p &N ⎩ ⎭ Pτ (v)2 Rω(v − 1)2 η=1 τ=1 ω=1 = 0. 5.3 Coupled Con-Sylvester Matrix Equations 219

The previous relations reveal that the first relation in (5.53) holds for j ∈ I[0, v]. From Algorithm 5.9, it can also be obtained that ⎧ ⎫ ⎨&p  ⎬ H( + ) ( ) Re ⎩ tr Pη v 1 Pη j ⎭ η=1 ⎧ ⎡  ⎤⎫ ⎨&p &N  H ⎬ = ⎣ H ( + ) H + H ( + ) H ( )⎦ Re ⎩ tr AiηRi v 1 Biη Ciη Ri v 1 Diη Pη j ⎭ η=1 i=1 ⎧ ⎫ &N 2 Rω(v + 1) ⎨&p  ⎬ ω=1 H + ( ) η( ) &N Re tr Pη v P j 2 ⎩ ⎭ Rω(v) η=1 ω= ⎧ 1 ⎫ ⎨&p &N  ⎬ = H( + ) + ( + )H ( ) Re ⎩ tr BiηRi v 1 Aiη DiηRi v 1 Ciη Pη j ⎭ η=1 i=1 ⎧ ⎫ &N 2 Rω(v + 1) ⎨&p  ⎬ ω=1 H + ( ) η( ) &N Re tr Pη v P j 2 ⎩ ⎭ Rω(v) η=1 ω= ⎧ 1 ⎫ ⎨&p &N  ⎬ = H( + ) ( ) + ( + )H ( ) Re ⎩ tr Ri v 1 AiηPη j Biη Ri v 1 CiηPη j Diη ⎭ η=1 i=1 ⎧ ⎫ &N 2 Rω(v + 1) ⎨&p  ⎬ ω=1 H + ( ) η( ) &N Re tr Pη v P j . 2 ⎩ ⎭ Rω(v) η=1 ω=1

By changing order of the double summation and applying (5.37), one has

⎧  ⎫ ⎨&p &N  ⎬ H( + ) ( ) + ( + )H ( ) Re ⎩ tr Ri v 1 AiηPη j Biη Ri v 1 CiηPη j Diη ⎭ η= = ⎧ 1 ⎡i 1 ⎤⎫ ⎨&N &p  ⎬ = ⎣ H( + ) ( ) + ( + )H ( ) ⎦ Re ⎩ tr Ri v 1 AiηPη j Biη Ri v 1 CiηPη j Diη ⎭ = η= ⎧ i 1 ⎡⎛ 1 ⎞ ⎛ ⎞⎤⎫ ⎨&N &p &p ⎬ = ⎣⎝ H( + ) ( ) ⎠ + ⎝ ( + )H ( ) ⎠⎦ Re ⎩ tr Ri v 1 AiηPη j Biη Ri v 1 CiηPη j Diη ⎭ i=1 η=1 η=1 ⎧ ⎡⎛ ⎞ ⎛ ⎞⎤⎫ ⎪ ⎪ ⎨&N &p &p ⎬ ⎢⎝ H ⎠ ⎝ H ⎠⎥ = Re tr ⎣ R (v + 1) AiηPη(j)Biη + Ri(v + 1) CiηPη(j)Diη ⎦ ⎩⎪ i ⎭⎪ i=1 η=1 η=1 220 5 Finite Iterative Approaches ⎧ ⎡ ⎤⎫ ⎨&N &p  ⎬ = ⎣ H( + ) ( ) + ( ) ⎦ Re ⎩ tr Ri v 1 AiηPη j Biη CiηPη j Diη ⎭ i=1 η=1 &p  Pτ (j)2 &N   = τ=1 H( + ) ( ( ) − ( + )) & Re tr Ri v 1 Ri j Ri j 1 . N 2 Rω(j) i=1 ω=1 Combining the previous two relations, one has ⎧ ⎫ ⎨&p  ⎬ H( + ) ( ) Re ⎩ tr Pη v 1 Pη j ⎭ η=1 &p  2 Pτ (j) &N   = τ=1 H( + ) ( ( ) − ( + )) &N Re tr Ri v 1 Ri j Ri j 1 (5.55) 2 Rω(j) i=1 ω=1 ⎧ ⎫ &N 2 Rω(v + 1) ⎨&p  ⎬ ω=1 H + ( ) η( ) &N Re tr Pη v P j . 2 ⎩ ⎭ Rω(v) η=1 ω=1

From this relation, it follows from the induction assumption (5.52) and the proven fact that the first relation in (5.53) holds for j ∈ I[0, v], that for j ∈ I[0, v − 1] ⎧ ⎫ ⎨&p  ⎬ H( + ) ( ) = Re ⎩ tr Pη v 1 Pη j ⎭ 0. η=1

In addition, it can also be obtained from (5.55) and (5.55) that ⎧ ⎫ ⎨&p  ⎬ H( + ) ( ) Re ⎩ tr Pη v 1 Pη v ⎭ η=1 &p    Pτ (v)2 &N   &N   = τ=1 H( + ) ( ) − H( + ) ( + ) &N Re tr Ri v 1 Ri v Re tr Ri v 1 Ri v 1 Rω(v)2 = = ω= i 1 i 1 1 ⎧ ⎫ &N Rω(v + 1)2 ⎨&p  ⎬ ω=1 H + & Re tr P (v)Pη(v) N ⎩ η ⎭ Rω(v)2 η= ω= 1 1 ⎛ ⎞ &p   &N Pτ (v)2 &N Rω(v + 1)2 &p / / τ=1 2 ω=1 ⎝ / /2⎠ =−& Ri (v + 1) + & Pη(v) N 2 N 2 Rω(v) i=1 Rω(v) η=1 ω=1 ω=1 = 0.

Thus, the second relation in (5.53) holds for j ∈ I[0, v]. 5.3 Coupled Con-Sylvester Matrix Equations 221

From Step 1 and Step 2, the conclusion is immediately obtained by the principle of mathematical induction.

5.4 Notes and References

In this chapter, some finite iterative algorithms have been presented to obtain a solu- tion for some complex conjugate matrix equations. These matrix equations include the generalized con-Sylvester matrix equation, the extended con-Sylvester matrix equation and the coupled con-Sylvester matrix equation. By these algorithms, the solvability of these matrix equations can be determined automatically. With a real inner product in complex matrix spaces as a tool, it has been proven that, when the considered matrix equation is solvable the proposed algorithms can give an exact solution within finite iteration steps for any initial values in absence of round-off errors. The main contents of this chapter are taken from the work [252, 270, 275]. In addition, the proposed algorithms do not require the coefficient matrices in any canonical forms. For finite iterative algorithms, there are many other results reported in literature. In [199], a finite iterative method was proposed for solving the generalized Roth matrix equation AXB + CYD = E. In [274] an iterative algorithm was constructed to give a common solution to a group of complex conjugate matrix equations. In this paper, the real inner product was also used to show the orthogonality of residuals. In [229, 279], finite iterative algorithms were established for solving coupled algebraic Lyapunov equations appearing in discrete-time and continuous-time linear systems with Markovian transitions, respectively. Part II Explicit Solutions Chapter 6 Real-Representation-Based Approaches

In the previous three chapters, some iterative approaches have been established for solving some complex conjugate matrix equations. From this chapter, the aim is to give explicit solutions for several classes of complex conjugate matrix equations. In this chapter, the explicit solutions are constructed with the aid of the real representa- tions of complex matrices introduced in Sect. 2.6. The involved equations include the normal con-Sylvester matrix equations, the con-Kalman-Yakubovich matrix equa- tions, the con-Sylvester matrix equations and con-Yakubovich matrix equations. They are investigated respectively in Sects. 6.1–6.4. In this chapter, some symbols need to be adopted. For A ∈ Cn×n, B ∈ Cn×r, and C ∈ Cm×n, it is denoted that   Ctr(A, B) = BAB··· An−1B , ⎡ ⎤ C ⎢ ⎥ ( , ) = ⎢ CA ⎥, Obsk A C ⎣ ··· ⎦ CAk−1 ( ) = ( − ) = n + α n−1 +···+α + α , fA s det⎡ sI A s n−1s ⎤ 1s 0 α1Ir α2Ir ··· αn−1Ir Ir ⎢ ⎥ ⎢ α2Ir α3Ir ··· Ir ⎥ ⎢ ⎥ Sr(A) = ⎢ ··· ··· ⎥ . ⎣ ⎦ αn−1Ir Ir Ir

Obviously, Ctr(A, B) is the controllability matrix of the matrix pair (A, B), Obsn(A, C) is the observability matrix of the matrix pair (A, C), and Sr(A) is a symmetric operator matrix.

© Springer Science+Business Media Singapore 2017 225 A.-G. Wu and Y. Zhang, Complex Conjugate Matrix Equations for Systems and Control, Communications and Control Engineering, DOI 10.1007/978-981-10-0637-1_6 226 6 Real-Representation-Based Approaches

In addition, for the matrix A ∈ Cn×n the following symbols are also used:

( ) = ( − ) = β n + β n−1 +···+β + , gA s det⎡ I sA ns n−⎤1s 1s 1 Ir β1Ir β2Ir ··· βn−1Ir ⎢ ⎥ ⎢ Ir β1Ir ··· βn−2Ir ⎥ ⎢ ⎥ ( ) = ⎢ .. ⎥ . Tr A ⎢ . ··· ··· ⎥ ⎣ ⎦ Ir β1Ir Ir

For a square matrix pair (E, A), f(E,A) (s) and λ (E, A) are used to denote its charac- teristic polynomial and finite eigenvalue set, respectively. That is,

f(E,A) (s) = det (sE − A) , λ (E, A) = {s | det (sE − A) = 0} .

6.1 Normal Con-Sylvester Matrix Equations

In this section, the following matrix equation is investigated

XF − AX = C, (6.1) where the matrices A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p are known, and X ∈ Cn×p is the matrix to be determined. This matrix equation is a conjugate version of the normal Sylvester matrix equation

XF − AX = C, (6.2) where the matrices A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p are known, and X ∈ Cn×p is the matrix to be determined. Due to this reason, the matrix equation (6.1)is referred to as the normal con-Sylvester matrix equation. In Subsection 6.1.1,the solvability condition is investigated for the normal con-Sylvester matrix equation (6.1). In Subsection 6.1.2, the uniqueness of the solution to this matrix equation is studied. In Subsection 6.1.3, an explicit solution is given for this matrix equation with the aid of real representations of complex matrices.

6.1.1 Solvability Conditions

On the existence of solutions to the normal con-Sylvester matrix equation (6.1), one has the following theorem. 6.1 Normal Con-Sylvester Matrix Equations 227

Theorem 6.1 Let A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p be given. The normal con-Sylvester matrix equation (6.1) has a solution if and only if the following two partitioned matrices are consimilar:

A −C A 0 , . (6.3) 0 F 0 F

Proof For any X ∈ Cn×p, the following matrix  is invertible:

IX  = . 0 I

In addition, it can be easily known that

− I −X  1 = . 0 I

Thus, by simple calculation one has

A −C −1 AXF− AX − C   = , 0 F 0 F which says that the following two matrices are consimilar:

A −C AXF− AX − C , . 0 F 0 F

Thus, if there is a solution X to the matrix equation (6.1), then the two matrices in (6.3) are consimilar. Now, let us prove the sufficiency. Assume that there is a nonsingular S ∈ C(n+p)×(n+p) such that A 0 −1 A −C S S = . 0 F 0 F

(n+p)×(n+p) (n+p)×(n+p) Define two real linear mappings Ti : C → C , i = 1, 2, by

A 0 A 0 T (X ) = X − X , 1 0 F 0 F

A −C A 0 T (X ) = X − X . 2 0 F 0 F 228 6 Real-Representation-Based Approaches

By the consimilarity hypothesis, for all X ∈ C(n+p)×(n+p) one has

A 0 −1 A 0 T (X ) = S S X − X 2 0 F 0 F

A 0 −1 − A 0 = S S X − SS 1X 0 F 0 F

A 0 − A 0 = S S−1X − S 1X 0 F 0 F   −1 = ST1 S X .   −1 This relation implies that T2 (X ) = 0 if and only if T1 S X = 0. The nonsin- gularity of S implies that rdim (KerT1) = rdim (KerT2). Now, it is denoted that X X X = 11 12 , X21 X22

n×n n×p p×n p×p with X11 ∈ C , X12 ∈ C , X21 ∈ C , and X22 ∈ C . With this partition, simple calculation gives

AX − X AAX − X F T (X ) = 11 11 12 12 , 1 FX − X AFX − X F 21 21 22 22 AX11 − X11A − CX21 AX12 − X12F − CX22 T2 (X ) = . FX21 − X21AFX22 − X22F

If KerT2 contains a matrix of the form

X X 11 12 ,(6.4) 0 −I then the proof is done, as X = X12 would satisfy XF − AX = C. In order to show that there is a matrix of this special form in KerT2, let us consider two real linear mappings: p×(n+p) Li : KerTi → C , i = 1, 2, given by the projections

  X11 X12 Li : → X21 X22 , i = 1, 2. X21 X22 6.1 Normal Con-Sylvester Matrix Equations 229

Notice that X ∈ KerL1 if and only if ⎧ ⎪ X = 0, ⎨⎪ 21 X22 = 0, ⎪ − = , ⎩⎪ AX11 X11A 0 AX12 − X12F = 0, and that exactly the same four conditions characterize the kernel of L2, so the kernel of L1 and L2 are identical. Moreover, notice that    Image L1 = X21 X22 | FX21 − X21A = 0 and FX22 − X22F = 0 and

Image L2  = X X | FX − X A = 0, FX − X F = 0 and there exist 21 22 21 21 22 22  X11, X12 such that AX11 − X11A = CX21 and AX12 − X12F = CX22 .

Thus, Image L2 ⊆ Image L1. By using Theorem 2.13, one has

rdim (KerLi) + rdim (Image Li) = rdim (KerTi) , i = 1, 2. (6.5)

Since it has been shown that

rdim (KerT1) = rdim (KerT2) ,rdim(KerL1) = rdim (KerL2) , then it follows from (6.5) that

rdim (Image L1) = rdim (Image L2) .

Combining this fact with Image L2 ⊆ Image L1, gives Image L1 = Image L2. Finally, it is easily found that

00 ∈ KerT . 0 −I 1

Thus,   0 −I ∈ Image L1 = Image L2, from which it follows that there is some X ∈ KerT2 of the form in (6.4), which is what one needed to show.  The result in the above theorem was given in [21]. In [210], the solvability of the normal Sylvester matrix equation AX − XB = C with A ∈ Cn×n, B ∈ Cp×p, and C ∈ Cn×p known was investigated. By now, it is well-known that the normal Sylvester 230 6 Real-Representation-Based Approaches matrix equation has a solution if and only if the two matrices in (6.3) are similar. Therefore, the result in Theorem 6.1 can be viewed as a conjugate version of that in [210]. In addition, an alternative proof of the result on the normal Sylvester matrix equation was given in [126, 143]. The proof of Theorem 6.1 is a slight modification of that in [126, 143].

Remark 6.1 It should be pointed out that the normal con-Sylvester mapping X → XF − AX is not linear, but real linear. Therefore, the result in Sect. 2.8 on real linear mappings is used in the proof of Theorem 6.1.

6.1.2 Uniqueness Conditions

In this subsection, the aim is to give the necessary and sufficient condition for the normal con-Sylvester matrix equation (6.1) to have a unique solution. The following two kinds of matrices are needed,

Ij 0 0 Ij Pj = , Qj = , 0 −Ij −Ij 0 which are defined in Sect.2.6. Now, the real representation of the matrix equation (6.1) is defined as Y(F)σ − Aσ Y = Cσ Pp. (6.6)

By Lemma 2.13, one has   − = − XF AX σ Xσ PpFσ Aσ Xσ Pp.

Thus the matrix equation (6.1) is converted into

Xσ PpFσ − Aσ Xσ Pp = Cσ .

2 = In view of Lemma 2.13 and Pp I2p, post-multiplying two sides of this equation by Pp,gives Xσ (F)σ − Aσ Xσ = Cσ Pp.

So the following conclusion is easily obtained.

Proposition 6.1 Given matrices A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p, the normal con-Sylvester matrix equation (6.1) has a solution X if and only if its real represen- tation matrix equation (6.6) has a real matrix solution Y = Xσ .

For the real representation matrix equation (6.6), the following result can be easily derived. 6.1 Normal Con-Sylvester Matrix Equations 231

n×n p×p n×p Lemma 6.1 Let A ∈ C , F ∈ C , and C ∈ C .IfY = Y0 is a real solution to the real representation equation (6.6), then Y = QnY0Qp is also a real solution of the matrix equation (6.6).

Proof According to Item (5) of Lemma 2.13, the matrix equation (6.6) is equivalent to

YQp(F)σ Qp − QnAσ QnY = QnCσ QpPp.

−1 =− = Note Qj Qj and QpPpQp Pp. Then, post-multiplying the two sides of the −1 above equation by Qp , one has

YQp(F)σ + QnAσ QnYQp =−QnCσ Pp.

−1 Pre-multiplying both sides of this equation by Qn ,gives

QnYQp(F)σ − Aσ QnYQp = Cσ Pp.

This shows that if Y = Y0 is a real solution of the matrix equation (6.6), then Y = QnY0Qp is also a real solution of the matrix equation (6.6). 

With the above lemma as a tool, on the uniqueness of the solution to the normal con-Sylvester matrix equation (6.1) one has the following conclusion.

Theorem 6.2 Given matrices A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p, the normal con-Sylvester matrix equation (6.1) has a unique solution if and only if its real representation matrix equation (6.6) has a unique solution.

Proof Necessity: It is assumed that X = X0 is the unique solution to the normal con-Sylvester matrix equation (6.1). Since X = X0 is a solution to the matrix equation (6.1), then Y = (X0)σ is a solution of the real representation equation (6.6) by Proposition 6.1. Next, it will be shown that Y = (X0)σ is the unique solution of (6.6). If not, then the matrix equation (6.6) has another solution Y = Y1 = (X0)σ . It follows from Lemma 6.1 that Y = QnY1Qp is also a solution of (6.6). Denote

Y11 Y12 n×p Y1 = , Yij ∈ R , i, j = 1, 2. Y21 Y22

Then, simple calculation gives

1   1 (Y − Y ) 1 (Y + Y ) + = 2 11 22 2 12 21 Y1 QnY1Qp 1 ( + ) − 1 ( − ) . 2 2 Y12 Y21 2 Y11 Y22

Let 1 1 X = (Y − Y ) + (Y + Y ) i , 1 2 11 22 2 12 21 232 6 Real-Representation-Based Approaches then 1   Y = Y + Q Y Q = (X )σ 2 1 n 1 p 1 is also a solution of (6.6). With this, it follows from Proposition 6.1 that X1 is a solution of (6.1). Since both Qn and Qp are nonsingular, then QnY1Qp = Qn (X0)σ Qp. In addition, it is known from Item (5) of Lemma 2.13 that Qn (X0)σ Qp = (X0)σ . Thus, QnY1Qp = (X0)σ . Therefore,

1   (X )σ = Y + Q Y Q = (X )σ . 1 2 1 n 1 p 0

This implies that X = X1 is another solution of the matrix equation (6.6) besides X = X0. This contradicts to the assumption that X = X0 is the unique solution to the matrix equation (6.1). Therefore, Y = (X0)σ is the unique solution of the matrix equation (6.6). Sufficiency: It is assumed that Y = Y0 is the unique solution of the matrix equa- tion (6.6). By Lemma 6.1, Y = QnY0Qp is a solution of (6.6). By the uniqueness, there holds Y0 = QnY0Qp. Such a relation implies that there exists a complex matrix X0 such that (X0)σ = Y0. Thus, it follows from Proposition 6.1 that X = X0 is a solution of the matrix equation (6.1). If the solution of the matrix equation (6.1) is not unique, then it has another solution X = X1 = X0. According to Proposition 6.1, the real representation equation (6.6) has a solution Y = (X1)σ . Since X1 = X0, then (X1)σ = (X0)σ . This implies that the matrix equation (6.6) has two solutions Y = (X1)σ and Y = (X0)σ at least. This contradicts to the assumption that the matrix equation (6.6) has a unique solution. Therefore, the solution of the matrix equation (6.1) is unique. The proof is thus completed.  By combining Theorem 6.2 and Lemma 2.16, one has the following interesting result. × × Theorem 6.3 Given matrices A ∈ Cn n and F ∈ Cp p, the normal con-Sylvester  ∈ Cn×p λ ∩ matrix  equation (6.1) has a unique solution for any C if and only if AA λ FF = ∅. Proof According to Theorem 6.2, the normal con-Sylvester matrix equation (6.1) has a unique solution  if and only if the matrix equation (6.6) has a unique solution. Thus, λ (Aσ ) ∩ λ Fσ = ∅. By using Item (3) of Lemma 2.16, one can obtain that λ( ) ∩ λ( ) = ∅ Aσ Fσ ⇐⇒ ( ) = det fAσ Fσ 0 ⇐⇒ ( ( )) = det fAA FF σ det Pn 0 ⇐⇒ ( ) = det fAA FF 0 ⇐⇒ λ(AA) ∩ λ(FF) = ∅. 6.1 Normal Con-Sylvester Matrix Equations 233

This completes the proof.  Remark 6.2 The result in Theorem 6.3 was given in [21]. In this subsection, an alternative proof is given for this result.

6.1.3 Solutions

In this subsection, explicit solutions are given for the normal con-Sylvester matrix equation (6.1) by means of the real representation of complex matrices in Sect.2.6. Before proceeding, the following result is needed on the normal Sylvester matrix equation (6.2). Lemma 6.2 Given matrices A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p,let

n fA(s) = det(sI − A) = αns +···+α1s + α0,αn = 1, and n−1 adj(sI − A) = Rn−1s +···+R1s + R0.

(1) If X is a solution of the normal Sylvester matrix equation (6.2), then

n−1 i XfA(F) = RiCF . i=0

(2) If X is the unique solution of the normal Sylvester matrix equation (6.2), then   n−1   i −1 X = RiCF fA(F) . i=0

Remark 6.3 The normal Sylvester matrix equation (6.2) has been extensively inves- tigated. The result in Lemma 6.2 can be found in [47, 152, 182, 258]. In [258], this result is obtained by using the so-called Kronecker maps as tools. Lemma 6.2 provides a very neat explicit solution to the normal Sylvester matrix equation (6.2). To obtain this solution, one needs the coefficients αi, i ∈ I[0, n − 1], of the characteristic polynomial of the matrix A and the coefficient matrices Ri, i ∈ I[0, n − 1], of the adjoint matrix adj(sI − A). According to Sect. 2.2, they can be obtained by the following well-known Leverrier algorithm:  Rn−k = Rn−k+1A + αn−k+1In, Rn−1 = I ( ) , k ∈ I[1, n]. (6.7) α =−tr Rn−k A ,α = n−k k n 1

In the following, an equivalent statement of Lemma 6.2 is provided. 234 6 Real-Representation-Based Approaches

Theorem 6.4 Given matrices A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p,let

n fA(s) = det(sI − A) = αns +···+α1s + α0,αn = 1.

(1) If X is a solution of the normal Sylvester matrix equation (6.2), then

n k−1 j k−j−1 XfA(F) = αkA CF . k=1 j=0

(2) If X is the unique solution of the normal Sylvester matrix equation (6.2), then ⎛ ⎞ n k−1   ⎝ j k−j−1⎠ −1 X = αkA CF fA(F) . k=1 j=0

Proof From (6.7), it is easily obtained ⎧ ⎪ = α + α +···+α n−2 + n−1 ⎨⎪ R0 1In 2A n−1A A ··· . ⎪ = α + (6.8) ⎩⎪ Rn−2 n−1In A Rn−1 = In

This relation can be compactly expressed as

n k−j−1 Rj = αkA ,αn = 1, j ∈ I[0, n − 1]. k=j+1

Thus, one has ⎛ ⎞ n−1 n−1 n j ⎝ k−j−1⎠ j RjCF = αkA CF j=0 j=0 k=j+1 n k−1 k−j−1 j = αkA CF k=1 j=0 n k−1 j k−j−1 = αkA CF . k=1 j=0

Combining this with Lemma 6.2, gives the conclusions. 

Remark 6.4 The result given in Theorem 6.4 is the same as in [152, 155]. Note, there exists a small flaw in the Lemma 1.1 of [155]. The index j in the two expressions should start from j = 0. 6.1 Normal Con-Sylvester Matrix Equations 235

Remark 6.5 Different from Lemma 6.2, only the coefficients of the characteristic polynomial of the matrix A are required when Theorem 6.4 is used to obtain the solution of the normal Sylvester matrix equation (6.2).

With the previous preliminary results, the solution of the normal con-Sylvester matrix equation (6.1) now can be investigated. The following result provides a method for solving this matrix equation based on its corresponding real representation equa- tion.

Theorem 6.5 Given matrices A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p, the normal con- Sylvester matrix equation (6.1) has a solution X if and only if the matrix equation (6.6) has a solution Y ∈ R2n×2p. In this case, if Y is a real solution of the matrix equation (6.6), then the following matrix

  1 Ip X = In iIn (Y + QnYQp) (6.9) 4 iIp is a solution of the normal con-Sylvester matrix equation (6.1).

Proof Only the following fact is shown: if

Y11 Y12 n×p Y = , Yij ∈ R , i, j = 1, 2, (6.10) Y21 Y22 is a solution to the matrix equation (6.6), then the matrix given in (6.9)isalsoa solution to the normal con-Sylvester matrix equation (6.1). Since Y is a real solution of the matrix equation (6.6), then QnYQp is also a real solution of the matrix equation (6.6) by Lemma 6.1. Thus the following matrix is also a solution to the matrix equation (6.6):

1   Yˆ = Y + Q YQ . 2 n p Substituting (6.10) into the preceding expression, gives

Z Z Yˆ = 0 1 , (6.11) Z1 −Z0 with 1 1 Z = (Y − Y ) , Z = (Y + Y ) . 0 2 11 22 1 2 12 21 236 6 Real-Representation-Based Approaches

From (6.11), one can construct a complex matrix as follows

  1 ˆ Ip X = Z0 + Z1 i = In iIn Y . 2 iIp

Clearly, the real representation of the complex matrix X is Yˆ . By Proposition 6.1, the matrix X given in (6.9) is a solution to the normal con-Sylvester matrix equation (6.1). 

Remark 6.6 The latter part of the proof of Theorem 6.5 is taken from the proof of the Theorem 4.2 in [155].

Based on the above theorem, one can obtain the solution of the matrix equation (6.1) by solving its corresponding real representation matrix equation (6.6) according to the method proposed in Lemma 6.2. However, the dimensions of the matrix equa- tion (6.6) are greater than those of the normal con-Sylvester matrix equation (6.1). This may result in poor numerical stability. In the following theorem, the solution of the normal con-Sylvester matrix equation (6.1) is provided by the original matrices of the equation, instead of its real representation matrices.

Theorem 6.6 Let A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p. (1) If X is a solution to the normal con-Sylvester matrix equation (6.1), then

Xf (FF) AA⎛ ⎞ n k−1 k−1      − −     − − ⎝ j k j 1 j k j 1⎠ = αk(AA) CF FF + αk(AA) AC FF , k=1 j=0 j=0

where ( ) = ( ) = n + α n−1 +···+α + α . fAA s fAA s s n−1s 1s 0

(2) If λ(AA)∩λ(FF) = ∅, then the normal con-Sylvester matrix equation (6.1) has a unique solution ⎛ ⎞ n k−1    k−1      = α ⎝ ( )j k−j−1 + ( )j k−j−1⎠ ( ) −1 . X k AA CF FF AA AC FF fAA FF k=1 j=0 j=0

Proof If the normal con-Sylvester matrix equation (6.1) has a solution X, then the matrix equation (6.6) has a solution Y = Xσ . By Theorem 6.4, one has

n 2k−1   − − ( ) = α j 2k j 1. Xσ fAσ Fσ kAσ Cσ PpFσ k=1 j=0 6.1 Normal Con-Sylvester Matrix Equations 237

From Theorem 2.7, and Lemmas 2.13 and 2.16, it can be derived that   ( ) XfAA FF σ    = ( ) = ( ( )) = ( ) Xσ f FF Pp Xσ f FF σ Pp Xσ fAσ F σ AA σ AA n 2k−1 s 2k−s−1 = αkAσ Cσ Pp(F)σ k=1 s=0⎛ ⎞ n k−1 k ⎝ 2j 2k−2j−1 2j−1 2k−2j⎠ = αk Aσ Cσ Pp(F)σ + Aσ Cσ Pp(F)σ = j= j= k 1 ⎛ 0 1 ⎞ n k−1 k ⎝ 2j 2(k−j−1)+1 2(j−1)+1 2k−2j⎠ = αk Aσ Cσ Pp(F)σ + Aσ Cσ Pp(F)σ = j=0 j=1 k 1 ⎛ n k−1        − − ⎝ j k j 1 = αk (AA) PnCσ Pp(F)σ FF Pp+ σ σ = j=0 k 1 ⎞ k       − j−1 k j ⎠ (AA) PnAσ Cσ Pp FF Pp σ σ j=1 ⎛ ⎞ n k−1  k       − −     − ⎝ j k j 1 j−1 k j ⎠ = αk (AA) CF FF + (AA) AC FF σ σ k=1 j=0 j=1 n k−1       − − j k j 1 = αk (AA) CF FF + σ k=1 j=0 n k−1       − − j k j 1 αk (AA) AC FF . σ k=1 j=0

So the first conclusion holds. With this the second conclusion is obviously true. 

In the following theorem, some equivalent forms are given for the solution in Theorem 6.6.

Theorem 6.7 Given matrices A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p,let   ( ) = − = n + α n−1 +···+α + α fAA s det sI AA s n−1s 1s 0, and n−1 adj(sI − AA) = Rn−1s +···+R1s + R0. 238 6 Real-Representation-Based Approaches

(1) If X is a solution of the normal con-Sylvester matrix equation (6.1), then

n−1    n−1    ( ) = i + i . XfAA FF Ri CF FF Ri AC FF i=0 i=0

(2) If λ(AA)∩λ(FF) = ∅, then the normal con-Sylvester matrix equation (6.1) has a unique solution   n−1    n−1      = i + i ( ) −1 . X Ri CF FF Ri AC FF fAA FF i=0 i=0

Proof By applying the Leverrier algorithm (6.7) to the matrix AA, it is easily obtained that the coefficients αi, Ri, i ∈ I[0, n − 1], satisfy ⎧     ⎪ = α + α +···+α n−2 + n−1 ⎨⎪ R0 1In 2AA n−1 AA AA ··· . (6.12) ⎪ = α + ⎩⎪ Rn−2 n−1In AA Rn−1 = In

This relation can be compactly expressed as

n  k−i−1 Ri = αk AA ,αn = 1, i ∈ I[1, n − 1]. k=i+1

Thus, one has

n−1   i Ri CF FF = i 0   n−1 n  k−i−1   i = αk AA CF FF i=0 k=i+1 n k−1  k−i−1   i = αk AA CF FF k=1 i=0 n k−1  i   k−i−1 = αk AA CF FF , k=1 i=0 6.1 Normal Con-Sylvester Matrix Equations 239 and

n−1   i Ri AC FF = i 0   n−1 n  k−i−1   i = αk AA AC FF i=0 k=i+1 n k−1  k−i−1   i = αk AA AC FF k=1 i=0 n k−1  i   k−i−1 = αk AA AC FF . k=1 i=0

Combining this with Theorem 6.6, gives the assertion of this theorem. 

Theorem 6.8 Let A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p. (1) If X is a solution of the normal con-Sylvester matrix equation (6.1), then

( ) = ( , ) ( ) ( , ) + XfAA FF Ctr AA C Sp AA Obsn FF F

Ctr(AA, A)Sn(AA) Obsn(FF, C).

(2) If λ(AA)∩λ(FF) = ∅, then the normal con-Sylvester matrix equation (6.1) has a unique solution  X = Ctr(AA, C)Sp(AA) Obsn(FF, F) +   ( , ) ( ) ( , ) ( ) −1 . Ctr AA A Sn AA Obsn FF C fAA FF

Proof By applying the Leverrier algorithm (6.7) to the matrix AA, for the coefficients αi, and coefficient matrices Ri, i ∈ I[0, n − 1], in Theorem 6.7 the relation (6.12) holds. With this relation, one has

n−1   i Ri CF FF i=0  = R CRC ··· R − C Obs (FF, F) 0 1 n 1 ! n  n−1 = CAAC ··· AA C Sp(AA) Obsn(FF, F)

= Ctr(AA, C)Sp(AA) Obsn(FF, F); 240 6 Real-Representation-Based Approaches and

n−1   i Ri AC FF i=0  = R ARA ··· R − A Obs (FF, C) 0 1 n 1 ! n  n−1 = AAAA ··· AA A Sn(AA) Obsn(FF, C)

= Ctr(AA, A)Sn(AA) Obsn(FF, C).

With the preceding two relations, the conclusion of this theorem can be readily obtained from Theorem 6.7. 

Example 6.1 Consider a normal con-Sylvester matrix equation in the form of (6.1) with the following parameters: ⎡ ⎤ ⎡ ⎤ 1 −2 − i −1 + i −1 + i1 2i i A = ⎣ 0i 0⎦ , F = , C = ⎣ 0i⎦ . 1 −1 + i 0 −11− i −i1− 2i

For this matrix equation, n = 3, p = 2. It is easily checked that λ(AA)∩λ(FF) = ∅, so this equation has a unique solution. By calculation, it can be obtained that

( ) = ( ) = 3 − 2 + − , fAA s fAA s s 4s 5s 2 ⎡ ⎤ 5 −41 ⎣ ⎦ S1(AA) = −41 0 . 100

With them, one has ⎡ ⎤ −12 + 14i 53 + 3i ⎣ ⎦ Ctr(AA, C)Sp(AA) Obsn(FF, F) = −9 + i −5 + 11i , 21 − 9i −3 − 7i ⎡ ⎤ 16 −25 − 23i ⎣ ⎦ Ctr(AA, A)Sn(AA) Obsn(FF, C) = −3 − 3i −5 − 3i , −1 − 13i 13 − 9i and −12 + 4i 12 − 4i f (FF) = . AA −4 − 4i −20 − 4i 6.1 Normal Con-Sylvester Matrix Equations 241

Thus it follows from Theorem 6.8 that the unique solution of the equation is ⎡ ⎤ 21 − 33 i − 9 + 5 i ⎢ 40 40 8 8 ⎥ = ⎣ 1 + 1 3 − 1 ⎦ . X 2 4 i 4 2 i − 6 + 21 − 13 + 9 5 20 i 20 5 i

6.2 Con-Kalman-Yakubovich Matrix Equations

In this section, the following con-Kalman-Yakubovich matrix equation is considered by real representation methods

X − AXF = C, (6.13) where A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p are known matrices, and X is the matrix to be determined. This equation is a conjugate version of the following Kalman- Yakubovich matrix equation: X − AXF = C, (6.14) where A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p are known matrices, and X is the unknown matrix. Regarding the uniqueness of the solution to the Kalman-Yakubovich matrix equation, the following result is well-known.

Lemma 6.3 Given matrices A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p, the Kalman- Yakubovich matrix equation (6.14) has a unique solution if and only if γμ = 1,for any γ ∈ λ (A), μ ∈ λ (F), or equivalently

gA (F) = 0, where gA (s) = det (I − sA).

6.2.1 Solvability Conditions

This subsection is devoted to the solvability conditions of the con-Kalman- Yakubovich matrix equation (6.13). Similarly to the previous section, the follow- ing two types of matrices are used

Ij 0 0 Ij Pj = , Qj = , 0 −Ij −Ij 0 which are defined in Sect.2.6. Now, let us first define the following real representation matrix equation for (6.13) Y − Aσ YFσ = Cσ , (6.15) 242 6 Real-Representation-Based Approaches where Y is the unknown matrix. By Lemma 2.13, one has   − = − X AXF σ Xσ Aσ Xσ Fσ .

Thus the matrix equation (6.13) is converted into

Xσ − Aσ Xσ Fσ = Cσ .

With the above simple derivation, the following conclusion is easily obtained. Proposition 6.2 Given matrices A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p, the con- Kalman-Yakubovich matrix equation (6.13) has a solution X if and only if its real representation matrix equation (6.15) has a real matrix solution Y = Xσ . For the real representation equation (6.15), the following result can be easily derived.

n×n p×p n×p Lemma 6.4 Given matrices A ∈ C , F ∈ C , and C ∈ C ,ifY = Y0 is a real solution to the real representation matrix equation(6.15), then QnY0Qp is also a real solution of the matrix equation (6.15). Proof According to Item (5) of Lemma 2.13, the matrix equation (6.15)is equivalent to Y − QnAσ QnYQpFσ Qp = QnCσ Qp.

−1 =− Note Qj Qj. Then, pre- and post-multiplying the two sides of the preceding −1 −1 equation by Qn and Qp , respectively, we have

QnYQp − Aσ QnYQpFσ = Cσ .

This shows that if Y is a real solution of the matrix equation (6.15), then QnYQp is also a real solution of the matrix equation (6.15).  With the above lemma as a tool, on the uniqueness of the solution to the normal con-Kalman-Yakubovich matrix equation (6.13) one has the following conclusion. Theorem 6.9 Given matrices A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p, the normal con-Kalman-Yakubovich matrix equation (6.13) has a unique solution if and only if its real representation matrix equation (6.15) has a unique solution. The proof of Theorem 6.9 can be given by using Lemma 6.4 along the same line of the proof of Theorem 6.2, and thus is omitted. By combining Theorem 6.9, Lemmas 6.3 and 2.16, the following interesting result can be derived. Theorem 6.10 Given matrices A ∈ Cn×n and F ∈ Cp×p, the con-Kalman- × Yakubovich matrix equation (6.13 ) has a unique  solution for any C ∈ Cn p if and only if γμ= 1, for any γ ∈ λ AA , μ ∈ λ FF . 6.2 Con-Kalman-Yakubovich Matrix Equations 243

Proof According to Theorem 6.9, the con-Kalman-Yakubovich matrix equation (6.13) has a unique solution if and only if the matrix equation (6.15) has a unique solution. Thus, γμ = 1, for any γ ∈ λ (Aσ ), μ ∈ λ (Fσ ). By using Item (3) of Lemma 2.16, it can be obtained that

γμ= 1, for any γ ∈ λ (Aσ ) ,μ∈ λ (Fσ )   ⇐⇒ ( ) = det gAσ Fσ 0 ⇐⇒ ( ( )) = det gAA FF σ det Pn 0 ⇐⇒ ( ) = . det gAA FF 0

With this, the conclusion is immediately obtained. 

Remark 6.7 The conclusions in Proposition 6.2 have been given in [155]. However, the results in Theorems 6.9 and 6.10 are new.

6.2.2 Solutions

In this subsection, the aim is to give explicit solutions for the con-Kalman-Yakubovich matrix equation (6.13). First, a result on Kalman-Yakubovich matrix equation is needed.

Lemma 6.5 Given matrices A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p,let

n gA(s) = det(I − sA) = αns +···+α1s + α0,α0 = 1, (6.16) and n−1 adj(I − sA) = Rn−1s +···+R1s + R0. (6.17)

Then (1) If X is a solution of the Kalman-Yakubovich matrix equation (6.14), then

n−1 i XgA(F) = RiCF . i=0

(2) If X is the unique solution of the Kalman-Yakubovich matrix equation (6.14), then   n−1   i −1 X = RiCF gA(F) . i=0 244 6 Real-Representation-Based Approaches

Remark 6.8 The Kalman-Yakubovich matrix equation (6.14) was investigated in [155, 280, 305], and the result in Lemma 6.5 was also given in [208, 305]. In [208], the result was obtained by using the so-called Kronecker maps as tools.

The result in Lemma 6.5 provides a very neat explicit solution to the matrix equation (6.14). To obtain this solution, one needs the coefficients αi, i ∈ I[1, n], of the characteristic polynomial of the matrix pair (−A, −I), and the coefficient matrices Ri, i ∈ I[0, n − 1], of the adjoint matrix adj(I − sA). They can be obtained by the following so-called generalized Leverrier algorithm given in Sect.2.3:  Ri = Ri−1A + αiI, R0 = I ( ) , i ∈ I[1, n]. (6.18) α =−tr Ri−1A ,α = i i 0 1

In the following, an equivalent statement of the above theorem is given.

Lemma 6.6 Given matrices A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p,let

n gA(s) = det(I − sA) = αns +···+α1s + α0,α0 = 1.

Then (1) If X is a solution of the matrix equation (6.14), then

n−1 n−1 j−k j XgA(F) = αkA CF . (6.19) k=0 j=k

(2) If X is the unique solution of the matrix equation (6.14), then ⎛ ⎞ n−1 n−1   ⎝ j−k j⎠ −1 X = αkA CF gA(F) . k=0 j=k

Proof From the generalized Leverrier algorithm (6.18), it is easily obtained that ⎧ ⎪ R0 = I ⎪ ⎨ R1 = α1I + A 2 R2 = α2I + α1A + A . ⎪ ⎪ ··· ⎩ n−2 n−1 Rn−1 = αn−1I + αn−2A +···+α1A + A

This relation can be compactly expressed by

j j−k Rj = αkA ,α0 = 1. k=0 6.2 Con-Kalman-Yakubovich Matrix Equations 245

Thus one has   n−1 n−1 j j j−k j RjCF = αkA CF j=0 j=0 k=0 n−1 n−1 j−k j = αkA CF . k=0 j=k

Combining this with Lemma 6.5, gives the conclusion of this lemma.  Remark 6.9 For a matrix A ∈ Cn×n, if

n i det(sI − A) = αis , i=0 then n n−i det(I − sA) = αis . i=0

Based on this fact, it is easily found that the expression (6.19) is exactly the first expression in Theorem 2.5 of [155]. With the previous preliminary results, now one can give results on the solutions of the con-Kalman-Yakubovich matrix equations. Theorem 6.11 Given matrices A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p, the matrix equation (6.13) has a solution X if and only if the matrix equation (6.15) has a solution Y ∈ R2n×2p. In this case, if Y is a solution to the matrix equation (6.15), then the following matrix

  1 Ip X = In iIn (Y + QnYQp) (6.20) 4 iIp is a solution to the matrix equation (6.13). Proof Let Y be a solution to the real representation matrix equation (6.15). Then, QnYQp is also a real solution of the matrix equation (6.15) by Lemma 6.4. Thus the conclusion can be obtained by following the proof of Theorem 6.5 in the previous section.  Remark 6.10 The conclusions in the above theorem have been given in [155]. Based on the above results, the solution of the matrix equation (6.13) can be obtained by solving its corresponding real representation matrix equation (6.15) according to the method proposed in Lemma 6.5. However, the dimensions of the matrix equation (6.15) are greater than those of the con-Kalman-Yakubovich matrix 246 6 Real-Representation-Based Approaches equation (6.13). This may result in poor numerical stability. In the following theorem, a solution is established by the original coefficient matrices of the matrix equation (6.13), instead of the real representation matrices.

Theorem 6.12 Let A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p. Then, (1) If X is a solution to the con-Kalman-Yakubovich matrix equation (6.13), then

Xg (FF) AA ⎛ ⎞ n−1 n−1   n−1    ⎝ j−k j j−k j⎠ = αk (AA) C FF + (AA) ACF FF , k=0 j=k j=k

where

( ) = ( − ) = α n + α n−1 +···+α + α . gAA s det I sA ns n−1s 1s 0

(2) If the con-Kalman-Yakubovich matrix equation (6.13) has a unique solution X, then ⎛ ⎞ n−1 n−1 n−1          − = α ⎝ ( )j−k j + ( )j−k j⎠ ( ) 1 X k AA C FF AA ACF FF gAA FF . k=0 j=k j=k

Proof If the con-Kalman-Yakubovich matrix equation (6.13) has a solution X, then the equation (6.15) has a solution Y = Xσ . By Lemma 6.6, it can be obtained that

n−1 2n−1 ( ) = α s−2k s . Xσ gAσ Fσ kAσ Cσ Fσ k=0 s=2k

From Theorem 2.7, and Lemmas 2.13 and 2.17, one has   Xg (FF) AA σ = ( ) = ( ( )) = (( ) ) Xσ g FF Pp Xσ g FF σ Pp Xσ gAσ F σ AA σ AA n−1 2n−1 s−2k s = αkAσ Cσ Fσ k=0 s=2⎛k ⎞ n−1 n−1 n−1 ⎝ 2j−2k 2j 2j+1−2k 2j+1⎠ = αk Aσ Cσ (F)σ + Aσ Cσ Fσ = = = k 0 ⎛ j k j k n−1 n−1     ⎝ j−k j = αk (AA) PnCσ FF Pp+ σ σ k=0 j=k 6.2 Con-Kalman-Yakubovich Matrix Equations 247 ⎞ n−1     j−k j ⎠ (AA) PnAσ Cσ Fσ FF Pp σ σ = j k ⎛

n−1 n−1     = α ⎝ ( )j−k j + k AA σ Pn C FF = = σ k 0 j k ⎞ n−1       j−k j ⎠ (AA) Pn ACF FF Pp σ σ σ = j k ⎛ ⎞ n−1 n−1    n−1     ⎝ j−k j j−k j ⎠ = αk (AA) C FF + (AA) ACF FF . σ σ k=0 j=k j=k

So the first conclusion holds. With this the second conclusion is obviously true.  Remark 6.11 Different from the result in [155], the solution given in Theorem 6.12 can be obtained from the original matrices, instead of the real representation of matrices. In the following, some equivalent statements are presented for the solutions to the con-Kalman-Yakubovich matrix equation (6.13) given in Theorem 6.12. Theorem 6.13 Given matrices A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p,let

( ) = ( − ) = α n + α n−1 +···+α + α α = gAA s det I sAA ns n−1s 1s 0, 0 1 and n−1 adj(I − sAA) = Rn−1s +···+R1s + R0.

Then (1) If X is a solution to the matrix equation (6.13), then

n−1   n−1    ( ) = i + i . XgAA FF RiC FF Ri ACF FF i=0 i=0     (2) If γμ = 1, for any γ ∈ λ AA , μ ∈ λ FF , then the matrix equation (6.13) has a unique solution   n−1   n−1      = i + i ( ) −1 . X RiC FF Ri ACF FF gAA FF i=0 i=0

Proof This conclusion can be proven from Theorem 6.12 by applying the generalized Leverrier algorithm (6.18) to the matrix AA. The proof procedure is very similar to that of Lemma 6.6.  248 6 Real-Representation-Based Approaches

Theorem 6.14 Let A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p. Then

(1) If X is a solution to the matrix equation (6.13), then

( ) XgAA FF = Ctr(AA, C)Tp(AA) Obsn(FF, Ip) +

Ctr(AA, AC)Tp(AA) Obsn(FF, F).     (2) If γμ = 1, for any γ ∈ λ AA , μ ∈ λ FF , then the matrix equation (6.13) has a unique solution  X = Ctr(AA, C)Tp(AA) Obsn(FF, Ip)+   ( , ) ( ) ( , ) ( ) −1 . Ctr AA AC Tp AA Obsn FF F gAA FF

Proof By applying the generalized Leverrier algorithm (6.18) to the matrix AA,for the coefficients αi, and coefficient matrices Ri, i ∈ I[0, n − 1], in Theorem 6.13 one has ⎧ ⎪ R0 = I ⎪ ⎨⎪ R1 = α1I + AA  2 R = α I + α AA + AA . ⎪ 2 2 1 ⎪ ··· ⎩⎪  n−2  n−1 Rn−1 = αn−1I + αn−2AA +···+α1 AA + AA

With this relation, one has

n−1  i RiC FF i=0  = R CRC ··· R − C Obs (FF, I ) 0 1 n 1 ! n p  n−1 = CAAC ··· AA C Tp(AA) Obsn(FF, Ip)

= Ctr(AA, C)Tp(AA) Obsn(FF, Ip); and

n−1   i Ri ACF FF i=0  = R ACRAC ··· R − AC Obs (FF, F) 0 1 n 1 ! n  n−1 = AC AAAC ··· AA AC Tp(AA) Obsn(FF, F)

= Ctr(AA, AC)Tp(AA) Obsn(FF, F). 6.2 Con-Kalman-Yakubovich Matrix Equations 249

With the preceding two relations, the conclusion of this theorem can be readily obtained from Theorem 6.13. 

Example 6.2 Consider a con-Kalman-Yakubovich matrix equation in the form of (6.13) with the following parameters: ⎡ ⎤ ⎡ ⎤ 1 −2 − i −1 + i −1 + i1 2i i A = ⎣ 0i 0⎦ , F = , C = ⎣ 0i⎦ . 1 −1 + i 0 −11− i −i1− 2i   = , = . γμ= γ ∈ λ For this equation, n 3 p 2 It is easily checked that 1,forany AA , μ ∈ λ FF , so this matrix equation has a unique solution. By calculation, one has

( ) =− 3 + 2 − + gAA s 2s 5s 4s 1.

With this, it can be obtained that ⎡ ⎤ 1 −45 ⎣ ⎦ T1(AA) = 01−4 , 00 1

Ctr(AA, C)T (AA) Obs (FF, I ) ⎡ p n ⎤ p −123 + 53i −68 + 66i = ⎣ −9 − 9i −9 − 7i ⎦ , −15 + 13i −3 + 7i and

Ctr(AA, AC)T (AA) Obs (FF, F) ⎡ p ⎤n 95 − 203i 164 − 34i = ⎣ 11 − 27i 25 − 7i ⎦ . 13 − 49i 19 − 9i

Thus it follows from Theorem 6.14 that the unique solution of this matrix equation is ⎡ ⎤ − 877 − 745 i 229 − 907 i ⎢ 328 328 328 328 ⎥ = ⎣ − 1 − 1 1 − 3 ⎦ . X 4 2 i 2 4 i − 69 − 23 13 − 119 164 41 i 41 164 i 250 6 Real-Representation-Based Approaches

6.3 Con-Sylvester Matrix Equations

In this section, the following con-Sylvester matrix equation is investigated

AX + BY = XF, (6.21) where A ∈ Cn×n, B ∈ Cn×r, and F ∈ Cp×p are given matrices, and X ∈ Cn×p and Y ∈ Cr×p are the matrices to be determined. Similarly to the cases in the preceding two sections, the real representation of complex matrices in Sect. 2.6 is also used as tools in this section. Before proceeding, some results on the following Sylvester matrix equation are first given:

AX + BY = XF, (6.22) where A ∈ Cn×n, B ∈ Cn×r, and F ∈ Cp×p are given matrices, and X ∈ Cn×p and Y ∈ Cr×p are the matrices to be determined.

Lemma 6.7 Given matrices A ∈ Cn×n,B∈ Cn×r, and F ∈ Cp×p,let

n n−1 fA(s) = det(sI − A) = s + αn−1s +···+α1s + α0, (6.23) and n−1 adj(sI − A) = Rn−1s +···+R1s + R0. (6.24)

Then

(1) The matrices X ∈ Cn×p and Y ∈ Cr×p given by ⎧ − ⎨⎪ n 1 X = R BZFi i (6.25) ⎩⎪ i=0 Y = ZfA(F)

satisfy the matrix equation (6.22) for an arbitrary matrix Z ∈ Cr×p. (2) When the matrices A and F do not have common eigenvalues, all the matrices X ∈ Cn×p and Y ∈ Cr×p satisfying (6.22) can be parameterized as (6.25) with Z being an arbitrary matrix which represents the degrees of the freedom in the solution.

Remark 6.12 The result in Lemma 6.7 is taken from [300].

Remark 6.13 Lemma 6.7 provides a very neat closed-form solution to the matrix equation (6.22). To obtain this solution, one needs the coefficients αi, i ∈ I[0, n−1], of the characteristic polynomial of the matrix A and the coefficient matrices Ri, i ∈ I[0, n − 1], of the adjoint matrix adj(sI − A). They can be obtained using the well-known Leverrier algorithm (6.7), which is given in Sect.2.2. 6.3 Con-Sylvester Matrix Equations 251

By using the Leverrier algorithm (6.7), the following lemma provides an equiva- lent form of the solution in Lemma 6.7. Lemma 6.8 Given matrices A ∈ Cn×n,B∈ Cn×r, and F ∈ Cp×p,let

n n−1 fA(s) = det(sI − A) = αns + αn−1s +···+α1s + α0,αn = 1.

Then the matrices X and Y given by (6.25) have the following equivalent form: ⎧ − ⎨⎪ n j 1 = α j−i−1 i X jA BZF , ⎩⎪ j=1 i=0 Y = ZfA(F) where Z ∈ Cr×p is a parameter matrix which can be arbitrarily chosen.

Proof By the Leverrier algorithm (6.7), for the coefficients αi, i ∈, I[0, n − 1],in (6.23), and the the coefficient matrices Ri, i ∈ I[0, n − 1],in(6.24), the following relations can be easily obtained ⎧ ⎪ = α + α +···+α n−2 + n−1 ⎨⎪ R0 1In 2A n−1A A ··· . ⎪ = α + (6.26) ⎩⎪ Rn−2 n−1In A Rn−1 = In

The relations in (6.26) can be compactly expressed as

n j−i−1 Ri = αjA ,αn = 1. j=i+1

Substituting this into the expression of X in (6.25) and changing the order of sum- mation, it can be derived that

n−1 i X = RiBZF i=0 ⎛ ⎞ n−1 n ⎝ j−i−1⎠ i = αjA BZF i=0 j=i+1 n j−1 j−i−1 i = αjA BZF . j=1 i=0

The proof is completed.  When the results in Lemmas 6.7 and 6.8 are applied to solve the Sylvester matrix equation (6.22), one only needs the coefficients αi, i ∈ I[0, n−1], of the characteristic 252 6 Real-Representation-Based Approaches polynomial of the matrix A. Besides the Leverrier algorithm, they can be obtained using proper numerically reliable algorithms (e.g., [189]). Now, the con-Sylvester matrix equation (6.21) is investigated. Similarly to the cases in the previous sections, the matrices Pj and Qj defined in Sect. 2.6 are used. That is, Ij 0 0 Ij Pj = , Qj = . 0 −Ij −Ij 0

The following real representation matrix equation is first defined for the con-Sylvester matrix equation (6.21) ˜ ˜ ˜ Aσ X + Bσ Y = XFσ , (6.27) where X˜ and Y˜ are unknown matrices. According to Item (2) of Lemma 2.13, it can be derived that   + = + , AX BY σ Aσ Xσ Pp Bσ Yσ Pp and   ( ) = . XF σ Xσ F σ Pp

2 = , Note that Pp I so the matrix equation (6.21) is equivalent to

Aσ Xσ + Bσ Yσ = Xσ Fσ .

Based on the preceding statement, the following proposition can be obtained. Proposition 6.3 Given matrices A ∈ Cn×n, F ∈ Cp×p, and B ∈ Cn×r, the con- Sylvester matrix equation (6.21) has a solution (X, Y) if and only if its real repre- ˜ ˜ sentation matrix equation (6.27) has a real matrix solution (X, Y) = (Xσ , Yσ ). Theorem 6.15 Given matrices A ∈ Cn×n, F ∈ Cp×p, and B ∈ Cn×r, the con- Sylvester matrix equation (6.21) has a solution (X, Y) if and only if the matrix equation (6.27) has a solution (X˜ , Y˜ ) ∈ R2n×2p × R2r×2p. In this case, if (X˜ , Y˜ ) is a solution to the matrix equation (6.27), then the matrix pair (X, Y) given by ⎧ ⎪   ⎨⎪ = 1 ( ˜ + ˜ ) Ip X 2 In iIn X QnXQp iIp ⎪   (6.28) ⎩⎪ = 1 ( ˜ + ˜ ) Ip Y 2 Ir iIr Y QrYQp iIp is a solution to the con-Sylvester matrix equation (6.21).

Proof By Proposition 6.3, if the con-Sylvester matrix equation (6.21) has a solution ˜ ˜ (X, Y), then the real representation equation (6.27) has a solution (X, Y) = (Xσ , Yσ ). On the other hand, according to Item (5) of Lemma 2.13, the equation (6.27)is equivalent to ˜ ˜ ˜ QnAσ QnX + QnBσ QrY = XQpFσ Qp. 6.3 Con-Sylvester Matrix Equations 253

−1 =− Note Qj Qj. Then, one has

˜ ˜ ˜ Aσ QnXQp + Bσ QrYQp = QnXQpFσ .

˜ ˜ ˜ This shows that if (X, Y) is a solution to the matrix equation (6.27), then (QnXQp, ˜ QrYQp) is also a solution to the matrix equation (6.27). Due to the homogeneity ˜ ˜ ˜ ˜ of the matrix equation (6.27), (X + QnXQp, Y + QrYQp) is also a solution to the equation (6.27). Now, the solution matrices X˜ and Y˜ are denoted as

X X × X˜ = 11 12 , X ∈ Rn p, i, j = 1, 2, X X ij 21 22 ˜ Y11 Y12 r×p Y = , Yij ∈ R , i, j = 1, 2. Y21 Y22

Then, one has

X − X X + X X˜ + Q XQ˜ = 11 22 12 21 , n p X + X − (X − X ) 12 21 11 22 − + ˜ ˜ Y11 Y22 Y12 Y21 Y + QrYQp = . Y12 + Y21 − (Y11 − Y22)

˜ ˜ ˜ ˜ Clearly, the matrices X + QnXQp and Y + QrYQp have the structure of real repre- sentations. In fact, there hold ˜ ˜ X + QnXQp = ((X11 − X22) + i (X12 + X21))σ , ˜ ˜ Y + QrYQp = ((Y11 − Y22) + i (Y12 + Y21))σ .

With these two relations, it follows from Proposition 6.3 that the following matrices X and Y satisfy the con-Sylvester matrix equation (6.21):  X = (X − X ) + i (X + X ) 11 22 12 21 . (6.29) Y = (Y11 − Y22) + i (Y12 + Y21)

With the previous two aspects, it is known that the first part of this theorem is true. In addition, by simple calculations it is easily found that the matrices X and Y in (6.29) can be expressed by (6.28). Thus, the second part of this theorem is true. The proof is thus completed. 

From the above theorem, it is possible that two different solutions of real repre- sentation equation (6.27) may result in a common solution to the matrix equation (6.21). Due to this reason, the relationship “ ” is introduced. 254 6 Real-Representation-Based Approaches

2r×2p Definition 6.1 Given two real matrices X1, X2 ∈ R , the matrices X1 and X2 are called to possess relation “ ”if

X1 + QrX1Qp = X2 + QrX2Qp.

This relation is denoted by X1 X2.

Obviously, the previously defined relation “ ” possesses reflexivity, transitivity and invertibility, and thus is an equivalence relation. According to this equivalence relation, one can define the equivalence class in R2r×2p. The equivalence class con- taining A ∈ R2r×2p is denoted by

com(A) = {X | A X} .

Now the following map is established:

× × ψ : Cr p → com (R2r 2p) Re(A) Im(A) . A → com Im(A) − Re(A)

Obviously, the defined map ψ is a 1-1 correspondence according to the definition of the relation “ ”. With this understanding, one has the following conclusion.

Proposition 6.4 Given matrices A ∈ Cn×n, F ∈ Cp×p, and B ∈ Cn×r, if the real representation equation (6.27) has 4rp degrees of freedom in R2r×2p if and only if the matrix equation (6.21) has rp degrees of freedom in Cr×p.

Regarding the degrees of freedom existing in the matrix equation (6.21), one has the following results.

Lemma 6.9 Given matrices A ∈ Cn×n, F ∈ Cp×p, and B ∈ Cn×r, the con-Sylvester matrix equation (6.21) has rp degrees of freedom if λ(AA) ∩ λ(FF) = ∅.

Proof According to the proof of Theorem 6.3, one has

λ(Aσ ) ∩ λ(Fσ ) = ∅ ⇐⇒ λ(AA) ∩ λ(FF) = ∅.

Thus, if λ(AA)∩λ(FF) = ∅, then the real representation matrix equation (6.27) has 4rp degrees of freedom by Lemma 6.7. Combining this with Proposition 6.4,gives the conclusion. 

Theorem 6.16 Given matrices A ∈ Cn×n, F ∈ Cp×p, and B ∈ Cn×r, the following hold. (1) The following matrices X ∈ Cn×p and Y ∈ Cr×p satisfy the con-Sylvester matrix equation (6.21) for any Z ∈ Cr×p: 6.3 Con-Sylvester Matrix Equations 255 ⎧ ⎪ n j−1 n j−1   ⎨ = α ( )j−i−1 ( )i + α ( )j−i−1 i X j AA ABZ FF j AA BZF FF , ⎪ = = = = (6.30) ⎩⎪ j 1 i 0 j 1 i 0 = ( ) Y ZfAA FF

where   ( ) = − = α n + α n−1 +···+α + α fAA s det sI AA ns n−1s 1s 0.

(2) When the matrices AA and FF do not have common eigenvalues, all the matrices X and Y satisfying the con-Sylvester matrix equation (6.21) can be parameterized by (6.30).

Proof (1) If the con-Sylvester matrix equation (6.21) has a solution (X, Y), then ˜ ˜ the real representation matrix equation (6.27) has a solution (X, Y) = (Xσ , Yσ ). Moreover, for the real solution, one can choose the parameter matrix to be a real representation Zσ of a complex parameter matrix Z. In this case, by Lemma 6.8, Theorem 2.7 and Lemmas 2.13 and 2.16, it can be derived that

Xσ n 2j−1 2j−s−1 s = αjAσ Bσ Zσ (F)σ j=1 s=0   n j−1 j 2j−2i−1 2i 2j−2i 2i−1 = αj Aσ Bσ Zσ (F)σ + Aσ Bσ Zσ (F)σ j=1 i=0 i=1   n j−1 j 2(j−i−1)+1 2i 2(j−i) 2(i−1)+1 = αj Aσ Bσ Zσ (F)σ + Aσ Bσ Zσ (F)σ j=1 i=0 i=1  n j−1     j−i−1 i = αj (AA) PnAσ Bσ Zσ FF Pp+ σ σ j=1 i=0  j       − j−i i 1 (AA) PnBσ Zσ (F)σ FF Pp σ σ = i 1  n j−1       j−i−1 i = αj (AA) Pn ABZ FF Pp+ σ σ σ j=1 i=0  j         − j−i i 1 (AA) Pn BZF FF Pp σ σ σ = i 1   n j−1 j         − j−i−1 i j−i i 1 = αj (AA) ABZ(FF) + (AA) BZF FF σ σ j=1 i=0 i=1 256 6 Real-Representation-Based Approaches   n j−1   j−1    j−i−1 i j−i−1 i = αj (AA) ABZ(FF) + (AA) BZF FF , σ σ j=1 i=0 i=0 and

= ( ) = ( ( )) Yσ Zσ fAσ Fσ Zσ fAA FF σ Pn = ( ( )) = ( ( )) . ZfAA FF σ ZfAA FF σ

Thus, the expression (6.30) is obtained. (2) This can be obtained immediately from Lemma 6.9. 

In the next two theorems, equivalent forms are provided for the solutions given in Theorem 6.16 to the con-Sylvester matrix equation (6.21).

Theorem 6.17 Given matrices A ∈ Cn×n, F ∈ Cp×p, and B ∈ Cn×r,let

( ) = ( − ) = n + α n−1 +···+α + α fAA s det sI AA s n−1s 1s 0 and n−1 adj(sI − AA) = Rn−1s +···+R1s + R0.

If the matrices AA and FF do not have common eigenvalues, then all the matrices X and Y satisfying the con-Sylvester matrix equation (6.21) can be parameterized by ⎧ − − ⎨⎪ n 1   n 1   = i + i X RiBZF FF RiABZ FF , ⎩⎪ i=0 i=0 = ( ) Y ZfAA FF where Z ∈ Cr×p is a parameter matrix which can be arbitrarily chosen.

Proof By applying the Leverrier algorithm (6.7) to the matrix AA, one can obtain that the coefficients αi, Ri, i ∈ I[0, n − 1], satisfy ⎧     ⎪ = α + α +···+α n−2 + n−1 ⎨⎪ R0 1In 2AA n−1 AA AA ··· . (6.31) ⎪ = α + ⎩⎪ Rn−2 n−1In AA Rn−1 = In

This relation can be compactly expressed as

n  k−i−1 Ri = αk AA ,αn = 1, i ∈ I[1, n − 1]. k=i+1 6.3 Con-Sylvester Matrix Equations 257

With this expression, one has

n−1  i RiBZF FF i=0 ⎛ ⎞ n−1 n      ⎝ j−i−1⎠ i = αj AA BZF FF i=0 j=i+1 n j−1  j−i−1   i = αj AA BZF FF , j=1 i=0 and

n−1  i RiABZ FF i=0 ⎛ ⎞ n−1 n      ⎝ j−i−1⎠ i = αj AA ABZ FF i=0 j=i+1 n j−1  j−i−1   i = αj AA ABZ FF . j=1 i=0

Combining these two relations with Theorem 6.16, gives the conclusions.  Theorem 6.18 Given matrices A ∈ Cn×n, F ∈ Cp×p, and B ∈ Cn×r, if the matrices AA and FF do not have common eigenvalues, all the matrices X and Y satisfying the con-Sylvester matrix equation (6.21) can be explicitly expressed as  = ( , ) ( ) ( , ) + ( , ) ( ) ( , ) X Ctr AA B Sr AA Obsn FF ZF Ctr AA AB Sr AA Obsn FF Z , = ( ) (6.32) Y ZfAA FF where Z ∈ Cr×p is an arbitrarily chosen parameter matrix.

Proof By applying the Leverrier algorithm (6.7) to the matrix AA, for the coefficients αi, and coefficient matrices Ri, i ∈ I[0, n − 1], in Theorem 6.17 the relation (6.31) holds. With this relation, one has

n−1  i RiBZF FF i=0  = R BRB ··· R − B Obs (FF, ZF) 0 1 n 1 ! n  n−1 = BAAB ··· AA B Sr(AA) Obsn(FF, ZF)

= Ctr(AA, B)Sr(AA) Obsn(FF, ZF), 258 6 Real-Representation-Based Approaches and

n−1  i RiABZ FF i=0  = R ABRAB ··· R − AB Obs (FF, Z) 0 1 n 1 ! n  n−1 = AB AAAB ··· AA AB Sr(AA) Obsn(FF, Z)

= Ctr(AA, AB)Sr(AA) Obsn(FF, Z).

With the preceding two relations, the conclusion of this theorem can be readily obtained from Theorem 6.17. 

Example 6.3 Consider a con-Sylvester matrix equation in the form of (6.21) with the following parameters ⎡ ⎤ ⎡ ⎤ 1 −2 − i −1 + i −1 + i1 2i i A = ⎣ 0i 0⎦ , F = , B = ⎣ 0i⎦ . 1 −1 + i 0 −11− i −i1− 2i

By some calculations, it can be obtained that

( ) = ( ) = 3 − 2 + − fAA s fAA s s 4s 5s 2.

Thus, one has ⎡ ⎤ 5 −41 ⎣ ⎦ S1(AA) = −41 0 . 100

In addition, it can be easily obtained that

( , ) ⎡Ctr AA B ⎤ −1 + i1−2 + 4i −6 + 3i −4 + 10i −13 + 16i = ⎣ 0i 0 i 0 i⎦ , −i1− 2i −2i −5i −4i −2 − 11i and

( , ) ⎡Ctr AA AB ⎤ −2 − 2i −3 + i −4 − 6i −12 − 6i −8 − 14i −23 − 27i = ⎣ 010 1 0 1⎦ . 1 + i3+ 2i 2 + 2i 5 + 6i 4 + 4i 9 + 14i 6.3 Con-Sylvester Matrix Equations 259

If the free parameter Z is chosen as

2 + i −1 + 2i Z = , −1 + 3i 1 − i by applying the result in Theorem 6.18 a special solution of this con-Sylvester matrix equation can be obtained as: ⎧ ⎡ ⎤ ⎪ 254 − 130i 130 − 442i ⎪ ⎨⎪ X = ⎣ −16 − 8i −64 + 8i ⎦ −88 + 8i 16 + 320i . ⎪ ⎪ −16 − 8i 56 − 32i ⎩ Y = −8 − 40i −24 + 56i

6.4 Con-Yakubovich Matrix Equations

In this section, by means of the real representation in Sect. 2.6 for a complex matrix the following con-Yakubovich matrix equation is investigated

X − AXF = BY, (6.33) where A ∈ Cn×n, F ∈ Cp×p, and B ∈ Cn×r are known matrices, and the matrices X ∈ Cn×p and Y ∈ Cr×p are to be determined. Similarly to the cases in the previous sections, the following matrices Pj and Qj defined in Sect. 2.6 are needed:

Ij 0 0 Ij Pj = , Qj = . 0 −Ij −Ij 0

The following real representation matrix equation is first defined for the con- Yakubovich matrix equation (6.33) ˜ ˜ ˜ X − Aσ XFσ = Bσ PrY, (6.34) where X˜ and Y˜ are the matrices to be determined. According to Items (2) and (7) of Lemma 2.13, one has   − = − , X AXF σ Xσ Aσ Xσ Fσ and (BY)σ = Bσ PrYσ . 260 6 Real-Representation-Based Approaches

So the con-Yakubovich matrix equation (6.33) is equivalent to

Xσ − Aσ Xσ Fσ = Bσ PrYσ .

Based on the above, one has the following proposition.

Proposition 6.5 Given matirces A ∈ Cn×n, F ∈ Cp×p, and B ∈ Cn×r, the con- Yakubovich matrix equation (6.33) has a solution (X, Y) if and only if its real repre- ˜ ˜ sentation matrix equation (6.34) has a real matrix solution (X, Y) = (Xσ , Yσ ).

Theorem 6.19 Given matrices A ∈ Cn×n, F ∈ Cp×p, and B ∈ Cn×r, the con- Yakubovich matrix equation (6.33) has a solution (X, Y) if and only if the matrix equation (6.34) has a solution (X˜ , Y˜ ) ∈ R2n×2p × R2r×2p. In this case, if (X˜ , Y˜ ) is a solution to the matrix equation (6.34), then (X, Y) with ⎧ ⎪   ⎨⎪ = 1 ( ˜ + ˜ ) Ip X 2 In iIn X QnXQp iIp ⎪   ⎩⎪ = 1 ( ˜ + ˜ ) Ip Y 2 Ir iIr Y QrYQp iIp is a solution to the matrix equation (6.33).

Proof According to Item (5) of Lemma 2.13, the matrix equation (6.34) is equivalent to ˜ ˜ ˜ X − QnAσ QnXQpFσ Qp = QnBσ QrPrY.

−1 =− =− , Since Qj Qj and QrPr PrQr one has

˜ ˜ ˜ QnXQp − Aσ QnXQpFσ = Bσ PrQrYQp.

˜ ˜ ˜ This shows that if (X, Y) is a solution to the matrix equation (6.34), then (QnXQp, ˜ QrYQp) is also a solution to the matrix equation (6.34). Thus the conclusion is obtained by following the steps of proof of Theorem 6.15. 

Theorem 6.19 shows that the solutions of the con-Yakubovich matrix equation (6.33) can be obtained by a real Yakubovich matrix equation. Due to this reason, now the following Yakubovich matrix equation is considered:

X − AXF = BY, (6.35) where A ∈ Cn×n, F ∈ Cp×p, and B ∈ Cn×r are known matrices, and the matrices X ∈ Cn×p and Y ∈ Cr×p are to be determined. Regarding the solution to the Yakubovich matrix equation (6.35), one has the following result. 6.4 Con-Yakubovich Matrix Equations 261

Lemma 6.10 Given matrices A ∈ Cn×n,B∈ Cn×r, and F ∈ Cp×p,let

n n−1 gA(s) = det(I − sA) = αns + αn−1s +···+α1s + 1, (6.36) and n−1 adj(I − sA) = Rn−1s +···+R1s + R0. (6.37)

Then

(1) The matrices X ∈ Cn×p and Y ∈ Cr×p given by ⎧ − ⎨⎪ n 1 X = R BZFi i (6.38) ⎩⎪ i=0 Y = ZgA(F)

satisfy the Yakubovich matrix equation (6.35) for any matrix Z ∈ Cr×p. (2) If γμ = 1 for any γ ∈ λ(A), μ ∈ λ (F), then all the matrices X ∈ Cn×p and Y ∈ Cr×p satisfying the Yakubovich matrix equation (6.35) can be parameterized as (6.38) with Z being an arbitrary matrix.

The result in Lemma 6.10 was given in [269, 305] by different approaches. This lemma provides a very neat and complete, explicit parametric solution to the matrix equation (6.35). To obtain this solution, one needs the coefficients αi, i ∈ I[1, n−1], of the characteristic polynomial of the matrix pair (−A, −I) and the coefficient matrices Ri, i ∈ I[0, n − 1], of the adjoint matrix adj(I − sA). They can be obtained by the following generalized Leverrier algorithm  Ri = Ri−1A + αiI, R0 = I ( ) , i ∈ I[1, n], (6.39) α =−tr Ri−1A ,α = i i 0 1 which is given in Sect. 2.3. The following lemma provides an equivalent form of the solution given in Lemma 6.10.

Lemma 6.11 Given matrices A ∈ Cn×n,B∈ Cn×r, and F ∈ Cp×p,let

n n−1 gA(s) = det(I − sA) = αns + αn−1s +···+α1s + α0,α0 = 1.

Then, the matrices X and Y given by (6.38) have the following equivalent form: ⎧ − − ⎨⎪ n 1 n 1 X = α Aj−kBZFj k , (6.40) ⎩⎪ k=0 j=k Y = ZgA(F) where Z ∈ Cr×p is a parameter matrix which can be arbitrarily chosen. 262 6 Real-Representation-Based Approaches

Proof According to the generalized Leverrier algorithm (6.39), the coefficient matri- ces Ri, i ∈ I[0, n − 1], in Lemma 6.10 can be compactly expressed as

j j−k Rj = αkA ,α0 = 1. k=0

Substituting this into the expression of X in (6.38) and reordering the summation, one has

n−1 j X = RjBZF j=0   n−1 j j−k j = αkA BZF j=0 k=0 n−1 n−1 j−k j = αkA BZF . k=0 j=k

The proof is completed. 

By combining the result of Lemma 6.10 or 6.11 with Theorem 6.19, one can solve the con-Yakubovich matrix equation (6.33). By Theorem 6.19, it is possible that two different solutions of the real repre- sentation equation (6.34) may result in a common solution to the matrix equation (6.33). By using the equivalence relation “ ” and the equivalence class com (·) in the previous section, the following conclusion can be obtained.

Proposition 6.6 Given matrices A ∈ Cn×n, F ∈ Cp×p, and B ∈ Cn×r, the real representation equation (6.34) has 4rp degrees of freedom in R2r×2p if and only if the con-Yakubovich matrix equation (6.33) has rp degrees of freedom in Cr×p.

Regarding the degrees of freedom existing in the con-Yakubovich matrix equation (6.33), one has the following results.

Theorem 6.20 Given matrices A ∈ Cn×n, F ∈ Cp×p, and B ∈ Cn×r, the con- Yakubovich  matrix equation (6.33) has rp degrees of freedom if γμ = 1, for any γ ∈ λ AA , μ ∈ λ FF .

Proof According to Theorem 6.10 ,γμ = 1, for any γ ∈ λ(Aσ ), μ ∈ λ(Fσ ) if and only if γμ = 1, for any γ ∈ λ AA , μ ∈ λ FF . Combining this with Proposition 6.6, implies the assertion of this theorem. 

Theorem 6.21 Given matrices A ∈ Cn×n, F ∈ Cp×p, and B ∈ Cn×r, the following statements hold. 6.4 Con-Yakubovich Matrix Equations 263

(1) The matrices X ∈ Cn×p and Y ∈ Cr×p given by ⎧ n−1 n−1 n−1 n−1 ⎨⎪  j X = α (AA)j−kBZ(FF)j + α (AA)j−kABZF FF k k (6.41) ⎩⎪ k=0 j=k k=0 j=k = ( ), Y ZgAA FF

satisfy the con-Yakubovich matrix equation (6.33) for an arbitrarily chosen matrix Z ∈ Cr×p.In(6.41),   ( ) = − = α n + α n−1 +···+α + α α = gAA s det I sAA ns n−1s 1s 0, 0 1.     (2) If γμ = 1, for any γ ∈ λ AA , μ ∈ λ FF , then all the matrices X and Y satisfying the matrix equation (6.33) can be parameterized by (6.41).

Proof If the con-Yakubovich matrix equation (6.33) has a solution (X, Y), then the ˜ ˜ real representation matrix equation (6.34) has a solution (X, Y) = (Xσ , Yσ ) with the free parameter Zσ . By Theorem 2.7 and Lemmas 6.11, 2.13, and 2.17, one has

Xσ n−1 2n−1 s−2k s = αkAσ Bσ PrZσ Fσ k=0 s=2⎛k ⎞ n−1 n−1 n−1 ⎝ 2j−2k 2j 2j+1−2k 2j+1⎠ = αk Aσ Bσ PrZσ Fσ + Aσ Bσ PrZσ Fσ = = = k 0 ⎛ j k j k n−1 n−1     ⎝ (j−k) j = αk (AA) PnBσ PrZσ FF Pp+ σ σ = = k 0 j k ⎞ n−1     j−k j ⎠ (AA) PnAσ Bσ PrZσ Fσ FF Pp σ σ = j k ⎛ n−1 n−1     ⎝ (j−k) j = αk (AA) BZ FF Pp+ σ σ = = k 0 j k ⎞ n−1       j−k j ⎠ (AA) Pn ABZF FF Pp σ σ σ = j k ⎛ n−1 n−1     ⎝ (j−k) j = αk (AA) Pn (BZ)σ FF Pp+ σ σ k=0 j=k 264 6 Real-Representation-Based Approaches ⎞ n−1     j−k j ⎠ (AA) PnAσ (BZ)σ Fσ FF Pp σ σ = j k ⎛ ⎞ n−1 n−1   n−1    ⎝ (j−k) j j−k j ⎠ = αk (AA) BZ(FF) + (AA) ABZF FF , σ σ k=0 j=k j=k and

= ( ) = ( ( )) Yσ Zσ gAσ Fσ Zσ gAA FF σ Pn = ( ( )) = ( ( )) . ZgAA FF σ ZgAA FF σ

Thus, the expression (6.41) is obtained. In view of Proposition 6.5, the second con- clusion is true.  In the following, some equivalent forms are presented for the solution given in Theorem 6.21. Theorem 6.22 Given matrices A ∈ Cn×n, F ∈ Cp×p, and B ∈ Cn×r,let

n−1 adj(I − sAA) = Rn−1s +···+R1s + R0.

Then the matrices X and Y given by (6.41) have the following equivalent form: ⎧ ⎪ n−1 n−1 ⎨  i X = R BZ(FF)i + R ABZF FF i i , ⎩⎪ i=0 i=0 = ( ) Y ZgAA FF where Z ∈ Cr×p is a free parameter matrix. Proof By applying the generalized Leverrier algorithm (6.39) to the matrix AA,itis easily obtained that the coefficients αi, Ri, i ∈ I[0, n − 1], in Theorem 6.21 satisfy ⎧ = ⎪ R0 I ⎪ ⎨ R1 = α1I + AA 2 R2 = α2I + α1AA + A . (6.42) ⎪ ⎪ ··· ⎩  n−2  n−1 Rn−1 = αn−1I + αn−2AA +···+α1 AA + AA

This relation can be compactly expressed by

i  i−j Ri = αj AA ,α0 = 1. j=0 6.4 Con-Yakubovich Matrix Equations 265

Thus, one has

n−1 i RiBZ(FF) i=0 ⎛ ⎞ n−1 i     − ⎝ i j⎠ i = αj AA (BZ) (FF) i=0 j=0 n−1 n−1     − i j i = αj AA (BZ) (FF) , j=0 i=j and

n−1  i RiABZF FF i=0 ⎛ ⎞ n−1 i     −   ⎝ i j⎠ i = αj AA ABZF (FF) i=0 j=0 n−1 n−1     −   i j i = αj AA ABZF (FF) . j=0 i=j

Combining these two expressions with Theorem 6.21, gives the conclusion. 

× × × Theorem 6.23 Given matrices A ∈ Cn n, F ∈ Cp p, and B ∈ Cn r,ifγμ = 1, for any γ ∈ λ AA , μ ∈ λ FF , then all the matrices X and Y satisfying the con-Yakubovich matrix equation (6.33) can be explicitly given by  = ( , ) ( ) ( , ) + ( , ) ( ) ( , ) X Ctr AA B Tr AA Obsn FF Z Ctr AA AB Tr AA Obsn FF ZF , = ( ) (6.43) Y ZgAA FF where Z ∈ Cr×p is a free parameter matrix.

Proof By applying the generalized Leverrier algorithm (6.39) to the matrix AA,for the coefficients αi, and coefficient matrices Ri, i ∈ I[0, n − 1], in Theorem 6.22 the relation (6.42) holds. With this relation, one has

n−1 i RiBZ(FF) i=0  = R BRB ··· R − B Obs (FF, Z) 0 1 n 1 ! n  n−1 = BAAB ··· AA B Tr(AA) Obsn(FF, Z)

= Ctr(AA, B)Tr(AA) Obsn(FF, Z), 266 6 Real-Representation-Based Approaches and

n−1  i RiABZF FF i=0  = R ABRAB ··· R − AB Obs (FF, ZF) 0 1 n 1 ! n  n−1 = AB AAAB ··· AA AB Tr(AA) Obsn(FF, ZF)

= Ctr(AA, AB)Tr(AA) Obsn(FF, ZF).

With the preceding two relations, the conclusion of this theorem can be readily obtained from Theorem 6.22. 

Example 6.4 Consider a con-Yakubovich matrix equation in the form of (6.33) with the following parameters ⎡ ⎤ ⎡ ⎤ 1 −2 − i −1 + i −1 + i1 2i i A = ⎣ 0i 0⎦ , F = , B = ⎣ 0i⎦ . 1 −1 + i 0 −11− i −i1− 2i

By straightforward calculation, it can be obtained that

( ) = ( ) =− 3 + 2 − + gAA s gAA s 2s 5s 4s 1.

Thus, one has ⎡ ⎤ 1 −45 ⎣ ⎦ T1(AA) = 01 −4 . 00 1

In addition, it can be easily calculated that

( , ) ⎡Ctr AA B ⎤ −1 + i1−2 + 4i −6 + 3i −4 + 10i −13 + 16i = ⎣ 0i 0 i 0 i⎦ , −i1− 2i −2i −5i −4i −2 − 11i and

( , ) ⎡Ctr AA AB ⎤ −2 − 2i −3 + i −4 − 6i −12 − 6i −8 − 14i −23 − 27i = ⎣ 010 1 0 1⎦ . 1 + i3+ 2i 2 + 2i 5 + 6i 4 + 4i 9 + 14i

If the free parameter Z is chosen as 6.4 Con-Yakubovich Matrix Equations 267

2 + i −1 + 2i Z = , −1 + 3i 1 − i by Theorem 6.23 a special solution of this matrix equation can be obtained as: ⎧ ⎡ ⎤ ⎪ 546 + 142i 654 + 926i ⎪ ⎨⎪ X = ⎣ 8 − 40i 112 ⎦ 96 − 72i 288 + 200i . ⎪ ⎪ 24 + 112i −112 + 24i ⎩ Y = −72 − 24i 152 − 232i

6.5 Extended Con-Sylvester Matrix Equations

This section is devoted to the extended con-Sylvester matrix equation

EXF − AX = C, (6.44) where E, A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p are known matrices, and X ∈ Cn×p is the matrix to be determined. As in previous sections, the matrices Pj and Qj defined in Sect.2.6 are used. That is,

Ij 0 0 Ij Pj = , Qj = . 0 −Ij −Ij 0

The real representation equation for (6.44) is defined as

Eσ YFσ − Aσ PnY = Cσ . (6.45)

The following theorem presents a result on transformation between complex and real matrix equations.

Theorem 6.24 Given matrices E, A ∈ Cn×n,F∈ Cp×p, and C ∈ Cn×p, the matrix equation (6.44) has a solution X ∈ Cn×p if and only if its real representation equation (6.45) has a real solution Y. In this case, the solution of the matrix equation (6.44) can be expressed by

  1 Ip X = In iIn (Y + QnYQp) . (6.46) 4 iIp

Proof By applying Item (2) of Lemma 2.13, one has

(EXF − AX)σ = Eσ Xσ Fσ − Aσ PnXσ .

Thus the matrix equation (6.44) can be equivalently written as 268 6 Real-Representation-Based Approaches

Eσ Xσ Fσ − Aσ PnXσ = Cσ , which implies that Xσ is a solution to the real representation equation (6.45). −1 =− =− On the other hand, it is easily known that Qi Qi and PiQi QiPi for any i. In addition, from Item (5) of Lemma 2.13 one has Aσ = QmAσ Qn for any A ∈ Cm×n. So the matrix equation (6.45) can be equivalently written as

QnEσ QnYQpFσ Qp − QnAσ QnPnY = QnCσ Qp.

−1 −1 Pre- and post-multiplying both sides of the above equation by Qn and Qp , respec- tively, gives Eσ QnYQpFσ − Aσ QnPnY(−Qp) = Cσ , which is equivalent to

Eσ QnYQpFσ − Aσ PnQnYQp = Cσ .

This equation shows that, if Y is a solution to the equation (6.45), then QnYQp is also a solution to the equation (6.45). Thus the following real matrix

1 Yˆ = (Y + Q YQ ) 2 n p is also a solution to the equation (6.45). Now let

Y11 Y12 n×p Y = , Yij ∈ R , i, j = 1, 2, Y21 Y22 then one has Z Z Yˆ = 1 2 , (6.47) Z2 −Z1 where 1 1 Z = (Y − Y ), Z = (Y + Y ). 1 2 11 22 2 2 12 21 From (6.47) one can construct a complex matrix as follows

  1 ˆ Ip X = Z1 + i Z2 = In iIn Y . 2 iIp

ˆ ˆ Clearly the real representation of the complex matrix X is Y, i.e., Xσ = Y. Thus, the X in (6.46) is a solution to the matrix equation (6.44).  6.5 Extended Con-Sylvester Matrix Equations 269

Remark 6.14 Theorem 6.24 gives a practical technique to find a solution of the extended con-Sylvester matrix equation (6.44) based on a solution to the matrix equation (6.45) by means of real representation of a complex matrix. Remark 6.15 The latter part of the proof of Theorem 6.24 is taken from [155]. In order to obtain a further result on the solution of the extended con-Sylvester matrix equation (6.45), the following result is needed. n×n Lemma 6.12 Given matrices E, A ∈ C ,ifλ is a finite eigenvalue of (Eσ , Aσ Pn), then so are ±λ, ±λ. = + = + , = , Proof Assume that A A1 i A2 and E E1 iE2 with Ai Ei, i 1 2, being αT αT T ( , ) real matrices. Let 1 2 be an eigenvector of Eσ Aσ Pn corresponding to the eigenvalue λ. Then, one has

A −A α E E α 1 2 1 = λ 1 2 1 , (6.48) A2 A1 α2 E2 −E1 α2 that is,  α − α = λ α + λ α A1 1 A2 2 E1 1 E2 2 . A2α1 + A1α2 = λE2α1 − λE1α2

Rearranging the above relations, one has  (−α ) − α = (−λ) (−α ) + (−λ) α A1 2 A2 1 E1 2 E2 1 , A2(−α2) + A1α1 = (−λ)E2(−α2) − (−λ)E1α1 that is, A −A −α E E −α 1 2 2 = (−λ) 1 2 2 . (6.49) A2 A1 α1 E2 −E1 α1

This reveals that −λ is also an eigenvalue of the matrix pair (Eσ , Aσ Pn).Itiseasily known from (6.48) and (6.49) that λ and −λ are also eigenvalues of the matrix pair (Eσ , Aσ Pn) . The conclusion is thus proven.  Lemma 6.13 Let E, A ∈ Cn×n and F ∈ Cp×p. Then ( ) (1) f(Eσ , Aσ Pn) s is a real polynomial, and there holds

n ( ) = 2i; f(Eσ , Aσ Pn) s a2ks i=0

( ) = ( ( )) (2) f(Eσ , Aσ Pn) Fσ g(Eσ , Aσ Pn) FF σ Pn, in which

n ( ) = i. g(Eσ , Aσ Pn) s a2ks i=0 270 6 Real-Representation-Based Approaches

Proof The first conclusion is obtained immediately from Lemma 6.12. For any i,by 2i i Lemma 6.12 it is known that Fσ = ((FF) )σ Pn, so Item (2) is valid.  Theorem 6.25 Given matrices E, A ∈ Cn×n,F∈ Cp×p, and C ∈ Cn×p, denote

2n−1 adj(sEσ − Aσ Pn) = R2n−1,0 + R2n−1,1s +···+R2n−1,2n−1s .

Thus, if λ(FF) ∩ λ (Eσ , Aσ Pn) = ∅, then the matrix equation (6.44) has a unique solution given by

   1 2n−1 Ip X = In iIn R2n−1,0Cσ + R2n−1,1Cσ Fσ +···+R2n−1,2n−1Fσ · 2 iIp   ( ) −1 . g(Eσ , Aσ Pn) FF

Proof From Lemma 6.13 and a result in [257], it can be obtained that   ( ) Xg(Eσ , Aσ Pn) FF σ

= Xσ (g( , )(FF))σ P  Eσ Aσ Pn  p = ( ) Xσ g(Eσ , Aσ Pn) FF σ Pp = ( ) Xσ f(Eσ , Aσ Pn) Fσ 2n−1 = R2n−1,0Cσ + R2n−1,1Cσ Fσ +···+R2n−1,2n−1Fσ .

In view of the proof of Theorem 6.24, the conclusion is obtained.  The above theorem has provided an approach for solving the extended con- Sylvester matrix equation (6.44). However, it is required to obtain the characteristic polynomial of the matrix pair (Eσ , Aσ Pn) and the adjoint matrix of (sEσ − Aσ Pn). This involves the calculation of the matrices with higher dimensions. To avoid such a case, inspired by the idea in Sect. 6.1 one should find a matrix pair (E , A ) with E , A ∈ Cn×n ( ) = ( ) such that f(E ,A ) s g(Eσ , Aσ Pn) s . It is a pity that such a matrix pair can not be found in this book. It is suggested that interested readers should give their efforts to this problem.

6.6 Generalized Con-Sylvester Matrix Equations

In this section, the following generalized con-Sylvester matrix equation is studied

AX + BY = EXF (6.50) where A, E ∈ Cn×n, B ∈ Cn×r, and F ∈ Cp×p are known matrices, and X ∈ Cn×p and Y ∈ Cr×p are the matrices to be determined. This equation is a conjugate version of the generalized Sylvester matrix equation AX + BY = EXF. In this section, an 6.6 Generalized Con-Sylvester Matrix Equations 271 approach is given to solve the generalized con-Sylvester matrix equation (6.50)by means of the real representation of a complex matrix. As in previous sections, the following matrices Pj and Qj defined in Sect. 2.6 are needed: Ij 0 0 Ij Pj = , Qj = . 0 −Ij −Ij 0

Taking the real representation for (6.50), gives   ( + ) = AX BY σ EXF σ .

By applying Item (5) of Lemma 2.13, the preceding relation is equivalent to

Aσ PnXσ + Bσ PrYσ = Eσ Xσ Fσ . (6.51)

Accordingly, for the generalized con-Sylvester matrix equation (6.50) one can define its real representation equation as ˜ ˜ ˜ Aσ PnX + Bσ PrY = Eσ XFσ , (6.52) where X˜ ∈ R2n×2p and Y˜ ∈ R2r×2p are the matrices to be determined. Based on the previous derivation, the following simple proposition can be obtained. Proposition 6.7 Given matrices E, A ∈ Cn×n,B ∈ Cn×r, and F ∈ Cp×p,the generalized con-Sylvester matrix equation (6.50) has a solution (X, Y) if and only if its real representation matrix equation (6.52) has a real matrix solution (X˜ , Y˜ ) = (Xσ , Yσ ). The matrix equation (6.52) is the generalized Sylvester matrix equation investi- gated in [69, 247, 303], and can be solved by the methods given in these references. If the solutions X˜ and Y˜ of the equation (6.52) possess the structure of the real rep- resentation, one can easily obtain a solution of the matrix equation (6.50). Now an immediate question is, what happens for the matrix equation (6.50) when one obtains a pair of solutions of (6.52), which are not in the form of the real representation. The following theorem gives an answer to this question. Theorem 6.26 Given matrices E, A ∈ Cn×n,B∈ Cn×r, and F ∈ Cp×p, the gener- alized con-Sylvester matrix equation (6.50) has a solution (X, Y) if and only if the matrix equation (6.52) has a real matrix solution (X˜ , Y˜ ) ∈ R2n×2p × R2r×2p.Inthis case, if (X˜ , Y˜ ) is a real solution to the matrix equation (6.52), then the matrix pair (X, Y) given by ⎧    ⎪ ˜ ˜ Ip ⎨ X = In iIn X + QnXQp iIp    (6.53) ⎪ ˜ ˜ Ip ⎩ Y = Ir iIr Y + QnYQp iIp is a solution to the matrix equation (6.50). 272 6 Real-Representation-Based Approaches

Proof By applying Item (2) of Lemma 2.13, the generalized con-Sylvester matrix equation (6.52) can be equivalently rewritten as ˜ ˜ ˜ (QnAσ Qn) PnX + (QnBσ Qr) PrY = (QnEσ Qn) XQpFσ Qp. (6.54)

−1 =− =− = Since Qp Qp and QjPj PjQj, j n, r,itfollowsfrom(6.54) that

˜ ˜ ˜ Aσ PnQnXQp + Bσ PrQrYQp = Eσ QnXQpFσ .

˜ ˜ This relation implies that (QnXQp, QrYQp) is also a solution to the matrix equation ˜ ˜ ˜ ˜ (6.52). Due to the homogeneity, (2(X + QnXQp), 2(Y + QrYQp)) is also a solution to the matrix equation (6.52). With this, the conclusion can be obtained by following the line of the proof of Theorem 6.15. 

It follows from Theorem 6.26 that two different solutions of the real representation equation (6.52) may result in an identical solution to the matrix equation (6.50). For this season, similarly to Sect.6.3 the relation “ ” given in Definition 6.1 can also be used in this section. It has been pointed out in Sect. 6.3 that the relation “ ” is an equivalence relation. Accordingly, for a given matrix A ∈ R2r×2p one can define the equivalence class

com(A) = {X | A X} .

With this, now one can establish the following map:

× × ψ : Cr p → com (R2r 2p) Re(A) Im(A) . A → com Im(A) − Re(A)

Obviously, the defined map ψ is a 1-1 correspondence according to the definition of the relation “ ”. Combining this with the result on the generalized Sylvester matrix equation [247, 303], one has the following conclusion.

Proposition 6.8 Given matrices E, A ∈ Cn×n,B ∈ Cn×r, and F ∈ Cp×p,the generalized con-Sylvester matrix equation (6.50) has rp degrees of freedom in Cr×p if and only if   rank sEσ − Aσ Pn Bσ = 2n, for s ∈ λ (Fσ ) . (6.55)

Theorem 6.26 has provided a method for solving the generalized con-Sylvester matrix equation (6.50). However, the dimension of the matrix equation (6.52)is higher than that of the matrix equation (6.50). This may result in some difficulties of computation. To avoid such a case, inspired by the idea in Sect.6.1 one should find (E , A ) E , A ∈ Cn×n ( ) = ( ) a matrix pair with such that f(E ,A ) s g(Eσ , Aσ Pn) s , where ( ) g(Eσ , Aσ Pn) s is defined in Lemma 6.13. It is a pity that such a matrix pair can not be found currently. It is suggested that interested readers should give their efforts to this problem. 6.7 Notes and References 273

6.7 Notes and References

In this chapter, some complex conjugate matrix equations have been investigated with the aid of the real representation in Sect. 2.6 of complex matrices. This real representation was firstly proposed in [155] to solve the con-Kalman-Yakubovich matrix equations. These equations include: normal con-Sylvester matrix equations, con-Kalman-Yakubovich matrix equations, con-Sylvester matrix equations, con- Yakubovich matrix equations, extended con-Sylvester matrix equations and general- ized con-Sylvester matrix equations. The contents in Subsection 6.1.3 and Sects. 6.2– 6.6 are mainly based on the results in [242, 258, 269, 280, 282]. The contents in Sect. 6.3 are new, and have not been published. The solutions of these six matrix equa- tions are obtained from those of their corresponding real representation equations by using results on the characteristic polynomials of real representation matrices. For normal con-Sylvester matrix equations, con-Kalman-Yakubovich matrix equations, con-Sylvester matrix equations and con-Yakubovich matrix equations, the obtained solutions can be expressed by the original coefficient matrices of the equations, instead of their real representations. Several equivalent forms of the solutions of these matrix equations are also given. Some of these equivalent forms are expressed in terms of controllability matrices. This feature may provide much convenience for further analysis and applications of these matrix equations. In addition, the necessary and sufficient condition for the normal con-Sylvester matrix equation to have a solu- tion is given in Subsection6.1.1. This result can be viewed as a conjugate version of the normal Sylvester matrix equation. For comparison, the solvability conditions of normal Sylvester and con-Sylvester matrix equations are listed in Table 6.1, where cs ∼ and ∼ are used to denote the similarity and consimilarity relations, respectively. The con-Sylvester matrix equation was earliest investigated by Bevis et al. [21], where the result of Theorem 6.1 on the solvability condition of con-Sylvester matrix equations was established. This result is a consimilarity version of Roth’s theo- rem. The uniqueness condition of solutions in Table 6.1 was also given in [21] for the normal con-Sylvester matrix equations. In addition, the general solution to the homogeneous normal con-Sylvester matrix equation was given in [21], and an important conclusion is that any solution to a normal con-Sylvester matrix equa- tion can be written as a particular solution plus a complementary solution to the

Table 6.1 Solvability condition XF − AX = C XF − AX = C " # " # " # " # A −C A 0 A −C cs A 0 Solvability ∼ ∼ 0 F 0 F 0 F 0 F     Uniqueness for any C λ (A) ∩ λ (F) = ∅ λ AA ∩ λ FF = ∅ 274 6 Real-Representation-Based Approaches corresponding homogeneous equation. The solvability and solutions of the normal con-Sylvester matrix equations were characterized in [20] by using Jordan canon- ical forms under consimilarity. The explicit solutions in the current chapter for the normal con-Sylvester matrix equation were developed in [258]. In addition, in [11] a necessary and sufficient condition for a nonsingular matrix to be conjugate-Toeplitz was given in terms of a homogeneous con-Sylvester matrix equation. The con-Kalman-Yakubovich matrix equation may appear earliest in [11] to investigate the conjugate-Toeplitz matrices. In [11], the following con-Kalman- Yakubovich matrix equation was considered:

X − ZXZT = xyT where Z is the ‘single shift’ matrix

00 Z = , In−1 0 and x and y are two column vectors. It was shown that the solution of this equation can be expressed as the product of a lower triangular conjugate-Toeplitz matrix whose first column is x and a upper triangular conjugate-Toeplitz matrix whose first row is yT. The general con-Kalman-Yakubovich matrix equation was investigated in [155]. It was shown that the solutions of this kind of equations can be obtained by solving the corresponding real representation matrix equations. Chapter 7 Polynomial-Matrix-Based Approaches

In Chap. 6, a kind of real representations for complex matrices has been used to inves- tigate some complex conjugate matrix equations, such as the normal con-Sylvester equations, the con-Yakubovich equations, and the generalized con- Sylvester equations. A basic feature of the approaches in Chap. 6 is that the basic solutions of these matrix equations are obtained by indirectly using the results of their corresponding real representation matrix equations. In this chapter, polynomial matrices are used to investigate some complex conjugate matrix equations. In such a framework, it is not necessary to use relevant results on some matrix equations without the conjugate of the unknown matrices. Since polynomial matrices are used as basic tools in this chapter, some basic results on polynomial matrices are first introduced. The set of all polynomial matrices of dimension m × n in the field of C with respect to the indeterminate s is denoted by Cm×n [s]. For two polynomials α (s) and β (s), the expression α (s) |β (s) means that α (s) divides β (s). Definition 7.1 A square polynomial matrix U (s) is called unimodular if its deter- minant det U (s) is a nonzero constant. Lemma 7.1 Fora polynomial matrix A (s), there must exist two unimodular matrices U (s) and V (s) with appropriate dimensions such that ⎡ ⎤ r ( ) ⎣ diag di s ⎦ U (s) A (s) V (s) = i=1 , (7.1) 0 where di(s)|di+1(s),i∈ I[1, r − 1], and 0 is a zero matrix with appropriate dimensions. The expression in the right-hand side of (7.1) is called the Smith normal form of the polynomial matrix A (s). The previous definition and lemma can be found in the

© Springer Science+Business Media Singapore 2017 275 A.-G. Wu and Y. Zhang, Complex Conjugate Matrix Equations for Systems and Control, Communications and Control Engineering, DOI 10.1007/978-981-10-0637-1_7 276 7 Polynomial-Matrix-Based Approaches book [160]. For some further results on polynomial matrices, one can also refer to the book [160]. Throughout this chapter, some symbols are adopted. For A ∈ Cn×n, B ∈ Cn×r, and C ∈ Cm×n, it is denoted that   k−1 Ctrk(A, B) = BAB··· A B , Ctr(A, B) = Ctr (A, B), ⎡n ⎤ C ⎢ ⎥ ( , ) = ⎢ CA ⎥, Obsk A C ⎣ ··· ⎦ CAk−1 n n−1 fA(s) = det(sI − A) = s + αn−1s +···+α1s + α0, n n−1 gA(s) = det(I − sA) = βns + βn−1s +···+β1s + 1.

Obviously, Ctr(A, B) is the controllability matrix of the matrix pair (A, B),Obsn (A, C) is the observability matrix of the matrix pair (A, C). For a pair of the n×n matrix pair E, A ∈ C , the notation f(E,A) (s) is used to denote the characteris- tic polynomial of the matrix pair (E, A). That is, f(E,A) (s) = det (sE − A). In addi- tion, λ (E, A) denotes the set of finite eigenvalues of matrix pair (E, A). That is, λ (E, A) = {s| det (sE − A) = 0}. ( ) = ϕ i In addition, for a polynomial matrix M s i=0 Mis , it is denoted that ⎡ ⎤ M0 M1 ··· Mt−1 ⎢ . ⎥ ⎢ .. ··· ⎥ ( ( )) = ⎢ M0 ⎥ Tt M s ⎢ . ⎥ , ⎣ .. ⎦ M1 M0 and ⎡ ⎤ M1 M2 ··· Mt−1 Mt ⎢ ⎥ ⎢ M2 M3 ··· Mt ⎥ ⎢ ⎥ St (M(s)) = ⎢ ··· ··· ⎥ , ⎣ ⎦ Mt−1 Mt Mt where Mi = 0 for any i >ϕ.

7.1 Homogeneous Con-Sylvester Matrix Equations

This section is devoted to investigate the following homogeneous con-Sylvester matrix equation AX + BY = XF,(7.2) 7.1 Homogeneous Con-Sylvester Matrix Equations 277 where A ∈ Cn×n, B ∈ Cn×r, and F ∈ Cp×p are known matrices, and X and Y are the matrices to be determined. Before proceeding, the following result on normal con-Sylvester matrix equations is given.

Lemma 7.2 Given matrices A ∈ Cn×n and F ∈ Cp×p, the normal con-Sylvester matrix equation

XF − AX = C (7.3) has a unique solution for any C ∈ Cn×p if and only if λ(AA) ∩ λ(FF) = ∅.

The result of Lemma 7.2 is in fact the Theorem 6.3 in the previous section. Based on this lemma, it is known that the con-Sylvester matrix equation (7.2) has a unique solution X with respect to an arbitrary fixed matrix Y if the matrices AA and FF have no common eigenvalues. Therefore, when λ(AA) ∩ λ(FF) = ∅,the number of degrees of freedom in the solution (X, Y) to the matrix equation (7.2)is equal to the number of elements in the matrix Y, that is, rp. This fact is expressed in the following lemma.

Lemma 7.3 Given matrices A ∈ Cn×n,B∈ Cn×r, and F ∈ Cp×p,ifλ(AA) ∩ λ (FF) = ∅, then the number of degrees of freedom existing in the solution (X, Y) to the homogeneous con-Sylvester matrix equation (7.2)isrp.

With this lemma, a basic solution can be obtained for the homogeneous con- Sylvester matrix equation (7.2).

Theorem 7.1 Given matrices A ∈ Cn×n,B∈ Cn×r, and F ∈ Cp×p, suppose that λ(AA) ∩ λ(FF) = ∅. Let

n−1 ( ) = ( − ) = α i,α = fAA s det sI AA is n 1, i=0

n−1 i adj(sI − AA) = Ris . i=0

Then, all the solutions to the homogeneous con-Sylvester matrix equation (7.2) can be characterized by ⎧ − − ⎨⎪ n 1 n 1 X = R BZF(FF)i + A R BZ(FF)i i i , (7.4) ⎩⎪ i=0 i=0 = ( ) Y ZfAA FF where Z ∈ Cr×p is an arbitrarily chosen free parameter matrix. 278 7 Polynomial-Matrix-Based Approaches

Proof It is first shown that the matrices X and Y given in (7.4) are solutions to the con-Sylvester matrix equation (7.2). A direct calculation gives

AX − XF n−1 n−1 i i = A RiBZF(FF) + AA RiBZ(FF) i=0 i=0 n−1 n−1 i i − RiBZF(FF) F − A RiBZ(FF) F (7.5) i=0 i=0 n−1 n−1 i i+1 = AA RiBZ(FF) − RiBZ(FF) i=0 i=0 n−1 i n = AAR0BZ + (AARi − Ri−1)BZ(FF) − Rn−1BZ(FF) . i=1

By the following polynomial matrix relation

(sI − AA) adj(sI − AA) = I det(sI − AA), it is easily derived that ⎧ ⎨ −AAR0 = α0I, − + = α , ∈ I[ , − ], ⎩ AARi Ri−1 iI i 1 n 1 (7.6) Rn−1 = αnI.

By applying these relations to (7.5) one has

AX − XF n−1 i n =−α0BZ − αiBZ(FF) − αnBZ(FF) (7.7) i=1 n i =−BZ αi(FF) i=0 =− ( ) BZfAA FF .

Therefore, the matrices X and Y given in (7.4) satisfy the con-Sylvester matrix equation (7.2). Secondly, let us show the completeness of solution (7.4). Since λ(AA) ∩ λ(FF) = ∅ λ( ) = λ( ) ( ) and FF FF , the matrix fAA FF is nonsingular, and hence the mapping Z → Y given in (7.4) is injective. This implies that the mapping Z → (X, Y) given in (7.4) is also injective. So all the rp elements in Z have contribution to the solution (7.4). In addition, it follows from Lemma 7.3 that there are rp degrees of freedom in 7.1 Homogeneous Con-Sylvester Matrix Equations 279 the solution of the con-Sylvester matrix equation (7.2). These two facts imply that the solution given in (7.4) is complete. 

The above theorem provides a very neat closed-form solution to the homogeneous con-Sylvester matrix equation (7.2). In order to obtain this solution, one needs the α ∈ I[ , ] ( ) coefficients i, i 1 n , of the polynomial fAA s and the coefficient matrices Ri, i ∈ I[0, n − 1], of adj(sI − AA). They can be obtained by the following Leverrier algorithm in Sect. 2.2 for the matrix AA:  Rn−i = Rn−i+1AA + αn−i+1I, Rn = 0 ( ) , i ∈ I[1, n]. (7.8) α =−tr Rn−iAA ,α = n−i i n 1

In fact, during the proof of Theorem 7.1 the relation (7.6) is the first expression of the Leverrier algorithm (7.8). The iterative expression in (7.8) can be explicitly expanded as ⎧   −   − ⎪ = α + α +···+α n 2 + n 1 ⎪ R0 1In 2AA  n−1 AA AA ⎨⎪ n−2 R1 = α2In + α3AA +···+ AA ··· . (7.9) ⎪ ⎪ = α + ⎩⎪ Rn−2 n−1In AA Rn−1 = In

From these relations it is easily known that   ··· = ( , ) ( ( ) ) R0BR1B Rn−1B Ctr AA B Sn fAA s Ir .

Thus one has

n−1 i RiBZF(FF) i=0  = R0BR1B ··· Rn−1B Obsn(FF, ZF) = ( , ) ( ( ) ) ( , ) Ctr AA B Sn fAA s Ir Obsn FF ZF .

Similarly, it can be obtained that

n−1 i RiBZ(FF) i=0 = ( , ) ( ( ) ) ( , ) Ctr AA B Sn fAA s Ir Obsn FF Z .

With the above discussion one can obtain an equivalent form of the solution provided in Theorem 7.1. 280 7 Polynomial-Matrix-Based Approaches

Theorem 7.2 Given matrices A ∈ Cn×n,B∈ Cn×r, and F ∈ Cp×p, suppose that λ(AA) ∩ λ(FF) = ∅. Then all the solutions to the con-Sylvester matrix equation (7.2) can be characterized by ⎧ ⎪ = ( , ) ( ( ) ) ( , ) ⎨ X Ctr AA B Sn fAA s Ir Obsn FF ZF + ( , ) ( ( ) ) ( , ) , (7.10) ⎪ ACtr AA B Sn fAA s Ir Obsn FF Z ⎩ = ( ) Y ZfAA FF with Z being an arbitrarily chosen parameter matrix.

If Theorem 7.2 is applied to obtain the general solution of the con-Sylvester matrix equation (7.2), one needs only the coefficients αi, i ∈ I[0, n], of the characteristic polynomial of AA. These coefficients can be obtained by some proper numerically reliable algorithms (e.g. [189]) apart from the Leverrier algorithm (7.8). In the above theorem, the solution of the homogeneous con-Sylvester matrix equation (7.2)is expressed in terms of the controllability matrix associated with the matrices AA and B, and the observability matrix associated with the matrix F and the free parameter matrix Z. This property may bring convenience and advantages to the further analysis of the con-Sylvester matrix equation (7.2). By applying the results in Theorems 7.1 and 7.2, explicit solutions for the normal con-Sylvester matrix equation (7.3) can be obtained. In addition, it is easily found that the matrix equation (7.2) is reduced to the normal con-Sylvester matrix equation (7.3)ifB = C and Y = I. With such an observation, the solution X of the normal con-Sylvester matrix equation (7.3) can be obtained from the first expression in (7.4) or (7.10) by letting B = C and setting the free parameter Z to be   = ( ) −1 . Z fAA FF

This is the following corollary.

Corollary 7.1 Given matrices A ∈ Cn×n,B∈ Cn×r, and F ∈ Cp×p, suppose that λ(AA) ∩ λ(FF) = ∅. Let

n ( ) = ( − ) = α i,α = , fAA s det sI AA is n 1 i=0

n−1 i adj(sI − AA) = Ris . i=0

Then the unique solution of the normal con-Sylvester matrix equation (7.3) is given by

n−1 n−1  −  − = ( ) 1 ( )i + ( ) 1( )i X RiC fAA FF F FF A RiC fAA FF FF , (7.11) i=0 i=0 7.1 Homogeneous Con-Sylvester Matrix Equations 281 or equivalently

 −1 X = Ctr(AA, C)S (f (s)I ) Obs (FF, f (FF) F) (7.12)  n AA r n AA    + ( , ) ( ( ) ) ( , ( ) −1) A Ctr AA C Sn fAA s Ir Obsn FF fAA FF .

It is easily checked that

( )i ( ) = ( ) ( )i F FF fAA FF fAA FF F FF , which, under the condition λ(AA) ∩ λ(FF) = ∅, is equivalent to

 −  − ( ) 1 ( )i = ( )i ( ) 1 fAA FF F FF F FF fAA FF . (7.13) ( ) ∈ R[ ] From Lemma 2.19, it is known that fAA s s . Combining this fact with (7.9), gives

ARi = RiA. (7.14)

( ) ∈ R[ ] In addition, due to fAA s s it is easily derived that     −1   ( ) −1 = ( ) = ( ) −1 fAA FF fAA FF fAA FF .

With this relation, it can be obtained that

 −  − ( ) 1( )i = ( )i ( ) 1 fAA FF FF FF fAA FF . (7.15)

By combining the three relations (7.13)–(7.15) it is known that the expression (7.11) for the unique solution of the matrix equation (7.3) can be equivalently rewritten as   n−1 n−1  − = ( )i + ( )i ( ) 1 X RiCF FF RiAC FF fAA FF . i=0 i=0

This is exactly the result given in Theorem 6.7. Different from the approach in the current section, in Chap. 6 this solution was derived by combining a solution of a normal Sylvester matrix equation and some properties of the real representation of a complex matrix. At the end of this section, the solution of the following Sylvester matrix equation is discussed

AX + BY = XF, (7.16) 282 7 Polynomial-Matrix-Based Approaches where A ∈ Cn×n, B ∈ Cn×r, and F ∈ Cp×p are known matrices, by applying the pro- posed results on the con-Sylvester matrix equation. If the known matrices A, B, and F, and the unknown matrices X and Y are all real, then the matrix equation (7.2) becomes the matrix equation (7.16) with A ∈ Rn×n, B ∈ Rn×r, and F ∈ Rp×p being known matrices, and X ∈ Rn×p and Y ∈ Rr×p being the unknown matrices. By apply- ing Theorem 7.1, under the condition of λ(A) ∩ λ(F) = ∅ all the real solutions of the Sylvester matrix equation (7.16) can be parameterized as ⎧ − − ⎨⎪ n 1 n 1 X = R BZF2i+1 + A R BZF2i i i , (7.17) ⎩⎪ i=0 i=0 2 Y = ZfA2 (F ) where n−1 i = ( − 2), i=0 Ris adj sI A and Z ∈ Rr×p is an arbitrarily chosen parameter matrix representing the degrees of freedom in the solution. In view that the set of complex numbers with addition and multiplication is a field, the result in (7.17) holds for the Sylveter matrix equation (7.16) with complex known and unknown matrices. This can be summarized as the following corollary.

Corollary 7.2 Given matrices A ∈ Cn×n,B∈ Cn×r, and F ∈ Cp×p, suppose that λ(A) ∩ λ(F) = ∅. Let n−1 2 i adj(sI − A ) = Ris . (7.18) i=0

Then all the solutions (X, Y) to the Sylvester matrix equation (7.16) can be parame- terized as ⎧ − ⎨⎪ n 1 X = (R BZF + AR BZ) F2i i i , ⎩⎪ i=0 2 Y = ZfA2 (F ) or equivalently ⎧ 2 2 ⎨ X = Ctr(A , B)Sn(fA2 (s)Ir) Obsn(F , ZF) 2 2 +A Ctr(A , B)Sn(fA2 (s)Ir) Obsn(F , Z) , ⎩ 2 Y = ZfA2 (F ) where Z ∈ Cr×p is an arbitrarily chosen free parameter matrix.

Similarly to the idea of Corollary 7.1, from Corollary 7.2 one can also obtain explicit expressions of the unique solution of the normal Sylvester matrix equation

XF − AX = C (7.19) 7.1 Homogeneous Con-Sylvester Matrix Equations 283 with A ∈ Cn×n, C ∈ Cn×p, and F ∈ Cp×p being known matrices. In addition, the solution of (7.19) can also be obtained by specializing the solution to the equation (7.3). With these ideas, the following corollary is given.

Corollary 7.3 Given matrices A ∈ Cn×n,C∈ Cn×p, and F ∈ Cp×p, suppose that λ(A) ∩ λ(F) = ∅, and (7.18) holds. Then the unique solution of the normal Sylvester matrix equation (7.19) can be expressed by one of the following

n−1       2 −1 2 −1 2i X = RiC fA2 (F ) F + ARiC fA2 (F ) F , = i 0  n−1   2i 2 −1 X = Ri (CF + AC) F fA2 (F ) , =  i 0  n−1   2i 2 −1 X = (RiCF + ARiC) F fA2 (F ) , = i 0   2 2 2 −1 X = Ctr(A , C)Sn(fA2 (s)Ir) Obsn(F , fA2 (F ) F)   2 2 2 −1 +A Ctr(A , C)S (f 2 (s)I ) Obs (F , f 2 (F ) ),  n A r n A 2 2 X = Ctr(A , C)Sn(fA2 (s)Ir) Obsn(F , F)+   2 2 2 −1 A Ctr(A , C)Sn(fA2 (s)Ir) Obsn(F , Ip) fA2 (F ) ,  2 2 X = Ctr(A , C)Sn(fA2 (s)Ir) Obsn(F , F)+   2 2 2 −1 Ctr(A , A)Sn(fA2 (s)Ir) Obsn(F , C) fA2 (F ) .

In [300], for the Sylvester matrix equation (7.16) the following two explicit expres- sions of the general solution were given under the condition λ(A) ∩ λ(F) = ∅ ⎧ − ⎨⎪ n 1 = ˜ X RiBZF , ⎩⎪ i=0 Y = ZfA(F) and  = ( , ) ( ( ) ) ( , ) X Ctr A B Sn fA s Ir Obsn F Z , Y = ZfA(F)

n−1 ˜ i = ( − ) ∈ Cr×p where i=0 Ris adj sI A , and Z is an arbitrarily chosen parameter matrix representing the degrees of freedom in the solution. In addition, in [258] the following two expressions for the unique solution to the normal Sylvester matrix equation (7.19) with λ(A) ∩ λ(F) = ∅ were also provided 284 7 Polynomial-Matrix-Based Approaches   n−1   ˜ −1 X = RiCF fA(F) (7.20) i=0 and   −1 X = Ctr(A, C)Sn(fA(s)Ir) Obsn(F, I) fA(F) , where

n−1 ˜ i Ris = adj(sI − A). i=0

Clearly, the expressions proposed in Corollaries 7.2 and 7.3 are more complicated than those in [258, 300], respectively. Moreover, the solutions in Corollaries 7.2 and 7.3 are dependent on the information of matrix A2 instead of A. Nevertheless, the approach in the current section may provide a new idea to investigate the matrix equations (7.2) and (7.19). It is desired that the proposed results in this section can provide some directions to further exploit some properties of these matrix equa- tions. In addition, it is very interesting to show the equivalence of the expressions in Corollary 7.3 and that in (7.20) by a direct computation.

7.2 Nonhomogeneous Con-Sylvester Matrix Equations

In this section, the emphasis is on the solution of the nonhomogeneous con-Sylvester matrix equation

AX + BY = XF + R (7.21) withA ∈ Cn×n, B ∈ Cn×r, F ∈ Cp×p, and R ∈ Cn×p being known matrices. It is easily checked that the mapping (X, Y) → AX + BY − XF is not linear. Nevertheless, this mapping is real linear. Thus, the following simple fact can be obtained.

Proposition 7.1 If (X0,Y0) is a special solution of the nonhomogeneous con- Sylvester matrix equation (7.21), and (X,˜ Y˜ ) is the general solution of the homo- geneous con-Sylvester matrix equation (7.2), then the general solution of (7.21) can be given by  ˜ X = X + X0 ˜ . Y = Y + Y0

Combining this lemma with the results in the previous section, two approaches will be given for solving the nonhomogeneous con-Sylvester matrix equation (7.21) in this section. 7.2 Nonhomogeneous Con-Sylvester Matrix Equations 285

7.2.1 The First Approach

Due to Proposition 7.1, the emphasis is first placed on giving some expressions of a special solution to the equation (7.21). In order to obtain the first expression of a special solution, the following lemma is needed.

Lemma 7.4 Given matrices A ∈ Cn×n,B∈ Cn×r, and F ∈ Cp×p, suppose that λ(AA) ∩ λ(FF) = ∅. Then there exist polynomial matrices L(s) ∈ Cn×n [s] and H(s) ∈ Cr×n [s], and polynomial (s) ∈ C [s] satisfying

(AA − sI)L(s) + BH(s) = (s)In, (7.22) with (γ ) = 0 for any γ ∈ λ(FF).

× ( + )×( + ) Proof Let P(s) ∈ Cn n [s] and Q(s) ∈ Cn r n r [s] be two unimodular polyno- mial matrices that transform AA − sI B into its Smith normal form, that is,     n P(s) AA − sI B Q(s) = diag di(s) 0 (7.23) i=1 with di(s) ∈ C [s], i ∈ I[1, n], and di(s)|di+1(s), i ∈ I[1, n − 1]. Since λ(AA) ∩ λ(FF) = ∅, then di(γ ) = 0, i ∈ I[1, n], for any γ ∈ λ(FF). Partition Q (s) into   Q (s) = Q1 (s) Q2 (s)

(n+r)×n with Q1 (s) ∈ C [s]. Then, it follows from (7.23) that

  n P(s) AA − sI B Q1(s) = diag di(s) i=1   n dn(s) ⇐⇒ P(s) AA − sI B Q1(s) diag = Indn(s) i=1 di(s)    n dn(s) ⇐⇒ AA − sI B Q1(s) diag P(s) = Indn(s). i=1 di(s)

Thus, one can choose (s) = dn(s) and    ( ) n ( ) L s = ( ) dn s ( ) ( ) Q1 s diag P s H s i=1 di(s) to satisfy the relation (7.22). 

With this lemma as preliminary, one has the following results on a special solution of the nonhomogeneous con-Sylvester matrix equation (7.21). 286 7 Polynomial-Matrix-Based Approaches

Theorem 7.3 Given matrices A ∈ Cn×n,B∈ Cn×r,R∈ Cn×p, and F ∈ Cp×p, sup- pose that λ(AA) ∩ λ(FF) = ∅ holds. In addition, let polynomial matrices

t−1 t i n×n i r×n L(s) = Lis ∈ C [s] , and H(s) = His ∈ C [s] , (7.24) i=0 i=0

( ) = t  i ∈ C (γ ) = γ ∈ and polynomial s i=0 is [s] satisfy (7.22) with 0 for any λ(FF). Then a special solution (X0,Y0) of the nonhomogeneous con-Sylvester matrix equation (7.21) is given by ⎧ t−1 t−1 ⎪  −  − ⎪ 1 i 1 i ⎨⎪ X0 = LiR (FF) F(FF) + A LiR (FF) (FF) i=0 i=0 . t (7.25) ⎪  − ⎪ 1 i ⎩⎪ Y0 = HiR (FF) (FF) i=0

Proof By comparing the powers of s in (7.22) one can derive that ⎧ ⎨ AAL0 + BH0 = 0I, − + =  , ∈ I[ , − ] ⎩ AALi Li−1 BHi iI i 1 t 1 , (7.26) −Lt−1 + BHt = tI.

Combining these relations with the expression of X0 in (7.25), one has

AX0 − X0F t−1 t−1  −  − 1 i 1 i = A LiR (FF) F(FF) + AA LiR (FF) (FF) i=0 i=0 t−1 t−1  −  − 1 i 1 i − LiR (FF) F(FF) F − A LiR (FF) (FF) F i=0 i=0 t−1 t−1  −  − 1 i 1 i+1 = AA LiR (FF) (FF) − LiR (FF) (FF) i=0 i=0 t−1  −  − 1 1 i = AAL0R (FF) + (AALi − Li−1)R (FF) (FF) (7.27) i=1  −1  t −Lt−1R (FF) FF t−1  −  − 1 1 i = (−BH0 + 0I) R (FF) + (−BHi + iI) R (FF) (FF) i=1  −1  t + (−BHt + tI) R (FF) FF t t  −  − 1 i 1 i =− BHiR (FF) (FF) + iR (FF) (FF) . i=0 i=0 7.2 Nonhomogeneous Con-Sylvester Matrix Equations 287

According to the definition of the polynomial (s), it is obvious that

t  − 1 i iR (FF) (FF) = R. i=1

Then, it can be derived that

AX0 − X0F t  − 1 i =− BHiR (FF) (FF) + R. i=0

This implies that the matrix pair (X0, Y0) in (7.25) is a special solution of the matrix equation (7.21). 

In the next theorem, another expression of a special solution of the matrix equation (7.21) is given. Before obtaining this result, the following preliminary lemma is needed.

Lemma 7.5 Let the polynomial matrices L(s) ∈ Cn×n[s] and H(s) ∈ Cr×n[s] have the forms in (7.24), and denote t i (s) = is ∈ C[s]. i=0

Then polynomial matrices L(s) and H(s), and polynomial (s), satisfy (7.22) if and only if Hi, i,i∈ I[0, t], satisfy t t  i  i AA BHi = i AA , (7.28) i=0 i=0 and Li, i ∈ I[0, t − 1], satisfy   L0 L1 ··· Lt−1 = Ctrt(AA, B)St(H(s)) − Ctrt(AA, In)St((s)In). (7.29)

Proof From the proof of Theorem 7.3 it is known that Li, i ∈ I[0, t − 1], and Hi, i, i ∈ I[0, t], satisfy (7.22) if and only if they satisfy (7.26). So it suffices to show that the relation (7.26) is equivalent to the relations (7.28) and (7.29). From the iterative relation (7.26), a direct calculation gives ⎧ ⎪ Lt−1 = BHt − tI, ⎪ ⎨⎪ Lt−2 = BHt−1 + AABHt − t−1I − tAA, ···   (7.30) ⎪ t−1 ⎪ L0 = BH1 + AABH2 +···+ AA BHt ⎩⎪  t−1 −1I − 2AA −···−t AA , 288 7 Polynomial-Matrix-Based Approaches and

 t 0 = BH0 + AABH1 +···+ AA BHt (7.31)  t −0I − 1AA −···−t AA .

Therelation(7.31) is exactly (7.28). Rewriting (7.30) in a matrix form, gives (7.29). So it is shown that (7.26)implies(7.28) and (7.29). On the other hand, it follows from (7.30) that

 2  t AAL0 = AABH1 + AA BH2 +···+ AA BHt  2  t −1AA − 2 AA −···−t AA .

Combining this relation with (7.28), gives the first expression of (7.26). For i ∈ I[1, t − 1], from (7.30) one has

+ −  AAL⎛i BHi iI ⎞ t      ⎝ j−(i+1) j−(i+1) ⎠ = AA AA BHj − j AA + BHi − iI j=i+1 t    j−i  j−i = AA BHj − j AA + BHi − iI j=i+1 t    j−i  j−i = AA BHj − j AA = Li−1. j=i

This is the second expression of (7.26). The last expression of (7.26) is obvious. Thus the relations (7.28) and (7.29) imply the relation (7.26). With the above two aspects, the proof is completed.  With the above lemma, it is easily known that

t−1   −1 i LiR (FF) F(FF) i=0    −1 = L0 L1 ··· Lt−1 Obst(FF, R (FF) F) (7.32)    −1 = Ctrt(AA, B)St(H(s)) − Ctrt(AA, In)St((s)In) Obst(FF, R (FF) F), and 7.2 Nonhomogeneous Con-Sylvester Matrix Equations 289

t−1  − 1 i LiR (FF) (FF) i=0 t−1  − 1 i = LiR (FF) (FF) (7.33) i=0    −1 = Ctrt(AA, B)St(H(s)) − Ctrt(AA, In)St((s)In) Obst(FF, R (FF) ).

With these two relations, one can give an equivalent form of the special solution in Theorem 7.3.

× × × × Theorem 7.4 Given matrices A ∈ Cn n,B∈ Cn r,R∈ Cn p, and F ∈ Cp p, sup- pose that λ AA ∩ λ(FF) = ∅. Let polynomial matrix

t i r×n H(s) = His ∈ C [s] i=0 and polynomial t i (s) = is ∈ C [s] i=0 satisfy (7.28), and (γ ) = 0, for any γ ∈ λ(FF).

Then a special solution (X0,Y0) of the nonhomogeneous con-Sylvester matrix equa- tion (7.21) can be given by ⎧     ⎪ −1 ⎪ X0 = Ctrt(AA, B)St(H(s)) − Ctrt(AA, In)St((s)In) Obst(FF, R (FF) F) ⎪ ⎨    −1 +A Ctrt(AA, B)St(H(s)) − Ctrt(AA, In)St((s)In) Obst(FF, R (FF) ) . ⎪ t   ⎪ −1 i ⎩⎪ Y0 = HiR (FF) (FF) i=0

With the results of Theorems 7.1–7.4, one can obtain the next two theorems on general solutions of the nonhomogeneous con-Sylvester matrix equation (7.21)by applying Proposition 7.1.

n×n n×r n×p p×p Theorem 7.5 Given matrices A ∈ C ,B∈ C ,R∈ C , and F ∈ C , sup- λ( ) ∩ λ( ) = ∅ ( ) = t−1 i ∈ pose that AA F F holds, and polynomial matrices L s i=0 Lis Cn×n ( ) = t i ∈ Cr×n ( ) = t  i ∈ [s] and H s i=0 His [s], and polynomial s i=0 is C [s] satisfy (7.22) with (γ ) = 0 for any γ ∈ λ(FF). Further, let

n ( ) = ( − ) = α i,α = , fAA s det sI AA is n 1 i=0 290 7 Polynomial-Matrix-Based Approaches

n−1 i adj(sI − AA) = Ris . i=0

Then all the solutions to the nonhomogeneous con-Sylvester matrix equation (7.21) can be characterized by ⎧ ⎪ n−1 n−1 ⎪ = ( )i + ( )i ⎪ X RiBZF FF A RiBZ FF ⎪ ⎪ i=0 i=0 ⎨ t−1 t−1  −1  −1 + L R (FF) F(FF)i + A L R (FF) (FF)i , (7.34) ⎪ i i ⎪ i=0 i=0 ⎪ t ⎪  − ⎪ = ( ) + ( ) 1 ( )i ⎩ Y ZfAA FF HiR FF FF i=0 where Z ∈ Cr×p is an arbitrarily chosen free parameter matrix.

Theorem 7.6 Given matrices A ∈ Cn×n,B∈ Cn×r,R∈ Cn×p, and F ∈ Cp×p, sup- pose that λ(AA) ∩ λ(FF) = ∅. Let polynomial matrix

t i r×n H(s) = His ∈ C [s] i=0 and polynomial t i (s) = is ∈ C [s] i=0 satisfy (7.28) and (γ ) = 0, for any γ ∈ λ(FF).

Then all the solutions to the nonhomogeneous con-Sylvester matrix equation (7.21) can be characterized by ⎧ ⎪ X = Ctr(AA, B)S (f (s)I ) Obs (FF, ZF) + ACtr(AA, B)S (f (s)I ) Obs (FF, Z) ⎪  n AA r n  n AA  r n ⎪ + ( , ) ( ( )) − ( , ) (( ) ) ( , ( ) −1 ) ⎨⎪ Ctrt AA B St H s Ctrt AA In St s In Obst FF R FF F     + ( , ) ( ( )) − ( , ) (( ) ) ( , ( ) −1) , ⎪ A Ctrt AA B St H s Ctrt AA In St s In Obst FF R FF ⎪ t ⎪  − ⎪ = ( ) + ( ) 1 ( )i ⎩ Y ZfAA FF HiR FF FF i=0 (7.35) where the arbitrarily chosen parameter matrix Z represents the degrees of freedom in the solution.

In Theorems 7.5 and 7.6, the coefficient matrices F, R together with the free parameter matrix Z explicitly appear in the solution. This allows the matrices F and 7.2 Nonhomogeneous Con-Sylvester Matrix Equations 291

R to be undetermined. These properties may bring convenience and advantages to the further application of the matrix equation (7.21). If Theorem 7.6 is adopted to solve the matrix equation (7.21), only the polynomial matrix H(s), polynomial (s) α ∈ I[ , ] ( ) and the coefficient i, i 0 n ,offAA s are required. To avoid the operation on polynomial matrices, one can find H(s) and (s) by solving the equation (7.28). Denote   T T T T H[t] = H H ··· H , (7.36)  0 1 t  T [t] = 0In 1In ··· tIn .

The equation (7.28) can be written as     H[t] Ctrt+1(AA, B) Ctrt+1(AA, In) = 0. [t]

Obviously, the unknown [t] is structured. Thus, it is a bit difficult to obtain Hi, i, i ∈ I[0, t], by solving the equation (7.28). However, if the matrix pair (AA, B) is controllable, that is,   rank sI − AAB = n, for any s ∈ C, then the approach of Theorem 7.6 can be readily employed to solve the matrix equation (7.21). Thus, the polynomial (s) can be chosen to be 1. In this case, there holds  −1 (FF) = In, St((s)In) = 0,   t i = and the matrix equation (7.28) becomes i=0 AA BHi I, which can be rewritten as Ctrt+1(AA, B)H[t] = I. (7.37)

When the number t is sufficiently large, there holds     rank Ctrt+1(AA, B) = rank Ctrt+1(AA, B) I .

Under this condition, the matrices Hi, i ∈ I[0, t], can be obtained from (7.37)by using some numerically reliable approaches such as QR decomposition approach and singular value decomposition approach. For clarity, this fact is summarized as the following corollary. Corollary 7.4 Given matrices A ∈ Cn×n,B∈ Cn×r,R∈ Cn×p, and F ∈ Cp×p, sup- pose that λ(AA) ∩ λ(FF) = ∅ and (AA, B) is controllable. Let polynomial matrix ( ) = t i ∈ Cr×n H s i=0 His [s] satisfy

t  i AA BHi = I. i=0 292 7 Polynomial-Matrix-Based Approaches

Then all the solutions to the nonhomogeneous con-Sylvester matrix equation (7.21) can be characterized by ⎧ ⎪ X = Ctr(AA, B)S (f (s)I ) Obs (FF, ZF) ⎪ n AA r n ⎪ + ( , ) ( ( ) ) ( , ) ⎪ ACtr AA B Sn fAA s Ir Obsn FF Z ⎨⎪ + Ctrt(AA, B)St(H(s)) Obst(FF, RF) , ⎪ +ACtr (AA, B)S (H(s)) Obs (FF, R) ⎪ t t t ⎪ t ⎪ = ( ) + ( )i ⎩⎪ Y ZfAA FF HiR FF i=0 where the arbitrarily chosen parameter matrix Z represents the degrees of freedom in the solution.

× × × × Corollary 7.5 Given matrices A ∈Cn n,B ∈ Cn r,R∈ Cn p, and F ∈ Cp p, sup- λ( ) ∩ λ( ) = ∅ , pose that AA FF and AA B is controllable. Let polynomial matrices ( ) = t−1 i ∈ Cn×n ( ) = t i ∈ Cr×n L s i=0 Lis [s] and H s i=0 His [s] satisfy

(AA − sI)L(s) + BH(s) = In.

Further, let ( ) = ( − ), fAA s det sI AA

n−1 i adj(sI − AA) = Ris . i=0

Then all the solutions to the nonhomogeneous con-Sylvester matrix equation (7.21) can be given by ⎧ ⎪ n−1 n−1 ⎪ = ( )i + ( )i ⎪ X RiBZF FF A RiBZ FF ⎪ ⎪ i=0 i=0 ⎨ t−1 t−1 + ( )i + ( )i , ⎪ LiRF FF A LiR FF ⎪ = = ⎪ i 0 i 0 ⎪ t ⎪ = ( ) + ( )i ⎩ Y ZfAA FF HiR FF i=0 where Z ∈ Cr×p is an arbitrarily chosen free parameter matrix. 7.2 Nonhomogeneous Con-Sylvester Matrix Equations 293

7.2.2 The Second Approach

In this subsection, another approach is presented for solving the nonhomogeneous con-Sylvester matrix equation (7.21). First, the following lemma is needed. This result is obvious, and the proof is thus omitted. Lemma 7.6 Given matrices A ∈ Cn×n,B∈ Cn×r,F∈ Cp×p, and R ∈ Cn×p,ifX= X0 is a solution of the following normal con-Sylvester matrix equation

XF − AX =−R, (7.38) then (X, Y) = (X0, 0) is a solution to the con-Sylvester matrix equation (7.21). For the solution to the normal con-Sylvester matrix equation (7.38), one has the following result. × × × Lemma 7.7 Given matrices A ∈ Cn n, F ∈ Cp p, and R ∈ Cn r with λ(AA) ∩ λ( ) = ∅ ( ) = − FF ,letfAA s det sI AA , and

n−1 adj(sI − AA) = Rn−1s +···+R1s + R0. (7.39)

Then, the normal con-Sylvester matrix equation (7.38) has a unique solution   n−1    n−1      =− i + i ( ) −1 . X Ri RF FF Ri AR FF fAA FF i=0 i=0

The result in Lemma 7.7 has been given in Subsection 6.1.3. By Proposition 7.1, Lemmas 7.6 and 7.7, and Theorem 7.1, one has the following theorem on the general solution to the con-Sylvester matrix equation (7.21). Theorem 7.7 Given matrices A ∈ Cn×n,B∈ Cn×r,R∈ Cn×p, and F ∈ Cp×p, sup- pose that λ(AA) ∩ λ(FF) = ∅, and (7.39) hold. Then all the solutions to the nonho- mogeneous con-Sylvester matrix equation (7.21) can be characterized by ⎧ ⎪ n−1 n−1 ⎪ = ( )i + ( )i ⎪ X RiBZF FF A RiBZ FF ⎨⎪ i=0 i=0  n−1    n−1      , (7.40) ⎪ i i −1 ⎪ − R RF FF + R AR FF f (FF) ⎪ i i AA ⎩⎪ i=0 i=0 = ( ) Y ZfAA FF where Z ∈ Cr×p is an arbitrarily chosen free parameter matrix. Different from Theorems 7.4 and 7.5, one does not need to solve the polynomial matrices H(s) and L(s), and polynomial (s) when applying Theorem 7.7 to obtain general solutions to the con-Sylvester matrix equation (7.21). 294 7 Polynomial-Matrix-Based Approaches

Similarly to the derivation on Corollary 7.1, one can obtain an equivalent form of the solution to the equation (7.21).

Theorem 7.8 Given matrices A ∈ Cn×n,B∈ Cn×r,R∈ Cn×p, and F ∈ Cp×p, sup- pose that λ(AA) ∩ λ(FF) = ∅, and (7.39) hold. Then all the solutions to the nonho- mogeneous con-Sylvester matrix equation (7.21) can be characterized by ⎧ n−1   ⎪  −1 ⎪ X = R BZ − R f (FF) F(FF)i ⎪ i AA ⎨ i=0 n−1    − , ⎪ 1 i ⎪ +A Ri BZ − R f (FF) (FF) ⎪ AA ⎩⎪ i=0 = ( ) Y ZfAA FF where Z ∈ Cr×p is an arbitrarily chosen free parameter matrix.

7.3 Con-Yakubovich Matrix Equations

In Sect. 6.4, the con-Yakubovich matrix equations have been investigated. In this section, the following nonhomogeneous version of the con-Yakubovich matrix equa- tion is considered X − AXF = BY + R, (7.41) where A ∈ Cn×n, B ∈ Cn×r, R ∈ Cn×p, and F ∈ Cp×p are known matrices, and X ∈ Cn×p and Y ∈ Cr×p are the matrices to be determined. Different from the approach in Sect. 6.4, the solutions of this equation are derived in the current section beyond the framework of real representations of complex matrices. With the aid of Smith normal form reduction, a general complete parametric solution is proposed in terms of the original matrices. This solution is represented in a finite series form, and can offer all the degrees of freedom which are represented by a free parameter matrix Z. In addition, the matrices F and R, together with Z, appear explicitly in this solution. This property allows the matrices F and R to be unknown a priori. The degrees of freedom existing in the solution of the con-Yakubovich matrix equation (7.41) are first investigated. The following result is needed.

Lemma 7.8 Given matrices A ∈ Cn×n and F ∈ Cp×p, the con-Kalman-Yakubovich matrix equation X − AXF = C   ∈ Cn×p γμ= γ ∈ λ has a unique  solution for any C if and only if 1, for any AA , μ ∈ λ FF .

The result of Lemma 7.8 is exactly that of Theorem 6.10. 7.3 Con-Yakubovich Matrix Equations 295

Based on this lemma, it is known that the con-Yakubovich matrix equation (7.41) has a unique  solution X with respect to an arbitrary fixed matrix Y if γμ= 1, for any γ ∈ λ AA , μ ∈ λ FF . Therefore, under this condition, the degrees of freedom in the solution (X, Y) to the matrix equation (7.41) are equal to the number of elements in the matrix Y, that is, rp. This fact is expressed in the following lemma.

× × × × Lemma 7.9 Given matrices A ∈ C n n,B∈ Cn r,F ∈ Cp p, and R ∈ Cn p,ifγμ= 1, for any γ ∈ λ AA , μ ∈ λ FF , then the degrees of freedom existing in the solu- tion (X, Y) to the con-Yakubovich matrix equation (7.41)arerp.

7.3.1 The First Approach

7.3.1.1 The Basic Solution

In this subsection, an explicit closed-form solution is provided for the nonhomoge- neous con-Yakubovich matrix equation (7.41). To obtain the result, the following lemma is needed with the aid of Smith normal form reduction.

× × × Lemma 7.10 Given matrices  A ∈ Cn n,B ∈ Cn r, and F ∈ Cp p, suppose that γμ= 1, for any γ ∈ λ AA , μ ∈ λ FF . Then there exist polynomial matrices L(s) ∈ Cn×n [s] and H(s) ∈ Cr×n [s], and polynomial (s) ∈ C [s] such that

(I − sAA)L(s) − BH(s) = (s)In, (7.42) with (γ ) = 0 for any γ ∈ λ(FF).

× ( + )×( + ) Proof Let P(s) ∈ Cn n [s] and Q(s) ∈ C nr n r [s] be two unimodular polyno- mial matrices that transform I − sAA −B into its Smith normal form. That is,     n P(s) I − sAA −B Q(s) = diag di(s) 0 (7.43) i=1 with di−1(s)|di ∈ $C [s], i ∈ I[2, n]. According% to the condition of this theorem, it is easily known that s| det(I − sAA) = 0 ∩ λ(FF) = ∅. Thus, di(γ ) = 0, i ∈ I[1, n], for any γ ∈ λ(FF). Partition Q(s) into   Q(s) = Q1(s) Q2(s)

(n+r)×n with Q1(s) ∈ C [s]. It follows from (7.43) that

  n P(s) I − sAA −B Q1(s) = diag di(s) i=1 296 7 Polynomial-Matrix-Based Approaches   n dn(s) ⇐⇒ P(s) I − sAA −B Q1(s) diag = Indn(s) i=1 di(s)    n dn(s) ⇐⇒ I − sAA −B Q1(s) diag P(s) = Ind(s). i=1 di(s)

Thus, one can choose (s) = dn(s) and    ( ) n ( ) L s = ( ) dn s ( ) ( ) Q1 s diag P s H s i=1 di(s) to satisfy the relation (7.42). 

With this lemma, the following theorem can be provided to obtain a closed-form solution for the nonhomogeneous con-Yakubovich matrix equation (7.41).

× × × × Theorem 7.9 Given matrices A ∈ Cn n,B∈ Cn r,R ∈ Cn p, and F ∈ Cp p, sup- γμ= γ ∈ λ μ ∈ λ pose that 1 holds for any AA , FF , and polynomial matri- ( ) = t−1 i ∈ Cn×n ( ) = t i ∈ Cr×n ces L s i=0 Lis [s] and H s i=0 His [s], and polynomial ( ) = t  i ∈ C (γ ) = γ ∈ λ( ) s i=0 is [s] satisfy (7.42) with 0 for any FF . Further, let ( ) = ( − ), gAA s det I sAA

n−1 i adj(I − sAA) = Ris . i=0

Then all the solutions to the nonhomogeneous con-Yakubovich matrix equation (7.41) can be characterized by ⎧ ⎪ n−1 n−1 ⎪ = ( )i + ( )i ⎪ X RiBZ FF A RiBZF FF ⎪ ⎪ i=0 i=0 ⎨ t−1 t−1  −1  −1 + L R (FF) (FF)i + A L R (FF) F(FF)i , (7.44) ⎪ i i ⎪ i=0 i=0 ⎪ t ⎪  − ⎪ = ( ) + ( ) 1 ( )i ⎩ Y ZgAA FF HiR FF FF i=0 where Z ∈ Cr×p is an arbitrarily chosen free parameter matrix.

Proof Let us first show that the matrices X and Y given in (7.44) are solutions of the n i matrix equation (7.41). Let g (s) = αis . A direct calculation gives AA i=0

X − AXF 7.3 Con-Yakubovich Matrix Equations 297

n−1 n−1 i i = RiBZ(FF) + A RiBZF(FF) i=0 i=0 t−1 t−1  −  − 1 i 1 i + LiR (FF) (FF) + A LiR (FF) F(FF) i=0 i=0 n−1 n−1 i i −A RiBZ(FF) F − AA RiBZF(FF) F i=0 i=0 t−1 t−1  −  − 1 i 1 i −A LiR (FF) (FF) F − AA LiR (FF) F(FF) F i=0 i=0 n−1 n−1 i i+1 = RiBZ(FF) − AA RiBZ(FF) i=0 i=0 t−1 t−1  −  − 1 i 1 i+1 + LiR (FF) (FF) − AA LiR (FF) (FF) i=0 i=0 n−1 i n = R0BZ + (Ri − AARi−1)BZ(FF) − AARn−1BZ(FF) (7.45) i=1 t−1  −  − 1 1 i +L0R (FF) + (Li − AALi−1)R (FF) (FF) i=1  −1  t −AALt−1R (FF) FF .

By the following relation

(I − sAA) adj(I − sAA) = I det(I − sAA), it is easily derived that ⎧ ⎨ R0 = α0I, − = α , ∈ I[ , − ], ⎩ Ri AARi−1 iI i 1 n 1 (7.46) −AARn−1 = αnI.

Thus one has

n−1 i n R0BZ + (Ri − AARi−1)BZ(FF) − AARn−1BZ(FF) i=1 n = α ( )i = ( ) BZ i FF BZgAA FF . (7.47) i=1

In addition, comparing the powers of s in the expression (7.42) yields 298 7 Polynomial-Matrix-Based Approaches ⎧ ⎨ L0 = BH0 + 0I, − = +  , ∈ I[ , − ], ⎩ Li AALi−1 BHi iI i 1 t 1 (7.48) −AALt−1 = BHt + tI.

With these relations, one easily has

t−1  −  − 1 1 i L0R (FF) + (Li − AALi−1)R (FF) (FF) i=1  −1  t −AALt−1R (FF) FF t−1  −  − 1 1 i = (BH0 + 0I) R (FF) + (BHi + iI) R (FF) (FF) i=1  −1  t + (BHt + tI) R (FF) FF t t  −  − 1 i 1 i = BHiR (FF) (FF) + iR (FF) (FF) (7.49) i=0 i=1 t t  −  − 1 i 1 i = BHiR (FF) (FF) + R (FF) i(FF) i=0 i=1 t  −  − 1 i 1 = BHiR (FF) (FF) + R (FF) (FF) i=0 t  − 1 i = BHiR (FF) (FF) + R. i=0

Substituting (7.47) and (7.49)into(7.45), yields

X − AXF t  − = ( ) + ( ) 1 ( )i + BZgAA FF BHiR FF FF R  i=0  t  − = ( ) + ( ) 1 ( )i + B ZgAA FF HiR FF FF R i=0 = BY + R.

Therefore, the matrices X and Y given in (7.44) satisfy the matrix equation (7.41). Secondly, it is needed to show the completeness of the solution (7.44). It follows from Lemma 7.9 that there are rp degrees of freedom in the solution of the con- Yakubovich matrix equation (7.41), while the solution given by (7.44) has exactly rp parameters represented by the elements of the free matrix Z. Therefore, in the following it suffices to show that all the parameters in the matrix Z contribute to the solution. To do this, it is sufficient to show that the mapping Z → (X, Y) defined by 7.3 Con-Yakubovich Matrix Equations 299

( ) (7.44) is injective. This is true since gAA FF is nonsingular under the condition that γμ= 1, for any γ ∈ λ AA , μ ∈ λ FF . The proof is thus completed.  In order to obtain a solution of the nonhomogeneous con-Yakubovich matrix equation (7.41) by Theorem 7.9, one needs to compute polynomial matrices L(s), H(s) and the polynomial (s) satisfying the relation (7.42), the coefficients αi, ∈ I[ , ] ( ) ∈ I[ , − ] i 1 n , of the polynomial gAA s , and the coefficient matrices Ri, i 0 n 1 , of adj(I − sAA). Regarding the solution of αi, i ∈ I[0, n], and Ri, i ∈ I[0, n − 1],the following so-called generalized Leverrier algorithm in Sect. 2.3 can be adopted:  Ri = AARi−1 + αiI, ( ) , i ∈ I[1, n], (7.50) α =−tr AARi−1 , i i with the initial condition R0 = I. It follows from the proof of Lemma 7.10 that the polynomial matrices L(s), H(s), and polynomial (s) can be easily obtained based on Smith normal form reduction. For relatively lower-order cases, the Smith normal form reduction can be implemented by using elementary polynomial matrix transformation. For high-order cases, the Maple function “smith” can be readily employed. By setting R in the con-Yakubovich matrix equation (7.41) to be zero, from Theorem 7.9 one can easily obtain the general solution of the homogeneous con- Yakubovich matrix equation

X − AXF = BY, (7.51) where A ∈ Cn×n, B ∈ Cn×r, and F ∈ Cp×p are known matrices, and X ∈ Cn×p and Y ∈ Cr×p are the matrices to be determined. × × × Corollary 7.6 Given  A ∈ Cn n,B∈ Cn r, and F ∈ Cp p,itisassumedthatγμ= 1, for any γ ∈ λ AA , μ ∈ λ FF . Then, all the solutions to the con-Yakubovich matrix equation (7.51) can be explicitly given by ⎧ − − ⎨⎪ n 1 n 1 X = R BZ(FF)i + A R BZF(FF)i i i , (7.52) ⎩⎪ i=0 i=0 = ( ) Y ZgAA FF where Z ∈ Cr×p is a free parameter matrix. From (7.46) one can obtain that   = i α i−j . Ri j=0 j AA

It follows from Theorem 2.8 that αj, j ∈ I[0, n], are all real numbers. With this fact it is easily checked that ARi = RiA. Thus the expression (7.52) can be equivalently rewritten as 300 7 Polynomial-Matrix-Based Approaches ⎧ − − ⎨⎪ n 1 n 1 = ( )i + ( )i X RiBZ FF RiABZF FF , ⎩⎪ i=0 i=0 = ( ) Y ZgAA FF which is exactly the solution given in Theorem 6.22. However, it is noted that in Sect. 6.4, in order to obtain this solution one needs to use some properties of the real representation of a complex matrix and a relation on the characteristic polynomial of the real representation. Different from the approach in Sect. 6.4, in the current section the solution of the homogeneous con-Yakubovich matrix equation (7.51)is derived in a direct way. At the end of this subsubsection, it is pointed out that from Theorem 7.9 a special solution of the con-Yakubovich matrix equation (7.41) can be obtained as ⎧ t−1 t−1 ⎪  −  − ⎪ 1 i 1 i ⎨⎪ X = LiR (FF) (FF) + A LiR (FF) F(FF) i=0 i=0 . t ⎪  − ⎪ 1 i ⎩⎪ Y = HiR (FF) (FF) i=0

7.3.1.2 Equivalent Forms

In this subsubsection, several equivalent forms of the presented solution (7.44)are provided. To obtain the result, the following lemma is needed. The idea of the proof of this lemma is motivated by [302].

Lemma 7.11 Let Li, i ∈ I[0, t − 1], and Hi, i,i∈ I[0, t], be the coefficients of L(s),H(s), and (s), respectively. Then they satisfy (7.42) if and only if Hi, i, i ∈ I[0, t], satisfy

t t  t−i  t−i AA BHi =− i AA , (7.53) i=0 i=0 and Li, i ∈ I[0, t − 1], satisfy   L0 L1 ··· Lt−1 = Ctrt(AA, B)Tt(H(s)) + Ctrt(AA, In)Tt((s)In). (7.54)

Proof From the proof of Theorem 7.9 it is known that Li, i ∈ I[0, t − 1], and Hi, i, i ∈ I[0, t], satisfy (7.42) if and only if they satisfy (7.48). So it suffices to show that the relation (7.48) is equivalent to the relations (7.53) and (7.54). From the iterative relation (7.48), a direct calculation gives 7.3 Con-Yakubovich Matrix Equations 301 ⎧ ⎪ L0 = BH0 + 0I, ⎪ ⎨⎪ L1 = BH1 + AABH0 + 1I + 0AAI, ···   (7.55) ⎪ t−1 ⎪ Lt−1 = BHt−1 + AABHt−2 +···+ AA BH0 ⎩⎪  t−1 +t−1I + t−2AA +···+0 AA , and

 t 0 = BHt + AABHt−1 +···+ AA BH0 (7.56)  t +tI + t−1AA +···+0 AA .

Therelation(7.56) is exactly (7.53). Rewriting (7.55) in a matrix form, gives (7.54). So it is shown that (7.48)implies(7.53) and (7.54). On the other hand, it follows from (7.54) that

 2  t AALt−1 = AABHt−1 + AA BHt−2 +···+ AA BH0  2  t +t−1AA + t−2 AA +···+0 AA .

Combining this relation with (7.53), gives the last expression of (7.48). For i ∈ I[1, t − 1], from (7.54) one has

+ +  AAL⎛i−1 BHi iI ⎞ i−1      ⎝ i−j−1 i−j−1 ⎠ = AA AA BHj + j AA + BHi + iI j=0 i−1    i−j  i−j = AA BHj + j AA + BHi + iI j=0 i    (i+1)−j−1  (i+1)−j−1 = AA BHj + j AA j=0

= Li.

This is the second expression of (7.48). The first expression of (7.48) is obvious. Thus the relations (7.53) and (7.54) imply the relation (7.48). The proof is thus completed.  The main result of this subsubsection is stated as the following theorem. × × × × Theorem 7.10 Given matrices A∈ Cn n,B∈Cn r,R∈ Cn p, and F ∈ Cp p, sup- γμ= γ ∈ λ μ ∈ λ ( ) = pose that 1, for any AA , FF . Let polynomial matrix H s t i ∈ Cr×n ( ) = t  i ∈ C i=0 His [s] and polynomial s i=0 is [s] satisfy (7.53) and (γ ) = 0, for any γ ∈ λ(FF). Then, all the solutions to the con-Yakubovich matrix equation (7.41) can be characterized by 302 7 Polynomial-Matrix-Based Approaches ⎧ ⎪ X = Ctr(AA, B)Tn(g (s)Ir) Obsn(FF, Z) ⎪ AA ⎪ +ACtr(AA, B)T (g (s)I ) Obs (FF, ZF) ⎪  n AA r n      ⎨ + ( , ) ( ( )) + ( , ) (( ) ) , ( ) −1 Ctrt AA B Tt H s Ctrt AA In Tt s In Obst FF R FF , ⎪   ⎪    −1 ⎪ +A Ctr (AA, B)T (H(s)) + Ctr (AA, I )T ((s)I ) Obs FF, R (FF) F ⎪ t t t n t n t ⎪ t  − ⎩ 1 i Y = Zg (FF) + HiR (FF) (FF) AA i=0 (7.57) where Z is an arbitrarily chosen parameter matrix.

Proof For the expressions in (7.44), by the proof of Theorem 6.23 it is easily derived that n−1 ( )i = ( , ) ( ( ) ) ( , ), RiBZ FF Ctr AA B Tn gAA s Ir Obsn FF Z (7.58) i=0 and

n−1 i RiBZF(FF) i=0 n−1 i = RiBZF(FF) (7.59) i=0 = ( , ) ( ( ) ) ( , ) Ctr AA B Tn gAA s Ir Obsn FF ZF .

It follows from Lemma 7.11 that the relation (7.42) holds if and only if the relations (7.53) and (7.54) hold. Thus, for the expressions in (7.44) one has

t−1  − 1 i LiR (FF) (FF) i=0    −1 = L L ··· L − Obs (FF, R (FF) ) (7.60) 0 1 t 1 t      −1 = Ctrt(AA, B)Tt(H(s)) + Ctrt(AA, In)Tt((s)In) Obst FF, R (FF) .

Similarly, one also has

t−1   −1 i LiR (FF) F(FF) i=0 t−1  − 1 i = LiR (FF) F(FF) (7.61) i=0      −1 = Ctrt(AA, B)Tt(H(s)) + Ctrt(AA, In)Tt((s)In) Obst FF, R (FF) F . 7.3 Con-Yakubovich Matrix Equations 303

Substituting (7.58), (7.59), (7.60), and (7.61) into the expression (7.44), gives (7.57). The proof is thus completed. 

In the above theorem, the solution to the con-Yakubovich matrix equation (7.41) is expressed in terms of the controllability matrix associated with the matrices AA and B, and the observability matrix associated with F and the free parameter matrix Z. This property may bring convenience and advantages to the further analysis and application of the matrix equation (7.41). When the approach in Theorem 7.10 is adopted to solve the matrix equation (7.41), only the polynomial matrix H(s), the polynomial (s) and the coefficients αi, ∈ I[ , ] ( ) α ∈ I[ , ] i 0 n ,ofgAA s are required. To obtain i, i 0 n , some proper numerically reliable algorithms (e.g. [189]) can be applied apart from the generalized Leverrier algorithm (7.50). Instead of adopting Smith normal form reduction, H(s) and (s) can be obtained by solving the equation (7.53). Thus the operation on polynomial matrices can be avoided. Denote   T T T T H[t] = H H − ··· H , (7.62)  t t 1 0  T [t] = tIn t−1In ··· 0In .

The equation (7.53) can be written as     H[t] Ctrt+1(AA, B) Ctrt+1(AA, In) = 0. [t]

Obviously, this is a structured linear matrix equation since i, i ∈ I[0, t], are scalars. Thus, it is a bit difficult to obtain Hi, i, i ∈ I[0, t], by solving the equation (7.53). In our future work further attempt will be made to give an efficient approach for solving such a kind of equations. However, when the matrix triple (AA, I, B) is R- controllable (on the detailed definition of R-controllability, please refer to [81]), that is,   rank I − sAA −B = n, for any s ∈ C, the approach of Theorem 7.10 can be readily employed to solve the matrix equation (7.41) without using the Smith normal form reduction. This is summarized as the following corollary.

× × × × Corollary 7.7 Given matrices A ∈ Cn n,B ∈Cn r,R∈ Cn p, and F ∈ Cp p, sup- γμ= γ ∈ λ μ ∈ λ ( , , ) pose that 1 holds for any AA , FF , and AA I B is R- ( ) = t i ∈ Cr×n controllable. Let polynomial matrix H s i=0 His [s] satisfy

t  t−i  t AA BHi =− AA . (7.63) i=0 304 7 Polynomial-Matrix-Based Approaches

Then all the solutions to the con-Yakubovich matrix equation (7.41) can be charac- terized by ⎧ ⎪ X = Ctr(AA, B)Tn(g (s)Ir) Obsn(FF, Z)  ⎪ AA ⎪ +A Ctr(AA, B)T (g (s)I ) Obs (FF, ZF) ⎨⎪  n AA r n  + Ctr (AA, B)T (H(s)) + Ctr (AA, I ) Obs (FF, R) , ⎪ t t t n t ⎪   ⎪ +A Ctrt(AA, B)Tt(H(s)) + Ctrt(AA, In) Obst(FF, RF) ⎪ t ⎩ i Y = Zg (FF) + HiR(FF) AA i=0 where Z is an arbitrarily chosen parameter matrix.

This result can be readily derived from Theorem 7.10 by noticing that the poly- nomial (s) can be chosen to be 1 when (AA, I, B) is R-controllable. Thus

 −1 (FF) = In, Tt((s)In) = I, and then the matrix equation (7.53) becomes the equation (7.63). Denote   = T T ··· T T , H[t] Ht Ht−1 H0 this equation can be rewritten as

 t Ctrt+1(AA, B)H[t] =− AA . (7.64)

When the number t is sufficiently large, one has       ( , ) = t . rank Ctrt+1 AA B rank Ctrt+1(AA, B) − AA

Under this condition, the matrices Hi, i ∈ I[0, t], can be obtained from (7.64)by using some numerically reliable approaches such as QR decomposition approach and singular value decomposition approach. t In addition, if the polynomial (s) is chosen to be s , then Tt((s)In) = 0. Similar to the above arguments, the following corollary can be easily obtained.

× × × × Corollary 7.8$ Given matrices A% ∈ Cn n,B∈ Cn r,C∈ Cn p, and F ∈ Cp p, sup- | ( − ) = ∩ λ( ) = ∅ ∈/ λ( ) pose that s det I sAA 0 FF , and 0 FF . Let polynomial ( ) = t i ∈ Cr×n matrix H s i=0 His [s] satisfy

t  t−i AA BHi =−I. i=0 7.3 Con-Yakubovich Matrix Equations 305

Then all the solution to the con-Yakubovich matrix equation (7.41) can be charac- terized by ⎧ ⎪ = ( , ) ( ( ) ) ( , ) ⎪ X Ctr AA B Tn gAA s Ir Obsn FF Z  ⎪ ⎪ +A Ctr(AA, B)Tn(g (s)Ir) Obsn(FF, ZF) ⎪ AA ⎨ + Ctr (AA, B)T (H(s)) Obs (FF, R(FF)−t)  t t t  , ⎪ + ( , ) ( ( )) ( , ( )−t ) ⎪ A Ctrt AA B Tt H s Obst FF R FF F ⎪ ⎪ t ⎪ = ( ) + ( )i−t ⎩ Y ZgAA FF HiR FF i=0 where Z is an arbitrarily chosen parameter matrix.

7.3.2 The Second Approach

In this subsection, an alternative approach is given for solving the con-Yakubovich matrix equation (7.41). Before proceeding, a result is needed on the solution of the con-Kalman-Yakubovich matrix equation

X0 − AX0F = R, (7.65)

n×n p×p n×p n×p where A ∈ C , F ∈ C , and R ∈ C are known matrices, and X0 ∈ C is the matrix to be determined. This result has been given in Sect. 6.2. For convenience, this result is rewritten as the following lemma.

× × × Lemma 7.12 Given matrices  A ∈ C n n,R∈ Cn p, and F ∈ Cp p, suppose that γμ= 1, for any γ ∈ λ AA , μ ∈ λ FF , and let

n−1 i adj(I − sAA) = Ris . (7.66) i=0

Then the unique solution of the matrix equation (7.65) can be expressed as   n−1 n−1  − = ( )i + ( )i ( ) 1 X0 RiR FF RiARF FF gAA FF . i=0 i=0

Suppose that a solution to the nonhomogeneous con-Yakubovich matrix equation (7.41) can be expressed as X = T + X˜ , Y = Y˜ with (X˜ , Y˜ ) being a solution to the matrix equation (7.51). Then one has 306 7 Polynomial-Matrix-Based Approaches   T + X˜ − AT + XF˜ − BY˜ − R

= X˜ − AXF˜ + T − ATF − BY˜ − R = T − ATF − R. ˜ ˜ It follows from this relation that X = T + X, Y = Y is a solution if X0 = T is a solution to the matrix equation (7.65). Combining this fact with the solution (7.52) to the matrix equation (7.51), one can obtain the following conclusion on the solution to the con-Yakubovich matrix equation (7.41).

× × × × Theorem 7.11 Given matrices A∈ Cn n,B∈Cn r,R∈ Cn p, and F ∈ Cp p, sup- pose that γμ= 1, for any γ ∈ λ AA , μ ∈ λ FF , and (7.66) holds. Then all the solutions to the nonhomogeneous con-Yakubovich matrix equation (7.41) can be characterized by ⎧ ⎪ n−1 n−1 ⎪ = ( )i + ( )i ⎪ X RiBZ FF A RiBZF FF ⎨⎪ i=0 i=0  n−1 n−1  − , ⎪ i i 1 ⎪ + RiR(FF) + RiARF(FF) g (FF) ⎪ AA ⎩⎪ i=0 i=0 = ( ) Y ZgAA FF where Z ∈ Cr×p is an arbitrarily chosen free parameter matrix.

Different from Theorems 7.9 and 7.10, one does not need to solve the polynomial matrices H(s) and L(s), and the polynomial (s) when applying Theorem 7.11 to obtain general solutions of the con-Yakubovich matrix equation (7.41). In addition, it is easily known from Theorem 7.11 that ⎧   n−1 n−1 ⎨⎪  − = ( )i + ( )i ( ) 1 X RiR FF RiARF FF gAA FF ⎩⎪ i=0 i=0 Y = 0 is also a special solution to the con-Yakubovich matrix equation (7.41). Similar to derivation in Subsection 7.3.1, one can also obtain some equivalent forms of the solution in Theorem 7.11.

× × × × Corollary 7.9 Given matrices A ∈ Cn n,B∈ Cn r,R∈ Cn p, and F ∈ Cp p, sup- pose that γμ= 1, for any γ ∈ λ AA , μ ∈ λ FF , and (7.66) holds. Then all the solutions to the nonhomogeneous con-Yakubovich matrix equation (7.41) can be characterized by 7.3 Con-Yakubovich Matrix Equations 307 ⎧ n−1   ⎪  −1 ⎪ X = R BZ + R g (FF) (FF)i ⎪ i AA ⎨ i=0 n−1     , ⎪ −1 ⎪ +A R BZ + R g (FF) F(FF)i ⎪ i AA ⎩⎪ i=0 = ( ) Y ZgAA FF where Z ∈ Cr×p is an arbitrarily chosen free parameter matrix.

× × × × Corollary 7.10 Given matrices A ∈ Cn n,B∈ Cn r,R∈ Cn p, and F ∈ Cp p, sup- pose that γμ= 1, for any γ ∈ λ AA , μ ∈ λ FF . Then all the solutions to the nonhomogeneous con-Yakubovich matrix equation (7.41) can be characterized by ⎧ ⎡   ⎤ ⎪    ⎪ ⎣  ZF  ⎦ ⎪ X = A Ctr AA, BR Tn(g (s)Ir+p) Obsn FF, −1 ⎨⎪ AA R g (FF) F   AA     , ⎪  Z  ⎪ + Ctr AA, BR Tn(g (s)Ir+p) Obsn FF, −1 ⎪ AA R g (FF) ⎩⎪ AA = ( ) Y ZgAA FF where Z ∈ Cr×p is an arbitrarily chosen free parameter matrix.

7.4 Extended Con-Sylvester Matrix Equations

In this section, the following extended con-Sylvester matrix equation is revisited

EXF − AX = C, (7.67) where E, A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p are known matrices, and X ∈ Cn×p is the matrix to be determined. In Sect. 6.5, this matrix equation has been investigated by using a real representation of a complex matrix, and the solution of this matrix equation was obtained by solving the corresponding real representation matrix equa- tion. An obvious drawback of that approach is the high dimensionality. In this section, another approach is provided to solve the extended con-Sylvester matrix equation (7.67) in terms of its original coefficient matrices instead of their real representations. Before proceeding, the concept of regularity is required.

Definition 7.2 [81] Given E, A ∈ Cn×n, the matrix pair (E, A) is referred to be reg- ular if there exists a scalar γ ∈ C such that det (γ E − A) = 0.

The following lemma gives a property of a regular matrix pair, which plays a vital role in the sequel. 308 7 Polynomial-Matrix-Based Approaches

Lemma 7.13 Given matrices A, E ∈ Cn×n with the matrix pair (E, A) being regular, let γ be a scalar such that (γ E − A) is nonsingular, and denote

−1 −1 Eγ = (γ E − A) E, Aγ = (γ E − A) A. (7.68)

Then the following relation holds

AEγ = EAγ .

Proof With the condition in this lemma, one has

(γ E − A)−1 (γ E − A) = I.

This relation can be equivalently rewritten as

γ Eγ − Aγ = I. (7.69)

−1 From this relation one obtains Aγ = γ Eγ − I.Ifγ = 0, then Eγ =−A E, Aγ =−I.So   −1 EAγ =−E =−A A E = AEγ . (7.70)

If γ = 0, then from (7.69) one has

1   AEγ = A I + Aγ γ   1 −1 1 = (γ E − A) (γ E − A) A + AAγ γ γ (7.71) 1   = γ E − A + A Aγ = EAγ γ .

The relations (7.70) and (7.71) imply the conclusion of this lemma. 

7.4.1 Basic Solutions

In this subsection, the basic expression is provided for the unique solution to the extended con-Sylvester matrix equation (7.67). This is the following theorem.

Theorem 7.12 Given matrices E, A ∈ Cn×n, B ∈ Cn×r, and F ∈ Cp×p with the matrix pair (E, A) regular, let γ be a scalar such that (γ E − A) is nonsingular, and define two matrices Eγ and Aγ as in (7.68). Further, denote   ( ) = − , f(EEγ , AAγ ) s det sEEγ AAγ 7.4 Extended Con-Sylvester Matrix Equations 309 and

n−1 n−2 adj(sEEγ − AAγ ) = Rn−1,n−1s + Rn−1,n−2s +···+Rn−1,1s + Rn−1,0. (7.72) If the matrix equation (7.67) has a unique solution, then the unique solution can be given by

n−1     −1 = ( )i + X Eγ Rn−1,iC f(EEγ , AAγ ) FF F FF (7.73) i=0 n−1     −1 ( )i Aγ Rn−1,iC f(EEγ , AAγ ) FF FF . i=0

Proof Denote

( ) = n + n−1 +···+ 1 + f(EEγ , AAγ ) s qn,ns qn,n−1s qn,1s qn,0. (7.74)

It is obvious that   ( − ) ( − ) = ( ) sEEγ AAγ adj sEEγ AAγ f(EEγ , AAγ ) s I.

Comparing the coefficients of the same power of s in the preceding relation, one has ⎧ ⎨ −AAγ Rn−1,0 = qn,0I − = ∈ I[ , − ] ⎩ EEγ Rn−1,i−1 AAγ Rn−1,i qn,iI, i 1 n 1 . (7.75) EEγ Rn−1,n−1 = qn,nI

By applying the expression of X in (7.73), Lemma 7.13, and the relation (7.75), one has

EXF − AX ⎡   ⎤ n−1     −1 = ⎣ γ − , ( )i⎦ E E Rn 1 iC f(EEγ , AAγ ) FF F FF F i=0 ⎡ ⎤ n−1     −1 + ⎣ γ − , ( )i⎦ E A Rn 1 iC f(EEγ , AAγ ) FF FF F i=0   n−1     −1 i − γ − , ( ) AE Rn 1 iC f(EEγ , AAγ ) FF F FF i=0 n−1     −1 i − γ − , ( ) AA Rn 1 iC f(EEγ , AAγ ) FF FF =  i 0  n−1     −1 = γ − , ( )i EE Rn 1 iC f(EEγ , AAγ ) FF F FF F i=0 310 7 Polynomial-Matrix-Based Approaches

n−1     −1 i − γ − , ( ) AA Rn 1 iC f(EEγ , AAγ ) FF FF i=0 n−1     −1 i+1 = γ − , ( ) EE Rn 1 iC f(EEγ , AAγ ) FF FF i=0 n−1     −1 i − γ − , ( ) AA Rn 1 iC f(EEγ , AAγ ) FF FF i=0       −1   −1 n =− γ − , + γ − , − ( ) AA Rn 1 0C f(EEγ , AAγ ) FF EE Rn 1 n 1C f(EEγ , AAγ ) FF FF n−1       −1 i + γ − , − − γ − , ( ) EE Rn 1 i 1 AA Rn 1 i C f(EEγ , AAγ ) FF FF i=1   n−1     −1   −1 i = , + , ( ) qn 0C f(EEγ , AAγ ) FF qn iC f(EEγ , AAγ ) FF FF i=1     −1 n + , ( ) qn nC f(EEγ , AAγ ) FF FF     n   −1 i = , ( ) C f(EEγ , AAγ ) FF qn i FF i=0     −1   = C f(EEγ , AAγ ) FF f(EEγ , AAγ ) FF = C.

This implies that the matrix X given in (7.73) satisfies the matrix equation (7.67). The conclusion is thus true. 

By simple calculation, one has    ( )i f(EEγ , AAγ ) FF F FF       −   n n 1 i = qn,n FF + qn,n−1 FF +···+qn,1 FF + qn,0I F(FF)

    − n i n 1 i = q , FF F(FF) + q , − FF F(FF) +···+ (7.76) n n   n n 1 i i q , FF F(FF) + q , F(FF) n 1  n 0      −   i n n 1 = F(FF) qn,n FF + qn,n−1 FF +···+qn,1 FF + qn,0I    = ( )i F FF f(EEγ , AAγ ) FF .     This implies that under the condition λ EEγ , AAγ ∩ λ FF = ∅ there holds       −1   −1 ( )i = ( )i f(EEγ , AAγ ) FF F FF F FF f(EEγ , AAγ ) FF . 7.4 Extended Con-Sylvester Matrix Equations 311     In addition, it is easily obtained that under the condition λ EEγ , AAγ ∩ λ FF = ∅ the following relation is satisfied       −1   −1 ( )i = ( )i f(EEγ , AAγ ) FF FF FF f(EEγ , AAγ ) FF .

With the above conclusions, one can obtain the following corollary from Theorem 7.12.

Corollary 7.11 Given matrices E, A ∈ Cn×n, B ∈ Cn×r, and F ∈ Cp×p with the matrix pair (E, A) regular, let γ be a scalar such that (γ E − A) is nonsingular, and define two matrices Eγ and Aγ as in (7.68). Further,  denote adj (sEEγ − AAγ ) ( ) = − λ ∩ λ = ∅ as in (7.72) and f(EEγ , AAγ ) s det sEEγ AAγ .If EEγ ,AAγ FF , then the unique solution of the extended con-Sylvester matrix equation (7.67) can be expressed as   n−1     −1 = ( )i X Eγ Rn−1,iCF FF f(EEγ , AAγ ) FF = i 0  n−1     −1 + ( )i Aγ Rn−1,iC FF f(EEγ , AAγ ) FF . i=0

In order to obtain the unique solution of the matrix equation (7.67) by the pre- ceding results, one needs the coefficient matrices Rn−1,i, i ∈ I[0, n − 1], and the coefficients qn,k, k ∈ I[0, n]. They can be obtained by the following generalized Leverrier algorithm given in Theorem 2.3: ⎧ ⎨ −EEγ Ri,k−1 + qi+1,kIn, if k = i + 1 = − + ∈ I[ , ] , Ri+1,k ⎩ AAγ Ri,k EEγ Ri,k−1 qi+1,kIn, if k 1 i (7.77) = AAγ Ri,0 + qi+1,0In, if k 0 ⎧ 1 ( γ , − ) = + ⎨ i+1 tr EE Ri k 1 , if k i 1 1 q + , = − tr(AAγ R , − EEγ R , − ), if k ∈ I[1, i] . (7.78) i 1 k ⎩ i+1 i k i k 1 1 − ( γ , ) if k = 0 i+1 tr AA Ri 0 ,

7.4.2 Equivalent Forms

In the previous subsection, an explicit expression has been given for the unique solu- tion of the extended con-Sylvester matrix equation (7.67) in terms of the characteristic polynomial of the matrix pair (EEγ , AAγ ) and the adjoint matrix of sEEγ − AAγ . In order to obtain them, one approach is to carry out the 2-dimensional Leverrier algorithm (7.77) and (7.78). In this subsection, an alternative approach is provided 312 7 Polynomial-Matrix-Based Approaches to solve the extended con-Sylvester matrix equation (7.67). This approach can avoid the 2-dimensional Leverrier algorithm.

Theorem 7.13 Given matrices E, A ∈ Cn×n,B∈ Cn×r, and F ∈ Cp×p with the matrix pair (E, A) regular, let γ be a scalar such that (γ E − A) is nonsingular, and define two matrices Eγ and Aγ as in (7.68). Further, let θ ∈ C be a scalar such that (θEEγ − AAγ ) is nonsingular, and denote

 −1  −1 E = θEEγ − AAγ EEγ , C = θEEγ − AAγ C. (7.79)

In addition, denote gE (s) = det(I − sE ), and

n−1 n−2 adj(I − sE ) = Rn−1s + Rn−2s +···+R1s + R0. (7.80)

Then, the unique solution of the extended con-Sylvester matrix equation (7.67) can be expressed as

n−1   −1  i X = Eγ RiC gE θI − FF F θI − FF + (7.81) i=0 n−1   −1  i Aγ RiC gE θI − FF θI − FF . i=0

Proof By applying binomial expansion, one has for integer i ≥ 0    i θI − FF F ⎡ ⎤ i   = ⎣ jθ i−j − j⎦ Ci FF F j=0 i   = jθ i−j − j Ci FF F (7.82) = j ⎡0 ⎤ i   = ⎣ jθ i−j − j⎦ F Ci FF j=0  i = F θI − FF .

Similarly, by simple calculation it is obtained that for integer i ≥ 0    i  i F θI − FF F = θI − FF FF. (7.83) 7.4 Extended Con-Sylvester Matrix Equations 313

In addition, it is obvious that   (I − sE ) adj(I − sE ) = gE (s)I. (7.84)

Denote

n n−1 1 gE (s) = qns + qn−1s +···+q1s + q0.

Comparing the coefficients of the same power of s in (7.84), one has ⎧ ⎨ R0 = q0I, − E = ∈ I[ , − ] ⎩ Ri Ri−1 qiI, i 1 n 1 , (7.85) −E Rn−1 = qnI.

It follows from (7.85) and (7.79) that ⎧     θ − = θ − ⎨  EEγ AAγ  R0 q0 EEγ AAγ ,  θ − − = θ − ∈ I[ , − ] ⎩ EEγ AAγ Ri EEγ Ri−1  qi EEγ AAγ , i 1 n 1 , (7.86) −EEγ Rn−1 = qn θEEγ − AAγ .

By applying the expression of X in (7.81), it follows from Lemma 7.13, the preceding relations (7.79), (7.82), (7.83), and (7.86) that

− EXF⎡ AX ⎤ n −1      ⎣ −1 i⎦ = E Eγ RiC gE θI − FF F θI − FF F + i=0 ⎡ ⎤ n −1      ⎣ −1 i⎦ E Aγ RiC gE θI − FF θI − FF F i=0 n −1   −1  i −AEγ RiC gE θI − FF F θI − FF − i=0 n −1   −1  i AAγ RiC gE θI − FF θI − FF = ⎡ i 0 ⎤ n −1      ⎣ −1 i⎦ = E Eγ RiC gE θI − FF F θI − FF F − i=0 n −1   −1  i AAγ RiC gE θI − FF θI − FF i=0 314 7 Polynomial-Matrix-Based Approaches

n −1   −1  i = EEγ RiC gE θI − FF θI − FF FF − i=0 n −1   −1  i AAγ RiC gE θI − FF θI − FF i=0 n −1     −1  i = θEEγ − AAγ RiC gE θI − FF θI − FF i=0 n −1   −1  i+1 −EEγ RiC gE θI − FF θI − FF i=0     −1 = θEEγ − AAγ R0C gE θI − FF n −1      −1  i + θEEγ − AAγ Ri − EEγ Ri−1 C gE θI − FF θI − FF i=1   −1  n −EEγ R − C gE θI − FF θI − FF n 1       −1    n = θEEγ − AAγ C gE θI − FF q0I + q1 θI − FF ···+qn θI − FF = C.

This implies that the conclusion of this theorem is true. 

Similarlyto(7.76), it is easily derived that

  −1  i gE θI − FF F θI − FF   −1  i = gE θI − FF θI − FF F  i   −1 = θI − FF gE θI − FF F  i   −1 = θI − FF F gE θI − FF  i   −1 = F θI − FF gE θI − FF , and   −1  i  i   −1 gE θI − FF θI − FF = θI − FF gE θI − FF .

With the above two relations, one can obtain the following corollary from Theorem 7.13.

Corollary 7.12 Given matrices E, A ∈ Cn×n,B∈ Cn×r, and F ∈ Cp×p with the matrix pair (E, A) regular, let γ be a scalar such that (γ E − A) is nonsingular, and define two matrices Eγ and Aγ as in (7.68). Further, let θ ∈ C be a scalar such that (θEEγ − AAγ ) is nonsingular, and denote E and C as in (7.79). In addition, denote 7.4 Extended Con-Sylvester Matrix Equations 315 adj(I − sE ) as in (7.80) and gE (s) = det(I − sE ). Then, the unique solution of the extended con-Sylvester matrix equation (7.67) can be expressed as   n−1  i   −1 X = Eγ RiC F θI − FF gE θI − FF + =  i 0  n−1  i   −1 Aγ RiC θI − FF gE θI − FF . i=0

In addition, from (7.85) one has ⎧ ⎪ R0 = q0I ⎪ ⎨ R1 = q0E +q1I 2 R2 = q0E + q1E + q2I . ⎪ ⎪ ··· ⎩ n−1 n−2 Rn−1 = q0E + q1E +···+qn−1I

With these relations, the following neat expression can be obtained   R0C R1C ··· Rn−1C = Ctr(E , C )Tn(gE (s)Ip).

So one can obtain an equivalent form of the solution given in (7.81).

Corollary 7.13 Given matrices E, A ∈ Cn×n,B∈ Cn×r, and F ∈ Cp×p with the matrix pair (E, A) regular, let γ be a scalar such that (γ E − A) is nonsingular, and define two matrices Eγ and Aγ as in (7.68). Further, let θ ∈ C be a scalar such that (θEEγ − AAγ ) is nonsigular, and denote the matrices E and C as in (7.79). Then, the unique solution of the extended con-Sylvester matrix equation (7.67) can be expressed as

    −1 X = Eγ Ctr(E , C )Tn(gE (s)Ip) Obsn θI − FF, F gE θI − FF    −1 +Aγ Ctr(E , C )Tn(gE (s)Ip) Obsn θI − FF, I gE θI − FF .

In order to obtain a solution of the extended con-Sylvester matrix equation (7.67) by Corollary 7.12, one needs the coefficients qi, i ∈ I[1, n], of the polynomial gE (s), and the coefficient matrices Ri, i ∈ I[0, n − 1], of adj(I − sE ). Regarding the solu- tion of qi, i ∈ I[0, n], and Ri, i ∈ I[0, n − 1], the following generalized Leverrier algorithm given in Theorem 2.4 can be adopted:  Ri = E Ri−1 + qiI, (E ) , i ∈ I[1, n], (7.87) =−tr Ri−1 , qi i 316 7 Polynomial-Matrix-Based Approaches with the initial condition R0 = I. This finite iteration (7.87) is much easier to compute than the iteration (7.77) and (7.78). When the method in Corollary 7.13 is adopted to solve the matrix equation (7.67), one needs only the coefficients qi, i ∈ I[1, n].

7.4.3 Further Discussion

In this subsection, a further discussion is given for the extended con-Sylvester matrix equation (7.67). Different from the case in the previous subsection, it is required in the current subsection that the matrix pair E, A is regular. The following lemma is needed.

Lemma 7.14 Given matrices A, E ∈ Cn×n with the matrix pair (E, A) regular, let γ be a scalar such that (γ E − A) is nonsingular, and denote

 −1  −1 Eγ = E γ E − A ,Aγ = A γ E − A . (7.88)

Then the following relation holds

Eγ A = Aγ E.

Proof With the condition of this lemma, one has

 −1 (γ E − A) γ E − A = I.

This relation can be equivalently rewritten as

γ Eγ − Aγ = I. (7.89)

−1 From this relation one obtains Aγ = γ Eγ − I.Ifγ = 0, then Eγ =−EA , Aγ = −I.So   −1 Aγ E =−E =− EA A = Eγ A. (7.90)

If γ = 0, then from (7.89) one has

1   Eγ A = I + Aγ A γ   1  −1   1 = A γ E − A γ E − A + Aγ A γ γ (7.91) 1   = Aγ γ E − A + A γ = Aγ E. 7.4 Extended Con-Sylvester Matrix Equations 317

The relations (7.90) and (7.91) imply that the conclusion of this lemma is true. 

Now, let γ be a scalar such that (γ E − A) is nonsingular. Then, pre-multiplying both sides of (7.67)byAγ defined in (7.88), gives

Aγ EXF − Aγ AX = Aγ C. (7.92)

In addition, (7.67) can be equivalently written as

EXF − AX = C.

By pre- and post-multiplying both sides of this equation by Eγ defined in (7.88) and F, respectively, one can obtain

Eγ EXFF − Eγ AXF = Eγ CF. (7.93)

In view of the result of Lemma 7.14, summing both sides of (7.92) and (7.93), gives

Eγ EXFF − Aγ AX = Aγ C + Eγ CF, (7.94) which is in the form of extended Sylvester equations. With the preceding deduction, the following conclusion is easily obtained.

Proposition 7.2 Given matrices A, E ∈ Cn×n,C∈ Cn×p, and F ∈ Cp×p with the matrix pair (E, A) regular, let γ be a scalar such that (γ E − A) is nonsingular, and define Eγ and Aγ as in (7.88). If the extended con-Sylvester matrix equation (7.67)is solvable, then the extended Sylvester matrix equation (7.94) is solvable. Moreover, if X = X0 is a solution of (7.67), then X = X0 is also a solution of (7.94).

Now, the problem is: whether is the matrix equation (7.67) solvable if the equation (7.94) is solvable? The answer for this problem is still open, and we present it as a conjecture.

Conjecture 7.1 Given matrices A, E ∈ Cn×n, C ∈ Cn×p, and F ∈ Cp×p with the matrix pair (E, A) regular, let γ be a scalar such that (γ E − A) is nonsingular, and define Eγ and Aγ as in (7.88). (1) The extended con-Sylvester matrix equation (7.67) is solvable if and only if the extended Sylvester matrix equation (7.94) is solvable. (2) The extended con-Sylvester matrix equation (7.67) has a unique solution if and only if the extended Sylvester matrix equation (7.94) has a unique solution. Moreover, these two unique solutions are equal. 318 7 Polynomial-Matrix-Based Approaches

7.4.4 Illustrative Examples

Example 7.1 Consider a matrix equation in the form of (7.67) with the following coefficient matrices ⎡ ⎤ ⎡ ⎤ 1 − i2i2+ i 1 −2 − i −1 + i E = ⎣ 0 −1 − i −1 ⎦ , A = ⎣ 0i 0⎦ , 1 − i −1 + i1+ i 0 −11− i ⎡ ⎤ −1 + i1   2i i C = ⎣ 0i⎦ , F = . 1 −1 + i −i1− 2i

It is easily checked that the matrix pair (E, A) is regular. Choosing γ = 0, by calcu- lation one obtains ⎡ ⎤ ⎡ ⎤ −2 + 2i 5 − 5i −2 − 5i −10 0 ⎣ ⎦ ⎣ ⎦ E0 = 01− i −i , A0 = 0 −10 . − 1 − 3 − 122 2 i 00 1

Now, the scalar θ is chosen to be θ = 0. In this case, one has ⎡ ⎤ ⎡ ⎤ − − + − 11 + 29 − − 10 5i 14 10i 2 2 i 15 i E = ⎣ − − + − 5 − 1 ⎦ C = ⎣ ⎦ i 2 2i 2 2 i , 01. − 3 − 7 + − 9 + 1 − 1 2 2 i2 6i 2 3i 2 2 i2

By applying the generalized Leverrier algorithm (7.87), one can obtain that

= R0 ⎡I, ⎤ 13 − + − 11 + 29 2 5i 14 10i 2 2 i = ⎣ − 29 + − 5 − 1 ⎦ R1 i 2 2i 2 2 i , − 3 − 7 i2+ 6i 12 + 3i ⎡ 2 2 ⎤ 5 + i −5 − i −12 + 8i ⎣ ⎦ R2 = −1 + 5i 1 − 5i −8 − 12i , −4 − 6i 4 + 6i 20 + 4i and 2 33 gE (s) = det(I − sE ) = 26s + s + 1. 2 Then one has   232 558 1674 558   −1 − + − − θ − = 337 933 337 933 i 337 933 337 933 i gE I FF 558 − 558 884 − 558 , 337 933 337 933 i 337 933 337 933 i 7.4 Extended Con-Sylvester Matrix Equations 319   232 558 1674 558   −1 − + − − θ − = 337 933 337 933 i 337 933 337 933 i gE I FF 558 − 558 884 − 558 . 337 933 337 933 i 337 933 337 933 i

In addition, by simple calculation one also has ⎡ ⎤   211 273 n−1 − + − −   2 2 i 213 112i C θ − i = ⎣ − 139 + 305 − − ⎦ E0 Ri F I FF 2 2 i 120 37i , 9 i=0 −5 − 13i − − 42i ⎡ 2 ⎤   589 5 n−1 + +   65 52i 2 2 i C θ − i = ⎣ 49 + 201 + ⎦ A0 Ri I FF 2 117i 2 272i . 209 111 i=0 − − − − 2 2 i 294 167i

With the above preliminaries, by applying Corollary 7.12 the unique solution of this matrix equation can be obtained as ⎡ ⎤ − 111 411 − 172 909 183 925 − 435 225 337 933 337 933 i 337 933 337 933 i = ⎣ − 19 692 + 54 377 339 603 − 207 412 ⎦ X 337 933 337 933 i 337 933 337 933 i . − 219 558 + 4732 − 235 416 + 157 577 337 933 337 933 i 337 933 337 933 i

Example 7.2 Consider a matrix equation in the form of (7.67) with the following coefficient matrices ⎡ ⎤ 4.2176 6.5574 6.7874 6.5548 ⎢ 9.1574 0.3571 7.5774 1.7119 ⎥ E = ⎢ ⎥ ⎣ 7.9221 8.4913 7.4313 7.0605 ⎦ 9.5949 9.3399 3.9223 0.3183 ⎡ ⎤ −10.6887 3.2519 −1.0224 −8.6488 ⎢ −8.0950 −7.5493 −2.4145 −0.3005 ⎥ +i ⎢ ⎥ , ⎣ −29.4428 13.7030 3.1921 −1.6488 ⎦ 14.3838 −17.1152 3.1286 6.2771 ⎡ ⎤ 10.9327 −12.1412 −7.6967 −10.8906 ⎢ 11.0927 −11.1350 3.7138 0.3256 ⎥ A = ⎢ ⎥ ⎣ −8.6365 −0.0685 −2.2558 5.5253 ⎦ 5.5253 15.3263 11.1736 11.0061 ⎡ ⎤ 15.4421 −10.6158 −1.9242 −14.2238 ⎢ 0.8593 238.5046.8861 4 .8819 ⎥ +i ⎢ ⎥ , ⎣ −14.9159 −6.1560 −7.6485 −1.7738 ⎦ −7.4230 7.4808 −14.0227 −1.9605 320 7 Polynomial-Matrix-Based Approaches ⎡ ⎤ ⎡ ⎤ −0.8249 8.4038 3.0352 17.1189 13.5459 14.3670 ⎢ −19.3302 −8.8803 −6.0033 ⎥ ⎢ −1.9412 −10.7216 −19.6090 ⎥ C = ⎢ ⎥ + i ⎢ ⎥ , ⎣ −4.3897 1.0009 4.8997 ⎦ ⎣ −21.3836 9.6095 −1.9770 ⎦ −17.9468 −5.4453 7.3936 −8.3959 1.2405 −12.0785 ⎡ ⎤ ⎡ ⎤ 29.0801 −10.5818 10.9842 −20.5182 −15.7706 0.3348 F = ⎣ 8.2522 −4.6862 −2.7787 ⎦ + i ⎣ −3.5385 5.0797 −13.3368 ⎦ . 13.7897 −2.7247 7.0154 −8.2359 2.8198 11.2749

The matrix pair (E, A) is regular. By choosing γ = 1 + i and θ =−1 + 2i, one has ⎡ ⎤ −0.4550 0.3851 −0.0688 −0.0561 ⎢ − . . − . − . ⎥ E = ⎢ 0 4643 0 5659 0 0087 0 0114 ⎥ ⎣ 0.5216 0.0934 0.3751 0.2195 ⎦ −0.4951 −0.0722 −0.4384 −0.3530 ⎡ ⎤ −0.2295 0.5933 0.3483 0.1453 ⎢ 0.1950 0.6104 0.5754 0.2832 ⎥ +i ⎢ ⎥ , ⎣ 0.1660 −0.6685 −0.4901 −0.1740 ⎦ −0.5232 1.1729 0.5163 0.1596 ⎡ ⎤ ⎡ ⎤ 1.2916 −0.5834 −1.8299 2.3896 0.8989 1.5205 ⎢ 2.3980 −1.1906 −3.1351 ⎥ ⎢ 3.9474 1.8154 2.7215 ⎥ C = ⎢ ⎥ + i ⎢ ⎥ . ⎣ −0.0727 0.6661 2.7600 ⎦ ⎣ −1.7217 0.8510 0.7173 ⎦ −1.1837 −2.4510 −4.2546 2.3124 −0.5210 −0.9113

By applying the generalized Leverrier algorithm (7.87), it is obtained that

= R0 ⎡I, ⎤ −0.5880 0.3851 −0.0688 −0.0561 ⎢ −0.4643 0.4329 −0.0087 −0.0114 ⎥ R = ⎢ ⎥ 1 ⎣ 0.5216 0.0934 0.2421 0.2195 ⎦ −0.4951 −0.0722 −0.4384 0.5163 ⎡ ⎤ −0.2800 0.5933 0.3483 0.1453 ⎢ 0.1950 0.5600 0.5754 0.2832 ⎥ +i ⎢ ⎥ , ⎣ 0.1660 −0.6685 −0.5405 −0.1740 ⎦ −10.17295232 0 .5163 0.1092 ⎡ ⎤ 0.1139 −0.1445 −0.1123 −0.0566 ⎢ −0.0004 −0.1437 −0.2245 −0.1088 ⎥ R = ⎢ ⎥ 2 ⎣ −0.0960 0.4128 0.3047 0.0442 ⎦ −0.1067 −0.4259 −0.2127 0.1152 ⎡ ⎤ −0.1560 0.1164 0.1025 0.0608 ⎢ ⎥ ⎢ 0.0068 0.0608 0.1591 0.0639 ⎥ + i , ⎣ −0.1372 0.1099 −0.2476 0.0363 ⎦ 0.1898 −0.1121 −0.0785 −0.3486 7.4 Extended Con-Sylvester Matrix Equations 321 ⎡ ⎤ 0.0247 −0.0255 −0.0176 −0.0115 ⎢ 0.0004 −0.0205 −0.0275 −0.0099 ⎥ R = ⎢ ⎥ 3 ⎣ −0.0038 0.0282 0.0261 −0.0081 ⎦ −0.0257 0.0019 0.0207 0.0534 ⎡ ⎤ 0.0064 −0.0010 0.0039 −0.0054 ⎢ 0.0118 −0.0089 0.0017 −0.0078 ⎥ +i ⎢ ⎥ , ⎣ −0.0057 0.0269 −0.0133 −0.0037 ⎦ −0.0568 −0.0184 −0.0071 0.0251 and

det(I − sE ) = (0.0047 + 0.0053i) s4 + (0.0838 + 0.0094i) s3 + (0.1951 − 0.3457i) s2 + (−0.1330 − 0.0504i) s + 1.

By using Theorem 7.13, the unique solution of this equation is obtained as ⎡ ⎤ ⎡ ⎤ −0.0433 0.0409 0.1248 0.0051 −0.0306 −0.0561 ⎢ −0.0765 0.0322 0.1842 ⎥ ⎢ 0.0292 −0.0304 −0.0736 ⎥ X = ⎢ ⎥ + i ⎢ ⎥ . ⎣ 0.2311 −0.1726 −0.6098 ⎦ ⎣ 0.0996 −0.0041 −0.0542 ⎦ −0.1848 0.0411 0.2535 0.0572 −0.0559 −0.1879

7.5 Generalized Con-Sylvester Matrix Equations

In this section, the emphasis is placed on the following generalized con-Sylvester matrix equation AX + BY = EXF, (7.95) where A, E ∈ Cn×n, B ∈ Cn×r, and F ∈ Cp×p are known matrices, and X ∈ Cn×p and Y ∈ Cr×p are the matrices to be determined. This equation is a conjugate ver- sion of the generalized Sylvester matrix equation AX + BY = EXF. In Sect. 6.6,an approach has been provided to solve the generalized con-Sylvester matrix equation (7.95). The shortcomings of the approach in Sect. 6.6 are twofolds. One is that it is an indirect approach. In order to obtain the general solution of the equation (7.95), some results on generalized Sylvester matrix equations are needed. The other is the high dimensionality. The solution in Sect. 6.6 was obtained from its real representation equation, whose dimension is higher than that of the equation (7.95). In this section, an alternative approach is given to solve the generalized con-Sylvester matrix equa- tion (7.95). Similarly to the previous sections of this chapter, the polynomial matrix technique is used. 322 7 Polynomial-Matrix-Based Approaches

7.5.1 Basic Solutions

In this subsection, a basic solution is given for the generalized con-Sylvester matrix equation (7.95). In order to obtain the result, the following lemma is needed. It can be found in [257, 305].

t r×r Lemma 7.15 Let Di ∈ C ,i∈ I[0, t], be square matrices. Then the matrix   i=0 T i F ⊗ Di is nonsingular if and only if the matrices

t i Dis ,s∈ λ(F) i=0 are all nonsingular.

Regarding the solution to the generalized con-Sylvester matrix equation (7.95), one has the following result.

Theorem 7.14 Given matrices E, A ∈ Cn×n, B ∈ Cn×r, and F ∈ Cp×p with the matrix pair (E, A) regular, suppose that the condition (6.55) holds. Let γ be a (γ − ) scalar such that E A is nonsingular, and define two matrices Eγ and Aγ as ( ) = t−1 i ∈ Cn×r[ ] in (7.68 ). In addition, the polynomial matrices N s i=0 Nis s and ( ) = t i ∈ Cr×r[ ] = D s i=0 Dis s with Nt 0 are determined by

t−1 t i i (sEEγ − AAγ ) Nis = B Dis . (7.96) i=0 i=0

Then, (1) the matrices X ∈ Cn×p and Y ∈ Cr×p, given by ⎧ ⎪ t−1 t−1 ⎪ i i ⎨⎪ X = Eγ NiZF(FF) + Aγ NiZ(FF) i=0 i=0 (7.97) ⎪ t ⎪ i ⎩⎪ Y = DiZ(FF) i=0

satisfy the generalized con-Sylvester matrix equation (7.95) for any matrix Z ∈ Cr×p; (2) all the matrices X ∈ Cn×p and Y ∈ Cr×p satisfying the generalized con-Sylvester matrix equation (7.95) can be parameterized as (7.97)if

rankD(s) = r, for ∀s ∈ λ(FF). (7.98) 7.5 Generalized Con-Sylvester Matrix Equations 323

Proof Due to the relation (7.96), it is obtained that ⎧ ⎨ BD0 =−AAγ N0, = − , ∈ I[ , − ], ⎩ BDi EEγ Ni−1 AAγ Ni i 1 t 1 BDt = EEγ Nt−1.

With these relations, by applying Lemma 7.13 one has

AX − EXF t−1 t−1 i i = AEγ NiZF(FF) + AAγ NiZ(FF) i=0 i=0 t−1 t−1 i i −EEγ NiZF(FF) F − EAγ NiZ(FF) F i=0 i=0 t−1 t−1 i i+1 = AAγ NiZ(FF) − EEγ NiZ(FF) i=0 i=0 t−1 i t = AAγ N0Z + (AAγ Ni − EEγ Ni−1)Z(FF) − EEγ Nt−1Z(FF) i=1 t i =−B DiZ(FF) i=0 =−BY.

This implies that the matrices X and Y given in (7.97) satisfy the matrix equation (7.95). The first conclusion is thus proven. Secondly, it will be shown that the solution given in (7.97) is complete. It follows from Proposition 6.8 that there are rp degrees of freedom in the solution of the matrix equation (7.95) under the condition of (6.55). While the solution (7.97) has exactly rp parameters represented by the elements of the free matrix Z. Therefore, in the following it is sufficient to show that all the parameters in the matrix Z contribute to the solution. To do this, it suffices to show that the mapping Z → (X, Y) given in (7.97) is injective. By taking the column stretching operation, the second expression in (7.97) can be equivalently rewritten as   vec(Y) = vec Z , with t   i T = (FF) ⊗ Di. i=0 324 7 Polynomial-Matrix-Based Approaches

According to Lemma 7.15, under the condition of (7.98) the matrix is nonsingular. Consequently, the mapping Z → Y given in (7.97) is injective, and thus the mapping Z → (X, Y) given in (7.97) is also injective. With the above aspects, the proof is thus completed. 

In order to obtain a solution of the generalized con-Sylvester matrix equation (7.95) by Theorem 7.14, one needs the polynomial matrices N(s) ∈ Cn×r[s] and D(s) ∈ Cr×r satisfying (7.96). Such a pair of polynomial matrices can be obtained by applying the Smith normal form reduction. Let P(s) ∈ Cn×n [s] and Q(s) ∈ ( + )×( + ) C n r n r [s]  be two unimodular polynomial matrices that transform AAγ − sEEγ B into its Smith normal form, that is,     ( ) − ( ) = n ( ) P s AAγ sEEγ B Q s diagi=1 di s 0 , where di−1(s)|di(s) ∈ C[s], i ∈ I[2, n]. Partition Q(s) into   Q(s) = Q1(s) Q2(s)

(n+r)×r with Q2(s) ∈ C [s]. Then one can choose   N(s) = Q (s). D(s) 2

Remark 7.1 Proposition 6.8 gives a necessary and sufficient condition on the degrees of freedom in the solution to the generalized con-Sylevester matrix equation (7.95). However, the condition is expressed in terms of the real representation matrices instead of the original coefficient matrices. In Sect. 7.1, the so-called con-Sylvester matrix equation has been investigated, where the condition on the degrees of freedom in the solution was given in terms of original coefficient matrices. It is our expectation to establish an original-coefficient-matrix-based condition on the existence of the solution to the matrix equation (7.95).

7.5.2 Equivalent Forms

In Subsection 7.5.1, an approach has been proposed to solve the generalized con- Sylvester matrix equation. In order to obtain a solution, one needs to carry out elementary transformations for a polynomial matrix. In this subsection, inspired by the idea of [197, 262, 305] another approach is provided for solving such a kind of matrix equations. Different from the method in Subsection 7.5.1, the proposed approach in this subsection does not involve computation of polynomial matrices. 7.5 Generalized Con-Sylvester Matrix Equations 325

Theorem 7.15 Given matrices E, A ∈ Cn×n,B∈ Cn×r, and F ∈ Cp×p with the matrix pair (E, A) regular, suppose that the condition (6.55) holds. Let γ be a scalar such that (γ E − A) is nonsingular, and define two matrices Eγ and Aγ as in (7.68). Further, let θ ∈ C be a scalar such that (θEEγ − AAγ ) is nonsingular, and denote

 −1  −1 E = θEEγ − AAγ EEγ , B = θEEγ − AAγ B. (7.99)

r×r If there exist a group of matrices Li ∈ C ,i∈ I[0, t], satisfying

t t−1 E BL0 + E BL1 +···+BLt = 0, (7.100) then,

(1) the matrices X ∈ Cn×p and Y ∈ Cr×p given by ⎧   = (E B) ( ( )) θ − , ⎪ X Eγ Ctrt , Tt L s Obst I FF ZF ⎨   + (E B) ( ( )) θ − , ⎪ Aγ Ctrt , Tt L s Obst I FF Z (7.101) ⎩⎪     Y = L0 L1 ··· Lt Obst+1 θI − FF, Z

( ) = t i ∈ Cr×r[ ] with L s i=0 Lis s , satisfy the generalized con-Sylvester matrix equation (7.95) for any matrix Z ∈ Cr×p; (2) all the matrices X ∈ Cn×p and Y ∈ Cr×p satisfying the generalized con-Sylvester matrix equation (7.95) can be parameterized as (7.101)if

rank L(s) = r, for ∀s ∈ λ(θI − FF).

Proof Since θ ∈ C is a scalar such that (θEEγ − AAγ ) is nonsingular, one has

sEEγ − AAγ

= sEEγ + θEEγ − AAγ − θEEγ (7.102)      −1 = θEEγ − AAγ I − (θ − s) θEEγ − AAγ EEγ .

By defining the matrices E and B as in (7.99), it follows from (7.102) that the relation (7.96) can be rewritten as

[I − (θ − s) E ] N(s) = BD(s). (7.103)

Introduce the following variable

θ − s = τ, (7.104) 326 7 Polynomial-Matrix-Based Approaches and let

t−1 i N(s) = N(θ − τ) = M(τ) = Miτ , (7.105) i=0 t i D(s) = D(θ − τ) = L(τ) = Liτ . (7.106) i=0

Then the relation (7.103) can be expressed as

(I − τE ) M(τ) = BL(τ). (7.107)

Comparing the coefficients with the same powers of τ in the preceding expression, one can obtain the following

M0 = BL0,

Mi − E Mi−1 = BLi, i ∈ I[1, t − 1],

−E Mt−1 = BLt.

From these iterations, a direct calculation gives ⎧ ⎪ M0 = BL0, ⎪ ⎨ M1 = EBL0 + BL1, 2 M2 = E BL0 + EBL1 + BL2, (7.108) ⎪ ⎪ ··· ⎩ t−1 t−2 Mt−1 = E BL0 + E BL1 +···+BLt−1, and

t t−1 0 = E BL0 + E BL1 +···+BLt. (7.109)

Therelation(7.109) is exactly (7.100).Therelation(7.108) can be written as the following matrix form   M0 M1 M2 ··· Mt−1 = Ctrt(E , B)Tt (L(s)) . (7.110)

By binomial expansion and exchanging the order of the summation, one has

t i D(s) = Li (θ − s) i=0 t i = j(− )jθ i−j LiCi s i=0 j=0 7.5 Generalized Con-Sylvester Matrix Equations 327

t t = j(− )jθ i−j LiCi s j=0 i=j t t = j (− )j jθ i−j s 1 Ci Li. j=0 i=j

This implies that

t = (− )j jθ i−j ∈ I[ , ] Dj 1 Ci Li, j 0 t . i=j

With these relations, by changing the order of the summation one has the following relation for the expression of Y in (7.97)

t t t = ( )j = (− )j jθ i−j ( )j Y DjZ FF 1 Ci LiZ FF j=0 j=0 i=j t i t i = (− )j jθ i−j ( )j = jθ i−j(− )j 1 Ci LiZ FF LiZ Ci FF i=0 j=0 i=0 j=0 t  i = LiZ θI − FF . (7.111) i=0

This is the second expression in (7.101). Similarly, one can obtain

t−1 t−1 t−1 ( ) = (θ − )i = j (− )j jθ i−j N s Mi s s 1 Ci Mi, i=0 j=0 i=j which implies

t−1 = (− )j jθ i−j ∈ I[ , − ] Nj 1 Ci Mi, j 0 t 1 . i=j

With these relations, similarly to the calculation in (7.111) it is easily obtained that

t−1 j NjZF(FF) j=0 ⎡ ⎤ t−1 t−1 = ⎣ (− )j jθ i−j ⎦ ( )j 1 Ci Mi ZF FF j=0 i=j 328 7 Polynomial-Matrix-Based Approaches ⎡ ⎤ t−1 t−1 − = ⎣ (− )j jθ i j ⎦ ( )j 1 Ci Mi ZF FF j=0 i=j t−1 i − = (− )j jθ i j ( )j 1 Ci MiZF FF i=0 j=0 t−1 i − = jθ i j(− )j MiZF Ci FF i=0 j=0 t−1  i = MiZF θI − FF , i=0 and

t−1 t−1   i i NiZ(FF) = MiZ θI − FF . i=0 i=0

By combining the preceding two expressions with (7.110), it is known that the first expression in (7.97) can be expressed as the first expression in (7.101). With the above argument, it follows from Theorem 7.14 that the matrices X and Y in (7.101) satisfy the generalized con-Sylvester matrix equation (7.95). The second conclusion can also be easily obtained from Theorem 7.14. The proof is thus completed. 

According to the proof of Theorem 7.15, the following result on another equivalent form of the solution in Theorem 7.14 is also given.

Theorem 7.16 Given matrices E, A ∈ Cn×n,B∈ Cn×r, and F ∈ Cp×p with the matrix pair (E, A) regular, suppose that the condition (6.55) holds. Let γ be a scalar such that (γ E − A) is nonsingular, and define two matrices Eγ and Aγ as in (7.68). Further,let θ ∈ C be a scalar such that (θEEγ − AAγ ) is nonsingular, and define two r×r matrices E and B as in (7.99). If the matrices Li ∈ C ,i∈ I[0, t], satisfy (7.100), n×r and the matrices Mi ∈ C ,i∈ I[0, t − 1], are given by

i i−j Mi = E BLj, j=0 then, (1) the matrices X ∈ Cn×p and Y ∈ Cr×p given by 7.5 Generalized Con-Sylvester Matrix Equations 329 ⎧ ⎪ t−1   t−1   ⎪ i i ⎨⎪ X = Eγ MiZF θI − FF + Aγ MiZ θI − FF i=0 i=0 (7.112) ⎪ t ⎪ i ⎩⎪ Y = LiZ(θI − FF) i=0

satisfy the generalized con-Sylvester matrix equation (7.95) for any matrix Z ∈ Cr×p; (2) all the matrices X ∈ Cn×p and Y ∈ Cr×p satisfying the generalized con-Sylvester matrix equation (7.95) can be parameterized as (7.112)if

rank L(s) = r, for ∀s ∈ λ(θI − FF)

t i r×r where L(s) = Lis ∈ C [s]. i=0 Remark 7.2 In order to obtain a complete parametric solution to the matrix equation (7.95) by Theorems 7.15 or 7.16 , one needs to get a series of matrices Li, i ∈ I[0, t], satisfying (7.100). Denote,   = T T ··· T T L Lt Lt−1 L0 ,

Then (7.100) can be written as

Ctrt+1(E , B)L = 0. (7.113)

Thus the matrices Li, i ∈ I[0, t], can be determined by solving the linear matrix equation (7.113).

7.5.3 Special Solutions

In Subsections 7.5.1 and 7.5.2, many methods have been established to solve the generalized con-Sylvester matrix equation (7.95). The approach in Subsection 7.5.1 involves the Smith normal form reduction of polynomial matrices, and the methods in Subsection 7.5.2 involve the solution to a linear matrix equation with higher dimen- sion. To avoid these two complicated operations, several more applicable methods are provided in this subsection by specializing the solutions in Theorems 7.14, 7.15, and 7.16. It is well-known that

(sEEγ − AAγ ) adj(sEEγ − AAγ ) = det(sEEγ − AAγ )I. 330 7 Polynomial-Matrix-Based Approaches

According to this relation, the polynomial matrices N(s) and D(s) in (7.96) can be chosen as   N(s) = adj(sEEγ − AAγ ) B, D(s) = det(sEEγ − AAγ )I. (7.114)

Then, by applying Theorem 7.14 one can obtain the following result. Corollary 7.14 Given matrices E, A ∈ Cn×n, B ∈ Cn×r, and F ∈ Cp×p with the matrix pair (E, A) regular, suppose that the condition (6.55) holds. Let γ be a scalar such that (γ E − A) is nonsingular, and define two matrices Eγ and Aγ as in (7.68). Denote   n n−1 1 ( ) = γ − γ = , + , − +···+ , + , f(EEγ , AAγ ) s det sEE AA qn ns qn n 1s qn 1s qn 0, (7.115) and

n−1 n−2 adj(sEEγ − AAγ ) = Rn−1,n−1s + Rn−1,n−2s +···+Rn−1,1s + Rn−1,0. (7.116)

Then,

(1) the matrices X ∈ Cn×p and Y ∈ Cr×p given by ⎧ ⎪ n−1 n−1 ⎨ i i X = Eγ R − , BZF(FF) + Aγ R − , BZ(FF) n 1 i n 1 i (7.117) ⎩⎪ i=0   i=0 = Y Zf(EEγ , AAγ ) FF

satisfy the matrix equation (7.95) for any matrix Z ∈ Cr×p; (2) all the matrices X ∈ Cn×p and Y ∈ Cr×p satisfying the generalized con-Sylvester matrix  equation (7.95) can be parameterized as (7.117)ifλ(EEγ , AAγ )∩ λ FF = ∅.

In order to obtain a solution of the generalized con-Sylvester matrix equa- tion (7.95) by the preceding corollary, one needs the coefficient matrices Rn−1,i, i ∈ I[0, n − 1], and the coefficients qn,k, k ∈ I[0, n]. They can be obtained by the following generalized Leverrier algorithm given in Theorem 2.3: ⎧ ⎨ −EEγ Ri,k−1 + qi+1,kIn, if k = i + 1 = − + ∈ I[ , ] , Ri+1,k ⎩ AAγ Rik EEγ Ri,k−1 qi+1,kIn, if k 1 i (7.118) = AAγ Ri0 + qi+1,0In, if k 0 ⎧ 1 ( γ , − ) = + ⎨ i+1 tr EE Ri k 1 , if k i 1 1 q + , = − tr(AAγ R − EEγ R , − ), if k ∈ I[1, i] . (7.119) i 1 k ⎩ i+1 ik i k 1 1 − ( γ ) if k = 0 i+1 tr AA Ri0 , 7.5 Generalized Con-Sylvester Matrix Equations 331

Similarly, the polynomial matrices L(s) and M(s) satisfying (7.107) can be cho- sen as L(s) = det (I − sE ) Ir, M(s) = adj(I − sE )B.

With this relation, by applying Theorems 7.15 and 7.16 one can obtain the following two results.

Corollary 7.15 Given matrices E, A ∈ Cn×n,B∈ Cn×r, and F ∈ Cp×p with the matrix pair (E, A) regular, suppose that the condition (6.55) holds. Let γ be a scalar such that (γ E − A) is nonsingular, and define two matrices Eγ and Aγ as in (7.68). Further, let θ ∈ C be a scalar such that (θEEγ − AAγ ) is nonsingular, and define two matrices E and B as in (7.99). Denote gE (s) = det(I − sE ), and

n−1 n−2 adj(I − sE ) = Rn−1s + Rn−2s +···+R1s + R0.

Then,

(1) the matrices X ∈ Cn×p and Y ∈ Cr×p given by ⎧ ⎪ n−1 n−1 ⎨  i  i X = Eγ RiBZF θI − FF + Aγ RiBZ θI − FF ⎩⎪ i=0  i=0 Y = ZgE θI − FF

satisfy the generalized con-Sylvester matrix equation (7.95) for any matrix Z ∈ Cr×p; (2) all the matrices X ∈ Cn×p and Y ∈ Cr×p satisfying the generalized con-Sylvester matrix equation (7.95) can be parameterized as (7.120)ifλ(−E , −I) ∩ λ(θI − FF) = ∅.

Corollary 7.16 Given matrices E, A ∈ Cn×n,B∈ Cn×r, and F ∈ Cp×p with the matrix pair (E, A) regular, suppose that the condition (6.55) holds. Let γ be a scalar such that (γ E − A) is nonsingular, and define two matrices Eγ and Aγ as in (7.68). Further, let θ ∈ C be a scalar such that (θEEγ − AAγ ) is nonsingular, and define two matrices E and B as in (7.99). Denote gE (s) = det(I − sE ). Then, (1) the matrices X ∈ Cn×p and Y ∈ Cr×p given by ⎧   = (E B) ( ( − E )) θ − , ⎨ X Eγ Ctr , Tn Ir det I s Obsn  I FF ZF + (E B) ( ( − E )) θ − , ⎩ Aγ Ctr , Tn Ir det I s Obsn I FF Z (7.120) Y = ZgE θI − FF

satisfy the generalized con-Sylvester matrix equation (7.95) for any matrix Z ∈ Cr×p; (2) all the matrices X ∈ Cn×p and Y ∈ Cr×p satisfying the matrix equation (7.95) can be parameterized as (7.120)if 332 7 Polynomial-Matrix-Based Approaches

λ(−E , −I) ∩ λ(θI − FF) = ∅.

In order to obtain a solution of the generalized con-Sylvester matrix equation (7.95) by Corollary 7.15, one needs the coefficients qi, i ∈ I[1, n], of the polynomial

n n−1 gE (s) = qns + qn−1s +···+q1s + q0, and the coefficient matrices Ri, i ∈ I[0, n − 1], of adj(I − sE ). Regarding the solu- tion of qi, i ∈ I[0, n], and Ri, i ∈ I[0, n − 1], the following generalized Leverrier algorithm given in Theorem 2.4 can be adopted:  Ri = E Ri−1 + qiI, (E ) , i ∈ I[1, n], (7.121) =−tr Ri−1 , qi i with the initial condition R0 = I. This iteration (7.121) is simpler than the iteration (7.118) and (7.119). When the method in Corollary 7.16 is adopted to solve the matrix equation (7.95), one needs only the coefficients qi, i ∈ I[1, n].

7.5.4 An Illustrative Example

Example 7.3 Consider a generalized con-Sylvester matrix equation in the form of (7.95) with the following coefficient matrices ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 − i2i2+ i 1 −2 − i −1 + i −1 + i1 E = ⎣ 0 −1 − i −1 ⎦ , A = ⎣ 0i 0⎦ , B = ⎣ 0i⎦ . 1 − i −1 + i1+ i 0 −11− i −i1− 2i

It is easily checked that the matrix pair (E, A) is regular. Choosing γ = 0, by calcu- lation one can obtain ⎡ ⎤ ⎡ ⎤ −2 + 2i 5 − 5i −2 − 5i −10 0 ⎣ ⎦ ⎣ ⎦ E0 = 01− i −i , A0 = 0 −10 . − 1 − 3 − 122 2 i 00 1

If it is chosen that θ = 0, then one has ⎡ ⎤ ⎡ ⎤ − − + − 11 + 29 − − 10 5i 14 10i 2 2 i 15 i E = ⎣ − − + − 5 − 1 ⎦ B = ⎣ ⎦ i 2 2i 2 2 i , 01. − 3 − 7 + − 9 + 1 − 1 2 2 i2 6i 2 3i 2 2 i2 7.5 Generalized Con-Sylvester Matrix Equations 333

By applying the generalized Leverrier algorithm (7.121), one can obtain

= R0 ⎡I, ⎤ 13 − + − 11 + 29 2 5i 14 10i 2 2 i = ⎣ − 29 + − 5 − 1 ⎦ R1 i 2 2i 2 2 i , − 3 − 7 i2+ 6i 12 + 3i ⎡ 2 2 ⎤ 5 + i −5 − i −12 + 8i ⎣ ⎦ R2 = −1 + 5i 1 − 5i −8 − 12i , −4 − 6i 4 + 6i 20 + 4i and 33 det(I − sE ) = 26s2 + s + 1. 2 According to Corollary 7.15, the following matrices X and Y satisfy this matrix equation with any matrix F:    = − + − + 2 X 0ZF 1ZFFF 0Z 1ZFF 2Z FF   , (7.122) = − 33 + 2 Y Z I 2 FF 26 FF with ⎡ ⎤ ⎡ ⎤ − + 7 − 11 − − 1 5 i 2 2 i 11 7i = ⎣ − ⎦ = ⎣ 1 − 1 − ⎦ 0 0 1 , 0 2 2 i13i , − 1 + 1 i −2 2 − 1 i −2 − 4i ⎡ 2 2 ⎤ ⎡ 2 ⎤ − − 61 − 15 7 − 47 13 − 59 2 15i 2 2 i 2 2 i 2 2 i = ⎣ 3 − − 17 + ⎦ = ⎣ − 5 − 19 33 − 39 ⎦ 1 2 2i 2 4i , 1 2 2 i 2 2 i , −9 + i −15 + 4i 5 − 2i −5i ⎡ ⎤ 7 − 9i 3 − 15i ⎣ ⎦ 2 = 9 + 7i 15 + 3i . −16 + 2i −18 + 12i

If   2i i F = , 1 −1 + i then all the solutions to the matrix equation in this example can be parameterized as (7.122). When the free parameter matrix is chosen as   2 + i −1 + 2i Z = , −1 + 3i 1 − i 334 7 Polynomial-Matrix-Based Approaches a special solution to the matrix equation in this example can be obtained as ⎡ ⎤ −653 − 725i −1 − 1627i = ⎣ 121 − − 1697 ⎦ X 2 645i 1061 2 i , 945 − 44i 118 + 1655 i  2 2  − 721 2627 − = 721 2 i 2 163i . Y − 1837 − 1047 − 395 − 2627 2 2 i 2 2 i

7.6 Notes and References

In this chapter, explicit closed-form solutions have been given for the con-Sylvester matrix equations, con-Yakubovich matrix equations, extended con-Sylvester matrix equations and generalized con-Sylvester matrix equations. For the extended con- Sylvester matrix equation, the unique solution is given in an explicit closed form by original coefficient matrices. Some equivalent forms of the unique solution are also provided. One of these equivalent forms is expressed in terms of controllability matrices and observability matrices. A feature of the proposed approach is that all the coefficient matrices can be allowed to be in an arbitrary form. For the other several matrix equations, explicit closed-form expressions of their general solutions are derived, and some equivalent forms of the solutions are also provided. Some of the obtained solutions are expressed in terms of the controllability matrix and the observability matrix. Such a feature may bring much convenience in some analysis and design problems related to these matrix equations. The established solutions have the following properties. 1. These solutions are parameterized by the free parameter matrix Z which represents the degrees of freedom existing in the solutions. 2. All the coefficients are not restricted to be in any canonical form. 3. The matrices F (and R) explicitly appear in the solutions, thus can be unknown a priori. Therefore, the matrices F (and R) can be viewed as some free design parameters in some problems related to these matrix equations. A common feature of the approaches in this chapter is that polynomial matri- ces are used during the derivation of the main results. In addition, the standard and generalized Leverrier algorithms also play a core role for the derivation and calcula- tion of some results. It is with the help of polynomial matrices that the solutions of these matrix equations can be derived by only using the original coefficient matrices instead of their real representations. The content in Subsection 7.2.2 is new, and is not published elsewhere. The other materials in this chapter are mainly based on the results in [242, 266, 268, 278, 282]. Chapter 8 Unilateral-Equation-Based Approaches

In Chaps.6 and 7, the con-Sylvester matrix equations and con-Yakubovich matrix equations have been investigated by different methods. In Chap. 6, the solutions of these two matrix equations are obtained with the aid of the real representation of complex matrices given in Sect. 2.6. In Chap. 7, the basic solutions are derived by using polynomial matrices as tools. In this chapter, the con-Sylvester matrix equations and con-Yakubovich matrix equations are revisited, and their solutions will be obtained by directly using unilateral matrix equations. In addition, such an approach is also utilized to solve nonhomogeneous con-Sylvester matrix equations. The approaches in this chapter are different from those in Chaps. 6 and 7. In this chapter, the following matrices Pj and Qj defined in Sect. 2.6 are needed:     Ij 0 0 Ij Pj = , Qj = . 0 −Ij −Ij 0

For a complex matrix A, its real representation defined in Sect.2.6 is denoted by Aσ , k k and the notation Aσ is used to denote (Aσ ) . In addition, the following result will be used in the three sections of this chapter.

Lemma 8.1 Given matrices M ∈ Cm×n,Z∈ Cn×p, and F ∈ Cp×p, for an integer k ≥ 0 the following relation holds:

 ←−   ∗k k k+1 ∗k k MZ F = P M Zσ Fσ . (8.1) σ m σ

Proof The conclusion is proven by mathematical induction principle. For k = 0 and k = 1, it follows from Items (2) and (3) of Lemma 2.13 that the conclusion holds.

© Springer Science+Business Media Singapore 2017 335 A.-G. Wu and Y. Zhang, Complex Conjugate Matrix Equations for Systems and Control, Communications and Control Engineering, DOI 10.1007/978-981-10-0637-1_8 336 8 Unilateral-Equation-Based Approaches

Now, it is assumed that the conclusion (8.1) holds for k = t ≥ 1, that is

 ←−   ∗t t t+1 ∗t t MZ F = P M Zσ Fσ . σ m σ

By using this relation, Lemma 2.13 and Item (4) of Lemma 3.2, one has

 ←− MZ∗(t+1)Ft+1   σ   ←− ←− = MZ∗tF t F = MZ∗tF t F   σ σ ←− ∗t t = MZ F PpFσ  σ  ←− ∗t t = PmPm MZ F PpFσ σ  ←− ∗t t = Pm MZ F Fσ   σ  ∗ = t+1 t t Pm Pm M Zσ Fσ Fσ   σ = t+2 ∗(t+1) t+1 Pm M σ Zσ Fσ .

This implies that the relation (8.1) holds for k = t + 1. By the principle of mathe- matical induction, it is known that the conclusion of the lemma holds for any integer k ≥ 0. 

8.1 Con-Sylvester Matrix Equations

In this section, the following con-Sylvester matrix equation is revisited

AX + BY = XF (8.2) with A ∈ Cn×n, B ∈ Cn×r, and F ∈ Cp×p being known matrices. The following lemma is needed. Lemma 8.2 Given matrices A ∈ Cn×n and F ∈ Cp×p, the normal con-Sylvester matrix equation XF − AX = C (8.3) has a unique solution for any C ∈ Cn×p if and only if λ(AA) ∩ λ(FF) = ∅. This lemma has been given in Theorem 6.3. For the solution to the con-Sylvester matrix equation (8.2), one has the following theorem. Theorem 8.1 Given matrices A ∈ Cn×n,B∈ Cn×r, and F ∈ Cp×p, suppose that n×r r×r λ(AA) ∩ λ(FF) = ∅, and there are two groups of matrices Ni ∈ C and Di ∈ C , i ∈ I[0, t] with Nt = 0 satisfying 8.1 Con-Sylvester Matrix Equations 337 ⎧ ⎨ −AN0 = BD0, − = ∈ I[ − ] ⎩ Ni−1 ANi BDi,i 1,t 1 , (8.4) Nt−1 = BDt.

Then the matrices X and Y expressed as ⎧ − ⎪ t 1 ←− ⎪ = ∗i i ⎪ X NiZ F ⎪ = ⎨⎪ i 0 ←− ∗(t−1) t−1 = N Z + N ZF + N ZFF + N ZFFF +···+N − Z F 0 1 2 3 t 1 (8.5) ⎪ t ←− ⎪ = ∗i i ⎪ Y DiZ F ⎪ = ⎩⎪ i 0 ←− ∗t t = D0Z + D1ZF + D2ZFF + D3ZFFF +···+DtZ F with Z ∈ Cr×p being an arbitrarily chosen free parameter matrix, satisfy the con- Sylvester matrix equation (8.2). Further, denote

t−1   N ( ) = i+1 ∗i i s Pn Ni σ s , (8.6) i=0 t   D( ) = i+1 ∗i i s Pr Di σ s . (8.7) i=0

Then, all the solutions to the con-Sylvester matrix equation (8.2) can be parameter- ized by (8.5)if   N (s) rank = 2r, for any s ∈ λ (Fσ ) . (8.8) D(s)

Proof It is first shown that the matrices X and Y given in (8.5) are solutions to the matrix equation (8.2). With the expression (8.5), by using Lemma 3.2 one has

AX − XF t−1 t−1 ←− ←− ∗i i ∗i i = A NiZ F − NiZ F F i=0 i=0 − − t 1 ←− t 1 ←− ∗i i ∗(i+1) i+1 = ANiZ F − NiZ F i=0 i=0 − t 1   ←− ←− ∗i i ∗t t = AN0Z + ANi − Ni−1 Z F − Nt−1Z F . i=1

Combining this relation with (8.4) and the expression Y in (8.5), one can prove that the matrices X and Y in (8.5) satisfy the matrix equation (8.3). 338 8 Unilateral-Equation-Based Approaches

Secondly, the completeness of the solution (8.5) is shown. It follows from Lemma 8.2 that the con-Sylvester matrix equation (8.2) has a unique solution X with respect to an arbitrary fixed matrix Y if λ(AA) ∩ λ(FF) = ∅. Therefore, when λ(AA) ∩ λ(FF) = ∅, the number of degrees of freedom in the solution (X, Y) to the con- Sylvester matrix equation (8.2) is equal to the number of elements in the matrix Y, that is, rp. By applying Lemma 8.1, one has     t ∗(t−1) t−1 Xσ = Pn (N0)σ Zσ + N1 Zσ Fσ +···+P N − Zσ Fσ  σ n t 1  σ . = ( ) + +···+ t+1 ∗t t Yσ Pr D0 σ Zσ D1 σ Zσ Fσ Pr Dt σ Zσ Fσ

It follows from the results in [303] that the following mapping

φ : Z → S     t ∗i S = i+1 Ni σ Z i P ∗i Fσ D σ i=0 i with P = diag(Pn, Pr) is injective under the condition of (8.8). In addition, it is obvious that the real representation σ is also injective. Thus the mapping Z → (X, Y) given in (8.5) is also injective. So all the rp elements in Z have contribution to the solution (8.5). These two facts imply that the solution given in (8.5) is complete. 

Remark 8.1 It follows from polynomial matrix theory that, by carrying out elemen- tary polynomial matrix transformation for N T(s) D T(s) T, there is a diagonal polynomial matrix (s) ∈ R2r×2r [s] such that     N (s) (s) rank = rank . D(s) 0

With this, the condition (8.8) becomes that det (s) = 0 for any s ∈ λ (Fσ ). Since det (s) is a nonzero polynomial, det (s) = 0 only has finite many roots. That is to say, det (s) = 0 holds for “almost all” s ∈ C. Such a fact implies that the condition (8.8) is very weak, and thus is easily satisfied.

According to Theorem 8.1, in order to solve the con-Sylvester matrix equation n×r r×r (8.2) one needs to obtain the matrices Ni ∈ C and Di ∈ C , i ∈ I[0, t]. When A and B are both real, the matrices Ni and Di, i ∈ I[0, t], can also be chosen to be real. In this case, the relation (8.4) becomes ⎧ ⎨ −AN0 = BD0 − = ∈ I[ , − ] ⎩ Ni−1 ANi BDi, i 1 t 1 . (8.9) Nt−1 = BDt 8.1 Con-Sylvester Matrix Equations 339

Denote t−1 t i i N(s) = Nis , D(s) = Dis . i=0 i=0

Then the relation (8.9) is equivalent to

(A − sI)N(s) + BD(s) = 0. (8.10)

Such a class of Diophantine polynomial matrix equations with respect to unknowns N(s) and D(s) have attracted considerable attention, and many approaches for solving (8.10) have been established. Interested readers can refer to [14, 71, 72, 118, 119, 170, 290, 311]. When the matrix dimensions are lower, the Diophantine equation (8.10) can be solved by carrying out elementary polynomial matrix transformations [66]. n×r However, in this book a polynomial approach can not be given to obtain Ni ∈ C r×r and Di ∈ C , i ∈ I[0, t], satisfying (8.4). In the following, an equivalent expression is given for the solution (X, Y) in (8.5). Before proceeding, it is necessary to introduce some new symbols. For matrices A ∈ Cn×n, B ∈ Cn×r, and C ∈ Cp×n, the following symbols are defined:   −→ −→ t−1 ∗(t−1) Ctrt(A, B) = BAB ··· A B , −→ −→ Ctr(A, B) = Ctr (A, B), ⎡ n ⎤ C ←− ⎢ CA ⎥ ( , ) = ⎢ ⎥ Obst A C ⎣ ··· ⎦ , ←− C∗(t−1)At−1 ←− ←− Obs (A, C) = Obsn (A, C) . −→ ←− Obviously, if A, B, and C are all real, then the matrices Ctr(A, B) and Obs (A, C) are the well-known controllability and observability matrices, respectively [75, 160]. −→ ←− In addition, Ctr(A, B) and Obs (A, C) can also be viewed as the conjugate version of the controllability and observability matrices, respectively. Due to such a reason, −→ ←− for the sake of convenience the matrices Ctrt(A, B) and Obst (A, C) will be referred to as con-controllability matrix of (A, B) with index t, and con-observability matrix −→ ←− of (A, C) with index t, respectively; Ctr(A, B) and Obs (A, C) will be called con- controllability matrix of (A, B) and con-observability matrix of (A, C), respectively. r×r In addition, for a matrix set D = Di ∈ C , i ∈ I[0, ϕ] , denote 340 8 Unilateral-Equation-Based Approaches ⎡ ⎤ D1 D2 D3 D4 ··· Dt ⎢ ··· ⎥ ⎢ D2 D3 D4 Dt ⎥ ⎢ ⎥ ⎢ D3 D4 ··· Dt ⎥ St(D) = ⎢ ⎥ , ⎢ ··· ··· ⎥ ⎣ ∗(t−1) ∗(t−1) ⎦ Dt−1 Dt ∗t Dt with Di = 0fori >ϕ. n×n n×r n×r Lemma 8.3 Given matrices A ∈ C and B ∈ C , the matrices Ni ∈ C and r×r r×r Di ∈ C , i ∈ I[0, t], with Nt = 0, satisfy (8.4) if and only if Di ∈ C , i ∈ I[0, t], satisfy −→ t ∗t BD0 + ABD1 + AABD2 +···+A (BDt) = 0, (8.11)

n×r and Ni ∈ C , i ∈ I[0, t − 1], satisfy   −→ N0 N1 ··· Nt−1 = Ctrt(A, B)St(D). (8.12)

Proof From the iteration relation (8.4), a direct calculation gives

−→ i−1 ∗(i−1) Nt−i = BDt−i+1 + ABDt−i+2 +···+A (BDt) , i ∈ I[1, t], (8.13) and −→ t ∗t 0 = BD0 + ABD1 + AABD2 +···+A (BDt) . (8.14)

Therelation(8.14) is exactly (8.11). The relations in (8.13) can be equivalently written as ⎧ ⎪ = ⎪ Nt−1 BDt, ⎪ ⎨ Nt−2 = BDt−1 + ABDt, Nt−3 = BDt−2 + ABDt−1 + AABDt, (8.15) ⎪ ⎪ ··· ⎩⎪ −→ t−1 ∗(t−1) N0 = BD1 + ABD2 + AABD3 +···+A (BDt) .

Rewriting (8.15) in a matrix form, gives (8.12). On the other hand, by using Lemma 3.2 it follows from (8.12) and (8.11) that

−→ t−1 ∗(t−1) AN0 = ABD1 + AABD2 + AAABD3 +···+AA (BDt) −→ t ∗t = ABD1 + AABD2 + AAABD3 +···+AA (BDt)

=−BD0. 8.1 Con-Sylvester Matrix Equations 341

This is the first expression of (8.4). For i ∈ I[1, t − 1], it follows from (8.12) that

t −−−−−→   −( + ) j−(i+1) j i 1 Ni = A BDj . j=i+1

By using this relation and Item (4) of Lemma 3.2, one has, for i ∈ I[1, t − 1] ⎡ ⎤ t −−−−−→   −( + ) ⎣ j−(i+1) j i 1 ⎦ ANi + BDi = BDi + A A BDj j=i+1 ⎡ ⎤ t −→   ⎣ j−i j−i⎦ = BDi + A BDj j=i+1

t −→   j−i j−i = A BDj = Ni−1. j=i

This is the second expression of (8.4). The last expression of (8.4) is obvious. Thus the relations (8.11) and (8.12) imply the relation (8.4). With the above two aspects, the proof is completed.  Remark 8.2 The idea of proving the above lemma is inspired by the Lemma 2 of [302]. With the preceding preliminaries, on the equivalent forms of (8.5) one has the following theorem. × × × Theorem 8.2 Given matrices A ∈ Cn n,B∈ Cn r, and F ∈ Cp p, suppose that r×r λ(AA) ∩ λ(FF) = ∅.IfthereisamatrixsetD = Di ∈ C ,i∈ I[0,t] satisfying (8.11), then the matrices X and Y given by

−→ ←− X = Ctr (A, B)S (D)Obs (F, Z)  t t  ←−t (8.16) Y = D0 D1 ··· Dt Obst+1 (F, Z) with Z ∈ Cr×p being a free parameter matrix, satisfy the con-Sylvester matrix equa- tion (8.2). Further, define D(s) as in (8.7). Then, all the solutions to the con-Sylvester matrix equation (8.2) can be parameterized by (8.5)if

det D(s) = 0, for any s ∈ λ (Fσ ) . (8.17)

Proof It follows from Lemma 8.3 that the matrix X in (8.5) can be written as

− t 1 ←− ∗i i X = NiZ F i=0 342 8 Unilateral-Equation-Based Approaches   ←− = N0 N1 ··· Nt−1 Obst (F, Z) −→ ←− = Ctrt(A, B)St(D)Obst (F, Z) .

The expression of Y in (8.5) is easily obtained as the second one in (8.16). It is clear that the condition (8.17) is stronger than the condition (8.8). It follows from Theorem 8.1 that the second conclusion is true.  In the above theorem, the solution to the matrix equation (8.2) is expressed in terms of the con-controllability matrix associated with (A, B), and the con-observability matrix associated with F and the free parameter matrix Z. This property may bring convenience and advantages to the further analysis of the matrix equation (8.5). Remark 8.3 When the approach in Theorem 8.2 is adopted to solve the con-Sylvester r×r matrix equation (8.2), only the matrices Di ∈ C , i ∈ I[0, t], are required. In fact, they can be easily obtained by solving the matrix equation (8.11) with respect to r×r unknown matrices Di ∈ C , i ∈ I[0, t]. An important feature of the matrix equa- tion (8.11) is that all the unknown matrices are premultiplied by the correspond- ing coefficient matrices. Therefore, the matrix equations in such a form are called unilateral matrix equations. Remark 8.4 For the unilateral matrix equation (8.11), denote ⎡ ⎤ D0 ⎢ ⎥ = ⎢ D1 ⎥ L ⎣ ···⎦ . ∗t Dt

Then the relation (8.11) can be equivalently written as −→ Ctrt+1(A, B)L = 0. (8.18)

This is a very simple standard unilateral linear matrix equation, and can be easily solved by using numerically reliable singular value decompositions. It should be pointed out that the completeness condition (8.8)isgivenbythe real representations, which have higher matrix dimensions. In the future work, many efforts will be devoted to provide a condition with lower matrix dimensions. Example 8.1 Consider a matrix equation in the form of (8.2) with the following parameters ⎡ ⎤ ⎡ ⎤ 1 −2 − i −1 + i   −1 + i1 2i i A = ⎣ 0i 0⎦ , F = , B = ⎣ 0i⎦ . 1 −1 + i 0 −11− i −i1− 2i

For this matrix equation, n = 3. Now, the index t is chosen to be t = 2. A direct calculation gives 8.1 Con-Sylvester Matrix Equations 343   BAB AAB ⎡ ⎤ −1 + i1−2 − 2i −3 + i −2 + 4i −6 + 3i = ⎣ 0i 0 1 0 i⎦ . −i1− 2i 1 + i3+ 2i −2i −5i

By solving the matrix equation (8.18), one can obtain       −1 + i −12 + 6i −i −5 + i 10 D = , D = , D = . 0 0 −2 1 00 2 02

With this, by applying (8.12) one has ⎡ ⎤ ⎡ ⎤ −1 + i −2 + 4i −1 − i2 ⎣ ⎦ ⎣ ⎦ N0 = 02, N1 = 0 −2i . −i7− 9i i2+ 4i

With the above preliminaries, the matrices  X = N0Z + N1ZF Y = D0Z + D1ZF + D2ZFF satisfy this matrix equation for an arbitrary matrix Z ∈ C2×2. If the free parameter matrix is chosen as   1 + 2i 2 − i Z = , 3i −1 + 3i a special solution to this matrix equation can be obtained as ⎧ ⎡ ⎤ ⎪ − − − ⎪ 8 22i 6 4i ⎨⎪ X = ⎣ −6 − 4i 2 − 8i ⎦ + + ⎪  48 40i 23 59i  . ⎪ −35 − 13i −36 − 34i ⎩⎪ Y = 2 + 10i −14 + 22i

8.2 Con-Yakubovich Matrix Equations

In this section, the following con-Yakubovich matrix equation is considered

X − AXF = BY, (8.19) where A ∈ Cn×n, B ∈ Cn×r, and F ∈ Cp×p are given known matrices, and X ∈ Cn×p and Y ∈ Cr×p are unknown matrices. Before proceeding, the following result is needed on the con-Kalman-Yakubovich matrix equation. 344 8 Unilateral-Equation-Based Approaches

Lemma 8.4 Given matrices A ∈ Cn×n,F∈ Cp×p, and C ∈ Cn×p, the con-Kalman- Yakubovich matrix equation X − AXF = C has a unique solution if γμ = 1,for∀γ ∈ λ(AA), ∀μ ∈ λ(FF).

The result in the above lemma has been given in Theorem 6.10. On the solution to the con-Yakubovich matrix equation (8.19), one has the following theorem.

Theorem 8.3 Given matrices A ∈ Cn×n, B ∈ Cn×r, and F ∈ Cp×p, suppose that n×r γμ = 1,for∀γ ∈ λ(AA), ∀μ ∈ λ(FF). If there are two groups of matrices Ni ∈ C r×r and Di ∈ C , i ∈ I[0, t], with Nt = 0 satisfying ⎧ ⎨ N0 = BD0, − = ∈ I[ , − ] ⎩ Ni ANi−1 BDi,i 1 t 1 , (8.20) −ANt−1 = BDt, then the following matrices X and Y ⎧ − ⎪ t 1 ←− ⎪ ∗i i ⎨⎪ X = NiZ F i=0 (8.21) ⎪ t ←− ⎪ ∗i i ⎩⎪ Y = DiZ F i=0 satisfy the con-Yakubovich matrix equation (8.19). In (8.21), Z ∈ Cr×p is an arbi- trarily chosen free parameter matrix. Further, denote ⎧ − ⎪ t 1 ⎪ N ( ) = i+1( ∗i) i ⎨⎪ s Pn Ni σ s i=0 , (8.22) ⎪ t ⎪ D( ) = i+1( ∗i) i ⎩ s Pr Di σ s i=0 then all the solutions to the con-Yakubovich matrix equation (8.19) can be parame- terized as (8.21)if   N (s) rank = 2r, for any s ∈ λ(Fσ ). (8.23) D(s)

Proof First, let us show that the matrices X and Y given in (8.21) satisfy the matrix equation (8.19). For the expression in (8.21), by using Lemma 3.2 and the expressions in (8.20) one has 8.2 Con-Yakubovich Matrix Equations 345

− X AXF ⎛ ⎞ t−1 t−1 ←− ←− ∗i i ⎝ ∗i i ⎠ = NiZ F − A NiZ F F i=0 i=0 − − t 1 ←− t 1 ←− ∗i i ∗(i+1) i+1 = NiZ F − ANiZ F i=0 i=0 − t 1 ←− ←− ∗i i ∗t t = N0Z + (Ni − ANi−1)Z F − ANt−1Z F i=1 − t 1 ←− ←− ∗i i ∗t t = BD0Z + BDiZ F + BDtZ F i=1 = BY.

This relation implies that matrices X and Y givenby(8.21) satisfy the con-Yakubovich matrix equation (8.19). Now, let us show the completeness of the solution given in (8.21). By Lemma 8.4, it is known that the con-Yakubovich matrix equation (8.19) has a unique solution for a fixed matrix Y if γμ = 1, for ∀γ ∈ λ(AA), ∀μ ∈ λ(FF). Therefore, under the condition of this theorem the number of degrees of freedom in the solution (X, Y) to the con-Yakubovich matrix equation (8.19) is equal to that of the elements in the matrix Y, that is, rp. On the other hand, by using Lemma 8.1, for the matrices X and Y in (8.21) one has ⎧ − ⎪ t 1 ⎪ = i+1( ∗i) i ⎨⎪ Xσ Pn Ni σ Zσ Fσ i=0 . ⎪ t ⎪ = i+1( ∗i) i ⎩ Yσ Pr Di σ Zσ Fσ i=0

By a result in [303], under the condition (8.23) the following mapping is injective:

φ : Z → S ,     t i+1 ∗i S = Pn Ni σ Z i . i+1 ∗i Fσ P D σ i=0 r i

In addition, it is obvious that the real representation σ is injective. Thus, the mapping Z → (X, Y) given in (8.21) is also injective. Therefore, all the rp elements in Z have contribution to the solution in (8.21). These two facts imply that the solutions given in (8.21) is complete. The proof is thus completed. 

Next, the goal is to give an equivalent expression of the solution given in −→ n×n n×r p×n Theorem 8.3. For matrices A ∈ C , B ∈ C , and C ∈ C , the matrices Ctrt(A, B), 346 8 Unilateral-Equation-Based Approaches ←− −→ ←− Obst (A, C), Ctr(A, B), andObs (A, C) are defined as in the previous section. In addi- r×r tion, for a matrix set D = Di ∈ C , i ∈ I[0,ϕ] , the following matrix is defined ⎡ ⎤ D0 D1 ... Dt−2 Dt−1 ⎢ . . ⎥ ⎢ .. .. ⎥ ⎢ D0 Dt−2 ⎥ (D) = ⎢ . ⎥ , Tt ⎢ .. ⎥ ⎢ ··· ··· ⎥ ⎣ ∗(t−2) ∗(t−2) ⎦ D0 D1 ∗(t−1) D0 with Di = 0, for i >ϕ.

n×n n×r n×r Lemma 8.5 Given matrices A ∈ C and B ∈ C , the matrices Ni ∈ C and r×r Di ∈ C , i ∈ I[0, t], with Nt = 0 satisfy the finite iterative relation (8.20) if and r×r only if D ={Di ∈ C , i ∈ I[0, t]}, satisfies −→ t ∗t BDt + ABDt−1 +···+A (BD0) = 0 (8.24)

n×r and Ni ∈ C , i ∈ I[0, t − 1], satisfy −→ [ N0 N1 ··· Nt−1 ]=Ctrt(A, B)Tt(D). (8.25)

Proof It follows from the iteration (8.20) that ⎧ ⎪ N0 = BD0 ⎪ ⎪ N = BD + ABD ⎨⎪ 1 1 0 N2 = BD2 + ABD1 + AABD0 ⎪ N = BD + ABD + AABD + AAABD , (8.26) ⎪ 3 3 2 1 0 ⎪ ··· ⎪ −→ ⎩ t−1 ∗(t−1) Nt−1 = BDt−1 + ABDt−2 +···+A (BD0) and −→ t ∗t 0 = BDt + ABDt−1 + AABDt−2 +···+A (BD0) . (8.27)

The relation (8.27)isexactly(8.24). Rewriting (8.26) gives the expression (8.25). On the other hand, the first expression of (8.20) can be easily obtained from (8.25). For i ∈ I[1, t], it follows from (8.25) that

i−1 −→   j ∗j Ni−1 = A BD(i−1)−j . (8.28) j=0

By using this relation and Item (4) of Lemma 3.2, one has, for i ∈ I[1, t − 1], 8.2 Con-Yakubovich Matrix Equations 347

AN − + BD ⎡i 1 i ⎤ i−1 −→   ⎣ j ∗j⎦ = A A BD(i−1)−j + BDi j=0 i−1 −→   j+1 ∗(j+1) = A BD(i−1)−j + BDi j=0 i −→   j ∗j = A BDi−j = Ni. j=0

This is the second expression of (8.20). In addition, by using Lemma 3.2, it follows from (8.28) and (8.24) that ⎡ ⎤ t−1 −→   ⎣ j ∗j⎦ ANt−1 = A A BD(t−1)−j j=0 t−1 −→   j+1 ∗(j+1) = A BD(t−1)−j j=0 t −→   j ∗j = A BDt−j =−BDt. j=1

Thus the relations (8.24) and (8.25) imply the relation (8.20). With the preceding two aspects, the proof is completed.  Remark 8.5 The idea of proving the above lemma is inspired by the Lemma 2 of [302]. With the previous lemma, one can obtain an equivalent expression of the general solution to the con-Yakubovich matrix equation (8.19). Theorem 8.4 Given matrices A ∈ Cn×n, B ∈ Cn×r, and F ∈ Cp×p, suppose that γμ = 1,for∀γ ∈ λ(AA), ∀μ ∈ λ(FF). If there is a collection of matrices D = r×r {Di ∈ C , i ∈ I[0, t]} satisfying (8.24), then the matrices X and Y given by −→ ←− X = Ctr (A, B)T (D)Obs (F, Z)  t t  ←−t (8.29) Y = D0 D1 ··· Dt Obst+1(F, Z) with Z ∈ Cr×p being an arbitrarily chosen free parameter matrix, satisfy the matrix equation (8.19). Further, define D(s) as in (8.22). Then all the solutions to the con- Yakubovich matrix equation (8.19) can be parameterized by (8.29)if

det D(s) = 0, ∀s ∈ λ(Fσ ). (8.30) 348 8 Unilateral-Equation-Based Approaches

Proof It follows from the expression (8.21) and Lemma 8.5 that the matrix X can be written as

− t 1 ←− ∗i i X = NiZ F i=0   ←− = N0 N1 ··· Nt−1 Obst(F, Z) −→ ←− = Ctrt(A, B)Tt(D)Obst(F, Z).

The expression of Y in (8.21) can be obtained as the second one in (8.29). In addition, the condition (8.30) is stronger than the condition (8.23). Therefore, the conclusion of this theorem is true by using Theorem 8.3. The proof is thus completed.  Remark 8.6 Similarly to the case in Sect. 8.1, a unilateral-equation-based method has been given in Theorem 8.4 to solve the con-Yakubovich matrix equation (8.19). In order to solve the general solution, one needs to solve the unilateral matrix equation (8.24).

r×r Remark 8.7 In Theorem 8.4, only the matrices Di ∈ C , i ∈ I[0, t], are required to solve the matrix equation (8.19). In fact, they can be easily obtained from (8.24). Denote ⎡ ⎤ Dt ⎢ ⎥ = ⎢ Dt−1 ⎥ , L ⎣ ··· ⎦ ∗t D0 then, the relation (8.24) can be rewritten as −→ Ctrt+1(A, B)L = 0. (8.31)

This is a standard unilateral matrix equation, and can be solved easily. One possible method to solve it is to use numerically reliable singular value decompositions. Remark 8.8 Similarly to the case in Sect. 8.1, the general solution in Theorem 8.4 for the con-Yakubovich (8.19) is also expressed in terms of the con-controllability matrix and the con-observability matrix. Such a feature may provide advantages and convenience for some related problems. Example 8.2 Consider a con-Yakubovich matrix equation in the form of (8.19) with ⎡ ⎤ 1 −2 − i −1 + i   2i i A = ⎣ 0i 0⎦ , F = , 1 −1 + i 0 −11− i ⎡ ⎤ −1 + ii B = ⎣ 0i⎦ . −i1− 2i 8.2 Con-Yakubovich Matrix Equations 349

It is easy to verify that γμ = 1, for ∀γ ∈ λ(AA), ∀μ ∈ λ(FF). So all the solutions can be characterized as the expression (8.21). Let the parameter t = 3. By solving the 2×2 matrix equation (8.31), one can obtain a set of Di ∈ C , i = 0, 1, 2, 3, as follows,     10 12− 4i D = , D = , 0 0i 1 01     −29− 10i −20 D = , D = . 2 0i 3 01

Thus, according to the iteration (8.20), one has ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ −1 + i −1 −3 − i2+ 11i −2 −4 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ N0 = 0 −1 , N1 = 00, N2 = 0 −1 . −i2+ i 1 −1 − 7i 1 + i −2

With the above preliminaries, the matrices  X = N0Z + N1ZF + N2ZFF Y = D0Z + D1ZF + D2ZFF + D3ZFFF satisfy this matrix equation for an arbitrary matrix Z ∈ C2×2. If the free parameter matrix Z is chosen as

  1 + 2i 2 − i Z = , 3i −1 + 3i a special solution to this matrix equation can be obtained as ⎡ ⎤ 11 − 22i 17 + 6i X = ⎣ −1 − 11i 9 − 17i ⎦ , −28 − 25i 15 − 61i   32 + 15i 57 + 133i Y = . 2 − 14i 20

8.3 Nonhomogeneous Con-Sylvester Matrix Equations

In the previous two sections, two types of homogeneous complex conjugate matrix equations are investigated based on unilateral linear matrix equations. In this section, the unilateral-equation-based approach is used to solve the following nonhomoge- neous con-Sylvester matrix equation

AX + BY = XF + R, (8.32) 350 8 Unilateral-Equation-Based Approaches where A ∈ Cn×n, B ∈ Cn×r, F ∈ Cp×p, and R ∈ Cn×p are given matrices, and X ∈ Cn×p and Y ∈ Cr×p are unknown matrices to be determined. When R = 0, this equa- tion degenerates into the following form

AX + BY = XF. (8.33)

This is the homogeneous con-Sylvester matrix equation studied in Sects.6.3, 7.1, and 8.1. Before proceeding, the following proposition is needed. The result of this proposition can be easily derived, and has been given in Sect. 7.1.

Proposition 8.1 If (Xs, Y s) is a special solution to the nonhomogeneous con- Sylvester matrix equation (8.32) and (X∗, Y ∗) is the general solution to the con- Sylvester matrix equation (8.33), then the general solution to the nonhomogeneous con-Sylvester matrix equation (8.32) is given by

X = Xs + X∗, Y = Y s + Y ∗.

The above result can be viewed as superposition principle for con-Sylvester matrix equation. According to this principle, in order to solve the nonhomogeneous con- Sylvester equation (8.32), one needs to solve the corresponding homogeneous equa- tion (8.33) and obtain a special solution of the equation itself. In Sect.8.1, the general solution to the equation (8.33) has been provided. Thus, in the sequel the emphasis is placed on obtaining a special solution to the nonhomogeneous con-Sylvester matrix equation (8.32). It should be pointed that the same symbols as in Sect.8.1 are used.

× × × × Theorem 8.5 Given matrices A ∈ Cn n,B∈ Cn r,F ∈ Cp p, and R ∈ Cn p,if τ D s = s ∈ Cr×q ∈ I[ ,τ] there exist two integers and q, a matrix set Di ,i 1 and two matrices V ∈ Cn×q and W ∈ Cq×p such that

−→  ∗τ s + s + s +···+ τ s = BD0 ABD1 AABD2 A BDτ V (8.34) and VW = R, (8.35) then a special solution to the nonhomogeneous con-Sylvester matrix equation (8.32) can be given by −→ ←− s s X = Ctrτ (A, B)Sτ (D )Obsτ (F, W)   ←− . (8.36) s = s s ··· s ( , ) Y D0 D1 Dτ Obsτ+1 F W

Proof By virtue of (8.34), a direct computation gives

AXs −→ ←− = ( , ) (D s) ( , ) ACtrτ A B Sτ Obsτ F W  −→ −−→ ←− τ− ∗(τ− ) −→τ ∗τ s = ABA2 B ··· A 1B 1 A B Sτ (D )Obsτ (F, W) 8.3 Nonhomogeneous Con-Sylvester Matrix Equations 351   τ −→ τ −1 −→ ←− = i ( s)∗i i ( s )∗i ··· s ( , ) A BDi A BDi+1 ABDτ Obsτ F W  i=1 i=1  τ −1 −→ ←− = − s i ( s )∗i ··· s ( , ) + BD0 A BDi+1 ABDτ Obsτ F W VW  i=1  τ −1 −→ ←− = − s i ( s )∗i ··· s ( , ) + BD0 A BDi+1 ABDτ 0 Obsτ+1 F W VW. i=1

Similarly, it can be derived that −→ ←− XsF = Ctrτ (A, B)Sτ (D s)Obsτ (F, W)F −→   ←− = Ctrτ (A, B) 0 Sτ (D s) Obsτ+ (F, W)  1 τ −1 −→ ←− = i ( s )∗i ··· s ( , ) 0 A BDi+1 BDτ Obsτ+1 F W . i=0

It follows from the previous two relations and (8.35) that   ←− s − s = − s − s ··· − s ( , ) + AX X F BD0 BD1 BDτ Obsτ+1 F W VW   ←− =− s s ··· s ( , ) + B D0 D1 Dτ Obsτ+1 F W R =−BY s + R.

This shows that the matrix pair (Xs, Y s) given in (8.36) satisfies the nonhomogeneous con-Sylvester matrix equation (8.32). 

With Proposition 8.1, one can obtain the general solution to the nonhomogeneous con-Sylvester matrix equation (8.32) according to the results of Lemma 8.2 and Theorem 8.5.

Theorem 8.6 Given matrices A ∈ Cn×n,B∈ Cn×r, F ∈ Cp×p, and R ∈ Cn×p, sup- pose that λ(AA) ∩ λ(FF) = ∅. If there exist integers t, τ and q, a matrix set D s = s ∈ Cr×q ∈ I[ ,τ] ∈ Cn×q ∈ Cq×p Di ,i 1 and two matrices V and W  satis- r×r fying (8.34) and (8.35), and a matrix set D = Di ∈ C ,i∈ I[1, t] satisfying (8.11), then the matrices X and Y given by

−→ ←− −→ ←− s X = Ctr (A, B)S (D)Obs (F, Z) + Ctrτ (A, B)Sτ (D )Obsτ (F, W)  t t  ←−t   ←− (8.37) = ··· ( , ) + s s ··· s ( , ) Y D0 D1 Dt Obst+1 F Z D0 D1 Dτ Obsτ+1 F W with Z ∈ Cr×p being an arbitrarily chosen free parameter matrix, satisfy the matrix equation (8.32). Further, define D(s) as in (8.7). Then all the solutions to the homo- geneous con-Sylvester matrix equation (8.32) can be parameterized by (8.37)if

det D(s) = 0, for any s ∈ λ (Fσ ) . 352 8 Unilateral-Equation-Based Approaches

In the above theorem, the solution to the nonhomogeneous con-Sylvester matrix equation (8.32) is expressed in terms of the con-controllability matrix associated with (A, B), and the con-observability matrices. This property may bring convenience and advantages to the further analysis of the matrix equation (8.32). When the approach in Theorem 8.6 is adopted to solve the con-Sylvester matrix ∈ Cr×r ∈ I[ , ] s ∈ Cr×q ∈ equation (8.32), two groups of matrices Di , i 1 t , and Di , i I[1,τ], are required. They need to satisfy (8.11 ) and (8.34), respectively. In Sect.8.1, r×r it has been illustrated that the matrix group Di ∈ C , i ∈ I[1, t] can be obtained by the standard unilateral matrix equation (8.18). Such an approach can also be s ∈ Cr×q, ∈ I[ ,τ] applied to derive the matrix group Di i 1 satisfying (8.34). In fact, by letting ⎡ ⎤ s D0 ⎢ s ⎥ s = ⎢ D1 ⎥ L ⎣ ··· ⎦ ,  ∗τ s Dt the expression (8.34) can be equivalently written as −→ s Ctrτ+1(A, B)L = V. (8.38)

This is a standard unilateral matrix equation, and can be solved easily. One possible method to solve it is to use numerically reliable singular value decompositions.

Example 8.3 Consider a nonhomogeneous con-Sylvester matrix equation in the form of (8.32) with the following parameters ⎡ ⎤ 1 −2 − i −1 + i   2i i A = ⎣ 0i 0⎦ , F = , 1 −1 + i 0 −11− i ⎡ ⎤ ⎡ ⎤ −1 + i1 1i B = ⎣ 0i⎦ , R = ⎣ i1⎦ . −i1− 2i 01− i

The corresponding homogeneous equation has been considered in Sect. 8.1. For this matrix equation, n = 3. The integer τ is chosen to be τ = 2. A direct calculation gives   BABAAB ⎡ ⎤ −1 + i1−2 − 2i −3 + i −2 + 4i −6 + 3i = ⎣ 0i 0 1 0 i⎦ . −i1− 2i 1 + i3+ 2i −2i −5i

In addition, the matrices W and V are respectively chosen as W = I2 and V = R. With this, by solving the matrix equation (8.38) one can obtain a group of matrices 8.3 Nonhomogeneous Con-Sylvester Matrix Equations 353 satisfying (8.34) as follows:       −2 − 2i 1 + 6i 3 − 1 i −2 + 2i 1 −2 Ds = , Ds = 2 2 , Ds = . 0 10 1 00 2 0 −i

Furthermore, it can be derived that   −−→   Ds Ds Ctr (A, B) = BAB , S (D s) = 1 2 , 2 2 Ds 0 2 ⎡ ⎤   W ←−− W ←−− Obs (F, W) = , Obs (F, W) = ⎣ WF ⎦ . 2 WF 3 WFF

According to (8.36), a special solution to the nonhomogeneous con-Sylvester matrix equation in this example can be given by −−→ ←−− s = ( , ) (D s) ( , ) X ⎡Ctr2 A B S2 Obs2⎤F W 1 + i −1 + i = ⎣ 1 −1 ⎦ , (8.39) −3.5 − 0.5i −4i

  ←−− Y s = Ds Ds Ds Obs (F, W)  0 1 2 3 30.5 + 2.5i = . (8.40) 2 + i1− 2i

By using Theorem 8.6 and the result in Sect. 8.1, the general solution to the homo- geneous con-Sylvester matrix equation can be expressed as  s X = N0Z + N1ZF + X s , Y = D0Z + D1ZF + D2ZFF + Y

s s where X and Y are respectively given in (8.39) and (8.40), and the matrices N0, N1, D0, D1, and D2 are given as follows       −1 + i −12 + 6i −i −5 + i 10 D = , D = , D = , 0 0 −2 1 00 2 02 ⎡ ⎤ ⎡ ⎤ −1 + i −2 + 4i −1 − i2 ⎣ ⎦ ⎣ ⎦ N0 = 02, N1 = 0 −2i . −i7− 9i i2+ 4i

At the end of this section, two remarks are provided for the proposed approaches. 354 8 Unilateral-Equation-Based Approaches

Remark 8.9 When Theorem 8.5 is utilized to obtain a special solution for the non- s ∈ Cr×q homogeneous con-Sylvester matrix equation (8.32), a group of matrices Di , i ∈ I[1,τ], need to be chosen such that the condition (8.34) holds. As mentioned pre- viously, these matrices can be obtained from the unilateral matrix equation (8.38). A necessary condition for the solvability of (8.38)is     −→ −→ ( , ) = rank Ctrτ+1 A B rank Ctrτ+1(A, B) V .

Therefore, it is important to determine an integer τ such that this condition holds. However, it is a pity that no method is provided in this book to determine such a τ. Interested readers are encouraged to investigate this problem.

Remark 8.10 In this section, the unilateral-equation-based approach is proposed to solve the nonhomogeneous con-Sylvester matrix equation. In fact, such a method can also be generalized to solve the following nonhomogeneous con-Yakubovich matrix equation X − AXF = BY + R, where A ∈ Cn×n, B ∈ Cn×r, F ∈ Cp×p, and R ∈ Cn×p are given matrices, and X ∈ Cn×p and Y ∈ Cr×p are unknown matrices to be determined. With the idea in this section, the special and general solutions of this equation can be easily derived.

8.4 Notes and References

In this chapter, the con-Sylvester and con-Yakubovich matrix equations have been revisited, and alternative approaches have been proposed for solving these two matrix equations. In the proposed approaches, a unilateral matrix equation needs to be solved. With some matrices obtained from some unilateral linear matrix equations, the solutions of these two matrix equations can be explicitly given. Two expressions are given for the general solutions. One is in a finite series form, and the other is expressed in terms of con-controllability matrices and con-observability matrices. The given solution can provide all the degrees of freedom. In addition, the unilateral-equation-based approach is also utilized in Sect.8.3 to solve the nonhomogeneous con-Sylvester matrix equation. The obtained general solution is expressed in terms of con-controllability matrices and con-observability matrices. Similarly to the cases of two homogeneous matrix equations, the proposed approach can also provide all the degrees of freedom in the solution. The main contents of this chapter are taken from [35, 273, 296]. Chapter 9 Conjugate Products

Polynomials and polynomial matrices play an important role in solving some matrix equations. In fact, the importance of polynomial matrices has been illustrated in Chap. 7. In this chapter, a new operation, the conjugate product, is introduced for complex polynomials and polynomial matrices. It will be seen in Chap. 10 that a unified approach can be obtained for solving a class of complex conjugate matrix equations by using the conjugate product of complex polynomial matrices. This chapter starts with the conjugate product of complex polynomials in Sect. 9.1.

9.1 Complex Polynomial Ring (C[s], +, )

In this section, some basic notation and terminologies are first given for polynomials. Let s be an indeterminate, and n be an integer not less than 0. The expression of

n f (s) = a0 + a1s +···+ans (9.1) with ai ∈ C, i ∈ I[0, n], is called a polynomial over C. The set of all the polynomials over C with the single indeterminate s is denoted by C[s]. By using summation notation, the polynomial f (s) is expressed more concisely as

n i f (s) = ai s . i=0

i For the polynomial f (s) in (9.1), each ai s is called a term, i ∈ I[0, n]; ai is called i the coefficient of the term ai s .Ifan = 0, then an is called the leading coefficient of the polynomial f (s) in (9.1); n is called its degree, and is denoted as deg f (s), i.e., n = deg f (s). A polynomial with leading coefficient being one is called monic. In

© Springer Science+Business Media Singapore 2017 355 A.-G. Wu and Y. Zhang, Complex Conjugate Matrix Equations for Systems and Control, Communications and Control Engineering, DOI 10.1007/978-981-10-0637-1_9 356 9 Conjugate Products addition, the degree of polynomial zero is defined as −∞. Besides, for the polynomial in (9.1) it is defined that

n − f (s) =−a0 − a1s −···−ans .

Definition 9.1 Given two polynomials

n m i i f (s) = ai s , g (s) = bi s ∈ C[s], (9.2) i=0 i=0 they are called equal if m = n, and ai = bi , i ∈ I[0, n]. In this case, it is denoted as f (s) = g (s).

Next, the definition of the sum is given for two polynomials.

Definition 9.2 Given two polynomials f (s) and g (s) ∈ C[s] in the form of (9.2), their sum is defined in the following way: ⎧ ⎪ m n ⎪ ( + ) i + i , > ⎪ ai bi s ai s if n m ⎪ ⎪ i=0 i=m+1 ⎨ m ( ) + ( ) = (a + b ) si n = m f s g s ⎪ i i ,if . ⎪ i=0 ⎪ n m ⎪   ⎪ + i + i < ⎩⎪ (ai bi ) s bi s ,ifn m i=0 i=n+1

For two polynomials with degrees n and m, respectively, if n > m, then the sum is a polynomial of degree n,ifm > n then the sum is a polynomial of degree m.If n = m, then this sum is a polynomial of degree n when an + bn = 0, and a polynomial of degree less than n when an + bn = 0. Thus, one has the following lemma.

Lemma 9.1 Given two polynomials f (s),g(s) ∈ C[s], there holds

deg ( f (s) + g (s)) ≤ max {deg f (s) , deg g (s)} .

Now, the definition of the conjugate product is given for two polynomials in C[s].

n m i j Definition 9.3 For two polynomials f (s) = ai s ∈ C[s] and g(s) = b j s ∈ i=0 j=0 C[s], their conjugate product is defined as

n m ( )  ( ) = ∗i i+ j f s g s ai b j s . i=0 j=0

The following result on the degree of the conjugate product is easily obtained. 9.1 Complex Polynomial Ring (C[s], +, ) 357

Lemma 9.2 Given two polynomials f (s),g(s) ∈ C[s], there holds

deg ( f (s)  g (s)) = deg f (s) + deg g (s) .

Remark 9.1 In Lemma 9.2, if both f (s) and g (s) are nonzero, the result is obvious. If one of them is zero, say f (s), then f (s)  g (s) = 0. Since the degree of zero polynomial is defined as −∞, then deg ( f (s)  g (s)) =−∞. Thus, the result of this lemma still holds.

For real polynomials, the conjugate product degenerates to the ordinary product. In addition, it is well-known that the ordinary product of polynomials is commutative. However, for complex polynomials the conjugate product does not obey commuta- tivity. For example, given

f (s) = (1 + 2i)s + 3i, g(s) = is + 2 − 4i, by simple calculation it is easily obtained that

f (s)  g(s) = (2 − i) s2 − (9 − 8i) s + 12 + 6i, g(s)  f (s) = (2 + i) s2 + 13s + 12 + 6i.

For this example, it is obvious that f (s)  g(s) = g(s)  f (s). Nevertheless, the conjugate product has some nice properties.

Lemma 9.3 Given a(s),b(s),c(s),d(s),e(s) ∈ C[s], the following relations hold: (1) Left distributivity: (a(s) + d(s))  b(s) = a(s)  b(s) + d(s)  b(s); (2) Right distributivity: a(s)  (b(s) + e(s)) = a(s)  b(s) + a(s)  e(s); (3) Associativity: (a(s)  b(s))  c(s) = a(s)  (b(s)  c(s)).

Proof Let

m m n n l i i j j k a(s) = ai s , d(s) = di s , b(s) = b j s , e(s) = e j s , c(s) = cks . i=0 i=0 j=0 j=0 k=0

According to Definition 9.3, one has

(a(s) + d(s))  b(s)   ⎛ ⎞ m n i ⎝ j ⎠ = (ai + di ) s  b j s i=0 j=0 m n = ( + ) ∗i i+ j ai di b j s i=0 j=0 358 9 Conjugate Products

m n m n = ∗i i+ j + ∗i i+ j ai b j s di b j s i=0 j=0 i=0 j=0 = a(s)  b(s) + d(s)  b(s);

a(s)  (b(s) + e(s))   ⎛ ⎞ m n  i ⎝ j ⎠ = ai s  b j + e j s i=0 j=0 m n  ∗i i+ j = ai b j + e j s i=0 j=0 m n m n = ∗i i+ j + ∗i i+ j ai b j s ai e j s i=0 j=0 i=0 j=0 = a(s)  b(s) + a(s)  e(s);

(a(s)  b(s))  c(s) ⎛ ⎞   m n l = ⎝ ∗i i+ j ⎠  k ai b j s ck s i=0 j=0 k=0 m n l = ∗i ∗(i+ j) i+ j+k ai b j ck s i=0 j=0 k=0 m n l      ∗i = ∗ j i+( j+k) ai b j ck s i=0 j=0 k=0   ⎡ ⎤ m n l = i  ⎣ ∗ j j+k ⎦ ai s b j ck s i=0 j=0 k=0 = a(s)  (b(s)  c(s)) .

The proof is thus completed. 

By summarizing the preceding results, the following conclusion can be obtained on algebraic properties of C[s] equipped with the operations  and +.

Theorem 9.1 The set C[s] equipped with the operations + and  is a ring.

Proof Let f (s), g(s), h(s) ∈ C[s]. For the operation +, it is easily known that (1) Commutativity: f (s) + g(s) = g(s) + f (s); (2) Associativity: ( f (s) + g(s)) + h(s) = g(s) + ( f (s) + h(s)); (3) 0 is the additive identity: 0 + f (s) = f (s); (4) f (s) + (− f (s)) = 0. 9.1 Complex Polynomial Ring (C[s], +, ) 359

For the operation , it follows from Lemma 9.3 that (5) f (s)  (g(s)  h(s)) = ( f (s)  g(s))  h(s); (6) f (s)  (g(s) + h(s)) = f (s)  g(s) + f (s)  h(s); (7) ( f (s) + g(s))  h(s) = f (s)  h(s) + g(s)  h(s). t i In addition, for any f (s) = ai s , by Definition 9.3, one has i=0  ( ) 1 f s    t 0 i = 1s  ai s i=0 t  = × ∗0 i+0 = ( ) 1 ai s f s ; i=0 and

( )  f s 1  t  i 0 = ai s  1s i=0 t  ∗i i+0 = ai × 1 s = f (s). i=0

The previous two relations imply that (8) 1 is an identity in the framework of the conjugate product. The above facts imply that the assertion of this theorem is true. 

9.2 Division with Remainder in (C[s], + , )

In the ordinary polynomial ring (C[s], + , ×), there exist some concepts, such as common divisors, greatest common divisors, coprimeness. In this section, these con- cepts will be generalized to (C[s], + , ). But such a generalization is not trivial since the conjugate product is not commutative.

Definition 9.4 Given f (s), g(s), h(s) ∈ C[s], if they satisfy

f (s) = g(s)  h(s), then h(s) (g(s), respectively) is called a right (left, respectively) divisor of f (s); f (s) is called a right (left, respectively) multiple of g(s) (h(s), respectively); f (s) is said to be right (left, respectively) divisible by h(s ) (g(s), respectively). It is denoted ( )| ( ) ( )| ( ) by g s  L f s , h s  R f s . 360 9 Conjugate Products

( ) ∈ C[ ] ( ) ( )| ( ) Definition 9.5 A polynomial f s s is said to be divisible by g s if g s  L f s ( )| ( ) ( )| ( ) and g s  R f s . It is denoted by g s  f s . By using the above definition, the following conclusion can be easily obtained.

Theorem 9.2 Let f (s),g(s),m(s),n(s), and c(s) be polynomials in C[s]. Then, ( )| ( ) ( )| ( ) =⇒ ( )| ( ( )  ( ) + ( )  ( )) (1) c s  R f s ,c s  Rg s c s  R m s f s n s g s ; ( )| ( ) ( )| ( ) =⇒ ( )| ( ( )  ( ) + ( )  ( )) (2) c s  L f s ,c s  Lg s c s  L f s m s g s n s . Theorem 9.3 (The division algorithm) For two polynomials f (s),g(s) ∈ C(s) with ( ) = ( ) ( ) ( ) ( ) g s 0, there exist polynomials hR s ,hL s ,qR s , and qL s such that

( ) = ( )  ( ) + ( ) f s g s hR s qR s , (9.3) ( ) = ( )  ( ) + ( ) f s hL s g s qL s . (9.4)

( ) = ( )< ( ) ( ) ( ) ( ) = If qR s 0, then deg qR s deg g s , and hR s and qR s are unique; if qL s ( )< ( ) ( ) ( ) 0, then deg qL s g s , and hL s and qL s are unique. Proof Only the case (9.3) is considered. ( ) ( ) The latter part of the conclusion is first considered. If there exist hR 2 s and qR 2 s satisfying

( ) = ( )  ( ) + ( ) f s g s hR 2 s qR 2 s

( ) = ( )< ( ) with qR 2 s 0, deg qR 2 s deg g s , then by using Lemma 9.3 one has

( )  ( ( ) − ( )) = ( ) − ( ) g s hR s hR 2 s qR 2 s qR s . (9.5)

( ) − ( ) = If qR 2 s qR s 0, it follows that

( ( ) − ( )) ≥ ( ). deg qR 2 s qR s deg g s

( ( ) − ( )) < ( ) ( )< ( ) However, deg qR 2 s qR s deg g s since deg qR 2 s deg g s and ( )< ( ) ( ) = ( ) deg qR s deg g s . This contradiction implies that qR 2 s qR s . Combining ( ) = ( ) this fact with (9.5) immediately gives that hR s hR 2 s . The proof of the latter part is thus completed. Now, let us prove the former part. If deg f (s)

n m i i f (s) = ai s , g(s) = bi s , an = 0, bm = 0, n ≥ m. i=0 i=0 9.2 Division with Remainder in (C[s], + , ) 361

A direct calculation gives   ∗m an n−m f1(s) = f (s) − g(s)  s  bm   ∗ ∗ m a m i = f (s) − b n si+n−m (9.6) i b i=0 m −  ∗ ∗ m1 a m i = f (s) − a sn − b n si+n−m n i b i=0 m − −  ∗ ∗ n 1 m1 a m i = a si − b n si+n−m . i i b i=0 i=0 m ( ) = = ( ) α ( ) If f1 s 0, denote n1 deg f1 s , and let n1 be the leading coefficient of f1 s . It is obvious that n1 < n.Ifn1 ≥ m, one takes   α ∗m n1 n1−m f2(s) = f1(s) − g(s)  s . bm ( ) = = ( ) α ( ) If f2 s 0, denote n2 deg f2 s , and let n2 be the leading coefficient of f2 s . Similarly to the derivation of (9.6), it is easily obtained that n2 < n1.Ifn2 ≥ m, one takes   α ∗m n2 n2−m f3(s) = f2(s) − g(s)  s . bm

Along this line, it is known that there is a k such that   α ∗m − nk 1 nk−1−m fk (s) = fk−1(s) − g(s)  s , bm where fk (s) = 0, or fk (s) = 0, nk = deg fk (s)

Choose       ∗m α ∗m α ∗m an − n − n − − ( ) = n m + 1 n1 m +···+ k 1 nk−1 m hR s s s s , bm bm bm ( ) = ( ) qR s fk s ,

< ∈ I[ − ] < ( ) where ni n, i 1, k 1 ; nk m. Then the relation (9.3) holds, and deg qR s < deg g(s). The proof is thus completed.  362 9 Conjugate Products

( ) ( ) The procedure for obtaining polynomials hR s and qR s is referred to as division with reminder. Proposition 9.1 For f (s),g(s), ψ(s), ϕ(s) ∈ C [s], ( )| ( ) ( )| ψ( ) ( )| ψ( ) (1) if f s  Rg s ,g s  R s , then f s  R s ; ( )| ( ) ( )| ψ( ) ( )| ψ( ) (2) if f s  Lg s ,g s  L s , then f s  L s ; | ( ) | ( ) ∈ C (3) a  L f s ,a R f s for any nonzero a .

9.3 Greatest Common Divisors in (C[s], + , )

( ) ( ) ∈ C ( ) ∈ C[ ] Definition 9.6 For f1 s , f2 s [s], if there exists a polynomial gR s s ( )| ( ) = ( ) such that gR s  R fi s , i 1, 2, then gR s is called a common right divisor of ( ) ( ) ( ) ( )| ( ) = f1 s and f2 s ; if there exists a polynomial gL s such that gL s  L fi s , i 1, ( ) ( ) ( ) 2, then gL s is called a common left divisor of f1 s and f2 s .

Definition 9.7 A greatest common right divisor of f1(s), f2(s) ∈ C [s] is any poly- ( ) ∈ C nomial gR s [s] with the following properties: ( )| ( ) = (1) gR s  R fi s , i 1, 2; ( ) ( ) ( ) ( )| ( ) (2) If gR 1 s is any other common right divisor of f1 s and f2 s , then gR 1 s  RgR s , ( ) ∈ C[ ] ( ) = ( )  ( ) i.e., there exists a w s s such that gR s w s gR 1 s .

Definition 9.8 A greatest common left divisor of f1(s), f2(s) ∈ C [s] is any poly- ( ) ∈ C nomial gL s [s] with the following properties: ( )| ( ) = (1) gL s  L fi s , i 1, 2; ( ) ( ) ( ) ( )| ( ) (2) If gL 1 s is any other common left divisor of f1 s and f2 s , then gL 1 s  LgL s , ( ) ∈ C[ ] ( ) = ( )  ( ) i.e., there exists a w s s such that gL s gL 1 s w s . On greatest common divisors, one has the following result. Theorem 9.4 For any two polynomials f (s),g(s) ∈ C[s], not both 0, a greatest common right (left, respectively) divisor exists. Proof If one of f (s) and g(s) is 0, say g(s), it is obvious that f (s) is a greatest common right divisor. Now, it is assumed that f (s) and g(s) are two nonzero polynomials with deg f (s) ≥ deg g(s). By successively applying the division algorithm in Theorem 9.3, one can obtain ⎧ ⎪ ( ) = ψ ( )  ( ) + ( ) ( )< ( ) ⎪ f s 1 s g s r1 s ,degr1 s deg g s ; ⎪ ( ) = ψ ( )  ( ) + ( ) ( )< ( ) ⎨⎪ g s 2 s r1 s r2 s ,degr2 s deg r1 s ; r1(s) = ψ3(s)  r2(s) + r3(s),degr3(s)

It can be assumed that one eventually obtains a remainder of zero, because the sequence of the degrees of the remainders monotonically decreases. ( )| ( ) It follows from the last item in (9.7) that rn s  Rrn−1 s . Combining this with ( )| ( ) the second last item in (9.7)givesrn s  Rrn−2 s . Along this line, it follows from (9.7) that

( )| ( ), ( )| ( ), . . . , ( )| ( ), ( )| ( ). rn s  Rrn−2 s rn s  Rrn−3 s rn s  Rg s rn s  R f s

Hence, rn(s) is a common right divisor of f (s) and g(s). Now it is assumed that ϕ(s) ∈ C[s] is an arbitrary common right divisor of f (s) ( ) ϕ( )| ( ) and g s . This, together with the first expression in (9.7), implies that s  Rr1 s . ϕ( )| ( ) Combining this fact with the second expression of (9.7) gives that s  Rr2 s . ϕ( )| ( ) ( ) Along this line, it follows from (9.7) that s  Rrn s . This fact reveals that rn s is a greatest common right divisor of f (s) and g(s). The case of greatest common left divisors can be similarly proven. 

Theorem 9.4 not only shows the existence of greatest common right and left divisors, but also provides a method to find them. Such a method is referred to as the Euclidean algorithm. It follows from the preceding section that if d(s) is a greatest common right (left, respectively) divisor of f (s) and g(s), then for any nonzero a ∈ C, a  d(s) (d(s)  a, respectively) is one of their greatest common right (left, respectively) divisors. Therefore, one can require the greatest common right (left, respectively) divisor to be monic, and thus it is unique. For f (s), g(s) ∈ C[s], the notations gcrd ( f (s), g(s)) and gcld ( f (s), g(s)) are used to denote the monic greatest common right and left divisors, respectively. In the following, some properties of greatest common right and left divisors are given.

Theorem 9.5 Given two polynomials f (s),g(s) ∈ C[s] with gcrd( f (s),g(s)) = d(s),leta(s) and b(s) be the polynomials such that f (s) = a(s)  d(s),g(s) = b(s)  d(s). Then, gcrd(a(s), b(s)) = 1.

( ) ( )| ( ) ( )| ( ) Proof Assume that c s is a polynomial such that c s  Ra s , c s  Rb s . Then there are polynomials k(s) and l(s) such that a(s) = k(s)  c(s) and b(s) = l(s)  c(s), and thus f (s) = (k(s)  c(s))  d(s) and g(s) = (l(s)  c(s))  d(s).

Hence, c(s)  d(s) is a common right divisor of f (s) and g(s). Since d(s) is a greatest common right divisor, c(s) must be a constant. Consequently, gcrd(a(s), b(s)) = 1.  364 9 Conjugate Products

Similarly, the following conclusion on greatest common left divisors can be obtained.

Theorem 9.6 Given two polynomials f (s),g(s) ∈ C[s] with gcld( f (s),g(s)) = d(s),leta(s) and b(s) be the polynomials such that f (s) = d(s)  a(s),g(s) = d(s)  b(s). Then, gcld(a(s), b(s)) = 1.

Theorem 9.7 Let f (s),g(s), and c(s) be polynomials in C[s]. Then (1) gcrd( f (s) + c(s)  g(s),g(s)) = gcrd( f (s),g(s)); (2) gcld( f (s) + g(s)  c(s),g(s)) = gcld( f (s),g(s)). ( ) = ( ( ) ( )) ( ) | ( ) ( )| ( ) Proof Let d s gcrd f s , g s . Then, d s  Rg s , which implies d s  Rc s  ( ) ( ) | ( ) g s . With this relation and d s  R f s , it follows from Theorem 9.2 that

( )| ( ( ) + ( )  ( )) . d s  R f s c s g s

Hence, d(s) is a common right divisor of f (s) + c(s)  g(s) and g(s). In addition, if e(s) is a polynomial in C(s) such that

( )| ( ( ) + ( )  ( )) , ( )| ( ), e s  R f s c s g s e s  Rg s then it follows from Theorem 9.2 that

( )| ( ( ) + ( )  ( )) − ( )  ( ) e s  R [ f s c s g s c s g s ] .

( )| ( ) ( ) ( ) With this, one has e s  R f s . Hence, e s is a common right divisor of f s and ( ) ( ) = ( ( ) ( )) ( )| ( ) g s . This together with d s gcrd f s , g s implies e s  Rd s . The preceding two facts imply the first conclusion. The second conclusion can be similarly proven. 

Theorem 9.8 Let f (s),g(s) ∈ C[s]. Then, for any m(s),n(s) ∈ C[s], ( ( ), ( ))| ( ( )  ( ) + ( )  ( )) (1) gcrd f s g s  R m s f s n s g s ; ( ( ), ( ))| ( ( )  ( ) + ( )  ( )) (2) gcld f s g s  L f s m s g s n s . Proof The conclusion is a simple corollary of Theorem 9.2, and the proof is omitted. 

Greatest common divisors can also be defined for more than two polynomials.

Definition 9.9 Let fi (s) ∈ C[s], i ∈ I[1, n], not all 0. ( ) ( )| ( ) ∈ I[ , ] (1) If there exists a polynomial gR s such that gR s  R fi s , i 1 n , then ( ) ( ) ∈ I[ , ] ( ) gR s is called a common right divisor of fi s , i 1 n .IfgR 1 s is any other ( ) ∈ I[ , ] ( )| ( ) ( ) common right divisor of fi s , i 1 n , and gR 1 s  RgR s , then gR s is called a greatest common right divisor of fi (s), i ∈ I[1, n]. 9.3 Greatest Common Divisors in (C[s], + , ) 365

( ) ( )| ( ) ∈ I[ , ] (2) If there exists a polynomial gL s such that gL s  L fi s , i 1 n , then ( ) ( ) ∈ I[ , ] ( ) gL s is called a common left divisor of fi s , i 1 n ;IfgL 1 s is any other ( ) ∈ I[ , ] ( )| ( ) ( ) common left divisor of fi s , i 1 n , and gL 1 s  LgL s , then gL s is a greatest common left divisor of fi (s), i ∈ I[1, n].

In this book, gcrd ( f1(s), f2(s), ..., fn(s)) and gcld ( f1(s), f2(s), ..., fn(s)) are used to denote the monic greatest common right and left divisors of fi (s), i ∈ I[1, n], respectively.

Lemma 9.4 Let fi (s) ∈ C[s],i ∈ I[1, n], not all 0. Then,  ( ( ), ( ) ..., ( )) = ( ), ( ) ..., ( ) ( ) (1) gcrd f1 s f2 s , fn s gcrd f1 s f2 s , gcrd fn−1 s ,fn s  ; (2) gcld ( f1(s), f2(s), ..., fn(s)) = gcld f1(s), f2(s), ...,gcld fn−1(s),fn(s) .

Proof Any common right divisor of the n polynomials fi (s), i ∈ I[1, n],is,in particular, a right divisor of fn−1(s) and fn(s), and therefore a right divisor of gcrd ( fn−1(s), fn(s)). Also, any common right divisor of the n − 1 polynomials f1(s), f2(s), ··· , fn−2(s) and gcrd ( fn−1(s), fn(s)) must be a common right divi- sor of n polynomials fi (s), i ∈ I[1, n], for it is a right divisor of gcrd ( fn−1(s), fn(s)), it must be a right divisor of fn−1(s) and fn(s). Since the set of n polynomials and the set of the first n − 2 polynomials together with the monic greatest common right divi- sor of the last two polynomials have exactly the same divisors, their monic greatest common right divisors are equal. The first conclusion is thus obtained. The second conclusion can be derived along the same line. 

9.4 Coprimeness in (C[s], + , )

Coprimeness is an important concept in the theory of polynomials and polynomial matrices. In this section, this concept is first defined in the framework of conjugate products.

Definition 9.10 Two polynomials f (s), g(s) ∈ C[s] are said to be right (left, respec- tively) coprime if f (s) and g(s) have greatest common right (left, respectively) divisors of degree 0.

Obviously, f (s) and g(s) is right coprime if and only if gcrd( f (s), g(s)) = 1; f (s) and g(s) is left coprime if and only if gcld( f (s), g(s)) = 1. In order to obtain some criteria of coprimeness, the next conclusions on greatest common right and left divisors are first provided.

Theorem 9.9 Given f (s),g(s) ∈ C[s],letd(s) be a greatest common right divisor of f (s)gand (s). Then, there exist α(s), β(s) ∈ C[s] such that

α(s)  f (s) + β(s)  g(s) = d(s). (9.8) 366 9 Conjugate Products

Proof It follows from the proof of Theorem 9.4 that d(s) is the rn(s) in (9.7), i.e., d(s) = rn(s). Denote α1(s) = 1, β1(s) =−ψn(s). Then, from the second last relation in (9.7) one has

d(s) = α1(s)  rn−2(s) + β1(s)  rn−1(s).

Substituting the third last relation into the preceding expression, gives that

d(s) = α2(s)  rn−3(s) + β2(s)  rn−2(s) with α2(s) = β1(s), β2(s) = α1(s) − β1(s)  ψn−1(s).

By successively applying such a procedure, one can obtain (9.8).  Similarly, on greatest common left divisors one can obtain the following theorem. Theorem 9.10 Given f (s),g(s) ∈ C[s],letd(s) be a greatest common left divisor of f (s) and g(s). Then, there exist α(s), β(s) ∈ C[s] such that

f (s)  α(s) + g(s)  β(s) = d(s).

With the preceding two results, one can obtain the following theorem on coprime- ness of two polynomials. Theorem 9.11 Let f (s),g(s) ∈ C[s]. (1) f (s) and g(s) is right coprime if and only if there exist α(s), β(s) ∈ C[s] such that α(s)  f (s) + β(s)  g(s) = 1.

(2) f (s) and g(s) is left coprime if and only if there exist α(s), β(s) ∈ C[s] such that f (s)  α(s) + g(s)  β(s) = 1.

Remark 9.2 In the framework of ordinary product, it is well known that coprimeness of two polynomials can be checked by the so-called Sylvester resultant matrix. How- ever, since the conjugate product does not obey commutativity law, such a resultant matrix does not exist for coprimeness in the framework of conjugate product.

9.5 Conjugate Products of Polynomial Matrices

In this section, the concept of conjugate products is generalized to the case of poly- nomial matrices. First, some basic terminologies are introduced. Let s be an indeter- p×q minate, and Ai ∈ C , i ∈ I[0, n]. The expression 9.5 Conjugate Products of Polynomial Matrices 367

n A (s) = A0 + A1s +···+ Ans (9.9) is called a polynomial matrix over Cp×q . The set of all the polynomial matrices over Cp×q with the indeterminate s is denoted by Cp×q [s]. By using summation notation, the polynomial matrix A (s) is expressed more concisely as

n i A (s) = Ai s . i=0

Similarly to the case of polynomials, one can define the sum of two polynomial matrices in Cp×q [s].

Definition 9.11 Given two polynomial matrices

n m i i p×q A (s) = Ai s , B (s) = Bi s ∈ C [s], i=0 i=0 their sum is defined in the following way: ⎧ ⎪ m n ⎪ ( + ) i + i , > ⎪ Ai Bi s Ai s if n m ⎪ ⎪ i=0 i=m+1 ⎨ m ( ) + ( ) = (A + B ) si n = m A s B s ⎪ i i ,if . ⎪ i=0 ⎪ n m ⎪   ⎪ + i + i < ⎩⎪ (Ai Bi ) s Bi s ,ifn m i=0 i=n+1

For the polynomial matrix A (s) in (9.9), let the j-th row k-th column element of Ai be aijk. Further, it is denoted that

n i a jk (s) = aijks . i=0

With these polynomials a jk (s), j ∈ I[1, p], k ∈ I[1, q], the polynomial matrix A (s) is equivalently written as ⎡ ⎤ a11 (s) a12 (s) ··· a1q (s) ⎢ a (s) a (s) ··· a (s) ⎥   A (s) = ⎢ 21 22 2q ⎥ = a (s) ⎣ ··· ··· ··· ⎦ jk p×q .

ap1 (s) ap2 (s) ··· apq (s)

In this way, the following result can be easily obtained. 368 9 Conjugate Products   Lemma 9.5 Given two polynomial matrices A (s) = a (s) and B (s) =   jk p×q ( ) ∈ Cp×q [ ] b jk s p×q s , there holds   ( ) + ( ) = ( ) + ( ) A s B s a jk s b jk s p×q .

Now, the definition of the conjugate product is given for two complex polynomial matrices.

Definition 9.12 For two polynomial matrices

m n i p×q j q×m A(s) = Ai s ∈ C [s] and B(s) = Bi s ∈ C [s], i=0 j=0 their conjugate product is defined as

m n ( )  ( ) = ∗i i+ j A s B s Ai B j s . i=0 j=0

It is obvious that the conjugate product will reduce to the ordinary product for two real polynomial matrices. The newly defined conjugate product of polynomial matrices has many nice properties.

Lemma 9.6 Given A(s) ∈ Cp×q [s],D(s) ∈ Cp×q [s],B(s) ∈ Cq×ϕ[s],E(s) ∈ Cq×ϕ[s], and C(s) ∈ Cϕ×ω[s], the following relations hold: (1) Left distributivity law: (A(s) + D(s))  B(s) = A(s)  B(s) + D(s)  B(s); (2) Right distributivity law: A(s)  (B(s) + E(s)) = A(s)  B(s) + A(s)  E(s); (3) Associativity law: (A(s)  B(s))  C(s) = A(s)  (B(s)  C(s)).

Proof Let

m m i i A(s) = Ai s , D(s) = Di s , i=0 i=0 n n l j j k B(s) = B j s , E(s) = E j s , C(s) = Ck s . j=0 j=0 k=0

According to Definition 9.12, one has 9.5 Conjugate Products of Polynomial Matrices 369

(A(s) + D(s))  B(s)   ⎛ ⎞ m n i ⎝ j ⎠ = (Ai + Di ) s  B j s i=0 j=0 m n = ( + ) ∗i i+ j Ai Di B j s i=0 j=0 m n m n = ∗i i+ j + ∗i i+ j Ai B j s Di B j s i=0 j=0 i=0 j=0 = A(s)  B(s) + D(s)  B(s);

A(s)  (B(s) + E(s))   ⎛ ⎞ m n  i ⎝ j ⎠ = Ai s  B j + E j s i=0 j=0 m n  ∗i i+ j = Ai B j + E j s i=0 j=0 m n m n = ∗i i+ j + ∗i i+ j Ai B j s Ai E j s i=0 j=0 i=0 j=0 = A(s)  B(s) + A(s)  E(s); and

(A(s)  B(s))  C(s) ⎛ ⎞   m n l = ⎝ ∗i i+ j ⎠  k Ai B j s Ck s i=0 j=0 k=0 m n l = ∗i ∗(i+ j) i+ j+k Ai B j Ck s i=0 j=0 k=0 m n l      ∗i = ∗ j i+( j+k) Ai B j Ck s i=0 j=0 k=0   ⎡ ⎤ m n l = i  ⎣ ∗ j j+k ⎦ Ai s B j Ck s i=0 j=0 k=0 = A(s)  (B(s)  C(s)) . 370 9 Conjugate Products

The proof is thus completed. 

× × ×ϕ Lemma 9.7 Let A(s) ∈ Cp1 q [s],B(s) ∈ Cp2 q [s], and C(s) ∈ Cq [s]. Then     A(s) A(s)  C(s)  C(s) = . B(s) B(s)  C(s)

Proof Let m m n i i j A(s) = Ai s , B(s) = Bi s , C(s) = C j s . i=0 i=0 j=0

According to Definition 9.12, one has   A(s)  C(s) B(s)     m n m n ∗i i+ j A ∗ + A C s = i C i si j = i j B j B C∗i si+ j = = i = = i j ⎡i 0 j 0 i ⎤0 j 0 m n ∗i i+ j Ai C s ⎣ i=0 j=0 j ⎦ = m n ∗i i+ j Bi C j s  i=0 j=0 A(s)  C(s) = . B(s)  C(s)

The conclusion is thus proven. 

Similarly, the following two conclusions can also be easily obtained.

× × ϕ× Lemma 9.8 Let A(s) ∈ Cp q1 [s],B(s) ∈ Cp q2 [s], and C(s) ∈ C p[s]. Then,     C(s)  A(s) B(s) = C(s)  A(s) C(s)  B(s) .

× × × Lemma 9.9 Let A(s) ∈ Cp q1 [s],B(s) ∈ Cp q2 [s],C(s) ∈ Cq1 p[s], and D(s) ∈ × Cq2 p[s]. Then,     C(s) A(s) B(s)  = A(s)  C(s) + B(s)  D(s). D(s)

By applying Lemmas 9.7–9.9, one can easily obtain the following corollary.   ( ) = ( ) ∈ Cm×r ( ) = ( ) ∈ Cr×p Corollary 9.1 Let A s [aik s ]m×r and B s bkj s r×p . Then,   r A(s)  B(s) = aik(s)  bkj(s) . k=1 m×p 9.5 Conjugate Products of Polynomial Matrices 371

Remark 9.3 Lemmas 9.6–9.9 and Corollary 9.1 imply that the conjugate product of complex polynomial matrices have some common properties to the ordinary product. However, it is worth noticing that for scalar polynomials a big difference between conjugate product and ordinary product is that the conjugate product does not obey commutativity law.

9.6 Unimodular Matrices and Smith Normal Form

This section is started with the invertibility of a complex polynomial matrix in the framework of conjugate products. Definition 9.13 A polynomial matrix U(s) ∈ Cn×n[s] is said to be invertible in the framework of conjugate products, if there exits a polynomial matrix V (s) ∈ Cn×n[s] such that U(s)  V (s) = I and V (s)  U(s) = I . In this case, the V (s) is called the inverse of U(s) in the framework of conjugate products, and is denoted by U −(s) = V (s). Lemma 9.10 Given a square polynomial matrix U(s), if there exist polynomial ( ) ( ) ( )  ( ) = ( )  ( ) = matrices VR s and VL s such that U s VR s I , and VL s U s I , then ( ) = ( ) VR s VL s . Proof By applying Lemma 9.6, one has

( ) VL s = ( )  VL s I = ( )  ( ( )  ( )) VL s U s VR s = ( ( )  ( ))  ( ) VL s U s VR s =  ( ) I VR s = ( ) VR s .

The proof is thus completed.  According to this lemma, for a square polynomial matrix U(s),ifU(s)  V (s) = I, then U −(s) = V (s) and V −(s) = U(s). In addition, by applying Item (3) of Lemma 9.6 the following result is easily obtained. Lemma 9.11 If U(s),V(s) ∈ Cn×n are both invertible in the framework of conju- gate products, then U(s)  V (s) is also invertible in the framework of conjugate products. Moreover,

(U(s)  V (s))− = V −(s)  U −(s).

For a polynomial matrix A(s) ∈ Cn×m [s], the elementary row transformations have the following three forms: 372 9 Conjugate Products

(1) Row switching transformations. This transformation switches all matrix ele- ments on row i with their counterparts on row j, and can be implemented by pre-multiplying A(s) by the following matrix ⎡ ⎤ 1 ⎢ ⎥ ⎢ .. ⎥ ⎢ . ⎥ ⎢ ⎥ ⎢ 1 ⎥ ⎢ ⎥ ⎢ 01⎥ ⎢ ⎥ ⎢ 1 ⎥ the i-th row ⎢ ⎥ ( , ) = ⎢ .. ⎥ En i j ⎢ . ⎥ . (9.10) ⎢ ⎥ ⎢ 1 ⎥ ⎢ ⎥ ⎢ 10⎥ ⎢ ⎥ the j-th row ⎢ 1 ⎥ ⎢ . ⎥ ⎣ .. ⎦ 1

(2) Row addition transformations. This transformation adds row i pre-multiplied by f (s) to row j, and can be implemented by pre-multiplying A(s) by the following matrix ⎡ ⎤ 1 ⎢ ⎥ ⎢ .. ⎥ ⎢ . ⎥ ⎢ ⎥ ⎢ 1 ⎥ ⎢ ⎥ ( ( ( )) ) = ⎢ .. ⎥ the i-th row En i f s , j ⎢ . ⎥ . (9.11) ⎢ ⎥ ⎢ f (s) ··· 1 ⎥ the j-th row ⎢ . ⎥ ⎣ .. ⎦ 1

(3) Row multiplication transformations. This transformation pre-multiplies all elements on row i with nonzero a ∈ C, and can be implemented by pre- multiplying A(s) by the following matrix ⎡ ⎤ 1 ⎢ ⎥ ⎢ .. ⎥ ⎢ . ⎥ ⎢ ⎥ ⎢ 1 ⎥ ⎢ ⎥ En(i(a)) = ⎢ a ⎥ . (9.12) ⎢ ⎥ the i-th row ⎢ 1 ⎥ ⎢ . ⎥ ⎣ .. ⎦ 1 9.6 Unimodular Matrices and Smith Normal Form 373

Similarly, there are three types of elementary column transformations for A(s) ∈ Cn×m [s]. (1) Column switching transformations. This transformation switches all matrix elements on column i with their counterparts on column j, and can be imple- mented by post-multiplying A(s) by Em (i, j). (2) Column addition transformations. This transformation adds column i post- multiplied by f (s) to column j, and can be implemented by post-multiplying A(s) by Em ( j( f (s)), i). (3) Column multiplication transformations. This transformation post-multiplies all elements on column i with nonzero a ∈ C, and can be implemented by post- multiplying A(s) by Em(i(a)).

Remark 9.4 In the preceding definitions of elementary transformations, by the mul- tiplication it means the conjugate product.

The matrices with the forms (9.10)–(9.12) are called elementary matrices. By applying Lemma 9.1, it is easily checked that all the elementary matrices are invert- ible. Moreover, the following relations hold

−( , ) = ( , ) En i j En i j , −( ( )) = ( ( )) En i a En i a , (9.13) − ( ( ( )) ) = ( (− ( )) ) En i f s , j En i f s , j .

Definition 9.14 A polynomial matrix P(s) is said to be unimodular if there exist elementary matrices Ei (s), i ∈ I[1, n], such that

P(s) = E1(s)  E2(s)  ···En(s).

Remark 9.5 From now on, if no special interpretations, when the terms “inversion” and “unimodularity” are mentioned, they refer to those in the framework of conjugate products.

The following simple properties of unimodular matrices are easily derived.

Proposition 9.2 Let P(s),Q(s) ∈ Cn×n [s] be two unimodular matrices. Then, (1) P(s)  Q(s) is unimodular; (2) P(s) is invertible, and P−(s) is also unimodular.

Definition 9.15 If A(s) ∈ Cm×n [s] can be transformed into B(s) via finite sequences of elementary transformations in the framework of conjugate product, then A(s) is said to be conequivalent to B(s), and denoted by

 A(s) ⇐⇒ B(s). 374 9 Conjugate Products

By the definition of unimodular matrices, one can easily obtain the following result. Theorem 9.12 For two polynomial matrices A(s),B(s) ∈ Cm × n [s],A(s)  ⇐⇒ B(s) if and only if there exist two unimodular matrices P(s) and Q(s) such that B(s) = P(s)  A(s)  Q(s).

By simple calculations, it is easily known that conequivalence is an equivalence relation. Proposition 9.3 For three polynomial matrices A(s),B(s),C(s) ∈ Cm×n [s],the following statements hold:  (1) Reflexivity: A(s) ⇐⇒ A(s);    (2) Transitivity: if A(s) ⇐⇒ B(s),B(s) ⇐⇒ C(s), then A(s) ⇐⇒ C(s);   (3) Invertibility: if A(s) ⇐⇒ B(s), then B(s) ⇐⇒ A(s). The following theorem provides a normal form for a polynomial matrix under elementary row and column transformations in the framework of conjugate products. To obtain such a normal form, Lemma 9.1 and the Euclidean division algorithm in the framework of conjugate product are applied. Using the above notation, one can manipulate polynomial matrices in the frame- work of conjugate products in ways that mirror the ways one manipulates polynomial matrices in the framework of ordinary products. The following result describes a diag- onal normal form of polynomial matrices in the framework of conjugate products. Theorem 9.13 A polynomial matrix A(s) ∈ Cm×n can be conequivalent to a diago- nal polynomial matrix. In details, one can find unimodular matrices U(s) and V (s) such that U(s)  A(s)  V (s) = (s), where ⎡ ⎤ λ1(s) ⎢ ⎥ ⎢ λ2(s) ⎥ ⎢ ⎥ ( ) = ⎢ .. ⎥ s ⎢ . ⎥ , (9.14) ⎣ ⎦ λr (s) 0(m−r)×(n−r) and λi (s),i ∈ I[1, r], are monic polynomials obeying a division property

λi (s)|λi+1(s), i ∈ I[1, r − 1].

Proof A constructive method is adopted to prove the conclusion. Step 1:IfA(s) = 0, one can choose U(s) = Im , V (s) = In, and then (s) = A(s). The conclusion holds. If A(s) = 0, go to Step 2. 9.6 Unimodular Matrices and Smith Normal Form 375

Step 2: By performing row and column switching operations on A(s), bring to the (1, 1) position the least degree polynomial entry in A(s). The obtained polynomial matrix is denoted by aij(s) . Step 3: By applying the Euclidean division algorithm, one can obtain

ai1(s) = qi1 (s)  a11(s) + ri1(s),degri1(s)

a1 j (s) = a11(s)  q1 j (s) + r1 j (s),degr1 j (s)

If the remainders r1 j (s), j ∈ I[2, n], and ri1(s), i ∈ I[2, m], are all zeroes, then go ( ) to Step 4, otherwise, find a remainder of least degree. If r1 j0 s is such a remainder, − ( ) add the first column post-multiplied by q1 j0 s to column j0, then go to Step 2; if ( ) − ri01 s is such a remainder, add the first row pre-multiplied by qi01 to row i0, then go to Step 2. Since the degrees of remainders r1 j (s), j ∈ I[2, n], and ri1(s), i ∈ I[2, m],are lower than the degree of a11(s), the degrees of the entry (1, 1) will decrease in each cycle of Step 2 and Step 3. After finite sequences, one can obtain a matrix ⎡ ⎤ a˜11(s) 00··· 0 ⎢ ⎥ ⎢ 0 a22 a23 ··· a2n ⎥ ⎢ ⎥ ⎢ 0 a32 a33 ··· a3n ⎥ , (9.15) ⎣ ··· ··· ··· ··· ··· ⎦

0 am2 am3 ··· amn where the degree of a˜11(s) is lower than those of the other nonzero entries. If a˜11(s)|aij(s), i ∈ I[2, m], j ∈ I[2, n], then go to Step 4. If there is an element which is not right or left divisible by a˜11(s), then add the column or row where this element is to the first column or row, and go to Step 3. Step 4: Repeat the procedure from Steps 1 through 3 to the submatrix on the bottom right of (9.15), the original matrix A(s) can be transformed to ⎡ ⎤ a˜11(s) 000··· 0 ⎢ ⎥ ⎢ 0 a˜22(s) 00··· 0 ⎥ ⎢ ⎥ ⎢ 00a32 a33 ··· a3n ⎥ ⎢ ⎥ , ⎢ 00a33 a34 ··· a3n ⎥ ⎣ ··· ··· ··· ··· ··· ··· ⎦

00am3 am4 ··· amn where a˜11(s)|a˜22(s), a˜11(s)|aij(s), a˜22(s)|aij(s), i ∈ I[3, m], j ∈ I[3, n]. Step 5: Repeat the procedure from Steps 1 through 4, after finite operation sequences one can obtain 376 9 Conjugate Products ⎡ ⎤ aˆ11(s) ⎢ ⎥ ⎢ aˆ22(s) ⎥ ⎢ ⎥ ⎢ .. ⎥ ⎢ . ⎥ ⎣ ⎦ aˆrr(s) 0(m−r)×(n−r) where aii(s)|a(i+1),(i+1)(s), i ∈ I[1, r − 1]. Step 6: Pre-multiplying aˆii(s) by some constant ci , i ∈ I[1, r − 1], one can obtain the monic λi (s) ∈ I[1, r − 1].Theform(9.14) is derived. 

The matrix (s) is called the Smith normal form of A(s). At the end of this section, the equivalence between a and an will be proven by using the Smith normal form in Theorem 9.13.

Theorem 9.14 A square complex polynomial matrix is unimodular if and only if it is invertible.

Proof Necessity can be easily proven by Definition 9.14 and the simple fact that any is invertible. Now, let us show the sufficiency. Let A(s) ∈ Cn×n [s] be an invertible matrix. By applying Theorem 9.13, there exist two unimodular matrices U(s), V (s) ∈ Cn×n [s] such that U(s)  A(s)  V (s) = D(s), (9.16) where D(s) = diag (d1(s), d2(s), ..., dn(s)), di (s) ∈ C [s], i ∈ I[1, n]. It follows from Proposition 9.2 that U(s) and V (s) are invertible. Combining this with the fact that A(s) is invertible, by Lemma 9.11 one knows that D(s) = diag (d1(s), d2(s), ..., dn(s)) is invertible. According to Definition 9.13, D(s) is invertible if and only if there exists a polynomial matrix E(s) = eij (s) such that D(s)  E(s) = I . This relation implies that

di (s)  eii(s) = 1, i ∈ I[1, n], (9.17)

eij(s) = 0, i = j.

It is easily obtained from (9.17) that di (s), i ∈ I[1, n], are all nonzero constants. Hence, D(s) is also unimodular. In addition, since U(s) and V (s) are invertible, then it follows from (9.16) that

A(s) = U −(s)  D(s)  V −(s). (9.18)

According to Item (1) of Proposition 9.2, the relation (9.18) together with the fact that U −(s), V −(s) and D(s) are unimodular, implies that A(s) is unimodular. With the above two aspects, the conclusion is thus true.  9.6 Unimodular Matrices and Smith Normal Form 377

Remark 9.6 In the framework of ordinary products, a unimodular matrix can be defined as a polynomial matrix whose determinant is a nonzero constant. However, such a definition is not suitable to the case in the framework of conjugate products since the conjugate product of two complex polynomials does not obey commuta- tivity, and thus the determinant can not be conventionally defined.

9.7 Greatest Common Divisors

In this section, greatest common divisors are investigated for two polynomial matrices in the framework of conjugate products. Definition 9.16 For three polynomial matrices A(s), B(s), and C(s) satisfying

A(s) = B(s)  C(s),

C(s) is said to be a right divisor of A(s), and B(s) is said to be a left divisor of A(s); A(s) is said to be a right (left, respectively) multiple of B(s) (C(s), respectively). ( ) ( ) Definition 9.17 For two polynomial matrices AR s and BR s with the same num- ber of columns, a square polynomial matrix R(s) is called a common right divisor ( ) ( ) ( ) ( ) of AR s and BR s , if there exist polynomial matrices R s and R s such that

( ) = ( )  ( ) ( ) = ( )  ( ) AR s R s R s , BR s R s R s .

( ) ( ) For two polynomial matrices AL s and BL s with the same number of rows, a ( ) ( ) ( ) square polynomial matrix L s is called a common left divisor of AL s and BL s , ( ) ( ) if there exist polynomial matrices L s and L s such that

( ) = ( )  ( ) ( ) = ( )  ( ) AL s L s L s , BL s L s L s .

From the previous definitions, common right and left divisors are square polyno- mial matrices. The following two definitions give the concepts of greatest common divisors. Definition 9.18 A greatest common right divisor (gcrd) of two polynomial matrices N(s) and D(s) with the same number of columns is any polynomial matrix R(s) with the following two properties: (1) R(s) is a common right divisor of N(s) and D(s); (2) If R1(s) is any other common right divisor of N(s) and D(s), then R1(s) is a right divisor of R(s). Definition 9.19 A greatest common left divisor (gcld) of two polynomial matrices A(s) and B(s) with the same number of rows is any matrix L(s) with the following two properties: 378 9 Conjugate Products

(1) L(s) is a common left divisor of A(s) and B(s); (2) If L1(s) is any other common left divisor of A(s) and B(s), then L1(s) is a left divisor of L(s).

The following lemma provides a constructive method to obtain a greatest common right divisor.

Theorem 9.15 For two polynomial matrices N(s) ∈ Cn×r [s] and D(s) ∈ Cr×r [s], find a unimodular matrix U(s) ∈ C(n+r)×(n+r) [s] such that     N(s) R(s) U(s)  = (9.19) D(s) 0 with R(s) ∈ Cr×r [s]. Then the square polynomial matrix R(s) is a gcrd of D(s) and N(s).

Proof Denote   U (s) U (s) U(s) = 11 12 U21(s) U22(s)

n×n n×r with U11(s) ∈ C [s] and U12(s) ∈ C [s]. Since U(s) is unimodular, then its inverse exists. Let

 −   U (s) U (s) V (s) V (s) 11 12 = 11 12 . U21(s) U22(s) V21(s) V22(s)

Then       N(s) V (s) V (s) R(s) = 11 12  , D(s) V21(s) V22(s) 0 so that N(s) = V11(s)  R(s), D(s) = V21(s)  R(s), and R(s) is a common right divisor of D(s) and N(s). Next, notice that one also has

R(s) = U11(s)  N(s) + U12(s)  D(s), so that if R1(s) is another common right divisor, say

N(s) = N1(s)  R1(s), D(s) = D1(s)  R1(s), then R(s) = [U11(s)  N1(s) + U12(s)  D1(s)]  R1(s).

This relation implies that R1(s) is a right divisor of R(s), and R(s) is thus a gcrd.  9.7 Greatest Common Divisors 379

Remark 9.7 For N(s) ∈ Cn×r [s] and D(s) ∈ Cr×r [s], a gcrd R(s) of N(s) and D(s) can be obtained by using the construction for getting the Smith normal form. ( ) ( ) In details, if U s and UR s are two unimodular matrices such that     N(s) (s) U(s)   U (s) = . D(s) R 0

( ) = ( )  −( ) Then a gcrd can be given as R s s UR s . However, it is not necessary to adopt Smith normal form reduction to obtain a gcrd. It suffices to use only row elementary transformations.

Similar to Theorem 9.15, one can obtain the following result on greatest common left divisors. The proof is similar to that of Theorem 9.15, and hence is omitted.

Theorem 9.16 For two polynomial matrices A(s) ∈ Cn×n [s] and B(s) ∈ Cn×r [s], find a unimodular matrix V (s) ∈ C(n+r)×(n+r) [s] such that     A(s) B(s)  V (s) = L(s) 0 with L(s) ∈ Cn×n [s]. Then the square polynomial matrix L(s) is a gcld of A(s) and B(s).

9.8 Coprimeness of Polynomial Matrices

In this section, the concept of coprimeness for polynomial matrices is given in the framework of conjugate products.

Definition 9.20 Two polynomial matrices with the same number of columns are right coprime if all their gcrds are unimodular. Two polynomial matrices with the same number of rows are left coprime if all their gclds are unimodular.

The following theorem gives criteria of right coprimeness of two polynomial matrices.

Theorem 9.17 Given N(s) ∈ Cn×r [s] and D(s) ∈ Cr×r [s], they are right coprime if and only if there exist polynomial matrices X (s) ∈ Cr×n [s] and Y (s) ∈ Cr×r [s] such that X(s)  N(s) + Y (s)  D(s) = I. (9.20)

Proof Given N(s) ∈ Cn×r [s] and D(s) ∈ Cr×r [s], it follows from the proof of The- orem 9.15 that any gcrd R(s) can be written as

R(s) = U11(s)  N(s) + U12(s)  D(s). (9.21) 380 9 Conjugate Products

If N(s) and D(s) are right coprime, then R(s) must be unimodular, so that R−(s) exists. Denote

− − X(s) = R (s)  U11(s), Y (s) = R (s)  U12(s).

Then the relation (9.20) is immediately obtained from (9.21). Conversely, suppose that there exist X(s) and Y (s) satisfying (9.20). In addition, let R(s) be any common right divisor of N(s) and D(s), that is, there exist N1(s) ∈ n×r r×r C [s] and D1(s) ∈ C [s] such that

N(s) = N1(s)  R(s), D(s) = D1(s)  R(s).

Combining this relation with (9.20), gives

[X(s)  N1(s) + Y (s)  D1(s)]  R(s) = I .

By applying Lemma 9.10, it follows from this relation that R(s) is invertible, and hence R(s) is unimodular according to Theorem 9.14. Therefore, by Definition 9.20 it is known that N(s) and D(s) are right coprime. With the above aspects, the proof is thus completed. 

Theorem 9.18 Given N(s) ∈ Cn×r [s] and D(s) ∈ Cr×r [s], they are right coprime ( ) ∈ C(n+r)×(n+r)[ ] if and only if there exists a unimodular matrix UR s s such that     N(s) I U (s)  = . (9.22) R D(s) 0

Proof Necessity. By carrying out row elementary transformations, one can find a unimodular matrix U(s) ∈ C(n+r)×(n+r)[s] such that (9.19) holds. It follows from Theorem 9.15 that R(s) is a gcrd of N(s) and D(s). Since N(s) and D(s) are right coprime, then R(s) is unimodular by the definition of right coprimeness. Thus, R(s) is invertible, and R−(s) is also unimodular. With this, one has from (9.19)       R−(s) 0 N(s) I  U(s)  = . 0 I D(s) 0

By taking   R−(s) 0 U (s) =  U(s), (9.23) R 0 I

( ) −( ) one can obtain (9.22). It is obvious that UR s in (9.23) is unimodular since R s and U(s) are both unimodular. Sufficiency. The conclusion can be obtained by Theorem 9.15. With these statements, the conclusion of this theorem is thus proven.  9.8 Coprimeness of Polynomial Matrices 381

Theorem 9.19 Given N(s) ∈ Cn×r [s] and D(s) ∈ Cr×r [s], they are right coprime n×n r×n if and only if there exist two polynomial matrices N1 ∈ C [s] and D1 ∈ C [s] such that   N(s) N (s) U(s) = 1 D(s) D1(s) is a unimodular matrix.

Proof Sufficiency. Since U(s) is unimodular, then U −(s) is also unimodular. Par- − tition U (s) as   − V (s) V (s) U (s) = 11 12 V21(s) V22(s)

r×n r×r − with V11(s) ∈ C [s] and V12(s) ∈ C [s].ItfollowsfromU (s)  U(s) = I that V11(s)  N(s) + V12(s)  D(s) = I .

By Theorem 9.17, N(s) and D(s) are right coprime. Necessity. Since N(s) and D(s) are right coprime, then there exists a unimodular ( ) ( ) −( ) matrix UR s satisfying (9.22). Since UR s is unimodular, then UR s exists, and is also unimodular. Denote   ( ) ( ) −( ) = W11 s W12 s UR s W21(s) W22(s)

n×r r×r with W11(s) ∈ C [s] and W21(s) ∈ C [s]. It is obvious that     ( ) ( )  W12 s = 0 UR s . W22(s) I

Combining this with (9.22), gives   ( ) ( ) ( )  N s W12 s = UR s I . D(s) W22(s)

( ) = ( ) ( ) = ( ) ( )  ( ) = By taking N1 s W12 s and D1 s W22 s , one has UR s U s I , which implies that U(s) is invertible, and thus is unimodular. The proof is thus completed. 

Similarly, on left coprimeness one has the following theorem.

Theorem 9.20 Given A(s) ∈ Cn×n [s] and B(s) ∈ Cn×r [s], the following state- ments are equivalent:

(1) A(s) and B(s) are left coprime; (2) there exist polynomial matrices X(s) and Y (s) such that 382 9 Conjugate Products

A(s)  X(s) + B(s)  Y (s) = I;

( ) ∈ C(n+r)×(n+r)[ ] (3) there exists a unimodular matrix UL s s such that     ( ) ( )  ( ) = A s B s UL s I 0 .

r×n r×r (4) there exist polynomial matrices A1(s) ∈ C [s] and B1(s) ∈ C [s] such that   A(s) B(s) V (s) = A1(s) B1(s)

is a unimodular matrix.

Remark 9.8 By applying Theorem 9.20, it is known that X (s) and Y (s) in Theorem 9.17 are also left coprime. Similarly, X(s) and Y (s) in Theorem 9.20 are also right coprime.

9.9 Conequivalence and Consimilarity

In the framework of ordinary products of polynomial matrices, the following result is well known.

Lemma 9.12 For two square matrices A, B ∈ Cn×n, A is similar to B if and only if (sI − A) is equivalent to (sI − B), that is, there exist two unimodular matrices (in the framework of ordinary products of polynomial matrices) such that

U(s)(sI − A)V (s) = sI − B.

In matrix algebra theory, consimilarity is also an important concept, and has attracted considerable attention from many researchers. It has been known from Sect. 6.1 that the solvability condition of a normal con-Sylvester matrix equation can be characterized in terms of consimilarity of two partitioned matrices. In this section, as an application of the conjugate product of polynomial matrices, a necessary and sufficient condition will be provided for consimilarity in terms of conequivalence. Before giving the main result, the following lemma is needed.

Lemma 9.13 Given A(s),B(s) ∈ Cn×n [s], if the leading coefficient matrix of B(s) ( ) ( ) ( ) ( ) ∈ Cn×n is invertible, then there exist QR s ,QL s ,RR s ,RL s [s] such that

( ) = ( )  ( ) + ( ) A s B s QR s RR s or ( ) = ( )  ( ) + ( ) A s QL s B s RL s 9.9 Conequivalence and Consimilarity 383

( ) = ( ) = ( )< ( ) ( )< ( ) where RR s 0,RL s 0,ordeg RR s deg B s , deg RL s deg B s . The proof of this lemma is similar to that of Theorem 9.3, and hence is omitted. By this lemma, the main result of this section can be given in the following theorem.

Theorem 9.21 For two complex matrices A, B ∈ Cn×n, A is consimilar to B if and only if (sI − A) is conequivalent to (sI − B).

Proof Necessity: Since A is consimilar to B, then there exists a nonsingular matrix C such that C−1 AC = B. With this relation, one has

C−1  (sI − A)  C ∗1 = sC−1 C − C−1 AC = sI − B.

This implies that (sI − A) is conequivalent to (sI − B). Sufficiency: Since (sI − A) is conequivalent to (sI − B), then there exist two unimodular matrices U(s) and V (s) such that

sI − A = U(s)  (sI − B)  V (s), or, equivalently U −(s)  (sI − A) = (sI − B)  V (s). (9.24)

n×n n×n It follows from Lemma 9.13 that there exist Q(s), R(s) ∈ C [s] andU0, V0 ∈ C such that

U(s) = (sI − A)  Q(s) + U0, (9.25)

V (s) = R(s)  (sI − A) + V0. (9.26)

By substituting (9.26)into(9.24), one has

− U (s)  (sI − A) = (sI − B)  R(s)  (sI − A) + (sI − B)  V0.

It follows from this relation that   − U (s) − (sI − B)  R(s)  (sI − A) = (sI − B)  V0. (9.27)

Since the matrix on the right-hand side of (9.27) and the matrix (sI − A) are both of degree one, then there exits a constant matrix C ∈ Cn×n such that

U −(s) − (sI − B)  R(s) = C. (9.28)

With this observation, the relation (9.27) can be rewritten as 384 9 Conjugate Products

C  (sI − A) = (sI − B)  V0. (9.29)

It follows from (9.28) that

U(s)  C + U(s)  (sI − B)  R(s) = I .

Combining this relation with (9.24), one has

I = U(s)  C + (sI − A)  V −(s)  R(s). (9.30)

By substituting (9.25)into(9.30), it can be derived that

− I = [(sI − A)  Q(s) + U0]  C + (sI − A)  V (s)  R(s).

This relation can be equivalently written as   − I = (sI − A)  Q(s)  C + V (s)  R(s) + U0C. (9.31)

By comparing the degree of both sides in (9.31), it is easily known that the first term in the right-hand side of (9.31) should be zero. Therefore, one has

I = U0C.

= −1 This implies that C is nonsingular, and C U0 . = −1 With C U0 , it follows from (9.29) that

−1  ( − ) = ( − )  U0 sI A sI B V0.

It is equivalent to −1 = −1 = U0 V0, U0 A BV0.

−1 = It follows from this relation that U0 AU0 B, that is, A is consimilar to B. The proof is thus completed. 

The above theorem implies that conequivalence plays the same important role in the theory of consimilarity as the conventional equivalence does with respect to similarity. Accordingly, it is conjectured that the Jordan canonical form of a matrix A under consimilarity can be obtained from the Smith normal form of (sI − A) in the framework of conjugate products. At the end of this section, a result is given on the solvability condition of normal con-Sylvester matrix equations. This conclusion can be easily obtained by combining the above theorem with Theorem 6.1.

Corollary 9.2 Let A ∈ Cn×n, F ∈ Cp×p, and C ∈ Cn×p be given. The normal con- Sylvester matrix equation 9.9 Conequivalence and Consimilarity 385

XF − AX = C with respect to X has a solution if and only if the following two polynomial matrices are conequivalent:     sI − AC sI − A 0 , . 0 sI − F 0 sI − F

9.10 An Example

Consider the following two complex polynomial matrices ⎡ ⎤ −is + 1 −s2 + 2is − (1 + i)(1 + i) s − i A(s) = ⎣ is2 s2 − (1 + i) s + i (1 − i) s2 − s ⎦ , s is − 12s2 − s + (1 − i) ⎡ ⎤ is + i (1 + i) s + 2 B(s) = ⎣ 0is + 1 ⎦ . −iss

By using elementary column transformations, one can find the unimodular matrix 8 i U(s) = Ui s , whose coefficient matrices are given in Tables 9.1 and 9.2, i=0 satisfying ⎡ ⎤   10000 A(s) B(s)  U(s) = ⎣ 01000⎦ . 00100

By applying Theorem 9.20, it is known that the polynomial matrices A(s) and B (s) are left coprime.

9.11 Notes and References

In this chapter, a new operation, conjugate product, is introduced for polynomials and polynomial matrices. Some concepts in the framework of ordinary products of polynomial matrices have been generalized to the framework of conjugate products. These concepts include elementary transformations, unimodular matrices, coprime- ness, Smith normal form. It has been shown that a complex polynomial matrix can be transformed into a Smith normal form by using some elementary row and column transformations in the framework of conjugate products. In addition, some necessary and sufficient conditions are provided for coprimeness of polynomials and polyno- mial matrices. 386 9 Conjugate Products

Table 9.1 The coefficient matrices U , i ∈ I[3, 8] of U(s) ⎡ ⎤ i 000 0 0 ⎢ ⎥ ⎢ ⎥ ⎢ 000 0 0⎥ ⎢ ⎥ = ⎢ ⎥ U8 ⎢ 000 0 0⎥ ; ⎢ 3640 − 8450 ⎥ ⎣ 000 5009 5009 i0⎦ − 8450 − 3640 ⎡ 000 5009 5009 i0 ⎤ 00 0 9620 − 24 180 i0 ⎢ 5009 5009 ⎥ ⎢ 20 540 1170 ⎥ ⎢ 00 0− − i0⎥ ⎢ 5009 5009 ⎥ = ⎢ ⎥ U7 ⎢ 00 0 00⎥ ; ⎢ − 910 + 4225 3315 + 7865 861 + 3519 ⎥ ⎣ 0 5009 10 018 i 5009 10 018 i 5009 5009 i0⎦ 4225 + 910 7865 − 3315 − 4931 + 2779 ⎡ 0 10 018 5009 i 10 018 5009 i 5009 5009 i0 ⎤ 0 − 2405 + 6045 i − 14 495 + 1235 i 50 390 + 30 796 i0 ⎢ 5009 5009 5009 5009 5009 5009 ⎥ ⎢ 5135 585 4550 21 125 14 757 61 189 ⎥ ⎢ 0 + i + i − i0⎥ ⎢ 5009 10 018 5009 10 018 5009 5009 ⎥ = ⎢ ⎥ U6 ⎢ 00 0 00⎥ ; ⎢ − 11 903 − 4197 − 7149 − 19 553 − 23 113 + 962 ⎥ ⎣ 0 10 018 10 018 i 10 018 10 018 i 5009 5009 i0⎦ 14 + 10 083 − 5844 + 13 779 − 8779 + 8244 ⎡ 0 5009 10 018 i 5009 10 018 i ⎤ 5009 5009 i0 0 2625 − 1651 159 549 0 ⎢ 10 018 10 018 5009 ⎥ ⎢ 781 30 183 137 138 ⎥ ⎢ 0 − − 0 ⎥ ⎢ 10 018 10 018 5009 ⎥ = ⎢ ⎥ U5 ⎢ 00 0 0 0⎥ ⎢ − 11 040 − 315 155 771 − 118 300 ⎥ ⎣ 0 5009 5009 5009 5009 ⎦ 8598 20 038 − 7010 274 625 0 ⎡ 5009 5009 5009 5009 ⎤ 0 1336 10 801 − 86 223 0 ⎢ 5009 5009 5009 ⎥ ⎢ 2606 3972 57 063 ⎥ ⎢ 0 − 0 ⎥ ⎢ 5009 5009 5009 ⎥ + ⎢ ⎥ i ⎢ 00 0 0 0⎥ ; ⎢ 14 763 37 269 80 719 274 625 ⎥ ⎣ 0 10 018 10 018 5009 5009 ⎦ 9021 5957 − 145 497 118 300 ⎡ 0 5009 5009 5009 5009 ⎤ 0 − 25 927 − 146 747 − 27 102 − 312 650 ⎢ 10 018 10 018 5009 5009 ⎥ ⎢ 47 511 38 929 143 767 667 550 ⎥ ⎢ 0 − ⎥ ⎢ 10 018 10 018 5009 5009 ⎥ = ⎢ ⎥ U4 ⎢ 00 0 0 0⎥ ⎢ 1820 − 37 727 8201 67 251 − 794 300 ⎥ ⎣ 5009 5009 5009 5009 5009 ⎦ − 4225 24 531 − 107 061 50 693 38 025 ⎡5009 10 018 10 018 5009 5009 ⎤ 0 60 565 − 43 925 − 42 653 785 850 ⎢ 10 018 10 018 5009 5009 ⎥ ⎢ 17 615 106 031 68 964 38 025 ⎥ ⎢ 0 − − ⎥ ⎢ 10 018 10 018 5009 5009 ⎥ + ⎢ ⎥ i ⎢ 00 0 0 0⎥ ; ⎢ − 4225 − 9115 − 88 980 91 765 − 236 600 ⎥ ⎣ 5009 5009 5009 5009 5009 ⎦ − 1820 28 594 29 254 − 111 302 676 000 ⎡ 5009 5009 5009 5009 5009⎤ 4810 − 1749 − 12 765 215 570 139 425 ⎢ 5009 10 018 10 018 5009 5009 ⎥ ⎢ 10 270 4059 48 467 80 766 42 250 ⎥ ⎢ − − ⎥ ⎢ 5009 10 018 10 018 5009 5009 ⎥ = ⎢ ⎥ U3 ⎢ 0000 0⎥ ⎢ 11 903 − 16 813 − 38 581 − 36 038 − 350 675 ⎥ ⎣ 5009 5009 5009 5009 5009 ⎦ − 28 − 2401 3503 47 751 − 8450 ⎡5009 10 018 10 018 5009 5009 ⎤ − 12 090 35 217 − 5853 232 248 287 300 ⎢ 5009 10 018 10 018 5009 5009 ⎥ ⎢ 585 25 083 34 455 58 431 333 775 ⎥ ⎢ − − ⎥ ⎢ 5009 10 018 10 018 5009 5009 ⎥ + ⎢ ⎥ i ⎢ 0000 0⎥ . ⎢ 4197 − 13 289 − 6868 348 657 − 202 800 ⎥ ⎣ 5009 5009 5009 5009 5009 ⎦ − 10 083 11 965 25 451 − 46 480 67 600 5009 10 018 10 018 5009 5009 9.11 Notes and References 387

Table 9.2 The coefficient matrices U , i = 0, 1, 2ofU(s) ⎡ i ⎤ − 2625 − 73 335 − 21 025 327 635 − 1782 950 ⎢ 5009 5009 5009 5009 5009 ⎥ ⎢ 781 − 12 913 − 12 913 58 140 − 392 925 ⎥ ⎢ 5009 5009 5009 5009 5009 ⎥ = ⎢ ⎥ U2 ⎢ 00 00 0⎥ ⎣ 4530 − 83 707 210 571 57 273 − 299 975 ⎦ 5009 10 018 10 018 5009 5009 289 9729 − 50 379 18 474 − 118 300 5009⎡ 10 018 10 018 5009 5009 ⎤ − 2672 7261 − 296 389 108 982 − 701 350 ⎢ 5009 10 018 10 018 5009 5009 ⎥ ⎢ − 5212 7615 − 22 439 18 880 156 325 ⎥ ⎢ 5009 5009 5009 5009 5009 ⎥ + ⎢ ⎥ i ⎢ 00000⎥ ; ⎣ 2722 − 58 826 − 92 887 284 223 − 1787 175 ⎦ 5009 5009 5009 5009 5009 − 492 25 537 25 537 − 49 684 274 625 ⎡ 5009 10 018 10 018 5009 5009 ⎤ 26 057 − 58 691 − 14 505 − 8491 − 540 800 ⎢ 5009 10 018 10 018 5009 5009 ⎥ ⎢ 1 −1 −110⎥ ⎢ ⎥ = ⎢ ⎥ U1 ⎢ 00 0 0 0⎥ ⎣ 5725 − 23 879 − 20 411 + 11 321 − 8376 − 325 325 ⎦ 5009 10 018 10 018 10 018 i 5009 5009 00 −110 ⎡ ⎤ 9505 − 1369 55 271 − 15 513 169 000 ⎢ 5009 10 018 10 018 5009 5009 ⎥ ⎢ 001−10⎥ ⎢ ⎥ + ⎢ ⎥ i ⎢ 0000 0⎥ ; ⎣ 25 276 − 32 865 11 321 − 56 613 − 147 875 ⎦ 5009 10 018 10 018 5009 5009 −10 0 2 0 ⎡ ⎤ 4720 − 9729 − 9729 6571 118 300 ⎢ 5009 10 018 10 018 5009 5009 ⎥ ⎢ 00 −11 0⎥ ⎢ ⎥ = ⎢ ⎥ U0 ⎢ 00 0 1 0⎥ ⎣ 4517 − 4517 − 14 535 − 4603 274 625 ⎦ 5009 10 018 10 018 5009 5009 01 0−10 ⎡ ⎤ − 4517 4517 − 15 519 19 630 − 274 625 ⎢ 5009 10 018 10 018 5009 5009 ⎥ ⎢ 00 0 −10⎥ ⎢ ⎥ + ⎢ ⎥ i ⎢ 00 0 0 0⎥ . ⎣ − 289 10 307 289 − 13 465 118 300 ⎦ 5009 10 018 10 018 5009 5009 00 1 −10

As an application, the consimilarity of complex matrices is also investigated. It has been revealed that two complex matrices A and B are consimilar if and only if (sI − A) and (sI − B) are conequivalent, that is, there exist two unimodular matrices U(s) and V (s) in the framework of conjugate products such that U(s)  (sI − A)  V (s) = sI − B. Such a result implies that the conjugate product of polynomial matrices may play a very vital role in some problems related to consimilarity. It is expected that the conjugate product can be applied to more other areas. The contents of this chapter are mainly taken from results in [250, 271]. Chapter 10 Con-Sylvester-Sum-Based Approaches

In this chapter, a general complex conjugate matrix equation, which includes the previously investigated con-Sylvester matrix equation and generalized con-Sylvester matrix equation as special cases, is investigated. By introducing the concept of con- Sylvester sum, general explicit solutions are established for this general matrix equa- tion. The concept of the con-Sylvester sum is very relevant to the conjugate product of polynomial matrices. The concept of con-Sylvester sum and its properties are given in Sect. 10.1. The solutions of the con-Sylvester-polynomial matrix equation are derived in Sect. 10.2.

10.1 Con-Sylvester Sum

This section is started with the definition of the con-Sylvester sum.

Definition 10.1 Given a polynomial matrix

t i n×r T (s) = Ti s ∈ C [s], i=0 and two matrices Z ∈ Cr×p and F ∈ Cp×p, their con-Sylvester sum is defined as

t F  ←− ∗i i T (s)  Z = Ti Z F . i=0

F With this definition, if T (s) = sI − A, T (s)  X becomes XF − AX;ifT (s) = F I − sA, T (s)  X becomes X − AXF. If all the matrices are required to be real,

© Springer Science+Business Media Singapore 2017 389 A.-G. Wu and Y. Zhang, Complex Conjugate Matrix Equations for Systems and Control, Communications and Control Engineering, DOI 10.1007/978-981-10-0637-1_10 390 10 Con-Sylvester-Sum-Based Approaches the con-Sylvester sum becomes the Sylvester sum in [257]. By this definition, the following interesting properties of the con-Sylvester sum can be obtained. n×r p×p r×p Lemma 10.1 Let T1(s),T2(s),T(s) ∈ C [s],F∈ C , and X, Y ∈ C . Then

F F F T (s)  (X + Y ) = T (s)  X + T (s)  Y; F F F (T1(s) + T2(s))  X = T1(s)  X + T2(s)  X.

Some further properties of the con-Sylvester sum are given in the next lemmas. Lemma 10.2 Let A(s) ∈ Cl×q [s],B(s) ∈ Cq×r [s],F ∈ Cp×p, and Z ∈ Cr×p. Then   F F F (A(s)  B(s))  Z = A(s)  B(s)  Z .

Proof Let m n i j A(s) = Ai s , B(s) = B j s . i=0 j=0

By applying Definition 10.1 of con-Sylvester sums, Definition 9.12 of conjugate products, and Lemma 3.2, one can obtain

F ( ( )  ( ))  ⎛A s B s Z ⎞ m n   F = ⎝ ∗i i+ j ⎠  Ai B j s Z i=0 j=0

m n ←−− = ∗i ∗(i+ j) i+ j Ai B j Z F ; i=0 j=0 and   F F A(s)  B(s)  Z ⎛ ⎞ m n  F  ←− i ⎝ ∗ j j ⎠ = Ai s  B j Z F i=0 j=0 ⎛ ⎞ ∗i m n ←− ←− ⎝ ∗ j j ⎠ i = Ai B j Z F F i=0 j=0 m n   ←− ∗i ←− ∗ j j i = Ai B j Z F F i=0 j=0 10.1 Con-Sylvester Sum 391

m n    ←− ∗i ←− = ∗i ∗ j ∗i j i Ai B j Z F F i=0 j=0 m n ←−− = ∗i ∗(i+ j) i+ j Ai B j Z F . i=0 j=0

The above two relations imply the conclusion of this lemma. 

× × × × Lemma 10.3 Let A(s) ∈ Cp1 r [s],B(s) ∈ Cp2 r [s],Z∈ Cr p, and F ∈ Cp p. Then, ⎡ ⎤   F A(s) F A(s)  Z  Z = ⎣ ⎦ . B(s) F B(s)  Z

Proof Let m m i i A(s) = Ai s , B(s) = Bi s . i=0 i=0

According to Definition 10.1, one has   A(s) F  Z B(s)    ←−  m ←− m ∗i i Ai ∗ Ai Z F = i i = ←− Z F ∗ Bi i i i=0 i=0 Bi Z F ⎡  ⎤ ⎡ ⎤ m ←− F A Z ∗i F i ( )  ⎣ = i ⎦ ⎣ A s Z ⎦ = im 0 ←− = . ∗i i F Bi Z F ( )  i=0 B s Z

The proof is thus completed. 

q×r1 q×r2 r1×p r2×p Lemma 10.4 Let A(s) ∈ C [s],B(s) ∈ C [s],Z1 ∈ C ,Z2 ∈ C , and F ∈ Cp×p. Then,     F F F Z1 A(s) B(s)  = A(s)  Z1 + B(s)  Z2. Z2

Proof Let m m i i A(s) = Ai s , B(s) = Bi s . i=0 i=0 392 10 Con-Sylvester-Sum-Based Approaches

According to Definition 10.1, one has     F Z A(s) B(s)  1 Z 2 m      F i Z1 = Ai Bi s  Z2 i=0   m   ∗i ←− Z1 i = Ai Bi F Z2 i=0 m ←− ←− = ∗i i + ∗i i Ai Z1 F Bi Z2 F i=0 m ←− m ←− = ∗i i + ∗i i Ai Z1 F Bi Z2 F i=0 i=0 F F = A(s)  Z1 + B(s)  Z2.

The proof is thus completed. 

By using Lemmas 10.3 and 10.4, one can obtain the following corollary.

q ×r r ×p p×p Corollary 10.1 Let Tij(s) ∈ C i j [s],Zj ∈ C j ,i, j = 1, 2, and F ∈ C . Then ⎡ ⎤     F F ( ) ( ) F ( )  + ( )  T11 s T12 s  Z1 = ⎣ T11 s Z1 T12 s Z2 ⎦ F F . T21(s) T22(s) Z2 T21(s)  Z1 + T22(s)  Z2

The following result gives a property of a con-Sylvester sum induced by a uni- modular matrix.

Lemma 10.5 Let U(s) ∈ Cn×n[s] be a unimodular matrix in the framework of F conjugate products. Then, the mapping π : Z → U(s)  Z is bijective for any F ∈ Cp×p.

Proof Since U(s) ∈ Cn×n[s] is a unimodular matrix in the framework of conjugate products, then it follows from the results in Chap. 9 that its inverse (in the framework of conjugate products) exists. Denote V (s) = U −(s). Then

U(s)  V (s) = V (s)  U(s) = I.

It is very obvious that the following mapping is bijective

F χ : Z → I  Z = Z. 10.1 Con-Sylvester Sum 393

By applying Lemma 10.2, one has

F F  = ( ( )  ( ))  I Z U s V s Z F F = U(s)  V (s)  Z .

F F F Let V (s)  Z = Y . Then I  Z = U(s)  Y . Since χ is surjective, then the mapping

F π : Y → U(s)  Y must be surjective. In addition,

F F  = ( ( )  ( ))  I Z V s U s Z F F = V (s)  U(s)  Z .

F This relation together with the fact that χ is injective implies that π : Z → U(s)  Z F is injective. The previous two facts imply that π : Z → U(s)  Z is bijective. The proof is thus completed. 

With the above result as tools, the following two lemmas can be easily derived.

Lemma 10.6 Given A(s) ∈ Cn×n[s] and B(s) ∈ Cn×r [s],ifA(s) and B(s) are left coprime in the framework of conjugate products, then the mapping

  F ϕ : Z → A(s) B(s)  Z is surjective for any F ∈ Cp×p.

Proof Since A(s) and B(s) are left coprime in the framework of conjugate products, then by Theorem 9.20 there exist unimodular matrices U(s) ∈ Cn×n[s] and V (s) ∈ C(n+r)×(n+r)[s] such that     A(s) B(s) = U(s)  I 0  V (s).

It is very obvious that the mapping

  F   χ : Z → I 0  Z = I 0 Z is surjective. This fact, together with Lemma 10.5, implies that ϕ is surjective. The proof is thus completed.  394 10 Con-Sylvester-Sum-Based Approaches

Similarly, one can obtain the following result.

Lemma 10.7 Given N(s) ∈ Cn×r [s] and D(s) ∈ Cr×r [s],ifN(s) and D(s) are right coprime in the framework of conjugate products, then the mapping   N(s) F ψ : Z →  Z D(s) is injective for any F ∈ Cp×p.

10.2 Con-Sylvester-Polynomial Matrix Equations

In this section, a kind of general complex matrix equations is investigated with the previous con-Sylvester sum as tools.

10.2.1 Homogeneous Case

The following homogeneous con-Sylvester-polynomial matrix equation is first con- sidered

φ1 ←− φ2 ←− ∗i i ∗i i Ai X F + Bi Y F = 0, (10.1) i=0 i=0

n×n n×r p×p where Ai ∈ C , i ∈ I[0, φ1], Bi ∈ C , i ∈ I[0, φ2], and F ∈ C are known matrices, and X ∈ Cn×p and Y ∈ Cr×p are the matrices to be determined. When φ1 = 1, A0 = I, and φ2 = 0, the matrix equation (10.1) becomes

X + A1 XF + B0Y = 0, which is the homogeneous con-Yakubovich matrix equation investigated in Sect. 6.4. When φ1 = 1, A1 = I , and φ2 = 0, the matrix equation (10.1) becomes

A0 X + XF + B0Y = 0, which is the con-Sylvester matrix equation in Sect. 6.3. In addition, when all the matrices in (10.1) are real, the matrix equation (10.1) is exactly the so-called homo- geneous Sylvester-polynomial matrix equation investigated in [257]. Thus, the matrix equation (10.1) can be viewed as a conjugate version of the Sylvester-polynomial matrix equation. Due to this reason, in this book the matrix equation (10.1) is referred to as the con-Sylvester-polynomial matrix equation. 10.2 Con-Sylvester-Polynomial Matrix Equations 395

Denote φ φ 1 2 i i A(s) = Ai s , B(s) = Bi s . (10.2) i=0 i=0

The matrix equation (10.1) can be rewritten as

F F A(s)  X + B(s)  Y = 0. (10.3)

Before giving the result on the solution to the matrix equation (10.3), the following lemma is given. This result can be easily derived from Theorem 9.20.

Lemma 10.8 Let A(s) ∈ Cn×n[s] and B(s) ∈ Cn×r [s] be a pair of left coprime matrices in the framework of conjugate products. Then, there exist a pair of right coprime polynomial matrices N(s) ∈ Cn×r [s] and D(s) ∈ Cr×r [s] such that

A(s)  N(s) + B(s)  D(s) = 0. (10.4)

Now, the main result of this subsection is given in the following theorem.

Theorem 10.1 Given A(s) ∈ Cn×n[s] and B(s) ∈ Cn×r [s],itisassumedthatA(s) and B(s) are left coprime in the framework of conjugate products. Further, let N(s) ∈ Cn×r [s] and D(s) ∈ Cr×r [s] be a pair of right coprime polynomial matri- ces satisfying (10.4). Then, all the solutions to the con-Sylvester-polynomial matrix equation (10.3) can be parameterized as ⎧ ⎨ F X = N(s)  Z , ⎩ F (10.5) Y = D(s)  Z where Z ∈ Cr×p is an arbitrarily chosen parameter matrix.

Proof Let us first show that the matrices X and Y given in (10.5) satisfy (10.3). By applying some properties of con-Sylvester sums given in Lemmas 10.1 and 10.2, one has

F F ( )  + ( )  A s X B s Y   F X = A(s) B(s)  Y ⎡ ⎤ F   F ( )  = ( ) ( )  ⎣ N s Z ⎦ A s B s F D(s)  Z      F N(s) F = A(s) B(s)   Z D(s) 396 10 Con-Sylvester-Sum-Based Approaches      N(s) F = A(s) B(s)   Z D(s) F = (A(s)  N(s) + B(s)  D(s))  Z F = 0  Z = 0.

This relation implies that X and Y given in (10.5) are solutions to (10.3). Now, let us show the completeness of the solution (10.5). Since A(s) and B(s) are left coprime in the framework of conjugate products, then the mapping     X   F X F F → A(s) B(s)  = A(s)  X + B(s)  Y Y Y is surjective. So the number of degrees of freedom in the solution of

F F A(s)  X + B(s)  Y = 0 is (n + r)p − np = rp. On the other hand, since N(s) and D(s) are right coprime in the framework of conjugate products, the mapping   N(s) F Z →  Z D(s) is injective. So all the rp elements in Z have contribution to the solution. The above two facts imply that the solution given in (10.5) is complete. 

Remark 10.1 If the polynomial matrices N(s) and D(s) in (10.4) are obtained as

ω ω i i N(s) = Ni s , D(s) = Di s , i=0 i=0 then the solution in (10.5) can be written as  ∗ω ←−ω X = N0 Z + N1 ZF + N2 Z FF +···+Nω Z F ∗ω ←−ω . Y = D0 Z + D1 ZF + D2 Z FF +···+ Dω Z F

Remark 10.2 Theorem 10.1 can be extended to a more general class of matrix equa- tions in the form of θ ω  k ←− i Aki Xk F = 0, (10.6) k=1 i=0 10.2 Con-Sylvester-Polynomial Matrix Equations 397 where Aki, i ∈ I[0, ωk ], k ∈ I[1, θ], and F are known matrices, and Xk , k ∈ I[1, θ], are the matrices to be determined. In fact, if one denotes

ω k i Ak (s) = Akis , k ∈ I[1,θ], i=0 then the matrix equation (10.6) can be rewritten as

θ  F Ak (s)  Xk = 0. k=1

Let Ni (s), i ∈ I[1, θ], be a group of right coprime polynomial matrices in the frame- work of conjugate products such that

A1(s)  N1(s) + A2(s)  N2(s) +···+ Aθ (s)  Nθ (s) = 0.

Similarly to Theorem 10.1, all the solutions to the matrix equation (10.6) can be expressed by ⎧ ⎪ F ⎪ = ( )  ⎪ X1 N1 s Z ⎨ F = ( )  X2 N2 s Z , ⎪ ··· ⎪ ⎩⎪ F Xθ = Nθ (s)  Z where Z is the free parameter matrix.

10.2.2 Nonhomogeneous Case

In this subsection, the following nonhomogeneous con-Sylvester-polynomial matrix equation is investigated:

φ φ φ 1 ←− 2 ←− 3 ←− ∗i i ∗i i ∗i i Ai X F + Bi Y F = Ci R F , (10.7) i=0 i=0 i=0

n×n n×r n×m where Ai ∈ C , i ∈ I[0, φ1], Bi ∈ C , i ∈ I[0, φ2], Ci ∈ C , i ∈ I[0, φ3], R ∈ Cm×p, and F ∈ Cp×p are known matrices, and X ∈ Cn×p and Y ∈ Cr×p are the matrices to be determined. When φ1 = 1, A1 = I , φ2 = 0, B0 = 0, φ3 = 0, and C0 = I, the matrix equation (10.7) becomes

A0 X + XF = R, 398 10 Con-Sylvester-Sum-Based Approaches which is the so-called normal con-Sylvester matrix equation investigated in Sect. 6.1. Similarly to the previous subsection, the equation (10.7) is referred to as the non- homogeneous con-Sylvester-polynomial matrix equation. In order to solve it, some notations need to be introduced. Denote φ φ φ 1 2 3 i i i A(s) = Ai s , B(s) = Bi s , C(s) = Ci s . i=0 i=0 i=0

By using the so-called con-Sylvester sum defined in Sect. 10.1, one can rewrite the matrix equation (10.7)as

F F F A(s)  X + B(s)  Y = C(s)  R. (10.8)

In order to obtain the general solution of the matrix equation (10.7), the following lemma is needed. Lemma 10.9 Let A(s) ∈ Cn×n[s] and B(s) ∈ Cn×r [s] be a pair of left coprime matrices in the framework of conjugate products. Then, there exists a unimodular matrix U(s) ∈ C(n+r)×(n+r)[s] such that     A(s) B(s)  U(s) = I 0 . (10.9)

The result in the above lemma is exactly that of Theorem 9.20. With this lemma as a tool, the main result of this subsection can be provided in the following theorem. Theorem 10.2 Given A(s) ∈ Cn×n[s],B(s) ∈ Cn×r [s], and C(s) ∈ Cn×m [s],itis assumed that A(s) and B(s) are left coprime in the framework of conjugate products. Further, let U(s) ∈ C(n+r)×(n+r)[s] be a unimodular matrix satisfying (10.9). Then all the solutions to the matrix equation (10.8) can be parameterized as        X C(s) 0 F R = U(s)   , (10.10) Y 0 I Z where Z ∈ Cr×p is an arbitrarily chosen parameter matrix. If the matrix U(s) is partitioned as   H(s) N(s) U(s) = L(s) D(s) with N(s) ∈ Cn×r [s] and D(s) ∈ Cr×r [s], then the solution in (10.10) can be equiv- alently expressed as ⎧ ⎨ F F X = (H(s)  C(s))  R + N(s)  Z . ⎩ F F (10.11) Y = (L(s)  C(s))  R + D(s)  Z 10.2 Con-Sylvester-Polynomial Matrix Equations 399

Proof Let us show that the matrices X and Y given in (10.11) satisfy (10.8). By applying some properties of con-Sylvester sums given in Lemmas 10.1 and 10.2, one can obtain

F F ( )  + ( )  A s X B s Y   F X = A(s) B(s)  Y        F C(s) 0 F R = A(s) B(s)  U(s)   0 I Z         C(s) 0 F R = A(s) B(s)  U(s)   0 I Z        C(s) 0 F R = I 0   0 I Z     F R = C(s) 0  Z F F = C(s)  R + 0  Z F = C(s)  R.

This relation implies that X and Y givenin(10.11) are solutions to the matrix equation (10.8). The completeness of solution is immediately obtained by Theorem 10.1. 

Remark 10.3 Theorem 10.2 can be extended to a more general class of matrix equa- tions in the form of θ ω  k ←− c ←− i j Aki Xk F = C j RF , (10.12) k=1 i=0 j=0 where Aki, i ∈ I[0, ωk ], k ∈ I[1, θ], C j , j ∈ I[0, c], F, and R are known matrices, and Xk , k ∈ I[1, θ], are the matrices to be determined. In fact, if one denotes

ω k c i j Ak (s) = Akis , k ∈ I[1,θ], C(s) = C j s , i=0 j=0 the matrix equation (10.12) can be rewritten as

θ  F F Ak (s)  Xk = C(s)  R. k=1

Let U(s) be a unimodular matrix such that 400 10 Con-Sylvester-Sum-Based Approaches     A1(s) A2(s) ··· Aθ (s)  U(s) = I 0 .

Similarly to Theorem 10.2, all the solutions to the matrix equation (10.12) can be expressed by ⎡ ⎤ X1      ⎢ X ⎥ C(s) 0 F R ⎢ 2 ⎥ = U(s)   , ⎣ ···⎦ 0 I Z Xθ where Z is the free parameter matrix. If the polynomial matrix U (s) is compatibly decomposed into the following form ⎡ ⎤ H1(s) N1(s) ⎢ ( ) ( ) ⎥ ( ) = ⎢ H2 s N2 s ⎥ U s ⎣ ··· ··· ⎦ , Hθ (s) Nθ (s) then the solution of this matrix equation can be parameterized as ⎧ ⎪ F F ⎪ = ( ( )  ( ))  + ( )  ⎪ X1 H1 s C s R N1 s Z ⎨ F F = ( ( )  ( ))  + ( )  X2 H2 s C s R N2 s Z , ⎪ ··· ⎪ ⎩⎪ F F Xθ = (Hθ (s)  C(s))  R + Nθ (s)  Z where Z is the free parameter matrix.

10.3 An Illustrative Example

Consider the con-Sylvester-polynomial matrix equation

A2 X FF + A1 XF + A0 X + B1YF+ B0Y = R, (10.13) where ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 0 −10 −i2i1+ i 1 −1 − i −i ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ A2 = i11− i , A1 = 0 −1 − i −1 , A0 = 0i 0 , 00 2 1i−1 0 −11− i ⎡ ⎤ ⎡ ⎤ i1+ i i2 ⎣ ⎦ ⎣ ⎦ B1 = 0i , B0 = 01 . −i1 00 10.3 An Illustrative Example 401

For this matrix equation, one has

( ) = 2 + + A s ⎡s A2 sA1 A0 ⎤ −is + 1 −s2 + 2is − (1 + i)(1 + i) s − i = ⎣ is2 s2 − (1 + i) s + i (1 − i) s2 − s ⎦ , s is − 12s2 − s + (1 − i) ⎡ ⎤ is + i (1 + i) s + 2 B(s) = ⎣ 0is + 1 ⎦ , C(s) = I . −iss

It is noticed that the polynomial matrices A(s) and B(s) are exactly those in Sect. 9.10. 8 i A unimodular matrix U(s) = Ui s satisfying (10.9) has been obtained in i=0 Sect. 9.10. According to Theorem 10.2, all the solutions to the matrix equation (10.13) can be parameterized as     X F R = U(s)  , Y Z which can be equivalently expressed as     8 ∗i X R ←− = U F i Y i Z i=0        R R R R = U + U F + U FF + U F FF 0 Z 1 Z 2 Z 3 Z          R 2 R 2 R 3 + U FF + U F FF + U FF 4 Z 5 Z 6 Z       R 3 R 4 + U F FF + U FF . 7 Z 8 Z

For the matrix equation (10.13) with ⎡ ⎤   3 + 2i −2i 2i i F = , R = ⎣ 21− i ⎦ , 1 −1 + i 1 − i0 when the free parameter matrix Z is chosen as   1 + 2i 2 − i Z = , 3i −1 + 3i 402 10 Con-Sylvester-Sum-Based Approaches a special solution can be obtained as ⎧ ⎡ ⎤ − − ⎪ 9608 057 8701 796i 14 295 552 29 451 169i ⎪ = 1 ⎣ − − − + ⎦ ⎨ X 5009 193 412 1028 356i 17 003 496 10 012 407i 5009 + 10 018i 10 018 − 5009i . ⎪   ⎪ 36 506 125 + 8637893i 65 344 921 + 32 370 313i ⎩ Y = 1 10018 −10 004 499 − 30 781 107i 5890 963 − 41 168 501i

10.4 Notes and References

In this chapter, explicit expressions have been given for the general solution to homo- geneous and nonhomogeneous con-Sylvester-polynomial matrix equations. In order to solve this kind of matrix equations in a unified framework, the concept of con- Sylvester sums is introduced, and some properties of this concept are established. It has been shown that the con-Sylvester-polynomial matrix equation can be written as a very compact expression by using con-Sylvester sums. In addition, the gen- eral solutions of con-Sylvester-polynomial matrix equation can also be expressed by con-Sylvester sums. The proposed solution can provide all the degrees of freedom, which are represented by a free parameter matrix. Similarly to the case in previous chapters, the proposed approach does not require that the coefficient matrices of the involved matrix equations are in any canonical form. In addition, the proposed approach allows some coefficient matrices to be undetermined. The main content of this chapter is taken from [267]. Part III Applications in Systems and Control Chapter 11 Stability for Antilinear Systems

In previous chapters, some complex conjugate matrix equations have been system- atically investigated. From this chapter, as applications of this kind of equations, several analysis and design problems are considered for discrete-time antilinear sys- tems: the ordinary antilinear systems and Markovian jump antilinear systems. As a preliminary, an introduction is given first for discrete-time antilinear systems. It is well-known that the state-space description of a discrete-time linear system is given by x(t + 1) = Ax(t) + Bu(t) , (11.1) where x(t) ∈ Rn is the state of this system, and u (t) ∈ Rm is the input of this system. For the autonomous part x (t + 1) = Ax (t), it is obvious that the mapping x → Ax is linear. In mathematics, there is a concept of antilinear mapping. Let V and W be two complex vector spaces. A mapping f : V → W is called antilinear if for any x, y ∈ V , a, b ∈ C, there holds

f (ax + by) = af(x) + bf(y) .

It is not difficult to find that x → Ax is an antilinear mapping. It should be pointed out that antilinear mappings often occur in quantum mechanics in the study of time reversal and in spinor calculus [29]. Due to this fact, the following discrete-time antilinear system is considered in this chapter

x(t + 1) = Ax(t) + Bu(t), (11.2) where x(t) ∈ Cn is the state of this system, and u (t) ∈ Cm is the input of this system. Similarly, a solution x (t) to the antilinear system (11.2) is called its state response.

© Springer Science+Business Media Singapore 2017 405 A.-G. Wu and Y. Zhang, Complex Conjugate Matrix Equations for Systems and Control, Communications and Control Engineering, DOI 10.1007/978-981-10-0637-1_11 406 11 Stability for Antilinear Systems

Another motivation for investigation of antilinear systems is as follows. First, complex dynamic control systems are ubiquitous in practice and they are well studied in current literature [125, 198]. Second, let us consider the following complex linear system X˙ (t) = AXˆ (t) + BUˆ (t) , where     0 A B 0 Aˆ = ∈ C2n×2n, Bˆ = ∈ C2n×2m , A 0 0 B and the state and the input satisfy the following constraints     In 0n×n X(t) = 0n×n In X(t),     Im 0m×m U(t) = 0m×m Im U(t).

By simple calculation, it is easily found that this constrained system can be equiva- lently transformed into the antilinear system (11.2). This fact implies that antilinear systems are a special type of complex dynamic systems with structure constraint. As symmetric systems and large scale systems in the context of real systems, this type of systems deserve further investigation as special structured complex systems. In the context of linear systems, the algebraic equivalence is a very important concept. This equivalence relation is used to describe the relation of different state space representations for a practical system due to the different choices of the state vector. It is known from linear systems theory that the algebraic equivalence involves mainly a similarity transformation between the coefficient matrices of two systems. Now, the concept of algebraic equivalence is generalized to the context of antilinear systems. Consider the discrete-time antilinear system (11.1), and carry out the following change of variables ξ(t) = Px(t) , where P is a nonsingular matrix. By simple calculation, one has

ξ(t + 1) = A ξ (t) + Bu (t), (11.3) where A = PAP−1, B = PB. (11.4)

In order to distinguish from linear systems, the concept of algebraic anti-equivalence for antilinear systems is given below.

Definition 11.1 Given two antilinear systems (11.2) and (11.3), if there exists a nonsingular matrix P such that (11.4) holds, then the two systems (11.2) and (11.3) are called algebraic anti-equivalent. 11 Stability for Antilinear Systems 407

It can be seen from (11.4) that the algebraic anti-equivalence is closely related to consimilarity of complex matrices. This is different from algebraic equivalence, which is related to similarity of matrices. Besides the ordinary antilinear systems, the discrete-time antilinear systems with Markov jump parameters are also considered in this chapter. This type of systems is described by x(t + 1) = Ar(t)x(t) + Br(t)u(t), (11.5) where x(t) ∈ Cn is the state of this system, and u(t) ∈ Cm is the input of this system; r(t) represents a homogeneous discrete-time Markovian chain that takes values in a discrete finite set S ={1, 2, ..., N} with transition probability matrix P = pij given by pij = Pr{r(t + 1) = j | r(t) = i}.

Similarly to the case of ordinary antilinear systems, the Markovian jump antilinear systems are also a type of structured Markovian jump systems with complex-valued coefficients. Stability is a basic requirement for a practical system to work properly. In this chapter, the stability is investigated for discrete-time antilinear systems: the ordinary antilinear systems and Markovian jump antilinear systems. In Sect. 11.1, the stability of ordinary discrete-time antilinear systems is investigated. In Sect. 11.2, the concept of stochastic stability is generalized to the context of Markovian jump antilinear systems, and the stability criteria are established in terms of coupled anti-Lyapunov matrix equations. In Sect. 11.3, some iterative algorithms are proposed to solve the coupled anti-Lyapunov matrix equations.

11.1 Stability for Discrete-Time Antilinear Systems

In this section, the stability is investigated for ordinary discrete-time antilinear sys- tems. Like the case of linear systems, for this purpose only the following autonomous antilinear system needs to be considered:

x(t + 1) = Ax(t), (11.6) where x(t) ∈ Cn is the state of this system, and A ∈ Cn×n is the system matrix. For clarity, the definition of stability is given for the system (11.6).

Definition 11.2 The discrete-time antilinear system (11.6) is asymptotically stable if for any initial value x (0) = x0, there holds

lim x(t) = 0. t→∞ 408 11 Stability for Antilinear Systems

Definition 11.3 The discrete-time antilinear system (11.6) is exponentially stable if for any initial value x(0) , there exit β>0 and 0 <α<1 such that

x(t) ≤ βαt x(0) .

It can be easily shown that the asymptotic stability is equivalent to the exponen- tial stability for the antilinear system (11.6). For convenience, a matrix A is called anti-Schur stable if the corresponding antilinear system (11.6) is asymptotically stable. It can be seen from the previous definitions that the stability of an antilinear system is described by its state response. In addition, it is known that the state response of two algebraic anti-equivalent antilinear systems is linked by a constant nonsingular linear transformation. With these in mind, the following conclusion can be immediately obtained. Proposition 11.1 Two algebraic anti-equivalent antilinear systems in the form of (11.6) possess the same stability property. The following theorem gives a necessary and sufficient condition for the stability of system (11.6). Theorem 11.1 The discrete-time antilinear system (11.6) is asymptotically stable if and only if for any positive definite matrix Q ∈ Cn×n, there exists a positive definite matrix P ∈ Cn×n such that AH PA− P =−Q. (11.7)

Proof For the antilinear system (11.6), the following Lyapunov functional is con- structed V (x(t)) = xH (t) Px (t) .

Since V (x(t)) is a real function, then one has

H V (x(t)) = xH(t) Px (t) = xH (t) Px(t) = x(t) Px(t).

With this relation, the forward difference of the Lyapunov functional V (x(t)) along the solution of the antilinear system (11.6) is given as

V (x(t)) = V (x (t + 1)) − V (x(t))  H = Ax(t) PAx(t) − xH(t) Px(t) H = x (t) AH PAx(t) − xH (t) Px(t) H = x(t) AH PAx(t) − xH (t)Px(t) H   = x(t) AH PA− P x(t) H =−x(t) Qx(t) < 0, for any x(t) = 0. 11.1 Stability for Discrete-Time Antilinear Systems 409

This implies that limt→∞ V (x(t)) = 0, and thus limt→∞ x(t) = 0 in view that P is a positive definite matrix. Now, let us show the necessity. Since the system is exponentially stable, then for some positive definite matrix P˜ there holds ∞ xH(t) x(t) ≤ xH(0) Px˜ (0) . (11.8) t=0

Given an arbitrary positive definite matrix R, define the following function

L xH(t)K (L − t)x(t) = xH(k)Rx(k), k=t where L is a positive integer. Since R is positive definite, then xH(t)K (L − t)x(t) is a nondecreasing function of L. It can be obtained from (11.8) that xH(t)K (L −t)x(t) is upper bounded. Hence, amatrixP can be found such that

L xH(t)Px(t) = lim xH(t)K (L − t)x(t) = lim xH(k)Rx(k). (11.9) L→∞ L→∞ k=t

Note that R is positive definite, it can be concluded from (11.9) that P is also positive definite. Now let us consider

xH(t)K (L − t)x(t) − xH(t + 1)K (L − t − 1)x(t + 1) L L = xH(k)Rx(k) − xH(k)Rx(k). k=t k=t+1

For the right-hand side of the preceding equation, one has

L L xH(k)Rx(k) − xH(k)Rx(k) = xH(t)Rx(t). k=t k=t+1

In addition, by using the system equation (11.6), one can obtain that

xH(t)K (L − t)x(t) − xH(t + 1)K (L − t − 1)x(t + 1)

H = xH(t)K (L − t)x(t) − Ax(t) K (L − t − 1)Ax(t).

It follows from the preceding three relations that

H xH(t)K (L − t)x(t) − Ax(t) K (L − t − 1)Ax(t) = xH(t)Rx(t), (11.10) 410 11 Stability for Antilinear Systems for any x(t). Now let L − t →∞, then from (11.10) and (11.9) one has

H Ax(t) PAx(t) − xH(t)Px(t) =−xH(t)Rx(t). (11.11)

H Since Ax(t) PAx(t) is real, one has

H H Ax(t) PAx(t) = Ax(t) PAx(t) = xH(t)AT P Ax(t).

Therefore, it follows from (11.11) that

AT P A − P =−R. (11.12)

Let Q = R > 0. Then, (11.12) can be equivalently rewritten as (11.7). With the preceding two aspects, the proof is thus completed.  For convenience, the matrix equation with respect to P in the preceding theorem is called anti-Lyapunov matrix equation. The above theorem is clearly a generalization of the Lyapunov stability result for discrete-time linear systems. By this theorem, the problem of checking the stability of a discrete-time antilinear system is converted equivalently to a problem of finding a positive definite matrix satisfying an anti- Lyapunov matrix equation. It is easily seen that the anti-Lyapunov matrix equation (11.7) is a special case of the con-Kalman-Yakubovich matrix equation in Chap. 3 and the extended con- Sylvester matrix equation in Subsection4.1.1. Therefore, the iterative algorithms in Chap. 3 and in Subsection4.1.1 can be used to solve the anti-Lyapunov matrix equa- tions. In addition, the method in Sect. 6.2 for solving the con-Kalman-Yakubovich matrix equation can be utilized to obtain the explicit solution of the anti-Lyapunov matrix equation (11.7).

11.2 Stochastic Stability for Markovian Antilinear Systems

In previous section, the stability of ordinary antilinear systems has been investigated. In this section, the stability is considered for the following discrete-time Markovian jump (DTMJ) antilinear system

x(t + 1) = Ar(t)x(t), x(0) = x0, (11.13)

n where x(t) ∈ C is the state of the system with the initial condition x(0) = x0, and r(t) represents a homogeneous discrete-time Markovian chain that takes values in a 11.2 Stochastic Stability for Markovian Antilinear Systems 411   ={ ... } = discrete finite set S 1, 2, , N with transition probability matrix P pij N×N given by pij = Pr{r(t + 1) = j | r(t) = i}. (11.14)

The set S contains N modes of the system (11.13), and for r(t) = i ∈ S, the system matrix of the i-th mode is assumed to be given and denoted by Ai . Similar to the case of linear systems with Markovian jump parameters [27], the concept of stochastic stability is given below for the DTMJ antilinear system (11.13).

Definition 11.4 The DTMJ antilinear system (11.13) with transition probability n (11.14) is called stochastically stable, if for all finite x0 ∈ C and i ∈ S, there exists a positive definite matrix P satisfying

∞ H( , , ) ( , , ) | ( ) = , ( ) = ≤ H E x t x0 i x t x0 i x 0 x0 r 0 i x0 Px0, (11.15) t=0 where x(t, x0, i) represents the corresponding solution of the system (11.13) at time t when the initial conditions of x (t) and r (t) are respectively taken as x (0) = x0 and r (0) = i. E represents the mathematical expectation.

Before proceeding, some well-known properties are given for Hermitian matrices.

Lemma 11.1 Let A ∈ Cn×n be a positive definite . Then

(1) xH Ax is real for all x ∈ Cn. Further, xH Ax = xT Ax. (2) xH Ax > 0 for all non-zero x ∈ Cn.

The main aim of this section is to derive some necessary and sufficient conditions for the DTMJ antilinear system (11.13)–(11.14) to be stochastically stable. By using a stochastic version of Lyapunov’s second method, criteria are derived for the stochastic stability of a Markovian jump antilinear system.

Theorem 11.2 The DTMJ antilinear system (11.13) with transition probability (11.14) is stochastically stable if and only if, for any given positive definite Hermitian matrices Qi ,i ∈ I[1, N], there exist unique positive definite Hermitian matrices Ki , i ∈ I[1, N], such that the following N coupled equations hold ⎛ ⎞ N H ⎝ ⎠ − =− ∈ I[ , ] Ai pijK j Ai Ki Qi ,i 1 N . (11.16) j=1

Proof For the DTMJ antilinear system (11.13), the following stochastic Lyapunov function is considered

H V (x(t), r(t) = i) = V (x(t), i) = x (t)Ki x(t), (11.17) 412 11 Stability for Antilinear Systems where Ki is a positive definite Hermitian matrix. By using Lemma 11.1, one can see that V (x(t), r(t) = i) is real and V(x(t), r(t) = i)>0forx (t) = 0. Let V (x(t), i) be defined as

V (x(t), i) = E [V (x(t + 1), r(t + 1)) | x(t), r(t) = i] − V (x(t), i). (11.18)

By the properties of conditional expectation, from (11.17) and ( 11.18) one can obtain that ⎛ ⎞ N  H ⎝ ⎠ H V (x(t), i) = pij Ai x(t) K j Ai x(t) − x (t)Ki x(t). j=1

By using the first item of Lemma 11.1, one yields

N  H H V (x(t), i) = pij Ai x(t) K j Ai x(t) − x (t)Ki x(t) j=1 N   H H = pij Ai x(t) K j Ai x(t) − x (t)Ki x(t) = j 1 ⎛ ⎞ N = H( ) T ⎝ ⎠ ( ) − H( ) ( ) x t Ai pijK j Ai x t x t Ki x t j= ⎡ ⎛ 1 ⎞ ⎤ N = H( ) ⎣ T ⎝ ⎠ − ⎦ ( ) x t Ai pijK j Ai Ki x t j=1 H =−x (t)Qi x(t).

With this relation, it can be obtained that for x(t) = 0, there holds

V (x(t), i) xH(t)Q x(t) λ [Q ] =− i ≤−γ =−min min i , (11.19) H ∈ V (x(t), i) x (t)Ki x(t) i S λmax[Ki ] where λmin[Qi ] denotes the minimal eigenvalue of Qi , and λmax[Ki ] denotes the maximal eigenvalue of Ki . With the relation (11.19), one has

V (x(t), i) ≤−γ V (x(t), i). (11.20)

According to the definition (11.18)ofV (x(t), i),from(11.20), one can get

E [V (x(t + 1), r(t + 1)) | x(t), r(t) = i] ≤ (1 − γ)[V (x(t), i)]. 11.2 Stochastic Stability for Markovian Antilinear Systems 413

With this relation, by using the properties of conditional expectation one has

E [V (x(t + 1), r(t + 1))] = E {E[V (x(t + 1), r(t + 1)) | x(t), r(t) = i]} (11.21) ≤ (1 − γ)EV (x(t), i).

Here, note that γ should satisfy 0 ≤ 1 − γ<1. Suppose that γ>1, it implies that

E [V (x(t + 1), r(t + 1))] < 0, which contradicts the fact that Ki , i ∈ S, are positive definite Hermitian matrices. Since the relation (11.21) holds for any i ∈ S, one has

E [V (x(t + 1), r(t + 1))] ≤ (1 − γ)E[V (x(t), r(t))]. (11.22)

It follows from (11.22) that

t+1 E [V (x(t + 1), r(t + 1))] ≤ (1 − γ) E[V (x0, r (0))].

From this, one has

t+1 E [V (x(t + 1), r(t + 1)) | x0, r (0)] ≤ (1 − γ) V (x0, r (0)).

Since γ<1, summing both sides of the preceding expression from 0 to ∞,gives

∞ ∞ H t H 1 H x (t)K ( )x(t) | x , r( ) = i ≤ ( − γ) x K x = x K x E r t 0 0 1 0 i 0 γ 0 i 0. t=0 t=0 (11.23) Since Ki is a positive definite Hermitian matrix for each i ∈ S, then

∞ H E x (t)Kr(t)x(t) | x0, r(0) = i t=0 ∞   H ≥ E λmin Kr(t) x (t)x(t) | x0, r(0) = i t=0   ∞ H ≥ min λmin K j E x (t)x(t) | x0, r (0) = i . (11.24) j∈S t=0

Combining (11.23) with (11.24), yields

∞    1 λ H( ) ( ) | , ( ) = ≤ H min min K j E x t x t x0 r 0 i x0 Ki x0. j∈S γ t=0 414 11 Stability for Antilinear Systems

Thus, it can be concluded that for all i ∈ S

∞ H( ) ( ) | , ( ) = ≤ H E x t x t x0 r 0 i x0 Px0, (11.25) t=0 where P is a positive definite Hermitian matrix satisfying

K P = max i . , ∈ i j S γλmin K j

The sufficiency is thus proven. Now, let us show the necessity. It is assumed that the DTMJ antilinear system n (11.13) is stochastically stable, that is, for any x0 ∈ C and any i ∈ S, there exists a positive definite Hermitian matrix P satisfying

∞ H( ) ( ) | ( ) = , ( ) = ≤ H E x t x t x 0 x0 r 0 i x0 Px0. (11.26) t=0

Let {Ri : i ∈ S} be a set of positive definite matrices. Define the following function

L H H x (t)K (L − t, r(t))x(t) = E x (k)Rr(k)x(k) | x(t), r(t) , k=t where L is a positive integer. H Since Rr(k), k ∈ I[t, L], are positive definite Hermitian matrices, then x (t)K (L− t, r(t))x(t) is a non-decreasing function of L. It can be obtained from (11.26) that H x (t)K (L − t, r(t))x(t) is bounded from above. Hence, one can find a matrix Kr(t) such that

H H x (t)Kr(t)x(t) = lim x (t)K (L − t, r(t))x(t) (11.27) L→∞ L H = lim E x (k)Rr(k)x(k) | x(t), r(t) . L→∞ k=t

Since (11.27) holds for any r(t) = i, one has

Ki = lim K (L − t, r(t) = i). (11.28) L→∞

Note that Ri is a positive definite Hermitian matrix, it can be concluded from (11.27) that Ki is also positive definite. Now let us consider  E xH(t)K (L − t, r(t))x(t)− 11.2 Stochastic Stability for Markovian Antilinear Systems 415  xH(t + 1)K (L − t − 1, r(t + 1))x(t + 1) | x(t), r(t)  L H = E E x (k)Rr(k)x(k) | x(t), r(t) − = k t  L H E x (k)Rr(k)x(k) | x(t + 1), r(t + 1) | x(t), r(t) . k=t+1

For the right-hand side of the preceding equation, one has  L H E E x (k)Rr(k)x(k) | x(t), r(t) − = k t  L H E x (k)Rr(k)x(k) | x(t + 1), r(t + 1) | x(t), r(t) k=t+1 H = x (t)Rr(t)x(t).

In addition, by using the definition of conditional expectation, one can obtain that   E xH(t)K (L − t, r(t))x(t) − xH(t + 1)K (L − t − 1, r(t + 1))x(t + 1) | x(t), r(t) = i = H( ) ( − , ( ) = ) ( ) − x t K L t r⎛t i x t ⎞ N H ⎝ ⎠ Ar(t)=i x(t) pijK (L − t − 1, j) Ar(t)=i x(t) . j=1

It follows from the preceding three relations that for any x(t) there holds ⎛ ⎞ N H  H ⎝ ⎠ x (t)K (L − t, r(t) = i)x(t) − Ar(t)=i x(t) pijK (L − t − 1, j) Ar(t)=i x(t) (11.29) j=1 H = x (t)Rr(t)=i x(t).

Let L − t →∞, then from (11.29) one has ⎛ ⎞ N H  ⎝ ⎠ H H Ai x(t) pijK j Ai x(t) − x (t)Ki x(t) =−x (t)Ri x(t). (11.30) j=1

From the first item of Lemma 11.1, it can be obtained that ⎛ ⎞ ⎛ ⎞ N N H   ( ) ⎝ ⎠ ( ) = H( ) T ⎝ ⎠ ( ) Ai x t pijK j Ai x t x t Ai pijK j Ai x t . j=1 j=1 416 11 Stability for Antilinear Systems

Substituting it into (11.30), gives ⎡ ⎛ ⎞ ⎤ N H( ) ⎣ T ⎝ ⎠ − ⎦ ( ) =− H( ) ( ) x t Ai pijK j Ai Ki x t x t Ri x t , j=1 for any i ∈ S. Since this relation holds for any x(t), one can conclude that ⎛ ⎞ N T ⎝ ⎠ − =− Ai pijK j Ai Ki Ri . (11.31) j=1

Let Qi = Ri > 0, i ∈ I[1, N]. Then, (11.31) can be equivalently rewritten as (11.16). Hence, {Ki : i ∈ S} is the positive definite Hermitian solution of (11.16). By (11.27), (11.28) and the relation Qi = Ri , it can be easily obtained that Ki is uniquely defined by Qi . Since (11.16) is derived from (11.27), Ki is the unique solution of (11.16). Thus, the proof is completed. 

Obviously, the coupled matrix equation (11.16) is not linear. In order to distinguish from the coupled Lyapunov equations appearing in DTMJ linear systems, the coupled equation (11.16) is referred to as the coupled anti-Lyapunov matrix equation.

Remark 11.1 The coupled anti-Lyapunov matrix equation in (11.16) can be equiv- alently written as ⎛ ⎞ √  √ N H − =− H ⎝ ⎠ − ∈ I[ , ] pii Ai Ki pii Ai Ki Ai pijK j Ai Qi , i 1 N . j=i, j=1

By the result in Sect. 11.1, it can be obtained from this relation that a necessary condition of the DTMJ antilinear system (11.13) to be stochastically stable is that the following discrete-time antilinear systems are stable: √ x (t + 1) = pii Ai x (t), i ∈ I[1, N]. (11.32)

Thus, in view of 0 ≤ pii ≤ 1 the stability of all subsystems is not necessary for the system (11.13) to be stochastically stable.

Example 11.1 Consider a one-dimensional Markovian jump antilinear system

x(t + 1) = Ar(t)x(t), x(0) = x0, where the dynamic process is a 2-state Markov chain with the following transition probability matrix   0.30.7 P = , 0.50.5 11.2 Stochastic Stability for Markovian Antilinear Systems 417 and the subsystem matrices are

A1 =−1 + 0.5i, A2 = 0.5 − 0.5i.

Choose Qi = 1, i = 1, 2. Then, it is easily obtained that the solution of anti-Lyapunov matrix equation (11.16)is(K1, K2) = (6.5, 3.5), which are positive definite. Hence, according to Theorem 11.2, this Markovian jump antilinear system is stochasti- cally stable. On the other hand, one can take advantage of the stability condition in Sect.11.1 for antilinear systems to check the stability of these two subsystems. As a result, the subsystem with the system matrix A1 is not stable while the other subsystem is stable. Thus, it can be seen that the stability of all subsystems is not necessary for the system (11.13) to be stochastically stable.

The following result provides another necessary and sufficient condition for the stochastic stability of Markovian jump antilinear systems. In comparison with The- orem 11.2, a different Lyapunov function is chosen in this result.

Theorem 11.3 The DTMJ antilinear system (11.13) with transition probability (11.14) is stochastically stable if and only if, for any given positive definite Hermitian matrices Qi ,i ∈ I[1, N], there exist unique positive definite Hermitian matrices Ki , i ∈ I[1, N], satisfying the following coupled anti-Lyapunov matrix equation:

N H − =− ∈ I[ , ] pij A j K j A j Ki Qi ,i 1 N . (11.33) j=1

Proof For the Markovian jump antilinear system (11.13), the following stochastic Lyapunov function is considered

H V (x(t), r(t − 1) = i) = V (x(t), i) = x (t)Ki x(t), (11.34) with Ki being Hermitian and positive definite. The proof is very similar to that of Theorem 11.2.LetV (x(t), i) be defined as

V (x(t), i) = E [V (x(t + 1), r(t)) | x(t), r(t − 1) = i] − V (x(t), i). (11.35)

By the definition of conditional expectation, it follows from (11.34) and (11.35) that

N  H H V (x(t), i) = pij A j x(t) K j A j x(t) − x (t)Ki x(t) j=1 N = ( )H H ( ) − H( ) ( ) pijx t A j K j A j x t x t Ki x t . j=1 418 11 Stability for Antilinear Systems

By using the first item of Lemma 11.1, one has

N  ( ( ), ) = ( )H H ( ) − H( ) ( ) V x t i pijx t A j K j A j x t x t Ki x t = j 1 ⎛ ⎞ N = H( ) ⎝ T ⎠ ( ) − H( ) ( ) x t pij A j K j A j x t x t Ki x t j= ⎡ 1 ⎤ N = H( ) ⎣ T − ⎦ ( ) x t pij A j K j A j Ki x t j=1 H =−x (t)Qi x(t), from which, it can be obtained that for x(t) = 0, there holds

V (x(t), i) xH(t)Q x(t) λ [Q ] =− i ≤−γ =−min min i . (11.36) H ∈ V (x(t), i) x (t)Ki x(t) i S λmax[Ki ]

From (11.36), it can be derived that

V (x(t), i) ≤−γ V (x(t), i). (11.37)

With the definition (11.35)ofV (x(t), i), it follows from (11.37) that

E[V (x(t + 1), r(t)) | x(t), r(t − 1) = i] ≤ (1 − γ)V (x(t), i).

With this relation, by using the properties of conditional expectation one has

E [V (x(t + 1), r(t))] = E {E [V (x(t + 1), r(t)) | x(t), r(t − 1) = i]} (11.38) = (1 − γ)E [V (x(t), i)] .

Similarly to the proof of the preceding theorem, it can be obtained that 0 ≤ 1−γ<1. Since the relation (11.38) holds for any i ∈ S, one has

E [V (x(t + 1), r(t)) | x(t), r(t − 1)] ≤ (1 − γ)E[V (x(t), r(t − 1))]. (11.39)

Repeating this iteration, it can be obtained that

E [V(x(t + 1), r(t))] ≤ (1 − γ)t E[V (x(1), r(0))]. 11.2 Stochastic Stability for Markovian Antilinear Systems 419

From this, one has

E [V (x(t + 1), r(t)) | x(1), r(0)] ≤ (1 − γ)t [V (x(1), r (0))].

Summing both sides of the preceding expression from 1 to ∞ and using the fact that γ<1, yields

∞ H E x (t)Kr(t−1)x(t) | x(1), r (0) t=1 ∞ t−1 H ≤ (1 − γ) x (1)Kr(0)x(1) (11.40) t=1

1 H = x ( )K ( )x( ) γ 1 r 0 1 .

By the first item of Lemma 11.1, it follows from (11.40) and (11.13) that

∞ H E x (t)Kr(t−1)x(t) | x (0) = x0, r (0) = i t=1 1 ≤ (A x )H K (A x ) γ i 0 i i 0 1 = (A x )H K (A x ) γ i 0 i i 0 1 = xH AT K A x γ 0 i i i 0.

Since Ki is a positive definite Hermitian matrix for each i ∈ S, then

∞ H E x (t)Kr(t−1)x(t) | x(0) = x0, r(0) = i t=1 ∞   H ≥ E λmin Kr(t−1) x (t)x(t) | x(0) = x0, r(0) = i t=1     ∞ H ≥ min λ K j E x (t)x(t) | x(0) = x , r (0) = i . ∈ min 0 j S t=1

From the preceding two relations, one has

∞    1 λ H( ) ( ) | ( ) = ( ) = ≤ H T min min K j E x t x t x 0 x0, r 0 i x0 Ai Ki Ai x0. j∈S γ t=1 420 11 Stability for Antilinear Systems

Thus, one can obtain that for all i ∈ S

∞ H( ) ( ) | ( ) = ( ) = ≤ H E x t x t x 0 x0, r 0 i x0 Px0, t=0 where P is a positive definite matrix defined as below

AT K A P = I + max i i i . , ∈ i j S γλmin K j

This shows the sufficiency. Next, the necessity will be shown. It is assumed that the Markovian jump antilinear n system (11.13) is stochastically stable, that is, for any x0 ∈ C and any i ∈ S, there exists a positive definite Hermitian matrix P satisfying

∞ H( ) ( ) | ( ) = , ( ) = ≤ H E x t x t x 0 x0 r 0 i x0 Px0. (11.41) t=0

Let {Ri : i ∈ S} be a set of positive definite Hermitian matrices. Define the following function

L H H x (t)K (L − t, r(t − 1))x(t) = E x (k)Rr(k−1)x(k) | x(t), r(t − 1) , k=t where L is a positive integer. Obviously, xH(t)K (L −t, r(t −1))x(t) is a nondecreas- ing function of L. It can be obtained from (11.41) that xH(t)K (L − t, r(t − 1))x(t) is bounded from above. Hence, one can find a matrix Kr(t−1) such that

H x (t)Kr(t−1)x(t) = lim xH(t)K (L − t, r(t − 1))x(t) (11.42) L→∞ L H = lim E x (k)Rr(k−1)x(k) | x(t), r(t − 1) . L→∞ k=t

Since (11.42) holds for any r(t − 1) = i, it can be derived that

Ki = lim K (L − t, r(t − 1) = i). (11.43) L→∞

Since {Ri : i ∈ S} is a set of positive definite matrices, it can be obtained from (11.42) that Ki is also Hermitian and positive definite. Now the following relation is considered: 11.2 Stochastic Stability for Markovian Antilinear Systems 421  E xH(t)K (L − t, r(t − 1))x(t)−  xH(t + 1)K (L − t − 1, r(t))x(t + 1) | x(t), r(t − 1)  L H = E E x (k)Rr(k−1)x(k) | x(t), r(t − 1) − = k t  L H E x (k)Rr(k−1)x(k) | x(t + 1), r(t) | x(t), r(t − 1) . k=t+1

For the expression on the right-hand side of the preceding relation, it can be readily obtained that  L H E E x (k)Rr(k−1)x(k) | x(t), r(t − 1) − = k t  L H E x (k)Rr(k−1)x(k) | x(t + 1), r(t) | x(t), r(t − 1) k=t+1 H = x (t)Rr(t−1)x(t).

By using the properties of conditional expectation, it follows from (11.13) that   E xH(t)K (L − t, r(t − 1))x(t) − xH(t + 1)K (L − t − 1, r(t))x(t + 1) | x(t), r(t − 1) = i N  H H = x (t)K (L − t, r (t − 1) = i)x(t) − pij Ar(t)= j x(t) K (L − t − 1, r(t) = j) Ar(t)= j x(t) . j=1

Combining the preceding three relations, gives

N  H H x (t)K (L − t, r(t − 1) = i)x(t) − pij A j x(t) K (L − t − 1, r(t) = j) A j x(t) j=1 H = x (t)Rr(t−1)=i x(t) (11.44) for any x(t).For(11.44), let L →∞, then one has

N  H H H pij A j x(t) K j A j x(t) − x (t)Ki x(t) =−x (t)Ri x(t). (11.45) j=1

By the first item of Lemma 11.1, it can be obtained that ⎛ ⎞ N N  H  ( ) ( ) = H( ) ⎝ T ⎠ ( ) pij A j x t K j A j x t x t pij A j K j A j x t . j=1 j=1 422 11 Stability for Antilinear Systems

Thus, it follows from (11.45) that

N T − =− pij A j K j A j Ki Ri , (11.46) j=1 for any i ∈ S.LetQi = Ri > 0, i ∈ I[1, N]. Then, (11.46) can be equivalently rewritten as (11.33). Hence, {Ki : i ∈ S} is a set of positive definite Hermitian solutions of (11.33). By (11.42), (11.43), and the relation Qi = Ri , it can be easily obtained that Ki is uniquely determined by Qi . Since (11.16) is derived from ( 11.27), Ki is the unique solution of (11.33). Thus, the proof is completed. 

The results of the previous two theorems are based on the choice of a Lyapunov function by using a stochastic version of Lyapunov’s second method. A natural candidate for a Lyapunov function is an appropriately chosen quadratic form. In the proof of Theorem 11.2, the Lyapunov function is chosen as V (x(t), r(t)) = H x (t)Kr(t)x(t), where x(t) is available with respect to r(t) and the matrix Kr(t) depends only on r(t). On the other hand, Theorem 11.3 is proved by using the H Lyapunov function V (x(t), r(t)) = x (t)Kr(t−1)x(t). In this case, the matrix Kr(t−1) depends only on r(t − 1).

Remark 11.2 The necessary and sufficient conditions given in (11.16) and (11.33) are equivalent.

Remark 11.3 By analyzing the coupled anti-Lyapunov equation in (11.33), a similar conclusion as in Remark 11.1 can also be obtained.

Remark 11.4 For the case of DTMJ linear systems, the involved coupled Lyapunov matrix equations are linear, and thus the stochastic stability can be checked by the spectral radius of an augmented matrix being less than one [124]. Since the coupled anti-Lyapunov matrix equations are not linear, one can not resort such a method to check the stochastic stability for a DTMJ antilinear system.

Theorem 11.3 provides an alternative criterion based on coupled anti-Lyapunov equation for the stochastic stability of the DTMJ antilinear system (11.13). Since the conditions in Theorems 11.2 and 11.3 are necessary and sufficient for the system (11.13) to be stochastically stable, there may exist a link between solutions of the coupled anti-Lyapunov equations (11.16) and (11.33). But it is a pity that currently, such a relation can not be established directly. In addition, it is not obvious which of these two necessary and sufficient conditions is better for practical applications. In general, one has to solve N coupled anti-Lyapunov equations when Theorem 11.2 or Theorem 11.3 is applied. However, for some special cases, compared with Theorems 11.2 and 11.3 actually provides an easier test for checking the stochastic stability of the DTMJ antilinear system. This is shown in the following corollary by using Theorem 11.3. 11.2 Stochastic Stability for Markovian Antilinear Systems 423

Corollary 11.1 Suppose that {r(t)} is a finite-state independent and identically dis- tributed (i.i.d.) random sequence with the probability distribution {p1,p2, ...,pN }. Then the DTMJ antilinear system (11.13) is stochastically stable if and only if, for any given positive definite Hermitian matrix Q, there exists a unique positive definite Hermitian matrix K such that the following matrix equation holds:

N H − =− p j A j KAj K Q. j=1

Remark 11.5 For i.i.d. case, if the necessary and sufficient condition in Theorem 11.2 is adopted, one needs to solve coupled anti-Lyapunov equations. This is more complicated than the method in Corollary 11.1.

11.3 Solutions to Coupled Anti-Lyapunov Equations

In the preceding section, necessary and sufficient conditions have been established for the DTMJ antilinear system (11.13) to be stochastically stable in terms of coupled anti-Lyapunov matrix equations in (11.16)or(11.33). In fact, these coupled anti- Lyapunov matrix equations are special cases of the coupled con-Sylvester matrix equations investigated in Subsection4.2.3. By using the iterative algorithm in Sub- section4.2.3 to solve the coupled anti-Lyapunov matrix equation in (11.16), one can obtain the following result.

Lemma 11.2 For the coupled anti-Lyapunov matrix equation in (11.16), given the initial values K j (0),j∈ I[1, N], the sequences {Kl (m)},l ∈ I[1, N], generated by the iteration ⎡ ⎛ ⎞ ⎤ N N ( + ) = ( ) + μ ⎣ ( ) − − H ⎝ ( )⎠ ⎦ H Kl m 1 Kl m pil Ai Ki m Qi Ai pijK j m Ai Ai i=1 j=1 ⎡ ⎛ ⎞ ⎤ ⎢ N ⎥ −μ ( ) − − H ⎝ ( )⎠ , ⎣Kl m Ql Al pljK j m Al ⎦ j=1 l ∈ I[1, N], converge to the solution of the coupled anti-Lyapunov matrix equation (11.16)if

2 0 <μ< ,  2 T 2 424 11 Stability for Antilinear Systems where T =[Tli]N×N with     ⊗ ( ) − = = pil Ai σ Ai σ  I,forl i, Tli ⊗ ( ) = pil Ai σ Ai σ ,forl i.

The coupled anti-Lyapunov matrix equation (11.16) has a special structure. Obvi- ously, this structure is not used when obtaining the algorithm in Lemma 11.2.In the rest of this section, two alternative algorithms are developed to solve the coupled anti-Lyapunov matrix equation (11.16) by making full use of its special structure. The idea can also be generalized to solve the coupled anti-Lyapunov equation (11.33).

11.3.1 Explicit Iterative Algorithms

In this subsection, some explicit iterative algorithms are first proposed for the coupled anti-Lyapunov matrix equation (11.16).

Theorem 11.4 If the coupled anti-Lyapunov matrix equation in (11.16) has a unique positive definite solution {Ki , i ∈ I[1, N]}, for any given Qi > 0,i ∈ I[1,N], and the corresponding Markovian jump antilinear system (11.13)–( 11.14) is stochastically stable, then the solution of the equation(11.16) can be obtained by the following iterative algorithm: ⎛ ⎞ N ( + ) = T ⎝ ( )⎠ + ∈ I[ , ], Ki m 1 Ai pijK j m Ai Qi ,i 1 N (11.47) j=1 with zero initial conditions Ki (0) = 0,i ∈ I[1,N].

Proof The conclusion of this theorem is proven by two parts. In the first part, it is needed to show the boundedness of the iterative solutions Ki (m), i ∈ I[1, N].Inthe second part, the monotonicity of the iterative solutions is shown. Now, let us proceed to the first part. The aim is to show

Ki (m) ≤ Ki , i ∈ I[1, N], m ≥ 0, (11.48) by mathematical induction principle. Since the coupled anti-Lyapunov equation (11.16) has positive definite solutions K j , j ∈ I[1, N], it follows from the itera- tive algorithm (11.47) and the equation (11.16) that ⎛ ⎞ N ( ) = T ⎝ ( )⎠ + Ki 1 Ai pijK j 0 Ai Qi j=1

= Qi 11.3 Solutions to Coupled Anti-Lyapunov Equations 425 ⎛ ⎞ N ≤ T ⎝ ⎠ + Ai pijK j Ai Qi j=1

= Ki .

This implies that (11.48) holds for m = 1. Now, it is assumed that (11.48) holds for m = k, k ≥ 1. That is, K j (k) ≤ K j , j ∈ I[1, N]. Then, it follows from (11.47) and the induction assumption that ⎛ ⎞ N ( + ) = T ⎝ ( )⎠ + Ki k 1 Ai pijK j k Ai Qi j= ⎛ 1 ⎞ N ≤ T ⎝ ⎠ + Ai pijK j Ai Qi j=1

= Ki .

Thus, (11.48) holds for m = k + 1. By the principle of mathematical induction, it can be concluded that (11.48) holds for m ≥ 0. In the second part, it is needed to prove that

Ki (m) ≤ Ki (m + 1), i ∈ I[1, N]. (11.49)

This is also proven by the principle of mathematical induction. Obviously, (11.49) holds for m = 0. Now, it is assumed that (11.49) holds for m = k, k ≥ 0. That is

Ki (k) ≤ Ki (k + 1), i ∈ I[1, N]. (11.50)

It follows from (11.47) that

( + ) − ( + ) Ki ⎛k 2 Ki k 1 ⎞ ⎡ ⎛ ⎞ ⎤ N N = T ⎝ ( + )⎠ + − ⎣ T ⎝ ( )⎠ + ⎦ Ai pijK j k 1 Ai Qi Ai pijK j k Ai Qi j= j= ⎛ 1 ⎞ 1 N = T ⎝ ( + ) − ( ) ⎠ Ai pij K j k 1 K j k Ai j=1 ≥ 0.

Thus, (11.49) holds for m = k + 1. It follows from the principle of mathematical induction that (11.49) holds for m ≥ 0. 426 11 Stability for Antilinear Systems

From the above statement, it has been shown that the algorithm (11.47) gener- ates monotonically nondecreasing sequences {Ki (m)} bounded from above by the solution of (11.16). That is

0 = Ki (0) ≤ Ki (1) ≤ Ki (2) ≤···≤ Ki (m) ≤ Ki (m+1) ≤···≤ Ki , i ∈ I[1, N]. (11.51) Thus, for any i ∈ I [1, N] the sequence Ki (m) converges according to the principle of monotonic convergence for positive operators [163]. Since the Markovian jump antilinear system (11.13)–(11.14) is stochastically stable, it can be guaranteed that the iterative solutions Ki (m), i ∈ I[1, N], are finite. Substituting m =∞into (11.47) yields ⎛ ⎞ N (∞) = T ⎝ (∞)⎠ + ∈ I[ , ] Ki Ai pijK j Ai Qi , i 1 N . (11.52) j=1

Since equations (11.52) and (11.16) are identical, it can be obtained that the algorithm (11.47) converges monotonically to the unique positive solution of (11.16). 

Remark 11.6 In the algorithm (11.47), the estimates Ki (m + 1) for Ki , i ∈ I[1, N], are updated by the estimates Ki (m) at the m-th step. The above algorithm (11.47) is only effective for the requirement of zero ini- tial conditions. A convergence result for any initial conditions is presented in the following theorem. Theorem 11.5 If the coupled anti-Lyapunov matrix equation in (11.16) has a unique positive definite solution for any given Qi > 0,i ∈ I[1, N], then the sequences {Ki (m)},i ∈ I[1, N], generated by the iteration (11.47) converge to the unique solution of (11.16) for any initial conditions Ki (0),i ∈ I[1, N],ifρ(X)<1, where ⎡ ⎤ p AH ⊗ AT p AH ⊗ AT ··· p , AH ⊗ AT ⎢ 11 1 σ 1 σ 12 1 σ 1 σ 1 N 1 σ 1 σ ⎥ ⎢ ⎥ ⎢ p AH ⊗ AT p AH ⊗ AT ··· p , AH ⊗ AT ⎥ = ⎢ 21 2 σ 2 σ 22 2 σ 2 σ 2 N 2 σ 2 σ ⎥ X ⎢ ⎥ , ⎣ ··· ··· ··· ⎦ p , AH ⊗ AT p , AH ⊗ AT ··· p , AH ⊗ AT N 1 N σ N σ N 2 N σ N σ N N N σ N σ and (·)σ is the real representation of a complex matrix defined in Sect.2.6. Proof Denote         T η = T T T vec ((K1)σ ) vec ((K2)σ ) ··· vec ((K N )σ ) ,         T η( ) = T T T m vec ((K1(m))σ ) vec ((K2(m))σ ) ··· vec ((K N (m))σ ) ,   (11.53)             T = T T ··· T q vec Q1 σ vec Q2 σ vec Q N σ . 11.3 Solutions to Coupled Anti-Lyapunov Equations 427

With these symbols, the exact solution of the coupled anti-Lyapunov equation (11.16) can be obtained. The coupled anti-Lyapunov matrix equation (11.16) can be equiv- alently written as ⎛ ⎞ N T ⎝ ⎠ − =− ∈ I[ , ] Ai pijK j Ai Ki Qi , i 1 N . j=1

The real representation of the preceding equation is as follows

N         T − ( ) =− ∈ I[ , ] pij Ai σ K j σ Ai σ Ki σ Qi σ , i 1 N . j=1

Taking the vectorization operation, gives

N             T T p A ⊗ A vec K − vec (K )σ =−vec Q , i ∈ I[1, N]. ij i σ i σ j σ i i σ j=1

Using the symbols in (11.53), one has

Xη − η =−q.

Thus, the exact solution of (11.16) is given by

η = (I − X)−1q.

Next, the aim is to prove that the sequences {Ki (m)}, i ∈ I[1, N], generated by the iteration (11.47) converge to the unique solutions of (11.16). The real representation of (11.47)is

N         ( ( + )) = T ( ) + Ki m 1 σ pij Ai σ K j m σ Ai σ Qi σ . j=1

Taking the vectorization operation, yields

vec ((Ki (m + 1))σ ) N             = T ⊗ T ( ) + pij Ai σ Ai σ vec K j m σ vec Qi σ j=1 N            = H ⊗ T ( ) + pij Ai σ Ai σ vec K j m σ vec Qi σ , j=1 428 11 Stability for Antilinear Systems for i ∈ I[1, N]. With the symbols in (11.53), the aforementioned equation can be compactly rewritten as η(m + 1) = Xη(m) + q.

Further, it can be obtained that

m η(m + 1) = X j q + X m+1η(0). j=0

Since ρ(X)<1, η(m + 1) converges. Then, one has ⎛ ⎞ m−1 lim η(m) = lim ⎝ X j q + X m η(0)⎠ m→∞ m→∞ j=0 = (I − X)−1q = η.

Thus, the proof is completed. 

It should be pointed out that the condition ρ(X)<1 in Theorem 11.5 maybedif- ficult to verify because of the high dimensionality of the matrix X. In practical appli- cations, the algorithm in Theorem 11.4 is preferred to solve coupled anti-Lyapunov matrix equations.

11.3.2 Implicit Iterative Algorithms

In this subsection, some implicit iterative algorithms are presented for solving the coupled anti-Lyapunov matrix equation (11.16). These algorithms are called implicit since some standard anti-Lyapunov matrix equations need to be solved in each iter- ation step. Before giving the main algorithm, the following result is introduced.

Lemma 11.3 Consider the following two standard anti-Lyapunov matrix equations

T A P1 A − P1 =−Q1,Q1 > 0, T A P2 A − P2 =−Q2,Q2 > 0.

If the matrix A is anti-Schur stable and Q1 > Q2, then there holds

P1 > P2. 11.3 Solutions to Coupled Anti-Lyapunov Equations 429

Proof Since Q1 > Q2, it follows from the above anti-Lyapunov matrix equations that     AT P A − P − AT P A − P  1 1 2 2 T = A P1 − P2 A − (P1 − P2)

=−(Q1 − Q2)<0.

Since the matrix A is anti-Schur stable, then it follows from Theorem 11.1 that P1 − P2 > 0. Thus, P1 > P2. This completes the proof.  With Lemma 11.3 as a tool, the implicit iterative algorithm can be established for solving the coupled anti-Lyapunov matrix equation (11.16) with zero initial condi- tions. Theorem 11.6 If the coupled anti-Lyapunov matrix equation (11.16) has the unique positive definite solution for any given Qi > 0,i ∈ I[1, N], and the matrices Ai , i ∈ I[1, N], are anti-Schur stable, then the unique solution of (11.16 ) can be obtained by the following iterative algorithm:

N T ( + ) − ( + ) + T ( ) + = pii Ai Ki m 1 Ai Ki m 1 pij Ai K j m Ai Qi 0, (11.54) j=1, j=i for zero initial conditions Ki (0) = 0,i ∈ I[1, N]. Proof First, it will be shown that

Ki (m)

This can be proven by the principle of mathematical induction. Due to zero initial conditions, it is obvious that (11.55) holds for m = 0. Then it follows from (11.54) that N N T ( ) + = < T + pij Ai K j 0 Ai Qi Qi pij Ai K j Ai Qi . j=1, j=i j=1, j=i

Thus, using Lemma 11.3 one has

Ki (1)

Similarly, by Lemma 11.3, one can obtain

Ki (k + 1)

From the above relation, it is shown that (11.55) holds for m = k + 1. Thus, by the principle of mathematical induction, it can be obtained that (11.55) holds for any integer m ≥ 0. Next, it will be shown that

Ki (m)

Ki (k)

With this assumption, one has

N N T ( ) + < T ( + ) + pij Ai K j k Ai Qi pij Ai K j k 1 Ai Qi . (11.57) j=1, j=i j=1, j=i

In addition, it follows from (11.54) that for i ∈ I[1, N] there hold ⎧ ⎪ N ⎪ T ( + ) − ( + ) =− T ( ) − ⎨⎪ pii Ai Ki k 1 Ai Ki k 1 pij Ai K j k Ai Qi = , = j 1 j i . ⎪ N ⎪ T ( + ) − ( + ) =− T ( + ) − ⎩⎪ pii Ai Ki k 2 Ai Ki k 2 pij Ai K j k 1 Ai Qi j=1, j=i (11.58) By using Lemma 11.3, it is derived from (11.57) and (11.58) that Ki (k + 1)< Ki (k + 2). Thus, (11.56) holds for m = k + 1. Therefore, it follows from the principle of mathematical induction that (11.56) holds for any integer m ≥ 0. With the previous two aspects, the convergence of the algorithm (11.54) can be confirmed by along the line given in the end of the proof for Theorem 11.4. 

Next, the following result is given for any initial conditions parallel to Theorem 11.5.

Theorem 11.7 If the coupled anti-Lyapunov matrix equation (11.16) has a unique positive definite solution for any given Qi > 0 ,i ∈ I[1, N], then the sequences {Ki (m)},i ∈ I[1, N], generated by the iteration (11.54) converge to the unique −1 solution of (11.16) for any initial conditions Ki (0),i ∈ I[1, N],ifρ(M W)<1, where 11.3 Solutions to Coupled Anti-Lyapunov Equations 431 ⎡ ⎤ I − M11 0 ··· 0 ⎢ ⎥ ⎢ 0 I − M22 ··· 0 ⎥ M = ⎢ . ⎥ , ⎣ ··· ··· .. ··· ⎦ 00··· I − M ,      N N = H ⊗ T ∈ I[ , ], Mii pii Ai σ Ai σ ,i 1 N ⎡ ⎤ 0 W12 W13 ··· W1,N−1 W1,N ⎢ ⎥ ⎢ W21 0 W23 ··· W2,N−1 W2,N ⎥ ⎢ ⎥ ⎢ W31 W32 0 ··· W3,N−1 W3,N ⎥ = ⎢ ⎥ W ⎢ .. ⎥ , ⎢ ··· ··· ··· . ··· ··· ⎥ ⎣ ⎦ WN−1,1 WN−1,2 WN−1,3 ··· 0 WN−1,N WN,1 WN,2 WN,3 ··· WN,N−1 0      = H ⊗ T = ∈ I[ , ] Wij pij Ai σ Ai σ ,i j, i, j 1 N .

Proof The same symbols as in (11.53) are used. Similarly to the proof of Theorem 11.6, it can be easily proven that the exact solution of (11.16) can be described by

η = (M − W)−1q.

It follows from (11.54) that     ( ( + )) − T ( ( + )) Ki m 1 σ pii Ai σ Ki m 1 σ Ai σ N         = T ( ) + pij Ai σ K j m σ Ai σ Qi σ . j=1, j=i

Taking the vectorization operation, gives       − H ⊗ T (( ( + )) ) I pii Ai σ Ai σ vec Ki m 1 σ N            = H ⊗ T ( ) + pij Ai σ Ai σ vec K j m σ vec Qi σ . j=1, j=i

This equation can be compactly rewritten as

Mη(m + 1) = Wη(m) + q.

Under the condition of the theorem, one has

η(m + 1) = M−1Wη(m) + M−1q. 432 11 Stability for Antilinear Systems

Further, it can be established that

m   j  m+1 η(m + 1) = M−1W M−1q + M−1W η(0). j=0

Since ρ(M−1W)<1, η(m + 1) converges. Then, one can obtain that

 −1 lim η(m) = I − M−1W M−1q m→∞ = (M − W)−1q = η.

Thus the proof is completed. 

Remark 11.7 For the algorithm (11.54), one needs to solve N standard discrete-time anti-Lyapunov matrix equations at each step.

Remark 11.8 The difficulties of solving the coupled anti-Lyapunov matrix equations are from the high dimensionality and the structure of coupling. The main feature of the iterative algorithms given in this section is to eliminate coupling of the equations. Thus, the advantage of the algorithms is that they can operate for decoupled equations, and thus can be implemented with parallel computation.

11.3.3 An Illustrative Example

In this section, an example is considered for Markovian jump antilinear systems. The aim is threefold. One is to use the proposed algorithms to solve the corresponding coupled anti-Lyapunov matrix equation. The second aim is to compare the conver- gence performance with the existing algorithms. The third one is to check the stability of the system by the positive definiteness of the solution matrices. Consider a fourth-order system in the form of (11.13) with the transition proba- bility matrix ⎡ ⎤ 0.10.30.6 P = ⎣ 0.50.25 0.25 ⎦ . 00.30.7

The subsystem matrices are A1 = M1 + iM2, A2 = M2 + iM3, A3 = M3 + iM1, where ⎡ ⎤ 0.0667 0.0665 0.0844 −0.2257 ⎢ 0.1383 −0.1309 0.0797 0.1162 ⎥ M = ⎢ ⎥ , 1 ⎣ 0.0658 0.0298 0.0645 −0.1018 ⎦ −0.2283 0.2438 −0.1990 0.2997 11.3 Solutions to Coupled Anti-Lyapunov Equations 433 ⎡ ⎤ 0.1885 −0.3930 −0.0894 −0.1919 ⎢ −0.4230 0.3598 −0.1224 −0.1548 ⎥ M = ⎢ ⎥ , 2 ⎣ 0.0350 −0.1950 −0.1967 −0.1017 ⎦ −0.2648 −0.0240 −0.0542 0.0484 ⎡ ⎤ 0.2746 0.0634 0.3414 −0.0692 ⎢ 0.0769 0.4167 0.0283 −0.1207 ⎥ M = ⎢ ⎥ . 3 ⎣ −0.1607 0.0344 −0.2227 0.1617 ⎦ 0.1175 −0.2969 0.4149 0.3314

Now the corresponding coupled Lyapunov matrix equation in the form of (11.16)is considered. The positive definite matrices Qi , i = 1, 2, 3, are all chosen as identity matrices. Here, the iterative errors δ(m) versus the iteration step m is defined as " # % ⎛ ⎞ % # % %2 #N % N % $ % T ⎝ ⎠ % δ(m) = %A pijK j (m) Ai + Qi − Ki (m)% , % i % i=1 j=1 where the Frobenius norm is used. By using the explicit algorithm (11.47) in Theorem 11.4, the solution of (11.16) at the 80-th step is given as ⎡ ⎤ 1.7186 −0.6939 0.1731 −0.1337 ⎢ −0.6939 1.9281 −0.0588 0.1536 ⎥ K (80) = ⎢ ⎥ 1 ⎣ 0.1731 −0.0588 1.1790 −0.0380 ⎦ −0.1337 0.1536 −0.0380 1.3705 ⎡ ⎤ 0.0000 −0.0164 0.0818 −0.2250 ⎢ 0.0164 0.0000 −0.0007 0.2557 ⎥ +i ⎢ ⎥ , ⎣ −0.0818 0.0007 0.0000 0.0868 ⎦ 0.2250 −0.2557 −0.0868 0.0000 ⎡ ⎤ 1.7581 −0.6491 0.3090 0.0327 ⎢ −0.6491 2.3203 −0.1978 −0.2221 ⎥ K (80) = ⎢ ⎥ (11.59) 2 ⎣ 0.3090 −0.1978 1.5109 0.2011 ⎦ 0.0327 −0.2221 0.2011 1.3308 ⎡ ⎤ 0.0000 0.0878 0.0141 −0.0018 ⎢ −0.0878 −0.0000 0.1278 −0.0350 ⎥ +i ⎢ ⎥ , ⎣ −0.0141 −0.1278 0.0000 0.0334 ⎦ 0.0018 0.0350 −0.0334 0.0000 ⎡ ⎤ 1.2701 −0.0964 0.3068 −0.1474 ⎢ −0.0964 1.6827 −0.2763 −0.3229 ⎥ K (80) = ⎢ ⎥ 3 ⎣ 0.3068 −0.2763 1.5001 0.0269 ⎦ −0.1474 −0.3229 0.0269 1.6398 434 11 Stability for Antilinear Systems ⎡ ⎤ 0.0000 0.2126 −0.1298 −0.1768 ⎢ −0.2126 0.0000 −0.0205 0.2818 ⎥ +i ⎢ ⎥ . ⎣ 0.1298 0.0205 0.0000 −0.2483 ⎦ 0.1768 −0.2818 0.2483 0.0000

By using the implicit algorithm (11.54) in Theorem 11.6, the solution of (11.16)is found to be the same as in (11.59). Moreover, the convergence performance is also compared among the algorithms in Lemma 11.2, Theorems 11.4 and 11.6 for zero initial conditions. By trial-and-error test, it can be shown that the iterative algorithm in Lemma 11.2 has the best convergence performance when the convergence factor μ = . δ( ) is chosen as 0 701. The iterative errors log10 m versus the iteration step m are shown in Fig. 11.1, where the performance curve of Lemma 11.2 is obtained when μ = 0.701. From this figure, it can be seen that the implicit algorithm in Theorem 11.6 converges faster than the explicit algorithm in Theorem 11.4, and that the proposed algorithms with zero initial conditions in the current section are much faster than the iterative algorithm in Lemma 11.2. Next, the iterations are carried out by using the algorithm in Lemma 11.2, Theorems 11.5 and 11.7 with nonzero initial conditions K1(0) = N1 + iN2, K2(0) = N2 + iN3, and K3(0) = N3 + iN1, where ⎡ ⎤ ⎡ ⎤ 120.50.3 −10.50.70.3 ⎢ 001.22.5 ⎥ ⎢ 100.9 −0.6 ⎥ N = ⎢ ⎥ , N = ⎢ ⎥ , 1 ⎣ 30.20.8 −0.6 ⎦ 2 ⎣ 23.2 −0.8 −1 ⎦ 2 −30 0.8 02.1 −10.75

2 Lemma 11.2 0 Theorem 11.4 Theorem 11.6 −2

−4

(m) −6 δ 10

log

−10

−12

−14

−16 0 50 100 150 200 250 300 350 m

Fig. 11.1 Convergence comparison for the case of zero initial conditions 11.3 Solutions to Coupled Anti-Lyapunov Equations 435

2 Lemma 11.2 0 Theorem 11.5 Theorem 11.7 −2

−4

(m) −6 δ 10

log

−10

−12

−14

−16 0 50 100 150 200 250 300 350 m

Fig. 11.2 Convergence comparison for the case of nonzero initial conditions

⎡ ⎤ 0.8 −0.51.6 −3.1 ⎢ 0.15 2.3 −0.70.8 ⎥ N = ⎢ ⎥ . 3 ⎣ −2.20.22.81.5 ⎦ 0.3 −2.11.5 −0.5

The convergence results are shown in Fig. 11.2. It can also be seen that the implicit algorithm in Theorem 11.7 converges faster than the explicit algorithm in Theorem 11.5, and the proposed algorithms for any initial conditions converge much faster than the algorithm in Lemma 11.2. Finally, it is easily checked that the solution matrices given in (11.59) are positive definite. Thus, according to Theorem 11.2, it is concluded that the discrete-time antilinear system in this example is stochastically stable.

11.4 Notes and References

11.4.1 Summary

Antilinear maps often occur in quantum mechanics in the study of time reversal and in spinor calculus [29]. Inspired by this fact, the discrete-time antilinear systems were investigated in [256], where the concepts of reachability and controllability in linear systems have been extended to the discrete-time antilinear systems case. In addition, antilinear systems are a special type of complex dynamic systems with structure 436 11 Stability for Antilinear Systems constraint as pointed out at the beginning of this chapter. For complex dynamic systems, they can model many practical systems including energy generation and distribution, ecosystems, health delivery, safety and security systems, etc. As they are ubiquitous in practice, complex dynamic systems have been well studied in literature (see [125, 198]). In the field of dynamic systems with real coefficients, some special systems, such as, symmetric systems and positive systems have been intensively investigated. Similarly, antilinear systems and stochastic antilinear systems should deserve further investigation as special structured complex systems. In this chapter, the problem of stability analysis is investigated for ordinary discrete-time antilinear systems and Markovian jump antilinear systems. For ordi- nary discrete-time antilinear systems, a criterion is established to check the asymp- totical stability in terms of the so-called anti-Lyapunov matrix equation. In addition, the concept of stochastic stability is also extended to the context of discrete-time Markovian jump antilinear systems. Two necessary and sufficient conditions are presented for the stochastic stability of a discrete-time Markovian jump antilinear system by using different stochastic quadratic Lyapunov functions. The stochastic stability can be checked by the existence of positive definite solutions of the so-called coupled anti-Lyapunov matrix equations. Both the anti-Lyapunov equation and cou- pled anti-Lyapunov equation include the conjugate of unknown matrices, and belong to the class of complex conjugate matrix equations. Obviously, these two kinds of matrix equations are not linear. It is easily seen that the anti-Lyapunov matrix equation (11.7) is a special case of the con-Kalman-Yakubovich matrix equation in Chap. 3 and the extended con- Sylvester matrix equation in Subsection4.1.1. Therefore, the iterative algorithms in Chap. 3 and in Subsection4.1.1 can be used to solve the anti-Lyapunov matrix equations. In addition, the coupled anti-Lyapunov matrix equation is also a special case of the coupled con-Sylvester matrix equations in Subsection 4.2.3. Naturally, the algorithm in Subsection4.2.3 can be adopted to give the solutions of the coupled anti-Lyapunov matrix equations. However, in such an approach the special structure of this kind of equations are not utilized. Therefore, in Sect. 11.3 two types of alter- native iterative algorithms are proposed to solve the coupled anti-Lyapunov matrix equations. For the implicit iterative algorithms, some standard anti-Lyapunov matrix equations need to be solved in each iterative step. In addition, it can be seen from the simulation results that implicit iterative algorithms converge faster than the explicit iterative algorithms, and both of these two iterative algorithms are much faster than the iterative algorithm in Subsection 4.2.3. The main contents in this chapter are taken from [256, 276].

11.4.2 A Brief Overview

Markovian jump systems have been utilized to model some practical problems where the structure of the plant is subject to abrupt changes due to component failure or repairs, sudden environment changes or abrupt changes in the operation point for 11.4 Notes and References 437 a non-linear plant, etc. [42]. These phenomena can be found in, for instance, fault- tolerant control systems [176] and neural networks [213], etc. Due to this observation, the investigation of this class of systems, especially its stability analysis, has received extensive attention from many researchers in last two decades. In comparison with the deterministic linear systems, the notation of stability is much richer in the context of discrete-time Markovian jump (DTMJ) systems. Some significant concepts include the mean square stability (MSS), moment stability, sto- chastic stability and almost sure stability (ASS). In fact, the MSS is the second moment stability. In [153], a necessary and sufficient condition was obtained for the second moment stability of DTMJ systems by using Lyapunov’s second method. In [43], some necessary and sufficient conditions were given for a DTMJ system to be mean square stable. It was shown that the mean square stability can be checked by the spectral radius of an augmented matrix being less than one, or the existence of a positive definite solution for a coupled Lyapunov equation. In [122], some simplified sufficient conditions were given for MSS. In addition, several sufficient conditions, which are also necessary for scalar systems, were also given in [122] for the δ- moment stability. In [123], several concepts on δ-moment stability were shown to be equivalent, and a relation between δ-moment stability and ASS was also established. On the stochastic stability of DTMJ systems, some earliest results may be found in [27]. By using stochastic Lyapunov methods, the authors of [27] derived a nec- essary and sufficient condition for stochastic stability of DTMJ systems in terms of coupled Lyapunov matrix equations. The stochastic stability of DTMJ systems was revisited later in [121]. By choosing an alternative quadratic Lyapunov function, another coupled Lyapunov equation criterion was given for the stochastic stability. In addition, the case of DTMJ systems with time-inhomogenous Markovian chain was also considered in [121]. The notation of ASS is more complicated compared with other stability notation. Up till now there are no necessary and sufficient conditions on ASS reported in literature. In [122], some sufficient conditions were obtained for ASS in terms of the singular values and operator norms of the system matrices of subsystems. In [120], another sufficient condition was given for ASS by using a relation with δ-moment stability given in [123]. It was pointed out in [120] that the obtained condition is necessary and sufficient for scalar DTMJ systems. In the analysis of stability for DTMJ systems, coupled Lyapunov matrix equations have been introduced. It has been shown in [43, 153] that the MSS of a DTMJ system can be characterized by the existence of positive definite solutions of such equations. In [27], a necessary and sufficient condition of the stochastic stability for a DTMJ system was established based on coupled Lyapunov matrix equations. As to the solution of coupled Lyapunov matrix equations, a formula for the exact solutions was given in [43] in terms of matrix inversions and Kronecker products. However, it is computationally expensive if the system is of high dimension and the number of modes is large. In [25], a parallel algorithm was proposed to solve such 438 11 Stability for Antilinear Systems coupled Markovian jump Lyapunov matrix equations with the requirement of the zero initial condition. It was shown in [237] that the iteration sequence generated by the algorithm in [25] is also convergent without this requirement. An iterative algorithm was proposed in [309] to solve the coupled Lyapunov matrix equations by minimizing a weighted quadratic function. Chapter 12 Feedback Design for Antilinear Systems

In Chap. 11, stability analysis is considered for antilinear systems. It has been shown that two kinds of complex conjugate matrix equations, anti-Lyapunov matrix equation and coupled anti-Lyapunov matrix equations, play vital roles. In this chapter, some feedback design problems are studied for discrete-time antilinear systems. It will be seen that many complex conjugate matrix equations investigated in the previous chapters have found important applications in the context of antilinear systems. This chapter is organized as follows. In Sect. 12.1, the problem of generalized eigenstructure assignment is considered for discrete-time antilinear systems, and a con-Sylvester-matrix-equation-based approach is given to solve this problem. In Sect. 12.2, the problem of model reference tracking control is investigated for discrete-time antilinear systems. In Sects.12.3 and 12.4, the quadratic regulation problems are studied for the cases of finite horizon and infinite horizon, respectively.

12.1 Generalized Eigenstructure Assignment

For a linear system, its eigenstructure is related to the following aspects: (1) The eigenvalues of its system matrix; (2) The multiplicities of each eigenvalue; (3) Eigenvector and generalized eigenvector of its system matrix. In linear systems, the problem of eigenstructure assignment is to find a control law such that the system matrix of the closed-loop system has the desired eigenvalues and multiplicities, and simultaneously determine the corresponding eigenvectors and generalized eigenvectors. The problem of eigenstructure assignment has been inten- sively investigated [1, 66, 67]. It is well-known that eigenvalues and eigenvectors are closely related to similarity. Also, it has been shown in [66] that the problem of eigenstructure assignment is to find a control law such that the closed-loop system matrix is similar to a prescribed Jordan matrix, and simultaneously determine the corresponding eigenvector matrix.

© Springer Science+Business Media Singapore 2017 439 A.-G. Wu and Y. Zhang, Complex Conjugate Matrix Equations for Systems and Control, Communications and Control Engineering, DOI 10.1007/978-981-10-0637-1_12 440 12 Feedback Design for Antilinear Systems

In this section, an analogous problem is considered for the following discrete-time antilinear system x (t + 1) = Ax (t) + Bu (t), (12.1) where x (t) ∈ Cn is the state, and u (t) ∈ Cr is the input; A ∈ Cn×n and B ∈ Cn×r are the system matrix and input matrix, respectively. If the state feedback

u (t) = Kx(t) (12.2) is applied to the system (12.1), the closed-loop system is obtained as   x (t + 1) = A + BK x (t). (12.3)

It follows from Chap.11 that algebraic anti-equivalence is an important concept in antilinear systems. Two algebraic anti-equivalent antilinear systems have the same stability property. Different from the concept of algebraic equivalence in linear sys- tems, the algebraic anti-equivalence for antilinear systems is characterized by con- similarity. With these ideas, the problem of generalized eigenstructure assignment for the antilinear system (12.1) can be stated as follows by a slight modification for the problem of eigenstructure assignment in linear systems.

Problem 12.1 Consider the discrete-time antilinear system (12.1). Given a pre- × scribed matrix F ∈ Cn n, find a state feedback controller in the form of (12.2) such that the system matrix A + BK of the closed-loop system (12.3) is consimilar to the matrix F, and simultaneously determine a nonsingular matrix V ∈ Cn×n satisfying   V −1 A + BK V = F. (12.4)   + Remark 12.1 According to Sect. 2.7,ifs is a coneigenvalue of A BK , and v is the corresponding coneigenvector, then A + BK v = sv. With such an observation, in the statement of Problem 12.1 if F is a diagonal matrix, then the elements in + the diagonal is the coneigenvalues of the closed-loop system matrix A BK , and every column of V is a coneigenvector of A + BK . However, different from the concept of eigenvalues, there may not exist a coneigenvalue for a matrix, or the number of its coneigenvalues is greater than its dimension. Thus, it is difficult to describe the structure of coneigenvalues for a matrix. Due to this reason, in the statement of Problem 12.1 the closed-loop system matrix is required to be consimilar to a prescribed matrix F. This is why the Problem 12.1 is called the problem of generalized eigenstructure assignment. In addition, it is obvious that the structure of coneigenvalues of the closed-loop matrix A + BK is implicitly included in the matrix F.

Next, let us consider the solution to Problem 12.1. This problem includes two requirements. One is that the obtained state feedback should be such that the closed- loop system matrix is consimilar to the given F. The second is to give a nonsingular 12.1 Generalized Eigenstructure Assignment 441 matrix V satisfying (12.4 ). Obviously, the consimilarity between the closed-loop sys- tem matrix A + BK and the prescribed matrix F can be easily derived from (12.4). Therefore, the problem of generalized eigenstructure assignment for the antilinear system (12.1) is to find a matrix K and a nonsingular matrix V satisfying (12.4). Due to the nonsingularity of V , the expression (12.4) can be equivalently written as

AV + BK V = VF. (12.5)

Let KV = W.

Then, the equation (12.5) can be transformed into the following matrix equation

AV + BW = VF, (12.6) which is the so-called con-Sylvester matrix equation investigated in Sects. 6.3, 7.1 and 8.1. With the preceding justification, the following lemma can be easily obtained. Lemma 12.1 For the antilinear system (12.1), the state feedback gain K to Problem 12.1 is given by K = WV−1, (12.7) where V and W are the solutions to the con-Sylvester matrix equation (12.6), and satisfy the constraint detV = 0. By this lemma, the key to solve Problem 12.1 is to solve the con-Sylvester matrix equation (12.6). In this section, the approach in Sect. 7.1 is adopted to solve this equation. Lemma 12.2 Given matrices A ∈ Cn×n,B∈ Cn×r , and F ∈ Cp×p, suppose that λ(AA) ∩ λ(F F) = ∅. Let

( ) = ( − ), f AA s det sI AA

n−1 i adj(sI − AA) = Ri s . i=0

Then, all the solutions to the homogeneous con-Sylvester matrix equation (12.6) can be characterized by ⎧ − − ⎨⎪ n 1 n 1 V = R BZF(F F)i + A R BZ(F F)i i i , (12.8) ⎩⎪ i=0 i=0 = ( ) W ZfAA FF where Z ∈ Cr×p is an arbitrarily chosen free parameter matrix. 442 12 Feedback Design for Antilinear Systems

By Lemmas 12.1 and 12.2, one can obtain the following theorem on the solution of Problem 12.1.

Theorem 12.1 For the discrete-time antilinear system (12.1), given a matrix F ∈ Cn×n satisfying λ(AA) ∩ λ(F F) = ∅, then all the solutions to Problem 12.1 are given by (12.7) and (12.8), where the free parametric matrix Z satisfies the constraint detV (Z) = 0.

On the proposed approach for solving Problem 12.1, some remarks are given as follows.

Remark 12.2 In order to derive parametric expressions of the matrices V and W,itis necessary to obtain the characteristic polynomial of AA, and the coefficient matrices of adj(sI − AA). They can be obtained by the celebrated Leverrier algorithm. For the details, please refer to Sect.7.1.

Remark 12.3 The previous theorem gives a complete parametric solution to Prob- lem 12.1. The free parameter matrix Z represents the degrees of freedom in the eigenstructure assignment design, and can be sought to meet certain desired system performance.

Remark 12.4 In the proposed approach, the matrix F explicitly appears in the para- metric expression (12.8). Thus, the matrix F can be used as a free parameter, and can also be chosen according to design requirement. In the next section, it can be seen that the matrix F is chosen as an anti-Schur stable matrix in the problem of model reference tracking control.

12.2 Model Reference Tracking Control

Consider the following discrete-time antilinear system

x(t + 1) = Ax(t) + Bu(t) (12.9) y(t) = Cx(t) + Du(t) where x(t) ∈ Cn, u(t) ∈ Cr , and y(t) ∈ Cm are, respectively, the state vector, the input vector and the output vector; A, B, C, and D are the system matrices with proper dimensions. In this section, the aim is to design a control law for the system (12.9) such that the output y(t) of the closed-loop system asymptotically tracks a given reference signal ym(t), that is,

lim (y(t) − ym(t)) = 0. (12.10) t→∞

In this book, it is assumed that the reference signal ym(t) is generated by the following reference model: 12.2 Model Reference Tracking Control 443

x (t + 1) = A x(t) m m , (12.11) ym(t) = Cm x(t)

p m where xm(t) ∈ C and ym(t) ∈ C are respectively the state vector and output vec- tor of the reference model, and Am and Cm are known matrices with appropriate dimensions. Obviously, the considered control problem involves two models: the plant model (12.9) and the reference model (12.11). Therefore, it is easily known that the desired control law should be a combination of the states of these two models. That is, the control law is in the following form

u(t) = Kx(t) + Km xm(t). (12.12)

With the above preliminaries, the problem of model reference tracking control can be addressed as follows. Problem MFT Given the plant system (12.9) and the reference model (12.11), find a control law in the form of (12.12) such that the output of the resulted closed-loop system satisfies the condition (12.10). For convenience, the concept of anti-Schur stable matrices given in Sect. 11.1 is also used. A matrix A ∈ Cn×n is called anti-Schur stable if the corresponding discrete-time antilinear system x(t + 1) = Ax(t) is asymptotically stable.

12.2.1 Tracking Conditions

On the existence of a tracking control law for the antilinear system (12.9), the fol- lowing theorem is provided in this subsection.

Theorem 12.2 Given the plant (12.9) and the reference model system (12.11), let K be a gain matrix such that the following antilinear system is stable:

x(t + 1) = (A + BK )x(t). (12.13)

If there exist two matrices G ∈ Cn×p and H ∈ Cr×p satisfying

AG + B H = GAm, (12.14)

CG + DH = Cm, (12.15) then, when the control law in the form of (12.12) with

Km = H − KG (12.16) is applied to the antilinear system (12.9), the resulted closed-loop system satisfies the zero-error asymptotic tracking property (12.10). 444 12 Feedback Design for Antilinear Systems

Proof Let ⎧ ⎨ x(t) = x(t) − Gxm(t)  ( ) = ( ) − ( ) ⎩ u t u t Hxm t . (12.17) y(t) = y(t) − ym(t)

It follows from (12.9) and (12.11) that

x(t + 1) = Ax(t) + Bu(t) + Lx(t) , (12.18) y(t) = Cx(t) + Du(t) + Mxm(t) where

L = AG + B H − GAm,

M = CG + DH − Cm.

Obviously, when the conditions (12.14) and (12.15) hold, the system in (12.18) becomes x(t + 1) = Ax(t) + Bu(t) . (12.19) y(t) = Cx(t) + Du(t)

The system (12.19) and the plant (12.9) have the same structure. Now the control law u(t) = K x(t) (12.20) is applied to the system (12.19), then the following closed-loop system can be obtained x(t + 1) = (A + BK )x(t) . (12.21) y(t) = (C + DK)x(t)

Since K is the gain such that the system (12.13) is asymptotically stable, then the closed-loop system (12.21) is also asymptotically stable. In this case, the tracking property (12.10) obviously holds. In addition, the control law (12.12) with (12.16) can be obtained from (12.17) and (12.20). The proof is thus completed. 

Due to the above theorem, for the tracking control law (12.12)theK is called the feedback stabilizing gain, and Km is called the feedforward compensation gain. In this section, only a sufficient existence condition is given for tracking controllers of discrete-time antilinear systems. It deserves further investigation on obtaining a necessary and sufficient condition of the output tracking problem for antilinear systems. In the sequel, the aim is to provide a parametric approach to solve Problem MFT. According to Theorem 12.2, one needs to give the feedback stabilizing gain and the feedforward compensation gain. 12.2 Model Reference Tracking Control 445

12.2.2 Solution to the Feedback Stabilizing Gain

From Sect. 11.1, the following result holds.

Lemma 12.3 If two matrices are consimilar, then one of these two matrices is anti- Schur stable if and only if the other is anti-Schur stable.

A feedback stabilizing gain K of the antilinear system (12.9) should make the closed-loop system matrix (A + BK ) anti-Schur stable. According to Lemma 12.3, for this goal, one can make the closed-loop matrix A + BK consimilar to a given anti-Schur matrix F ∈ Cn×n. Thus, there exists a nonsingular matrix X ∈ Cn×n such that X −1(A + BK )X = F.

By the result in Sect. 12.1, the following theorem can be obtained on the stabilizing gain K .

Theorem 12.3 Given the plant (12.9), and an anti-Schur stable matrix F with λ(AA) ∩ λ(F F) = ∅, denote

( ) = ( − ) f AA s det sI AA , (12.22) n−1 i adj(sI − AA) = Ri s . (12.23) i=0

Then, all the stabilizing gains K such that the closed-loop system matrix (A + BK ) is consimilar to F can be given by ⎧ ⎪ K = YX−1 ⎪ ⎨ n−1 n−1  i  i X = Ri B Z F F F + A Ri BZ F F , ⎪ ⎩⎪ i=0   i=0 = Y ZfAA F F where Z ∈ Cr×n is a parameter matrix satisfying

detX(Z) = 0.

Remark 12.5 It can be seen from the above theorem that, in order to obtain the stabilizing gain K one should choose a parameter matrix Z satisfying detX (Z) = 0. In general, such a matrix Z can be arbitrarily chosen if no other special requirements are imposed. 446 12 Feedback Design for Antilinear Systems

12.2.3 Solution to the Feedforward Compensation Gain

In this subsection, an approach is given to obtain the feedforward compensation gain Km. It can be seen from Theorem 12.2 that the first step of obtaining the feedforward compensation gain is to solve the coupled matrix equation (12.14)–(12.15). Obvi- ously, the equation (12.14) is also the con-Sylvester matrix equation investigated in Sects.6.3, 7.1 and 8.1. By using the result in Sect. 7.1, it is known that, under the condition λ(AA) ∩ λ(Am Am) = ∅, all the matrices G and H satisfying (12.14) can be parameterized as ⎧ n−1 n−1 ⎨⎪  i  i G = R BW A A A + A R BW A A i m m m i m m , (12.24) ⎩⎪ i=0   i=0 = H WfAA Am Am ( ) , ∈ I[ , − ] where f AA s and Ri i 0 n 1 , are defined in (12.22) and (12.23), respectively, and W ∈ Cr×p is a free parameter matrix. Substituting the parametric expressions of G and H in (12.24)into(12.15), gives

n−1   n−1     i + i + = C Ri BW Am Am Am CA Ri BW Am Am DW fAA Am Am Cm. i=0 i=0 (12.25) When the matrix W is derived by solving the matrix equation (12.25), the matrices G and H can be obtained from the expression (12.24). However, an explicit solution can not be given for the matrix equation (12.25) with respect to W currently. Therefore, at the end of this subsection a special case where r = m is considered. Now, it is assumed that r = m. In this case, it is denoted that

AB I × 0 × A = , E = n n n r , (12.26) C D 0r×n 0r×r

0 G Cm = , X = . (12.27) Cm H

Then, the coupled matrix equations (12.14)–(12.15) can be written as the following matrix equation: A X − EX Am = Cm, (12.28) which has been investigated in Sect. 7.4. By using a result in Sect. 7.4, on the solution G and H to the coupled matrix equation (12.14)–(12.15) one has the following theorem.

Theorem 12.4 Given matrices A ∈ Cn×n, B ∈ Cn×r , C ∈ Cr×n, and D ∈ Cr×r , denote the matrices A , E , Cm, and X as in (12.26) and (12.27), and let γ be a scalar such that γ E − A is nonsingular. Further, define 12.2 Model Reference Tracking Control 447

−1 −1 Eγ = (γ E − A ) E , Aγ = (γ E − A ) A , and denote

( ) = E E − A A f(E Eγ ,A Aγ ) s det s γ γ , n−1 i adj(sE Eγ − A Aγ ) = Ri s . i=0

Then, the matrices G and H satisfying (12.14)–(12.15) can be given by

n−1    −1   X =−E R C ( ) i γ i m f(E Eγ ,A Aγ ) Am Am Am Am Am i=0 n−1    −1   −A R C ( ) i . γ i m f(E Eγ ,A Aγ ) Am Am Am Am i=0

Remark 12.6 The idea of converting the coupled matrix equation (12.14)–(12.15)to the extended con-Sylvester matrix equation (12.28) is inspired by the work in [76].

Remark 12.7 In Theorem 12.4, one can use the generalized Leverrier algorithm in Sect. 2.3 to obtain the characteristic polynomial of the matrix pair (E E , A A ) and the adjoint matrix of sE E − A A .

With the feedback stabilizing gain matrix K obtained in the previous subsection and the matrices G and H given in this subsection, the feedforward compensation gain thus can be obtained by (12.16). By now, the feedback stabilizing gain K and the feedforward compensation gain Km have been obtained. With them, the tracking control law (12.12) can be thus implemented.

12.2.4 An Example

Consider an antilinear system in the form of (12.9) with the following system matrices: ⎡ ⎤ ⎡ ⎤ 1 −1 − i −1 + i 1 A = ⎣ 1 −i0⎦ , B = ⎣ 0 ⎦ , 0 −11− i −1 + i   C = 2 − i11+ i , D = 2. 448 12 Feedback Design for Antilinear Systems

For this system, a control law in the form of (12.12) will be designed to make the output of the closed-loop system track the output of a reference model in the form of (12.11) with the parameters:

−1 + 2i −2   A = , C = 21+ i . m 21+ 2i m

Stabilizing gain Choose the following anti-Schur stable matrix F ⎡ ⎤ 1 + 1 + 2 2 i2 i0 = ⎣ 1 − 1 ⎦ F 0 2 2 i0 . 1 102

It is easily checked that the condition λ(AA) ∩ λ(F F) = ∅ holds. By calculations, one has

( ) = ( − ) = 3 − 2 − − f AA s det sI AA s 2s 3s 8 and

2 adj(sI − AA) = R0 + R1s + R2s with ⎡ ⎤ −1 + i1+ 3i −3 + 3i ⎣ ⎦ R0 = −1 + 3i −3 − 3i −3 + i , −1 + 2i −11+ 2i ⎡ ⎤ −2 − i1− i −3 − i ⎣ ⎦ R1 = 1 − i −2 + i −1 − i , R2 = I . −1 −10

Choose the parameter matrix Z as   Z = i1− i2 .

Then, by using Theorem 12.3 one can obtain the following stabilizing gain K :   K = − 1113 066 687 5610 728 707 2000 037 742 +  11 759 097 092 11 759 097 092 2939 774 273  − 323 528 713 5806 111 493 5535 473 991 11 759 097 092 11 759 097 092 11 759 097 092 i.

Feedforward Compensation gain First, Theorem 12.4 is used to obtain the matri- ces G and H. If the scalar γ is chosen to be γ = 1, then one has 12.2 Model Reference Tracking Control 449     16 20 44 55 ( ) =− − 3 − − 2 f(E Eγ ,A Aγ ) s i s i s 41 41  41 41 48 60 136 170 − − i s + − i, 41 41 41 41 and

2 3 adj(sE Eγ − A Aγ ) = R0 + R1s + R2s + R3s , with ⎡ ⎤ ⎡ ⎤ − 180 477 − 72 − 147 − 308 193 − 402 − 11 ⎢ 41 41 41 41 ⎥ ⎢ 41 41 41 41 ⎥ ⎢ − 18 35 54 33 ⎥ ⎢ 2 − 95 − 6 − 31 ⎥ ⎢ 41 41 41 41 ⎥ ⎢ 41 41 41 41 ⎥ R0 = ⎢ ⎥ + i ⎢ ⎥ , 42 − 238 95 126 132 − 256 199 − 14 ⎣ 41 41 41 41 ⎦ ⎣ 41 41 41 41 ⎦ 220 − 599 88 193 258 − 225 382 − 67 41 41 41 41 41 41 41 41

⎡ ⎤ ⎡ ⎤ − 90 20 − 120 − 63 92 − 25 − 96 − 34 ⎢ 41 41 41 41 ⎥ ⎢ 41 41 41 41 ⎥ ⎢ 4 − 48 − 51 − 11 ⎥ ⎢ 36 19 33 24 ⎥ ⎢ 41 41 41 41 ⎥ ⎢ 41 41 41 41 ⎥ R1 = ⎢ ⎥ + i ⎢ ⎥ , 131 − 146 67 44 31 39 152 27 ⎣ 41 41 41 41 ⎦ ⎣ 41 41 41 41 ⎦ 172 − 88 200 111 − 10 − 13 242 56 ⎡ 41 41 41 41⎤ ⎡ 41 41 41 ⎤ 41 − 6 − 4 − 54 − 21 28 46 6 16 ⎢ 41 41 41 41 ⎥ ⎢ 41 41 41 41 ⎥ ⎢ 16 4 − 8 ⎥ ⎢ − 20 36 10 ⎥ ⎢ 41 41 0 41 ⎥ ⎢ 41 41 0 41 ⎥ R2 = ⎢ ⎥ + i ⎢ ⎥ , 34 2 40 19 − 22 18 32 7 ⎣ 41 41 41 41 ⎦ ⎣ 41 41 41 41 ⎦ − 14 15 26 − 3 − 70 29 41 41 1 41 41 41 1 41 ⎡ ⎤ ⎡ ⎤ 000 0 000 0 ⎢ 000 0 ⎥ ⎢ 000 0⎥ R = ⎢ ⎥ + i ⎢ ⎥ . 3 ⎣ 000 0 ⎦ ⎣ 000 0⎦ − 8 10 000 41 000 41

According to Theorem 12.4, the matrices G and H are obtained as ⎡ ⎤ 100 + 48 i 74 − 18i ⎢ 7 7 7 ⎥ = ⎣ 20 + 95 − 19 ⎦ G 7 10i 7 7 i , − 68 − 22 i − 31 + 13i  7 7 7  = − 104 + 12 6 + 146 H 7 7 i 7 7 i .

With them, the feedforward compensation gain Km is given by 450 12 Feedback Design for Antilinear Systems

= − Km H KG    = − 102 681 434 675 302 083 602 885 + 135 215 775 357 599 132 106 863 20 578 419 911 82 313 679 644 41 156 839 822 82 313 679 644 i.

With the obtained feedback stabilizing gain and the feedforward compensation gain, the tracking control law can be implemented.

12.3 Finite Horizon Quadratic Regulation From this section, the linear quadratic regulation problems are investigated for discrete-time antilinear systems. The objective of this kind of problems is to design a feedback control law such that the resultant closed-loop system satisfies a prescribed quadratic performance index. This kind of problems has been intensively investigated in the context of linear systems. In this section, the problem of finite horizon quadratic regulation is considered. In this case, the discrete-time antilinear system under consideration is time-variant, and is described by

x(t + 1) = A(t) x(t) + B(t) u(t), x(0) = x0, (12.29) where x(t) ∈ Cn and u(t) ∈ Cr are the state vector and the input vector, respectively; n×n n×r A(t) ∈ C and B(t) ∈ C are the system coefficient matrices; x0 is the initial condition. For the system model (12.29), x(t) and u(t) represent the conjugate of the system state x(t) and the system input u(t), respectively, i.e., the vectors obtained by taking the complex conjugate of each element of x(t) and u(t). Therefore, the discrete-time antilinear system (12.29) is investigated in the complex domain. The finite horizon linear quadratic regulation (LQR) problem to be addressed for the system (12.29) can be stated as follows.

Problem 12.2 (The finite horizon LQR problem) For the discrete-time antilinear system (12.29), find an optimal control law u (t) = K (t) x (t), minimizing the fol- lowing quadratic performance index:

N−1  H H H J(N, x0, u) = x (N) Sx(N) + x (t) Q(t) x(t) + u (t) R(t) u(t) , (12.30) t=0 where the finite integer N ≥ 1, S(t) ≥ 0, Q(t) ≥ 0, and R(t) > 0for0≤ t ≤ N − 1.

Different from the quadratic regulation problems for linear systems, Problem 12.2 involves the performance index and dynamic equations in the complex domain. Due to this, the discrete minimum principle is first generalized to the complex domain. Consider the following discrete-time system in the complex domain

x(t + 1) = f (x (t) , u(t) , t) , x(0) = x0, (12.31) 12.3 Finite Horizon Quadratic Regulation 451 where x(t) ∈ Cn is the system state, and u(t) ∈ Cr is the control input. The goal is to find a control u (t) such that the following performance index is minimum

N−1 J(N, x0, u) = (x(N) , N) + L(x(t) , u(t) , t), t=0 where both  and L are real-valued functions. To solve this problem, the following discrete minimum principle is provided. Lemma 12.4 (The discrete minimum principle) Forthe discrete-time system (12.31), if u∗(t) is a control such that   N−1 u∗(t) = arg min (x(N) , N) + L(x(t) , u(t) , t) , u(t) t=0 s.t. x(t + 1) = f (x(t) , u(t) , t) , then the following statements hold. (1) The corresponding optimal state vector x∗ (t) and co-state vector λ∗ (t) satisfy the regular equations

∂ H(x∗(t) , u∗(t) ,λ∗(t + 1) , t) λ∗(t) = , ∂x(t)

∂ H(x∗(t) , u∗(t) ,λ∗(t + 1) , t) x∗(t + 1) = = f (x∗(t) , u∗(t) , t), ∂λ(t + 1)

where

H(x(t) , u(t) ,λ(t + 1) , t) = L(x(t) , u(t) , t) + λH(t + 1) f (x(t) , u (t) , t)

is the discrete Hamiltonian function. (2) The optimal control u∗ (t) satisfies

∂ H(x∗(t) , u∗(t) ,λ∗(t + 1) , t) = 0 . ∂u(t)

(3) The boundary condition and the transversality condition are

∗ ∗ ∂ x (t0) = x0, λ (N) = . ∂xN

Lemma 12.4 is a complex-domain version of the discrete minimum principle. The proof of this lemma can be easily obtained by a slight modification for that of the minimum principle in the real domain.. 452 12 Feedback Design for Antilinear Systems

Before proceeding, the following result on matrix inversion is provided. Lemma 12.5 Let A, B, C, and D be matrices with appropriate dimensions. If A, C, and A + BC D are invertible, then

 −1 (A + BCD)−1 = A−1 − A−1 B C−1 + DA−1 B DA−1.

By using the discrete minimum principle and the dynamic programming tech- nique, necessary and sufficient conditions for the existence of the unique optimal control are established in the following result. Theorem 12.5 Given the discrete-time antilinear system (12.29) with the system matrix A nonsingular, there exists a unique state regulator u(t) = K (t) x(t) to mini- mize the performance index J given in (12.30), if and only if there exists a Hermitian matrix P(t) satisfying the following matrix difference equation

−1 P(t) = Q(t) + AT(t)P(t + 1) I + B(t)R−1(t)BT(t)P(t + 1) A(t), (12.32) with the boundary condition

P(N) = S. (12.33)

In this case, the optimal state feedback control u∗ (t) is given by

u∗(t) = K (t) x(t) , (12.34) with

 −1 K (t) =− R(t) + BT(t) P(t + 1)B(t) BT(t) P(t + 1)A(t). (12.35)

Proof First, the necessity is proven by the discrete minimum principle in Lemma 12.4. Construct the following Hamiltonian function

( ( ) , ( ) ,λ( + ) , ) H x t u t t 1 t  (12.36) = xH(t) Q(t) x(t) + uH (t) R(t) u(t) + λH (t + 1) A(t) x(t) + B(t) u(t) .

By using the matrix derivative rule in Sect. 2.10, the control equation can be obtained as: ∂ H = R (t) u(t) + BT (t) λ(t + 1) = 0. (12.37) ∂u (t)

Thus, the optimal control is given by

u∗(t) =−R−1(t) BT(t) λ(t + 1). (12.38) 12.3 Finite Horizon Quadratic Regulation 453

The co-state equation can be written as

∂ H λ(t) = = Q(t) x(t) + AT(t) λ(t + 1), (12.39) ∂x(t) with the transversality condition ∂   λ(N) = xH (N) Sx(N) = Sx(N) . (12.40) ∂x(N)

Substituting (12.38) into the state equation (12.29), yields

x(t + 1) = A(t) x(t) − B(t)R−1(t)BH(t) λ(t + 1) . (12.41)

In view that A(t) is nonsingular, it follows from (12.39) that

 −  − λ(t + 1) =− AH(t) 1 Q(t)x (t) + AH(t) 1 λ(t). (12.42)

Combining (12.42) with (12.41), gives

x(t + 1) M M x(t) = 11 12 , (12.43) λ(t + 1) M21 M22 λ(t) where ⎧ ⎪ = ( ) + ( ) −1( ) H( )( H( ))−1 ( ) ⎨⎪ M11 A t B t R t B t A t Q t M =−B(t)R−1(t)BH(t)(AH(t))−1 12 . ⎪ =−( H( ))−1 ( ) ⎩⎪ M21 A t Q t H −1 M22 = (A (t))

With the transversality condition (12.40), it can be derived from (12.43) that there exists a time-varying matrix P (t) such that

λ(t) = P(t)x(t). (12.44)

Substituting (12.44)into(12.38), one can obtain the following optimal control

u∗(t) =−R−1(t)BT(t)P(t + 1) x(t + 1). (12.45)

Since the optimal control law in the form of (12.45) is not linear, then x (t + 1) in (12.45) can be replaced with x(t) to find a linear state feedback control law. Substituting (12.45) into the state equation (12.29), one has

x(t + 1) = A(t)x(t) − B(t) R−1(t)BH(t)P(t + 1)x(t + 1), 454 12 Feedback Design for Antilinear Systems which implies that

−1 x(t + 1) = I + B(t)R−1(t)BH(t)P(t + 1) A(t)x(t). (12.46)

It follows from (12.46) and (12.45) that the linear state feedback control law is given by (12.34) with

− − − 1 K (t) =−R 1 (t) BT (t) P(t + 1) I + B(t)R 1 (t) BT (t) P (t + 1) A(t). (12.47)

For the gain K (t) in (12.47), by Lemma 12.5 one has

K (t)     − −1 =−R 1(t) BT (t) P(t + 1) I − B(t) R(t) + BT(t) P (t + 1)B(t) BT (t) P(t + 1) A(t)     − −1 =−R 1(t) I − BT (t) P(t + 1)B(t) R(t) + BT(t) P (t + 1)B(t) BT (t) P(t + 1)A (t)     − −1 =−R 1(t) R(t) R(t) + BT (t) P (t + 1)B(t) BT (t) P(t + 1)A(t)

 −1 =− R(t) + BT(t) P(t + 1)B(t) BT (t) P(t + 1)A(t).

This is the relation (12.35). On the other hand, it follows from the co-state equation (12.39) and the relation (12.44) that P(t) x(t) = Q(t) x(t) + AT(t) P(t + 1) x(t + 1). (12.48)

Combining (12.48) with (12.46), yields

−1 P(t) = Q(t) + AT(t)P(t + 1) I + B(t)R−1(t)BT(t)P(t + 1) A(t), (12.49) which is the relation (12.32). Next, the sufficiency is shown by using the dynamic programming principle, that is, the principle of optimality: any part of an optimality trajectory must itself be optimal. For convenience, denote

ϕ (k) = xH(k) Q(k) x(k) + uH (k) R(k) u(k) .

Thus, the performance index function J (N, t0, x, u) in (12.30) can be written as

N−1 H J(N, t0, x, u) = x (N)Sx(N) + ϕ (i) . i=0 12.3 Finite Horizon Quadratic Regulation 455

Construct a function V : I[0, N]×Cn −→ R such that, for each s ∈ I[0, N] and each x ∈ Cn,

V(s, x) = min J(N, s, x, u) . u

From the boundary condition, one has

V(N, x) = xH(N)Sx(N).

Denote

JN−1 = V (N, x) + ϕ (N − 1) . (12.50)

By the principle of optimality, it can be seen that the partial value of the optimality trajectory at time N − 1 should satisfy the following condition

V (N − 1, x) = min {JN−1} . u(N−1)

Substituting the state equation (12.29)into(12.50), gives

H JN−1 = ϕ (N − 1) + A(N − 1)x(N − 1) + B(N − 1)u(N − 1)

·S A(N − 1)x(N − 1) + B(N − 1)u(N − 1) .

Taking conjugate gradient of JN−1 with respect to the control input u (N − 1), yields

∂ J − N 1 = R(N − 1)u(N − 1) + BT(N − 1)ST ∂ ( − ) u N 1 · A(N − 1)x(N − 1) + B(N − 1)u(N − 1) .

Then by setting the preceding conjugate gradient to be 0, one can obtain a possible optimal control u∗ (N − 1) as follows:

u∗(N − 1) = K (N − 1)x(N − 1), (12.51) with

K(N − 1) =−(R(N − 1) + BT(N − 1)ST B(N − 1))−1 BT(N − 1)ST A(N − 1).

Since the matrix S is Hermitian, the gain K (N − 1) can be rewritten as

K(N − 1) =−(R(N − 1) + BT(N − 1)SB(N − 1))−1 BT(N − 1)S A(N − 1). 456 12 Feedback Design for Antilinear Systems

In order to verify that the control (12.51) is actually the optimal control, the Hessian matrix of the function JN−1 is considered:

∂ ∂ T 2 JN−1 ∇ ( − ) JN−1 = u N 1 ∂u(N − 1) ∂u(N − 1) = R(N − 1) + BT(N − 1)ST B(N − 1) > 0.

This implies that the JN−1 reaches its unique minimum V(N − 1, x) at u(N − 1) = u∗(N − 1), where u∗(N − 1) is given in (12.51). Thus, with the optimal control (12.51), one has

V (N − 1, x) ∗ ∗ = xH(N − 1)Q(N − 1)x(N − 1) + u H(N − 1)R(N − 1)u (N − 1)

H + A(N − 1)x(N − 1) + B(N − 1)u∗(N − 1) ·

S A(N − 1)x(N − 1) + B(N − 1)u∗(N − 1)

= xH(N − 1)Q(N − 1)x(N − 1) + xH(N − 1)K H(N − 1)R(N − 1)K (N − 1)x(N − 1)

H + A(N − 1)x(N − 1) + B(N − 1)K (N − 1)x(N − 1) ·

S A(N − 1)x(N − 1) + B(N − 1)K (N − 1)x(N − 1)

= xH(N − 1)Q(N − 1)x(N − 1) + xH(N − 1)K H(N − 1)R(N − 1)K (N − 1)x(N − 1)

H + A(N − 1)x(N − 1) + B(N − 1)K (N − 1)x(N − 1) ·

S A(N − 1)x(N − 1) + B(N − 1)K (N − 1)x(N − 1)

= xH(N − 1)Q(N − 1)x(N − 1) + xH(N − 1)K H(N − 1)R(N − 1)K (N − 1)x(N − 1)

T +xH(N − 1) A(N − 1) + B(N − 1)K (N − 1) S A(N − 1) + B(N − 1)K (N − 1) x(N − 1)

= xH (N − 1) P (N − 1) x (N − 1) , with

P(N − 1) = Q(N − 1) + K H(N − 1)R(N − 1)K (N − 1)

T + A(N − 1) + B(N − 1)K (N − 1) S A(N − 1) + B(N − 1)K (N − 1) .

Notice that P(N) = S. Then the expressions of P(N − 1) and K (N − 1) can be rewritten as

P(N − 1) = Q(N − 1) + K H(N − 1)R(N − 1)K (N − 1) (12.52)

T + A(N − 1) + B(N − 1)K (N − 1) P (N) A(N − 1) + B(N − 1)K (N − 1) , 12.3 Finite Horizon Quadratic Regulation 457 and

− K (N − 1) =−(R(N − 1) + BT(N − 1)P(N)B(N − 1)) 1 BT(N − 1)P(N)A(N − 1). (12.53)

From (12.52) and (12.53), it can be assumed that at the time t the optimal control is u∗(t) = K (t)x(t), where

K (t) =−(R(t) + BT(t)P(t + 1)B(t))−1 BT(t)P(t + 1)A(t), (12.54) and the corresponding partial value of the optimality trajectory is

V (t, x) = xH(t)P(t)x(t), where

T P(t) = Q(t) + K H(t)R(t)K (t) + A(t) + B(t)K (t) P(t + 1) A(t) + B(t)K (t) . (12.55)

With these two assumptions, let us consider the partial value of the optimality tra- jectory at time t − 1. Denote

Jt−1 = V (t, x) + ϕ (t − 1) .

Taking conjugate gradient of Jt−1 with respect to u(t − 1),gives

∂ J − t 1 = R(t − 1)u(t − 1) + BT(t − 1)P(t) A(t − 1)x(t − 1) + B(t − 1)u(t − 1) . ∂u(t − 1)

By setting the preceding expression to be zero, one can obtain a possible optimal control at the time t − 1 as follows

u∗(t − 1) = K (t − 1)x(t − 1), (12.56) with

−1 K(t − 1) =− R(t − 1) + BT(t − 1)P(t)B(t − 1) BT(t − 1)P(t)A(t − 1). (12.57) On the other hand, the Hessian matrix of the function Jt−1 is given by

∂ ∂ T 2 Jt−1 ∇ ( − ) Jt−1 = u t 1 ∂u(t − 1) ∂u(t − 1) = R(t − 1) + BT(t − 1)ST B(t − 1) > 0. 458 12 Feedback Design for Antilinear Systems

This implies that the control given in (12.56)–(12.57) is the optimal control. Similarly to the derivation for the case at the time N − 1, one can obtain the partial value function at the time t − 1is

V (t − 1, x) = xH(t − 1)P(t − 1)x(t − 1), with

P(t − 1) = Q(t − 1) + K H(t − 1)R(t − 1)K (t − 1) (12.58)

T + A(t − 1) + B(t − 1)K (t − 1) P(t)A(t − 1) + B(t − 1)K (t − 1).

The relations in (12.57) and (12.58) imply that the relations (12.54) and (12.55) hold for the case at the time t − 1. By mathematical induction, (12.54) and (12.55) hold for any integer t ∈ I[0, N]. Finally, we eliminate K (t) in the expression (12.55)ofP (t). Substituting (12.54) into (12.55), yields

P(t) = Q(t) + K H(t)R(t)K (t) + AT(t)P(t + 1)A(t) + AT(t)P(t + 1)B(t)K (t) + H( ) T( ) ( + ) ( ) + H( ) T( ) ( + ) ( ) ( ) K t B t P t 1 A t K t B t P t 1 B t K t = Q(t) + K H(t) R(t) + BT(t)P(t + 1)B(t) K (t) + AT(t)P(t + 1)A(t)

+AT(t)P(t + 1)B(t)K (t) + K H(t)BT(t)P(t + 1)A(t)

−1 = Q(t) + AT(t)P(t + 1)B(t) R(t) + BT(t)P(t + 1)B(t) R(t) + BT(t)P(t + 1)B(t)

−1 · R(t) + BT(t)P(t + 1)B(t) BT(t)P(t + 1)A(t) + AT(t)P(t + 1)A(t)

−1 −AT(t)P(t + 1)B(t) R(t) + BT(t)P(t + 1)B(t) BT(t)P(t + 1)A(t)

−1 −AT(t)P(t + 1)B(t) R(t) + BT(t)P(t + 1)B(t) BT(t)P(t + 1)A(t)

= Q(t) + AT(t)P(t + 1)A(t) − AT(t)P(t + 1)B(t)

−1 · R(t) + BT(t)P(t + 1)B(t) BT(t)P(t + 1)A(t)

−1 = Q(t) + AT(t)P(t + 1) I − B(t) R(t) + BT(t)P(t + 1)B(t) BT(t)P(t + 1) A(t).

Hence, one can obtain the following matrix difference equation

−1 P(t) = Q(t) + AT(t)P(t + 1) I − B(t) R(t) + BT(t)P(t + 1)B(t) BT(t)P(t + 1) A(t), which can be equivalently rewritten as (12.32) by using Lemma 12.5. Thus, the sufficiency holds. In addition, the state feedback gain of the unique optimal control has been given in (12.54), which is the expression in (12.35). With the previous aspects, the conclusion of this theorem is thus proven.  12.3 Finite Horizon Quadratic Regulation 459

When all the matrices are real, the difference equation (12.32) is reduced to the well-known discrete-time Riccati matrix equation. For convenience, the difference equation (12.32) is referred to as the anti-Riccati matrix equation.

Remark 12.8 It can be seen from the proof of Theorem 12.5 that the nonsingularity of the system matrix A is only needed in the necessity proof. Therefore, in general the anti-Riccati matrix equation condition is sufficient for the existence of the unique optimal state feedback regulator.

Now let us address how to obtain the unique optimal control.

Remark 12.9 From the preceding proof procedure, the state feedback regulator can be obtained by the following backward iteration: ( ) = (1) P N S , −1 (2) K(t) =− R(t) + BT(t)P(t + 1)B(t) BT(t)P(t + 1)A(t), t = N − 1, N − 2, ...,0, T (3) P(t) = Q(t) + K H(t)R(t)K (t) + A(t) + B(t)K (t) P(t + 1) A(t) + B(t)K (t) , t = N − 1, N − 2, ...,0.

In addition, the following remark is given on the expressions of the anti-Riccati matrix equation and the state feedback gain in Theorem 12.5. Remark 12.10 It follows from the proof of Theorem 12.5 that the anti-Riccati matrix equation (12.32) can be equivalently written as

−1 P(t) = Q(t) + AT(t)P(t + 1) I − B(t) R(t) + BT(t)P(t + 1)B(t) BT(t)P(t + 1) A(t), (12.59) and the state feedback gain K (t) in (12.35) can be expressed as

 −1 K (t) =−R−1(t) BT (t) P(t + 1) I + B(t)R−1(t) BT(t) P(t + 1) A(t).

In Theorem 12.5, the existence condition of the unique optimal control has been established for the antilinear system (12.29), and the expression of the control law is also given. Naturally, it is very interesting to give the optimal performance index. This is the following result.

Theorem 12.6 Given the antilinear system (12.29) with initial value x (0) = x0,let the matrix P (t) satisfy the anti-Riccati matrix equation (12.32). Then, under the ∗ state feedback (12.34)–(12.35) the optimal value J [x0] of the performance index (12.30) is given by

∗ = H ( ) J [x0] x0 P 0 x0. 460 12 Feedback Design for Antilinear Systems

Proof It follows from the state equation (12.29) that

xH(t + 1)P(t + 1)x(t + 1) − xH(t)P(t)x(t) = xH(t + 1)P(t + 1)x(t + 1) − xH(t)P(t)x(t)

H = A(t)x(t) + B(t)u(t) P(t + 1) A(t)x(t) + B(t)u(t) − xH(t)P(t)x(t)

= xH(t) AT(t)P(t + 1)A(t) − P(t) x(t) + xH(t)AT(t)P(t + 1)B(t)u(t) +uH(t)BT(t)P(t + 1)A(t)x(t) + uH(t)BT(t)P(t + 1)B(t)u(t).

By Remark 12.10, it is seen that the anti-Riccati matrix equation (12.32) can be equivalently written as (12.59). With (12.59), from the preceding relation one has

xH(t + 1)P(t + 1)x(t + 1) − xH(t)P(t)x(t)

−1 =−xH(t)Q(t)x(t) + xH(t)AT(t)P(t + 1)B(t) R(t) + BT(t)P(t + 1)B(t) BT(t)P(t + 1)A(t)x(t)

+ xH(t)AT(t)P(t + 1)B(t)u(t) + uH(t)BT(t)P(t + 1)A(t)x(t) + uH(t)BT(t)P(t + 1)B(t)u(t)

−1 H =−xH(t)Q(t)x(t) − uH(t)R(t)u(t) + u(t) + R(t) + BT(t)P(t + 1)B(t) BT(t)P(t + 1)A(t)x(t)

−1 · R(t) + BT(t)P(t + 1)B(t) u(t) + R(t) + BT(t)P(t + 1)B(t) BT(t)P(t + 1)A(t)x(t) .

Thus, under the control law (12.34)–(12.35) one has xH(t + 1)P(t + 1)x(t + 1) − xH(t)P(t)x(t) =−xH(t)Q(t)x(t) − uH(t)R(t)u(t).

Summing both sides of the preceding relation from 0 to N − 1, gives

N−1   xH(t + 1)P(t + 1)x(t + 1) − xH(t)P(t)x(t) t=0 N−1   =− xH(t)Q(t)x(t) + uH(t)R(t)u(t) t=0 = H( ) ( ) − H ( ) x N Sx N x0 P 0 x0.

With this relation, one can derive that the corresponding performance index is

N−1   ∗ H H H J [x0] = x (N)Sx(N) + x (t)Q(t)x(t) + u (t)R(t)u(t) t=0  = H( ) ( ) − H( ) ( ) − H ( ) x N Sx N x N Sx N x0 P 0 x0 = H ( ) x0 P 0 x0.

The proof is thus completed.  12.3 Finite Horizon Quadratic Regulation 461

Remark 12.11 Note that the matrix P(t) in Theorem 12.5, though Hermitian, are not necessarily positive definite. For instance, if S = 0 and Q = 0, then the performance index J is minimized by u = 0, and the optimal performance index is 0. Thus P(t) = 0for0≤ t ≤ N. Equivalently, the matrix P(t) = 0 is the solution to the anti-Riccati matrix equation (12.32) with S = Q = 0.

12.4 Infinite Horizon Quadratic Regulation

In the preceding section, the optimal state regulator has been obtained for discrete time-varying antilinear systems on a finite interval. In this section, the infinite horizon quadratic regulation is considered. To simplify, it is assumed that the considered antilinear systems are time-invariant, and that the matrices in the performance index are also independent of time t. Specifically, the following discrete-time antilinear system is considered:

x(t + 1) = Ax(t) + Bu(t), x(0) = x0, (12.60) where A ∈ Cn×n and B ∈ Cn×r are the system matrices of this antilinear system. The problem to be investigated in this section is stated as follows.

Problem 12.3 (The infinite horizon LQR problem) Given the discrete-time antilin- ear system (12.60), find an optimal control u(t) = Kx(t) such that the following performance index is minimized:

∞   H H J∞(x0, u) = x (t)Qx(t) + u (t)Ru(t) , (12.61) t=0 with Q ≥ 0, R > 0.

Now for the antilinear system (12.60), it is denoted that

τ−1   H H Jτ (x0, u) = x (t)Qx(t) + u (t)Ru(t) , t=0 and

τ−1   H H Vτ (s, x0) = min x (t)Qx(t) + u (t)Ru(t) . u t=s

By Theorem 12.6, it is shown that

( , ) = H ( ) Vτ 0 x0 x0 P 0 x0, 462 12 Feedback Design for Antilinear Systems where P(0) is obtained by the following anti-Riccati matrix equation:

−1 P(t) = Q + AT P(t + 1) I + BR−1 BT P(t + 1) A, P(τ) = 0.

In addition, by time invariance it is seen that Vτ (0, x0) can be obtained by

( , ) = H (−τ) Vτ 0 x0 x0 P1 x0, where P1(−τ) is given by the following anti-Riccati equation:

−1 T −1 T P1(t) = Q + A P1(t + 1) I + BR B P1(t + 1) A, P1(0) = 0.

Let (t) = P1(−t). Then, by Remark 12.10, (t) satisfies the equation

−1 (t + 1) = Q + AT (t) I − B R + BT (t)B BT (t) A, (0) = 0, (12.62) for t ≥ 0. In this case, one has

( , ) = H (τ) Vτ 0 x0 x0 x0. (12.63)

Take any pair μ>τ≥ 0, and let v be optimal for the interval [0,μ] with the initial state x0. Then

H (μ) x0 x0 = Vμ(0, x) = Jμ(x, v)   μ−1  H H = Jτ x, v|[0,τ] + x (t)Qx(t) + v (t)Rv(t) (12.64) t=τ ≥ ( , ) = H (τ) Vτ 0 x x0 x0.

This holds for all x0,so (μ) ≥ (τ) for μ>τ≥ 0. Next, it is needed to show that if the antilinear system (12.60) is controllable, then there is a limit for the (t) in (12.62)ast approaches to infinity. Note, here the concept of controllability is a direct generalization of that in discrete time-invariant linear systems. For precision, the following definition is given. Definition 12.1 The discrete-time antilinear system (12.60) is called controllable, if for any initial time t0 with its initial state x(t0) = x0, and an arbitrary final state x1, there exist a finite time instantl,l > t0, and a control input sequence {u(i) , i ∈ I[t0, l]} such that the corresponding state response of the system (12.60) with initial value x(t0) = x0 satisfies x(l) = x1. 12.4 Infinite Horizon Quadratic Regulation 463

Lemma 12.6 It is assumed that the antilinear system (12.60) is controllable, and let R > 0,Q≥ 0. Then, for the sequence { (t)} given in (12.62) the limit = limt→∞ (t) exists. Moreover, satisfies the algebraic anti-Riccati equation    −1 = Q + AT − B R + BT B BT A. (12.65)

n Proof Fix any x0 ∈ C . One first notes that controllability implies that there is some u such that J∞ (x0, u) is finite, where J∞ (x0, u) is given in (12.61). In fact, since the system (12.60) is controllable, then there are an integer t1 and some control u1 such that under the control u1 there holds x (t1) = 0. Now, let us consider the control v on [0, ∞) that is equal to u1 for t ∈[0, t1 − 1] and is identically zero for t ≥ t1. Then,

− t1 1   H H J∞ (x0, v) = x (t)Qx(t) + u (t)Ru(t) < ∞. t=0

It follows that for each τ>0,   H ( ) = ( , ) ≤ , | ≤ ( , ) x0 t x0 Vτ 0 x0 Jτ x0 v [0,t1−1] J∞ x0 v . (12.66)   H ( ) This relation and (12.64) imply that the sequence x0 t x0 is not only nonde- H ( ) creasing versus t but also bounded from above. Thus, limt→∞ x0 t x0 exists for each fixed x0. With this fact, by choosing some special x0 it is easily concluded that limt→∞ (t) exists. By taking limit for both sides of (12.62), the expression (12.65) is readily obtained.  With the result of Lemma 12.6,from(12.63) the following result can be immedi- ately obtained. Lemma 12.7 Given the antilinear system (12.60) and the quadratic performance index J∞ (x0, u) in (12.61) with R > 0,Q≥ 0, it is assumed that the system (12.60) is controllable, then H min J∞ (x0, u) = x x0, u 0 where is the solution to the anti-Riccati equation (12.65).

With this lemma as a tool, in the next theorem we will give a result on the solution of Problem 12.3. Theorem 12.7 It is assumed that the antilinear system (12.60) is controllable, then n for R > 0 and Q ≥ 0 there exists for each x0 ∈ C a unique control minimizing the ∗ performance index J∞ (x0, u) in (12.61). Moreover, the unique optimal control u (t) is given by u∗(t) = Kx(t), (12.67)

 −1 K =− R + BT B BT A, (12.68) 464 12 Feedback Design for Antilinear Systems where is the solution to the algebraic anti-Riccati matrix equation (12.65). More- over, the optimal performance is   , ∗ = H J∞ x0 u x0 x0. (12.69)

Proof When u∗(t) given by (12.67) and (12.68) is the closed-loop control on the interval [0, ∞), the corresponding closed-loop system is   x(t + 1) = A + BK x (t), x (0) = x0. (12.70)

In addition, by substituting u (t) = u∗ (t) = Kx(t) with K given in (12.68)into (12.61) one has   ∞   ∗ H H J∞ x0, u = x (t) Q + K RK x(t). t=0

On the other hand, since satisfies (12.65), then for the closed-loop system one has

xH(t + 1) x(t + 1) − xH(t) x(t)  H   = xH(t) A + BK A + BK x(t) − xH(t) x(t)

 T = xH(t) A + BK A + BK x(t) − xH(t) x(t)

= xH(t) AT A + K H BT A + AT BK + K H BT BK − x(t)   −1 = xH(t) AT B R + BT B BT A + K H BT A + AT BK + K H BT BK − Q x(t).

By using the expression (12.68)ofK , from the preceding relation, one can obtain that

xH(t + 1) x(t + 1) − xH(t) x(t)     = xH(t) K H R + BT B K + K H BT A + AT BK + K H BT BK − Q x(t)       = xH(t) K H RK + K H BT A + BK + A + BK H BK − Q x(t). (12.71)

In addition, it follows from (12.68) that   R + BT B K =−BT A, which implies that   BT A + BK =−RK. 12.4 Infinite Horizon Quadratic Regulation 465

Substituting this expression into (12.71), gives

H( + ) ( + ) − H( ) ( ) x t  1 x t 1 x t x t  = H( ) H − H + (− )H − ( ) x t K RK K RK RK K Q x t =−xH(t) K H RK + Q x(t).

With this relation, one has

  ∞   ∗ H H J∞ x0, u = x (t) Q + K RK x(t) t=0 ∞   = xH(t) x(t) − xH(t + 1) x(t + 1) (12.72) t=0 H H = x x0 − lim x (t) x(t). 0 t→∞

In addition, due to = limt→∞ (t), it follows from (12.66) that   , ∗ ≤ H ≤ ( , ) J∞ x0 u x0 x0 J∞ x0 v for all v. Combining this relation with Lemma 12.7,gives   , ∗ = H J∞ x0 u x0 x0, (12.73) which implies that u∗ given in (12.67)–(12.68) is optimal, and the corresponding performance index is given by (12.69). In addition, it follows from (12.72) and (12.73) that

lim xH(t) x(t) = 0. t→∞

Next, let us show that the optimal control u∗ givenin(12.67)–(12.68) is unique. Let v be another control different from u∗, and x (t) be state response of the corresponding closed-loop system with x (0) = x0. Similarly to the derivation in the proof of Theorem 12.6,byusing(12.65) one has

xH(t + 1) x(t + 1) − xH(t) x(t) =−xH(t)Qx(t) − vH(t)Rv(t) +      −1 H    −1 v(t) + R + BT B BT Ax(t) R + BT B v(t) + R + BT B BT Ax(t) . 466 12 Feedback Design for Antilinear Systems

Since v(t) is different from u∗,forsomeτ ≥ 1 one has

Jτ (x0, v) τ−1   = xH(t)Qx(t) + vH(t)Rv(t) t=0 τ−1   = xH(t) x(t) − xH(t + 1) x(t + 1) + t=0 τ−1      −1 H    −1 v(t) + R + BT B BT Ax(t) R + BT B v(t) + R + BT B BT Ax(t) t=0 > H − H(τ) (τ) x0 x0 x x , which is equivalently written as

H < ( , ) + H(τ) (τ) x0 x0 Jτ x0 v x x .

Let z = x(τ), and the control ω (t) = v (t + τ), t ≥ 0. Then, it follows from Lemma 12.7 that H x (τ) x(τ) ≤ J∞(z,ω) .

By time invariance, one has

Jτ (x0, v) + J∞ (z,ω) = J∞ (x0, v) .

The preceding three relations imply that

H < ( , ) = ∗ x0 x0 J∞ x0 v for all v u .

This implies the uniqueness of the optimal control u∗. The proof is thus completed. 

Remark 12.12 Similarly to the case of finite horizon regulation, the anti-Riccati matrix equation (12.65) can be equivalently written as

 −1 = Q + AT I + BR−1 BT A.

In this case, the unique optimal state feedback gain K in (12.68) can be written as

 −1 K =−R−1 BT I + BR−1 BT A.

In this section, the infinite horizon regulation problem is considered. A natural problem is, the closed-loop system under the optimal control law is stable? In other words, can the optimal control by minimizing the quadratic performance index sta- bilize the original system? For this question, the following theorem gives an answer. 12.4 Infinite Horizon Quadratic Regulation 467

However, the general solution for the anti-Riccati matrix equation (12.65) deserves further investigation in future. Theorem 12.8 Given the antilinear system (12.60), and two matrices R > 0 and Q > 0, under the optimal control (12.67)–(12.68) with satisfying the anti-Riccati matrix equation (12.65), the resulted closed-loop system is asymptotically stable. Proof First, let us show that >0ifQ > 0. It follows from Lemma 12.6 that is the limit of the (t) satisfying (12.62). Fix any x0 = 0, (12.64)impliesthat H ( ) x0 t x0 is nondecreasing on t. In addition, it follows from (12.62) that

−1 (1) = Q + AT (0) I − B R + BT (0)B BT (0) A = Q > 0.

Therefore, for any x0 = 0 one has

H H x x0 = lim x (t) x0 > 0. 0 t→∞ 0

Due to the arbitrariness, this relation implies that >0. Under the optimal control (12.67)–(12.68), the closed-loop system is   x (t + 1) = A + BK x (t) (12.74)

 −1 with K =− R + BT B BT A. Since >0, for the closed-loop system, we define the Lyapunov function

V (x(t)) = xH(t) x(t).

It follows from the proof of Theorem 12.7 that

 ( ( )) = ( ( + )) − ( ( )) V x t V x t 1 V x t  =−xH(t) Q + K H RK x(t).

Since Q > 0, then Q + K H RK > 0, it can be concluded that the closed-loop system is asymptotically stable. 

12.5 Notes and References

12.5.1 Summary

In this chapter, some feedback design problems have been studied for discrete-time antilinear systems. In Sect. 12.1, the problem of generalized eigenstructure assign- ment via state feedback was considered. The goal of this problem is to find a state 468 12 Feedback Design for Antilinear Systems feedback such that the resultant closed-loop system matrix is consimilar to a given matrix, and simultaneously give the corresponding . This prob- lem is a generalization of the eigenstructure assignment problem for linear systems. It is worth pointing out that the key concept is consimilarity in the context of anti- linear systems. In this section, a parametric approach is presented for this problem. By using this approach, general complete parametric expressions can be obtained for the state feedback gain and the corresponding transformation matrix. Obviously, the proposed approach needs to use the parametric solutions for the con-Sylvester matrix equations as tools. In Sect. 12.2, the problem of model reference control is considered for discrete- time antilinear systems. A sufficient condition is presented for the existence of an output tracking controller. It is shown that the tracking controller is a combination of a feedback stabilizing controller and a feedforward compensator. The compensation gain satisfies two matrix equations, one of which is the con-Sylvester matrix equation investigated in the previous chapters. In addition, the feedback stabilizing gain can be obtained in the framework of generalized eigenstructure assignment. Based on explicit solutions to the con-Sylvester matrix equations, a parametric design approach is proposed to solve the model reference control for discrete-time antilinear systems. An illustrative example is given to show the application of the proposed approach. In Sects.12.3 and 12.4, the linear quadratic regulation problems are investigated for discrete-time antilinear systems. In Sect.12.3, the finite horizon case is consid- ered, and in Sect.12.4 the infinite horizon case is studied. First, in Sect. 12.3 the dis- crete minimum principle is generalized to the complex domain. By using the discrete minimum principle and dynamic programming, necessary and sufficient conditions for the existence of the unique optimal control are obtained for the finite horizon regulation problem in terms of the so-called anti-Riccati matrix equation. Besides, the optimal value of the performance index under the optimal control is provided. Furthermore, in Sect. 12.4 the optimal regulation problem on an infinite horizon is investigated under the assumption that the considered time-invariant antilinear sys- tem is controllable. The resulted closed-loop system under the optimal control turns out to be asymptotically stable. The content in Sect. 12.2 is taken from [243], and the contents in Sects.12.3 and 12.4 are taken from [277].

12.5.2 A Brief Overview Eigenstructure assignment in linear systems has attracted much attention in the last three decades. In [192], a geometric approach was presented to solve eigenstructure assignment for linear systems. In [115], the class of assignable eigenvectors and generalized eigenvectors associated with the assigned eigenvalues via state feedback is explicitly described by introducing a result on the differentiation of determinants. In [66], a complete parametric approach was proposed for state feedback eigen- structure assignment in linear systems based on explicit solutions to the so-called 12.5 Notes and References 469

Sylvester matrix equations. The approach in [66] does not require the distinctness of the closed-loop eigenvalues from the open-loop ones. On the problems of eigen- structure assignment via output feedback and dynamical compensator, one may refer to [65, 67, 73]. The problem of generalized eigenstructure assignment for antilinear systems in this chapter is a generalization of the case in linear systems. Model reference control, also called model following, is a vital control approach, and has been extensively investigated. Roughly, the problem of model reference con- trol is to select a controller such that the resulted closed-loop system is as close as possible to a given reference model. In [223], the model reference control was consid- ered for a class of single-input single-output (SISO) system. The designed controller consists of a conventional model matching feedback and a linear “modelling error compensator” which guarantees an arbitrary small H∞-norm of the tracking error transfer function over a given frequency. The approximate model reference control was considered for discrete-time SISO nonlinear systems in [195], where the distance between the closed-loop system and the reference model was measured by ∞-norm and 2-norm. An important type of the problem of model reference control is that of asymptotic output tracking. In [38], a so-called new standard model reference control approach was proposed to solve the output tracking problem for linear time-varying systems. For this aim, the time-varying effect of the plant parameters was first transformed into some additional signals. In [203], such a problem was considered for SISO time- invariant systems with unknown parameters, nonlinear uncertainties, and additive bounded disturbances. The proposed method in [203] was an extension of the model reference adaptive control. In [5], the output tracking problem was studied for a multiple-input multiple-output (MIMO) system with disturbances and parameter uncertainties. When the reference model was not strictly positive real, an augmented system was presented in [5]. In [6], a state feedback plus feedforward compensation scheme was designed for a class of nonlinear systems such that the tracking error was bounded and tended to a neighbourhood of the origin. In [96], the problem of robust output tracking was investigated for MIMO linear systems with parameter uncertainties. The designed controller in [96] is a combination of a state feedback stabilizing controller and a feedforward compensator. The compensation gain can be obtained by solving a coupled matrix equation. Based on a parametric solution of Sylvester matrix equations given in [66], the robust compensator design was turned into a minimization problem. In [111], the problem of robust model reference control was investigated for descriptor linear systems. The work of [111] can be viewed as a parallel version of [96]. The result in this chapter can be veiwed as an extended version of the work [96] in the context of antilinear systems. The standard linear quadratic regulation (LQR) problem is to find the optimal control law for a linear system such that the given quadratic performance index is minimized. This regulation problem, including the finite horizon case and the infinite horizon case, has been well investigated for linear systems in the literature [78]. For discrete-time linear systems, the earliest result on the LQR problem may be in [162], where the discrete-time LQR problem was formulated, and a solution to the infinite horizon LQR problem was obtained by using the dynamic programming approach. 470 12 Feedback Design for Antilinear Systems

The mathematical details in [162] were further developed in [161]. In order to obtain the solution of the infinite horizon LQR problem, an important concept “controlla- bility” was introduced. It has been shown in [161] that the infinite horizon optimal regulation problem is solvable if and only if the system is completely controllable. In recent years, the classic LQR problem was investigated for both continuous-time and discrete-time systems with input delays (see [295, 299]). On LQR problems, several approaches have been developed in literature. In [162] and [61], the method of dynamic programming was employed to solve the infinite horizon and finite horizon LQR problems for discrete-time systems, respectively. By using the dynamic programming technique and the discrete minimum principle respectively, linear quadratic optimal control problem was dealt with in detail in [175], and necessary and sufficient conditions were proposed for the existence of optimal control laws. The results on quadratic regulation in this chapter are the version in the context of antilinear systems. References

1. Abdelaziz, T.H.S., Valasek, M.: Eigenstructure assignment by proportional-plus-derivative feedback for second-order linear control systems. Kybernetika 41(5), 661–676 (2005) 2. Abou-Kandil, H., Freiling, G., Lonessu, V., Jank, G.: Matrix Riccati Equations in Control and Systems Theory. Springer, New York (2003) 3. Aguirre, L.A.: Controllability and observability of linear systems: some noninvariant aspects. IEEE Trans. Edu. 38(1), 33–39 (1995) 4. Al’pin, Y.A., Il’in, S.N.: The matrix equation AX − YB = C and related problems. J. Math. Sci. 137(3), 4769–4773 (2006) 5. Ambrose, H., Qu, Z.: Model reference robust control for mimo systems. Int. J. Control 68(3), 599–623 (1997) 6. Ambrosino, G., Celentano, G., Garofalo, F.: Robust model tracking control for a class of nonlinear plants. IEEE Trans. Autom. Control, AC 30(3), 275–279 (1985) 7. Andrews, H.C., Patterson, C.L.: Singular value decompositions and digital image-processing. IEEE Trans. Acoust. Speech Sig. Process. ASSP 24(1), 26–53 (1976) 8. Baksalary, J.K., Kala, R.: The matrix equation AX −YB = C. Linear Algebra Appl. 25, 41–43 (1979) 9. Baksalary, J.K., Kala, R.: The matrix equation AXB + CYD = E. Linear Algebra Appl. 30(1), 141–147 (1980) 10. Barnett, S.: Leverrier’s algorithm: a new proof and extensions. SIAM J. Matrix Anal. Appl. 10(4), 551–556 (1989) 11. Barnett, S., Gover, M.J.C.: Some extensions of Hankel and Toeplitz matrices. Linear Multi- linear Algebra 14, 45–65 (1983) 12. Barnett, S., Storey, C.: The Liapunov matrix equation and Schwarz’s form. IEEE Trans. Autom. Control 12(1), 117–118 (1967) 13. Bartels, R.H., Stewart, G.W.: Solution of the matrix equation AX + XB = C. Commun. ACM 15(9), 820–826 (1972) 14. Basilio, J.C., Kouvaritakis, B.: An algorithm for coprime matrix fraction description using sylvester matrices. Linear Algebra Appl. 266(2), 107–125 (1997) 15. Ben-Israel, A., Greville, T.N.E.: Generalized Inverses: Theory and Applications, 2nd edn. Springer, New York (2003) 16. Benner, P., Damm, T.: Lyapunov equations, energy functionals, and model order reduction of bilinear and stochastic systems. SIAM J. Control Optim. 49(2), 686–711 (2011) 17. Benner, P., Li, R.C., Truhar, N.: On the ADI method for Sylvester equations. J. Comput. Appl. Math. 233(4), 1035–1045 (2009) 18. Benner, P., Quintana-Qrti, E.S., Quintana-Qrti, G.: Solving stable Stein equations on distrib- uted memory computers. Lect. Notes Comput. Sci. 1685, 1120–1123 (1999)

© Springer Science+Business Media Singapore 2017 471 A.-G. Wu and Y. Zhang, Complex Conjugate Matrix Equations for Systems and Control, Communications and Control Engineering, DOI 10.1007/978-981-10-0637-1 472 References

19. Betser, A., Cohen, N., Zeheb, E.: On solving the Lyapunov and Stein equations for a companion matrix. Syst. Control Lett. 25(3), 211–218 (1995) 20. Bevis, J.H., Hall, F.J., Hartwing, R.E.: The matrix equation AX¯ − XB = C and its special cases. SIAM J. Matrix Anal. Appl. 9(3), 348–359 (1988) 21. Bevis, J.H., Hall, F.J., Hartwing, R.E.: Consimilarity and the matrix equation AX¯ − XB = C. In: Current Trends in Matrix Theory, Auburn, Ala., 1986, pp. 51–64. North-Holland, New York (1987) 22. Bischof, C.H., Datta, B.N., Purkyastha, A.: A parallel algorithm for the Sylvester observer equation. SIAM J. Sci. Comput. 17(3), 686–698 (1996) 23. Bitmead, R.R.: Explicit solutions of the discrete-time Lyapunov matrix equation and Kalman- Yakubovich equations. IEEE Trans. Autom. Control, AC 26(6), 1291–1294 (1981) 24. Bitmead, R.R., Weiss, H.: On the solution of the discrete-time Lyapunov matrix equation in controllable canonical form. IEEE Trans. Autom. Control, AC 24(3), 481–482 (1979) 25. Borno, I.: Parallel computation of the solutions of coupled algebraic Lyapunov equations. Automatica 31(9), 1345–1347 (1995) 26. Borno, I., Gajic, Z.: Parallel algorithm for solving coupled algebraic Lyapunov equations of discrete-time jump linear systems. Comput. Math. Appl. 30(7), 1–4 (1995) 27. Boukas, E.K., Yang, H.: Stability of discrete-time linear systems with markovian jumping parameters. Math. Control Sig. 8(4), 390–402 (1995) 28. Brockett, R.W.: Introduction to Matrix Analysis. IEEE Transactions on Education, Baltimore (1970) 29. Budinich, P., Trautman, A.: The Spinorial Chessboard. Springer, Berlin (1988) 30. Calvetti, D., Lewis, B., Reichel, L.: On the solution of large Sylvester-observer equations. Numer. Linear Algebra Appl. 8, 435–451 (2001) 31. Carvalho, J., Datta, K., Hong, Y.:A new block algorithm for full-rank solution of the Sylvester- observer equation. IEEE Trans. Autom. Control 48(12), 2223–2228 (2003) 32. Carvalho, J.B., Datta, B.N.: An algorithm for generalized Sylvester-observer equation in state estimation of descriptor systems. In: Proceedings of the 41th Conference on Decision and Control, pp. 3021–3026 (2002) 33. Carvalho, J.B., Datta, B.N.: A new block algorithm for generalized Sylvester-observer equa- tion and application to state estimation of vibrating systems. In: Proceedings of the 43th Conference on Decision and Control, pp. 3613–3618 (2004) 34. Carvalho, J.B., Datta, B.N.: A new algorithm for generalized Sylvester-observer equation and its application to state and velocity estimations in vibrating systems. Numer. Linear Algebra Appl. 18(4), 719–732 (2011) 35. Chang, M.F., Wu, A.G., Zhang, Y.: On solutions of the con-Yakubovich matrix equation X − AXF = BY. In: Proceedings of the 33rd Chinese Control Conference, pp. 6148–6152 (2014) 36. Chen, B.S., Zhang, W.: Stochastic H2/H∞ control with state-dependent noise. IEEE Trans. Autom. Control 49(1), 45–57 (2004) 37. Chen, Y., Xiao, H.: The explicit solution of the matrix equation AX − XB = C-to the memory of Prof Guo Zhongheng. Appl. Math. Mech. 16(12), 1133–1141 (1995) 38. Chien, C.J., Fu, L.C.: A new approach to model reference control for a class of arbitrarily fast time-varying unknown plants. Automatica 28(2), 437–440 (1992) 39. Choi, J.W.: Left eigenstructure assignment via Sylvester equation. KSME Int. J. 12(6), 1034– 1040 (1998) 40. Chu, K.E.: Singular value and generalized singular value decompositions and the solution of linear matrix equations. Linear Algebra Appl. 88–89, 83–98 (1987) 41. Chu, K.E.: The solution of the matrix equations AXB − CXD = E and (YA −DZ, YC −BZ) = (E, F). Linear Algebra Appl. 93, 93–105 (1987) 42. Costa, O., Fragoso, M.D., Marques, R.P.: Discrete-time Markovian Jump Linear Systems. Springer, Berlin (2005) 43. Costa, O.L.V.,Fragoso, M.D.: Stability results for discrete-time linear systems with markovian jumping parameters. J. Math. Anal. Appl. 179(1), 154–178 (1993) References 473

44. Datta, B.N., Hetti, C.: Generalized Arnoldi methods for the Sylvester-observer equation and the multi-input pole placement problem. In: Proceedings of the 36th Conference on Decision and Control, pp. 4379–4383 (1997) 45. Datta, B.N., Saad, Y.: Arnoldi methods for large Sylvester-like observer matrix equations, and an associated algorithm for partial spectrum assignment. Linear Algebra Appl. 154–156, 225–244 (1991) 46. Datta, K.: The matrix equation XA − BX = R and its applications. Linear Algebra Appl. 109, 91–105 (1988) 47. de Souza, E., Bhattacharyya, S.P.:Controllability, observability and the solution of AX−XB = C. Linear Algebra Appl. 39, 167–188 (1981) 48. de Teran, F., Dopico, F.M.: Consistency and efficient solution of the Sylvester equation for *-congruence. Electron. J. Linear Algebra 22, 849–863 (2011) 49. Dehghan, M., Hajarian, M.: An iterative algorithm for the reflexive solutions of the generalized coupled Sylvester matrix equations and its optimal approximation. Appl. Math. Comput. 202(2), 571–588 (2008) 50. Dehghan, M., Hajarian, M.: Efficient iterative method for solving the second-order Sylvester matrix equation EVF2 − AVF − CV = BW. IET Control Theor. Appl. 3(10), 1401–1408 (2009) 51. Dehghan, M., Hajarian, M.: An iterative method for solving the generalized coupled Sylvester matrix equations over generalized bisymmetric matrices. Appl. Math. Model. 34(3), 639–654 (2010) 52. Ding, F., Chen, T.: Gradient based iterative algorithms for solving a class of matrix equations. IEEE Trans. Autom. Control 50(8), 1216–1221 (2005) 53. Ding, F., Chen, T.: Hierarchical gradient-based identification of multivariable discrete-time systems. Automatica 41(2), 315–325 (2005) 54. Ding, F., Chen, T.: Hierarchical least squares identification methods for multivariable systems. IEEE Trans. Autom. Control 50(3), 397–402 (2005) 55. Ding, F., Chen, T.: Iterative least squares solutions of coupled Sylvester matrix equations. Syst. Control Lett. 54(2), 95–107 (2005) 56. Ding, F., Chen, T.: On iterative solutions of general coupled matrix equations. SIAM J. Control Optim. 44(6), 2269–2284 (2006) 57. Ding, F., Ding, J.: Least squares parameter estimation for systems with irregularly missing data. Int. J. Adapt. Control Sig. Proces. 24(7), 540–553 (2010) 58. Ding, F., Liu, P.X., Ding, J.: Iterative solutions of the generalized Sylvester matrix equations by using the hierarchical identification principle. Appl. Math. Comput. 197(1), 41–50 (2008) 59. Djaferis, T.E., Mitter, S.K.: Algebraic methods for the study of some linear matrix equations. Linear Algebra Appl. 44, 125–142 (1982) 60. Dooren, P.V.: Reduced order observers: a new algorithm and proof. Syst. Control Lett. 4(5), 243–251 (1984) 61. Dorato, P., Levis, A.: Optimal linear regulators: the discrete-time case. IEEE Trans. Autom. Control 16(6), 613–620 (1971) 62. Drakakis, K., Pearlmutter, B.A.: On the calculation of the l2 → l1 induced matrix norm. Int. J. Algebra 3(5), 231–240 (2009) 63. Duan, G.R.: Simple algorithm for robust pole assignment in linear output feedback. IEE Proc.-Control Theor. Appl. 139(5), 465–469 (1992) 64. Duan, G.R.: Solution to matrix equation AV + BW = EVF and eigenstructure assignment for descriptor linear systems. Automatica 28(3), 639–643 (1992) 65. Duan, G.R.: Robust eigenstructure assignment via dynamical compensators. Automatica 29(2), 469–474 (1993) 66. Duan, G.R.: Solutions of the equation AV +BW = VFand their application to eigenstructure assignment in linear systems. IEEE Trans. Autom. Control 38(2), 276–280 (1993) 67. Duan, G.R.: Eigenstructure assignment by decentralized output feedback-a complete para- metric approach. IEEE Trans. Autom. Control 39(5), 1009–1014 (1994) 474 References

68. Duan, G.R.: Parametric approach for eigenstructure assignment in descriptor systems via output feedback. IEE Proc.-Control Theor. Appl. 142(6), 611–616 (1995) 69. Duan, G.R.: On the solution to the sylvester matrix equation AV + BW = EVF. IEEE Trans. Autom. Control 41(4), 612–614 (1996) 70. Duan, G.R.: Eigenstructure assignment and response analysis in descriptor linear systems with state feedback control. Int. J. Control 69(5), 663–694 (1998) 71. Duan, G.R.: Right coprime factorisations using system upper hessenberg forms the multi-input system case. IEE Proc.-Control Theor. Appl. 148(6), 433–441 (2001) 72. Duan, G.R.: Right coprime factorizations for single-input descriptor linear systems: a simple numerically stable algorithm. Asian J. Control 4(2), 146–158 (2002) 73. Duan, G.R.: Parametric eigenstructure assignment via output feedback based on singular value decompositions. IET Control Theor. Appl. 150(1), 93–100 (2003) 74. Duan, G.R.: Two parametric approaches for eigenstructure assignment in second-order linear systems. J. Control Theor. Appl. 1(1), 59–64 (2003) 75. Duan, G.R.: Linear Systems Theory, 2nd edn. (In Chinese). Harbin Institute of Technology Press, Harbin (2004) 76. Duan, G.R.: A note on combined generalised sylvester matrix equations. J. Control Theor. Appl. 2(4), 397–400 (2004) 77. Duan, G.R.: Parametric eigenstructure assignment in second-order descriptor linear systems. IEEE Trans. Autom. Control 49(10), 1789–1795 (2004) 78. Duan, G.R.: The solution to the matrix equation AV + BW = EVJ + R. Appl. Math. Lett. 17(10), 1197–1202 (2004) 79. Duan, G.R.: Parametric approaches for eigenstructure assignment in high-order linear systems. Int. J. Control, Autom. Syst. 3(3), 419–429 (2005) 80. Duan, G.R.: Solution to high-order generalized Sylvester matrix equations. In: Proceedings of the 44th IEEE Conference on Decision and Control and the European Control Conference 2005, pp. 7247–7252. Seville, Spain (2005) 81. Duan, G.R.: Analysis and Design of Descritor Linear Systems. Springer, New York (2010) 82. Duan, G.R.: On a type of generalized Sylvester equations. In: Proceedings of the 25th Chinese Control and Decision Conference, pp. 1264–1269 (2013) 83. Duan, G.R.: On a type of high-order generalized Sylvester equations. In: Proceedings of the 32th Chinese Control Conference, Xi’an, China, pp. 328–333 (2013) 84. Duan, G.R. On a type of second-order generalized Sylvester equations. In: Proceedings of the 9th Asian Control Conference (2013) 85. Duan, G.R.: Solution to a type of high-order nonhomogeneous generalized Sylvester equa- tions. In: Proceedings of the 32th Chinese Control Conference, Xi’an, China, pp. 322–327 (2013) 86. Duan, G.R.: Solution to a type of nonhomogeneous generalized Sylvester equations. In: Proceedings of the 25th Chinese Control and Decision Conference, pp. 163–168 (2013) 87. Duan, G.R.: Solution to second-order nonhomogeneous generalized Sylvester equations. In Proceedings of the 9th Asian Control Conference (2013) 88. Duan, G.R.: Parametric solutions to rectangular high-order Sylvester equations–case of F arbitrary. In: SICE Annual Conference 2014, Sapporo, Japan, pp. 1827–1832 (2014) 89. Duan, G.R.: Parametric solutions to rectangular high-order Sylvester equations–case of F jordan. In: SICE Annual Conference 2014, Sapporo, Japan, pp. 1820–1826 (2014) 90. Duan, G.R.: Generalized Sylvester Equations: Unified Parametric Solutions. CRC Press, Tay- lor and Francis Group, Boca Roton (2015) 91. Duan, G.R., Howe, D.: Robust magnetic bearing control via eigenstructure assignment dy- namical compensation. IEEE Trans. Control Syst. Technol. 11(2), 204–215 (2003) 92. Duan, G.R., Hu, W.Y.: On matrix equation AX − XC = Y (In Chinese). Control Decis. 7(2), 143–147 (1992) 93. Duan, G.R., Li, J., Zhou, L.: Design of robust Luenberger observer (In Chinese). Acta Auto- matica Sinica 18(6), 742–747 (1992) References 475

94. Duan, G.R., Liu, G.P.,Thompson, S.: Disturbance decoupling in descriptor systems via output feedback-a parametric eigenstructure assignment approach. In: Proceedings of the 39th IEEE Conference on Decision and Control, pp. 3660–3665 (2000) 95. Duan, G.R., Liu, G.P., Thompson, S.: Eigenstructure assignment design for proportional- integral observers: continuous-time case. IEE Proc.-Control Theor. Appl. 148(3), 263–267 (2001) 96. Duan, G.R., Liu, W.Q., Liu, G.P.: Robust model reference control for multivariable linear systems subject to parameter uncertainties. Proc. Inst. Mech. Eng. Part I: J. Syst. Control Eng. 215(6), 599–610 (2001) 97. Duan, G.R., Patton, R.J.: Eigenstructure assignment in descriptor systems via proportional plus derivative state feedback. Int. J. Control 68(5), 1147–1162 (1997) 98. Duan, G.R., Patton, R.J.: Robust fault detection using Luenberger-type unknown input observers–a parametric approach. Int. J. Syst. Sci. 32(4), 533–540 (2001) 99. Duan, G.R., Qiang, W.: Design of Luenberger observers with loop transfer recovery(B) (In Chinese). Acta Aeronautica ET Astronautica Sinica 14(7), A433–A436 (1993) 100. Duan, G.R., Qiang, W.,Feng, W.,Sun, L.: A complete parametric approach for model reference control system design (In Chinese). J. Astronaut. 15(2), 7–13 (1994) 101. Duan, G.R., Wu, A.G.: Robust fault detection in linear systems based on PI observers. Int. J. Syst. Sci. 37(12), 809–816 (2006) 102. Duan, G.R., Wu, G.Y.: Robustness analysis of pole assignment controllers for continuous systems (In Chinese). J. Harbin Inst. Technol. 2, 62–69 (1989) 103. Duan, G.R., Wu, G.Y., Huang, W.H.: Analysis of the robust stabiltiy of discrete pole assign- ment control system (In Chinese). J. Harbin Inst. Technol. 4, 57–64 (1988) 104. Duan, G.R., Wu, G.Y., Huang, W.H.: Robust stability analysis of pole assignment controllers for linear systems (In Chinese). Control Theor. Appl. 5(4), 103–108 (1988) 105. Duan, G.R., Wu, G.Y., Huang, W.H.: The eigenstructure assignment problem of linear time- varying systems (In Chinese). Sci. China: Series A 20(7), 769–776 (1990) 106. Duan, G.R., Xu, S.J., Huang, W.H.: Generalized positive definite matrix and its application in stability analysis (In Chinese). Acta Mechanica Sinica 21(6), 754–757 (1989) 107. Duan, G.R., Yu,H.H.: Observer design in high-order descriptor linear systems. In: Proceedings of SICE-ICASE International Joint Conference 2006, Busan, Korea, pp. 870–875 (2006) 108. Duan, G.R., Yu, H.H.: Parametric approaches for eigenstructure assignment in high-order descriptor linear systems. In: Proceedings of the 45th IEEE Conference on Decision and Control, San Diego, USA, 2006, pp. 1399–1404 (2006) 109. Duan, G.R., Yu, H.H.: Robust pole assignment in high-order descriptor linear systems via proportional plus derivative state feedback. IET Control Theor. Appl. 2(4), 277–287 (2008) 110. Duan, G.R., Yu, H.H., Tan, F.: Parametric control systems design with applications in missile control. Sci. China Series F: Inform. Sci. 52(11), 2190–2200 (2009) 111. Duan, G.R., Zhang, B.: Robust model-reference control for descriptor linear systems subject to parameter uncertainties. J. Control Theor. Appl. 5(3), 213–220 (2007) 112. Duan, G.R., Zhang, X.: Dynamical order assignment in linear descriptor systems via state derivative feedback. In: Proceedings of the 41st IEEE Conference on Decision and Control, pp. 4533–4538 (2002) 113. Duan, G.R., Zhou, B.: Solution to the second-order sylvester matrix equation MVF2 +DVF+ KV = BW. IEEE Trans. Autom. Control 51(5), 805–809 (2006) 114. Duan, X., Li, C., Liao, A., Wang, R.: On two classes of mixed-type Lyapunov equations. Appl. Math. Comput. 219, 8486–8495 (2013) 115. Fahmy, M., O’Reilly, J.: Eigenstructure assignment in linear multivariable systems-a para- metric solution. IEEE Trans. Autom. Control 28(10), 990–994 (1983) 116. Fahmy, M.M., Hanafy, A.A.R.: A note on the extremal properties of the discrete-time Lya- punov matrix equation. Inform. Control 40, 285–290 (1979) 117. Fairman, F.W.: Linear Control Theory: The State Space Approach. Wiely, West Sussex, Eng- land (1998) 476 References

118. Fang, C.: A simple approach to solving the Diophantine equation. IEEE Trans. Autom. Control 37(1), 152–155 (1992) 119. Fang, C., Chang, F.: A novel approach for solving Diophantine equations. IEEE Trans. Circ. Syst. 37(11), 1455–1457 (1990) 120. Fang, Y.: A new general sufficient condition for almost sure stability of jump linear systems. IEEE Trans. Autom. Control 42(3), 378–382 (1997) 121. Fang, Y., Loparo, K.A.: Stochastic stability of jump linear systems. IEEE Trans. Autom. Control 47(7), 1204–1208 (2002) 122. Fang, Y., Loparo, K.A., Feng, X.: Almost sure and δ-moment stability of jump linear systems. Int. J. Control 59(5), 1281–1307 (1994) 123. Fang, Y., Loparo, K.A., Feng, X.: Stability of discrete-time jump linear systems. J. Math. Syst. Estimation Control 5(3), 275–321 (1995) 124. Fang, Y., Loparo, K.A., Feng, X.: New estimates for solutions of Lyapunov equations. IEEE Trans. Autom. Control 42(3), 408–411 (1997) 125. Ferrante, A., Pavon, M., Ramponi, F.:Hellinger versus kullback-leibler multivariable spectrum approximation. IEEE Trans. Autom. Control 53(4), 954–967 (2008) 126. Flanders, H., Wimmer, H.K.: On the matrix equation AX −XB = C and AX −YB = C.SIAM J. Appl. Math. 32(4), 707–710 (1977) 127. Frank, P.M.: Fault diagnosis in dynamic systems using analytical and knowledge-based redundancy-a survey and some new results. Automatica 26(3), 459–474 (1990) 128. Gajic, Z.: Lyapunov Matrix Equations in System Stability and Control. Academic Press, New York (1995) 129. Gardiner, J.D., Laub, A.J., Amato, J.J., Moler, C.B.: Solution of the Sylvester matrix equation AXBT + CXDT = E. ACM Trans. Math. Softw. 18(2), 223–231 (1992) 130. Gavin, K.R., Bhattacharyya, S.P.: Robust and well-conditioned eigenstructure assignment via sylvester’s equation. Optimal Control Appl. Meth. 4, 205–212 (1983) 131. Ghavimi, A.R., Laub, A.J.: Backward error, sensitivity, and refinement of computed solutions of algebraic Riccati equations. Numer. Linear Algebra Appl. 2(1), 29–49 (1995) 132. Golub, G.H., Loan, C.F.V.: Matrix Computations, 3rd edn. Johns Hopkins, Baltimore (1996) 133. Golub, G.H., Nash, S., Loan, C.V.:A Hessenberg-Schur method for the problem AX+XB = C. IEEE Trans. Autom. Control, AC 24(6), 909–913 (1979) 134. Gugercin, S., Sorensen, D.C., Antoulas, A.C.: A modified low-rank Smith method for large- scale Lyapunov equations. Numer. Algorithms 32, 27–55 (2003) 135. Hajarian, M.: Matrix form of the Bi-CGSTAB method for solving the coupled Sylvester matrix equations. IET Control Theor. Appl. 7(14), 1828–1833 (2013) 136. Hammarling, S.J.: Numerical solution of the stable, nonnegative definite Lyapunov equation. IMAJ.Numer.Anal.2(3), 303–323 (1982) 137. Hanzon, B., Peeters, R.L.M.: A Faddeev sequence method for solving Lyapunov and Sylvester equations. Linear Algebra Appl. 241–243, 401–430 (1996) 138. Hernandez, V., Gasso, M.: Explicit solution of the matrix equation AXB − CXD = E. Linear Algebra Appl. 121, 333–344 (1989) 139. Heyouni, M.: Extended arnoldi methods for large low-rank Sylvester matrix equations. Appl. Numer. Math. 60, 1171–1182 (2010) 140. Higham, N.J.: Perturbation theory and backward error for AX −XB = C.BIT33(1), 124–136 (1993) 141. Hong, Y.P., Horn, R.A.: A canonical form for matrices under consimilarity. Linear Algebra Appl. 102, 143–168 (1988) 142. Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, Cambridge (1990) 143. Horn, R.A., Johnson, C.R.: Topics in Matrix Analysis. Cambridge University Press, Cam- bridge (1991) 144. Hou, S.: A simple proof of the Leverrier-Faddeev characteristic polynomial algorithm. SIAM REV. 40(3), 706–709 (1998) 145. Hu, G.D., Hu, G.D.: A relation between the weighted logarithmic norm of a matrix and the Lyapunov equation. BIT 40, 606–610 (2000) References 477

146. Huang, L., Liu, J.: The extension of Roth’s theorem for matrix equations over a ring. Linear Algebra Appl. 259, 229–235 (1997) 147. Huang, L.: The explicit solutions and solvability of linear matrix equations. Linear Algebra Appl. 311, 195–199 (2000) 148. Huang, L.: Consimilarity of quaternion matrices and complex matrices. Linear Algebra Appl. 331, 21–30 (2001) 149. Huang, L., Zeng, Q.: The matrix equation AXB+CYD = E over a simple artinian ring. Linear Multilinear Algebra 38, 225–232 (1995) 150. Jacobson, N.: Pseudo-linear transformation. Ann. Math. 38, 484–507 (1937) 151. Jadav, R.A., Patel, S.S.: Applications of singular value decomposition in image processing. Indian J. Sci. Technol. 3(2), 148–150 (2010) 152. Jameson, A.: Solutions of the equation AX − XB = C by inversion of an M × M or N × N matrix. SIAM J. Appl. Math. 16, 1020–1023 (1968) 153. Ji, Y., Chizeck, H.J., Feng, X., Loparo, K.: Stability and control of discrete-time jump linear systems. Control-Theor. Adv. Tech. 7(2), 247–270 (1991) 154. Jiang, T., Cheng, X., Chen, L.: An algebraic relation between consimilarity and similarity of complex matrices and its applications. J. Phys. A: Math. Gen. 39, 9215–9222 (2006) 155. Jiang, T., Wei, M.: On solutions of the matrix equations X − AXB = C and X − AXB¯ = C. Linear Algebra Appl. 367, 225–233 (2003) 156. Young Jr., D.M.: Iterative Solution of Large Linear Systems. Academic Press (1971) 157. Jones Jr., J., Lew, C.: Solutions of the Lyapunov matrix equation BX − XA = C. IEEE Trans. Autom. Control, AC 27(2), 464–466 (1982) 158. Kagstrom, B., Dooren, P.V.:A generalized state-space approach for the additive decomposition of a transfer matrix. Int. J. Numer. Linear Algebra Appl. 1(2), 165–181 (1992) 159. Kagstrom, B., Westin, L.: Generalized Schur methods with condition estimators for solving the generalized sylvester equation. IEEE Trans. Autom. Control 34(7), 745–751 (1989) 160. Kailath, T.: Linear Systems. Prentice Hall, New Jersey (1980) 161. Kalman, R.: On the general theory of control systems. IRE Trans. Autom. Control 4(3), 481–492 (1959) 162. Kalman, R., Koepcke, R.: Optimal synthesis of linear sampling control systems using gener- alized performance indices. Trans. ASME 80, 1820–1826 (1958) 163. Kantorovich, L.V., Akilov, G.P.: Functional Analysis in Normed Spaces. Macmillan, New York (1964) 164. Kautsky, J.,Nichols, N.K., Van Dooren, P.: Robust pole assignment in linear state feedback. Int. J. Control 41(5), 1129–1155 (1985) 165. Kim, H.C., Choi, C.H.: Closed-form solution of the continuous-time Lyapunov matrix equa- tion. IEE Proc.-Control Theor. Appl. 141(5), 350–356 (1994) 166. Kim, Y., Kim, H.S., Junkins, J.L.: Eigenstructure assignment algorithm for mechanical second-order systems. J. Guidance, Control Dyn. 22(5), 729–731 (1999) 167. Kleinman, D.L., Rao, P.K.: Extensions to the Bartels-Stewart algorithm for linear matrix equations. IEEE Trans. Autom. Control, AC 23(1), 85–87 (1978) 168. Kressner, D., Schroder, C., Watkins, D.S.: Implicit QR algorithms for palindromic and even eigenvalue problems. Numer. Algorithms 51, 209–238 (2009) 169. Kwon, B., Youn, M.: Eigenvalue-generalized eigenvector assignment by output feedback. IEEE Trans. Autom. Control 32(5), 417–421 (1987) 170. Lai, Y.S.: An algorithm for solving the matrix polynomial equation A(s)X(s) + B(s)Y(s) = C(s). IEEE Trans. Circ. Syst. 36(8), 1087–1089 (1989) 171. Lancaste, P.: Explicit solutions of linear matrix equations. SIAM Rev. 12(4), 544–566 (1970) 172. Lancaster, P., Tismenetsky, M.: The Theory of Matrices, 2nd edn. Academic Press, London, United Kingdom (1998) 173. Lee, C.H., Kung, F.C.: Upper and lower matrix bounds of the solutions for the continuous and discrete Lyapunov equations. J. Franklin Inst. 334B(4), 539–546 (1997) 174. Lewis, F.L.: Further remarks on the Cayley-Hamiltion theorem and Leverrier’s method for matrix pencil (sE − A). IEEE Trans. Autom. Control 31(9), 869–870 (1986) 478 References

175. Li, C.J., Ma, G.F.: Optimal Control (In Chinese). Science Press, Beijing (2011) 176. Li, H., Gao, H., Shi, P., Zhao, X.: Fault-tolerant control of markovian jump stochastic systems via the augmented sliding mode observer approach. Automatica 50(7), 1825–1834 (2014) 177. Liao, A.P., Lei, Y.: Least-squares solution with the minimum-norm for the matrix equation (AXB, GXH) = (C, D). Comput. Math. Appl. 50(3–4), 539–549 (2005) 178. Liu, Y., Wang, D., Ding, F.: Least-squares based iterative algorithms for identifying Box- Jenkins models with finite measurement data. Digit. Sig. Process. 20(5), 1458–1467 (2010) 179. Liu, Y., Xiao, Y., Zhao, X.: Multi-innovation stochastic gradient algorithm for multiple-input single-output systems using the auxiliary model. Appl. Math. Comput. 215(4), 1477–1483 (2009) 180. Liu, Y.H.: Ranks of solutions of the linear matrix equation AX + YB = C. Comput. Math. Appl. 52(6–7), 861–872 (2006) 181. Liu, Y.J., Yu, L., Ding, F.: Multi-innovation extended stochastic gradient algorithm and its performance analysis. Circ. Syst. Sig. Process. 29(4), 649–667 (2010) 182. Lu, T.: Solution of the matrix equation AX − XB = C. Computing 37, 351–355 (1986) 183. Ma, E.C.: A finite series solution of the matrix equation AX − XB = C. SIAM J. Appl. Math. 14(3), 490–495 (1966) 184. Maligranda, L.: A simple proof of the Holder and the minkowski inequality. Am. Math. Mon. 102(3), 256–259 (1995) 185. Mariton, M.: Jump Linear Systems in Automatic Control. Marcel Dekker, New York (1990) 186. Mertzios, B.G.: Leverrier’s algorithm for singular systems. IEEE Trans. Autom. Control 29(7), 652–653 (1984) 187. Meyer, C.D.: Matrix Analysis and Applied Linear Algebra. SIAM (2000) 188. Miller, D.F.: The iterative solution of the matrix equation XA + BX + C = 0. Linear Algebra Appl. 105, 131–137 (1988) 189. Misra, P., Quintana, E.S., Van Dooren, P.M.: Numerically reliable computation of character- istic polynomials. In: Proceedings of the 1995 American Control Conference, USA, 1995, pp. 4025–4029 (1995) 190. Mitra, S.K.: The matrix equation AXB + CXD = E. SIAM J. Appl. Math. 32(4), 823–825 (1977) 191. Mom, T., Fukuma, N., Kuwahara, M.: Eigenvalue bounds for the discrete Lyapunov matrix equation. IEEE Trans. Autom. Control, AC 30(9), 925–926 (1985) 192. Moore, B.C.: On the flexibility offered by state feedback in multivariable systems beyond closed loop eigenvalue assignment. IEEE Trans. Autom. Control, AC 21(5), 689–692 (1976) 193. Mori, T., Fukuma, N., Kuwahara, M.: Explicit solution, eigenvalue bounds in the Lyapunov matrix equation. IEEE Trans. Autom. Control, AC 31 (7), 656–658 (1986) 194. Navarro, E., Company, R., Jodar, L.: Bessel matrix differential equations: explicit solutions of initial and two-point boundary value problems. Applicationes Mathematicae 22(1), 11–23 (1993) 195. Nijmeijer, H., Savaresi, S.M.: On approximate model-reference control of SISO discrete-time nonlinear systems. Automatica 34(10), 1261–1266 (1998) 196. Ozguler, A.B.: The equations AXB+CYD = E over a principal ideal domain. SIAM J. Matrix Anal. Appl. 12(3), 581–591 (1991) 197. Paraskevopoulos, P., Christodoulou, M., Boglu, A.: An algorithm for the computation of the transfer function matrix for singular systems. Automatica 20(2), 259–260 (1984) 198. Pavon, M., Ferrante, A.: On the georgiou-lindquist approach to constrained kullback-leibler approximation of spectral densities. IEEE Trans. Autom. Control 51(4), 639–644 (2006) 199. Peng, Z., Peng, Y.:An efficient iterative method for solving the matrix equation AXB+CYD = E. Numer. Linear Algebra Appl. 13(6), 473–485 (2006) 200. Penzl, T.: A cyclic low-rank Smith method for large sparse Lyapunov equations. SIAM J. Sci. Comput. 21(4), 1401–1418 (2000) 201. Qian, Y.Y., Pang, W.J.: An implicit sequential algorithm for solving coupled Lyapunov equa- tions of continuous-time Markovian jump systems. Automatica 60(8), 245–250 (2015) References 479

202. Qiao, Y.P., Qi, H.S., Cheng, D.Z.: Parameterized solution to a class of Sylvester matrix equa- tions. Int. J. Autom. Comput. 7(4), 479–483 (2010) 203. Qu, Z., Dorsey, J.F., Dawson, D.M.: Model reference robust control of a class of siso systems. IEEE Trans. Autom. Control 39(11), 2219–2234 (1994) 204. Ramadan, M.A., El-Danaf, T.S., Bayoumi, A.M.E.: Finite iterative algorithm for solving a class of general complex matrix equation with two unknowns of general form. Appl. Comput. Math. 3(5), 273–284 (2014) 205. Ramadan, M.A., El-Shazly, N.M., Selim, B.I.: A Hessenberg method for the numerical solu- tions to types of block sylvester matrix equations. Math. Comput. Model. 52(9–10), 1716– 1727 (2010) 206. Rao, C.R.: A note on a generalized inverse of a matrix with applications to problems in mathematical statistics. J. Roy. Statist. Soc. Ser. B. 24(1), 152–158 (1962) 207. Regalia, P.A., Mitra, S.K.: Kronecker products, unitary matrices and signal processing appli- cations. SIAM Rev. 31(4), 586–613 (1989) 208. Rohde, C.A.: Generalized inverses of partitioned matrices. J. Soc. Indust. Appl. Math. 13(4), 1033–1035 (1965)   209. Rohn, J.: Computing the norm A ∞,1 is NP-hard. Linear Multilinear Algebra 47(3), 195–204 (2000) 210. Roth, W.E.: The equations AX − YB = C and AX − XB = C in matrices. Proc. Am. Math. Soc. 3(3), 392–396 (1952) 211. Saberi, A., Stoorvogel, A.A., Sannuti, P.: Control of Linear Systems with Regulation and Input Constraints. Springer, Berlin (1999) 212. Sadkane, M.: Estimates from the discrete-time Lyapunov equation. Appl. Math. Lett. 16(3), 313–316 (2003) 213. Sakthivel, R., Raja, R., Anthoni, S.M.: Linear matrix inequality approach to stochastic stability of uncertain delayed bam neural networks. IMA J. Appl. Math. 78(6), 1156–1178 (2013) 214. Shafai, B., Bhattacharyya, S.P.: An algorithm for pole assignment in high order multivariable systems. IEEE Trans. Autom. Control 33(9), 870–876 (1988) 215. Shi, Y.,Ding, F., Chen, T.: 2-norm based recursive design of transmultiplexers with designable filter length. Circ. Syst. Sig. Process. 25(4), 447–462 (2006) 216. Shi, Y., Ding, F., Chen, T.: Multirate crosstalk identification in xDSL systems. IEEE Trans. Commun. 54(10), 1878–1886 (2006) 217. Smith, P.G.: Numerical solution of the matrix equation AX + XAT + B = 0. IEEE Trans. Autom. Control, AC 16(3), 278–279 (1971) 218. Smith, R.A.: Matrix equation XA + BX = C. SIAM J. Appl. Math. 16(1), 198–201 (1968) 219. Song, C., Chen, G.: An efficient algorithm for solving extended Sylvester-conjugate transpose matrix equations. Arab J. Math. Sci. 17(2), 115–134 (2011) 220. Sorensen D. C., Zhou, Y.: Direct methods for matrix Sylvester and Lyapunov equations. J. Appl. Math. 6, 277–303 (2003) 221. Sreeram, V., Agathoklis, P.: Solution of Lyapunov equation with system matrix in companion form. IEE Proc.-D: Control Theor. Appl. 138(6), 529–534 (1991) 222. Sreeram, V., Agathoklis, P., Mansour, M.: The generation of discrete-time q-Markov covers via inverse solution of the Lyapunov equation. IEEE Trans. Autom. Control 39(2), 381–385 (1994) 223. Sun, J., Olbrot, A.W., Polis, M.P.: Robust stabilisation and robust performance using model reference control and modelling error compensation. IEEE Trans. Autom. Control 39(3), 630–635 (1994) 224. Syrmos, V.L., Lewis, F.L.: Output feedback eigenstructure assignment using two Sylvester equations. IEEE Trans. Autom. Control 38(3), 495–499 (1993) 225. Teran, F.D., Dopico, F.M., Guillery, N., Montealegre, D., Reyes, N.: The solution of the equations AX + X∗B = 0. Linear Algebra Appl. 438, 2817–2860 (2013) 226. Tian, Y.: Ranks of solutions of the matrix equation AXB = C. Linear Multilinear Algebra 51(2), 111–125 (2003) 480 References

227. Tippett, M.K., Marchesin, D.: Upper bounds for the solution of the discrete algebraic Lyapunov equation. Automatica 35(8), 1485–1489 (1999) 228. Tismenetsky, M.: Factorizations of hermitian block hankel matrices. Linear Algebra Appl. 166, 45–63 (1992) 229. Tong, L., Wu, A.G., Duan, G.R.: Finite iterative algorithm for solving coupled Lyapunov equations appearing in discrete-time Markov jump linear systems. IET Control Theor. Appl. 4(10), 2223–2231 (2010) 230. Tsiligkaridis T., Hero A. O.: Covariance estimation in high dimensions via Kronecker product expansions. IEEE Trans. Sig. Process. 61(21), 5347–5360 (2013) 231. Tsui, C.C.: A complete analytical solution to the equation TA −FT = LC and its applications. IEEE Trans. Autom. Control 32(8), 742–744 (1987) 232. VanLoan, C.F.: The ubiquitous Kronecker product. J. Comput. Appl. Math. 123(1–2), 85–100 (2000) 233. Wan, K.C., Sreeram, V.: Solution of the bilinear matrix equation using Astrom-Jury-Agniel algorithm. IEE Proc.-Control Theor. Appl. 142(6), 603–610 (1995) 234. Wang, G., Lin, Y.: A new extension of Leverrier’s algorithm. Linear Algebra Appl. 180, 227–238 (1993) 235. Wang, L., Xie, L., Wang, X.: The residual based interactive stochastic gradient algorithms for controlled moving average models. Appl. Math. Comput. 211(2), 442–449 (2009) 236. Wang Q.W.: A system of matrix equations and a linear matrix equation over arbitrary regular rings with identity. Linear Algebra Appl. 384, 43–54 (2004) 237. Wang, Q., Lam, J., Wei, Y., Chen, T.: Iterative solutions of coupled discrete Markovian jump Lyapunov equations. Comput. Math. Appl. 55(4), 843–850 (2008) 238. Wimmer, H.K.: The matrix equation X −AXB = C and an analogue of Roth’s theorem. Linear Algebra Appl. 109, 145–147 (1988)  i 239. Wimmer, H.K.: Explicit solutions of the matrix equation A XDi = C. SIAM J. Matrix Anal. Appl. 13(4), 1123–1130 (1992) 240. Wimmer, H.K.: Consistency of a pair of generalized Sylvester equations. IEEE Trans. Autom. Control 39(5), 1014–1016 (1994) 241. Wimmer, H.K.: Roth’s theorems for matrix equations with symmetry constraints. Linear Algebra Appl. 199, 357–362 (1994) 242. Wu, A.G.: Explicit solutions to the matrix equation EXF − AX = C. IET Control Theor. Appl. 7(12), 1589–1598 (2013) 243. Wu, A.G., Chang, M.F., Zhang, Y.: Model reference control for discrete-time antilinear sys- tems. In: Proceedings of the 33rd Chinese Control Conference, Nanjing, 2014, pp. 6153–6157 (2014) 244. Wu, A.G., Duan, G.R.: Design of generalized PI observers in descriptor linear systems. IEEE Trans. Circ. Syst. I: Regular Papers 53(12), 2828–2837 (2006) 245. Wu, A.G., Duan, G.R.: Design of PI observers for continuous-time descriptor linear systems. IEEE Trans. Syst. Man Cybern.-Part B: Cybern. 36(6), 1423–1431 (2006) 246. Wu, A.G., Duan, G.R.: IP observer design for descriptor linear systems. IEEE Trans. Circ. Syst.-II: Express. Briefs 54(9), 815–819 (2007) 247. Wu, A.G., Duan, G.R.: On solution to the generalized sylvester matrix equation AV + BW = EVF. IET Control Theor. Appl. 1(1), 402–408 (2007) 248. Wu, A.G., Duan, G.R.: New iterative algorithms for solving coupled Markovian jump Lya- punov equations. IEEE Trans. Autom. Control 60(1), 289–294 (2015) 249. Wu, A.G., Duan, G.R, Dong, J., Fu, Y.M.: Design of proportional-integral observers for discrete-time descriptor linear systems. IET Control Theor. Appl. 3(1), 79–87 (2009) 250. Wu, A.G., Duan, G.R., Feng, G., Liu, W.: On conjugate product of complex polynomials. Appl. Math. Lett. 24(5), 735–741 (2011) 251. Wu, A.G., Duan, G.R., Fu, Y.M.: Generalized PID observer design for descriptor linear sys- tems. IEEE Trans. Syst. Man Cybern.-Part B: Cybern. 37(5), 1390–1395 (2007) 252. Wu, A.G., Duan, G.R., Fu, Y.M., Wu, W.J.: Finite iterative algorithms for the generalized Sylvester-conjugate matrix equation AX + BY = EXF + S. Computing 89, 147–170 (2010) References 481

253. Wu, A.G., Duan, G.R., Hou, M.Z.: Parametric design approach for proportional multiple- integral derivative observers in descriptor linear systems. Asian J. Control 14(6), 1683–1689 (2012) 254. Wu, A.G., Duan, G.R., Liu, W.: Proportional multiple-integral observer design for continuous- time descriptor linear systems. Asian J. Control 14(2), 476–488 (2012) 255. Wu, A.G., Duan, G.R., Liu, W.: Implicit iterative algorithms for continuous Markovian jump Lyapunov matrix equations. IEEE Trans. Autom. Control, page to appear (2016) 256. Wu, A.G., Duan, G.R., Liu, W., Sreeram, V.: Controllability and stability of discrete-time antilinear systems. In: Proceedings of 3rd Australian Control Conference, Perth, 2013, pp. 403–408 (2013) 257. Wu, A.G., Duan, G.R., Xue, Y.: Kronecker maps and Sylvester-polynomial matrix equations. IEEE Trans. Autom. Control 52(5), 905–910 (2007) 258. Wu, A.G., Duan, G.R., Yu, H.H.: On solutions of the matrix equations XF − AX = C and XF − AX¯ = C. Appl. Math. Comput. 183(2), 932–941 (2006) 259. Wu, A.G., Duan, G.R., Yu, H.H.: Impulsive-mode controllablizability revisited for descriptor linear systems. Asian J. Control 11(3), 358–365 (2009) 260. Wu, A.G., Duan, G.R., Zhao, S.M.: Impulsive-mode controllablisability in descriptor linear systems. IET Control Theor. Appl. 1(3), 558–563 (2007) 261. Wu, A.G., Duan, G.R., Zhao, Y., Yu, H.H.: A revisit to I-controllablisability for descriptor linear systems. IET Control Theor. Appl. 1(4), 975–978 (2007) 262. Wu, A.G., Duan, G.R., Zhou, B.: Solution to generalized Sylvester matrix equations. IEEE Trans. Autom. Control 53(3), 811–815 (2008) 263. Wu, A.G., Feng, G., Duan, G.R.: Proportional multiple-integral observer design for discrete- time descriptor linear systems. Int. J. Syst. Sci. 43(8), 1492–1503 (2012) 264. Wu, A.G., Feng, G., Duan, G.R., Liu, W.: Iterative solutions to the Kalman-Yakubovich- conjugate matrix equation. Appl. Math. Comput. 217(9), 4427–4438 (2011) 265. Wu, A.G., Feng, G., Duan, G.R., Wu, W.J.: Iterative solutions to coupled Sylvester-conjugate matrix equations. Comput. Math. Appl. 60(1), 54–66 (2010) 266. Wu, A.G., Feng, G., Hu, J., Duan, G.R.: Closed-form solutions to the nonhomogeneous Yakubovich-conjugate matrix equation. Appl. Math. Comput. 214(2), 442–450 (2009) 267. Wu, A.G., Feng, G., Liu, W., Duan, G.R.: The complete solution to the Sylvester-polynomial- conjugate matrix equations. Math. Comput. Modell. 53(9–10), 2044–2056 (2011) 268. Wu, A.G., Feng, G., Wu, W.J., Duan, G.R.: Closed-form solutions to Sylvester-conjugate matrix equations. Comput. Math. Appl. 60(1), 95–111 (2010) 269. Wu, A.G., Fu, Y.M., Duan, G.R.: On solutions of matrix equations V − AVF = BW and V − AVF¯ = BW. Math. Comput. Modell. 47(11–12), 1181–1197 (2008) 270. Wu, A.G., Li, B., Zhang, Y., Duan, G.R.: Finite iterative solutions to coupled Sylvester- conjugate matrix equations. Appl. Math. Modell. 35(3), 1065–1080 (2011) 271. Wu, A.G., Liu, W., Duan, G.R.: On the conjugate product of complex polynomial matrices. Math. Comput. Modell. 53(9–10), 2031–2043 (2011) 272. Wu, A.G., Lv, L., Duan, G.R.: Iterative algorithms for solving a class of complex conjugate and transpose matrix equations. Appl. Math. Comput. 217, 8343–8353 (2011) 273. Wu, A.G., Lv, L., Duan, G.R., Liu, W.: Parametric solutions to Sylvester-conjugate matrix equations. Comput. Math. Appl. 62(9), 3317–3325 (2011) 274. Wu, A.G., Lv, L., Hou, M.Z.: Finite iterative algorithms for a common solution to a group of complex matrix equations. Appl. Math. Comput. 218(4), 1191–1202 (2011) 275. Wu, A.G., Lv, L., Hou, M.Z.: Finite iterative algorithms for extended Sylvester-conjugate matrix equations. Math. Comput. Modell. 54(9–10), 2363–2384 (2011) 276. Wu, A.G., Qian, Y.Y., Liu, W.: Stochastic stability for discrete-time antilinear systems with Markovian jumping parameters. IET Control Theor. Appl. 9(9), 1399–1410 (2015) 277. Wu, A.G., Qian, Y.Y., Liu, W., Sreeram, V.: Linear quadratic regulation for discrete-time antilinear systems: an anti-Riccati matrix equation approach. J. Franklin Inst. 353(5), 1041– 1060 (2016) 482 References

278. Wu, A.G., Sun, Y., Feng, G.: Closed-form solution to the non-homogeneous generalised Sylvester matrix equation. IET control Theory Appl. 4(10), 1914–1921 (2010) 279. Wu, A.G., Tong, L., Duan, G.R.: Finite iterative algorithm for solving coupled Lyapunov equations appearing in continuous-time Markov jump linear systems. Int. J. Syst. Sci. 44(11), 2082–2093 (2013) 280. Wu, A.G., Wang, H.Q., Duan, G.R.: On matrix equations X − AXF = C and X − AXF¯ = C. J. Comput. Appl. Math. 230, 690–698 (2009) 281. Wu, A.G., Zeng, X., Duan, G.R., Wu, W.J.: Iterative solutions to the extended Sylvester- conjugate matrix equations. Appl. Math. Comput. 217(1), 130–142 (2010) 282. Wu, A.G., Zhang, E., Liu, F.: On closed-form solutions to the generalized Sylvester-conjugate matrix equation. Appl. Math. Comput. 218, 9730–9741 (2012) 283. Wu, Y., Duan, G.R.: Design of dynamical compensators for matrix second-order linear sys- tems: A parametric approach. In: Proceedings of the 44th IEEE Conference on Decision and Control, SPAIN, 2005, pp. 7253–7257 (2005) 284. Xiao, C.S., Feng, Z.M., Shan, X.M.: On the solution of the continuous-time Lyapunov matrix equation in two canonical forms. IEE Proc.-D, Control Theor. Appl. 139(3), 286–290 (1992) 285. Xie, L., Ding, J., Ding, F.: Gradient based iterative solutions for general linear matrix equa- tions. Comput. Math. Appl. 58(7), 1441–1448 (2009) 286. Xing, W., Zhang, Q., Zhang, J., Wang, Q.: Some geometric properties of Lyapunov equation and LTI system. Automatica 37(2), 313–316 (2001) 287. Xiong, Z., Qin, Y., Yuan, S.: The maximal and minimal ranks of matrix expression with applications. J. Inequal. Appl. 54 accepted (2012) 288. Xu, G., Wei, M., Zheng, D.: On solutions of matrix equation AXB+CYD = F. Linear Algebra Appl. 279(1–3), 93–109 (1998) 289. Xu, S., Cheng, M.: On the solvability for the mixed-type Lyapunov equation. IMA J. Appl. Math. 71(2), 287–294 (2006) 290. Yamada, M., Zun, P.C., Funahashi, Y.: On solving Diophantine equations by real matrix manipulation. IEEE Trans. Autom. Control 40(1), 118–122 (1995) 291. Yang, C., Liu, J., Liu, Y.: Solutions of the generalized Sylvester matrix equation and the application in eigenstructure assignment. Asian J. Control 14(6), 1669–1675 (2012) 292. Yu, H.H., Duan, G.R.: ESA in high-order linear systems via output feedback. Asian J. Control 11(3), 336–343 (2009) 293. Yu, H.H., Duan, G.R.: ESA in high-order descriptor linear systems via output feedback. Int. J. Control, Autom. Syst. 8(2), 408–417 (2010) 294. Yu, H.H., Duan, G.R., Huang, L.: Asymptotically tracking in high-order descriptor linear systems. In: Proceedings of the 27th Chinese Control Conference, Kunming, China, 2008, pp. 762–766 (2008) 295. Zhang, H., Duan, G.R., Xie, L.: Linear quadratic regulation for linear time-varying systems with multiple input delays. Automatica 42, 1465–1476 (2006) 296. Zhang, L., Zhu, A.F., Wu, A.G., Lv, L.L.: Parametric solutions to the regulator-conjugate matrix equation. J. Ind. Manage. Optim. (to appear) (2015) 297. Zhang, X.: Matrix Analysis and Applications. Tsinghua University Press, Beijing (2004) 298. Zhang, X., Thompson, S., Duan, G.R.: Full-column rank solutions of the matrix equation AV = EVJ. Appl. Math. Comput. 151(3), 815–826 (2004) 299. Zhao, H., Zhang, H., Wang, H., Zhang, C.: Linear quadratic regulation for discrete-time systems with input delay: spectral factorization approach. J. Syst. Sci. Complex. 21(1), 46–59 (2008) 300. Zhou, B., Duan, G.R.: Parametric solutions to the generalized sylvester matrix equation AX − XF = BY and the regulator equation AX − XF = BY + R. Asian J. Control 9(4), 475–483 (2007) 301. Zhou, B., Duan, G.R.: An explicit solution to the matrix equation AX − XF = BY. Linear Algebra Appl. 402, 345–366 (2005) 302. Zhou, B., Duan, G.R.: An explicit solution to polynomial matrix right coprime factorization with application in eigenstructure assignment. J. Control Theor. Appl. 4(2), 147–154 (2006) References 483

303. Zhou, B., Duan, G.R.: A new solution to the generalized Sylvester matrix equation AV − EVF = BW. Syst. Control Lett. 55(3), 193–198 (2006) 304. Zhou, B., Duan, G.R.: Parametric approach for the normal Luenberger function observer design in second-order linear systems. In: Proceedings of the 45th IEEE Conference on Decision and Control, San Diego, CA, USA 2006, pp. 1423–1428 (2006) 305. Gershon, Eli, Shaked, Uri: Applications. In: Gershon, Eli, Shaked, Uri (eds.) Advanced Topics in Control and Estimation of State-multiplicative Noisy Systems. LNCIS, vol. 439, pp. 201– 216. Springer, Heidelberg (2013) 306. Zhou, B., Duan, G.R., Li, Z.Y.: Gradient based iterative algorithm for solving coupled matrix equations. Syst. Control Lett. 58(5), 327–333 (2009) 307. Zhou, B., Duan, G.R., Lin, Z.: A parametric Lyapunov equation approach to the design of low gain feedback. IEEE Trans. Autom. Control 53(6), 1548–1554 (2008) 308. Zhou, B., Lam, J., Duan, G.R.: Toward solution of matrix equation X = Af (X)B + C. Linear Algebra Appl. 435(6), 1370–1398 (2011) 309. Zhou, B., Lam, J., Duan, G.R.: Convergence of gradient-based iterative solution of coupled Markovian jump Lyapunov equations. Comput. Math. Appl. 56(12), 3070–3078 (2008) 310. Zhou, B., Lam, J., Duan, G.R.: On Smith-type iterative algorithms for the Stein matrix equa- tion. Appl. Math. Lett. 22(7), 1038–1044 (2009) 311. Zhou, B., Li, Z.Y., Duan, G., Wang, Y.: Weighted least squares solutions to general coupled sylvester matrix equations. J. Comput. Appl. Math. 224(2), 759–776 (2009) 312. Zhou, B., Li, Z.Y., Duan, G.R., Wang, Y.: Solutions to a family of matrix equations by using the kronecker matrix polynomials. Appl. Math. Comput. 212, 327–336 (2009) 313. Zhou, B., Lin, Z.: Parametric Lyapunov equation approach to stabilization of discrete-time systems with input delay and saturation. IEEE Trans. Circ. Syst.-I: Regular Papers 58(11), 2741–2754 (2011) 314. Zhou, B., Lin, Z., Duan, G.: A parametric Lyapunov equation approach to low gain feedback design for discrete-time systems. Automatica 45(1), 238–244 (2009) 315. Zhou, B., Lin, Z., Duan, G.: Properties of the parametric Lyapunov equation-based low-gain design with applications in stabilization of time-delay systems. IEEE Trans. Autom. Control 54(7), 1698–1704 (2009) 316. Zhou, K., Doyle, J.C., Glover, K.: Robust and Optimal Control. Prentice Hall, New Jersey (1996) 317. Zhu, Q., Hu, G.D., Zeng, L.: Estimating the spectral radius of a real matrix by discrete Lyapunov equation. J. Differ. Equ. Appl. 17(4), 603–611 (2011) Index

A Con-Sylvester matrix equation, 250, 254, Absolute homogeneity, 52 336, 441, 446 Adjoint matrix, 233, 244, 261, 270 Con-Sylvester-polynomial matrix, 394 Algebraic anti-Riccati matrix equation, 464 Con-Sylvester sum, 389, 392, 398 Alternating power Con-Yakubovich matrix equation, 31, 164, left alternating power, 99 259, 263, 265, 343 right alternating power, 99 Condiagonalization, 74 Annihilating polynomial, 10 Coneigenvalue, 73 Anti-Lyapunov matrix equation, 410 Coneigenvector, 73 Anti-Riccati matrix equation, 459, 462, 467 Conequivalence, 373 Antilinear mapping, 405 Conjugate gradient, 88 Conjugate gradient method, 5 Conjugate product, 356, 368, 392 B Conjugate symmetry, 83 Bartels-Stewart algorithm, 10 Consimilarity, 30, 73, 227, 440, 441 Bezout identity, 10 Continuous-time Lyapunov matrix equation, 8 Controllability matrix, 7, 10, 12, 225, 276, C 280 Cayley-Hamilton identity, 44 Controllable canonical form, 6, 8 Cayley-Hamilton theorem, 10 Convergence Characteristic polynomial, 6, 233, 244, 250, linear convergence, 107 261, 300 quadratic convergence, 107 Column sum norm, 57 superlinear convergence, 107 Common left divisor, 362 Coordinate, 80 Common right divisor, 362 Coprimeness Companion form, 6 left coprimeness, 365 Companion matrix, 11, 42 right coprimeness, 365 Con-controllability matrix, 339, 342 Coupled anti-Lyapunov matrix equation, Con-Kalman-Yakubovich matrix equation, 416, 417, 423 31, 241, 294, 344 Coupled con-Sylvester matrix equation, 135 Con-observability matrix, 339, 342 Coupled Lyapunov matrix equation, 27 Con-Sylvester mapping, 81 Cramer’s rule, 2

© Springer Science+Business Media Singapore 2017 485 A.-G. Wu and Y. Zhang, Complex Conjugate Matrix Equations for Systems and Control, Communications and Control Engineering, DOI 10.1007/978-981-10-0637-1 486 Index

D I Degrees of freedom, 262, 277, 294, 298, 323, Image, 81 338, 345 Inner product, 30, 83 Diophantine polynomial matrix equation, Inverse, 371 339 Discrete minimum principle, 451 Discrete-time antilinear system, 440 J Discrete-time Lyapunov matrix equation, 5, Jacobi iteration, 3, 4 8 Jordan form, 21 Division with reminder, 362 Divisor left divisor, 359 K right divisor, 359 Kalman-Yakubovich matrix equation, 9, 12, Dynamic programming principle, 454 13, 114 Kernel, 81 Kronecker product, 12, 36, 41, 125 E left Kronecker product, 36 Eigenstructure, 440 Elementary column transformation, 373 Elementary row transformation, 371 L Euclidean space, 53 Laurent expansion, 15 Explicit solution, 10, 233, 244 Left coprimeness, 393 Extended con-Sylvester matrix equation, Leverrier algorithm, 42, 233, 238, 250, 256, 121, 179, 267, 307, 447 279 Linear quadratic regulation, 450 Lyapunov matrix equation, 5 F Feedback stabilizing gain, 444 Feedforward compensation gain, 444 Frobenius norm, 55, 66, 163 M Markovian jump antilinear system, 410 Matrix exponential function, 7 G Maximal singular value norm, 58 Gauss-Seidel iteration, 3, 4 Minkowski inequality, 55 Generalized con-Sylvester matrix equation, Mixed-type Lyapunov matrix equation, 16 163, 272, 321, 328, 330, 332 Model reference tracking control, 443 Generalized eigenstructure assignment, 440 Multiple Generalized inverse, 2, 15 left multiple, 359 Generalized Leverrier algorithm, 49, 244, right multiple, 359 248, 261, 265, 299, 311, 315, 332 Generalized Sylvester matrix equation, 18, 174, 270 N Gradient, 89 Nonhomogeneous con-Sylvester matrix Greatest common left divisor, 362, 363 equation, 284, 289 Greatest common right divisor, 362 Nonhomogeneous con-Sylvester- polynomial matrix equation, 397 Nonhomogeneous con-Yakubovich matrix H equation, 296, 299 Hankel matrix, 12 Norm, 52 Hessian matrix, 89 1-norm, 57 Hierarchical principle, 121, 150 2-norm, 58, 66, 142 H˝older inequality, 54 Normal con-Sylvester mapping, 230 Homogeneous con-Sylvester matrix equa- Normal con-Sylvester matrix equation, 30, tion, 276 164, 172, 226, 230, 238, 277, 293 Index 487

Normal Sylvester matrix equation, 9, 11, Second-order generalized Sylvester matrix 226, 229, 233, 281, 282 equation, 24 Normed space, 52 Singular value, 49 Singular value decomposition (SVD), 49, 50, 52 O Singular vector Observability matrix, 10, 12, 225, 276, 280 left-singular vector, 49 Operator norm, 56 right-singular vector, 49 Smith accelerative iteration, 13 Smith iteration, 8, 103 P Smith normal form, 15, 275, 285, 295, 299, Permutation matrix, 41 324, 376 Polynomial matrix, 9 Solvability, 229, 241 Squared Smith iteration, 108 Stochastic Lyapunov function, 411, 417 Q Stochastic stability, 411, 417 Quadratic performance index, 450 Successive over-relaxation, 4 Sylvester matrix equation, 250 Sylvester-observer matrix equation, 18 R Symmetric operator matrix, 225 Real basis, 80, 202 Symmetry, 84 Real dimension, 80, 86, 172, 191, 202 Real inner product, 84, 172, 191 Real inner product space, 84, 172, 191, 202 T Real linear combination, 77 Toeplitz matrix, 12 Real linear mapping, 81, 227, 228 Transformation approach, 3, 28 Real linear space, 76 Triangle inequality, 52 Real linear subspace, 77 Real representation, 31, 63, 123, 139, 230, 231, 242, 259, 262, 300, 335, 426 U Relaxation factor, 4 Unimodular matrix, 373, 392, 398 Roth matrix equation, 17 Unitary matrix, 38, 50, 62 Roth’s removal rule, 9 Upper Hankle matrix, 11 Roth’s theorem, 15 Routh table, 6 Row sum norm, 59 V Vectorization, 39

S Schur-Cohn matrix, 6 Y Schur decomposition, 9 Young’s inequality, 54