Lecture Notes in Control and Information Sciences 389

Editors: M. Thoma, F. Allgöwer, M. Morari Rafael Bru and Sergio Romero-Vivó (Eds.)

Positive Systems Proceedings of the third Multidisciplinary International Symposium on Positive Systems: Theory and Applications (POSTA 2009) Valencia, Spain, September 2–4, 2009

ABC Series Advisory Board P. Fleming, P. Kokotovic, A.B. Kurzhanski, H. Kwakernaak, A. Rantzer, J.N. Tsitsiklis

Editors Rafael Bru Instituto Universitario de Matemática Multidisciplinar, Universidad Politécnica de Valencia, Camí de Vera s/n, 46022 Valencia Spain E-mail: [email protected]

Sergio Romero-Vivó Instituto Universitario de Matemática Multidisciplinar, Universidad Politécnica de Valencia, Camí de Vera s/n, 46022 Valencia Spain E-mail: [email protected]

ISBN 978-3-642-02893-9 e-ISBN 978-3-642-02894-6

DOI 10.1007/978-3-642-02894-6

Lecture Notes in Control and Information Sciences ISSN 0170-8643

Library of Congress Control Number: Applied for

c 2009 Springer-Verlag Berlin Heidelberg

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable for prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

Typeset & Cover Design: Scientific Publishing Services Pvt. Ltd., Chennai, India.

Printed in acid-free paper 543210 springer.com Preface

Nowadays, the researchers into Control Theory and its Applications as well as Matrix Analysis are well aware and recognize the importance of Positive Sys- tems. This volume contains the proceedings of the “Third Multidisciplinary Symposium on Positive Systems: Theory and Applications (POSTA09)” held in Valencia, Spain, September 2–4, 2009. At present, this is the only world congress whose main topic is focused on this field. After this third event, we think that we have established the basis of a regular triennial event supported in the task of the previous organizing committees and hope to have met the requirements of their expectations. This POSTA09 meeting has been organized by members of the “Univer- sitat Polit`ecnica de Val`encia”,who have taken their doctor’s degree in the subjects of the conference. We are grateful to all of them for their enthu- siasm, effort and especially their collaboration, without the help of which the outcome would have never been the same. Also, we highly appreciate Christian Commault for having his experience at our disposal. Besides that, we wish to thank the International Programme Committee components and additional referees for their outstanding work in the review process of the contributions. We are happy to say that their constructive suggestions and positive comments have improved the quality of the presentations. We are very much obliged to the following organizations for having fi- nancially backed this congress: “Ministerio de Educaci´on y Ciencia”, “Con- selleria d’Educaci´o de la Generalitat Valenciana”, “Universitat Polit`ecnica de Val`encia (UPV)” and “Sociedad Espa˜nola de Matem´atica Aplicada (SEMA)”. We extend our gratitude to the “International Linear Society (ILAS)”, “Institut Universitari de Matem`atica Multidisciplin`aria (IMM)” and the “Algebra´ Lineal, An´alisis Matricial y Aplicaciones (ALAMA)” network for having endorsed this meeting. Finally, we would like to thank all of the par- ticipants for their attendance in spite of these hard times. Their presence here reasserts the success of this conference and we hope to see them successively in the future.

Valencia, Rafael Bru September 2009 Sergio Romero-Viv´o Organization

Program Committee

Chairman

Rafael Bru Universitat Polit`ecnica de Val`encia, Spain

Members

Georges Bastin Universit´e Catholique de Louvain, Belgium Luca Benvenuti Universit`a di Roma “La Sapienza”, Italy Vincent Blondel Universit´e Catholique de Louvain, Belgium Rafael Cant´o Universitat Polit`ecnica de Val`encia, Spain Carmen Coll Universitat Polit`ecnica de Val`encia, Spain Bart De Moor Katholieke Universiteit Leuven, Belgium Alberto De Santis Universit`a di Roma “La Sapienza”, Italy Elena De Santis Universit`a dell’Aquila, Italy Lorenzo Farina Universit`a di Roma “La Sapienza”, Italy Stephane Gaubert INRIA, Ecole Polytechnique, France Alessandro Giua Universit`a di Cagliari, Italy Jean-Luc Gouz´e INRIA Sophia Antipolis, France Diederich Hinrichsen Universit¨at Bremen, Germany Tadeusz Kaczorek Warsaw University of Technology, Poland Ulrich Krause Universit¨at Bremen, Germany Volker Mehrmann Technische Universit¨at Berlin, Germany Ventsi Rumchev Curtin University of Technology, Australia Maria Pia Saccomani Universit`a di Padova, Italy Elena S´anchez Universitat Polit`ecnica de Val`encia, Spain Jan H. van Schuppen CWI, Amsterdam, The Netherlands Robert Shorten The Hamilton Institute, Co. Kildare, Ireland Anton A. Stoorvogel University of Twente, The Netherlands VIII Organization

Ana M. Urbano Universitat Polit`ecnica de Val`encia, Spain Maria Elena Valcher Universit`a di Padova, Italy Paul Van Dooren Universit´e Catholique de Louvain, Belgium Joseph Winkin Universit´e Notre-Dame de la Paix, Belgium

Organizing Committee

Bego˜na Cant´o Universitat Polit`ecnica de Val`encia, Spain Rafael Cant´o Universitat Polit`ecnica de Val`encia, Spain Beatriz Ricarte Universitat Polit`ecnica de Val`encia, Spain Sergio Romero-Viv´o Universitat Polit`ecnica de Val`encia, Spain

Additional Referees

Esteban Bailo Maya Mincheva Gregory Batt Francisco Pedroche Bego˜na Cant´o Juan Manuel Pe˜na Bart De Schutter Harish Pillai Zong Woo “Victor” Geem Beatriz Ricarte Josep Gelonch Sergio Romero-Viv´o Bernd Heidergott Bartek Roszak Julien Hendrickx Boris Shapiro On´esimo Hern´andez-Lerma HalL.Smith Alexandros Karatzoglou Juan Ramon Torregrosa Jerzy Klamka Elena Virnik Thomas G. Kurtz Yimin Wei Hongwei Lin Eva Zerz James H. Liu Contents

Plenary Talks Reputation Systems and Nonnegativity ...... 3 Cristobald de Kerchove, Paul Van Dooren Lyapunov Exponents and Uniform Weak Normally Repelling Invariant Sets ...... 17 Paul Leonard Salceanu, Hal L. Smith Analysis for Different Classes of Positive Systems ...... 29 Maria Elena Valcher

Invited Sessions On the Positive LQ-Problem for Linear Discrete Time Systems ...... 45 Charlotte Beauthier, Joseph J. Winkin The Importance of Being Positive: Admissible Dynamics for Positive Systems ...... 55 Luca Benvenuti, Lorenzo Farina Detectability, Observability, and Asymptotic Reconstructability of Positive Systems ...... 63 Tobias Damm, Cristina Ethington Stability Radii of Interconnected Positive Systems with Uncertain Couplings ...... 71 Diederich Hinrichsen X Contents

Linear Operators Preserving the of Positive (Nonnegative) Polynomials ...... 83 Olga M. Katkova, Anna M. Vishnyakova

Convergence to Consensus by General Averaging ...... 91 Dirk A. Lorenz, Jan Lorenz

Stability and D-stability for Switched Positive Systems ...... 101 Oliver Mason, Vahid S. Bokharaie, Robert Shorten On Positivity and Stability of Linear Volterra-Stieltjes Differential Systems ...... 111 Pham Huu Anh Ngoc

Eigenvalue Localization for Totally Positive Matrices ...... 123 Juan Manuel Pe˜na Positivity Preserving Model Reduction ...... 131 Timo Reis, Elena Virnik The Minimum Energy Problem for Positive Discrete-Time Linear Systems with Fixed Final State ...... 141 Ventsi Rumchev, Siti Chotijah A Rollout Algorithm for Multichain Markov Decision Processes with Average Cost ...... 151 Tao Sun, Qianchuan Zhao, Peter B. Luh Analysis of Degenerate Chemical Reaction Networks ...... 163 Markus Uhr, Hans-Michael Kaltenbach, Carsten Conradi, J¨org Stelling k-Switching Reachability Sets of Continuous-Time Positive Switched Systems...... 173 Maria Elena Valcher

Contributed Papers Inverse-Positive Matrices with Checkerboard Pattern ...... 185 Manuel F. Abad, Mar´ıa T. Gass´o, Juan R. Torregrosa Some Remarks on Links between Positive Invariance, Monotonicity, Strong Lumpability and Coherency in Max-Plus Algebra ...... 195 Mourad Ahmane, Laurent Truffet Contents XI

Stability Analysis and Synthesis for Linear Positive Systems with Time-Varying Delays ...... 205 Mustapha Ait Rami Linear Programming Approach for 2-D Stabilization and Positivity ...... 217 Mohammed Alfidi, Abdelaziz Hmamed, Fernando Tadeo An Algorithmic Approach to Orders of Magnitude in a Biochemical System ...... 233 Eric Benoˆıt, Jean-Luc Gouz´e Structural Identifiability of Linear Singular Dynamic Systems ...... 243 Bego˜na Cant´o, Carmen Coll, Elena S´anchez On Positivity of Discrete-Time Singular Systems and the Realization Problem ...... 251 Rafael Cant´o, Beatriz Ricarte, Ana M. Urbano Multi-Point Iterative Methods for Systems of Nonlinear Equations...... 259 Alicia Cordero, Jos´eL.Hueso,EulaliaMart´ınez, Juan R. Torregrosa Identifiability of Nonaccessible Nonlinear Systems ...... 269 Leontina D’Angi , Maria Pia Saccomani, Stefania Audoly, Giuseppina Bellu Trajectory Tracking Control of a Timed Event Graph with Specifications Defined by a P-time Event Graph ...... 279 Philippe Declerck, Abdelhak Guezzi Tropical Scaling of Polynomial Matrices ...... 291 St´ephane Gaubert, Meisam Sharify Scrutinizing Changes in the Water Demand Behavior ...... 305 Manuel Herrera, Rafael P´erez-Garc´ıa, Joaqu´ın Izquierdo, Idel Montalvo Characterization of Matrices with Nonnegative -Projector ...... 315 Alicia Herrero, Francisco J. Ram´ırez, N´estor Thome Robust Design of Water Supply Systems through Evolutionary Optimization ...... 321 Joaqu´ın Izquierdo, Idel Montalvo, Rafael P´erez-Garc´ıa, Manuel Herrera XII Contents

Applications of Linear Co-positive Lyapunov Functions for Switched Linear Positive Systems ...... 331 Florian Knorn, Oliver Mason, Robert Shorten A Problem in Positive Systems Stability Arising in Topology Control ...... 339 Florian Knorn, Rade Stanojevic, Martin Corless, Robert Shorten

Control of Uncertain (min,+)-Linear Systems...... 349 Euriell Le Corronc, Bertrand Cottenceau, Laurent Hardouin On a Class of Stochastic Models of Cell Biology: Periodicity and Controllability ...... 359 Ivo Marek Implementation of 2D Strongly Autonomous Behaviors by Full and Partial Interconnections ...... 369 Diego Napp Avelli, Paula Rocha Ordering of Matrices for Iterative Aggregation - Disaggregation Methods ...... 379 Ivana Pultarov´a The Positive Servomechanism Problem under LQcR Control ...... 387 Bartek Roszak, Edward J. Davison

Author Index ...... 397 Reputation Systems and Nonnegativity

Cristobald de Kerchove and Paul Van Dooren

Abstract. We present a voting system that is based on an iterative method that as- signs a reputation to n + m items, n objects and m raters, applying some filter to the votes. Each rater evaluates a of objects leading to an n×m rating matrix with a given sparsity pattern. From this rating matrix a formula is defined for the reputa- tion of raters and objects. We propose a natural and intuitive nonlinear formula and also provide an iterative algorithm that linearly converges to the unique vector of reputations and this for any rating matrix. In contrast to classical outliers detection, no evaluation is discarded in this method but each one is taken into account with different weights for the reputations of the objects. The complexity of one iteration step is linear in the number of evaluations, making our algorithm efficient for large data set.

1 Introduction

Many measures of reputation have been proposed under the names of reputation, voting, ranking or trust systems and they deal with various contexts ranging from the classification of football teams to the reliability of each individual in peer to peer systems. Surprisingly enough, the most used method for reputation on the Web amounts simply to average the votes. In that case, the reputation is, for instance, the average of scores represented by 5 stars in YouTube, or the percentage of positive transactions in eBay. Therefore such a method trusts evenly each rater of the sys- tem. Besides this method, many other algorithms exploit the structure of networks generated by the votes: raters and evaluated items are nodes connected by votes. A great part of these methods use efficient eigenvector based techniques or trust

Cristobald de Kerchove and Paul Van Dooren Universit«e catholique de Louvain (UCL), Department of Applied Mathematics, Avenue Georges Lemaˆõtre, 4 B-1348 Louvain-la-Neuve Belgium, e-mail: [email protected],[email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 3Ð16. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 4 C. de Kerchove and P. Van Dooren propagation over the network to obtain the reputation of every node [7, 9, 13Ð17]. They can be interpreted as a distribution of some reputation flow over the network where reputations satisfy some transitivity: you have a high reputation if you have several incoming links coming from nodes with a high reputation. The average method, the eigenvector based techniques and trust propagation may suffer from noise in the data and bias from dishonest raters. For this reason, they are sometimes accompanied by statistical methods for spam detection [10, 19], like in the context of web pages trying to boost their PageRank scores by adding artificial incoming links [2, 8]. Detected spam is then simply removed from the data. This describes the three main strategies for voting systems: simple methods averaging votes where raters are evenly trusted, eigenvector based techniques and trust propagation where reputations directly depend on reputations of the neighbours, and finally statistical measures to classify and possibly remove some of the items. Concerning the Iterative Filtering (IF) systems which we introduce here, we will make the following assumption: Raters diverging often from other raters’ opinion are less taken into account. We label this the IF-property and will formally define it later on. This property is at the heart of the filtering process and implies that all votes are taken into account, but with a continuous validation scale, in contrast with the di- rect deletion of outliers. Moreover, the weight of each rater depends on the distance between his votes and the reputation of the objects he evaluates: typically weights of random raters and outliers decrease during the iterative filtering. The main criticism one can have about the IF-property is that it discriminates “marginal” evaluators, i.e., raters who vote differently from the average opinion for many objects. How- ever, IF systems may have different basin of attraction, each corresponding to a group of people with a coherent opinion. Votes, raters and objects can appear, disappear or change making the system dy- namical. This is for example the case when we consider a stream of news like in [5]: news sources and articles are ranked according to their publications over time. Nowadays, most sites driven by raters involve dynamical opinions. For instance, the blogs, the site Digg and the site Flickr are good places to exchange and discuss ideas, remarks and votes about various topics ranging from political election to photos and videos. We will see that IF systems allow to consider evolving voting matrices and then provide time varying reputations.

2 Iterative Filtering Systems

We first consider the case where the votes are fixed, i.e., the voting matrix does not change over time, and all objects are evaluated by all raters, i.e., the voting matrix is full. With these assumptions, we present the main properties of IF systems and then we restrict ourselves to the natural case of quadratic IF systems where the reputations are given by a linear combination of the votes and the weights of the raters are based on the Euclidean distance between the reputations and the votes. Reputation Systems and Nonnegativity 5

n×m n ∈ Ê Let X ∈ Ê be the voting matrix, r be the reputation vector of the objects ∈ m and w Ê be the weight vector of the raters. The entry Xij is the vote to object th i given by rater j and the vector x j,the j column of X, represents the votes of rater j: X =[x1 ...xm]. The bipartite graph formed by the objects, the raters and their votes is represented by the n × m adjacency matrix A,i.e.,Aij = 1 if object i is evaluated by rater j,and 0 otherwise. For the sake of simplicity, we assume in this section that every object has been evaluated by all raters

Aij = 1foralli, j.(1)

The general case, where the bipartite graph is not complete, will be handled later. The belief divergence d j of rater j is the normalized distance between his votes and the reputation vector r (for a particular choice of norm) 1 d = x − r2. (2) j n j Let us already remark that when the bipartite graph is not complete, i.e., Eq. (1) is not satisfied, then the number of votes varies from one rater to another. Therefore the normalization of the belief divergence d j in Eq. (2) will change depending on this number. Before introducing IF systems, we define the two basic functions of these systems:

n m

→ Ê ( )= , (1) the reputation function F : Ê : F w r that gives the reputation vector depending on the weights of the raters and implicitly on the voting matrix X;

m n

→ Ê ( )= , (2) The filtering function G : Ê : G d w that gives the weight vector for the raters depending on the belief divergence d of each rater defined in Eq. (2). We formalize the so-called IF-property described in the introduction that claims that raters diverging often from the opinion of other raters are less taken into ac- count. We will make the reasonable assumption that raters with identical belief di- vergence receive equal weights. Hence, we can write ⎡ ⎤ g(d1) ⎢ ⎥ ( )= . . G d ⎣ . ⎦ (3) g(dm)

We call the scalar function g the discriminant function associated with G. 6 C. de Kerchove and P. Van Dooren

A filtering function G satisfies the IF-property if its associated discriminant func- tion g is positive and decreases with d. Therefore, the IF-property merely implies that a decrease in belief divergence d j for any rater j corresponds to a larger weight w j. Eq. (3) indicates that every rater has the same discriminant function g,butwe could also consider personalized functions g j penalizing differently the raters. In [4] three choices of function g are shown to have interesting properties

− g(d)=d k, (4) g(d)=e−kd, (5) g(d)=1 − kd. (6)

All discriminant function g are positive and decrease with d for positive k and there- fore satisfy the IF-property. However k must be small enough to keep g positive in Eq. (6) and hence to avoid negative weights.

Definition 1. IF systems are systems of equations in the reputations rt of the objects and the weights wt of the raters that evolve over discrete time t according to the voting matrix X

+ rt 1 = F(wt ), (7) 1 wt+1 = G(dt+1) with dt+1 = x − rt+12 (8) j n j for j = 1,...,m and some initial vector of weights w0.

Definition 1 does not imply any convergence properties, nor robustness to initial conditions. The system (7-8) can have several converging solutions and it allows the existence of cycles in the iterative processes. The fixed points of (7-8) satisfy

∗ ∗ r = F(w ), (9) 1 w∗ = G(d∗) with d∗ = x − r∗2 (10) j n j for j = 1,...,m. Let us remark that IF systems can be interpreted as a particular iterative search method to find the stable fixed points of Eq. (9-10). IF systems are a simple iterative scheme for this system with the advantage to be easily extended to take into account dynamical voting matrices Xt with t ≥ 0. In this paper, we focus on IF systems where we fix the reputation function F ap- pearing in Eq. (7,9) and the norm . given in the definition of the belief divergence in Eq. (2).

Definition 2. Quadratic IF systems are IF systems where the reputation function F and the belief divergence are respectively given by Reputation Systems and Nonnegativity 7

w F(w)=X , (11) 1T w 1 ◦ d = (XT − 1rT ) 21, (12) n where 1 is the vector of all 1’s and (XT − 1rT )◦2 is the componentwise product (XT − 1rT ) ◦ (XT − 1rT ).

In that definition, the reputation function F(w) is naturally given by taking the weighted average of the votes and the belief divergence d (given in the matrix form) is defined using the Euclidian norm. Therefore Eq. (12) are quadratic equations in r and amount to consider an estimate of the variances of the votes for every rater according to a given reputation vector r. For any positive vector w, the reputation vector r then belongs to the polytope

m m P = { ∈ n | = λ λ = λ ≥ }. r Ê r ∑ jx j with ∑ j 1and j 0 (13) j=1 j=1

From Eq. (11), the iterations and the fixed point in Eq. (7,9) are given by quadratic equations in r and w

rt+1(1T wt)=Xwt, (14) ∗ ∗ ∗ r (1T w )=Xw . (15)

The next theorem establishes the correspondence between the iterations of quad- ratic IF systems and some steepest descent methods minimizing some energy func- tion. The fixed points in Eq. (14,15) are then the stationary points of that energy function.

Theorem 1. (see [4]) The fixed points of quadratic IF systems with integrable discriminant function g, are the stationary points of the energy function  m d j(r) E(r)= ∑ g(u)du, (16) j=1 0 where d j is the belief divergence of rater j that depends on r. Moreover one itera- tion step in quadratic IF systems corresponds to a steepest descent direction with a particular step size t+1 t t t r = r − α ∇rE(r ), (17) with αt = n . 2(1T wt )

3 Iterative Filtering with Affine Discriminant Function

We look at the quadratic IF system with the discriminant function g defined in Eq. (6) where the iterations are given by 8 C. de Kerchove and P. Van Dooren

wt rt+1 = F(wt )=X , (18) 1T wt wt+1 = G(dt+1)=1 − k dt+1, (19) starting with equal weights w0 = 1. By substituting w, the fixed point of the system is given by a system of cubic equations in r∗

k (X − r∗1T )(1 − (XT − 1(r∗)T )◦21)=0, (20) n with r∗ in the polytope P defined in Eq. (13). Theorem 2 claims that r∗ is unique in P if k is such that the weights are strictly positive for all vectors of reputations r ∈ P. This result uses the associated energy function that we define for affine IF systems.

3.1 The Energy Function

The energy function in Eq. (16) associated with system (18,19) is given by 1 E(r)=− wT w + constant, (21) 2k where w depends on r according the function G(r). We will see later that this energy t function decreases with the iterations, i.e., (E(r ))t≥0 decreases, and under some assumption on k, it converges to the unique minimum. The iterations in system (18,19) can be written as a particular minimization step on the function E, + 1 rt 1 = argmin − G(r)T G(rt) . r 2k Therefore, we have for all t that (wt+1)T (wt ) ≥ (wt )T (wt ).

3.2 Uniqueness

The following theorem proves that the stable point of quadratic IF systems with g defined in Eq. (6) is unique, under some condition on parameter k. This result fol- lows directly from the energy function E that is a fourth-order polynomial equation.

Theorem 2. (see [4]) The system (18,19) has a unique fixed point r∗ in P if

−1 k < min d∞ . r∈P Reputation Systems and Nonnegativity 9

−4.5

−5.5 −.5 The Energy function The Energy function

−7 −1 0 0

0.5 0 0.5 0 0.2 0.2 0.4 0.4 0.6 0.6 0.8 0.8 1 1 1 1 (a) (b)

0

−.4 −0.5 The Energy function The Energy function

−.8 −1 0 0

0.5 0 0.5 0 0.2 0.2 0.4 0.4 0.6 0.6 0.8 0.8 1 1 1 1 (c) (d)

Fig. 1 Four energy functions with two objects and increasing values of k. We have in the unit square: (a) a unique minimum; (b) a unique minimum but other stationary points are close to the boundary; (c) a unique minimum and other stationary points; (d) a unique maximum.

3.3 Convergence of the Method

We analyze the convergence of system (18,19) that reaches the minimum of the energy function E in P.Letrt and rt+1 be two subsequent points of the iterations given by some search method. Then the next point rt+1 is obtained by choosing a vector v and a scalar γ such that

rt+1 = rt + γv. (22)

This corresponds to some line search on the scalar energy function

e(y)=E(rt + yv) (23) that is a polynomial of degree 4. We have that e(0) is the energy at rt and e(γ) is the energy at rt+1. Finally it is useful for the sequel to define the scalar that minimizes e given by β = arg min E(rt + yv). (24) y with rt +yv∈P 10 C. de Kerchove and P. Van Dooren

System (18,19) provides a steepest descent method with a particular step size. The direction v and the scalar γ in Eq. (22) are n v = −∇ E(rt) and γ = αt = , r 2(1T wt ) so that we recover Eq. (17). This particular step size αt can be compared to the step size β that minimizes E in the same direction given in Eq. (24). We have that the particular step size αt is generally smaller than β in numeric simulations, meaning that the step stops before reaching the minimum of the energy function E in the t direction v. The (E(r ))t≥0 can be shown to decrease so that we have the following convergence result.

Theorem 3. (see [4]) The steepest descent method given by system (18,19) con- verges to the unique fixed point in P if

−1 k < min d∞ . r∈P There exist greater values of k such that the minimum of E remains unique and the previous methods converge to this minimum. By increasing k, we allow the maxima of E to appear in the polytope P, see Fig. 1(c). Then, we need to verify during the iterations if (rt ) remains in the basin of attraction of E. Theorem 4. (see [4]) If the energy function E in Eq. (21) has a minimum, then the system (18,19) is locally convergent and its asymptotic rate of convergence is linear. Let us remark that for a singular matrix X, the rate of convergence will be faster. In particular, when X is a rank 1 matrix, we have X = r∗1T (every object receives m identical votes from the raters) and the method converges in one step. When we take greater values of k maxima of the function E may appear in P.Howeverif the sequence (1T wt ) remains positive, the sequence (E(rt )) remains decreasing and converges to a stationary point of E. In order to avoid saddle points and maxima, we need to avoid to reach the minimum. The idea of increasing k is to make the discriminant function g more penalizing and therefore to have a better separation between honest and dishonest raters. We refer to [4] for more details on this.

4 Sparsity Pattern and Dynamical Votes

This section extends some previous results to the case where the voting matrix has some sparsity pattern, that is when an object is not evaluated by all raters. Moreover we analyze dynamical voting matrices representing votes that evolve over time. Reputation Systems and Nonnegativity 11

4.1 Sparsity Pattern

In general, the structure of real data is sparse. We hardly find a set of raters and objects with a vote for all possible pairs. An absence of vote for object i from rater j will imply that the entry (i, j) of the matrix X is equal to zero, that is, by using the adjacency matrix A, if Aij = 0, then Xij = 0. These entries must not be considered as votes but instead as missing values. There- fore the previous equations presented in matrix form require some modifications that will include the adjacency matrix A. We write the new equations and their im- plications using the order of the previous section. Let us already mention that some theorems will be simply stated without proof. Whenever their extensions with an adjacency matrix A are straightforward. The belief divergence for IF systems in Eq. (2) becomes 1 d j = x j − a j ◦ r, n j

th th where a j is the j column of the adjacency matrix A and n j is the j entry of the vector n containing the numbers of votes given to each item, i.e.,

n = AT 1.

On the other hand, the scalar n remains the total number of objects, i.e., the number of rows in A. Therefore, when A is full, then n = n1. Eq. (11-12) for quadratic IF systems can be replaced by the following ones: the reputation function, that remains the weighted average of the votes, is given in ma- trix form by [Xw] F(w)= , [Aw] [·] where [·] is the componentwise division. Let us remark that every entry of Aw must be strictly positive. This means that every object is evaluated by at least one rater with nonzero weight. Then all possible vectors of reputations r are include in the polytope

m m Pø = { ∈ n | = λ λ = λ ≥ }. r Ê ri ∑ jx j with ∑ j aij 1and j 0 j=1 j=1

The third equation (12) for the belief divergence with the Euclidian norm is changed into

(XT − AT ◦ 1rT )◦21 d = . (25) [AT 1] 12 C. de Kerchove and P. Van Dooren

With these modifications, the iterations and the fixed point in Eq. (7,9) are given by quadratic equations in r and w

(A ◦ rt+11T )wt = Xwt (26) ∗ ∗ ∗ (A ◦ r 1T )w = Xw . (27)

Hence we expect an energy function to exist and Theorem 1 is generalized by the following theorem.

Theorem 5. (see [4]) The fixed points of quadratic IF systems with integrable discriminant function g, are the singular points of the energy function

 ( ) 1 m d j r E(r)= ∑ n j g(u)du, (28) n j=1 0 where d j is the belief divergence of rater j that depends on r. Moreover one iteration step in quadratic IF systems corresponds to a dilated steepest descent direction with a particular step size t+1 t t t r = r − α ◦ ∇rE(r ) (29) αt = n [1] with 2 [Awt ] . The number of votes n gives somehow a weight of importance for the mini- j d j ( ) mization of the surface 0 g u du. Therefore a rater with more votes receives more attention in the minimization process.

4.2 Affine Quadratic IF Systems

The system for the discriminant function g(d)=1 − kd is given by

+ [Xw] rt 1 = F(wt )= , (30) [Aw] wt+1 = G(dt+1)=1 − k dt+1, (31) with the belief divergence defined in Eq. (25). The energy function is given by 1 E(r)=− wT [w ◦ n]+constant, (32) 2kn where w depends on r according to the function G(r). Theorem 2 remains valid for the system (30-31) and the arguments are similar. The steepest descent method adapted to the system (30-31) converges with the prop- erty that the sequence (E(rt )) decreases. The proofs are closely related to the ones presented in Theorems 3. Reputation Systems and Nonnegativity 13

Fig. 2 Trajectory of reputations (circles) for a 5-periodic voting matrix

Theorem 6. (see [4]) The steepest descent method given by system (30,31) con- verges to the unique fixed point in Pø if

−1 k < min d∞ . r∈Pø The choice of k can be made larger to better separate honest from dishonest raters. Theorem 4 remains valid with a few modifications in its proof to take into account the adjacency matrix A.

Theorem 7. (see [4]) If the energy function E in Eq. (32) has a minimum, then (30,31) is locally convergent and its asymptotic rate of convergence is linear.

This section shows that most of the earlier analysis can still be applied when we introduce a sparsity pattern in the voting matrix.

4.3 Dynamical Votes

We consider in this section the case of time-varying votes. Formally, we have dis- crete t t (X )t≥0, (A )t≥0 14 C. de Kerchove and P. Van Dooren of voting matrices and adjacency matrices evolving over time t. Hence the IF sys- t+1 tem (7,8) takes into account the new voting matrix X in the functions Ft+1 and Gt+1 that become time-dependent:

t+1 t r = Ft+1(w ), (33) t+1 t+1 w = Gt+1(d ). (34)

The system (30,31) for dynamical voting matrices is then given by (30,31)

t+1 t t+1 t X w r = F + (w )= , (35) t 1 [At+1wt] t+1 t+1 t+1 w = Gt+1(d )=1 − k d , (36) with the belief divergence dt+1 defined as in Eq. (25) after replacing X and r by Xt+1 t+1 t and r . We already now that for subsequent constant matrices X with T1 ≤ t ≤ T2, the iterations on rt and wt of system (35,36) tend to fixed vectors r∗ and w∗ provided that k is not too large. In [4] we give stronger results for the case of 2-periodic voting sequences.

5 Concluding Remarks

The general definition of Iterative Filtering systems provides a new framework to analyze and evaluate voting systems. We emphasized the need for a differentiation of trusts between the raters unlike what is usually done on the Web. The originality of the approach lies in the continuous validation scale for the votes. Next, we as- sumed that the set of raters is characterized by various possible behaviors including raters who are clumsy or partly dishonest. However, the outliers being in obvious disagreement with the other votes remain detectable by the system as shown in the simulations in the cases of alliances, random votes and spammers. Our paper focuses on the subclass of quadratic IF systems and we show the ex- istence of an energy function that allows us to link a steepest descent to each step of the iteration. It then follows that the system minimizes the belief divergence ac- cording to some norm defined from the choice of the discriminant function. This method was illustrated in [4] using two data sets: (i) the votes of 43 countries during the final of the EuroVision 2008 and (ii) the votes of 943 movie lovers in the website of MovieLens. It was shown that the IF method penalizes certain types of votes. In the first set of data, this yielded a difference in the ranking used by Euro- vision and the ranking obtained by our method, in the sense that countries trading votes with e.g. neighboring countries, would get a smaller weight. The second set of data was used to verify the desired property mentioned in the introduction: raters diverging often from other raters’ opinion are less taken into account. We see two application areas of voting systems: first, the general definition of IF systems offers the possibility to analyze various systems depending on the context Reputation Systems and Nonnegativity 15 and the objectives we aim for; second, the experimental tests and the comparisons are crucial to validate the desired properties (including dynamical properties) and to discuss the choice of the IF systems.

Acknowledgements. This paper presents research results of the Belgian Programme on In- teruniversity Attraction Poles, initiated by the Belgian Federal Science Policy Office, and a grant Action de Recherche Concert«ee (ARC) of the Communaut«eFranc¸aise de Belgique. The scientific responsibility rests with its authors.

References

1. Akerloff, G.: The Market for Lemons: Quality Uncertainty and the Market Mechanism. Quaterly Journal of Economics 84, 488Ð500 (1970) 2. Baeza-Yates, R., Castillo, C., L«opez, V.: PageRank Increase under Different Collusion Topologies. In: First International Workshop on Adversarial Information Retrieval on the Web (2005), http://airweb.cse.lehigh.edu/2005/baeza-yates.pdf 3. de Kerchove, C., Van Dooren, P.: Reputation Systems and Optimization. Siam News (March 14, 2008) 4. de Kerchove, C., Van Dooren, P.: Iterative Filtering in Reputation Systems (submitted, 2009) 5. Del Corso, G.M., Gull«õ, A., Romani, F.: Ranking a stream of news. In: Proceedings of the 14th international conference on World Wide Web (2005) 6. Ginsburgh, V., Noury, A.: Cultural Voting. The Eurovision Song Contest. Mimeo (2004) 7. Guha, R., Kumar, R., Raghavan, P., Tomkins, A.: Propagation of Trust and Distrust. In: Proceedings of the 13th International Conference on World Wide Web, pp. 403Ð412 (2004) 8. Gy¬ongyi, Z., Garcia-Molina, H.: Link spam alliances. In: VLDB 2005: Proceedings of the 31st international conference on Very large data bases, pp. 517Ð528 (2005) 9. Kamvar, S., Schlosser, M., Garcia-molina, H.: The Eigentrust Algorithm for Reputation Management in P2P Networks. In: Proceedings of the 12th International Conference on World Wide Web, pp. 640Ð651 (2003) 10. Kotsovinos, E., Zerfos, P., Piratla, N.M., Cameron, N., Agarwal, S.: Jiminy: A Scal- able Incentive-Based Architecture for Improving Rating Quality. In: St¿len, K., Wins- borough, W.H., Martinelli, F., Massacci, F. (eds.) iTrust 2006. LNCS, vol. 3986, pp. 221Ð235. Springer, Heidelberg (2006) 11. Laureti, P., Moret, L., Zhang, Y.-C., Yu, Y.-K.: Information Filtering via Iterative Refine- ment. EuroPhysic Letter 75, 1006Ð1012 (2006) 12. McLachlan, G., Krishnan, T.: The EM algorithm and extensions. John Wiley & Sons, New York (1996) 13. Mui, L., Mohtashemi, M., Halberstadt, A.: A Computational Model of Trust and Repu- tation. In: Proceedings of the 35th Annual Hawaii International Conference, pp. 2431Ð 2439 (2002) 14. O’Donovan, J., Smyth, B.: Trust in recommender systems. In: Proceedings of the 10th International Conference on Intelligent User Interfaces, pp. 167Ð174 (2005) 15. Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank Citation Ranking: Bringing Order to the Web. Stanford Digital Library Technologies Project (1998) 16 C. de Kerchove and P. Van Dooren

16. Richardson, M., Agrawal, R., Domingos, P.: Trust Management for the Semantic Web. In: Fensel, D., Sycara, K., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 351Ð 368. Springer, Heidelberg (2003) 17. Theodorakopoulos, G., Baras, J.: On Trust Models and Trust Evaluation Metrics for Ad Hoc Neworks. IEEE Journal on Selected Areas in Communications 24(2), 318Ð328 (2006) 18. Yu, Y.-K., Zhang, Y.-C., Laureti, P., Moret, L.: Decoding information from noisy, redun- dant, and intentionally distorted sources. Physica A 371(2), 732Ð744 (2006) 19. Zhang, S., Ouyang, Y., Ford, J., Make, F.: Analysis of a Lowdimensional Linear Model under Recommendation Attacks. In: Proceedings of the 29th annual International ACM SIGIR conference on Research and development in information retrieval, pp. 517Ð524 (2006) Lyapunov Exponents and Uniform Weak Normally Repelling Invariant Sets

Paul Leonard Salceanu and Hal L. Smith

Abstract. Let M be a compact invariant set contained in a boundary hyperplane of n the positive orthant of Ê for a discrete or continuous time dynamical system defined on the positive orthant. Using elementary arguments, we show that M is uniformly weakly repelling in directions normal to the boundary in which M resides provided all normal Lyapunov exponents are positive. This result is useful in establishing uniform persistence of the dynamics.

1 Introduction

Dynamical systems models in population biology are typically defined on the non- negative cone in Euclidean space. In order to establish persistence of some or all components (species) in the model, it is often necessary to show that a compact invariant set on the boundary of the cone is an isolated invariant set and that it is repelling, at least in some directions normal to M.See[4,5,7,15,19,20]forre- cent work in the theory of persistence, sometimes called permanence. In this paper, building on the work of [4, 15] and [11, 14], we show that Lyapunov exponents can be used to establish the requisite repelling properties for both discrete and contin- uous time systems. This is well known when M is a fixed point or periodic orbit but not so when the dynamics on M is more complicated. We use only elementary arguments rather than appealing to the multiplicative ergodic theorem [1, 2, 4, 15]. This extends our earlier work in [12Ð14] which covered only the discrete case. The use of Lyapunov exponents in the study of biological models was pioneered by Metz [8], Metz et. al. [9], who proposed that the dominant Lyapunov exponent gives the best measure of invasion fitness, and by Rand et. al. [10] who used it to characterize the invasion “speed” of a rare species. See also the more recent review

Paul Leonard Salceanu and Hal L. Smith School of Mathematical and Statistical Sciences, Arizona State University, Tempe, AZ 85287, USA, e-mail: [email protected], e-mail: [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 17Ð27. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 18 P.L. Salceanu and H.L. Smith by Ferriere and Gatto [3] which deals with computational aspects. Roughly, a pos- itive dominant Lyapunov exponent corresponding to a potential invading species in the environment set by a resident species attractor implies that the invader can successfully invade. Our results will give a mathematically rigorous interpretation of this for the nonlinear dynamics. Ashwin et al. [2] use “normal” Lyapunov ex- ponents and invariant measures to answer the following question: if f : M → M is a smooth map on a smooth finite dimensional manifold, N is a lower dimensional submanifold for which f (N) ⊆ N,andA ⊆ N is an attractor for f|N ,isA an attractor for f , or it is an unstable saddle?

2MainResults

Due to our need to use both subscripts and occasionally superscripts for sequences, =( (1) (m) T m | | = | (i)| we adopt the notation x x ,...,x ) ∈ Ê .Let x ∑ x denote the norm

m m m m m

( , ) ∈ Ê ⊂ Ê Ê = { ∈ Ê on Ê and d z M for the distance of z to M . Denote by + x :

(i) m m

≥ ∈ Ê =( ) x ≥ 0,∀i} the nonnegative cone in Ê . We write x 0whenx +;ifA aij is an n × n matrix, then A ≥ 0ifaij ≥ 0foralli, j. Observe that |x + y| = |x| + |y|

, ∈ m m = Ê for vectors x y Ê+.WeletZ+ +. We consider the discrete dynamical system

zn+1 = F(zn), z0 ∈ Z+ (1) and the continuous dynamical system

z (t)=F(z(t)), z0 := z(0) ∈ Z+ (2) on the nonnegative cone Z+. It is assumed that (1) and (2) generate a semi-dynamical system on Z+. In case m ∈ , (i) = of (1), F : Z+ → Z+ is continuous; in case (2), F : Z+ → Ê satisfies z Z+ z 0 ⇒ F(i)(z) ≥ 0 and sufficient regularity properties such that solutions of (2) exist and are unique. We assume that m = p + q, p,q ≥ 1andthatZ+ is decomposed as follows

=( , ) ∈ m p q = Ê × Ê z x y Z+ ≡ Ê+ + +

Compatible with this decomposition, assume that F(z)=(f (z),g(z)).Define

X = {z =(x,y) ∈ Z+ : y = 0}.

We assume: X and Z+ \ X are positively invariant sets. (3) Positive invariance of X for both (1) and (2) means that g(x,0)=0, (x,0) ∈ X. If F satisfies additional smoothness hypotheses, then it would follow from posi- tive invariance of X that (1) and (2) can be expressed as Lyapunov Exponents and Normal Repellers 19  x + = f (z ) n 1 n (4) yn+1 = A(zn)yn and, respectively as  x = f (z) (5) y = A(z)y where the matrix function z → A(z) is continuous and satisfies:

A(z) ≥ 0, z ∈ X (6) in case of (4), and Aij(z) ≥ 0, i = j, z ∈ X (7) in case of (5). Rather than assume the required smoothness of F, we simply assume

hereafter that (6) holds for (4) and that (7) holds for (5).

 Ê ∈ Ì Let Ì+ denote either + or +. When we write t +, that means we consider both discrete and continuous cases. To make the notation more general, let also zt , yt

etc. denote z(t), y(t) etc., when t ∈ Ê+. ( φ( )) ∈ + Let t t∈Ì+ be the dynamical system generated by (4) (for t )orby(5)

∈ + O ( ) = {φ( , ) ∈ Ì } (for t Ê+), respectively. Let z : t z : t + , which we will refer to as the positive orbit through z.LetI denote the identity matrix. Let P(n,z) and P(t,z) denote the fundamental matrix solutions for

un+1 = A(φ(n,z))un (8) and for v (t)=A(φ(t,z))v(t). (9) They satisfy:

P(n + 1,z)=A(φ(n,z))P(n,z), P(0,z)=I (10) for discrete time and d P(t,z)=A(φ(t,z))P(t,z), P(0,z)=I (11) dt for continuous time. In either case, it follows from (6), (7) that . P(t,z) ≥ 0,∀ z ∈ X, ∀t ∈ Ì+ (12)

Let M ⊆ X be compact and positively invariant set. We envision that in typical applications, M will be an invariant set in the interior of the face X of the cone Z+. In this paper, we will focus on the behavior of solutions near M in Z+ \ X. Following Arnold [1], P(n,z) (or P(t,z)) is a matrix co-cycle generated by (8) (or by (9)). It is trivial to check that P has the following (co-cycle) property:

( , ∀ , ∈ Ì . P t2,φ(t1,z))P(t1,z)=P(t1 +t2,z), ∀ z ∈ + t1 t2 + (13)

20 P.L. Salceanu and H.L. Smith ∈ Ê Hereafter, when we take t ∈ +, we refer to (4), and when we take t +, we refer to (5). ∈ q Following[1,2,6],foranyz M and η ∈ Ê we define the normal Lyapunov exponent λ(z,η) as

1 | ( , ) . λ(z,η)=limsup ln P t z η|, t ∈ Ì+ (14)

t→∞ t

λ( ,η)=λ( , η), ∀ ∈ Ê \{ } As noted in [1], λ(z,η) ∈{−∞}∪ Ê and z z a a 0 .We q +( ,η) ∈ ( ,η) only consider the case that η ∈ Ê+ because in that case z 0 Z+ and 0 represents a normal vector to M at z. The co-cycle property (13) can be used to show that λ(z,η)=λ(φ(s,z),P(s,z)η), s ≥ 0. (15) Definition 1. We call the compact positively invariant set M a uniformly weak normally repelling set if there exists ε > 0suchthat

limsupd(φ(t,z),M) > ε, ∀z ∈ Z+ \ X. t→∞

Equivalently, in view of (3), there exists a neighborhood V of M in Z+ such that

∀z ∈ V \ X, ∃ t = t(z) > 0, φ(t,z) ∈/ V.

We stress that M may be an attractor relative to the dynamics restricted to the positively invariant set X but we are concerned with the behavior of solutions near M in the positively invariant set Z+ \ X. In [14], we used the terminology “M a uni- formly weak repeller” for the definition above; we believe the current terminology gives a more accurate description. The adjective “uniform” reflects that ε is inde- pendent of z; “weak” reflects that limit superior, rather than limit inferior, appears in the definition; “normal” indicates that we are only interested in the behavior of solutions in Z+ \ X. First, we give a lemma adapted from [11, 14] that gives an alternative formulation for the “positivity” of Lyapunov exponents. Let = { q |η| = }. U η ∈ Ê+ : 1

Lemma 1. Let K ⊂ X be compact. Assume that ∀ ( , \{ }τ = τ( ,η) | (τ, )η| > . z η) ∈ K ×U, ∃ Ì+ 0 z such that P z 1 (16)

Then ∃ c > 1, ∃ V a bounded neighborhood of K in Z+, such that if L ⊆ Visa positively invariant set, then L ⊂ X and ∀ ( , , ν → ∞, | (ν , )η| > p, ∀ ≥ . z η) ∈ L ×U, ∃ (νp)p ⊆ Ì+ p P p z c p 1 (17)

If, in addition, K is positively invariant, then (16) is equivalent to

λ(z,η) > 0, ∀ (z,η) ∈ K ×U. (18) Lyapunov Exponents and Normal Repellers 21

Proof. Let W = K ×U (so W is compact) andw ˆ =(zˆ,ηˆ ) ∈ K ×U.From(16)we \{ } | (τ, )η| > have that there exists τˆ = τˆ(zˆ,ηˆ ) ∈ Ì+ 0 such that P ˆ zˆ ˆ 1. The function (z,η) →|P(τˆ,z)η| being continuous, there exist δwˆ > 0, cwˆ > 1suchthat

| (τˆ, )η| > , ∀ =( ,η) ∈ ( ) = { ∈ + × || − | < δ }. P z cwˆ w z Bδwˆ wˆ : w˜ Z U w˜ wˆ wˆ (19)

Since W is compact, there exists a finite set {w1,...,wk}⊆W such that W ⊂ C := k i i ∪ = Bδ (w ), where for every i = 1,...,k, δ i is the quantity corresponding to w , i 1 wi w coming from (19) (i.e.,foreveryi = 1,...,k, (19) is satisfied withw ˆ replaced by i τ = τ( i), δ = δ , = ,..., = w ). To simplify notation, let i : w i : wi i 1 k. Also, let c : mincwi i (hence c > 1) and τ = maxτi. Thus, from (19) we have that i | (τ , )η| > , ∀ =( ,η) ∈ ( i), ∀ = ,..., . P i z c w z Bδi w i 1 k (20)

Now let V ⊂ Z+ be a bounded neighborhood of K such that V × U ⊆ C and let L ⊆V be positively invariant. We prove that L ⊂ X arguing by contradiction: suppose L \ X = 0./ Let a =(ax,ay) ∈ L \ X.Since|ay| > 0, we can define α := ay/|ay|.Note that α ∈ U. We will show that ∃ ( , ν → ∞, | (ν , )α| > p, ∀ ≥ . νp)p ⊂ Ì+ p such that P p a c p 1 (21) by inductively constructing the sequence (νp)p. Thus, there exists i ∈{1,...,k} such ( ,α) ∈ ( i) | (ν , )α| > ν = τ that a Bδi w . Then, from (20) we have P 1 a c,where 1 i.Now p suppose |P(νp,a)α| > c for some p ≥ 1. Let α˜ = P(νp,a)α/|P(νp,a)α|.Since (2) (2) L\X is positively invariant, φ(νp,a) ∈ L\X, hence φ (νp,a) > 0, where φ (t,z) denotes the vector formed with the last q components of φ(t,z).So

1 1 (2) P(νp,a)α = P(νp,a)ay = φ (νp,a) > 0. |ay| |ay|

α ∈ ∈{ ,..., } (φ(ν , ),α) ∈ ( j) Thus, ˜ U. There exists j 1 k such that p a ˜ Bδ j w .Then again, from (20) we have

p+1 |P(τ j,φ(νp,a))α˜ | > c, which implies |P(τ j,φ(νp,a))P(νp,a)α| > c .

p+1 This means, using (13), that |P(νp+1,a)α| > c , where we define νp+1 = νp +τ j. Note that, by construction, νp → ∞ as p → ∞. Hence (21) holds. Then, we have that

(2) p |φ (νp,a)| = |P(νp,a)ay| > c |ay|, ∀ p ≥ 1,

(2) which implies that |φ (νp,a)|→∞ as p → ∞. But this is a contradiction to L being bounded. Hence, L ⊂ X. Now (17) can be proved identically as for (21), using that L ⊂ X is positively ( , ) ≥ , ∀ ∈ , ∀ ∈ invariant and that P t z 0 z X t Ì+. 22 P.L. Salceanu and H.L. Smith

Now assume that K is also positively invariant. The implication (18) ⇒ (16) is trivial. For the converse, using (17) and the fact that νp ≤ pτ, ∀ p ≥ 1, we have, for all (z,η) ∈ K ×U,that

1/νp p/νp 1/τ 1 1 |P(νp,z)η| > c ≥ c ⇒ ln|P(νp,z)η| > lnc, ∀ p ≥ 1. νp τ Hence 1 1 λ(z,η)=limsup ln(|P(t,z)η|) ≥ lnc > 0. t→∞ t τ This completes our proof. 

In the next result we establish sufficient conditions for M to be a uniformly weak normally repelling set. Let

Ω(M)=∪z∈Mω(z), (22) where ω(z) represents the omega limit set of z.

Theorem 1. Let M ⊂ X be a nonempty compact and positively invariant. M is a uniformly weak normally repelling set if

λ(z,η) > 0, ∀ (z,η) ∈ M ×U. (23)

If ∀ ( , , ( , )η = z η) ∈ M ×U, ∀t ∈ Ì+ P t z 0 (24) and λ(z,η) > 0, ∀(z,η) ∈ Ω(M) ×U (25) then (23) holds.

Proof. First we show that (23) implies that M is a uniformly weak normally repelling set. For this, we argue by contradiction: suppose M is not a uniformly m weak normally repelling set. So, there exists a sequence (z˜ )m ⊆ Z+ \ X such that

limsupd(φ(t,z˜m),M) < 1/m, ∀ m ≥ 1. t→∞ ( ≥ Hence there exists a sequence τm)m ⊂ Ì+ such that, for each m 1, we have

m d(φ(t,z˜ ),M) < 1/m, ∀t ≥ τm. (26)

m m m m Let z =(x ,y )=φ(τm,z˜ ). Using the positive invariance of Z+ \X,wehavethat

ym > 0, ∀ m ≥ 1. (27)

From the semiflow property of φ and from (26) we get ( m , ≥ . d φ(t,z ),M) < 1/m, ∀t ∈ Ì+ m 1 (28) Lyapunov Exponents and Normal Repellers 23

Using (23), we obtain from Lemma 1 (applied with K = M) that there exists V a bounded neighborhood of M in Z+, having the property that any positively invariant ∈ = { ∈ set contained in V is a subset of X. Then there exists m Æ such that Bm : z Z+|d(z,M) ≤ 1/m} is contained in V.ThesetL = {φ(n,zm)|n ≥ 0} is positively invariant and, according to (28), it is contained in Bm.Also(see(27))L \ X = 0./ But this is a contradiction, according to Lemma 1. Hence, M is a uniformly weak repeller. Now we prove the final assertion. Let (a,α) ∈ M × U.Using2) and the fact that ω(a) ⊂ X is compact and invariant, we can again apply Lemma 1, now with K = ω(a).SoletVa be a neighborhood of ω(a) and c > 1 as in the above mentioned φ( , ) ∈ lemma. Since φ(t,a) → ω(a) as t → ∞, there exists τa ∈ Ì+ such that t a Va, ∀t ≥ τa.LetL = {φ(t,a)|t ≥ τa}.ThenL is a positively invariant set contained in Va.Letα˜ = P(Na,a)α/|P(Na,a)α|. Note that α˜ is well defined, due to (24), and that α˜ ∈ U. So, from (17), there exists a sequence νp → ∞ such that |P(νp,φ(τa,a))α˜ | > cp, ∀ p ≥ 1. Thus, using (13) we get

p |P(νp + τa,a)α| > c |P(τa,a)α|, ∀ p ≥ 1.

p We can find a p large enough such that to have c |P(τa,a)α| > 1. So, we proved that ∀ ( , \{ } | (τ, )η| > , z η) ∈ M ×U, ∃ τ ∈ Ì+ 0 such that P z 1 which is equivalent to (23), by Lemma 1. This completes our proof.  Note that (24) is automatically satisfied in the continuous case. In the discrete case, it is equivalent to A(z)η = 0, ∀ z ∈ M, ∀ η ∈ U. As it will be seen below, when the matrix A(z) satisfies stronger positivity con- ditions, then the Lyapunov exponents are independent of the unit vector η.Let ||A|| = sup{|Aξ | : |ξ | = 1} denote the norm of an n×n matrix. For matrices A,B,we write A ≤ B if aij ≤ bij, ∀i, j; inequality A  0 means all entries of A are positive.

Proposition 1. Let z ∈ X have compact orbit closure O+(z). In the discrete case, assume that + ∃N, P(n,z0)  0, n ≥ N, z0 ∈ O (z). (29) In the continuous case, assume that

+ A(z0) is irreducible, z0 ∈ O (z). (30)

Then 1 λ(z,η)=limsup ln||P(t,z)||, ∀ η ∈ U. (31) t→∞ t In particular, if (29), respectively (30), holds for each z ∈ M, then (31) holds for every z ∈ M. ∈ ( ) = ( , ) Proof. First we give the proof for the discrete case (t Æ). Let P n : P n z .

∀ ∈ , ∃ , ∈ Æ ≤ ≤ − = + We have that n Æ kn pn , with 0 pn N 1, such that n knN pn.Let = ( , ), ∀ ≥ =  , ∀ ≥ ˜( )= ··· Bs : P N zsN s 0. Here, z0 z. Hence Bs 0 s 0. Let P n Bkn B0. ( )= ( , ) ˜( ) So P n P pn zknN P n . First, we want to apply Theorem 3.4. in [16] for the 24 P.L. Salceanu and H.L. Smith

+ sequence of matrices B0,...,Bs,....SinceO (z) is compact, it follows that there exist constant matrices C, D  0 such that D ≥ Bs ≥ C, ∀ s ≥ 0. Let δ = min(Cij) i, j and γ = max(Dij). So, the following hold: i, j a) min(Bs)ij ≥ δ > 0, ∀ s ≥ 0; i, j b) max(Bs)ij ≤ γ < ∞. i, j Thus, hypotheses of [16, Theorem 3.4] hold and (see exercise 3.6 in [16]) we have that P˜(n)li → cij > 0asn → ∞, (32) P˜(n)lj i th for some cij independent of l. Denote by P˜(n) the i column of P˜(n). Then (32) implies that |P˜(n)i| lim = cij. (33) n→∞ |P˜(n) j| th Let ei ∈ U be the unit vector whose i component equals one, and the other compo- nents are zero. Then, using (33), we get, for any i ∈{1,...,q},that

λ( , )= 1 | ( , ) ˜( ) |≤ 1( || ( , )||+ z ei limsup ln P pn zknN P n ei limsup ln P pn zknN n→∞ n n→∞ n 1 (34) + ln|P˜(n)i|)=limsup ln|P˜(n)i|. n→∞ n On the other hand,

1 1 i λ(z,ei) ≥ limsup ln|P(knN,z)ei| = limsup ln|P˜(n) | n→∞ knN n→∞ knN n 1 1 (35) = limsup ln|P˜(n)i| = limsup ln|P˜(n)i|. n→∞ knN n n→∞ n Thus, from (34) and (35) we have that

1 i λ(z,ei)=limsup ln|P˜(n) |. (36) n→∞ n

But, for any i, j ∈{1,...,q} we have that   1 1 |P˜(n)i| |P˜(n)i| = |P˜(n) j| limsup ln limsup ln j n→∞ n n→∞ n |P˜(n) |  1 |P˜(n)i| 1 = + |P˜(n) j| limsup ln j ln n→∞ n |P˜(n) | n 1 = limsup ln|P˜(n) j|. n→∞ n Thus, let Lyapunov Exponents and Normal Repellers 25

1 i c = λ(z,ei)=limsup ln|P˜(n) |, ∀ i = 1,...,q. (37) n→∞ n q Let η ∈ U.Thereexistp1,..., pq ∈ [0,1] such that η = ∑ piei. Then, using (37), we i=1 obtain 1 1 q λ(z,η)=limsup ln|P(n)η| = limsup ln| ∑ piP(n)ei| →∞ →∞ n n n n i=1 1 q 1 ≥ limsup ∑ pi ln|P(n)ei| = limsup ln|P(n)ei| = c. (38) →∞ →∞ n n i=1 n n On the other hand, we have

q λ(z,η)=λ(z, ∑ piei) ≤ max λ(z,ei)=c, (39) i= ,..,q i=1 1 where we used the following two properties of Lyapunov exponents (see [1] page 114):

1) λ(z,η1 + η2) ≤ max{λ(z,η1),λ(z,η2)},and \{ } 2) λ(z,aη)=λ(z,η), ∀ a ∈ Ê 0 . From (38) and (39) we obtain λ(z,η)=c. It is clear that

1 c = λ(z,η) ≤ limsup ln||P(n)||. (40) n→∞ n Now, we want to show the opposite inequality. Because all norms on a finite dimen- sional normed linear space are equivalent, there exist constants a,b > 0 such that i a||B||1 ≤||B|| ≤ b||B||1 for all matrices B,where||B||1 = max|B |.Then i   1 || ( )|| ≤ 1 || ( , )||·||˜( )|| limsup ln P n limsup ln P pn zknN P n n→∞ n n→∞ n = 1 || ˜( )|| limsupn→∞ n ln P n 1 (41) ≤ limsup ln(b||P˜(n)||1) n→∞ n 1 1 = limsup ln||P˜(n)||1 = lim ln||P˜(nk)||1, n→∞ n k→∞ nk ( , → ∞ → ∞ ∈{ ,..., } for some sequence nk)k ⊆ Æ nk as k . There exists j 1 q such j that ||P˜(nk)||1 = |P˜(nk) | for infinitely many k s. Hence, there exists a subsequence j (n˜k)k of (nk)k such that ||P˜(n˜k)||1 = |P˜(n˜k) |, ∀ k. Then, from (41) we have that

1 1 j 1 j limsup ln||P(n)|| ≤ lim ln|P˜(n˜k) |≤limsup ln|P˜(n) | = c. (42) n→∞ n k→∞ n˜k n→∞ n 26 P.L. Salceanu and H.L. Smith

1 Now, (40) and (42) imply λ(z,η)=limsup ln||P(n)||.Sinceη ∈U was arbitrarily n→∞ n chosen, the proof for the discrete case is complete. ∈ ∈ Now let us consider the continuous case (t Ê+). Again, considering z X fixed,

( , ) ( ) [·] →  we can denote P t z , in short, by P t . Denote by : Ê the greatest integer function. The same argument used to prove (36) leads to 1 λ(z,η)=limsup ln|P([t])η|. (43) t→∞ [t]

1 | ( ) Let n :=[t].Thenλ(z,η)=limsup ln P n η|.LetBn := P(1,zn), ∀ n ∈ Æ.Then n→∞ n P(n)=Bn−1 ···B0. Our hypotheses on matrix A(z) guarantee that Bn  0, ∀ n ≥ 0 (see [17, Theorem 1.1]). Now the same proof as for the discrete case, applied with N = 1 (hence kn = n, pn = 0andP˜(n)=P(n)), carries over and leads to 1 1 limsup ln|P(n)η| = limsup ln||P(n)||. n→∞ n n→∞ n But 1 1 1 limsup ln||P(n)|| = limsup ln||P([t])|| = limsup ln||P(t)|| n→∞ n t→∞ [t] t→∞ t Indeed, the left side is clearly less than or equal to the right hand side and the oppo- site inequality is obtained as in (34). This completes our proof. 

Theorem 1 shows that M is a uniformly weak normally repelling set provided λ(z,η) > 0forallz ∈ M and all η ∈ U. Furthermore, under a mild hypothesis, it suffices to show λ(z,η) > 0forz ∈ Ω(M)=∪z∈Mω(z) and η ∈ U. Proposition 1 gives conditions for λ(z,η) to be independent of η ∈ U for all z ∈ M. In this case, using (15), we see that λ(z)=λ(φ(s,z)), s ≥ 0 is constant on forward orbits. We assume hereafter that λ(z)=λ(z,η) depends only on z ∈ M. As a consequence, if the hypotheses of Theorem 1 hold, and if Ω(M) consists of a finite number of periodic orbits Oi, i = 1,2,...,p, then it suffices to show that λ(zi) > 0forsome choice zi ∈ Oi, i = 1,2,...,p. In this case, only finitely many exponents must be computed. See [3, 11, 14] where λ(zi) is related to the spectral radii of a certain Floquet matrix. According to the multiplicative ergodic theorem [1, 6], if M is invariant and there exists an ergodic F-invariant Borel probability measure μ on M,thenλ(z) is a constant on M, almost surely. Unfortunately, for our results almost sure positivity of the Lyapunov exponent does not suffice.

References

1. Arnold, L.: Random Dynamical Systems. Springer, Heidelberg (1998) 2. Ashwin, P., Buescu, J., Stewart, I.: From attractor to chaotic saddle: a tale of transverse instability. Nonlinearity 9, 703Ð737 (1996) Lyapunov Exponents and Normal Repellers 27

3. Ferriere, R., Gatto, M.: Lyapunov Exponents and the Mathematics of Invasion in Oscil- latory or Chaotic Populations. Theor. Population Biol. 48, 126Ð171 (1995) 4. Garay, B.M., Hofbauer, J.: Robust Permanence for Ecological Differential Equations, Minimax, and Discretizations. SIAM J. Math. Anal. 34, 1007Ð1039 (2003) 5. Hirsch, M.W., Smith, H.L., Zhao, X.-Q.: Chain transitivity, attractivity and strong repel- lors for semidynamical systems. J. Dynamics and Diff. Eqns. 13, 107Ð131 (2001) 6. Katok, A., Hasselblatt, B.: Introduction to the Modern Theory of Dynamical Systems. Cambridge University Press, New York (1995) 7. Magal, P., Zhao, X.-Q.: Global Attractors and Steady States for Uniformly Persistent Dynamical Systems. SIAM J. Math. Anal. 37, 251Ð275 (2005) 8. Metz, J.A.J.: Fitness. Evol. Ecol. 2, 1599Ð1612 (2008) 9. Metz, J.A.J., Nisbet, R.M., Geritz, S.A.H.: How Should We Define “Fitness” for General Ecological Scenarios? Tree 7, 198Ð202 (1992) 10. Rand, D.A., Wilson, H.B., McGlade, J.M.: Dynamics and Evolution: Evolutionarily Sta- ble Attractors, Invasion Exponents and Phenotype Dynamics. Philosophical Transac- tions: Biological Sciences 343, 261Ð283 (1994) 11. Salceanu, P.L.: Lyapunov exponents and persistence in dynamical systems, with appli- cations to some discrete-time models. Phd. Thesis, Arizona State University (2009) 12. Salceanu, P.L., Smith, H.L.: Persistence in a Discrete-time, Stage-structured Epidemic Model. J. Difference Equ. Appl. (to appear, 2009) 13. Salceanu, P.L., Smith, H.L.: Persistence in a Discrete-time Stage-structured Fungal Dis- ease Model. J. Biol. Dynamics 3, 271Ð285 (2009) 14. Salceanu, P.L., Smith, H.L.: Lyapunov Exponents and Persistence in Discrete Dynamical Systems. Discrete and Continuous Dynamical Systems-B (to appear, 2009) 15. Schreiber, S.J.: Criteria for Cr Robust Permanence. J. Differ. Equations 162, 400Ð426 (2000) 16. Seneta, E.: Non-negative Matrices, an Introduction to Theory and Applications. Halsted Press, New York (1973) 17. Smith, H.L.: Monotone Dynamical Systems: an introduction to the theory of competitive and cooperative systems. Amer. Math. Soc. Surveys and Monograghs 41 (1995) 18. Smith, H.L., Zhao, X.-Q.: Robust Persistence for Semidynamical Systems. Nonlinear Anal. 47, 6169Ð6179 (2001) 19. Thieme, H.R.: Mathematics in Population Biology. Princeton University Press, New Jer- sey (2003) 20. Zhao, X.-Q.: Dynamical Systems in Population Biology. Springer, New York (2003) Reachability Analysis for Different Classes of Positive Systems

Maria Elena Valcher

Abstract. In this survey paper, reachability properties for discrete-time positive systems, two-dimensional discrete state-space models and discrete-time positive switched systems are introduced and characterized. Comparisons among the re- sults obtained in these three settings are presented, thus enlightening which results can be easily extended and what aspects, at present time, are still challenging open problems.

1 Introduction

Since the early seventies, positive systems have been the object of a noteworthy in- terest in the literature. Positive linear systems [14] naturally arise in various fields, such as bioengineering (compartmental models), economic modeling, behavioral science, and stochastic processes (Markov chains or hidden Markov models). Gen- erally speaking, these systems provide the natural framework for modeling physical systems whose describing variables necessarily take nonnegative values. It is clear, however, that apart from the nonnegativity constraint, various additional features may be relevant when capturing the system dynamics. These instances led to the in- troduction of different classes of positive systems, in particular, (one-dimensional) positive systems, two-dimensional (2D) positive systems and switched positive sys- tems. Even though for each of this class of systems several theoretical problems have been thoroughly investigated, a common research topic for all these classes of sys- tems has been the analysis of structural properties and, in particular, of reachability. The aim of this paper is that of providing a brief survey on the reachability charac- terizations obtained within these three settings, with special attention to the discrete time cases (even though similar analyses have been performed in the continuous- time cases). Specifically, reachability of discrete-time positive systems will be the

Maria Elena Valcher Dip. Ingegneria dell’Informazione, Universit`a di Padova, Italy, e-mail: [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 29Ð41. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 30 M.E. Valcher object of Sect. 2, and it has been investigated, just to quote some of the available ref- erences, in [5Ð7, 10Ð13, 24]; reachability of discrete two-dimensional positive sys- tems will be discussed in Sect. 3, and it has been the object of [1, 2, 17, 18, 21, 22]; finally, reachability of discrete-time positive switched systems will be addressed in Sect. 4 (see [9, 20, 26Ð28]).

Notation.The(i, j)th entry of a matrix A is denoted by [A]i, j. In the special case of [ ] a vector v, its ith entry is v i. The symbol Ê+ denotes the semiring of nonnegative

real numbers. A matrix A (in particular, a vector) with entries in Ê+ is said to be nonnegative (A ≥ 0). If A ≥ 0, but at least one entry is positive, A is said to be positive (A > 0). n We let ei denote the ith vector of the canonical basis in Ê (where n is always clear from the context). Given a vector v,thenonzero pattern of v is the set of indices corresponding to its nonzero entries, namely ZP(v) := {i : [v]i = 0}. A vector ∈ n ( )= ( )={ } v Ê+ is an ith monomial vector if ZP v ZP ei i .Amonomial matrix is a nonsingular square positive matrix whose columns are (distinct) monomial vectors. The Hurwitz products of two n × n matrices A1 and A2 are inductively defined [15] as

i j A1 A2 = 0, when either i or j is negative, i 0 = i , ≥ , 0 j = j , ≥ , A1 A2 A1 if i 0 A1 A2 A2 if j 0 i j i−1 j i j−1 A1 A2 = A1(A1 A2)+A2(A1 A2), if i, j > 0.

i j Notice that ∑i+ j= A1 A2 =(A1 + A2) . Basic definitions and results about cones may be found, for instance, in [3, 4]. We K ⊂ n recall here only those facts that will be used within this paper. A set Ê is said to be a cone if αK ⊂ K for all α ≥ 0; a cone is convex if it contains, with any two points, the line segment between them. A cone K is said to be polyhedral if it can be expressed as the set of nonnegative linear combinations of a finite set of generating vectors. This amounts to saying that a positive integer k and an n × k matrix C can be found, such that (s.t.) K coincides with the set of nonnegative combinations of the columns of C. In this case, we adopt the notation K := Cone(C). To efficiently introduce our results, we also need some definitions borrowed from the algebra of non-commutative polynomials [25]. Given the alphabet Ξ = ∗

{ξ1,ξ2,...,ξp}, the free Ξ with base Ξ is the set of all words w =

ξ ξ ···ξ , ∈ Æ,ξ ∈ Ξ. i1 i2 ik k ih The integer k is called the length of w and is de- noted by |w|, while |w|i represents the number of occurrences of ξi in w.If = ξ ξ ···ξ Ξ ∗ w˜ j1 j2 jp is another element of , the product is defined by concatena- = ξ ξ ···ξ ξ ξ ···ξ . ε = tion ww˜ i1 i2 im j1 j2 jp This produces a monoid with 0,/ the empty word, as unit element. Clearly, |ww˜| = |w| + |w˜| and |ε| = 0. ξ ,ξ ,...,ξ   1 2 p is the algebra of polynomials in the noncommuting indetermi- n×n A = { , ,..., } nates ξ1,ξ2,..., ξp. For every family of p matrices in  , : A1 A2 Ap , the map ψ defined on {ε,ξ1,ξ2,...,ξp} by the assignments ψ(ε)=In and ψ(ξi)=

n×n ξ ,ξ ,...,ξ   Ai, i ∈p, uniquely extends to an algebra morphism of  1 2 p into Reachability Analysis for Different Classes of Positive Systems 31

n×n Ξ ∗ (as an example, ψ(ξ1ξ2)=A1A2 ∈  ). If w is a word in (i.e. a monic mono- ξ ,ξ ,...,ξ  ψ ( , ,..., ) mial in  1 2 p ), the -image of w is denoted by w A1 A2 Ap .

2 Reachability of Discrete-Time Positive Systems

A (discrete-time) positive system is a state-space model

x(k + 1)=Ax(k)+Bu(k), k = 0,1,2,..., (1) where x(k) and u(k) denote the n-dimensional state variable and the m-dimensional

∈ n×n n×m ∈ Ê input variable, respectively, at the time instant k, while A Ê+ and B + .Un- der the nonnegativity constraint on the system matrices, the state trajectories of the system are constrained within the positive orthant, provided that the initial condition ( ) ( ), ∈ x 0 and the input sequence u k k +, are nonnegative. Reachability property for this type of systems focuses only on nonnegative states, reached by means of nonnegative inputs, and hence it is defined as follows.

n Definition 1. Given the positive system (1), a state x f ∈ Ê+ is said to be reach- ( ) = , ,..., − able if there exist k f ∈ Æ and a nonnegative input sequence u k , k 0 1 k f 1, that transfers the state of the system from the origin at k = 0tox f at time k = k f . The positive system (1) is monomially reachable if every monomial vector (equiv- alently, every canonical vector ei, i ∈n) is reachable, and reachable if every state n x f ∈ Ê+ is reachable.

It is easily seen that monomial reachability is a necessary and sufficient condition for reachability. Necessity is obvious. On the other hand, if each canonical vector i ei,i ∈n, is reachable at some time ki by means a nonnegative input sequence u (k), then each of them is reachable at k f := maxi ki by means of a suitably right-shifted i i version, say uø (k), of the sequence u (k). Consequently, every positive vector x f can n i( )[ ] be reached at time k f by means of the nonnegative input sequence ∑i=1 uø k x f i. This simple remark allows to convert the reachability problem in the easier mono- mial reachability problem. An algebraic characterization of monomial reachability, and hence of reachability, can be easily obtained by resorting to the reachability matrix of the system.

Definition 2. The reachability matrix at time k of system (1) is

k−1 Rk(A,B) :=[B | AB | ... | A B].

As the expression of the state at time k, starting from the zero initial condition x(0) and under the (nonnegative) soliciting input u(·),isgivenby 32 M.E. Valcher ⎡ ⎤ u(0) ⎢ ( ) ⎥ − ⎢ u 1 ⎥ x(k)=[B | AB | ... | Ak 1B]⎢ . ⎥, ⎣ . ⎦ u(k − 1) it is clear that the monomial vector ei is reachable at time k if and only if the reach- ability matrix Rk(A,B) includes an ith monomial column. Therefore Proposition 1. For the n-dimensional positive system (1), the following facts are equivalent ones: • the system is reachable; • the system is monomially reachable;

• ∈ Æ R ( , ) × there exists k f such that the reachability matrix k f A B includes an n n monomial matrix.

All the results reported up to now are quite straightforward. A quite nontrivial step, instead, was taken by Coxson, Larson and Schneider [10] in proving that if a positive system (1) is reachable then the index k f , in the third item of the previous proposition, can always be chosen equal to the system dimension n.

Proposition 2. The n-dimensional positive system (1) is reachable if and only if the reachability matrix Rn(A,B) includes an n × n monomial matrix. If we define the reachability index of a reachable positive system (1) as the small-

n

∈ Æ R ( , )= Ê est k f such that k f A B +, the previous proposition tells us that the reach- ability index cannot exceed the system dimension. The proof of this result is rather involved and it resorts to the precious graph-theoretic approach to the study of the structural properties of positive systems. Indeed, to every n-dimensional system with m inputs (1) we may associate [7, 8, 29] a digraph (directed graph) D(A,B), with n vertices, indexed by 1,2,...,n,andm sources s1,s2,...,sm. There is an arc ( j,i) from j to i if and only if [A]ij > 0, and there is an arc (s j,i) from the source s j to vertex i if and only if [B]ij > 0. A sequence s j → i0 → i1 →··· → ik−1, starting from the source s j, and passing through the vertices i0,...,ik−1,isans-path from s j to ik−1 (of length k) provided (s j,i0),(i0,i1),...,(ik−2,ik−1) are all arcs of D(A,B). It is easily seen that there is a path of length k from s j to some vertex i if and only if the (i, j)th entry of k−1 A B is positive. Clearly, leaving from some source s j,afterk steps one can reach several distinct vertices. This corresponds to saying that the jth column of Ak−1B has, in general, more than one nonzero entry. We say that an s-path of length k from s j deterministically reaches some vertex i, if no other vertex of the digraph can be reached in k steps starting from s j. If so, we refer to such an s-path as to a deterministic path (of length k)toi. Again, it is obvious that a vertex i can be deterministically reached from some source s j by means of a path of length k if and only if the jth column of Ak−1B is a ith monomial vector. So, we have realized that monomial reachability (and hence reachability) of a positive system (1) can be easily tested by simply verifying that for each vertex Reachability Analysis for Different Classes of Positive Systems 33 i ∈n there is a source and a deterministic path from that source reaching the vertex i.

Example 1. Consider the positive system (1) with ⎡ ⎤ ⎡ ⎤ 01000 00 ⎢ ⎥ ⎢ ⎥ ⎢10000⎥ ⎢10⎥ ⎢ ⎥ ⎢ ⎥ A = ⎢00010⎥ B = ⎢10⎥. ⎣00001⎦ ⎣00⎦ 00000 01

By inspecting the associated digraph D(A,B) one easily sees that, starting from s2, vertex 5 can be reached deterministically in one step, 4 in two steps and 3 in three steps. On the other hand, starting from s1, vertex 1 can be reached deterministically in two steps, while 2 in three steps. So, the system is reachable.

 - - 5 4 3  D(A,B) 6 

 s1 @ § ¤ ? s2 @R  21  ¦6 ¥

Fig. 1 Graph description of the system of Example 1

Coxson and Larson proved [10] (by using a slightly different terminology, though) that if there exists a deterministic path from some source s j to some vertex i, then there exists a deterministic path from s j to i of length not greater than n,the number of vertices. This led to Proposition 2. This graph-theoretic interpretation turned out to be very profitable, as it allowed to derive canonical forms for reachable positive systems. A first result about canon- ical forms was derived in [29]. More refined results were later obtained by Bru and co-workers in [5, 7] (see, also, [6], where the concept of reachability index of a positive system (1) was generalized and characterized in graph-theoretic terms).

3 Reachability of Discrete 2D Positive Systems

A(discrete)two-dimensional (2D) positive system is a 2D state-space model de- scribed by the following first order state-updating equation [15]:

x(h + 1,k + 1)=A x(h,k + 1)+A x(h + 1,k) 1 2 (2) + B1u(h,k + 1)+B2u(h + 1,k), 34 M.E. Valcher where the n-dimensional local states x(·,·) and the m-dimensional inputs u(·,·) take nonnegative values, A1 and A2 are nonnegative n × n matrices, B1 and B2 are nonnegative n × m matrices, and the initial conditions are assigned by specifying the (nonnegative) values of the state vectors on the separation set C0 := {(h,k) : , ∈ , + = }, h k  h k 0 namely by assigning all local states of the initial global state X0 := {x(h,k) : (h,k) ∈ C0}. All input sequences involved have supports included

{( , ) ∈ ×  + ≥ } in the half-plane h k  : h k 0 . For this class of systems (even when no positivity constraint is introduced) reach- ability represents a rather articulate concept [15, 16]. This is an immediate conse- quence of the fact that, when defining this concept, we may either refer to the local states or to the global states Xt := {x(h,k) : (h,k) ∈ Ct }, which collect all local C , + = }. states lying on the separation set t := {(h,k) : h,k ∈  h k t

Definition 3. A 2D positive system (2) is said to be • X ∗ n locally reachable if, upon assuming 0 = 0, for every x ∈ Ê+ there exist

( , ) ∈ ×  + > (·,·) ( , )= h k  , h k 0, and a nonnegative input sequence u such that x h k x∗. When so, we will say that x∗ is reachable in h + ksteps; ∗ • globally reachable if, upon assuming X0 = 0, for every global state X with

n

∈  (·,·) entries in Ê+,thereexistN + and a nonnegative input sequence u such that ∗ ∗ the global state XN coincides with X . When so, we will say that X is reachable in N steps. If all local (global) states are reachable, system (2) is locally (globally) reachable, and the smallest number of steps which allows to reach every nonnegative local (global) state represents its local (global) reachability index ILR (IGR).

Clearly, as in the standard (nonpositive) case, global reachability ensures local reachability, while the converse is not true.

3.1 Local Reachability of 2D Positive Systems

In order to characterize local reachability, we first introduce the reachability matrix in k steps [15] of the 2D positive system (2), i.e. R ( , , , )=[ + 2 k A1 A2 B1 B2 B1 B2 A1B1 A1B2 A2B1 A2B2 A1B1 (A 1 1A )B + A2B ... Ak−1B ] 1 2 1 1 2 2 2 = ( i−1 j ) +( i j−1 ) A1 A2 B1 A1 A2 B2 i, j≥0, 0

Proposition 3. [18] Given a 2D positive system (2), the following facts are equiv- alent ones: • the system is locally reachable; • the system is “locally monomially reachable”, i.e. all monomial vectors can be reached (starting from X0 = 0) by means of nonnegative inputs; • ∈ R ( , , , ) there exists k Æ such that the reachability matrix in k steps, k A1 A2 B1 B2 , includes an n × n monomial submatrix;

•∃ ( ×  = ( ) ∈ , ∈ , n pairs hi,ki) ∈ + + and n indices j j i m i n s.t.   hi−1 ki hi ki−1 ZP (A1 A2)B1e j +(A1 A2)B2e j = {i}.

If so,

ILR = maxi min{hi + ki : ∃ j = j(i) s.t. hi,ki hi−1 ki hi ki−1 (A1 A2)B1e j +(A1 A2)B2e j is an ith monomial vector}.

As for positive systems (1), local reachability of 2D positive systems is a struc- tural property, by this meaning that it only depends on the nonzero patterns of the system matrices and not on the specific values of their nonzero elements. Conse- quently, it can be investigated by resorting to a graph-theoretic approach. To every 2D positive system (2), of size n, with m inputs, we associate a 2D (2) influence digraph D (A1,A2,B1,B2) with n vertices, 1,2,...,n, and m sources s1,s2,...,sm.ThereisanA1-arc (an A2-arc) from j to i if and only if the (i, j)th entry of A1 (of A2) is nonzero. There is a B1-arc (a B2-arc) from s j to i if and only if the (i, j)th entry of B1 (of B2) is nonzero. Example 2. The positive system with a single input described by the matrices ⎛⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎞ 1500 0000 0 2 ⎜⎢0000⎥ ⎢0040⎥ ⎢0⎥ ⎢0⎥⎟ (A ,A ,B ,B )=⎜⎢ ⎥,⎢ ⎥,⎢ ⎥,⎢ ⎥⎟ (3) 1 2 1 2 ⎝⎣1000⎦ ⎣2001⎦ ⎣0⎦ ⎣0⎦⎠ 0000 0000 1 0 corresponds to the 2D digraph, with 4 vertices and a single source, of Fig. 2. A1- arcs and B1-arcs have been represented by means of thick lines, while A2-arcs and B2-arcs by means of thin lines. 36   M.E. Valcher - 4 3    @ @   @R -  s 1 2   



Fig. 2 2D influence digraph corresponding to (3)

(2) A path p in D (A1,A2,B1,B2) is a sequence of adjacent arcs and, in particular, an s j-path is a path which originates from the source s j.Apathp in a 2D digraph is specified by assigning its vertices and the type of arcs they are connected by. If we denote by |p|1 the number of A1-arcs and B1-arcs and by |p|2 the number of A2-arcs and B2-arcs occurring in p,then[|p|1 |p|2] is the composition of p and |p| = |p|1 + |p|2 its length. Once we have introduced these concepts, local reachability admits an interesting and useful characterization in terms of the 2D influence digraph associated with h −1 k h k −1 the system. Indeed, saying that (A1 i i A2)B1e j +(A1 i i A2)B2e j is an ith monomial vector just means that the set of s j-paths p of composition [|p|1 |p|2]= [hi ki] is not empty and each of them reaches the vertex i alone. If so, we will say that the vertex i is deterministically reached by all s j-paths of composition [hi ki]. As a consequence, the 2D system (2) is locally reachable if and only if for every i ∈n there exists j = j(i) such that the vertex i is deterministically reached by all s j-paths of some composition [hi ki].Moreover,

ILR = maxi min{hi + ki : ∃ j = j(i) s.t. all s j-paths hi,ki of composition [hi ki] deterministically reach i}.

Even though the graph-theoretic approach represents a noteworthy tool in the study of the local reachability index of a locally reachable 2D positive system, unfortu- nately, at present time, no upper bound on the maximum value of the local reach- ability index has been derived yet. Nonetheless, upper bounds for specific classes of systems have been derived in [18] and in [1, 2]. As a result, we can now at least claim [2] that the upper-bound on ILR is not smaller than      n n I = + 1 n + 1 − . LR 2 2

3.2 Global Reachability of 2D Positive Systems

When addressing global reachability, it suffices, again, to focus on the reachability of those global states which consist of all zero (local) states except for one of them, Reachability Analysis for Different Classes of Positive Systems 37 which coincides with ei, i ∈n (“global monomial reachability” for the class of 2D positive systems). Moreover, the reachability of such global states can be, again, described in terms of conditions on certain columns of the reachability matrix.

Proposition 4. [18] Given a 2D positive system (2), the following facts are equiv- alent ones: • the system is globally reachable; • the system is “globally monomially reachable”;

•∃ ( ×  = ( ) ∈ , ∈ , n pairs hi,ki) ∈ + + and n indices j j i m i n s.t.   hi−1 ki hi ki−1 ZP (A1 A2)B1e j +(A1 A2)B2e j = {i}, (4)   h−1 k h k−1 ZP A1 A2)B1e j +(A1 A2)B2e j = /0, (5)

∀ (h,k) =(hi,ki) with h + k = hi + ki.

Proposition 4 can be interpreted in graph-theoretic terms: the 2D system (2) is globally reachable if and only if for every i ∈n there exists j = j(i) ∈m such that the vertex i is deterministically reached by all s j-paths of a given composition [hi ki], and no s j-path exists, having the same length hi + ki and different composition. = { + ∃ = ( ) Moreover, IGR maxi minhi,ki hi ki : j j i s.t. all s j-paths of composition [hi ki] deterministically reach i and there is no s j-path of length hi + ki and different composition}. In [18] a canonical form for globally reachable positive 2D systems has been obtained, even more it has been proved that the global reachability index of a 2D (globally reachable) system never exceeds the system dimension n, a result which makes global reachability easy to be tested, differently from local reachability.

4 Reachability of Discrete-Time Positive Switched Systems

A discrete-time positive switched system is described, at each time instant k ∈ +, by a first-order difference equation of the following type:

x(k + 1)=Aσ(k)x(k)+Bσ(k)u(k), (6) where x(k) and u(k) denote the n-dimensional state variable and the m-dimensional input variable, respectively, at the time instant k, while σ is a switching sequence, P =   ∈ P defined on + and taking values in a finite set p . For each i , the pair ( n×n Ai,Bi) represents a discrete-time positive system (1), which means that Ai ∈ Ê+ n×m and Bi ∈ Ê+ . The definition of reachability for discrete-time positive switched systems may be given by suitably adjusting the definition given in [19, 30], in order to introduce the nonnegativity constraint on the state and input variables. 38 M.E. Valcher

n Definition 4. Given the positive switched system (6), a state x f ∈ Ê+ is said to σ [ , − ] → P be reachable if there exist k f ∈ Æ, a switching sequence : 0 k f 1 and [ , m ( )= an input sequence u : 0 k f − 1] → Ê+ that lead the state trajectory from x 0 0 to x(k)=x f . The positive switched system (6) is monomially reachable if every n monomial vector is reachable, and reachable if every state x f ∈ Ê+ is reachable.

We refer to the cardinality of the discrete time interval [0,k f − 1] as to the length |σ| of the switching sequence σ (in this case, |σ| = k f ). When reachability prop- erty is ensured, a natural goal one may want to pursue is that of determining the maximum number of steps required to reach every nonnegative state.

Definition 5. Given a reachable positive switched system (6), we define its reach- ability index as I := sup n min{k : x is reachable at time k}. R x∈Ê+

As we will see, reachable systems can be found endowed with an infinite IR.This fact represents a significant difference with respect to both standard switched sys- tems and positive systems. Another significant fact, which makes switched positive systems different from the previous ones we considered, is that monomial reacha- bility is no longer sufficient for reachability. This is rather intuitive as, indeed, non- negative combinations of switching sequences do not generally lead to admissible switching sequences (as they do not take values in P). To explore monomial reachability and reachability, it is first convenient to pro- ∈ vide the explicit expression of the state at any time instant k Æ, starting from the initial condition x(0), under the effect of the input sequence u(0),u(1),...,u(k−1), and of the switching sequence σ(0),σ(1),...,σ(k − 1). It turns out (see, for in- stance, [19]) that   k−1 k−1 ( )= σ ( )+ σ ( )+···+ ( − ), x k A 1 Bσ(0)u 0 A 2 Bσ(1)u 1 Bσ(k−1)u k 1 (7) where we have resorted to the following shorthand notation:   ··· , < k−1 Aσ(k−1)Aσ(k−2) Aσ(l) if l k; σ = A l : In, if l = k.

It is immediately seen that, when the input sequence u(·) is nonnegative, the state at the time instant k belongs to the polyhedral cone generated by the (columns of the) k−1 σ matrices A l Bσ(l−1),asl ranges from 1 to k, namely to the cone generated by the columns of the reachability matrix associated with the switching sequence σ of length k:    k−1 k−1 k−1 R (σ) := σ ... σ σ . k Bσ(k−1) A k−1Bσ(k−2) A 2 Bσ(1) A 1 Bσ(0)

When dealing with standard discrete-time switched systems, it has been proved [19] that the system is reachable if and only if there exists a switching sequence σ (of (R n. length say k) such that Im k(σ)) = Ê For positive switched systems, instead, if n , a switching sequence σ of length k exists such that Cone(Rk(σ)) = Ê+ then the Reachability Analysis for Different Classes of Positive Systems 39 system is reachable, but the converse is not true [27]. Even the weaker condition that there exists a finite number of switching sequences of finite lengths, such that the union of the cones generated by the columns of their reachability matrices covers the positive orthant, is only sufficient for the system reachability. σ ,σ ,...,σ Proposition 5. [27] If there exist switching! sequences 1 2 (of lengths

n

, ,..., R (σ ) = Ê , k1 k2 k, respectively) such that i=1 Cone ki i + the positive switched system (6) is reachable.

Example 3. [27] Consider the positive system, switching among the following sub- systems:     00 1 11 0 (A ,B )= , , (A ,B )= , . 1 1 00 0 2 2 01 1

T It is clearly seen that every 1st monomial vector x f = x1 0 , x1 > 0, can be σ ( )= ( )= reached in a single step, by setting 1 0 1(andu 0 x1). On the other hand, for T , ≥ every x f = x1 x2 ≥ 0, with x2 > 0, there exists a sufficiently large k ∈ + k 2, such that   012... k − 1 x ∈ Cone = Cone(R (σ )), f 111... 1 k 2 where σ2(i)=2, for every i ∈ [0,k − 1]. In particular, from Eq. (7) together with R (σ ) ( ) ≤ ( − ) ( ) the expression of k 2 we may" deduce# that x1 k k 1 x2 k . Thus, x f can be reached in a minimum of k = x1 + 1 steps. As a particular case, when x = 0 x2 1 (hence k = 1) it is sufficient to set u(0)=x2;ifk > 1, then x f can be reached by ( )= x1 ( − )= − ( ) ( − ) setting u 0 k−1 and u k 1 x2 u 0 , where the nonnegativity of u k 1 is ensured by the definition of k.So,x f can be reached in k steps. This ensures that 2 = (R (σ )) ∪ ∪+∞ (R (σ )) , Ê+ Cone 1 1 k=0Cone k 2 and hence the system is reachable. 2 However, since every nonnegative vector in Ê+ which is not a 1st monomial vector can only be reached by steadily setting the switching sequence to the value 2, we may deduce that:

∈ 2 (R (σ ))∪ (R (σ )) = Ê , 1) for every finite k +, Cone 1 1 Cone k 2 + thus proving that Proposition 5 gives only a sufficient condition; 2) there is no upper bound on sup 2 min{k : ∃ σ with |σ| = k such that x ∈ x∈Ê+ Cone(Rk(σ))} = IR. Thus the system is reachable, but IR is not finite. Luckily, monomial reachability is easily captured.

Proposition 6. [27] The switched positive system (6) is monomially reachable if ∃ ∈ and only if N Æ such that the reachability matrix in N steps

RN =[w(A1,A2,...,Ap)B1w(A1,A2,...,Ap)B2...w(A1,A2,...,Ap)Bp] w∈Ξ∗ 0≤|w|≤N−1 includes an n × n monomial submatrix. 40 M.E. Valcher

At this point a natural question arises: if the system is monomially reachable and we let N denote the minimum positive integer such that RN includes an n × n monomial matrix, what is the maximum value that N may reach? This amounts to defining a “monomial reachability index” and to searching for an upper bound on it. It turns out [27] that the upper bound is 2n − 1 and it is strict, meaning that examples can be given of (single-input) positive switched systems whose monomial reachability index has just that value.

References

1. Bailo, E., Bru, R., Gelonch, J., Romero, S.: On the reachability index of positive 2-D systems. IEEE Transactions on Circuits and Systems II 5(10), 1Ð7 (2006) 2. Bailo, E., Gelonch, J., Romero, S.: An upper bound on the reachability index for a special class of positive 2-D systems. Electronic Journal of Linear Algebra 18, 1Ð12 (2009) 3. Barker, G.P.: Theory of cones. Lin. Alg. Appl. 39, 263Ð291 (1981) 4. Berman, A., Plemmons, R.J.: Nonnegative matrices in the mathematical sciences. Aca- demic Press, New York (1979) 5. Bru, R., Caccetta, L., Rumchev, V.G.: Monomial subgraphs of reachable and controllable positive discreteÐsystems. International Journal of Applied Mathematics and Computer Science 15, 159Ð166 (2005) 6. Bru, R., Coll, C., Romero, S., S«anchez, E.: Reachability indices of positive linear sys- tems. Electronic Linear Algebra 11, 88Ð102 (2004) 7. Bru, R., Romero, S., S«anchez, E.: Canonical forms for positive discrete-time linear con- trol systems. Linear Algebra & its Appl. 310, 49Ð71 (2000) 8. Brualdi, R.A., Ryser, H.J.: Combinatorial matrix theory. Cambridge Univ. Press, Cam- bridge (1991) 9. Conner, L.T., Stanford, D.P.: The structure of the controllable set for multimodal systems. Linear Algebra & its Appl. 95, 171Ð180 (1987) 10. Coxson, P.G., Larson, L.C., Schneider, H.: Monomial patterns in the sequence Akb.Lin. Alg. Appl. 94, 89Ð101 (1987) 11. Coxson, P.G., Shapiro, H.: Positive reachability and controllability of positive systems. Lin. Alg. Appl. 94, 35Ð53 (1987) 12. Fanti, M.P., Maione, B., Turchiano, B.: Controllability of linear single-input positive discrete time systems. Int. J. of Control 50, 2523Ð2542 (1989) 13. Fanti, M.P., Maione, B., Turchiano, B.: Controllability of multi-input positive discrete time systems. Int. J. of Control 51, 1295Ð1308 (1990) 14. Farina, L., Rinaldi, S.: Positive linear systems: theory and applications. Series on Pure and Applied Mathematics. Wiley-Interscience, New York (2000) 15. Fornasini, E., Marchesini, G.: Doubly indexed dynamical systems. Math. Sys. Theory 12, 59Ð72 (1978) 16. Fornasini, E., Marchesini, G.: Global properties and in 2-D systems. Systems & Control Letters 2(1), 30Ð38 (1982) 17. Fornasini, E., Valcher, M.E.: On the positive reachability of 2D positive systems. In: Farina, L., Benvenuti, L., De Santis, A. (eds.) Positive Systems. LNCIS, pp. 297Ð304 (2003) Reachability Analysis for Different Classes of Positive Systems 41

18. Fornasini, E., Valcher, M.E.: Controllability and reachability of 2D positive systems: a graph theoretic approach. IEEE Trans. Circuits and Systems, Part I: Regular Pa- pers 52(3), 576Ð585 (2005) 19. Ge, S.S., Sun, Z., Lee, T.H.: Reachability and controllability of switched linear discrete- time systems. IEEE Trans. Aut. Contr. 46(9), 1437Ð1441 (2001) 20. Conner Jr., L.T., Stanford, D.P.: The structure of the controllable set for multi-modal systems. Linear Algebra & its Appl. 95, 171Ð180 (1987) 21. Kaczorek, T.: Reachability and controllability of 2D positive linear systems with state feedback. Control and Cybernetics 29(1), 141Ð151 (2000) 22. Kaczorek, T.: Positive 1D and 2D systems. Springer, London (2002) 23. Maeda, H., Kodama, S.: Positive realization of difference equations. IEEE Trans. Circ. Sys. CAS-28, 39Ð47 (1981) 24. Rumchev, V.G., James, D.J.G.: Controllability of positive linear discrete time systems. Int. J. of Control 50, 845Ð857 (1989) 25. Salomaa, A., Soittola, M.: Automata theoretic aspects of formal power series. Springer, Heidelberg (1978) 26. Santesso, P., Valcher, M.E.: Reachability properties of discrete-time positive switched systems. In: Proceedings of the 45th Conference on Decision and Control (CDC 2006), San Diego (CA), pp. 4087Ð4092 (2006) 27. Santesso, P., Valcher, M.E.: Monomial reachability and zero-controllability of discrete- time positive switched systems. Systems and Control Letters 57, 340Ð347 (2008) 28. Stanford, D.P., Conner Jr., L.T.: Controllability and stabilizability in multi-pair systems. SIAM J. Contr. Optim. 18(5), 488Ð497 (1980) 29. Valcher, M.E.: Controllability and reachability criteria for discrete time positive systems. Int. J. of Control 65, 511Ð536 (1996) 30. Xie, G., Wang, L.: Reachability realization and stabilizability of switched linear discrete- time systems. J. Math. Anal. Appl. 280, 209Ð220 (2003) On the Positive LQ-Problem for Linear Discrete Time Systems

Charlotte Beauthier and Joseph J. Winkin

Abstract. The finite horizon Linear-Quadratic (LQ) optimal control problem with nonnegative state constraints (denoted by LQ+) is studied for positive linear systems in discrete time. Necessary and sufficient optimality conditions are obtained by us- ing the maximum principle. These conditions lead to a computational method for the solution of the LQ+ problem by means of a corresponding Hamiltonian system. In addition, necessary and sufficient conditions are reported for the LQ+-optimal con- trol to be given by the standard LQ-optimal state feedback law. Sufficient conditions are also reported for the positivity of the LQ-optimal closed-loop system. In partic- ular, such conditions are obtained for the problem of minimal energy control with penalization of the final state. Moreover a positivity criterion for the LQ-optimal closed-loop system is derived for positive systems with a positively invertible (dy- namics) generator.

1 Introduction

An important question in system and control theory is the invariance of the nonneg- ative orthant of the state space for linear systems. When they satisfy that property, such systems are called positive linear systems. They encompasse controlled dy- namical models where all the variables, i.e. the state and output variables, should remain nonnegative for any nonnegative initial conditions and input functions. An overview of the state of the art in positive systems theory is given e.g. in [9], [16] and [19]. Typical examples of positive systems are economics models, chemical processes or age-structured populations (see e.g. [9, 10]).

Charlotte Beauthier and Joseph J. Winkin University of Namur (FUNDP), Department of mathematics, Rempart de la Vierge, 8, 5000 Namur, Belgium, e-mail: [email protected], [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 45Ð53. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 46 C. Beauthier and J.J. Winkin

Several system theoretic problems have already been investigated for positive systems. In particular the LQ problem with nonnegative control constraints has been studied for (general) linear systems : see e.g. [13] and references therein for the LQ problem with positive controls and [16] for the minimal energy positive control problem for reachable positive systems. When synthesizing a feedback control law for a positive system, it is meaningful, from the modeling point of view, to aim at keeping the positivity property of the open-loop system for the designed closed-loop system. In [8], using a controllable block companion transformation, sufficient conditions on the weighting matrices of a quadratic cost criterion are established to ensure that the closed-loop system is positive. This idea was generalized in [14] in order to remove the restrictive positiv- ity assumption that was required on such transformation. Here we report some results concerning the finite-horizon positive linear quad- ratic (LQ+) problem for positive linear time systems, [4]. The infinite horizon LQ+ problem in continuous time is studied in [17] by means of a Newton-type iterative scheme which is inspired by the one developed in [11], and the LQ+ problem for positive linear continuous time systems is studied in [3]. Due to the lack of space, only hints or short proofs are given for the main results and numerical examples are omitted. More details are available in [4].

2 Problem Statement

Let X and Y be matrices in IRp×q. The property that, for all i = 1,...,p and for all j = 1,...,q, xij ≥ yij,(xij > yij, respectively), is denoted by X ≥ Y,(X  Y , respectively). A matrix M ∈ IR p×q is said to be nonnegative,(strictly positive, respectively) if M ≥ 0, (M  0, respectively). In particular, these notations and definitions obviously apply to the case q = 1, i.e. to vectors x ∈ IR p. Consider the following linear time-invariant system description in discrete time, denoted by [A, B] :

xi+1 = Axi + Bui, i = 0,...,N − 1, x0 = xˆ0 ≥ 0, (1)

n m where the state xi and the control ui are in IR and IR respectively, A and B are real n matrices andx ˆ0 ∈ IR denotes any fixed initial state. Definition 1. The system [A, B] given by (1) is said to be positive if, for all initial ≥ ( )N−1 ≥ conditionsx ˆ0 0 and for all controls ui i=0 0, the state trajectories are nonneg- ative, i.e. for all i = 0,...,N, xi ≥ 0. The following characterizations are well-known (see e.g. [9], [16]). Proposition 1. The system [A, B] is positive if and only if A and B are nonnegative matrices. The finite horizon positive LQ problem in discrete time, which will be denoted N by LQ+ , consists of minimizing the quadratic functional : $ % N−1 ( ,( )N−1) = 1 ( 1/2 2 +  2)+ T J xˆ0 ui i=1 : ∑ R ui Cxi xNSxN (2) 2 i=0 On the Positive LQ-Problem for Linear Discrete Time Systems 47 for a given positive linear system described by (1), where the initial statex ˆ0 ≥ 0is fixed, under the constraints

∀ i ∈{0,...,N}, xi ≥ 0, (3) where N is a fixed final time, R ∈ IR m×m is a symmetric positive definite matrix, C belongs to IRp×n and S ∈ IR n×n is a symmetric positive semidefinite matrix. N In other words, the LQ+ problem consists of minimizing a quadratic functional for a given positive system while requiring that the state trajectories be nonnegative for any fixed nonnegative initial state, whence the positivity property should be kept for the optimal state trajectories. In this framework, it is not required that the input ( )N−1 function ui i=0 be nonnegative.

3 Optimality Conditions

Applying the discrete time maximum principle with state constraints (see e.g. [12]), i.e. the Karush-Kuhn-Tucker optimality conditions, yields a characterization, to- N gether with a computational procedure, for an LQ+ optimal control. Theorem 1 (Optimality conditions based on the Maximum Principle). N ( )N−1 λ a) The LQ+ problem has a solution ui i=0 if and only if there exist multipliers i = − −1 T , = ,..., − [ T T ]T ∈ 2n such that ui R B pi i 0 N 1,where xi pi IR is the solution of the recurrent Hamiltonian equation

x x + 0 i = H i 1 − , i = N − 1,...,0 pi pi+1 λi

A−1 A−1BR−1BT with x = xˆ , p = Sx −λ ,whereH= is the 0 0 N N N CTCA−1 AT +CTCA−1BR−1BT = ,..., ≥ , λ ≥ λ T = Hamiltonian matrix, and for all i 0 N, xi 0 i 0 and i xi 0 (comple- mentarity condition). ( )N−1 b) By using the matrix form of the recurrent Hamiltonian equation, ui i=0 is so- N lution of the LQ+ problem if and only if there exist multiplier matrices Λi such that = ( ) = − −1 T −1 , = ,..., − ui Ki xˆ0 xi : R B Yi Xi xi i 0 N 1,where

X X + 0 i = H i 1 − , i = 0,...,N − 1 Yi Yi+1 Λi with the final condition XN = I and YN = S −ΛN, and for all i = 0,...,N Λ −1 ≥ , i X0 xˆ0 0 (4)

T −T Λ T −1 = xˆ0 X0 i Xi X0 xˆ0 0 (complementarity condition) (5) and −1 ≥ . Xi X0 xˆ0 0 (6) Hints a) This result follows directly from the Karush-Kuhn-Tucker optimality con- ditions with state constraints (by using the discrete-time analogue of e.g. [12, The- orem 4.1]), for necessity, and from the fact that the functional (2) is convex and the 48 C. Beauthier and J.J. Winkin dynamics and inequality constraints (1) and (3) are defined by linear functions, for sufficiency. b) This proof is a straightforward extension of the one of [7, Theorem 167, pp. 63- 66]. The main fact is the invertibility of the matrices Xi, which can be proved by using an evaluation lemma, as in [7, Corollary 134, p. 61] : see [2]. 2 Remark 1. a) The terminology used here is borrowed from [7]. The optimality conditions in Theorem 1 (a) are also called Euler-Lagrange equations in the classi- cal optimization litterature. Observe also that H is a symplectic matrix. b) A priori, in view of conditions (4)-(6), the function Ki(xˆ0) in Theorem 1 (b) clearly depends upon the choice of the initial statex ˆ0. Stronger conditions are needed in order to make it independent of the initial state, i.e. such that the optimal control law be of the state feedback type ui = Ki xi. Such conditions are reported next.

Proposition 2. The conditions (4)-(6) are satisfied for all initial states xˆ0 ≥ 0 if and only if the following conditions hold for all i = 0,...,N : Λ −1 ≥ , i X0 0 (7)

Λ T + T Λ = i Xi Xi i 0(8) and −1 ≥ . Xi X0 0 (9) The proof of this result is based on the following lemma. Lemma 1. AmatrixM∈ IR n×n is a skew-symmetric matrix, i.e. M = −MT , if and only if for all x ≥ 0, xT Mx= 0. (10) Proof. Proof of Proposition 2: The fact that conditions (4) and (6) hold for all xˆ0 ≥ 0 is obviously equivalent to conditions (7) and (9). By Lemma 1, condition (5) ≥ −T Λ T −1 holds for allx ˆ0 0 if and only if the matrix X0 i Xi X0 is skew-symmetric, or Λ T  equivalently i Xi is a skew-symmetric matrix, i.e. (8) holds. Remark 2. a) Conditions (7)-(9) can be hard to check in general. However they obviously hold with Λi = 0 in an important particular case. See Corollary 1 below. b) The optimality conditions in Theorem 1 and Proposition 2 also hold for linear systems (1) that are not positive. However the positivity assumption plays a crucial role for obtaining the criteria reported in Sect. 4. N In view of the analysis above, it is easy to get conditions such that the LQ+ problem has a solution. These conditions are based on the standard problem which will be denoted by LQN and which consists of minimizing the quadratic functional (2) for a given positive linear system described by (1) (without any nonnegativity constraint on the state trajectory). It is well-known that its solution is given by ui = Ki xi := − −1 T −1 , = ,..., − [ T T ]T ∈ 2n×n R B Yi Xi xi i 0 N 1where Xi Yi IR is the solution of the matrix recurrent Hamiltonian equation,

X X + X I i = H i 1 , N = . (11) Yi Yi+1 YN S On the Positive LQ-Problem for Linear Discrete Time Systems 49

Equivalently the solution of the LQN problem is given, for all i = 0,...,N − 1, by −1 T −1 T −1 ui = −R B Pi+1[I + BR B Pi+1] xi, where Pi is the solution of the recurrent Riccati equation (RRE), i = N,...,1, (see e.g. [7]) : T T T −1 T −1 −1 T −Pi−1 = C C + A Pi A − A PiB(I + R B PiB) R B PiA, PN = S. (12)

Corollary 1 (Optimality conditions based on admissibility). The solution of N N the (standard) LQ problem is solution of the LQ+ problem for all xˆ0 ≥ 0 if and only N if the LQ optimal state trajectories are admissible, i.e. nonnegative for all xˆ0 ≥ 0, or equivalently, one of the following equivalent conditions holds : a) The standard closed-loop matrix A + BKi is a nonnegative matrix for all i = 0,...,N − 1,i.e. −1 T ∀ k,l, ∀ i = 0,...,N − 1, (BR B Pi)kl ≤ akl. (13) b) The matrix solution of the matrix recurrent Riccati equation (12) is such that for = ,..., , −1 ≥ . all i 0 N Xi X0 0 Hints Corollary 1 follows from Theorem 1 and Proposition 2 by applying the dis- crete time version of a known characterization of the positivity of homogeneous linear time-varying systems in continuous time, see e.g. [1] and [15]. In addition, the solution of the LQN problem is given as in Theorem 1 where the multiplier ma- trices Λi are identically equal to zero. 2

4 Positivity Criteria

In this section, the LQN problem is studied with the aim of finding conditions on the problem data such that the standard closed-loop system is positive, i.e. such that the conditions of Corollary 1 hold. This can be interpreted as solving an inverse N LQ+ problem.

4.1 Minimal Energy Control

In the sequel, σ(A) and ρ(A) denote the spectrum and the spectral radius of a matrix A, respectively. The matrix norm that is used here is the one induced by the euclid- ian vector norm. Consider the particular problem of minimal energy control with penalization of the final state, i.e. the LQN problem (1)-(2) where C is equal to zero. By computing the expression of Pi in terms of the matrix solution of the recurrent Hamiltonian equation, we obtain the following result :

Theorem 2. If A  0 and if the spectral radius ρ(S) of the final state penalty matrix is sufficiently small such that 50 C. Beauthier and J.J. Winkin ⎧ λ (R)(1 − σ) ⎪ min , if σ < 1 ⎪  2 ⎨⎪ B λmin(R)(σ − 1) ρ(S)= max μi < γ := , if σ > 1 (14) μ ∈σ( ) ⎪  2 σ N i S ⎪ B ⎪ λ (R) ⎩ min , if σ = 1 B2N where σ := σmin(A)σmax(A), with σmin(A) (σmax(A), respectively ) denoting the smallest (the largest, respectively) singular value of A and λmin(R) := min{λ : λ ∈ σ(R)}, then the LQN closed-loop system is positive and therefore the solution of the N N LQ problem is solution of the LQ+ problem.

Hints The positivity constraint on the closed-loop matrix can be written in terms of the solution Pi of the RRE (see condition (13)), where B ≥ 0. In addition, Pi = −1 =( T )N−i [ + ( , ) ]−1 N−i, Yi Xi A S I G N i S A where N−i−1 G(N, i) := ∑ (A−1)i−N+k+1BR−1BT (AT )N−i−k−1, k=0 B2 1 − σ N−i ρ(S) and, for σ = 1, G(N, i)S≤ ρ(S) ≤ . Thus, if (14) holds, λmin(R) 1 − σ γ ρ(S) then S[I + G(N, i)S]−1≤ , whence, by choosing ρ(S) sufficiently small, ρ(S) 1 − γ ∀ , , > (( T )N−i)N condition (13) will hold, since k l akl 0 and the sequences A i=0 and ( N−i)N 2 A i=0 are bounded. Remark 3. a) If σ ≥ 1 and if the time horizon N is increased, ρ(S) has to be decreased accordingly for condition (14) to be satisfied with a fixed matrix R.This reveals a tradeoff between positivity and stability of the closed-loop system in a receding horizon approach. b) The minimal energy control problem with nonnegative controls and with a final state equality constraint is solved in [16, Subsection 3.4.1] for reachable systems. Here we use a penalization term in the cost instead of a final state constraint, it is not assumed that the system is reachable and it is not required that the input function ( )N−1 ui i=0 be nonnegative.

4.2 Nonnegative Hamiltonian Matrix

Theorem 3. If the Hamiltonian matrix H and the penalty matrix S are nonneg- ative and if the solution of the matrix recurrent Hamiltonian equation is such that −1 ≥ N X0 0, then the LQ closed-loop system is positive and therefore the solution of N N the LQ problem is solution of the LQ+ problem. Proof. Multiplying the matrix recurrent Hamiltonian equation (11) on the right −1 by X0 gives On the Positive LQ-Problem for Linear Discrete Time Systems 51

−1 −1 Xi −1 Xi+1 −1 XN X0 X0 X = H X with − = − . 0 + 0 1 1 Yi Yi 1 YNX0 SX0 = ,..., , −1 ≥ It follows by induction that, for all i 0 N Xi X0 0. Then, by using Corol- lary 1 (b), one gets the conclusion. 

4.3 Monomial Systems

A nonnegative matrix M is monomial if M is a diagonal matrix up to a permutation, = = [ ]n i.e. M DP diag mi i=1 P,whereD is a positive definite diagonal matrix and P is a permutation matrix, or equivalently M−1 ≥ 0, see e.g. [5] and [18].

Definition 2. A positive system [A, B], described* by (1), is said+ to be monomial m diag[bi] = if A is a monomial matrix and B is of the form B = i 1 . 0(n−m)×m

Definition 3. Let L and M be monomial matrices. L and M are said to be struc- s turaly similar, denoted by L = M, if and only if there exist positive definite diagonal matrices D1 and D2 such that L = D1 MD2. s It is easy to check that “=” is an equivalence relation on the set of monomial matri- ces. The following straightforward result will be needed below :

Lemma 2. Let L and M be monomial matrices. Let P be a permutation matrix such that L = D1 P and M = PD2 where D1 and D2 are positive definite diagonal matrices. Then L and M are structuraly similar.

Theorem 4. Consider a monomial system described by (1) and the quadratic cost (2) where C, R and S are diagonal matrices. Then the LQN closed-loop sys- tem is positive and therefore the solution of the LQN problem is solution of the N LQ+ problem.

Proof. By using the explicit form of H where A = DP, PT = P−1 and the fact s that CTCP−1 = P−1CTC (see Lemma 2), one gets :

P−1 0 D−1 D−1 BR−1 BT H = 0 P−1 CTCDDø −1 D +CTCDDø −1 BR−1 BT where Dø is a positive definite diagonal matrix. Therefore H is nonnegative. Now, by using the matrix recurrent Hamiltonian equation with this expression of H,it −1 N−i can be shown by induction that, for all i = 0,...,N − 1, Xi =(P ) DX,i and −1 N−i Yi =(P ) DY,i,whereDX,i and DY,i are positive definite diagonal matrices. −1 N Therefore, X0 =(P ) D0 where D0 is a positive definite diagonal matrix. Hence −1 ≥ N X0 is a monomial matrix and X0 0. It follows by Theorem 3 that the LQ closed- loop system is positive.  52 C. Beauthier and J.J. Winkin

5 Concluding Remarks

The results reported in this paper have been obtained for the finite horizon LQ+ problem in discrete time. Their possible extensions to the infinite horizon problem are currently under investigation. The continuous time case was studied in [3]. An- other perspective for this work is the case where the positivity of the open-loop system is not required. Indeed, one can observe that the analysis and the results in Sect. 3 and Subsect. 4.1 can be generalized to the case where the system [A, B] is not positive.

Acknowledgements. The authors wish to thank the reviewers of this paper for their helpful suggestions and comments.

References

1. Angeli, D., Sontag, E.D.: Monotone control systems. IEEE Transactions on Automatic Control 48(10), 1684Ð1698 (2003) 2. Beauthier, C.: Le Probl`eme lin«eaire quadratique positif. M«emoire de DEA (Master The- sis), FUNDP, Namur (2006) 3. Beauthier, C., Winkin, J.J.: Finite horizon LQ-optimal control for continuous time posi- tive systems. In: Proceedings of the Eighteenth International symposium on Mathemati- cal Theory of Networks and Systems (MTNS 2008), Virginia Tech. Blacksburg, Virginia, USA (2008) 4. Beauthier, C., Winkin, J.J.: LQ-optimal control of positive linear systems (submitted 2009) 5. Berman, A., Plemmons, R.J.: Inverses of nonnegative matrices. Linear and Multilinear Algebra 2, 161Ð172 (1974) 6. Bixby, R.E.: Implementation of the simplex method: the initial basis. ORSA Journal on Computing 4(3) (1992) 7. Callier, F.M., Desoer, C.A.: Linear System Theory. Springer, New York (1991) 8. Castelein, R., Johnson, A.: Constrained optimal control. IEEE Transactions on Auto- matic Control 34(1), 122Ð126 (1989) 9. Farina, L., Rinaldi, S.: Positive Linear Systems. John Wiley, New York (2000) 10. Godfrey, K.: Compartmental Models and Their Applications. Academic Press, London (1983) 11. Guo, C.-h., Laub, A.J.: On a Newton-like Method for Solving Algebraic Riccati Equa- tions. SIAM J. Matrix Anal. Appl. 21(2), 694Ð698 (2000) 12. Hartl, R.F., Sethi, S.P., Vickson, R.G.: A survey of the maximum principles for optimal control problems with state constraints. SIAM Review 37(2), 181Ð218 (1995) 13. Heemels, W.P.M.H., Van Eijndhoven, S.J.L., Stoorvogel, A.A.: Linear quadratic regula- tor problem with positive controls. Int. J. Control 70(4), 551Ð578 (1998) 14. Johnson, A.: LQ state-constrained control. In: Proceedings of the IEEE/IFAC Joint Symposium on Computer-Aided Control System Design, pp. 423Ð428 (1994) 15. Kaˇczorek, T.: Externally and internally positive time-varying linear systems. Int. J. Appl. Math. Comput. Sci. 11(4), 957Ð964 (2001) On the Positive LQ-Problem for Linear Discrete Time Systems 53

16. Kaˇczorek, T.: Positive 1D and 2D Systems. Springer, London (2002) 17. Laabissi, M., Winkin, J., Beauthier, C.: On the positive LQ-problem for linear continous- time systems. In: Proceedings of the 2nd Multidisciplinary International Symposium on Positive Systems: Theory and Applications (POSTA 2006), Grenoble, France. LNCIS, pp. 295Ð302. Springer, Heidelberg (2006) 18. Plemmons, R.J., Cline, R.E.: The generalized inverse of a nonnegative matrix. In: Pro- ceedings of the American Mathematical Society, vol. 31(1) (1972) 19. Van Schuppen, J.H.: Control and System Theory of Positive Systems. Lecture Notes (2007) TheImportanceofBeingPositive: Admissible Dynamics for Positive Systems

Luca Benvenuti and Lorenzo Farina

Abstract. Positive linear systems display peculiar dynamics due to the positiv- ity constraints on input, state and output variables. In this paper we review such peculiarities for externally and internally positive linear systems. The properties of externally positive systems are shown in terms of poles and zeros location and inputÐoutput response, and those of internally positive systems in terms of eigenval- ues location. Open problems are also presented. The presentation style of this paper is very informal, aiming to convey to the reader just a taste of the “importance of being positive”.

1 Introduction

Positivity constraints on system’s variables is often found in engineering applica- tions. In fact, this is the case of any variable representing any possible type of re- source measured by a quantity such as time [11, 29], money and goods [17, 19], buffer size and queues [27], data packets flowing in a network [23], human, animal and plant populations [20], concentration of any substance [13, 15, 28] including mRNAs, proteins and molecules [8], electric charge [2, 6, 12], and light intensity levels [3, 22]. Moreover, also probabilities are positive quantities, so that also hidden Markov models [1, 26] and phase-type distributions models are subject to positivity constraints [24, 25]. Such constraints impose limitations on the input-output dynamics and also on the parameters describing the system’s behavior, that is on poles, zeros and eigenvalues location. These limitations are quite stringent and often allow to characterize quali- tatively the behavior of the system and therefore dramatically simplify its analysis.

Luca Benvenuti and Lorenzo Farina Dipartimento di Informatica e Sistemistica “A. Ruberti”, Sapienza Universit`adiRoma, via Ariosto 25, 00185 Roma, Italy, e-mail: [email protected],[email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 55Ð62. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 56 L. Benvenuti and L. Farina

In this paper we briefly review the most important features of positive SISO linear systems by considering externally and internally positive systems.

2 External and Internal Positivity

Given a discrete-time SISO linear system, its “internal” representation is given by

x(k + 1)=Ax(k)+bu(k) y(k)=cx(k) and its “external” representation (transfer function) by

H(z)=c(zI − A)−1b where H(z) is the Z-transform of the system impulse response h(k). In the case of a continuous-time SISO linear system, the “internal” representation is

xú(t)=Ax(t)+bu(t) y(t)=cx(t) and the “external” representation (transfer function) is

H(s)=c(sI − A)−1b where H(s) is the Laplace transform of the system impulse response h(t). Asystemissaidtobeexternally positive if for any nonnegative input and zero initial state, the output is always nonnegative. By contrast, a system is said to be internally positive if for any nonnegative input and nonnegative initial state, the state trajectory and the output are always nonnegative. It is plain that an internally positive system is also externally positive whereas the converse do not always hold. The problem of finding conditions for an externally positive system to be also internally positive is the so called positive realization problem which is reviewed in the tutorial paper [4]. In the next sections we will illustrate the properties of externally positive linear systems in terms of poles and zeros location and inputÐoutput response and those of internally positive linear systems in terms of eigenvalues location.

3 Dynamic Properties of Externally Positive Linear Systems

We first review two time-domain properties of externally positive linear systems. The Importance of Being Positive 57

The first property refers to nonnegativity of the impulse response which stems immediately from the definition itself. In fact, the Dirac delta function is a non- negative input. As shown in [14], this property can be exploited in the design of a feedback control system that maximizes the size of an unknown-but-bounded dis- turbance without violating any hard constraint. Another interesting property is the monotonicity of the step response since it im- plies that no undershoot and overshoot are present. This may be a desired property in the design of a control system in several applications such as a manipulator perform- ing pick and place operations close to a wall, filling a tank with a fluid in minimum time without spilling over, or temperature control in a hazardous environment [7, 9]. We consider now two frequency-domain properties of externally positive linear systems, the first related to transfer function poles location, the second to the loca- tion of zeros. The first property refers to the location of the dominant poles, where a dominant pole (or eigenvalue) is that of maximum modulus [maximal real part] for discrete- time systems [continuous-time systems]. The property simply states that one of the dominant pole is real. Moreover, for discrete-time systems, such real dominant pole is nonnegative. In fact, if only complex dominant poles are present, then the long term behavior of the impulse response would be oscillating thus contradicting the nonnegativity assumption. The same reasoning applies for negative real dominant poles for discrete-time systems [10]. It’s worth noting that complex pairs can be dominant poles as illustrated by the following examples. Consider the discrete-time system with transfer function 2 z − cosϕ H(z)= + (1) z − 1 z2 − (2cosϕ)z + 1 whose poles are 1 and e±iϕ and are all dominant. The transfer function has a pair of dominant complex poles and its impulse response

h(k)=2 + cos[(k − 1)ϕ] is clearly nonnegative for all k ≥ 1. As an additional example, consider the continuous-time system with transfer function 2 1 H(s)= + (2) s + 1 s2 + 2s + 2 whose poles are −1and−1 ± i and are all dominant. The transfer function has a pair of dominant complex poles and its impulse response

h(t)=e−t(2 + sin(t)) is clearly nonnegative for all t ≥ 0. The second property refers to the location of real zeros. More precisely, the real zeros of a continuous-time [discrete-time] externally positive linear system are 58 L. Benvenuti and L. Farina smaller in value [in modulus] than the dominant real pole. In fact, the transfer func- tion of the system is  ∞ H(s)= h(t)e−st dt 0 and since h(t) is nonnegative and e−st is nonnegative for any real value of s,then H(s) cannot be zero for any real value of s in the radius of convergence of the Laplace transform, that is greater than the real dominant pole [10]. Finally, we end this section by stating a simple but useful criterion for check- ing BIBO stability for externally positive systems. In fact, an externally positive continuous-time linear system is BIBO stable if and only if all the coefficients of the transfer function denominator have the same sign. To prove the sufficiency1 of this assertion, consider that if the system is not BIBO stable, then it has a positive dominant real pole thus arriving at a contradiction since a monic polynomial with positive coefficients cannot be zero when evaluated at the positive dominant real pole of the system (Frobenius pole).

4 Dynamic Properties of Internally Positive Linear Systems

It is well known that positivity imposes a specific sign pattern on the dynamic system matrix A, in particular, an internally positive discrete-time linear system is fully characterized by having a nonnegative matrix A, while an internally positive continuousÐtime linear system by having a Metzler matrix A (i.e. having nonnega- tive off-diagonal entries). Moreover an internally positive linear system is such that vectors b and c are nonnegative. Nonnegativity of the matrix A enforces strong conditions on the admissible loca- tion of the eigenvalues of an internally positive discrete-time linear system [5, 21]. We begin with discussing the location of dominant eigenvalues described by the celebrated Perron-Frobenius theorem (see [21]). 2 In fact, the Perron-Frobenius the- orem, which holds for irreducible nonnegative matrices, can be used to gain insight also into the case of a generic nonnegative matrix, since this matrix can be reduced, by reordering state variables, to a block triangular matrix with irreducible diagonal blocks. Moreover the spectrum of such a matrix is the union of the spectra of the diagonal blocks, so that the dominant eigenvalues of a nonnegative square matrix A of dimension n are among the roots of

λ k − ρ(A)k = 0(3)

1 Necessity of such condition is known to hold also for non externally positive systems. 2 The case of nonnegative nilpotent matrices is not considered since, in this case, all the eigenvalues are located at the origin of the complex plane. The Importance of Being Positive 59 for some (possibly more than one) values of k = 1,...,n. In particular, one of the dominant eigenvalues is positive real 3 and any other dominant eigenvalue has a degree 4 not greater than that of the dominant positive real eigenvalue. From this result it follows that there not exists an internally positive linear system having the externally positive system transfer function (1) when ϕ/π irrational. In fact there is no integer k such that equation (3) holds, that is eikϕ − 1 = 0. The admissible location of the dominant eigenvalues of an internally positive continuous-time linear system can be obtained by noting that for any Metzler matrix A there exists a nonnegative value α such that A + αI is a nonnegative matrix. As a consequence, the spectrum of a Metzler matrix is that of the nonnegative matrix A + αI leftward shifted by α. Therefore, the dominant eigenvalue of an internally positive continuous-time linear system is unique (possibly multiple) and real5.This results clearly shows that there is no internally positive system having the externally positive system transfer function considered in Example 2. Let us now consider the limitation imposed by internal positivity on the whole spectrum of the system. We first recall a fundamental result on eigenvalue location for nonnegative ma- trices. This result will be used hereafter to study the spectrum of discrete and continuous-time internally positive systems. ρ Denote by Θn the set of points in the complex plane that are eigenvalues of nonnegative n × n matrices with Frobenius eigenvalue ρ. For example, the region Θ ρ [−ρ,ρ] Θ ρ 2 consists of points on the segment and the region 3 consists of points in the interior and on the boundary of the triangle with vertices ρ, ρe2πi/3, ρe4πi/3 [−ρ,ρ] Θ ρ = ρΘ1 andonthesegment . A complete characterization of the regions n n for any n has been given by Karpelevich [16] (see also [18], and Theorem 1.2 at Θ 1 page 168 of reference [21]). The region n is symmetric with respect to the real axis, is included in the disc |z|≤1, and intersects the circle |z| = 1 at points e2πia/b, where a and b run over the relatively prime integers satisfying 0 ≤ a ≤ b ≤ n.The Θ 1 boundary of n consists of these points and of curvilinear arcs connecting them in 2πia /b 2πia /b circular order. The endpoints of an arc are e 1 1 and e 2 2 (b1 ≤ b2) and each arc is given by the following parametric equation:

[ / ] [ / ] [ / ] λ b2 (λ b1 − s) n b1 =(1 − s) n b1 λ b1 n b1 where the real parameter s runs over the interval 0 ≤ s ≤ 1. Since the dynamic matrix A of an internally positive discrete-time linear system is ρ nonnegative, then its spectrum is contained in the region Θn where n is the internal dimension of the system and ρ its Frobenius eigenvalue. For the sake of illustration, Θ ρ Θ ρ the regions 3 and 4 are depicted in Figure 1. The spectrum of an internally positive continuous-time linear system can be char- acterized using the previous result. In fact, since the spectrum of a Metzler matrix

3 This eigenvalue is called the Frobenius eigenvalue 4 The degree of an eigenvalue is the size of the largest diagonal block containing it in the Jordan canonical form of the matrix. 5 This eigenvalue is called the Frobenius eigenvalue 60 L. Benvenuti and L. Farina

Im Im i½(A) ½(A) e2¼i/3 ½(A) e2¼i /3

Re Re

{½(A) ½(A) {½(A) ½(A)

½(A) e4¼i /3 ½(A) e4¼i/3 -i½(A)

Fig. 1 The regions bounding the spectrum of a generic third-order (left) and fourth order (right) internally positive discrete-time linear system.

A is that of the nonnegative matrix A + αI leftward shifted by α, then the spectrum of a generic Metzler matrix is obtained by considering a value of α arbitrarily large. Consequently, the spectrum of an internally positive continuous-time linear system of dimension n and with Frobenius eigenvalue ρ is contained in the angular region with opening angle π π − 2 n symmetric with respect to the real axis with vertex located at ρ.Forthesakeof illustration, the region for n = 3 is depicted in Figure 2. This result shows again that the externally positive system with transfer function (2) cannot have a finite dimension internally positive representation.

Lp

Uh

(D)

Fig. 2 The region bounding the spectrum of a generic third-order internally posi- tive continuous-time posi- tive system. The Importance of Being Positive 61

5 Open Problems

There is a considerable number of open issues related to positive linear systems. Examples of useful results that seem to be very hard to obtain are: 1. determine the zeros location for externally positive linear systems; 2. infer nonnegativity of the whole impulse response directly from the systems parameters, or at least 3. determine directly from the systems parameters the minimum number of sam- ples of the impulse response to be checked in order to infer nonnegativity of the whole impulse response; 4. fully characterize the frequency response of externally positive linear systems.

References

1. Anderson, B.D.O.: From Wiener to hidden Markov models. IEEE Control Systems, 41Ð 51 (June 1999) 2. Benvenuti, L., Farina, L.: DiscreteÐtime filtering via charge routing networks. Signal Processing 49(3), 207Ð215 (1996) 3. Benvenuti, L., Farina, L.: The design of fiberÐoptic filters. IEEE/OSA Journal of Light- wave Technology 19(9), 1366Ð1375 (2001) 4. Benvenuti, L., Farina, L.: A tutorial on the positive realization problem. IEEE Transac- tions on Automatic Control 49(5), 651Ð664 (2004) 5. Benvenuti, L., Farina, L.: Eigenvalue regions for positive systems. Systems & Control Letters 51, 325Ð330 (2004) 6. Benvenuti, L., Farina, L., Anderson, B.D.O.: Filtering through combination of positive filters. IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applica- tions 46(12), 1431Ð1440 (1999) 7. Darbha, S., Bhattacharyya, S.P.: On the synthesis of controllers for a non-overshooting step response. IEEE Transactions on Automatic Control 48(5), 797Ð800 (2003) 8. De Jong, H.: Modeling and simulation of genetic regulatory systems: a literature review. Journal of Computational Biology 9(1), 67Ð103 (2002) 9. Deodhare, G., Vidyasagar, M.: Design of non-overshooting feedback control systems. In: Proceedings of the 29th IEEE Conference on Decision and Control, Honolulu, USA (1990) 10. Farina, L., Rinaldi, S.: Positive linear systems: theory and applications. In: Pure and Applied Mathematics. WileyÐInterscience, New York (2000) 11. Gaubert, S., Butkovic, P., Cuninghame-Green, R.: Minimal (max,+) realization of convex sequences. SIAM Journal on Control and Optimization 36, 137Ð147 (1998) 12. Gersho, A., Gopinath, B.: Charge-routing networks. IEEE Transactions on Circuits and Systems 26(1), 81Ð92 (1979) 13. Jacquez, J.A.: Compartmental analysis in biology and medicine, 2nd edn. University of Michigan Press, Ann Arbor (1985) 14. Jayasuriya, S.: On the determination of the worst allowable persistent bounded distur- bance for a system with constraints. Journal of Dynamic Systems, Measurement, and Control 117, 126Ð133 (1995) 62 L. Benvenuti and L. Farina

15. Kajiya, F., Kodama, S., Abe, H.: Compartmental Analysis - Medical Applications and Theoretical Background, Karger, Basel (1984) 16. Karpelevich, F.I.: On the characteristic roots of matrices with nonnegative elements. Izv. Akad. Nauk SSSR Ser. Mat. 15, 361Ð383 (1951) (in Russian); In Eleven Papers Trans- lated from Russian, American Mathematical Society 140 (1988) 17. Krause, U.: Positive nonlinear systems in economics. In: Maruyama, T., Takahashi, W. (eds.) Nonlinear and Convex Analysis in Economic Theory. Lecture Notes in Economics and Mathematical Systems, vol. 419, pp. 181Ð195. Springer, Heidelberg (1995) 18. Ito, H.: A new statement about the theorem determining the region of eigenvalues of stochastic matrices. Linear Algebra and its Applications 246, 241Ð246 (1997) 19. Leontieff, W.W.: The Structure of the American Economy 1919Ð1939. Oxford Univer- sity Press, New York (1951) 20. Leslie, P.H.: On the use of matrices in certain population mathematics. Biometrika 35, 183Ð212 (1945) 21. Minc, H.: Nonnegative Matrices. John Wiley & Sons, New York (1987) 22. Moslehi, B., Goodman, J.W., Tur, M., Shaw, H.J.: FiberÐoptic signal processing. In: Proceedings of the IEEE, vol. 72, pp. 909Ð930 (1984) 23. Mounier, H., Bastin, G.: Compartmental modelling for traffic control in communication networks. IEEE Transactions on Automatic Control (2002) (submitted) 24. O’Cinneide, C.A.: Characterization of phaseÐtype distributions. Stochastic Models 6, 1Ð57 (1990) 25. O’Cinneide, C.A.: PhaseÐtype distributions: open problems and a few properties. Stochastic Models 15, 731Ð757 (1999) 26. Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. In: Proceedings of the IEEE, vol. 77, pp. 257Ð286 (1989) 27. Saaty, T.L.: Elements of Queueing Theory. McGraw-Hill, New York (1961) 28. Segel, I.H.: Enzyme Kinetics. Whiley, New York (1975) 29. Xi, N., Tarn, T.J., Bejczy, A.K.: Intelligent planning and control for multi robot coor- dination: an event based approach. IEEE Transactions on Robotics and Automation 12, 439Ð452 (1996) Detectability, Observability, and Asymptotic Reconstructability of Positive Systems

Tobias Damm and Cristina Ethington

Abstract. We give a survey of detectability, observability and reconstructabil- ity concepts for positive systems and sketch some applications to the analysis of stochastic equations.

1 Introduction

There have been a number of contributions to the definition of observability and detectability for different classes of linear control systems recently, see [3, 4, 8, 10, 14, 21, 24]. A common feature of these papers is the distinction between properties which are equivalent for deterministic linear time-invariant systems (compare also [20, 23]). In particular, the following properties are of interest. Usually, one defines a system to be detectable (resp. observable), if all its un- stable (resp. nontrivial) modes produce a non-zero output, i.e. if vanishing of the output y(t)=0forallt implies that the state x(t) converges to zero (resp. is equal to zero). In the deterministic case, it follows that a system is detectable if and only if the dual system is stabilizable, which again is equivalent to the existence of an asymptotically stable linear dynamic state observer. Here this property will be called asymptotic reconstructability. Moreover, there are equivalent algebraic criteria, the so called Hautus-test [15] (or Popov-Belevich-Hautus test), which play an important rˆole in the discussion of algebraic Lyapunov and Riccati equations. For many classes of stochastic systems, however, these properties fall apart and their usefulness differs. Therefore different concepts have been developed. Several authors (e.g. [5, 9, 12, 13, 21]) have chosen mean-square stabilizability of the dual

Tobias Damm and Cristina Ethington Department of Mathematics, TU Kaiserslautern. D-67663 Kaiserslautern, Germany, e-mail: [email protected],[email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 63Ð70. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 64 T. Damm and C. Ethington system as a defining property for detectability. Some authors (e.g. [3]), call this property MS-detectability. However, as discussed in [8], this choice is not well- motivated. First no clear interpretation with respect to dynamical properties of the underlying stochastic control system has been given (as mentioned e.g. in [21]). Second, there is no simple equivalent algebraic criterion like the Hautus-test for deterministic systems. Third, in applications to generalized algebraic Lyapunov and Riccati equations only the generalized Hautus-test is used, which is weaker than stabilizability of the dual system (e.g. [13]). In [8] a generalized version of the Hautus test was given and shown (for stochastic differential equations) to be equivalent to the system being detectable (in the sense that all unstable modes produce non-zero output). This property is called e.g. W- detectability in [4] or β-detectability in [7]. In the present note, we suggest abstract definitions and characterizations of de- tectability, observability and asymptotic reconstructability in terms of positive op- erators and clarify some relations between these notions. We then show how these definitions are related to the dynamical properties of different classes of systems.

2 Resolvent Positive Operators

We summarize some results on resolvent positive operators which have been col- lected e.g. in [1] or [6, 7]. Let H be some finite-dimensional real vector-space, ordered by a closed, solid, pointed convex cone H+.OnH we consider a scalar ∗ product ·,·.ByH+ we denote the dual cone, and for a linear operator T : H → H we denote the adjoint operator by T ∗.

Definition 1. A linear operator T : H → H is called positive, T ≥ 0, if T (H+) ⊂ H α ≥ α +. It is called resolvent positive if there exists an α0 ∈ Ê such that for all 0 the resolvent (αI − T)−1 is positive, i.e.

−1 (αI − T) (H+) ⊂ H+ .

There are many other equivalent characterizations of resolvent positivity (e.g. [6, 11]). Here, the following will be relevant. Proposition 1. A linear operator T : H → H is resolvent positive, if and only if it is exponentially positive, which means that eTs ≥ 0 for all s ≥ 0.

n n

H = Ê ∈ Example 1. (i) Let H = Ê be ordered by the cone + +. A matrix A n×n H → H Ê , regarded as a mapping A : , is positive, if and only if all its entries are nonnegative. It is resolvent positive, if and only if all off-diagonal entries of A are nonnegative, i.e. A is a Metzler matrix. This is, by definition, − equivalent to saying that A is a Z-matrix. We call A stable,ifσ(A) ⊂ −. Hence, again by definition, A is resolvent positive and stable, if and only if −A is an M-matrix M-matrix (see [2, 16, 18]). Detectability, Observability, and Asymptotic Reconstructability 65

(ii) Let H = H n denote the space of n × n symmetric matrices ordered by the H n n×n cone + = H+ of nonnegative definite matrices. Then for any A ∈ Ê , n n ∗ the operator ΠA : H → H , ΠA(X)=A XA is positive, whereas both the n n T continuous-time Lyapunov operator LA : H → H , LA(X)=A X + XA, n and the discrete-time Lyapunov operator (also Stein operator) SA : H → n ∗ H , SA(X)=A XA− X, are resolvent positive but, in general, not positive. ∗ Note that in both examples the cone H+ is self-dual, i.e. H+ = H+ . The following result goes back to [19]. Proposition 2. Let T : H → H be resolvent positive and set β = maxReσ(T ). Then there exists an eigenvector V ∈ H+,V = 0, such that TV = βV. Moreover, the following are equivalent (where X < 0 means X ∈ intH+): (a) β(T ) < 0, (b) ∃X < 0: T (X) < 0, (c) ∀Y < 0: ∃X < 0: T(X)=Y.

3 Detectability and Observability

∗ Definition 2. Let T : H → H be resolvent positive and Y ∈ H+ . Consider the positive linear system Xú = T (X), y(t)=X(t),Y. The solution of the initial value problem with X(0)=X0 ∈ H+ will be denoted by X(t,X0). We call the pair (T,Y )

(i) detectable,ify(t,X0)=0forallt ≥ 0 implies X(t,X0) → 0fort → ∞. (ii) observable,ify(t,X0)=0forallt ≥ 0 implies X0 = 0. The following result is a positive analogue of the Hautus-criterion. Proposition 3. The pair (T,Y) as in the previous definition is

(a) detectable, if and only if V,Y > 0 for any eigenvector V ∈ H+ of T corre- sponding to an eigenvalue λ ≥ 0, (b) observable, if and only if V,Y > 0 for any eigenvector V ∈ H+ of T corre- sponding to an arbitrary eigenvalue.

Proof. (a) Assume that the criterion in (a) does not hold, i.e. there exist λ ≥ 0, λt X0 ∈ H+ \{0},sothatT (X0)=λX0 and X0,Y  = 0. Since X(t,X0)=e X0,we have y(t,X0)=0forallt,butX(t,X0) → 0ast → ∞. Hence the system is not detectable. Vice versa, assume that the criterion holds and for some X0 ∈ H+ \{0} and all t→∞ t ≥ 0wehavey(t,X0)=0. Then we have to show that X(t,X0) −→ 0. Let X+ =  clconv{X(t,X0) t ≥ 0} denote the closed convex hull of the positive orbit of X(t,X0). Let further X = X+ − X+ be the minimal subspace of H containing X+.ThenX+ is a closed solid pointed convex cone in X . By construction, both 66 T. Damm and C. Ethington

X+ and X are invariant with respect to Xú = T (X). That means T (X ) ⊂ X ,  X β and the restriction T X is resolvent positive with respect to +.Let X be  ∈ the spectral bound of T X . By Proposition 2 there exists an eigenvector VX X+ ⊂ H+, such that T (VX )=βX VX . Since X(t,X0),Y  = 0forallt ≥ 0we conclude V,Y  = 0forallV ∈ X . In particular VX ,Y = 0. It follows now from the detectability criterion that βX < 0, which implies asymptotic stability of X(t,X0) for all X0 ∈ X . (b) If the criterion does not hold, then Ð as in (a) Ð we have a nontrivial solution λt X(t,X0)=e ∈ H+ with y(t,X0)=0. Conversely, if y(t,X0)=0forallt ≥ 0,thenÐasin(a)ÐwehavetheT-invariant subspace X with X ,Y  = {0}.IfX0 = 0thenX contains an eigenvector, i.e. the criterion is violated. 

4 Asymptotic Reconstructability

Note that detectability and observability in the sense of Definition 2 do not imply any means to reconstruct the state of the system. In fact, it is obvious that the mea- surement y = X,Y  in general will not be sufficient to distinguish two different solutions of the system.

H = 2 2 Ê ( )= Example 2. Consider Ê ordered by +.ForT X X, the differential equa- tion Xú = T (X) defines a positive system, and for any Y > 0 the pair (T,Y) is ob- t T servable. But y(t,X0)=e X0,Y  just depends on X0,Y. If for instance Y =[1,1] , then y(t,[0,1]T )=y(t,[1,0]T ) for all t. ⊥ Note that Y > 0 always implies observability. In this case Y ∩ H+ = {0}.The ⊥ smaller the dimension of span Y ∩ H+ the smaller also the number of nonob- servable modes is likely to be. When using positive systems e.g. on H n to analyze n stochastic systems on Ê , this is a natural requirement. We therefore aim at a con- cept of asymptotic reconstructability which also has this property. To this end we consider a different condition on the pair (T,Y), which can easily be formulated for arbitrary ordered vector spaces. For the important special cones introduced in Example 1, we show how it leads to positive asymptotic observer equations. ∗ Definition 3. Let T : H → H be resolvent positive and Y ∈ H+ . We call the pair (T,Y) asymptotically reconstructable if T ∗(Z) −Y < 0forsomeZ > 0. Let us note a simple general implication. Lemma 1. If (T,Y) is asymptotically reconstructable then it is detectable.

∗ ∗ Proof. If T (Z) −Y < 0forsomeZ ∈ intH+ .IfTV = λV for some eigenpair ( × H \{ }  ,  = λ,V) ∈ Ê + 0 satisfying V Y 0then

0 > V,T ∗(Z) −Y = T (V),Z = λV,Z .

Since V,Z≥0, we have λ < 0.  Detectability, Observability, and Asymptotic Reconstructability 67

The converse implication does not hold in general as was demonstrated in Exam- ple 2. Now let us consider two special cases.

n

4.1 Asymptotic Reconstructability on Ê

n

This case may look a bit artificial, but it illustrates the concept. Let H = Ê and

H n n ∈ Ê ( ) + = Ê+.ThenT is a Metzler-matrix and Y +. Let diag Y denote the diagonal matrix whose diagonal contains the entries of Y and assume that y = diag(Y)X. Consider the extended system

Xú = TX , y = diag(Y )X Xˆú = TXˆ + K diag(Y )Xˆ − y , where Xˆ is the state of the observer parametrized by the diagonal matrix K.Forthe error E = Xˆ − X we have

Eú =(T + K diag(Y ))E =: TK E . (1)

Lemma 2. There exists a diagonal matrix K, so that σ(TK ) ⊂ −, if and only if (T,Y ) is asymptotically reconstructable. β( )=β( ∗) Proof. Note that TK is Metzler and TK TK ∗ − < ∈ H ∗ = − ( )−1 ∗ = If T Z Y 0forsomeZ int + ,thenwesetK diag Z ,sothatTK Z ∗ T Z −Y < 0, i.e. β(TK) < 0 by Proposition 2. β( ) < ∗ ˜ − = ∗ ˜ + ( ) ˜ < ˜ ∈ H ∗ Conversely, if TK 0 TK Z Y T Z diag Y KZ 0forsomeZ int + . It is easy to see that diag(Y)KZ˜ > −αY for some α > 0. Hence T ∗Z −Y < 0for Z = Z˜/α. 

n×n n×n ∈ Ê Remark 1. For general systems with A ∈ Ê , C the existence of a ma-

trix K so that σ(A + KC) ⊂ − is equivalent to detectability in the usual sense, ( − T T i.e. rank sI A ,C )=n for all s ∈ +. But the characterization in Lemma 2 requires the assumptions that A is Metzler and that C = diag (Y) is diagonal 1 −2 00 and nonnegative. For instance, the pair (A,C)= , , with Y = 0 0 −1 00 is not detectable,  but AZ − Y < 0for Z =[1,1]T . Similarly, the pair (A,C)= 00 −10 −1 , with Y = is detectable, but AZ −Y < 0foranyZ > 0. 0 −1 00 0 Definition 3 thus is specific for positive systems.

4.2 Asymptotic Reconstructability on H n

n n Let H = H and H+ = H+ .Lety = YX and consider the extended system 68 T. Damm and C. Ethington

Xú = T (X) , y = YX Xˆú = T (Xˆ )+KYXˆ + XYKˆ T − y − yT , (2)

∈ n×n = − where K Ê . For the error E Xˆ X we have

T Eú = T(E)+KYE + EYK =: TK(E) . (3)

∈ n×n σ( ) ⊂  Lemma 3. There exists a matrix K Ê , so that TK −, if and only if (T,Y ) is asymptotically reconstructable.

Proof. Note that TK is resolvent positive for all K and thus σ(TK ) ⊂ − if and ∗( )= ∗( )+ + T < > only if TK Z T Z ZKY YK Z 0forsomeZ 0. ∗( ) − < = − 1 −1 ∗( ) < If T Z Y 0thenwemayjustsetK 2 Z to get TK Z 0.

∗ n×n ( ) < > =[ , ] ∈ Ê Vice versa, assume that TK Z 0 with Z 0. Let U U1 U2 be or- T = thogonal with the columns of U1 spanning KerY.ThenU1 YU1 0 together with T ∗( ) > T ∗( ) < α > U1 TK Z U1 0 implies U1 T Z U1 0. For 0wehave

α T ( ) α T ∗( ) T ( ∗(α ) − ) = U1 T Z U1 U1 T Z U2 , U T Z Y U α T ∗( ) α T ∗( ) − T U2 T Z U1 U2 T Z U2 U2 YU2 T > α where U2 YU2 0. If is sufficiently small the Schur-complement   − T + α T ∗( ) − T ∗( ) ( T ∗( ) )−1 T ∗( ) , U2 YU2 U2 T Z U2 U2 T Z U1 U1 T Z U1 U1 T Z U2 is negative, proving T (αZ)−Y < 0. Since αZ > 0, this proves that (T,Y ) is asymp- totically reconstructable. 

As a consequence we obtain a simple characterization of detectability.

n×n p×n

∈ Ê ( , ) Corollary 1. Let A ∈ Ê ,C . The pair A C is detectable in the usual sense, i.e. rank(sI − AT ,CT )=n, if and only if there exists a positive definite matrix Z, so that AZ + ZAT −CTC < 0.

5 Some Applications

We sketch two set-ups where our concepts can be applied directly. An in-depth analysis of these examples is beyond the scope of this contribution.

5.1 Stochastic Differential Equations

Consider the Itˆo-type stochastic differential equation

dξ = Aξ dt + A0ξ dw, η = Cξ . Detectability, Observability, and Asymptotic Reconstructability 69

= (ξξT ) ú = + T + T = ( ) Then X E satisfies X AX XA A0XA0 T X , which is a positive system on H n. If from the measurements η it is possible to estimate E(ηηT )= CXCT , then also the output y = traceXCTC = X,Y is available. Then the system is detectable, if and only if the pair (T,Y ) with Y = CTC is detectable. If further we assume that even the output YX = CTCX is available, then we can set up the observer Eq. (2) for the second moments X. Note, however, that this requires different measurements than just η = Cξ and may be unrealistic, but it is exactly the underlying assumption in the concept of MS-detectability (compare [8]).

5.2 Markov Jump Linear Systems

Here we consider n-dimensional systems of the form (e.g. [22])

ξú(t)=A(θ(t))ξ (t) , η(t)=C(θ(t))ξ (t) , (4) where θ(t) is a Markov process in continuous time on a finite sample space S = N {1,2,...,N}. The distributions p ∈ [0,1] with p j = P(θ = j) of the process are subject to the transition equationp ú(t)=Λ p(t) where eΛ is a stochastic matrix. For T Xj = E(x(t)x(t) δθ(t), j), j = 1,...,N,whereδi, j is Kronecker’s delta, we have the coupled set of Lyapunov equations (cf. [17])

N ú = T + + λ = ( ) , = ,..., , (λ )=Λ . Xj Ai Xi XiAi ∑ ijXj Tj X j 1 N ij (5) j=1

n N n N Here we may consider the space H =(H ) ordered by H+ =(H+ ) .ThenT = ( ,..., ) ( T ,..., T )= ∈ H ∗ Tj TN is resolvent positive, and C1 C1 CNCN Y + . Detectability, observability and reconstructability properties can now immediately be transferred to the system (4) like in the previous subsection.

6 Summary and Outlook

Positive linear systems on matrix are used as auxiliary systems for various linear control problems. The notions of detectability, observability and asymptotic reconstructability can be formulated conveniently in terms of resolvent positive op- erators. While the interpretation of the first two is clear for arbitrary spaces H ,we have clarified the meaning of asymptotic reconstructability only for (H ,H+)=

( n n n n , Ê ) (H ,H )=(H ,H ) Ê + and for + + . A general analysis will be an issue for further research. Similarly, a unified treatment of different classes of stochastic sys- tems (e.g. from networked control) shall be carried out in more detail. 70 T. Damm and C. Ethington

References

1. Berman, A., Neumann, M., Stern, R.: Nonnegative Matrices in Dynamic Systems. John Wiley and Sons Inc., New York (1989) 2. Berman, A., Plemmons, R.J.: Nonnegative Matrices in the Mathematical Sciences. In: Classics in Applied Mathematics, vol. 9. SIAM Publications, Philadelphia (1994) 3. Costa, E., do Val, J.: On the detectability and observability of discrete-time Markov jump linear systems. Syst. Control Lett. 44(2), 135Ð145 (2001) 4. Costa, E., do Val, J., Fragoso, M.: A new approach to detectability of discrete-time infi- nite Markov jump linear systems. SIAM J. Control Optim. 43(6), 2132Ð2156 (2005) 5. Da Prato, G., Ichikawa, A.: Stability and quadratic control for linear stochastic equations with unbounded coefficients. Boll. Unione Mat. Ital., VI. Ser., B 6, 987Ð1001 (1985) 6. Damm, T.: Stability of linear systems and positive of symmetric matrices. In: Benvenuti, L., Santis, A.D., Farina, L. (eds.) Positive Systems. LNCIS, vol. 294, pp. 207Ð214. Springer, Heidelberg (2003) 7. Damm, T.: Rational Matrix Equations in Stochastic Control. LNCIS, vol. 297. Springer, Heidelberg (2004) 8. Damm, T.: On detectability of stochastic systems. Automatica 43(5), 928Ð933 (2007) 9. Drˇagan, V., Halanay, A., Stoica, A.: A small gain theorem for linear stochastic systems. Syst. Control Lett. 30, 243Ð251 (1997) 10. Drˇagan, V.,Morozan, T.: Stochastic observability and applications. IMA J. Math. Control & Information 21(3), 323Ð344 (2004) 11. Elsner, L.: Quasimonotonie und Ungleichungen in halbgeordneten R¬aumen. Linear Al- gebra Appl. 8, 249Ð261 (1974) 12. Fragoso, M.D., Costa, O.L.V., de Souza, C.E.: A new approach to linearly perturbed Riccati equations in stochastic control. Appl. Math. Optim. 37, 99Ð126 (1998) 13. Freiling, G., Hochhaus, A.: On a class of rational matrix differential equations arising in stochastic control. Linear Algebra Appl. 379, 43Ð68 (2004) 14. H¬ardin, H.M., Van Schuppen, J.H.: Observers for linear positive systems. Linear Algebra Appl. 425(2-3), 571Ð607 (2007) 15. Hautus, M.L.J.: Controllability and observability conditions of linear autonomous sys- tems. Indag. Math. 31, 443Ð448 (1969) 16. Horn, R.A., Johnson, C.R.: Topics in Matrix Analysis. Cambridge University Press, Cambridge (1991) 17. Morozan, T.: Parametrized Riccati equations for controlled linear differential systems with jump Markov perturbations. Stochastic Anal. Appl. 16(4), 661Ð682 (1998) 18. Ostrowski, A.M.: Uber¬ die Determinanten mituberwiegender ¬ Hauptdiagonale. Comm. Math. Helv. 10, 69Ð96 (1937) 19. Schneider, H.: Positive operators and an inertia theorem. Numer. Math. 7, 11Ð17 (1965) 20. Sontag, E.D.: Mathematical Control Theory, Deterministic Finite Dimensional Systems, 2nd edn. Springer, New York (1998) 21. Tessitore, G.: Some remarks on the detectability condition for stochastic systems. In: Da Prato, G. (ed.) Partial differential equation methods in control and shape analysis. Lect. Notes Pure Appl. Math., vol. 188, pp. 309Ð319. Marcel Dekker, New York (1997) 22. Ugrinovskii, V.: Randomized algorithms for robust stability and guaranteed cost control of stochastic jump parameter systems with uncertain switching policies. J. Optim. Th. & Appl. 124(1), 227Ð245 (2005) 23. Van Willigenburg, L., De Koning, W.: Linear systems theory revisited. Automatica 44, 1686Ð1696 (2008) 24. Zhang, W., Chen, B.-S.: On stabilizability and exact observability of stochastic systems with their applications. Automatica 40, 87Ð94 (2004) Stability Radii of Interconnected Positive Systems with Uncertain Couplings

Diederich Hinrichsen

Abstract. We analyze interconnections of finitely many (exponentially) stable pos- itive systems which interact via uncertain couplings of arbitrarily prescribed struc- ture. We view the couplings as structured perturbations of the direct sum of the subsystems and derive computable formulas for the corresponding stability radius with respect to some costumary perturbation norms.

1 Introduction

A basic problem in the stability analysis of large-scale systems is to determine under which conditions on the magnitude of the couplings the stability of the subsystems implies the stability of the overall system. In the literature one can find many suf- ficient criteria which guarantee the stability of the interconnected system given the stability of its subsystems, see [15] and the references therein. The purpose of this paper is to derive results which are, in a certain sense, necessary and sufficient. We restrict ourselves to time-invariant linear couplings. We consider composite systems Σcomp with a prescribed interconnection struc- ture. If Σcomp consists of N subsystems Σi, the interconnection structure is given by =( N×N Σ a nonnegative matrix E eij) ∈ Ê+ specifying that the output of j is coupled to the input of Σi if and only if eij > 0. Gershgorin [6] had the idea of viewing any square matrix as a perturbation of the corresponding diagonal matrix. Analogously we regard the coupled system Σcomp as a perturbation of the (“nominal”) blockdiagonal system Σ = ⊕Σi which describes the collection of decoupled subsystems. This view of the couplings as perturba- tions is particularly appropriate in our context, since the interconnections between the subsystems are considered as uncertain. However, we deviate from Gershgorin’s

Diederich Hinrichsen Zentrum f¬ur Technomathematik, Universit¬atBremen, Postfach 330 440, 28334 Bremen, Germany, e-mail: [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 71Ð81. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 72 D. Hinrichsen approach in two respects: First, we do not suppose that we know exactly the “diago- nal elements” (the decoupled subsystems) but allow for some parametric uncertainty of these systems; secondly we do not allow arbitrary off-diagonal perturbations (couplings between any two subsystems) but only those which preserve the given coupling structure. In this latter respect our approach ressembles that of Brualdi [2] who used the zero pattern of a given matrix in order to sharpen Gershgorin’s and Brauer’s inclusion theorems, see [13]. Our aim is to determine the stability radius of the system Σ with respect to the given perturbation/coupling structure. A standard tool for the robustness analysis of uncertain systems with structured perturbations is μ-analysis [3]. However, μ-analysis provides, in general, only es- timates for stability radii, see [7], [8]. In contrast, our objective is to obtain exact formulas. Moreover, a reformulation of our problem in terms of blockdiagonal per- turbations, as usual in μ-analysis, would blow up the dimension of the matrices to multiples of the overall system dimension. For large scale systems this may cause severe computational difficulties. Therefore we aim at reduced order formulas for the stability radii which involve only matrices of order N (the number of intercon- nected subsystems). In most cases N will be significantly smaller than the dimension of the overall system. To obtain precise results, a specific investigation of the special class of uncertain systems, the particular perturbation structure and perturbation norm is necessary. In this paper we deal especially with large scale systems composed of positive subsys- tems. For an analysis of spectral value sets and stability radii of composite systems consisting of general time-invariant linear systems, see [13]. It is well known that stability radii probems are greatly simplified under positiv- ity assumptions. For positive systems with full block uncertainty the real and the complex stability radii coincide and can be computed by a simple formula [9], [17], whereas for general systems the two radii are different and the computation of the real stability radius is an intricate problem, see [7, §5.3]. Meanwhile many papers have appeared which extend the early results of [9], [17] to other classes of pos- itive systems and different types of uncertainty. For instance, block-diagonal and positive-affine perturbations have been considered in [8], infinite dimensional pos- itive systems in [4], [5], positive delay systems in [16], [12] and positive Volterra systems in [14]. However, up till now, there are no papers on the stability radius problem for interconnected positive systems with uncertain couplings of prescribed structure. The paper is organized as follows. After the preliminaries in the next section we present a detailed description of the interconnected system in Section 3. All the new results of this paper are contained in Section 4. We end the paper with a conjecture the proof or disproof of which remains an open question. Interconnected Positive Systems 73

2 Preliminaries

In this section we introduce some basic concepts and fix the notation. The symbols

, Ê, Ê ,  Æ + denote the sets of positive integers, real numbers, non-negative real num- ∈ = { , ,..., } bers and complex numbers, respectively. For any N Æ we set N : 1 2 N .

n×m

à à = Ê  By à we denote the set of n by m matrices with entries in where or .If =( n×m | | =(| |) =( ), =( ) ∈ A aij) ∈  we define A : aij and for real matrices A aij B bij n×m ≤ ≤ ∈ , ∈ σ( ) Ê we write A B if aij bij for all i n j m.IfA is square then A and ρ(A)=max{|λ|; λ ∈ σ(A)} denote its spectrum and spectral radius, respectively. The spectral radius has the following monotonicity property, see [10, §8.1]:

∀ ∈ n×n n×n , ∈ Ê | |≤ ⇒ ρ( ) ≤ ρ(| |) ≤ ρ( ). A  B + : A B A A B (1)

n×n

(Ã) ( , , ) ∈ Ã , ∈ By Än,l,q we denote the set of triples of matrices A B C with A B

n×l q×n

, ∈ Ã , , , ∈ Æ  = { ∈ Ã C n l q . The open left half-plane is denoted by − s

n×n

< } ∈  σ( ) ⊂  ;Res 0 and A is called stable if A −. We use the conventions

− − 0 1 = ∞, ∞ 1 = 0, inf0 / = ∞. (2)

( , , ) ∈ Δ l×q (Ã) Δ ⊂ Ã Ã In this section we suppose that A B C Än,l,q and is a -linear subspace provided with a norm ·Δ . We consider perturbations of the form

A ; A(Δ)=A + BΔC, Δ ∈ Δ . (3)

( , , )∈ (Ã) Definition 1. Given a system A B C Än,l,q and given a perturbation space (Δ ,·Δ ), the stability radius of A with respect to perturbations of the form (3) is defined by Δ }. rΔ (A,B,C)=inf{ΔΔ ; Δ ∈ Δ , σ(A(Δ)) ⊂ − (4)

It is easily seen that the infimum in (4) is in fact a minimum if rΔ (A,B,C) is finite. In this case the stability radius is the norm of a smallest perturbation in Δ which ( , , )=∞ destabilizes A. rΔ (A,B,C)=0 if and only if σ(A) ⊂ −. rΔ A B C if and Δ ∈ Δ only if σ(A + BΔC) ⊂ − for all . For arbitrary perturbation spaces (Δ ,·Δ ) the determination of rΔ (A,B,C) is a difficult problem and, in general, only estimates are available. In the full-block Δ l×q ( , , ) case,however,whereΔ = Ã and the stability radius rΔ A B C is denoted by ( , , )

rà A B C , computable formulas are available for both the complex stability radius

( , , ) ( , , ) ( , , ) ∈ Ä (Ê) · Ê r A B C and the real stability radius r A B C (if A B C n,l,q and Δ is the spectral norm), see [7, §5.3]. ( )= ( ) ∈ n×n A linear systemx ú t Ax t , A Ê is said to be positive if the positive orthant n At ≥ ≥ Ê+ is invariant for the corresponding flow, i.e., e 0forallt 0. In this case the ∈ n×n matrix A is called a Metzler matrix. A Ê is a Metzler matrix if and only if all the off-diagonal entries of A are nonnegative, i.e., tIn + A ≥ 0forsomet ≥ 0. A · n norm on à is said to be monotone if it satisfies 74 D. Hinrichsen

n. |x|≤|y|⇒x≤y, x,y ∈ Ã n ≤ ≤ ∞ · Every p-norm ·p on  ,1 p is monotone. The operator norm associated with a pair of monotone vector norms needs not be monotone. However, we have the following properties, see [17].

m n

,  · ·

 m n  Lemma 1. Suppose that are provided with monotone norms  , , · m×n and denotes the corresponding operator norm on  .Then

m×n n

∈ Ê   =   =  

∈ Ê n m  1. For every P + there exists u +, u  1 such that Pu P .

∈ m×n m×n ∈ Ê | |≤  ≤| |≤  2. If P  ,Q + and P Q, then P P Q . ∈ m×n   = | | 3. If P  is of rank one, then P P . The following proposition illustrates that the stability radius problem is substantially

( , , ) ∈ (Ê) simplified if the system A B C Än,l,q is positive,i.e.A is a Metzler matrix and B,C ≥ 0.

∈ n×n n×l ( , ) ∈ Ê × Proposition 1. [17] Suppose A Ê is a stable Metzler matrix, B C +

q×n l q

 ,  Ê+ , and are provided with monotonic norms. Then, with respect to the

· l×q à = Ê, 

induced operator norm q l on à , ,

L (Ã ,Ã )

( , , )= ( , , )= ( , , )= −1 −1 .

Ê Ê

r A B C r A B C r + A B C CA B l q (5)

L (Ê ,Ê )

l×q

, σ( (Δ)) ⊂  } ( , , )= {Δ Δ ∈ Ê −

where rÊ+ A B C inf q l ; + A .

L (Ê ,Ê )

3 Interconnected Systems

To develop a framework for the analysis of interconnected positive systems with uncertain couplings we need some additional notation, see [13]. In the follow- ing, l,n,q, are finite N-tuples l =(l1,...,lN ), n =(n1,...,nN), q =(q1,...,qN ),

l×q

Ã × l j,n j,q j,N ∈ Æ.By we denote the set of N N block matrices ⎡ ⎤ Δ11 ··· Δ1N ⎢ ⎥ ×

[ . . li q j , ( , ) ∈ × . Δ ]=[Δ ] = , Δ ∈ Ã ij ij i, j∈N ⎣ . . ⎦ ij i j N N (6) ΔN1 ··· ΔNN

∈ N×N ≥ Suppose that E Ê is a given nonnegative matrix with entries eij 0. A block Δ l×q = Δ = matrix Δ =(Δij) ∈ Δ :=  is said to be of structure E if eij 0 implies ij 0. Δ Δ l×q Let Δ E ⊂ Δ be the of all the block matrices Δ ∈  of structure E.

=( h×k h×k , =( ) ∈  , Given X xij) ∈  Y yij where h k are positive integers, the Hadamard product of X and Y, denoted by X ◦Y,isdefinedbyX ◦Y =(xijyij) ∈

h×k th ◦k ◦k

∈  =( )  .Fork the k Hadamard power of X is defined by X : xij where

◦k k ◦k N×N = = = = =( ) ∈  xij : xij if xij 0andxij : 0ifxij 0. Given a matrix X xij and =[ l×q a block matrix Y Yjk] j,k∈N ∈  ,theHadamard block product of X and Y is, Interconnected Positive Systems 75

l×q by definition, the block matrix X ◦ Y :=[xijYij]i, j∈N ∈  . Note that for every l×q ◦ Δ =[ Δ ] ∈ Δ Δ =(Δij) ∈  we have E eij ij i, j∈N E .

( , , ) ∈ Ä (Ã) ∈ Given N subsystems Ai Bi Ci ni,li,qi , i N, consider the system Σ :úx(t)=Ax(t)+Bu(t), y(t)=Cx(t) (7)

, , = ⊕N n×n N ∈ Ã = ⊕ ∈ where A B C are the block-diagonal matrices A i=1Ai , B i=1Bi

n×l N q×n

, = ⊕ ∈ Ã Σ Ã C i=1Ci . is the direct sum of the N subsystems

Σi :úxi(t)=Aixi(t)+Biui(t), yi(t)=Cixi(t), i ∈ N. (8)

The transfer matrix of Σ is the direct sum of the transfer matrices of the subsystems: ( )= ( − )−1 = ⊕N ( ), ( )= ( − )−1 , ∈ . G s C sI A B i=1Gi s Gi s Ci sIni Ai Bi i N (9) Δ l×q Introducing, for a given Δ =(Δij) ∈ Δ := Ã , the couplings

ui(t)= ∑ eijΔijy j(t), i ∈ N (10) j∈N one obtains the coupled system equations

xúi(t)=Aixi(t)+Bi ∑ eijΔijCjx j(t), i ∈ N (11) j∈N which altogether describe the interconnected system ⎡ ⎤ ⎡ ⎤ xú1 x1 ⎢ ⎥ ⎢ ⎥ Σ . =( + ( ◦ Δ) ) . . Δ : ⎣ . ⎦ A B E C ⎣ . ⎦ (12) xúN xN

In this description the scaling matrix E has a double role. On the one hand it de- fines the structure of the admissible couplings: The subsystem Σ j is coupled to the subsystem Σi (by the uncertain coupling eijΔij) if and only if eij > 0. On the other hand the positive entries eij of E provide weights for the blocks Δij. Since these weights cannot, in general, be absorbed by the matrices Bi and/or Cj, they provide an additional scaling flexibility.

4 Stability Radii

In this section we derive computable formulas for the stability radii of the block- diagonal systemx ú = Ax with respect to structured perturbations of the form

A ; A(Δ) := A + B(E ◦ Δ)C, Δ ∈ Δ , (13) 76 D. Hinrichsen see (12). The perturbed matrix A(Δ) defined in (13) is the system matrix of the Σ · coupled system Δ . We continue to use the set-up of the previous section. Let Yj

q j D li · · = Ã be a norm on Y := Ã , its dual norm, see [10], and a norm on U : . j Yj Ui i

li×q j qi×l j

à · The corresponding operator norms on à (resp. ) are denoted by L (Yj,Ui)

l×q q×l

· Δ =(Δ ) ∈ Ã =( ) ∈ Ã (resp. L (Uj,Yi)). For any ij and M Mij we define    

˜ ∈ N×N ∈ N×N

, =   Ê . Δ = Δ  Ê ˜ : ij L (Yj,Ui) + M : Mij L (Uj,Yi) + (14)

q×l N×N ∈ Ê Lemma 2. If M =(Mij) ∈ Ã and E + then

˜ l×q. ρ((E ◦ Δ)M) ≤ ρ((E ◦ Δ)M˜ ), Δ =(Δij) ∈ Ã (15) =( ) =(( ◦Δ) ) =∑N Δ Proof. LetY Yij , Yij E M ij k=1 eik ikMkj then by [13, Lemma 4.1] ρ( ) ≤ ρ( ˜ ) ˜ =( ) ). Y Y where Y Yij L (Uj,Ui)

  ≤ ∑N Δ    =(( ◦ Δ˜) ˜ ) , ∈ Since Yij L (Uj,Ui) k=1 eik ik L (Yk,Ui) Mkj L (Uj,Yk) E M ij for i j N,wehave0≤ Y˜ ≤ (E ◦ Δ˜)M˜ and hence (15) follows from (1).  N (·) N×N Let be any vector norm on Ê and define the perturbation norm on Δ l×q Δ = Ã by  ˜ l×q. ΔΔ = N (Δ), Δ ∈ Ã (16)

= ⊕N q×l N×N ∈ Ã ∈ Ê Lemma 3. If M j=1Mj is block-diagonal and E + , then, for

N×N l×q Δ ∈ Ã every matrix δ =(δij) ∈ Ê+ , there exists such that

ρ((E ◦ Δ)M)=ρ((E ◦ Δ˜)M˜ ) and Δ˜ =(δij). (17)

If Mj ≥ 0 for all j ∈ N,thenΔ can be chosen to be nonnegative. ∗

∈ l j q j D ∈ Ã   =   = Proof. For j N let u ∈ Ã and y be such that u y 1and j j j Uj j Yj ∗ =   y j Mju j Mj L (Uj,Yj).Define

× ∗ ∗ ×

l N ∗ N q

= ( ,..., ) ∈ Ã , = ( ,..., )Ã . Du : diag u1 uN Dy : diag y1 yN ∗ ∗ ˜ = ( ,..., )= ∗ Then M diag y1M1u1 yN MN uN Dy MDu and so   (E ◦ δ)M˜ =(E ◦ δ)Dy∗MDu, ρ (E ◦ δ)M˜ = ρ (Du(E ◦ δ)Dy∗ M). (18)

∗ li×q j l×q

Δ = δ ∈ Ã Δ =(Δ ) ∈ Ã Setting ij : ijuiy j we obtain a block matrix : ij satisfying Δ  = δ , ∈ Δ˜ = δ ij L (Yj,Ui) ij for all i j N,i.e. ,and   ∗ ◦ Δ =( Δ )= δ = ( ◦ δ) ∗ , E eij ij eij ijuiy j Du E Dy     hence ρ((E ◦Δ)M)=ρ(Du(E ◦δ)Dy∗ M)=ρ (E ◦ δ)M˜ = ρ (E ◦ Δ˜)M˜ by (18). This proves the first part of the lemma. If, additionally, Mj ≥ 0forall j ∈ N then the Interconnected Positive Systems 77 above vectors u j,y j can be chosen nonnegative by Lemma 1. But then Δ =(Δij)= (δ ∗) ≥  ijuiy j 0. This proves the second part of the lemma. Now suppose we are given the following data:

( −1 N×N (Ã), ( )= ( − ) , ∈ , ∈ Ê , Ai,Bi,Ci) ∈ Än ,l ,q Gi s Ci sI Ai Bi i N E + i i i (19)

= ⊕N N N N , σ( )⊂  , = ⊕ , = ⊕ , ( )=⊕ ( ). A i=1Ai A − B i=1Bi C i=1Ci G s i=1Gi s

N×N

For any G ∈  we define N N×N , ρ(( ◦ δ) ) ≥ }. μ (G;E)=inf{N (δ);δ ∈ Ê+ E G 1 (20)

The following theorem reduces the stability radius problem for the complex |n|- Σ | | = m dimensional interconnected system Δ of dimension n ∑i=1 ni to an N-dimen- sional problem with non-negative data. Δ l×q Theorem 1. Suppose (19) and let Δ =  be provided with the norm (16),then the stability radius of A with respect to perturbations of the form (13) is given by

l×q N

, σ( (Δ)) ⊂  } = μ ( ( ω) ). ( , , ) = {Δ Δ ∈  −

r A B C;E : inf Δ ; A min D ı ;E ω∈Ê (21) ( )= ( ( ) ,..., ( ) ) where D s diag G1 s L (U1,Y1) GN s L (UN ,YN ) . If all the subsystems (Ai,Bi,Ci), i ∈ N are positive and the spaces Ui,Yi are provided with monotone norms then

( , , )= ( , , )= ( , , )=μN ( ( ) ).

Ê Ê r A B C;E r A B C;E r + A B C;E D 0 ;E (22)

( , , ) ( , , ) ( , , )

Ê  where r Ê A B C;E ,r + A B C;E are defined analogously to r A B C;E . Proof. By [7, Lemma 5.2.7] the following equivalence holds for all s ∈ ρ(A) and l×q all Δ ∈  :

s ∈ σ(A + B(E ◦ Δ)C) ⇔ 1 ∈ σ((E ◦ Δ)G(s)). (23) , Since G(s)=D(s), it follows from Lemma 2 that 1 ∈ σ((E ◦ Δ)G(s)) implies ρ((E ◦ δ)D(s))}≥1forδ := Δ˜.Conversely,ifρ := ρ((E ◦ δ)D(s))}≥1forsome

N×N l×q

Δ ∈  ρ(( ◦ Δ) ( ))} = ρ δ ∈ Ê+ then there exists by Lemma 3 such that E G s and Δ˜ = δ, whence ΔΔ = N (δ).Letλ ∈ σ((E ◦ Δ)G(s)) satisfy |λ| = ρ ≥ 1, ∈ −1 −1 l×q λ −1Δ ≤ N (δ) then 1 σ((E ◦ (λ Δ))G(s)) and λ Δ ∈  is of norm Δ . Therefore it follows from the continuity of the spectrum that

( , , )= {Δ Δ ∈ Δ , ∈ σ(( ◦ Δ) ( ω))}

r A B C;E mininf Δ ; 1 E G ı ω∈Ê N×N , ρ(( ◦ δ) ( ω))}≥ }

= mininf{N (δ);δ ∈ Ê+ E D ı 1 ω∈Ê

= minμN (D(ıω);E). ω∈Ê 78 D. Hinrichsen

This proves (21). Now assume that (A,B,C) is a positive system and the spaces ,  ( ω) ≤ ( ) Ui Yi are provided with monotone norms. Then Gi ı L (Ui,Yi) Gi 0 L (Ui,Yi) § ρ(( ◦ δ) ( ω)) ≤ ρ(( ◦ δ) ( )) for all i ∈ N, ω ∈ Ê (see [17, 4]) and hence E D ı E D 0

N×N N N

δ ∈ Ê μ ( ( ω) )=μ ( ( ) ) for all + by (1). It follows that minω∈Ê D ı ;E D 0 ;E

( , , )=μN ( ( ) ) ≤ ≤

 Ê Ê whence r A B C;E D 0 ;E by (21). Trivially we have r r r + . N×N ρ(( ◦ Hence it suffices by (20) and (23) to prove that for every δ ∈ Ê+ such that E l×q

δ)D(0))}≥1 there exists Δ ∈ Ê+ satisfying (17). But this follows by application = ⊕N ( ) ≥  of Lemma 3 to the nonnegative block-diagonal matrix M : i=1Gi 0 0. Remark 1. Theorem 1 generalizes known results for single systems (N = 1) to interconnected systems with an arbitrarily prescribed coupling structure. In fact, let

= =( ) ∈ 1×1 1×1 N (δ)=|δ| δ ∈ Ã = Ã ( , , )= N 1 and choose E 1 Ê+ , for .Then A B C

(A ,B ,C ) and, omitting the subindices, we have D(s)=G(s) l q = C(sI −

, ) 1 1 1 L (

)−1  ( )= −1  ( , , , )= ( , , ) Ã

A B l q , D 0 CA B l q , rà A B C E r A B C ,and

L ( , ) L ( , )

N

, ρ(δ ( ))}≥ }= / ( ) , ∈  \ σ( ).

μ (D(s);E)={|δ|; δ ∈ Ê+ D s 1 1 G s l q s A

L ( , )

( , , )= / ( ω) ω∈Ê

Hence (21) and (22) specialize to r A B C min 1 G ı l q and

L ( , )

( , , )= ( , , )= ( , , )= / −1  ,

Ê Ê

r A B C r A B C r + A B C 1 CA B l q

L (Ê ,Ê ) respectively. The latter formula coincides with (5) and the first one coincides with formula (22) in [7, § 5.3.2]. N N×N We will now apply the previous theorem to specific norms on Ê in order ( , , ) to derive explicit and computable formulas for the stability radius r A B C;E of interconnected positive systems. For this we need to determine μN (D(0);E).Since ( )= ( ,..., ) =  ( ) ( ◦ δ) ( )= D 0 diag d1 dN , di Gi 0 L (Ui,Yi) is diagonal, we have E D 0 N×N δ ◦ (ED(0)) for δ =(δij) ∈ Ê+ and hence by (20)

N N×N , ρ(δ ◦ ( ( )))}≥ }. μ (D(0),E)=inf{N (δ);δ ∈ Ê+ ED 0 1 (24)

N×N

We first consider the maximum norm on Ê : N ( N×N . δ)=δmax := max|δij|, δ =(δij) ∈ Ê (25) i, j∈N

Δ l×q Corollary 1. Suppose (19) and let Δ =  be provided with the norm (16) where N is the maximum norm (25). If all the subsystems (Ai,Bi,Ci), i ∈ N are positive and the spaces Ui,Yi are provided with monotone norms then

( , , )= ( , , )= ( , , )= /ρ( ( )).

Ê Ê r A B C;E r A B C;E r + A B C;E 1 ED 0 (26) N×N ρ(δ ◦ ( ( )))}≥ δ ρ( ( )) ≥ Proof. If δ ∈ Ê+ satisfies ED 0 1then max ED 0 1 by (1) and hence μN (D(0),E) ≥ 1/ρ(ED(0)) by (20). On the other hand, if 1 denotes the N × N-matrix with all entries equal to one, then δ :=(1/ρ(ED(0)))1 N satisfies δmax = 1/ρ(ED(0)) and ρ(δ ◦ (ED(0)))} = 1 whence μ (D(0),E) ≤ Interconnected Positive Systems 79

1/ρ(ED(0)) by (24). We conclude that μN (D(0),E)=1/ρ(ED(0)) and so (26) follows from (22).  N×N

We now consider operator norms on Ê which are induced by weakly mono- N ≤ ≤ ∞ tone and permutation invariant norms on Ê (e.g. p-norms, 1 p ). A norm · N ( ) = ( ) ( ) ∈

N on Ê is called permutation invariant if x N xπ( ) N for all x

Ê Ê Ê i i i N π   ≥ 

Ê and all permutations of N. It is called weakly monotone if x N y N for Ê , ∈ N ∈ = = Ê all x y Ê such that, for all i N, either yi xi or yi 0. Definition 2. [11, Def.5.7.20] The maximum cycle geometric mean of a nonneg- =( N×N ative matrix G gij) ∈ Ê+ is k 1/k ( ) = c G : max ∏ gi ji j+1 (27) j=1 in which k + 1 is identified with 1 and the maximum is taken over all sequences of distinct indices i1,...,ik ≤ N and over all k ∈ N. , ∈ N×N Given two nonnegative matrices G H Ê+ the following inequalities hold, see [11, Theorem 5.7.21].

c(G) ≤ ρ(G) ≤ Nc(G) and ρ(G ◦ H) ≤ c(G)ρ(H). (28)

Δ l×q Corollary 2. Suppose (19) and let Δ =  be provided with the norm (16) where N is any operator norm induced by a permutation invariant and weakly N ( , , ), ∈ monotone norm on Ê . If all the subsystems Ai Bi Ci i N are positive and the spaces Ui,Yi are provided with monotone norms then

( , , )= ( , , )= ( , , )= / ( ( )).

Ê Ê r A B C;E r A B C;E r + A B C;E 1 c ED 0 (29) N×N ρ(δ ◦ ( ( )))}≥ N (δ) ≥ ρ(δ) ≥ Proof. If δ ∈ Ê+ satisfies ED 0 1then 1/c(ED(0)) by (28), hence μN (D(0),E) ≥ 1/c(ED(0)) by (24) and equality holds if c(ED(0)) = 0. To prove the converse inequality, suppose c(ED(0)) > 0andlet ,..., ≤ ∏k = ( )k = ( ) i1 ik N be distinct indices such that j=1 gi ji j+1 c G for G : ED 0 .De-

N×N 2

δ ∈ Ê δ = / ( ) ∈ δ = ( , ) ∈ \{( , ) ∈ fine + by i ji j+1 1 c G , j k and ij 0for i j N i j i j+1 ; j

k}.ThenΔ has at most one non-zero entry (= 1/c(G)) in each row so that δ x N ≤ Ê

(1/c(G))x N by weak monotonicity and permutation invariance. Moreover H := Ê δ ◦ ( ( )) ( )k ≥ ∏k = ∏k δ = ( )−k ( )k = ED 0 satisfies c H j=1 hi ji j+1 j=1 i ji j+1 gi ji j+1 c G c G 1. Therefore ρ(δ ◦ (ED(0))) ≥ c(H) ≥ 1 by (28) and N (δ) ≤ (1/c(G)), whence μN (D(0),E) ≤ 1/c(ED(0)). This shows μN (D(0),E)=1/c(ED(0)) and con- cludes the proof.  Finally, we consider the case where N is a mixed norm of the form

N ( p 1/p N×N δ) := max ∑ |δ | , δ ∈ Ê (30) ∈ ij i N j∈N where 1 < p < ∞ is given. 80 D. Hinrichsen

∗ ∗ Δ l×q Corollary 3. Suppose (19), 1 < p, p < ∞ and 1/p + 1/p = 1.LetΔ =  be provided with the norm (16) where N is given by (30). If all the subsystems (Ai,Bi,Ci), i ∈ N are positive and the spaces Ui,Yi are provided with monotone norms then ∗ −1/p

( , , )= ( , , )= ( , , )= ρ( ◦p∗ ( )p∗ ) .

Ê Ê r A B C;E r A B C;E r + A B C;E E D 0 (31)

N×N ρ(δ ◦( ( )))}= Proof. Suppose that δ ∈ Ê+ satisfies ED 0 1. By Perron-Frobenius ∈ N , = δ ◦( ( )) = δ = theory there exists y Ê+ y 0suchthat ED 0 y y,i.e.∑ j∈N ijd jeijy j yi for i ∈ N.If1< p < ∞ then by H¬older’s inequality

∗ p 1/p p∗ 1/p yi ≤ ∑ |δij| ∑ |d jeijy j| . j∈N j∈N

∗ ∗ ∗ = p ≤ N (δ)p∗ ∑ p p ∈ Setting zi yi we obtain from (30) that zi j∈N d j eij z j for i N.By ∗ ∗ ∗ [2, Thm. 2.1.11] this implies ρ(E◦p D(0)p ) ≥ 1/N (δ)p , hence μN (D(0),E) ≥ ∗ ∗ ∗ [ρ(E◦p D(0)p )]−1/p by (24). ◦p∗ p∗ 1/p∗ N , = To prove the converse inequality, let ρ :=[ρ(E D(0) )] and z ∈ Ê+ z 0 ∗ ∗ ◦p∗ ( )p∗ = ρ p∗ ∑ p p = ρ p∗ ∈ = such that E D 0 z z,i.e. j∈N d j eij z j zi for i N.Letyi : 1/p∗ , ∈ · · ∗ ∈ zi i N.Since p is the dual norm of p there exists, for every i N,a (i) (i) vector δ =(δij) j∈N of p-norm δ p = 1/ρ such that for i ∈ N

∗ ∗ 1 1 p∗ p∗ p∗ 1/p 1 ∗ p∗ 1/p δ = ( )  ∗ = = ρ p . ∑ ijd jeijy j ρ d jeijy j j∈N p ρ ∑ d j eij y j ρ yi j∈N j∈N

N×N N (δ)= δ (i) = /ρ Hence δ :=(δij) ∈ Ê+ is of norm maxi∈N p 1 and satisfies δ ◦(ED(0))y = y, whence ρ(δ ◦(ED(0))) ≥ 1. This proves μN (D(0),E)=1/ρ = ∗ ∗ ∗ [ρ(E◦p D(0)p )]−1/p and so the corollary follows from Theorem 1. 

Concluding Remark

In the robust control of large scale systems it is important to make sure that the closed loop interconnected system remains stable if the links between certain sub- systems break down or change with time. Accordingly, Siljak’sù concept of connec- tive stability requires robustness against time-varying perturbations/couplings [15]. However, counterparts to the above results for time-varying perturbations (of pre- scribed structure E) are not yet available. To determine stability radii with respect to time-varying perturbations is known to be a difficult problem [18]. However, un- der positivity assumptions, the problem may become tractable. I conclude the paper with the following conjecture: Formulas (26), (29), and (31) also hold true if the per- turbations are allowed to be time-varying. The proof or disproof of this conjecture is an open problem. Interconnected Positive Systems 81

References

1. Berman, A., Plemmons, R.J.: Nonnegative Matrices in the Mathematical Sciences. In: Classics in Applied Mathematics, vol. 9. SIAM Publications, Philadelphia (1994) 2. Brualdi, R.A.: Matrices, eigenvalues, and directed graphs. Linear Multilinear Alge- bra 11(2), 143Ð165 (1982) 3. Doyle, J.C.: Analysis of feedback systems with structured uncertainties. IEE Proc. Part D 129, 242Ð250 (1982) 4. Fischer, A.: Stability radii of infinite-dimensional positive systems. Math. Control Sig- nals Syst. 10, 223Ð236 (1997) 5. Fischer, A., Hinrichsen, D., Son, N.K.: Stability radii of Metzler operators. Vietnam J. of Mathematics 26, 147Ð163 (1998) 6. Gershgorin, S.A.: Uber¬ die Abgrenzung der Eigenwerte einer Matrix. Izvestia Akad. Nauk SSSR, Ser. Fis-Mat. 6, 749Ð754 (1931) 7. Hinrichsen, D., Pritchard, A.J.: Mathematical Systems Theory I. Modelling, State Space Analysis, Stability and Robustness. Springer, Berlin (2005) 8. Hinrichsen, D., Son, N.K.: μ-analysis and robust stability of positive linear systems. Appl. Math. and Comp. Sci. 8, 253Ð268 (1998) 9. Hinrichsen, D., Son, N.K.: Stability radii of positive discrete-time systems. In: Proc. 3rd Int. Conf. Approximation and Optimization in the Caribean, Aportaciones Mat., Comu- nicaciones, Puebla, M«exico, vol. 24, pp. 113Ð124 (1995). Sociedad Matem«atica Mexi- cana, M«exico (1998) 10. Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, Cambridge (1990) 11. Horn, R.A., Johnson, C.R.: Topics in Matrix Analysis. Cambridge University Press, Cambridge (1991) 12. Hua, G., Davison, E.J.: Real stability radii of linear time-invariant time-delay systems. scl. 50, 209Ð219 (2003) 13. Karow, M., Hinrichsen, D., Pritchard, A.J.: Interconnected systems with uncertain cou- plings: explicit formulae for μ-values, spectral value sets and stability radii. SIAM J. Control Optim. 45(3), 856Ð884 (2006) 14. Ngoc, P.H.A.: Stability radii of positive linear Volterra-Stieltjes equations. Journal of Differential Equations 243, 101Ð122 (2007) 15. Siljak,ˇ D.D.: Large Scale Dynamic Systems. Stability and Structure. Series in System Science and Engineering. North-Holland, New York (1978) 16. Son, N., Ngoc, P.: Robust stability of positive linear time-delay systems under affine parameter perturbations. Acta Math. Vietnamica 24, 353Ð372 (1999) 17. Son, N.K., Hinrichsen, D.: Robust stability of positive continuous time systems. Nu- mer. Functional Anal. Optim. 17, 649Ð659 (1996) 18. Wirth, F.: On the calculation of real time-varying stability radii. Int. J. Robust & Nonlin- ear Control 8, 1043Ð1058 (1998) Linear Operators Preserving the Set of Positive (Nonnegative) Polynomials

Olga M. Katkova and Anna M. Vishnyakova

Abstract. This note deals with linear operators preserving the set of positive (non- negative) polynomials. Numerous works of prominent mathematicians in fact con- tain the exhaustive description of linear operators preserving the set of positive (nonnegative) polynomials. In spite of this, since this description was not formu- lated explicitly, it is almost lost for possible applications. In the paper we formulate and prove these classical results and give some applications. For example, we prove ∈ that there are no linear ordinary differential operators of order m Æ with polyno- mial coefficients which map the set of nonnegative (positive) polynomials of degree ≤ ( m  + ) 2 1 into the set of nonnegative polynomials. This result is a generalization of a Theorem by Guterman and Shapiro.

1 Introduction and Statement of Results

Real polynomial P is called positive (nonnegative) polynomial if P(x) > 0(P(x) ≥ ∈ 0) for every x Ê. Positive (nonnegative) polynomials arise in different branches of mathematics, physics, engineering and other sciences (see [3] for a number of open problems connected with the topic, see also a recent paper [2]). Numerous works of prominent mathematicians devoted to linear operators preserving the set of positive (nonnegative) polynomials and connected questions (see, for example, [1], [6]-[12] and the references therein). These classical works in fact contain the exhaustive description of linear operators preserving the set of positive (nonnega- tive) polynomials. In spite of this, since this description is not formulated explicitly, it is almost lost for possible applications. In this short note we formulate and prove these classical results and give some applications.

Olga M. Katkova and Anna M. Vishnyakova Department of Mathematics, Kharkov National University, 4 Svobody sq., 61077 Kharkov, Ukraine, e-mail: [email protected], [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 83Ð90. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 84 O.M. Katkova and A.M. Vishnyakova

[ ]

We will use two standard notations: Ê x will denote the set of real polynomials

[ ]={ ∈ Ê[ ] ≤ } = , ,... and Êm x Q x :degQ m , m 0 1 . The following theorem is a classical result. Unfortunately we can not name its au- thors, but it is based on the ideas which arose in works by Euler, Gauss and Sylvester.

k

[ ] → Ê[ ] ( ) = ( ), = Theorem 1. Let A : Ê x x be a linear operator. Let Pk x : A x k 0,1,.... 1. The operator A preserves the set of nonnegative polynomials if and only if all A ( ) =( ( ))∞ principal minors of the infinite matrix x : Pi+ j x i, j=0 are nonnegative ∈ .

for every x Ê

[ ] → Ê[ ] 2. The operator A : Ê2m x x maps the set of nonnegative polynomials of de- gree ≤ 2m into the set of nonnegative polynomials if and only if all principal ( + ) × ( + ) A ( , ) =( ( ))m minors of the m 1 m 1 matrix x m : Pi+ j x i, j=0 are nonnega- ∈ . tiveforeveryx Ê

The following theorem is a corollary of Theorem 1 and a remark of Guterman and Shapiro ([4], see also [5]).

k

[ ] → Ê[ ] ( ) = ( ) Theorem 2. Let A : Ê x x be a linear operator. Denote by Pk x : A x , k = 0,1,.... 1. The operator A preserves the set of positive polynomials if and only if all prin- A ( ) =( ( ))∞ cipal minors of the infinite matrix x : Pi+ j x i, j=0 are nonnegative for

∈ ( ) > ∈ Ê.

every x Ê and P0 x 0 for every x

[ ] → Ê[ ] 2. The operator A : Ê2m x x maps the set of positive polynomials of degree ≤ 2m into the set of positive polynomials if and only if all principal minors ( + ) × ( + ) A ( , ) =( ( ))m of the m 1 m 1 matrix x m : Pi+ j x i, j=0 are nonnegative for

∈ ( ) > ∈ Ê. every x Ê and P0 x 0 for every x

With the help of Theorems 1 and 2 we obtain the following two results. {α }∞ Theorem 3. Given a l and two real sequences k k=0 and

{ ∞ k k k+2l β } Ê[ ] → Ê[ ] ( )=α + β , = k k=0 define a linear operator A : x x by A x kx kx k 0,1,.... The operator A preserves the set of nonnegative polynomials if and only if A =(α )∞ A =(β )∞ all principal minors of the infinite matrices 1 : i+ j i, j=0 and 2 : i+ j i, j=0 are nonnegative.

Theorem 4. Consider an arbitrary natural number m, a sequence of natural ∞ { ( )} { ( , )} , = , , .... numbers n k k=0, and a double sequence of polynomials Q j x k j∈ k 0 1 2 Suppose that Q j(x,k) ≡ 0 for k+ j < 0 and Q−m(0,m) = 0. Define a linear operator

k n(k) k+ j

[ ] → Ê[ ] ( )= ( , ) , = , ,... A : Ê x x by A x ∑ j=−m Q j x k x k 0 1 . Then the operator A [ ] does not map the set of nonnegative (positive) polynomials from Ê ( m + ) x into 2 2 1 the set of nonnegative polynomials.

The last theorem is a generalization of the following surprising result by Guter- man and Shapiro.

Linear Operators Preserving the Set of Positive Polynomials 85

[ ] → Ê[ ] Theorem 5. [4, Theorem A] Let A : Ê x x be a linear ordinary differential operator of order m ≥ 1 with polynomial coefficients Q =(q0(x),q1(x), ...qm(x)), qm(x) ≡ 0:

(m) [ ]. A( f )=q0(x) f (x)+q1(x) f (x)+...+ qm(x) f (x), f ∈ Ê x

Then for any coefficient sequence Q the operator A does not map the set of nonneg- [ ] ative (positive) polynomials from Ê2m x into the set of nonnegative polynomials.

Moreover, the following more precise theorem is a direct corollary of Theorem 4.

[ ] → Ê[ ] Theorem 6. Let A : Ê x x be a linear ordinary differential operator of or- der m ≥ 1 with polynomial coefficients Q =(q0(x),q1(x),...qm(x)),qm(x) ≡ 0:

(m) [ ]. A( f )=q0(x) f (x)+q1(x) f (x)+...+ qm(x) f (x), f ∈ Ê x

Then for any coefficient sequence Q the operator A does not map the set of nonneg- Ê [ ] ative (positive) polynomials from ( m + ) x into the set of nonnegative polyno- 2 2 1 mials.    m  + Note that the degree 2 2 1 in the previous Theorem could not be decreased:

∈ Ê[ ] → Ê[ ] for every m Æ there exist linear ordinary differential operators A : x x of [ ] order m which map the set of nonnegative polynomials from Ê  m  x into the set 2 2 of nonnegative polynomials. As a simple example of such operator one could take A( f )= f (m) (or A( f )=f + f (m)). Some important examples of such differential

operators were described by R.Remak.

[ ] → Ê[ ] Theorem 7. ([11], see also [10, VII, Problem 38]). Let A : Ê x x be a , ∈ , linear ordinary differential operator of order 2m m Æ with constant coefficients a0,a1,...a2m:

a0 ( )+a1 a2m (2m) [ ]. A( f )= f x f (x)+...+ f (x), f ∈ Ê x 0! 1! (2m)! [ ] Then the operator A maps the set of nonnegative polynomials from Ê2m x into the m m ξ ξ set of nonnegative polynomials if and only if the quadratic form ∑i=0 ∑ j=0 ai+ j i j is nonnegative definite. Using the method which was applied to prove Theorems 1, 6 and 7 we obtain the

following result.

[ ] → Ê[ ] Theorem 8. Let A : Ê x x be a linear ordinary differential operator of or- der m ≥ 1 with polynomial coefficients Q =(q0(x),q1(x),...qm(x)):

( ) ( ) ( ) ( )=q0 x ( )+q1 x qm x (m) [ ]. A f f x f (x)+...+ f (x), f ∈ Ê x 0! 1! m! [ ] Then the operator A maps the set of nonnegative polynomials from Ê  m  x into the 2 2 set of nonnegative polynomials if and only if all principal minors of the following 86 O.M. Katkova and A.M. Vishnyakova

 m  ( m  + ) × ( m  + ) A ( ) =( ( )) 2 2 1 2 1 matrix q x : qi+ j x i, j=0 are nonnegative for every ∈ . x Ê

2 Proof of Theorems 1 and 2

Proof of Theorem 1. 1. It is well known that a real polynomial is nonnegative if and only if it can be represented as a sum of squares of two real polynomials (see, for example, [10, VI, Problem 44] or [1, Chapter 1, §1]). Whence a linear operator A preserves the set of nonnegative polynomials if and only if A(Q2) is a nonnegative ∈ [ ] = , , ... polynomial for every Q Ê x , that is for every n 0 1 2 and for every set of real n 2 numbers ξ0,ξ1,...,ξn a polynomial A (ξ0 + ξ1x + ...+ ξnx ) is nonnegative. We have   n n n 2 i+ j A (ξ0 + ξ1x + ...+ ξnx ) = ∑ ξiξ jA(x )= ∑ Pi+ j(x)ξiξ j. i, j=0 i, j=0

Hence A preserves the set of nonnegative polynomials if and only if for every

n ( )ξ ξ = , , ... ∈ Ê n 0 1 2 and for every x the quadratic form ∑i, j=0 Pi+ j x i j is nonneg- ative and the statement of the theorem follows from the well-known criterion of nonnegative definiteness. 2. To prove this statement it is sufficient to note that a real polynomial from [ ] Ê2m x is nonnegative if and only if it could be represented as a sum of squares of [ ]  two real polynomials from Êm x and to apply the reasonings mentioned above. Proof of Theorem 2. Using continuity arguments we obtain that if A preserves the set of positive polynomials then all principal minors of the infinite matrix A (x) 0 are nonnegative for every real x,moreoverA(x )=P0(x) is a positive polyno- mial. Suppose now that all principal minors of A (x) are nonnegative for every

∈ ( ) > ∈ Ê. x Ê and P0 x 0foreveryx By theorem 1 we obtain that A preserves the set of nonnegative polynomials. Let P be a positive polynomial. Denote by μ = ( ) μ > ( ) = ( ) − μ : minx∈Ê P x (obviously 0). Since Q x : P x 2 is a positive polyno- ( )= ( )− μ ( ) mial, then A Q A P 2 P0 is a nonnegative polynomial, thus A P is a positive polynomial. 

3 Proof of Theorems 3, 4, 6 and 8

Proof of Theorem 3. Suppose that the operator A preserves the set of nonnegative polynomials. Then by Theorem 1 all principal minors of the infinite matrix A (x) :=  ∞ i+ j i+ j+2l . α + + β + ∈ Ê i jx i jx i, j=0 are nonnegative for every x Let us choose an arbitrary natural number n and an arbitrary set of integers 0 ≤ s1 < s2 <...

∈ . ∈ Ê \{ } for every x Ê Hence for every x 0   n α + β 2l ≥ . det si+s j si+s j x 0 i, j=1

It follows that   n det α + ≥ 0. si s j i, j=1 A =(α )∞ So we obtain that all principal minors of the matrix 1 : i+ j i, j=0 are nonnega- tive. Analogously we have   + + + n α si s j + β si s j 2l = det si+s j x si+s j x , = (2)  i j 1 ( + +...+ + ) − n x2 s1 s2 sn nl det α + x 2l + β + ≥ 0 si s j si s j i, j=1

∈ . ∈ Ê \{ } for every x Ê Hence for every x 0   n α −2l + β ≥ . det si+s j x si+s j 0 i, j=1 Tending x → +∞ we conclude that   n det β + ≥ 0. si s j i, j=1 A =(β )∞ So we obtain that all principal minors of the matrix 2 : i+ j i, j=0 are nonnega- tive. A =(α )∞ Suppose now that all principal minors of two infinite matrices 1 : i+ j i, j=0 A =(β )∞ and 2 : i+ j i, j=0 are nonnegative. Let us consider two linear operators A1,

k k k k+2l

[ ] → Ê[ ] ( )=α , = , ,... ( )=β , = A2 : Ê x x defined by A1 x kx k 0 1 and A2 x kx k 0,1,.... Using Theorem 1 it is easy to check that both linear operators preserve the set of nonnegative polynomials. Hence the given operator A = A1 + A2 also preserves the set of nonnegative polynomials. 

Proof of Theorem 4. Suppose that the operator A maps the set of nonnegative Ê [ ] (positive) polynomials from ( m + ) x into the set of nonnegative polynomials . 2 2 1 =  m + . A ( , ) = Denote by s : 2 1 By Theorem 1 all principal minors of the matrix x s :

( i+ j s ) ∈ Ê. A x i, j=0 are nonnegative for every x = + × Suppose m = 2t, t ∈ Æ.Thens t 1. The following 2 2principalmi- nor of A (x,t + 1) formed by rows and columns with numbers t − 1andt + 1is nonnegative: 88 O.M. Katkova and A.M. Vishnyakova - - - n(2t−2) 2t−2+ j n(2t) 2t+ j - - ∑ Q j(x,2t − 2)x ∑ Q j(x,2t)x - det- j=−2t j=−2t - ≥ 0 - ∑n(2t) ( , ) 2t+ j ∑n(2t+2) ( , + ) 2t+2+ j - j=−2t Q j x 2t x j=−2t Q j x 2t 2 x ∈ ( , ) ≡ + < for every x Ê .SinceQ j x k 0fork j 0 we rewrite the last inequality in such a way

- - - n(2t−2) 2t−2+ j n(2t) 2t+ j - - ∑ Q j(x,2t − 2)x ∑ Q j(x,2t)x - det- j=−2t+2 j=−2t - ≥ 0 - ∑n(2t) ( , ) 2t+ j ∑n(2t+2) ( , + ) 2t+2+ j - j=−2t Q j x 2t x j=−2t Q j x 2t 2 x Let us set x = 0 in this inequality. We have - - - - - Q−2t+2(0,2t − 2) Q−2t(0,2t) - det- - ≥ 0, Q−2t(0,2t) 0

2 that is equivalent to (Q−2t(0,2t)) ≤ 0. Since m = 2t the last inequality contradicts the assumption Q−m(0,m) = 0. = − , ∈ = × Suppose m 2t 1 t Æ.Thens t. The following 2 2 principal minor of A (x,t) formed by rows and columns with numbers t − 1andt is nonnegative:

- - - n(2t−2) 2t−2+ j n(2t−1) 2t−1+ j - - ∑ Q j(x,2t − 2)x ∑ Q j(x,2t − 1)x - det- j=−2t+1 j=−2t+1 - ≥ 0 - ∑n(2t−1) ( , − ) 2t−1+ j ∑n(2t) ( , ) 2t+ j - j=−2t+1 Q j x 2t 1 x j=−2t+1 Q j x 2t x ∈ ( , ) ≡ + < for every x Ê .SinceQ j x k 0fork j 0 we rewrite the last inequality in such a way

- - - n(2t−2) 2t−2+ j n(2t−1) 2t−1+ j - - ∑ Q j(x,2t − 2)x ∑ Q j(x,2t − 1)x - det- j=−2t+2 j=−2t+1 - ≥ 0 - ∑n(2t−1) ( , − ) 2t−1+ j ∑n(2t) ( , ) 2t+ j - j=−2t+1 Q j x 2t 1 x j=−2t+1 Q j x 2t x Let us set x = 0 in this inequality. We have - - - - - Q−2t+2(0,2t − 2) Q−2t+1(0,2t − 1) - det- - ≥ 0, Q−2t+1(0,2t − 1) 0

2 that is equivalent to (Q−2t+1(0,2t − 1)) ≤ 0. Since m = 2t − 1 the last inequality contradicts the assumption Q−m(0,m) = 0.  Proof of Theorem 6. Suppose that there exists a coefficient sequence Q such (m) that the operator A( f )=q0(x) f (x)+q1(x) f (x)+...+ qm(x) f (x), qm(x) ≡ 0, [ ] maps the set of nonnegative (positive) polynomials from Ê ( m + ) x into the set 2 2 1 (α) =

of nonnegative polynomials. Let us choose α ∈ Ê such that qm 0 and con-

[ ] → Ê[ ] ( ) = ( + α) ( )+ ( + sider a linear operator B : Ê x x of the form B f : q0 x f x q1 x (m) α) f (x)+...+ qm(x + α) f (x). Obviously the operator B also maps the set of [ ] nonnegative (positive) polynomials from Ê ( m + ) x into the set of nonnegative 2 2 1 Linear Operators Preserving the Set of Positive Polynomials 89 polynomials. Namely, let f (x) be an arbitrary nonnegative (positive) polynomial. Then g(x)= f (x − α) is also a nonnegative (positive) polynomial, so A(g)(y) is ∈ ( )( +α)= ( )( ) nonnegative for every y Ê, and thus A g x B f x is a nonnegative poly-

nomial. The operator B satisfies the conditions of Theorem 4 with n(k) ≡ 0, thus it Ê [ ] does not map the set of nonnegative (positive) polynomials from ( m + ) x into 2 2 1 the set of nonnegative polynomials.  Proof of Theorem 8. Obviously a polynomial f (x) is nonnegative if and only if a polynomial h(x)= f (x + x0) is nonnegative (where x0 is an arbitrary ). So, as we noted in the proof of Theorem 6, the operator A maps the set of nonnega- [ ] tive polynomials from Ê  m  x into the set of nonnegative polynomials if and only

2 2 [ ] ∈ Ê if for every nonnegative polynomial f ∈ Ê  m  x and for every x 2 2

q (x) q (x) q (x) A( f )(0)= 0 f (0)+ 1 f (0)+...+ m f (m)(0) ≥ 0. 0! 1! m! [ ] A real polynomial from Ê  m  x is nonnegative if and only if it could be repre- 2 2 [ ] ( )( ) ≥ sented as a sum of squares of two real polynomials from Ê m  x . Whence A f 0 2 ∈ [ ] ( 2)( ) ≥ 0 for every nonnegative polynomial f Ê  m  x if and only if A g 0 0forev- 2 2 ∈ [ ] ξ ,ξ ,...,ξ ery g Ê m  x , that is for every set of real numbers  m  and for every 2   0 1 2 ∈ (ξ + ξ + ...+ ξ  m )2 ( ) x Ê a polynomial A x  m x 2 0 is nonnegative. We have 0 1 2

 m    2  m  2 A (ξ + ξ x + ...+ ξ m x 2 ) (0)= ∑ ξiξ jqi+ j(x) ≥ 0. 0 1 2 i, j=0 [ ] Hence A maps the set of nonnegative polynomials from Ê2 m  x into the set ∈ 2 of nonnegative polynomials if and only if for every x Ê the quadratic form  m  ∑ 2 ( )ξ ξ i, j=0 qi+ j x i j is nonnegative. The statement of theorem follows from the well- known criterion of nonnegative definiteness. 

Acknowledgements. The authors express their deep gratitude to the organizers and partic- ipants of the Workshop ”P«olya-Schur-Lax problems: hyperbolicity and stability preservers” (2007, Palo-Alto, USA) for the fruitful discussions and interesting new problems. The authors are very grateful to referees for valuable comments and suggestions.

References

1. Akhiezer, N.I.: The classical moment problem and some related questions in analysis. In: Kemmer, N. (ed.) Translated from the Russian, 253 pages. Hafner Publishing Co., New York (1965) 2. Borcea, J.: Classifications of linear operators preserving elliptic, positive and non- negative polynomials. arXiv:0811.4374 90 O.M. Katkova and A.M. Vishnyakova

3. Borcea, J., Br¬and«en, P., Csordas, G., Vinnikov, V.: P«olya-Schur-Lax problems: hyperbol- icity and stability preservers, http://www.aimath.org/pastworkshops/polyaschurlax.html 4. Guterman, A., Shapiro, B.: On linear operators preserving the set of positive polynomi- als. JFPTA 3(2), 411Ð429 (2008) 5. Guterman, A., Shapiro, B.: A note on positivity preserveres. Math. Res. Lett. 15 (2008) 6. Hamburger, H.: Uber¬ eine Erweiterung des Stieltjesschen Momentenproblems. Parts I, II, III. Math. Ann. 81, 235Ð319; ibid. 82 (1921), 20Ð164, 168Ð187 (1920) 7. Hurwitz, A.: Uber¬ definite Polynome. Math. Ann. 73, 173Ð176 (1913) 8. Krein, M.G., Nudelman, A.A.: The Markov moment problem and extremal problems. Ideas and problems of P. L. Chebyshev and A. A. Markov and their further development. Translated from the Russian by D. Louvish. In: Translations of Mathematical Mono- graphs, Vol. 50, 417 pages. American Mathematical Society, Providence (1977) 9. P«olya, G., Schur, I.: Uber¬ zwei Arten von Faktorenfolgen in der Theorie der algebrais- chen Gleichungen. J. Reine Angew. Math. 144, 89Ð113 (1914) 10. P«olya, G., Szeg¬o, G.: Problems and Theorems in Analysis. Reprint of the 1976 English translation. Classics in Mathematics, vol. II. Springer, Berlin (1998) 11. Remak, R.: Bemerkung zu Herrn Stridsbergs Beweis des Waringschen Theorems. Math. Ann. 72, 153Ð156 (1912) 12. Schur, I.: Bemerkungen zur Theorie der beschr¬ankten Bilinearformen mit unendlichvie- len Ver¬aderlichen. J. f¬ur Math. 140 (1911) Convergence to Consensus by General Averaging

Dirk A. Lorenz and Jan Lorenz

Abstract. We investigate sufficient conditions for a discrete nonlinear non-homo- geneous dynamical system to converge to consensus. We formulate a theorem which is based on the notion of averaging maps. Further on, we give examples that demon- strate that the theory of convergence to consensus is still not complete.

1 Introduction

We consider the problem of consensus formation under the action of general nonlin- ear averaging maps (or general means). We consider a set of agents n = {1,...,n} ⊂ d where each of them has coordinates in a d-dimensional opinion space S Ê . ∈ i( ) ∈ The individual coordinates of agent i at time t Æ are labeled x t S,and

( ) ∈ n d n ) ∈ Æ x t S ⊂ (Ê is called the profile at time t . Hence, we study discrete dy- ( d)n namical systems in Ê of the following form

x(t + 1)= ft (x(t)) (1) n → n i where ft : S S . We denote the component functions by ft . Accordingly we use upper indices for the number of the agents and lower indices for the dimension of i the opinion space, e.g. xk denotes the k-th component of the opinion of the i-th agent. We assume that the maps ft are averaging maps (see below), and we are interested in conditions that ensure, that the solution converges to a consensus, i.e. there is γ i n×n such that x (t) → γ for every i. With matrices At ∈ Ê we get a linear example

Dirk A. Loreny Institute for Analysis and Algebra, TU Braunschweig, 38092 Braunschweig, Germany, e-mail: [email protected] Jan Lorenz Chair of Systems Design, Department of Management, Technology, and Economics, ETH Z¬urich, Z¬urich, Switzerland, e-mail: [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 91Ð99. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 92 D.A. Lorenz and J. Lorenz by ft (x)=Atx. To recover averaging maps (as defined below) At is row-stochastic ∈ for all t Æ. This problem is solved in special cases, e.g. when A is independent of t [9], in the case of finitely many A [10] or when the At ’s have positive diagonals, are type-symmetric and have positive minima uniformly bounded from below for all t [3, 6]. Here we treat the non-linear case similar to [5, 7, 8].

2 General Averaging Maps

In this section we define averaging maps (or general means, respectively). First we consider d = 1 and hence, an appropriate opinion space S is an interval. A general n i mean is a function g : S → S such that the sandwich inequality mini∈n x ≤ g(x) ≤ i maxi∈n x holds. An example is the power mean for real p = 0and positive x:

1 1 P (x)=( ((x1)p + ···+(xn)p)) p p n The power mean includes√ the arithmetic (p = 1) and harmonic (p = −1) means. The geometric mean n x1 ...xn is approached for p → 0, and max{x1,...,xn} and min{x1,...,xn} for p → ∞ respectively p →−∞. Further on, there are weighted means: for nonnegative numbers α1,...,αn which n i sum up to one there is the weighted arithmetic mean ∑ = αix and the weighted α i 1 ∏n i geometric mean i=1 xi . Another generalization is the f-mean. For a continuous and injective function → 1,..., n f : S Ê (which is thus invertible on its range) the f -mean of x x is

n (−1) 1 i Pf (x)= f ( ∑ f (x )). n i=1

The power mean is represented here as f (x)=xp, the geometric mean as f (x)= log(x). Of course, more means can be defined by means of means. Now we extend the definition of a general mean to d-dimensional opinion vec- ≥ ⊂ d tors for d 2. So let S Ê be an appropriate opinion space. All means mentioned for the case d = 1 can be generalized to higher dimensions by taking them com- ponentwise. Further on, one may define different one-dimensional means in each component. What is a proper generalization of the one-dimensional sandwich inequality? We are going to answer this question with the help of generalized barycentric coordi- nate maps, but first we discuss two straightforward generalizations. First, a function g : Sn → S is called a convex-hull mean if it fulfills the convex-hull sandwich in- i clusion: g(x) ∈ convi∈nx . Obviously, the componentwise weighted arithmetic mean fulfills this property. But, e.g., the componentwise geometric mean does not (see i i i Figure 1). Another possibility is, to use cubei∈nx :=[mini∈n x ,maxi∈n x ] instead of the convex hull. Notice that max and min are componentwise and that the interval Convergence to Consensus by General Averaging 93

d is multidimensional. So cube represents the smallest closed hypercube in Ê which covers all vectors x1,...,xn. Figure 1 shows an example of the convex hull and the 2 n → cube of a set of points in Ê . A function g : S S is called a cube mean if it fulfills i the cube sandwich inclusion: g(x) ∈ cubei∈nx .

4 4 4

3 3 3

2 2 2

1 1 1

0 0 0 0 1 2 3 0 1 2 3 0 1 2 3

2 Fig. 1 A set of six points in Ê≥0. The gray area is their convex hull (left), all possible weighted geometric means (center) and their cube (right).

The convex-hull means and the cube means can be generalized further with the help of a generalized barycentric coordinate map. To motivate the construction note that the cube is indeed the convex hull of the 2d points which lie at its vertices.

i i n

= ( ) Ê → Hence, we could write cubei∈nx convi∈2d y x with an appropriate map y : d

2  d Ê Ê which takes n points in and maps them to the 2 possible combinations of taking componentwise max and min. To formalize this further we call y : Sn → Sm a generalized barycentric coordinate map if for every k ∈ n it holds that xk ∈ i i convi∈my (x). It is now natural to call convi∈my (x) the y-convex hull of x.So,a y-convex hull is a set-valued function from Sn to the compact and convex of S. We call a set y-convex, if it is the union of the y-convex hulls of all n of its i points. A function g is a y-convex hull mean if g(x) ∈ convi∈my (x). Note that the convex hull is obtained from the y-convex hull with m = n and y the identity. The cube is obtained with m = 2d and an appropriate mapping y. Many other examples d

fit into this setting: the smallest interval for any basis of Ê [1, Example 2], or the smallest polytope with faces parallel to a set of k ≥ d +1 hyperplanes [1, Example 3] containing x1,...,xn (the generalized barycentric coordinates are then the extreme points of the polytope, perhaps with multiplicities e.g. if the number of extreme points is smaller than n). Now, we define the central notion of this paper. ⊂ d n → m Definition 1. Let S Ê , y : S S be a generalized barycentric coordinate mapsuchthatS is y-convex. A mapping f : Sn → Sn is called a y-averaging map,if for every x ∈ Sn it holds

i i convi∈my ( f (x)) ⊂ convi∈my (x). (2) 94 D.A. Lorenz and J. Lorenz

Furthermore, a proper y-averaging map is a y-averaging map, such that for every x ∈ Sn which is not a consensus, the above inclusion is strict.

3 Convergence to Consensus

To state a sufficient condition on the maps ft which ensure that the solution of (1) converge to consensus we need a little more notation. Let d(x,C) denote the distance of a point x and a compact set C.TheHausdorff metric for compact non-empty sets is defined as dH(B,C) := max{supd(b,C),supd(c,B)}. b∈B c∈C ⊂ ( , ) = ( , ) If B C it holds dH B C : supb∈B d b C . The next notion we need, is ‘equiproper averaging map’: Definition 2. Let y be a generalized barycentric coordinate map and let F be a family of proper y-averaging maps. F is called equiproper,ifforeveryx ∈ Sn which is not a consensus, there is δ(x) > 0 such that for all f ∈ F   i i dH convi∈my ( f (x)),convi∈my (x) > δ(x). (3)

We can state the following lemma:

Lemma 1. Let ft be a sequence of averaging maps forming an equiproper family of averaging maps such that ft → g pointwise. Then g is a proper averaging map.

Proof. First we show that g is an averaging map. Take x ∈ Sn and let ε > 0, due to convergence of ( ft )i to gi and continuity of y there is t0 such that for all t > t0 i i i i it holds y ( ft (x)) − y (g(x)) < ε.Duetoy ( ft (x)) ∈ convi∈my (x) it follows that i i i the maximal distance of y (g(x)) to convi∈my (x) is less than ε, and thus y (g(x)) ∈ i i convi∈my (x) because convi∈my (x) is closed. Now we show that g is proper. To this end, let x ∈ Sn be not a consensus. ∗ i ∗ i We have to show that there is z ∈ convi∈my (x) but z ∈/ convi∈my (g(x)).(Note ∗ n m that z ∈ S, while x ∈ S and y(x) ∈ S .) We know that there is for each t ∈ Æ i i an z(t) ∈ convi∈my (x) with z(t) ∈/ convi∈my ( ft (x)). According to the equiproper i property it can be chosen such that the distance of z(t) to convi∈my ( ft (x)) is big-

ger than δ(x)/2 > 0forallt ∈ Æ. Further on, we know that the set difference i i convi∈my ( ft (x))\convi∈my (x) is non empty and bounded, thus there is a subse- ∗ i quence ts such that z(ts) converges to an z ∈ convi∈my (x). Because of the construc- ∗ i tion it also holds z ∈/ convi∈my (g(x)). 

i Lemma 2. Let x(t) be a solution of (1) and let C(t)=convi∈my (x(t)).There i exists a subsequence ts such that x(ts) → cfors→ ∞ and it holds c ∈C = ∩tC(t) for every i.

Proof. Due to compactness of C(0)n there exists a convergent subsequence of ( ) i ε > x t and we call its limit c. It remains to show that c ∈ C.LetT ∈ Æ and 0. Convergence to Consensus by General Averaging 95

Take s large enough, to have x(ts) − c≤ε.Ifwetakes even larger, we have i i i x (ts) ∈ C(T ) which shows c ∈ C(T) since C(T) is closed. Hence, c ∈ C(T ) for every T . 

Before we state our main theorem we need one more notion: We call a family F of continuous functions equicontinuous if for every ε > 0 there exists δ > 0suchthat for every f ∈ F it holds that x − x ≤δ implies  f (x)− f (x )≤ε. Note that δ is chosen independently of f . Now, the main theorem is as follows: ⊂ d Theorem 1. Let S Ê , y be a generalized barycentric coordinate map such that S is y-convex, and F be an equicontinuous family of equiproper y-averaging maps n ( ) ∈ ( ) ∈ n on S . Then it holds for any sequence ft t∈Æ with ft F and any x 0 S that the solution of (1) converges to a consensus.

i Proof. In the first step one shows that C from Lemma 2 fulfills C = convi∈my (c). This step uses that, due to the Theorem of Arzel`a-Ascoli and Lemma 1, there is a uniformly convergent subsequence of ft with a proper y-averaging map g as a i i limit. Now one shows that convi∈my (g(c)) = convi∈my (c) which implies that c is a consensus since g is proper. For details we refer to [5]. It remains to show that the whole sequence x(t) converges to c =(γ,...,γ). This can be seen as follows: For ε > 0 there exists s0 such that for s > s0 we  i( ( )) − γ≤ε > ( ) ∈ ( ) have y x ts . Moreover, for t ts0 we have x t C ts0 and hence i( )=∑ j j( ( )) x t j a y x ts0 is convex combination. We conclude  i( ) − γ =  j( j( ( )) − γ)≤  j( ( )) − γ≤ ε x t ∑a y x ts0 ∑ y x ts0 m (4) j j and hence, x(t) → c =(γ,...,γ). 

The proof follows the lines of the main theorem in [4] which now appears as a corollary since there the system x(t + 1)= f (x(t)) with just one proper averaging map f is considered. Some more corollaries can be deduced.

Corollary 1. Let F = { f1,..., fm} be a finite family of proper y-averaging maps n d)n on S ⊂ (Ê , with S an appropriate opinion space. Let F be uniformly continuous. ( ) ∈ ( ) ∈ n

Then it holds for a sequence ft t∈Æ with ft F and x 0 S that the solution of (1) converges to consensus.

Proof. Since the family is finite, it is uniformly continuous and equiproper.  n d)n Corollary 2. Let F be a family of averaging maps on S ⊂ (Ê ,withSanap- propriate opinion space. Let F be uniformly equicontinuous and at least one element ( ) in F is proper and all proper elements of F are equiproper.Let ft t∈Æ be a sequence ∈ ( ) ∈ n with ft F and ts be a subsequence such that fts is proper. Then, for x 0 S ,the solution of (1) converges to consensus.

Proof. Theorem 1 for the sequence fts gives subsequential convergence to con- sensus. An estimate similar to (4) gives convergence of the whole sequence.  96 D.A. Lorenz and J. Lorenz

The proof of the next corollary uses similar techniques. n d)n Corollary 3. Let F be a family of averaging maps on S ⊂ (Ê , with S an ( ) appropriate opinion space. Let F be uniformly equicontinuous. Let ft t∈Æ be a ∈ ( ) = = sequence with ft F and ts s∈Æ be a subsequence (with t0 0) such that the fs : ◦ ◦···◦ ◦ ( ) ∈ n fts+1−1 fts+1−2 fts+1 fts are equiproper. Then, for x 0 S , the solution of (1) converges to consensus.

In the spirit of [1] we state another generalization of Theorem 1. The general- , ⊂ d ization deals with deformations of the hull. To this end, let S T Ê be compact and φ : T → S be a homeomorphism. For a generalized barycentric coordinate map n m −1 i y : S → S we define the y,φ-hull as φ (convi∈my (φ(x))).Now,ay,φ-averaging map g is defined analogous to Definition 1:

−1 i −1 i φ (convi∈my (φ(g(x)))) ⊂ φ (convi∈my (φ(x))).

Note, that the y,φ-hull is not necessarily convex, see [1, Example 6]. The extension of the notions ‘proper’ and ‘equiproper’ is straightforward. For the proof of the following theorem we refer to [5].

Theorem 2. Let φ : T → S be continuous with Lipschitz continuous inverse and let y be a generalized barycentric coordinate map such that S is y-convex. Let G be a family of equicontinuous, equiproper y,φ-averaging maps on T n. Then it holds for ( ) ∈ ( ) ∈ n ( + )= any sequence gt t∈Æ with gt G and any x 0 T that the solution of x t 1 gt(x(t)) converges to a consensus.

4 Examples and Counterexamples

We give some examples that illustrate the role of the different assumptions in Theorem 1.

( 2 2 ) → (Ê ) Example 1. Let f : Ê≥0 ≥0 with    3 x1 + 1 x2, 3 x1 + 1 x2 if x1 + x2 > 10, 1 2  4 4 4 4  f (x ,x ) := 3 1 1 3 (x1) 4 (x2) 4 ,(x1) 4 (x2) 4 otherwise.

This averaging map converges to consensus but is not continuous. For example for x(0)=(1,9) it converges to (3,3) but for x(0)=(1 + ε,9) it converges to (5 + ε/2,5 + ε/2). So continuity is not necessary for convergence.

3 3

) → (Ê) Example 2. Let f : (Ê with  (x1,x2, 1 x3 + 1 min{x1,x2}) if x3 < min{x1,x2}, f (x1,x2,x3) := 2 2 (x1,x1,x1) otherwise. Convergence to Consensus by General Averaging 97

Starting with x(0)=(2,3,1) the discrete dynamical system x(t + 1)= f (x(t)) will converge to (2,3,2), although it is a proper averaging map. But it is not continuous at all points where x3 = min{x1,x2} and x1 = x2.

2 2

) → (Ê) Example 3. Let ft : (Ê with   1 2 −t 1 −t 2 −t 1 −t 2 ft (x ,x ) := (1 − 4 )x + 4 x , 4 x +(1 − 4 )x .

It is easy to see that theses ft ’s are proper and that for t ≥ 1andx(1)=(0,1) it

1 1 2 2

( ) < ( ) > { | ∈ Æ} holds that x t 3 and x t 3 . Obviously, ft t is not equiproper because ft converges to the identity as t → ∞.

2 2

) → (Ê) Example 4. Let ft : (Ê with   ( 1, 2) = ( − 1 ) 1 + 1 2 , 2 . ft x x : 1 t x t x x

This example is not equiproper, because ft converges to the identity for t → ∞. ≥ ( 1 2 )2 ( + )= But for t 2 and any initial values x (2),x (2)) ∈ (Ê the system x t 1 ( ( )) ( )=( 1 1( )+ t−2 2( ) , 2( )) ft x t has the solution x t t−1 x 2 t−1 x 2 x 2 and thus converges to consensus at x2(2) for all initial values.

Of course, equicontinuity is not necessary. Example 1 gives a one-element family of not equicontinuous averaging maps which converge. Now, we show that a family of uniformly continuous proper averaging maps is not enough to ensure convergence. The example is inspired by bounded confidence.

n n → Ê Example 5 (Vanishing confidence). Let ft : Ê with

n i j j ∑ = Dt (|x − x |)x ( f ) (x) := j 1 t i n (| i − j|)

∑ j=1 Dt x x → Ê and Dt : Ê≥0 ≥0.Now, ft is an averaging map for any choice of Dt . Further on, ft is continuous if Dt is, and ft is proper if Dt is strictly positive. The Hegselmann- Krause model [2, 7] with homogeneous bound of confidence ε > 0 comes out for Dt being a non-continuous cutoff function  1ify ≤ ε D (y)= t 0otherwise.

−( y )t We chose Dt (y) := e ε as a sequence of functions which has the cutoff function as } a limit function. So, Dt is continuous but {Dt |t ∈ Æ is not equicontinuous. Rough estimates show that with x(0)=(0,8), ε = 1 the process x(t)= ft (x(t)) does not converge to consensus although only proper averaging maps are involved. 98 D.A. Lorenz and J. Lorenz

5 Comparison with a Theorem of Moreau

Theorem 1 is similar to a theorem of Moreau [8, Theorem 2]. We cite it here to dis- cuss similarities and differences. It incorporates changing communication networks into self-maps with averaging properties.

( n d n ) ⊂ (Ê )

Theorem 3 (Moreau). Let ft t∈Æ be a sequence of self maps on S , ( ( )) with S convex and closed. Let N t t∈Æ be a communication regime where all net- ∈ works have positive diagonals: Moreover, assume thatthere is T Æ such that for t0+T ( ( )) 1 all t0 ∈ Æ the network inc ∑ N t has only one essential class . t=t0 Further on, it should exist for each network with positive diagonal N, each x ∈ Sn and each agent k ∈ n a compact set ek(x,N) such that ( ) ( ) ∈ ( , ) 1. For all t ∈ Æ it holds ft k x ek x N , ( , ) ⊂ { i} 2. ek x N riconvi∈nb(k,N) x , 3. ek(x,N) depends continuously on x. Then it holds for x(0) ∈ Sn that the solution of (1) converges to a consensus. The theorem has been significantly reformulated in comparison with the original to fit it in our vocabulary. Especially the original theorem is about “uniform global attractivity of the system with respect to the set of equilibrium solutions x1 = ···= xn = constant” which is equivalent to convergence to consensus for every x(0) ∈ Sn. Items 1 and 2 in the assumptions of Theorem 3 are similar to a ‘proper convex hull averaging map with respect to the current network’. It is averaging due to conv, and proper due to ri (actually ri is a stronger assumption than proper). The conti- nuity assumption in item 3 shows similarity to the assumption of equicontinuity in Theorem 1. Equiproper from Theorem 1 finds its analog in Theorem 3 in the fact that in item 1 it holds ( ft )k(x) ∈ ek(x,N) and ek is independent of t. Especially, the assumption that the ek’s are in the relative interior of convex hulls is more strict than the assumption of properness. But, the theorems can not be compared directly. Moreau’s Theorem allows changing communication topologies and poses assump- tions on this. Our theorem does not deal with communication networks, but with arbitray switching update maps from an equiproper set.

d 3 d 1 2 3

, , , (Ê ) → Ê ( ) = { , , } ( ) = Example 6. Let g1 g2 g3 √g4 : with g1 x : max x x x , g2 x : 1 ( 1 + 2 + 3) ( ) = 3 1 2 3 ( ) = { 1, 2, 3} 3 x x x , g3 x : x x x and g4 x : min x x x be general multidi-

σ1σ2σ3 d 3 d 3

) → (Ê ) mensional means (all computations componentwise) and f : (Ê σ σ σ 1 2 3 =( , , ) with f : gσ1 gσ2 gσ3 be averaging maps. Now it is easy to verify, that

σ1σ2σ3 3 F := { f |(σ1,σ2,σ3) ∈{1,2,3,4} but 1 and 4 not both in (σ1,σ2,σ3)} is an equicontinuous and equiproper set of averaging maps w.r.t cube. Thus, for any d )3 ( + )= ( ( )) sequence ft with elements from F and x(0) ∈ (Ê it holds that x t 1 ft x t

1 With inc we denote the incidence matrix, i.e. inc(A)i, j = 1ifAi, j = 0and0otherwise. Convergence to Consensus by General Averaging 99 converges to consensus due to Theorem 1. Theorem 3 is not applicable because item 2 does not hold for all elements of F.

References

1. Angeli, D., Bliman, P.A.: Stability of leaderless multi-agent systems. Extension of a result by Moreau. Mathematics of Control, Signals & Systems 18(4), 293Ð322 (2006) 2. Hegselmann, R., Ulrich Krause, U.: Opinion Dynamics Driven by Various Ways of Av- eraging. Computational Economics 25(4), 381Ð405 (2004) 3. Hendrickx, J.M., Blondel, V.D.: Convergence of Different Linear and Non-Linear Vicsek Models. CESAME research report 2005.57 (2005) 4. Krause, U.: Compromise, consensus, and the iteration of means. Elemente der Mathe- matik 63, 1Ð8 (2008) 5. Lorenz, D.A., Lorenz, J.: On conditions for convergence to consensus. arXiv.org/abs/0803.2211 (March 2008) 6. Lorenz, J.: A Stabilization Theorem for Dynamics of Continuous Opinions. Physica A 355(1), 217Ð223 (2005) 7. Lorenz, J.: Repeated Averaging and Bounded Confidence Ð Modeling, Analysis and Sim- ulation of Continuous Opinion Dynamics. PhD thesis, Universit¬at Bremen (March 2007) 8. Moreau, L.: Stability of Multiagent Systems with Time-Dependent Communication Links. IEEE Transactions on Automatic Control 50(2) (2005) 9. Seneta, E.: Non-Negative Matrices and Markov Chains, 2nd edn. Springer, Heidelberg (1981) 10. Wolfowitz, J.: Products of Indecomposable, Aperiodic, Stochastic Matrices. In: Proceed- ings of the American Mathematical Society Eugene, vol. 15, pp. 733Ð737 (1963) Stability and D-stability for Switched Positive Systems

Oliver Mason, Vahid S. Bokharaie and Robert Shorten

Abstract. We consider a number of questions pertaining to the stability of positive switched linear systems. Recent results on common quadratic, diagonal, and copos- itive Lyapunov function existence are reviewed and their connection to the stability properties of switched positive linear systems is highlighted. We also generalise the concept of D-stability to positive switched linear systems and present some prelim- inary results on this topic.

1 Introduction

While the stability properties of positive linear time-invariant (LTI) systems have been thoroughly investigated and are now completely understood, the theory for nonlinear, uncertain and time-varying positive systems is considerably less well- developed. In fact, many natural and fundamental questions on the stability of such systems remain unanswered. It is clear that for many practical applications there is a need to extend the theory for positive LTI systems to broader and more realistic system classes incorporating nonlinearities and time-varying parameters. Another separate and interesting line of recent research has focussed on extending the stabil- ity properties of positive LTI systems to positive descriptor systems [11]. Our principal focus in the present paper is on extending the stability theory of positive LTI systems to switched positive linear systems [9]. We review recent work on the stability of these systems, highlighting the connection between various no- tions of stability and the existence of corresponding types of common Lyapunov function. We also consider an extension of the concept of D-stability to positive

Oliver Mason, Vahid S. Bokharaie and Robert Shorten Hamilton Institute, National University of Ireland Maynooth, Co. Kildare, Ireland, e-mail: [email protected],[email protected], [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 101Ð109. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 102 O. Mason, V.S. Bokharaie and R. Shorten switched linear systems, present some preliminary results for this question and high- light some directions for future research.

2 Notation and Background Ê Throughout, Ê denotes the field of real numbers, + denotes the set of non-negative n real numbers, Ê stands for the vector space of all n-tuples of real numbers and

m×n n × Ê Ê is the space of m n matrices with real entries. For x in , xi denotes the th i component of x, and the notation x 0(x ! 0) means that xi > 0(xi ≥ 0) for 1 ≤ i ≤ n. The notations x ≺ 0andx # 0 are defined in the obvious manner.

T n×n n×n Ê We write A for the transpose of A ∈ Ê and for a symmetric P in the notation P > 0 means that the matrix P is positive definite. Throughout the paper, in an abuse of notation, for LTI systems we shall use the term stability to denote asymptotic stability. Also, when referring to switched lin- ear systems, stability shall be used to denote asymptotic stability under arbitrary switching [9]. For a positive LTI system

xú(t)=Ax(t) (1)

∈ n×n where A Ê is a Metzler matrix (meaning that the off-diagonal entries of A are non-negative), the equivalences we collect in the following result are well known. ∈ n×n Proposition 1. [4] Let A Ê be a Metzler matrix. The following statements are equivalent: (a)The LTI system (1) is stable; (b)A is Hurwitz, meaning that its eigenvalues lie in the open left half plane; (c)There exists P > 0 such that AT P + PA < 0; (d)There exists a diagonal matrix D > 0 such that AT D + DA < 0; n ≺ (e)There exists a vector v 0 in Ê with Av 0; (f) For any diagonal matrix D > 0, the system xú(t)=DAx(t) is stable. While the equivalence of (a), (b) and (c) in the previous result also holds for any LTI system, properties (d), (e) and (f) are specific to positive LTI systems. With regard to point (e), as A is Hurwitz and Metzler if and only if AT is Hurwitz, an equivalent condition for stability for positive LTI systems is the existence of v 0 satisfying AT v ≺ 0. Such a v can be used to define a copositive linear Lyapunov function V (x)=vT x for the system (1). The property described in (f) is known as D-stability and establishes that stability of positive LTI systems is robust with respect to parametric uncertainties given by diagonal scaling. Later in the paper, we shall be concerned with investigating the connection between concepts similar to those in (e) and (f) for switched positive linear systems. Before this, in the following section, we shall review some recent work on the stability of switched positive linear systems. Stability of Positive Switched Systems 103

3 Lyapunov Functions and Stability for Switched Positive Linear Systems

It is well known that a switched positive linear system of the form

xú(t)=A(t)x(t) A(t) ∈{A1,A2} (2) can be unstable for certain choices of switching sequence even when the individual system matrices A1,A2 are asymptotically stable [9]. This observation has led to great interest in the stability of such systems under arbitrary switching regimes. A key result in this connection is that stability of (2) is equivalent to the existence of a common Lyapunov function for the individual component LTI systems [9]. In the light of Proposition 1, three classes of Lyapunov function naturally suggest themselves for positive switched linear systems: • Common Quadratic Lyapunov Functions (CQLFs): V(x)=xT Px where P = T > T + < = , P 0andAi P PAi 0fori 1 2; • Common Diagonal Lyapunov Functions (CDLFs): V (x)=xT Dx where D = ( ,..., ) > T + < = , diag d1 dn , D 0andAi D DAi 0fori 1 2; • Common Linear Copositive Lyapunov Functions (CLLFs): V(x)=vT x where T ≺ = , v 0andAi v 0fori 1 2. In the interests of brevity, we shall abuse notation slightly and say that the matrices A1,A2 have a CQLF, CDLF or CLLF rather than always referring to the associated LTI systems. Recall the following well-known necessary condition for the stability of positive switched linear systems (in fact this is a necessary condition for stability for general switched linear systems)[9]. n×n Lemma 1. Let A1,A2 ∈ Ê be Metzler and Hurwitz. Suppose that the associ- ated switched positive linear system (2) is stable. Then for any real γ ≥ 0,A1 + γA2 is Hurwitz. Common Quadratic Lyapunov Functions (CQLFs) In [5], the relationship between the existence of CQLFs, the stability of all ma- trices of the form A1 + γA2 with γ ≥ 0, and the stability of the system (2) was considered. For 2-dimensional systems, the following result was established.

2×2 Theorem 1. Let A1,A2 ∈ Ê be Hurwitz and Metzler. Then the following state- ments are equivalent:

(a)A1,A2 have a CQLF; (b)The switched system (2) is stable; (c)A1 + γA2 is Hurwitz for all real γ ≥ 0. Further, the equivalence of (b) and (c) can be extended to the case of an arbitrary finite number of positive LTI systems. Formally, it was shown in [5] that given 2×2 ( )= ( ) ( ) Metzler, Hurwitz matrices A1,...,Ak in Ê , the switched systemx ú t A t x t , 104 O. Mason, V.S. Bokharaie and R. Shorten

A(t) ∈{A1,...,Ak} is stable if and only if A1 + γ2A2 + ···+ γkAk is Hurwitz for all real γ2 ≥ 0,...γk ≥ 0. The equivalence of (a), (b) and (c) fails immediately for 3-dimensional systems. Moreover, the equivalence of (b) and (c) is not true for arbitrary dimensions [5]. In fact, in a very recent paper [3], a 3-dimensional example of an unstable switched sys- tem for which A1 + γA2 was Hurwitz for all γ ≥ 0 was explicitly described. In con- nection with CQLF existence and the stability of positive switched linear systems, it has been shown in [7] for 2 and 3 dimensional systems that if rank(A2 −A1)=1, and A2,A1 are both Hurwitz, then the associated LTI systems always possess a CQLF and the switched linear system (2) is stable. Common Diagonal Lyapunov Functions (CDLFs) As stable positive LTI systems have diagonal Lyapunov functions, it is natural to ask under what conditions families of such systems will possess a common diagonal Lyapunov function. In the paper [6], the following result was derived for systems with irreducible system matrices (for the definition of irreducible matrices, see [1]).

n×n Theorem 2. Let A1,A2 ∈ Ê be irreducible, Metzler and Hurwitz. A1,A2 have a CDLF if and only if A1 + DA2D is Hurwitz for all diagonal matrices D > 0. The above result allows us to establish a connection between the existence of a CDLF and a form of robust stability for switched positive linear systems. First of all, note that for A1,A2 irreducible, Metzler and Hurwitz, Theorem 2 shows that if A1, A2 have a CDLF, then so do D1A1D1, D2A2D2 for any choice of diagonal matrices D1 > 0, D2 > 0. Hence the existence of a CDLF guarantees the stability of the positive switched linear system

xú(t)=A(t)x(t) A(t) ∈{D1A1D1,D2A2D2} (3) for any diagonal matrices D1 > 0, D2 > 0. Conversely, if A1, A2 do not have a CDLF, then it follows from Theorem 2 that there is some diagonal matrix D > 0suchthatA1 + DA2D is not Hurwitz. This then immediately implies from Lemma 1 that the switched system (3) is not stable with D1 = I,andD2 = D. This discussion establishes the following result. n×n Proposition 2. Let A1,A2 ∈ Ê be irreducible, Metzler and Hurwitz. The switched system (3) is stable for any diagonal matrices D1 > 0,D2 > 0 if and only if A1,A2 have a CDLF. Common Linear Copositive Lyapunov Functions (CLLFs) It is also possible to establish the stability of positive switched linear systems using copositive linear Lyapunov functions. As noted in [2], traditional Lyapunov functions may give conservative stability conditions for positive switched systems as they fail to take into account that trajectories are naturally constrained to the positive orthant. The existence of a CLLF for a pair of Metzler, Hurwitz matrices A1,A2 T ≺ T ≺ is equivalent to the feasibility of the linear inequalities v 0, A1 v 0, A2 v 0. For the most part, we shall be concerned with the feasibility of the related system of inequalities v 0, A1v ≺ 0, A2v ≺ 0 as this is more relevant to the extension Stability of Positive Switched Systems 105 of the concept of D-stability for switched positive linear systems that interests us. Conditions for the feasibility of this system of inequalities (for compact sets of matrices) have been given in terms of P-matrix sets in the paper [10]. It is important for what follows to make clear the distinction between the exis- T ≺ = , tence of a common v 0 satisfying Ai v 0fori 1 2 (CLLF existence), and the existence of a common v 0suchthatAiv ≺ 0fori = 1,2. For switched systems (in contrast with the LTI case), these two conditions are not equivalent. This can be seen from the following simple 2 × 2example. Example 1.     −12 −66 A = ,A = 1 1 −3 2 2 −6

T It can be verified algebraically that Aiv ≺ 0fori = 1,2wherev =(52) .However, T ≺ = , it is easy to show that there can be no v 0 satisfying Ai v 0fori 1 2. An algebraic condition for CLLF existence was derived in [8]. In the interests of brevity, we shall not explicitly state this result here but rather state the following technical result which follows from Theorem 3.1 in that paper. This fact shall prove useful in our later discussion. n×n Lemma 2. Let A1,A2 ∈ Ê be Metzler and Hurwitz. Suppose that there is no ! n # # > non-zero v 0 in Ê with A1v 0,A2 0. Then there is some diagonal D 0 such that A1 + DA2 is singular.

4 Switched Positive Linear Systems and D-Stability: The 2-d Case

In this and the following section, we shall investigate the following generalisation of the notion of D-stability to positive switched linear systems. n×n Definition 1. Let A1,A2 ∈ Ê be Metzler and Hurwitz. The associated switched positive linear system (2) is said to be D-stable if for any diagonal matrices D1,D2 ∈ n×n > > Ê with D1 0, D2 0, the system

xú(t)=A(t)x(t) A(t) ∈{D1A1,D2A2} (4) is stable. For positive LTI systems, Proposition 1 shows that stability and D-stability are equivalent. Our first observation, in Example 2, is to note that this equivalence is not true in the switched case. First of all, we note the following simple necessary condition for D-stability, which follows immediately from Lemma 1. n×n Lemma 3. Let A1,A2 ∈ Ê be Metzler and Hurwitz. Suppose that the associ- ated switched positive linear system (2) is D-stable. Then for any diagonal matrix D > 0,A1 + DA2 is Hurwitz. 106 O. Mason, V.S. Bokharaie and R. Shorten

2×2

Example 2. Consider the Metzler, Hurwitz matrices in Ê     −20 −15 A = ,A = 1 1 −4 2 0 −1

It is straightforward to verify that A1 + γA2 is Hurwitz for all γ ≥ 0. Hence by Theorem 1, the associated switched system is stable. On the other hand, choosing   20 0 D = 00.5 it is easily verified that A1 + DA2 is not Hurwitz. Hence by Lemma 3 the associated switched system is not D-stable. The above example illustrates that for switched positive linear systems, the concepts of stability and D-stability are not equivalent, in contrast to the LTI system case. In the following result, we show that the necessary condition given in Lemma 3 is also sufficient for D-stability for 2-dimensional systems. 2×2 Theorem 3. Let A1,A2 ∈ Ê be Metzler and Hurwitz. The positive switched linear system (2) is D-stable if and only if A1 + DA2 is Hurwitz for all diagonal matrices D > 0. Proof. Lemma 3 has already established the necessity of this condition. For suffi- ciency let D1 > 0, D2 > 0 be diagonal matrices and let γ ≥ 0 be any non-negative real + γ −1 γ > number. By hypothesis, A1 D1 D2A2 is Hurwitz for 0 and it is trivially true for γ = 0. However, this matrix is also Metzler and hence by point (f) of Proposition + γ = ( + γ −1 ) 1, D1A1 D2A2 D1 A1 D1 D2A2 is also Hurwitz. It now follows immedi- ately from Theorem 1 that the switched system (4) associated with D1A1,D2A2 is stable. As this is true for any diagonal D1 > 0, D2 > 0, the system (2) is D-stable. 

The next result establishes a connection between the existence of a common so- lution to the inequalities v 0, Aiv ≺ 0fori = 1,2 and D-stability for (2). 2×2 Corollary 1. Let A1,A2 ∈ Ê be Metzler and Hurwitz. Then:

(i) If there is some v 0 with A1v ≺ 0,A2v ≺ 0 then the system (2) is D-stable; (ii)If (2) is D-stable then there exists some non-zero v ! 0 with A1v # 0,A2v # 0.

Proof. (i) Suppose there is some v 0 with Aiv ≺ 0fori = 1,2. Then for any diagonal D > 0, DA2v ≺ 0and(A1 + DA2)v ≺ 0. Moreover, A1 + DA2 is Metzler. Hence, from point (e) of Proposition 1, it follows that A1 +DA2 is Hurwitz. Theorem 3 now implies that the switched system (2) is D-stable. (ii) If (2) is D-stable, then Theorem 3 implies that A1 + DA2 is Hurwitz for all diagonal D > 0. It now follows from Lemma 2 that there must exist some non-zero v ! 0 with A1v # 0, A2v # 0.  Note that the sufficient condition for D-stability presented in point (i) of Corollary 1 is not necessary as demonstrated by the following example. Stability of Positive Switched Systems 107

Example 3. Consider the Metzler, Hurwitz matrices A1,A2 given by:     −21 −31 A = ,A = 1 2 −2 2 2 −1

Using Theorem 4.1 of [8] it is straightforward to show that there is no vector v 0 with A1v ≺ 0, A2v ≺ 0. On the other hand, it can be verified algebraically that for any diagonal D > 0, A1 + DA2 is Hurwitz and hence the switched system (2) is D-stable by Theorem 3.

5 D-Stability in Higher Dimensions

In this section, we present a result extending Corollary 1 to higher dimensional positive switched linear systems. While the following result is stated for switched systems with two constituent systems, the argument can easily be amended to derive a corresponding result for an arbitrary number of constituent systems. n×n Theorem 4. Let A1,A2 ∈ Ê be Metzler and Hurwitz. Then:

(i) If there is some v 0 with A1v ≺ 0,A2v ≺ 0 then the system (2) is D-stable; (ii)If (2) is D-stable then there exists some non-zero v ! 0 with A1v # 0,A2v # 0. Proof. (i) The first step in proving (i) is to show that the existence of such a v is sufficient for the stability of the switched system (2). With this in mind, suppose that there exists some v 0 satisfying A1v ≺ 0, A2v ≺ 0, and let a (piecewise-constant) →{ , } ( )= ≥ switching signal σ : Ê+ 1 2 be given such that A t Aσ(t) for all t 0. Fur- thermore, let 0 = t0,t1,t2,...,tk,...,be the switching times or points of discontinuity of σ. As is standard for switching systems [9], we assume that there is some non- σ vanishing dwell-time τ > 0 such that tk+1 −tk ≥ τ for all k ≥ 0. Let x (.,x0) denote 1 the unique, piecewise C solution of (2) corresponding to the initial condition x0 and (i) the switching signal σ.Also,fori = 1,2, let x (.,x0) denote the unique solution of the stable positive LTI systemx ú = Aix corresponding to the initial state x0. Note the following readily verifiable facts concerning the solutions of the positive LTI systems with system matrices A1,A2. (i) (i) (a)For i = 1,2, if x0 ! 0,x1 ! 0 satisfy x0 # x1,thenx (t,x0) # x (t,x1) for all t ≥ 0. This simply records the well-known fact that positive LTI systems are monotone; = , d (i)( , )= ≺ δ > (b)For i 1 2, as dt x 0 v Aiv 0, it follows that there is some 0suchthat x(i)(t,v) ≺ v for 0 ≤ t ≤ δ. Combining (a) and (b) we see immediately that for 0 ≤ t ≤ δ,andi = 1,2,

x(i)(t + δ,v)=x(i)(t,x(i)(δ,v)) # x(i)(t,v) # v.

Simply iterating this process, it is easy to see that for i = 1,2, x(i)(t,v) # v for all t ≥ 0. 108 O. Mason, V.S. Bokharaie and R. Shorten

Now consider the solution xσ (.,v) of (2) corresponding to the initial condition v and the switching signal σ. The argument in the previous paragraph guarantees σ that for 0 ≤ t ≤ t1, x (t,v) # v (as the dynamics in this interval are given by one of the constituent positive LTI systems). But in the second interval [t1,t2), the system σ dynamics are again given by a positive LTI system with x (t1,v) # v as initial con- dition. Hence from the previous argument combined with point (a) above, we can σ conclude that for t1 ≤ t ≤ t2, x (t,v) # v. Continuing in this way, we can easily see that for all t ≥ 0, we have xσ (t,v) # v. As the switching signal σ was arbitrary, we can conclude that xσ (t,v) # v holds for all switching signals. It is now straightforward to show that the solutions of (2) are uniformly bounded. In fact, for any x0 ! 0suchthatx0∞ ≤ K1,ifvmin = min{v1,...,vn},thenx0 # σ (K1/vmin)v. It now follows that for all t ≥ 0, x (t,x0) # (K1/vmin)v and hence that σ x (t,x0)∞ ≤ K1(vmax/vmin) for all t ≥ 0wherevmax = max{v1,...,vn}. Now if there is some v 0 with Aiv ≺ 0fori = 1,2, then (Ai + εI)v ≺ 0for sufficiently small positive ε > 0. Therefore, the trajectories of the switched system corresponding to A1 + εI, A2 + εI are also uniformly bounded for small enough positive ε > 0. This immediately implies that the original system (2) is globally asymptotically stable. To complete the proof of (i), note that for any positive definite diagonal matrices D1,D2, the matrices D1A1 and D2A2 are Metzler and Hurwitz. Moreover, if Aiv ≺ 0fori = 1,2, then DiAiv ≺ 0fori = 1,2. The above argument now immediately implies that the system (4) is stable and hence the original system (2) is D-stable as claimed. The result given by (ii) follows immediately from Lemma 3 and Lemma 2. 

Note that the result given by (i) provides a condition for stability of (2) that is distinct although related to the condition given by CLLF existence.

6 Concluding remarks

In this paper, we have discussed a number of problems relating to the stability prop- erties of switched positive linear systems. In particular, we have reviewed recent work on common quadratic, copositive and diagonal Lyapunov functions for these systems and on the relationship between the existence of such functions and various notions of stability for switched positive systems. We have also discussed the notion of D-stability for positive switched systems and presented separate necessary and sufficient conditions for D-stability for n-dimensional systems. More detailed and complete results have also been given for 2-dimensional systems. A number of interesting directions for future research emerge from the work described here. For instance, it would be interesting to investigate the possibility of Theorem 3 extending to dimensions higher than 2, even for some restricted system class. Also, the question of whether stability and D-stability are equivalent for any subclass of positive switched linear systems arises naturally. It is straightforward to Stability of Positive Switched Systems 109 show that this is true for upper (or lower) triangular positive systems, for example, but are there any more interesting such classes?

Acknowledgements. The authors are very grateful to the organisers of the session on Sta- bility and Control of Positive Systems for their kind invitation to contribute to the session. This work has been supported by the Irish Higher Education Authority (HEA) PRTLI Network Mathematics grant, by Science Foundation Ireland (SFI) grant 08/RFP/ENE1417 and by SFI PI Award 07/IN.1/1901.

References

1. Berman, A., Plemmons, R.J.: Non-negative Matrices in the Mathematical Sciences. SIAM Classics in Applied Mathematics (1994) 2. Camlibel, M.K., Schumacher, J.M.: Copositive Lyapunov Functions. In: Unsolved Prob- lems in Mathematical Systems and Control Theory. Princeton University Press, Princeton (2004), http://press.princeton.edu/math/blondel 3. Fainshil, L., Margaliot, M., Chigansky, P.: Positive Switched Linear Systems are not Uniformly Stable, even for n = 3. IEEE Transactions on Automatic Control (to appear 2009) 4. Farina, L., Rinaldi, S.: Positive Linear Systems. Theory and Applications. Pure and Ap- plied mathematics. John Wiley & Sons, Inc., New York (2000) 5. Gurvits, L., Shorten, R., Mason, O.: On the Stability of Switched Positive Linear Sys- tems. IEEE Transactions on Automatic Control 52(6), 1099Ð1103 (2007) 6. Mason, O., Shorten, R.: On the Simultaneous Diagonal Stability of a Pair of Positive Linear Systems. Linear Algebra and its Applications 413, 13Ð23 (2006) 7. Mason, O., Shorten, R.: Quadratic and Copositive Lyapunov Functions and the Stability of Positive Switched Linear Systems. In: Proceedings of American Control Conference, New York (2007) 8. Mason, O., Shorten, R.: On Linear Copositive Lyapunov Functions and the Stability of Switched Positive Linear Systems. IEEE Transactions on Automatic Control 52(7), 1346Ð1349 (2007) 9. Shorten, R., Wirth, F., Mason, O., Wulff, K., King, C.: Stability Theory for Switched and Hybrid Systems. SIAM Review 49(4), 545Ð592 (2007) 10. Song, Y., Seetharama-Gowda, M., Ravindran, G.: On Some Properties of P-matrix Sets. Linear Algebra and its Applications 290, 237Ð246 (1999) 11. Virnik, E.: Stability Analysis of Positive Descriptor Systems. Linear Algebra and its Applications 429, 2640Ð2659 (2008) On Positivity and Stability of Linear Volterra-Stieltjes Differential Systems

Pham Huu Anh Ngoc

Abstract. An explicit criterion for positive linear Volterra-Stieltjes differential sys- tems is given. Then new explicit criteria for uniform asymptotic stability and expo- nential asymptotic stability of positive linear Volterra-Stieltjes differential systems are presented. Finally, a crucial difference between the uniform asymptotic stabil- ity and the exponential asymptotic stability of linear Volterra-Stieltjes differential systems is shown.

1 Preliminaries

l

=  Ê , ≥ , à − Let à : or . For an integer l q 1 denotes the l dimensional vector space over K. Inequalities between real matrices or vectors will be understood componen- =( l×q, ≥ twise, i.e. for two real matrices A aij) and B =(bij) in Ê we write A B iff l×q aij ≥ bij for i = 1,···,l, j = 1,···,q. We denote by Ê+ the set of all nonnegative

≥ ∈ n l×q ∈ à matrices A 0. Similar notations are adopted for vectors. For x à and P we define |x| =(|xi|) and |P| =(|pij|).

· n n  ≤  , ∈ à ,| |≤ A norm on à is said to be monotonic if x y whenever x y x | | n, ≤ ≤ ∞ y .Everyp-norm on à 1 p , is monotonic. Throughout the paper, if not ∈ l×q stated otherwise, the norm of a matrix P à is understood as its operator norm

l q

à   = associated with a given pair of monotonic vector norms on à and ,thatis P max{Py; y = 1}. Then one has

∈ l×q l×q , ∈ Ê ,| |≤ ⇒≤| |≤ ,

P Ã Q + P Q P P Q (1)

= { ∈  ≥ γ} γ ∈ Ê. see, e.g. [7]. In what follows, we denote γ : z :Rez with given ∈ n×n μ( )= { λ For any matrix A Ã the spectral abscissa of A is denoted by A max Re : ( − )= } λ ∈ σ(A)}, where σ(A) := {s ∈ ;det sIn A 0 is the spectrum of A. A matrix

Pham Huu Anh Ngoc Institute of Mathematics, Ilmenau Technical University, Weimarer Stra§e 25, 98693 Ilmenau, DE, e-mail: [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 111Ð121. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 112 P.H.A. Ngoc

n×n

A ∈ Ê is called a Metzler matrix if all off-diagonal elements of A are nonnegative.

l×q . η(·) → Ê Let J be an interval of Ê A matrix function : J is called increasing on J if η(θ2) ≥ η(θ1) for θ1,θ2 ∈ J,θ1 < θ2.

2 Explicit Criterion for Positive Linear Volterra-Stieltjes Differential Systems

Consider a linear Volterra-Stieltjes differential system of the form  t ( )= ( )+ , xú t Ax t d[B(s)]x(t − s), for a.a. t ∈ Ê+ (2) 0

∈ n×n n×n (·) Ê → Ê where A Ê is a given matrix and B : + is a given matrix function (·) of locally bounded variation on Ê+. Furthermore, we always assume that B is

normalized to be right-continuous on Ê+ and vanishes at 0. From the theory of integro-differential systems (see e.g. [1]), it is well-known that

(·) n×n → Ê there exists a unique locally absolutely continuous matrix function R : Ê+ such that  t , ( )= . Rú(t)=AR(t)+ d[B(s)]R(t − s), a.a. t ∈ Ê+ R 0 In (3) 0

(·) ∈ 1 n (Ê , Ê ) Then R is called the resolvent of (2). Moreover, for given f Lloc + ,the following nonhomogeneous system  t ( )= ( )+ , xú t Ax t d[B(s)]x(t − s)+ f (t), a.a. t ∈ Ê+ (4) 0

n has a unique solution x(·) satisfying the initial condition x(0)=x0 ∈ Ê and it is represented by the variation of constants formula  t ( )= ( ) , x t R t x0 + R(t − s) f (s)ds, t ∈ Ê+ (5) 0 see e.g. [1, p. 81].

n

ϕ ∈ ([ ,σ], Ê ) (·) Ê → Definition 1. Let σ ∈ Ê+ and C 0 . A vector function x : + n (σ,ϕ) (·) Ê is called a solution of (2) through if x is absolutely continuous on any compact subinterval of [σ,+∞) and satisfies (2) for almost all t ∈ [σ,+∞) and x(t)=ϕ(t),∀t ∈ [0,σ]. We denote it by x(· ;σ,ϕ).

Remark 1. By the fact mentioned above on solution of the nonhomogeneous sys-

n

ϕ ∈ ([ ,σ], Ê ) tem (4), it is easy to check that for fixed σ ∈ Ê+ and given C 0 ,there exists a unique solution of (2) through (σ,ϕ) and it is given by On Positivity and Stability of Linear Volterra-Stieltjes Differential Systems 113

  . t u+σ . x(t + σ;σ,ϕ)=R(t)ϕ(σ)+ R(t − u) d[B(s)]ϕ(u + σ − s) du, t ∈ Ê+ 0 u (6) u+σ [ ( )]ϕ( + σ − )= σ = In the above, it is understood that u d B s u s 0when 0. n) Definition 2. Wesay that (2) is positive, if for any σ ≥ 0andanyϕ ∈C([0,σ], Ê , ϕ ≥ 0, the corresponding solution x(· ;σ,ϕ) is also nonnegative, that is x(t;σ,ϕ) ≥ 0, ∀t ≥ σ.

We are now in the position to prove the first main result of this paper. n×n

Theorem 1. The system (2) is positive if and only if A ∈ Ê is a Metzler matrix (·) and B is an increasing matrix function on Ê+. n),ϕ ≥ . (· σ,ϕ) Proof. (The ”if” part) Let σ ≥ 0andϕ ∈ C([0,σ], Ê 0 Then x ; is (·) ϕ ≥ given by (6). Since B is increasing on Ê+ and 0, it follows that  u+σ d[B(s)]ϕ(u + σ − s) ≥ 0,∀u ≥ 0.

u

, ( ) ≥ , ∈ Ê . Thus x(·;σ,ϕ) ≥ 0,t ∈ Ê+ provided R t 0 t + It remains to show that

( , ≥ . ( ) = ( , ), ∈ Ê x t;0 x0)=R(t)x0 ≥ 0, t ∈ Ê+ for any x0 0 Note that x t : x t;0 x0 t + satisfies   t  u  ( )= At A(t−u) . x t e x0 + e d[B(s)]x(u − s) du, t ∈ Ê+ 0 0 Fix a > 0. We consider the operator

n n

) −→ ([ , ], Ê ) T : C([0,a], Ê C 0 a   t  u  At A(t−u) ϕ → T ϕ(t) := e x0 + e d[B(s)]ϕ(u − s) du, t ∈ [0,a]. 0 0

n ) ∈ Æ By induction, it is easy to show that for ϕ1,ϕ2 ∈ C([0,a], Ê and k ,wehave  kϕ ( ) − kϕ ( )≤ Mktk ϕ − ϕ , ∀ ∈ [ , ], = = T 2 t T 1 t k! 2 1 t 0 a where M : M1M2 and M1 :

As k  , = ( , ). ∈ Æ maxs∈[0,a] e M2 : Var B;0 a This implies that T is a contraction for k sufficiently large. Fix k0 ∈ Æ sufficiently large, by the contraction mapping principal, = ([ , ], n). there exists a unique solution of the equation x Tx in C 0 a Ê Moreover,

( mk0 n ϕ ) , ϕ ∈ ([ , ], Ê ) the sequence T 0 m∈Æ with an arbitrary 0 C 0 a converges to this

([ , ], n n n×n ) ϕ ∈ ([ , ], Ê ),ϕ ≥ . ∈ Ê solution in the space C 0 a Ê . Choose 0 C 0 a 0 0 Since A

mk0 ϕ ≥ ,∀ ∈ Æ is a Metzler matrix, B(·) is increasing on Ê+, it follows that T 0 0 m . ( , . Thus x t;0 x0) ≥ 0,∀t ∈ [0,a]. Since a > 0 is arbitrary, x(t;0,x0) ≥ 0,∀t ∈ Ê+ (The ”only if” part) We first show that A is a Metzler matrix. Let {e1,e2,...,en}

n

. ∈{ , ,..., } ( ) = ( , ), ∈ Ê . be the standard basis of Ê Fix j 1 2 n .Letx t : x t;0 e j t + Then

(·) ≥ , ∈ . ∈{ , ,..., } = ∈ Æ. x 0 satisfies (2) a.a. t Ê+ Fix i 1 2 n with i j and k Since (·) [ , / ] ( )= + t ( ) , ∈ [ , / ]. x is absolutely continuous on 0 1 k ,wehavex t e j 0 xú s ds t 0 1 k It follows that there exists tk ∈ [0,1/k] such that 114 P.H.A. Ngoc  t ( )= ( )+ k [ ( )] ( − ) T ( ) ≥ . xú tk Ax tk d B s x tk s ds and ei xú tk 0 0 T ( )= T ≥ = This gives limk→+∞ ei xú tk ei Ae j 0fori j. Thus A is a Metzler matrix. (·)=( , ∈ Let B bij(·)). We now prove that bij(·) is increasing on Ê+ for every i j { , ,..., }. ),ψ ≥ ,ψ(σ)= σ > 1 2 n To do so, let ψ ∈ C([0,σ], Ê 0 0 with given 0. For T n) ϕ = fixed i0 ∈{1,2,...,n} and let us define ϕ :=(ϕ1,...,ϕn) ∈ C([0,σ], Ê ,where i T ψ if i = i0 otherwise ϕi = 0. Set x(t) := x(t;σ,ϕ)=(x1(t),x2(t),...,xn(t)) ≥ 0,∀t ≥ . (·) σ. Note that x(·) satisfies (2) a.a. t ∈ [σ,σ + 1/k] for given k ∈ Æ Since x1 is absolutely continuous on [σ,σ +1/k], x1(·) ≥ 0,x1(σ)=0, there exists tk ∈ [σ,σ + 1/k] such that x(·) satisfies (2) at tk andx ú1(tk) ≥ 0. This yields limk→+∞ xú1(tk)=

σ

ψ(σ − ) [ ( )] ≥ . ([ ,σ], Ê) → 0 s d b1 i0 s 0 Thus, the linear functional defined by L : C 0 , ψ → ψ = σ ψ(σ − ) [ ( )], Ê L : 0 s d b1i0 s is positive. Taking into account [6, Lemma (·) [ ,σ]. σ > (·)

3.4], we conclude that b1i0 is increasing on 0 Since 0 is arbitrary, b1i0 . (·) Ê is increasing on Ê+ By a similar way, we can show that bij is increasing on + for any i, j ∈{1,2,...,n}. This completes the proof.  The following is immediate from Theorem 1.

∈ n×n n×n (·) Ê → Ê Corollary 1. Let A Ê be a given matrix and let B : + be a given continuous matrix function. Then, a linear Volterra integro-differential system of the convolution type  t ( )= ( )+ , xú t Ax t B(t − s)x(s)ds, t ∈ Ê+ (7) 0 ( ) ∈ n×n ≥ . is positive if and only if A is a Metzler matrix and B t Ê+ for all t 0

3 Stability of Positive Linear Volterra-Stieltjes Systems

In this section, we offer new and novel criteria for uniform asymptotic stability and exponential asymptotic stability of positive linear Volterra-Stieltjes differential systems of the form (2). Throughout this section, we assume that

 +∞ |dB(t)| < +∞. (8) 0

(·) ˜( ) = +∞ −zs ( ), Then the Laplace-Stieltjes transform of B : B z : 0 e dB s is well-defined ∈ for z 0.

3.1 Explicit Criterion for uniform asymptotic stability

Definition 3. (i) The zero solution of (2) is said to be uniformly stable (US) if for each ε > 0, there exists δ > 0suchthat On Positivity and Stability of Linear Volterra-Stieltjes Differential Systems 115

n),ϕ < δ ⇒ ( σ,ϕ) < ε,∀ ≥ σ. ϕ ∈ C([0,σ], Ê x t; t

(ii) The zero solution of (2) is said to be uniformly asymptotically stable (UAS) if it is US and if there exists δ0 > 0suchthat∀ε > 0, ∃ T (ε) > 0: n),ϕ < δ ⇒ ( σ,ϕ) < ε,∀ ≥ (ε)+σ. ϕ ∈ C([0,σ], Ê 0 x t; t T

If the zero solution of (2) is US (UAS) then we also say that (2) is US (UAS), respectively.

Denote  +∞ −zt , Δ(z) := zIn − A − e dB(t), z ∈ 0 (9) 0 the characteristic matrix of (2). Theorem 2. [7, Th. 5.3] Let (8) hold. Then the following statements are equiv- alent

(i) detΔ(z) = 0,∀z ∈ 0;

(·) 1 n×n , Ê ) (ii) the resolvent R of (2) belongs to L (Ê+ ; (iii) the system (2) is UAS. Theorem 3. Suppose that (8) holds and (2) is positive. Then, (2) is UAS if and only if  +∞ μ(A + dB(t)) < 0. (10) 0

+∞ n×n μ( + ( )) < ∈ Ê Proof. Assume that A 0 dB t 0. Since (2) is positive, A is a (·) Metzler matrix and B is an increasing matrix function on Ê+, by Theorem 1. ∈  +∞ −zt ( ) ≤ +∞ −Rezt ( ) ≤ +∞ ( ). For an arbitrary z 0,wehave 0 e dB t 0 e dB t 0 dB t By a standard property of Metzler matrices (see e.g. [6, Th. 2.1 (iv)]), μ(A + +∞ −zt ( )) ≤ μ( + +∞ ( )) < . ∈ σ( + +∞ −zt ( )) 0 e dB t A 0 dB t 0 It follows that z A 0 e dB t .

That is, detΔ(z) = 0. By Theorem 2, (2) is UAS. Δ( ) = ∀ ∈  Conversely, suppose (2) is UAS. Then, det z 0 z 0, by Theorem 2. Con- (θ)=θ − μ( + +∞ −θt ( )), θ ∈ [ ,+∞). sider the real function defined by f A 0 e dB t 0 Clearly, f is continuous and limθ→+∞ f (θ)=+∞. We show that f (0) > 0. Seeking ( ) ≤ λ ≥ (λ )= a contradiction, assume that f 0 0. Then there is 1 0 such that f 1 0. +∞ −λ +∞ −λ λ = μ( + 1t ( )) + 1t ( ) That is, 1 A 0 e dB t .SinceA 0 e dB t is a Metzler matrix, +∞ −λ λ ∈ σ( + 1t ( )), 1 A 0 e dB t by the Perron-Frobenius theorem, see e.g. [5, Th. 2.1]. Δ(λ )= μ( + +∞ ( )) = − ( ) < Thus, det 1 0. This is a contradiction. Hence, A 0 dB t f 0 0, as required. This completes the proof.  The following is immediate from Theorems 1, 3.

∈ n×n n×n (·) Ê → Ê Corollary 2. Let A Ê be a Metzler matrix and let B : + be a given nonnegative continuous matrix function. If

 +∞ B(s)ds < +∞, 0

μ( + +∞ ( ) ) < . then the system (7) is UAS if and only if A 0 B t dt 0 116 P.H.A. Ngoc

Remark 2. Necessity of positivity The assumption of positivity of (2) in Theorem 3 cannot, in general, be omitted. To see this, we consider the scalar equation  t xú(t)=Ax(t)+ d[B(s)]x(t − s) for a.a. t ≥ 0, (11)

0

→ Ê ≥ where A = 0, B(·) : Ê+ is defined for s 0, by ⎧ 2 τ  ⎪ 3e e2 , τ ∈ [0,1) s ⎨ e2−1

( ) = −4 2τ → Ê, (τ)= (τ) τ Ê , τ ∈ [ , ) B s : b d ; b : + b 2− e 1 2 0 ⎩⎪ e 1 0, τ ∈ [2,∞).

An easy computation yields   +∞ +∞ e2 |dB(s)| < +∞, A + dB(s)=− < 0 0 0 2 and so the equation (11) satisfies (8), (10). We show that (11) is not uniformly ∈ , asymptotically stable. For z  consider the characteristic equation of (11)

 +∞  +∞ 0 = z − e−zsdB(s)=z − e−zsb(s)ds. 0 0 Writing  +∞

−ts

→ Ê, → ( ) = − ( ) , g : Ê+ t g t : t e b s ds 0 we obtain   2 2 2 ( )=− ( ) = e > , ( )= − −2s ( ) = − + 1 < , g 0 b s ds 0 g 2 2 e b s ds 1 2 0 0 2 0 e − 1 and hence there exists t∗ ∈ (0,2) such that g(t∗)=0. Now Theorem 2 implies that (11) is not uniformly asymptotically stable. Note that since B(·) is not increasing on

Ê+, (11) is not positive, by Theorem 1.

3.2 Explicit Criterion for Exponential Asymptotic Stability

Definition 4. The zero solution of (2) is said to be exponentially asymptotically stable (EAS) if there exist M ≥ 1,α > 0suchthat ∀ n),∀ ≥ σ  ( σ,ϕ)≤ −α(t−σ)ϕ. σ ≥ 0,∀ϕ ∈ C([0,σ], Ê t : x t; Me

If the zero solution of (2) is EAS then we also say that (2) is EAS. On Positivity and Stability of Linear Volterra-Stieltjes Differential Systems 117

Theorem 4. Suppose that (8) holds and (2) is positive. Then the following state- ments are equivalent (i) (2) is EAS; (ii) (2) is UAS and - - - +∞ - - αs - ∃ α > 0: - e dB(s)- < +∞. (12) 0

μ( + +∞ ( )) < (iii) (12) holds and A 0 dB t 0. Proof. Note that (ii) ⇔ (iii) is immediate from Theorem 3. It remains to show that (i) ⇔ (ii). (ii) ⇒ (i) STEP 1: We show that ∃ ,  ( )≤ −εt . K ε > 0 ∀t ∈ Ê+ : R t Ke

(·) | +∞ −zs ( )|≤ +∞ αs ( ), Since B is increasing on Ê+,wehave 0 e dB s 0 e dB s for any

∈ +∞ −zs Δ( )= ∈  | |≤ + ( )≤ z −α. Thus, if det z 0forsomez −α then z A 0 e dB s

  +  +∞ αs ( ) = , Δ( ) = ∈  − α ≤ A 0 e dB s : T0 by (1). Hence det z 0forz with ≤ | |≥ û Rez 0and Imz T0 + 1. Since/ det Δ(·) is analytic on −α , it has at most0 a D = ∈  − α/ ≤ ≤ , | |≤ + . finite number of zeros in : z  2 Rez 0 Imz T0 1 In , addition, since (2) is UAS, detΔ(z) = 0 ∀z ∈ 0 by Theorem 2. It follows that , Δ( )= } < . ε ∈ ( , {− ,α}) c0 := sup{Rez : z ∈  det z 0 0 Choose 0 min c0 . Then, it ε· is easy to check that R1(·) := e R(·) and Δ1(·) := Δ(·−ε) are, respectively, the resolvent and the characteristic matrix of the equation  t ( )=( + , yú t A εIn)y(t)+ d[F(s)]y(t − s), t ∈ Ê+ (13) 0

(·) ( ) = s ετ [ (τ)], ≥ . where F is defined by F s : 0 e d B s 0 By (12), we have  +∞ |dF(s)| < +∞. 0 Δ ( ) = Since detΔ(z) = 0forallz ∈ −ε , it follows that det 1 z 0forall

∈ 1 n×n (·) ∈ (Ê Ê ) z 0. Applying Theorem 3 to (13), we get R1 L +; .Thisim-

t 1 n×n

[ ( )] ( − ) ∈ (Ê , Ê ) plies that 0 d F s R1 t s L + , by a standard property of convo- ( )= ( )+ lutions, see e.g. [1, page 96]. From the resolvent equation, Rú1 t AR1 t

t 1 n×n

, (·) ∈ (Ê , Ê ). [ ( )] ( − ), . . Ê ú 0 d F s R1 t s a e on + it follows that R1 L + Since

1 n×n

, Ê ) ( ) → → +∞. (·) ∈ Rú1(·),R1(·) ∈ L (Ê+ ,wehaveR1 t 0ast In particular, R1

∞ n×n −εt

, Ê )  ( )≤ ∀ ∈ Ê , > . L (Ê+ which yields R t Ke t + for some K 0 STEP 2: Weshow that ∀ n),∀ ≥ σ  ( σ,ϕ)≤ −ε(t−σ)ϕ, σ ≥ 0,∀ϕ ∈ C([0,σ], Ê t : x t; Me

> .  ( )≤ −εt . for some M 0 By Step 1, R t Ke ∀t ∈ Ê+ Taking into account the for- mula (6), it remains to show that 118 P.H.A. Ngoc

n

× ([ ,σ] Ê ) ∀ ≥ σ ∃ K1 > 0, ∀(σ,ϕ) ∈ Ê+ C 0 ; t : -  . - - t u+σ - - - −ε(t−σ) - R(t − u) d[B(s)]ϕ(u + σ − s) du- ≤ K1e ϕ.

0 u

( ) = ( , ), ∈ Ê . (·) Ê , Let VB s : Var B;0 s s + Since B is increasing on + (12) is equivalent

+∞ αs [ ( )] < +∞. ∈ Ê to 0 e d VB s Then we have for t +, -  . -  - t u+σ - t - - −ε(t−u) - R(t − u) d[B(s)]ϕ(u + σ − s) du- ≤ Ke 0 u 0    - u+σ - t u+σ - - −ε(t−u) d[B(s)]ϕ(u + σ − s) du ≤ Kϕ e d[VB(s)]du u 0 u    t u+σ +∞ −εt (ε−α)u αs −εt (ε−α)u ≤ Ke ϕ e e d[VB(s)]du ≤ Ke ϕ e du 0  u  0  +∞  +∞ αs K αs −ε(t−σ) e d[VB(s)] ≤ e d[VB(s)] e ϕ. 0 α − ε 0 (i) ⇒ (ii) STEP 1: We show, by induction, that - - -  +∞ - ∀ ∈ - m ( )- < +∞. m Æ : - t dB t - (14) 0

In what follows, we may consider without restriction of generality, the norm U :=

n n×n

| | =( ) ∈  . ∑ij=1 uij for U : uij Since (2) is EAS, we have

 ( )≤ −βt , R t M1e , t ∈ Ê+ (15) for some β,M1 > 0. Choose α ∈ (0,β).LetRˆ(·) be the Laplace transform of (·) (·) R . Then, (15) implies that Rˆ is analytic on −α . Taking Laplace transforms

to two sides of the differential equation in (3), we get Δ(z)Rˆ(z)=In, z ∈ 0. Thus, detRˆ(0) = 0. Since the function z → detRˆ(z) is continuous at z = 0, there exists α ∈ ( ,α) ˆ( ) = ∈ B ( ) ˆ(·)−1 B ( ) 0 0 such that detR z 0forallz α0 0 . Thus R exists on α0 0 . ˆ(·) B ( ) ˆ(·)−1 Since the entries of R are analytic on α0 0 , so must be the entries of R . Therefore −1 V (z) := zIn − A − Rˆ(z) ,

. B ( ) V ( )= ˜( ) ∈  is analytic on α0 0 . Note that z B z for z 0 By a standard property of the Stieltjes-Laplace transform, we have

∀ ∈ û (m) m +∞ m −st ∀ ∈ B ( ) ∩  V ( )=(− ) ( ). m Æ s α0 0 0 : s 1 0 t e dB t (16) Set M := V (0). For m = 1, seeking a contradiction, we suppose that  T ∃ T > 1: (t − 1)dB(t)dt > M. (17) 0 On Positivity and Stability of Linear Volterra-Stieltjes Differential Systems 119

Choose δ0 > 0 sufficiently small such that

1 − e−ht ∀h ∈ (0,δ ) ∀t ∈ [0,T ] : ≥ t − 1. (18) 0 h , > Since B(·) is increasing on Ê+ we have for h 0 sufficiently small, - -   - - n +∞ −ht n -V (h) − V (0)- 1 − e T - - = ∑ dBpq(t) ≥ ∑ (t − 1)dBpq(t) h p,q=1 0 h p,q=1 0 - - - - - T - = - (t − 1)dB(t)-. (19) 0 By invoking (17), (18), (19) and continuity of the norm, we arrive at the contradic- tion - - - - - - - T - -V (h) − V (0)- - - M = V (0) = lim - - ≥ - (t − 1)dB(t)- > M. h→0+ h 0

Therefore, (14) holds for m = 1. If (14) holds for m, then it can be shown analogously as in the previous paragraph for m = 1 that (14) holds for m + 1 by replacing V (·) by V (m)(·).ThisprovesStep1. STEP 2: We show that (12) holds. ∈ , Note that by (14), (16), we have for every m Æ

 +∞  +∞ V (m)(0)= lim (−1)m tme−st dB(t)=(−1)m tmdB(t). s→0+ 0 0

∞ V (k)(0) V (·) Bα ( ) ∑ k, Since is analytic on 0 0 , the Maclaurin’s series k=0 k! s is for some α > B ( ) 1 0, absolutely convergent in α1 0 . Therefore, by (14)-(16),   ∞ αk +∞ +∞ |V (k)( )| 1 k pq 0 k ∑ t dBpq(t) = ∑ α1 < +∞, k=0 k! 0 k=0 k! for p,q ∈{1,2,...,n} and so, in view of increasing property of B(·),wehave - - - +∞ - n  +∞ - α t - α t - e 1 dB(t)- = ∑ e 1 dBpq(t)= 0 p,q=1 0 $ % $ %   n +∞ +∞ (α )k n +∞ αk +∞ 1t 1 k ∑ ∑ dBpq(t) = ∑ ∑ t dBpq(t) < +∞. p,q=1 0 k=0 k! p,q=1 k=0 k! 0 This completes the proof. 

Remark 3. Uniform asymptotic stability versus exponential asymptotic stability 120 P.H.A. Ngoc

By definition, it is easy to see that the exponential asymptotic stability of (2) (not necessarily positive) implies its uniform asymptotic stability. However, the converse of this statement does not hold even for positive systems. To see this, we consider the scalar Volterra-Stieltjes equation  t xú(t)=−x(t)+ d[B(s)]x(t − s) (20) 0

s 1

→ Ê, → ( ) = τ B Ê+ s B s d where : : 0 2(τ+1)2 . Note that (20) is positive, by Theorem 1. First, we show that (20) is uniformly asymptotically stable. Since (20) is positive and  +∞  +∞ − + ( )=− + 1 = −1, 1 dB s 1 2 ds 0 0 2(s + 1) 2 (20) is uniformly asymptotically stable, by Theorem 3. Secondly, since

γ 1 ∀γ > 0 ∃T = T (γ) ≥ 0 ∀t ≥ T : e t > 1, 0 0 0 2(t + 1)2 we conclude  +∞  +∞ ∀γ > γs ( )= γs 1 =+∞. 0: e dB s e 2 ds 0 0 2(s + 1) Therefore, (20) is not exponentially asymptotically stable, by Theorem 4. Finally, if B(·) in (20) is defined by

1 − e−t B(t) := , t ≥ 0, 2 then it is easy to see that (20) is exponentially asymptotically stable, by Theorem 4.

Acknowledgements. Dedicated to Professor Nguyen Khoa Son on the occasion of his 60th birthday. The author is supported by the Alexander von Humboldt Foundation.

References

1. Gripenberg, G., Londen, S.O., Staffans, O.: Volterra Integral and Functional Equations. Cambridge Univeristy Press, New York (1990) 2. Murakami, S.: Exponential asymptotic stability for scalar linear Volterra equations. Dif- ferential Integral Equations 4, 519Ð525 (1991) 3. Naito, T., Shin, J.S., Murakami, S., Ngoc, P.H.A.: Characterizations of positive linear Volterra differential equations. Integral Equations and Operator Theory 58, 255Ð272 (2007) On Positivity and Stability of Linear Volterra-Stieltjes Differential Systems 121

4. Ngoc, P.H.A.: On exponential asymptotic stability of linear Volterra-Stieltjes differential systems (in preparation, 2009) 5. Ngoc, P.H.A., Naito, T., Shin, J.S.: Characterizations of postive linear functional differen- tial equations. Funkcialaj Ekvacioj 50, 1Ð17 (2007) 6. Ngoc, P.H.A., Naito, T., Shin, J.S., Murakami, S.: On stability and robust stability of positive linear Volterra equations. SIAM Journal on Control and Optimization 47, 975Ð 996 (2008) 7. Ngoc, P.H.A., Murakami, S., Naito, T., Shin, J.S., Nagabuchi, Y.: On positive linear Volterra-Stieltjes equations. Integral Equations and Operator Theory (in press) (2009) Eigenvalue Localization for Totally Positive Matrices

Juan Manuel Pe˜na

Abstract. We survey eigenvalue localization results for totally positive matrices and we show its potential application in several problems. We first recall some localiza- tion results for the real eigenvalues of the real matrices and which can be considered an alternative to Gerschgorin disks, because they provide a sharper information in cases where the intervals corresponding to Gerschgorin disks are not sharp. An ex- clusion interval for the real eigenvalues of the real matrices is also recalled. The ap- plications of these results are illustrated. On the other hand, a new lower bound for the minimal eigenvalue of a totally positive matrix is announced, and it improves the information derived from the Gerschgorin disks. We also survey some applications of stochastic totally positive matrices to birth and death processes and to Computer Aided Geometric Design. Birth and death processes occur in many chemical phe- nomena. In some recent problems in Computer Aided Geometric Design it is very important to know a good lower bound for the minimal eigenvalue of stochastic totally positive matrices. We apply to these problems the localization results men- tioned above. Finally, we present a class of totally positive matrices to which we can apply a direct method for the eigenvalue computation recently obtained.

1 Introduction and Basic Results

A matrix such that all its minors are nonnegative is called totally positive (TP). If all minors are positive, then we say that the matrix is strictly totally positive (STP). TP matrices present many applications (see [1, 9, 18]) in several fields such us Statistics, Approximation Theory, Computer Aided Geometric Design (C.A.G.D.) or Economy.

Juan Manuel Pe˜na Departamento de Matem«atica Aplicada/IUMA, Universidad de Zaragoza, 50009 Zaragoza, Spain, e-mail: [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 123Ð130. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 124 J.M. Pe˜na

In this paper we revisit some results on eigenvalue localization and relate them with some problems on TP matrices. Let us start by recalling the following basic facts which can be found, for instance, in the following results of [1]: Corollary 6.6, Theorem 6.2 and Theorem 3.3. Theorem 1. Let A be an n × n TP matrix. Then: (i) All the eigenvalues of A are nonnegative. If A is STP, then they are also distinct (and positive). (ii)Given the n × n diagonal matrix J := diag(1,−1,1,...,(−1)n−1), if A is in addi- tion nonsingular, then the matrix JA−1JisTP. Theorem 1 (i) assures the positivity of the eigenvalues of a nonsingular TP matrix and, taking into account that J−1 = J, Theorem 1 (ii) shows that the inverse of a nonsingular TP matrix has the spectrum of another TP matrix. Therefore the results on eigenvalue localization of TP matrices presented in this paper can be also applied to their inverses. As for the eigenvectors of an STP matrix, it is also well-known (cf. Theorem 6.3 of [1]) that, if its eigenvalues are λ1 > λ2 > ··· > λn and they are associated to eigenvectors x1,...,xn,thenxk has exactly k − 1 variations of signs. In Section 2 we first recall some results on real eigenvalue localization. These re- sults are an alternative to the classical Gerschgorin disks and we also illustrate their use. We also announce a result for bounding the minimal eigenvalue of a nonsingu- lar TP matrix which improves the information provided by the Gerschgorin disks. Results of Section 2 will have application to the problems commented in Section 3, which deals with TP stochastic matrices. Finally, Section 4 presents a class of TP matrices to which we can apply a direct method for the eigenvalue computation recently obtained.

2 Eigenvalue Localization Results

As recalled in Sect. 1, the eigenvalues of a TP matrix are nonnegative and so real. Let us start this section by presenting some results which are considered alternative to the classical Gerschgorin disks (see [14, 19Ð22] for more related results). Let A =(aik)1≤i,k≤n be a real matrix. We shall use the following notations: for each i = 1,...,n + = { , | = }, − = { , | = }, ri : max 0 aij j i ri : min 0 aij j i (1) The following result providing inclusion intervals for the real eigenvalues of a real matrix was proved in Theorem 3.5 of [19]. =( ) +, − Theorem 2. Let A aik 1≤i,k≤n be a real matrix; let ri ri be as in (1); and let λ be a real eigenvalue of A. Then

1n λ ∈ = [ − + − | + − |, − − + | − − |]. S : aii ri ∑ ri aik aii ri ∑ ri aik i=1 k=i k=i Eigenvalue Localization 125

Now we introduce the following notations: for each i = 1,...,n + = { , { | = }}, − = { , { | = }}. si : max 0 min aij j i si : min 0 max aij j i (2) The following result providing an exclusion interval for the real eigenvalues of a real matrix was proved in Proposition 2.9 of [19]. =( ) +, − Theorem 3. Let A aik 1≤i,k≤n be a real matrix, let si si be as in (2) and let λ be a real eigenvalue of A. Then $ % n n λ ∈/ = { − +}, { − −} . E : max ∑ aij nsi min ∑ aij nsi i= ,...,n i=1,...,n 1 j=1 j=1

Let us illustrate the use of the previous results with the matrix ⎛ ⎞ k ··· k ⎜ ⎟ = . . , > , A ⎝. .⎠ k 0 k ··· k which has eigenvalues 0 (with multiplicity n − 1) and nk (with multiplicity 1). The [(− + ) , ] + = intervals provided by the Gerschgorin discs are n 2 k nk .Sinceri k and − = [ , ] + = ri 0, the intervals provided by Theorem 2 are 0 nk . Besides, since si k and − = ( , ) si 0, the exclusion interval provided by Theorem 3 is 0 nk . In conclusion, the combination of both theorems provides exactly the eigenvalues 0 and nk. We finish this section announcing a result (Theorem 4.4 of [23]) for bounding the minimal eigenvalue of a nonsingular TP matrix which improves the information provided by the Gerschgorin disks. It uses the following notation for index subsets: given i ∈{1,...,n} let Ji := { j ||j − i| is odd}.

Theorem 4. Let A be a nonsingular TP matrix, and let λmin(> 0) be its minimal eigenvalue. For each i ∈{1,...,n},letJi be the index subset defined above. Then:

λmin ≥ min{aii − ∑ aij}. i j∈Ji A problem where this bound can be applied will be presented in the next section.

3 Eigenvalue Localization for Stochastic TP Matrices and Applications

Let us recall that a nonnegative matrix is called row stochastic (or simply stochastic) if all its row sums are 1. The following result was proved in Proposition 3.2 of [21] and provides an upper bound of the real eigenvalues different from 1 of a stochastic matrix in terms of the least off-diagonal element. 126 J.M. Pe˜na

+ Theorem 5. Let A =(aik)1≤i,k≤n be a stochastic matrix and let s and w be the least off-diagonal and diagonal entries of A, respectively. If λ is a real eigenvalue of A, then either λ = 1 (with algebraic multiplicity 1 if s+ > 0)or2w − 1 ≤ λ ≤ 1 − ns+. Let us now present two different applications of stochastic TP matrices related to birth and death processes and to Computer Aided Geometric Design (C.A.G.D.), respectively. Birth and death processes occur in many chemical phenomena, such as in aerosol chemistry (see [16]), in applications of industrial chemistry (see [24]), or in stochas- tic models of chemical reactions (see [17]). A birth and death process is a stationary Markov process whose state space is the nonnegative integers and whose transition probability matrix (Pij(t))i, j=0,1,2,···, t ≥ 0, with

Pij(t)=Pr{x(t)= j |x(0)=i} satisfies the conditions (as t → 0) Pi,i+1(t)=λit + o(t), Pi,i−1(t)=μit + o(t) and Pi,i(t)=1−(λi + μi)t +o(t),whereλi > 0foralli ≥ 0, μi > 0foralli ≥ 1andμ0 ≥ 0. By the results of [11] (see also [12]), the matrices associated to these processes are TP. Moreover they are STP for all t > 0. In fact, in [12] it was proved that, for all i1 < i2 < ···< in and j1 < j2 < ···< jn, the determinants ⎛ ⎞ ( ) ··· ( ) Pi1, j1 t Pi1, jn t ⎜ . . ⎟ det⎝ . . ⎠ (3) ( ) ··· ( ) Pin, j1 t Pin, jn t have the following interpretation. Suppose that n labelled particles start out in states i1,...,in and execute the process simultaneously and independently. Then the de- terminant (3) is equal to the probability that at time t the particles will be found in states j1,..., jn respectively without any two of them ever having been coincident (in the same state) in the intervening time. See also the related papers [10Ð13]. Let us now present the second application, where we see that localization re- sults for the minimal eigenvalue of a nonsingular TP matrix have an application in C.A.G.D. We start with some basic definitions of this field. Let us recall that the collocation matrix of a system of univariate real functions (u0(t),...,um(t)) at the ,..., ( )r u0 um =( ( ))0≤ j≤m. points ti i=0 in R is given by M : u j ti 0≤i≤r In C.A.G.D. it is t0,...,tr convenient that the system of functions is blending,thatis,ui(t) ≥ 0forallt and m ( )= ( ,..., ) i and ∑i=1 ui t 1forallt. Obviously, u0 um is blending if and only if all its collocation matrices are stochastic. A basis (u0,...,um) is totally positive (TP) when all its collocation matrices are TP. A blending basis that is TP is said to be a normalized totally positive (NTP) basis. A basis is NTP if and only if all its col- location matrices are TP and stochastic. It is well-known (cf. [18]) that the bases providing shape preserving representations are precisely the NTP bases. By Theorem 4.2 of [3], a space with a normalized totally positive basis always has a unique normalized B-basis, which is the basis with optimal shape preserving Eigenvalue Localization 127 properties. For instance the Bernstein basis is the normalized B-basis in the case of the space of polynomials of degree at most n on a compact interval [2] and the B-splines form the normalized B-basis of their corresponding space [3]. All normalized totally positive bases satisfy the property known as the progres- sive iterative approximation property, which assure convergence of an interpolation process (see [5, 15]). In [5] it was proved that the normalized B-basis is the NTP basis with the fastest convergence rates for interpolating a curve and that the tensor product of normalized B-bases also present the fastest convergence rates for interpo- lating a surface. The proofs also show that the smallest eigenvalue of a nonsingular collocation matrix of the normalized B-basis is always greater than or equal to the smallest eigenvalue of the corresponding collocation matrix of another NTP basis. In fact, the convergence rate for the progressive iterative approximation property of the NTP basis depends on the minimal eigenvalue of the corresponding colloca- tion matrix. This matrix is TP and stochastic and this explains why bounding this minimal eigenvalue for TP matrices is so important in C.A.G.D. Let us mention some key facts which allowed us to prove the mentioned re- sult about the fastest convergence rates of the normalized B-basis. The matrix of change of basis K between an NTP basis (u0,...,um) and the normalized B-basis (b0,...,bm) (i.e., K such that (u0,...,um)=(b0,...,bm)K) is TP and stochastic, and then it provides (if we factorize this matrix as a product of nonnegative bidiag- onal and stochastic matrices) a corner cutting algorithm (see p. 240 of [7] and [8]) from the control polygon of the given NTP basis to the control polygon of the NTP B-basis. Since K is TP and stochastic, by Theorem 2.6 of [18] we can write

K = Fn−1Fn−2 ···F1G1 ···Gn−2Gn−1, with ⎛ ⎞ 1 ⎜ ⎟ ⎜01 ⎟ ⎜ ⎟ ⎜ .. .. ⎟ ⎜ . . ⎟ ⎜ ⎟ Fi = ⎜ 01 ⎟ ⎜ ⎟ ⎜ αi+1,1 1 − αi+1,1 ⎟ ⎜ . . ⎟ ⎝ .. .. ⎠ αn,n−i 1 − αn,n−i and ⎛ ⎞ 10 ⎜ ⎟ ⎜ .. .. ⎟ ⎜ . . ⎟ ⎜ ⎟ ⎜ 10 ⎟ ⎜ ⎟ Gi = ⎜ 1 − α1,i+1 α1,i+1 ⎟, ⎜ ⎟ ⎜ . . ⎟ ⎜ .. .. ⎟ ⎝ ⎠ 1 − αn−i,n αn−i,n 1 128 J.M. Pe˜na where, ∀(i, j), 0 ≤ αi, j < 1. The previous factorization has been a key property used in the mentioned proof.

4 A Class of TP Matrices with a Direct Method for Eigenvalue Computation

In [6] the class of r-convexity preserving matrices for all r ≤ k, important in many applications (see [4]), is considered and a direct method to compute their k largest eigenvalues is presented. The computational cost of the corresponding method is O(kn2) elementary operations for computing the k largest eigenvalues of an n × n matrix. The good stability properties of the method are also commented in that pa- per. In the particular case of an r-convexity preserving matrix for all r ≤ n − 1we provide a direct method of O(n3) elementary operations to compute all its eigenval- ues. Here we show that a class of TP matrices satisfies the previous hypotheses. Let us start by recalling the basic notations. T n Let k be a nonnegative integer. A vector v =(v1,v2,...,vn) ∈ R is said to be k k-convex if Δ vi ≥ 0foralli ∈{1,...,n − k},where   k k k k− j Δ vi := ∑ (−1) vi+ j. j=0 j

A vector v ∈ Rn is said to be k-concave if the vector −v is k-convex. Observe that a vector is 0-convex if and only if it is nonnegative and a vector is 1-convex if and only if it is monotonically increasing. A matrix A is said to be k-convexity preserv- ing if for any k-convex vector v, the vector Av is also k-convex. Let us observe that A is 0-convexity preserving if and only if it transforms nonnegative vectors into non- negative vectors, which is equivalent to A ≥ 0. A matrix A is 1-convexity preserving if and only if it is monotonicity preserving. Pk−1 will denote the set of k-convex and k-concave vectors. We shall denote by E the lower triangular matrix ⎛ ⎞ ⎛ ⎞ 10...... 0 ⎜ ⎟ 10... 0 ⎜ .. .⎟ ⎜ ⎟ ⎜−11 . .⎟ ⎜...... ⎟ ⎜ ⎟ = ⎜. . . .⎟, −1 = ⎜ .. .⎟. E : ⎝ ⎠ E ⎜ 0 −11 . .⎟ (4) 1 ... 10 ⎜ ⎟ ⎝ . . . . ⎠ 1 ... 11 ...... 0 0 ... 0 −11

For each j ∈{1,...,n},letE j be the following n × n matrix: E1 := E and for j ≥ 2, ⎛ ⎞ ⎛ ⎞ Ij−1 | 0 Ij−1 | 0 = ⎝ ⎠, −1 = ⎝ ⎠. E j : E j 0 | E 0 | E−1 Eigenvalue Localization 129 where Ij−1 is the ( j − 1) × ( j − 1)− identity matrix and E is the (n − j + 1) × (n − j + 1)-matrix given by (4). Corollary 4.5 of [4], which is recalled below, provides a class of n×n TP matrices which are r-convexity preserving matrix for all r ≤ n − 1. So, we can apply to them the direct method of [6] for computing their eigenvalues.

Theorem 6. Let A be a nonsingular totally positive matrix such that APr ⊆ Pr, r = 0,...,n − 1. Then A is a r-convexity preserving matrix for all r and A is similar to an upper triangular matrix of the form: ⎛ ⎞ λ1 ...... ⎜ ⎟ ⎜ . ⎟ −1 ⎜ 0 λ2 ... . ⎟ (E ···E − ) A(E ···E − )= 1 n 1 1 n 1 ⎜ . . . . ⎟ ⎝ ...... ⎠ 0 ... 0 λn with λ1 ≥ λ2 ≥···≥λn > 0. In [4] we also have a source of examples of TP matrices satisfying the hypothesis of the previous result.

Acknowledgements. Supported by the Spanish Research Grant MTM2006-03388 and by Gobierno de Arag«on and Fondo Social Europeo.

References

1. Ando, T.: Totally Positive Matrices. Linear Algebra Appl. 90, 165Ð219 (1987) 2. Carnicer, J.M., Pe˜na, J.M.: Shape preserving representations and optimality of the Bern- stein basis. Advances in Computational Mathematics 1, 173Ð196 (1993) 3. Carnicer, J.M., Pe˜na, J.M.: Totally positive bases for shape preserving curve design and optimality of B-splines. Computer Aided Geometric Design 11, 633Ð654 (1994) 4. Carnicer, J.M., Pe˜na, J.M.: Generalized convexity preserving transformations. Computer Aided Geometric Design 13, 179Ð197 (1996) 5. Delgado, J., Pe˜na, J.M.: Progressive iterative approximation and bases with the fastest convergence rates. Computer Aided Geometric Design 24, 10Ð18 (2007) 6. Delgado, J., Pe˜na,J.M.: Computation of the eigenvalues of convexity preserving matri- ces. Applied Mathematics Letters (to appear, 2009) 7. Goodman, T.N.T., Micchelli, C.A.: Corner cutting algorithms for the B«ezier representa- tion of free form curves. Linear Algebra Appl. 99, 225Ð252 (1988) 8. Goodman, T.N., Said, H.B.: Shape preserving properties of the generalized Ball basis. Computer Aided Geometric Design 8, 115Ð121 (1991) 9. Karlin, S.: Total Positivity. Stanford University Press, Stanford (1968) 10. Karlin, S., McGregor, J.: A characterization of birth and death processes. Proc. Nat. Acad. Sci. 45, 375Ð379 (1959) 11. Karlin, S., McGregor, J.: Coincidence properties of birth and death processes. Pacific. J. Math. 9, 1109Ð1140 (1959) 130 J.M. Pe˜na

12. Karlin, S., McGregor, J.: Coincidence probabilities. Pacific. J. Math. 9, 1141Ð1164 (1959) 13. Karlin, S., McGregor, J.: Classical diffusion processes and total positivity. J. Math. Anal. Appl. 1, 163Ð183 (1960) 14. Li, H.-B., Huang, T.-Z., Li, H.: On some subclasses of P-matrices. Numer. Linear Alge- bra Appl. 14, 391Ð405 (2007) 15. Lin, H., Bao, H., Wang, G.: Totally positive bases and progressive iteration approxima- tion. Computer & Mathematics with Applications 50, 575Ð586 (2005) 16. Losert-Valiente Kroon, C.M., Ford, I.J.: Stochastic Birth and Death Equations to Treat Chemistry and Nucleation in Small Systems. In: 17th International Conference: Nucle- ation and Atmospheric Aerosols, pp. 332Ð336. Springer, Galway (2007) 17. Mitrophanov, A.Y.: Note on Zeifman’s bounds on the rate of convergence for birth-death processes. J. Appl. Probab. 41, 593Ð596 (2004) 18. Pe˜na, J.M.: Shape preserving representations in Computer Aided-Geometric Design. Nova Science Publishers, Newark (1999) 19. Pe˜na, J.M.: A class of P-matrices with applications to the localization of the eigenvalues of a real matrix. SIAM J. Matrix Anal. Appl. 22, 1027Ð1037 (2001) 20. Pe˜na, J.M.: On an alternative to Gerschgorin circles and ovals of Cassini. Numer. Math. 95, 337Ð345 (2003) 21. Pe˜na, J.M.: Exclusion and inclusion intervals for the real eigenvalues of positive matri- ces. SIAM J. Matrix Anal. Appl. 26, 908Ð917 (2005) 22. Pe˜na, J.M.: Refining Gerschgorin disks through new criteria for nonsingularity. Numer- ical Linear Algebra with Applications 14, 665Ð671 (2007) 23. Pe˜na, J.M.: Eigenvalue bounds for some classes of P-matrices. Preprint 24. Ross, J.V., Pollett, P.K.: Extinction times for a birth-death process with two phases. Math. Biosci. 202, 310Ð322 (2006) Positivity Preserving Model Reduction

Timo Reis and Elena Virnik

Abstract. We propose a model reduction method for positive systems that ensures the positivity of the reduced model. Our approach is based on constructing diago- nal solutions of Lyapunov inequalities. These are linear matrix inequalities (LMIs), which are shown to be feasible. Stability is preserved and an error bound in the H∞-norm is provided.

1 Introduction

We consider linear time-invariant systems in continuous-time

xú(t)=Ax(t)+Bu(t), x(0)=x0, (1) y(t)=Cx(t)+Du(t),

∈ n×n n×p n×n p×p ∈ Ê ∈ Ê ∈ Ê where A Ê , B , C , D are real constant coefficient ma- trices. The state x, input u and output y are real-valued vector functions. We focus on (internally) positive systems. These are systems whose state and output variables take only nonnegative values at all times t for any nonnegative initial state and any nonnegative input, see, e.g., [5, 7, 9]. We consider the problem of model reduction which preserves the positivity of a system. We generalize the model reduction methods of standard balanced truncation and singular perturbation balanced truncation such that positivity is preserved. Our technique uses a linear matrix inequality (LMI) approach and we show that, stability is preserved and an error bound in the H∞-norm is provided. We present the results for continuous-time systems, however, we indicate that all results can be extended to the discrete-time case, [12].

Timo Reis and Elena Virnik Institut f¬ur Mathematik, Technische Universit¬at Berlin, Stra§e des 17. Juni 136, 10623 Berlin, Germany, e-mail: [email protected],[email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 131Ð139. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 132 T. Reis and E. Virnik

This paper is organized as follows. In Section 2, we introduce the notation, de- fine the positivity concept and give a well-known characterization. We review the basics of balanced truncation and introduce positivity-preserving generalizations in Section 3. The applicability of the proposed methods is demonstrated by means of an example in Section 4.

2 Preliminaries

Let the matrix quadruple [A, B, C, D] denote the system (1). The function G(s)= C(sI − A)−1B + D is called transfer function of the system and s is called frequency variable of the transfer function. Conversely, [A, B, C, D] is called realization of G.LetH∞ be the space of all transfer functions that are analytic and bounded in the + open right half-plane  . The continuous-time H∞-norm is defined by

G∞ = sup G(s)2, + (2)

s∈ respectively, where ·2 denotes the spectral matrix norm. ∈ n×n AmatrixA Ê is called c-stable if all its eigenvalues are located in the open left complex half-plane. We call a realization of a continuous-time system [A, B, C, D] stable, if A is c-stable. ∈ m×n ≥ ( > ) AmatrixA Ê is called nonnegative (positive) and we write A 0 A 0 ∈ n×n if all entries are nonnegative (positive). A matrix A Ê is called Z-matrix if its off-diagonal entries are non-positive. A matrix for which −A is a Z-matrix we call −Z-matrix. For a matrix A we have that eAt ≥ 0forallt ≥ 0 if and only if A is a − ∈ n×n ≥ ρ( ) Z-matrix, see, e.g., [11]. Let B Ê B 0 with spectral radius B . A matrix A of the form A = αI − B, with α > 0, and α ≥ ρ(B) is called M-matrix.Ifα > ρ(B) then A is a nonsingular M-matrix, if α = ρ(B) then A is a singular M-matrix. The ∈ n,n class of M-matrices is a subclass of the Z-matrices. Accordingly, a matrix A Ê for which −A is an M-matrix is called a −M-matrix. Note that for a nonsingular M-matrix A,wehaveA−1 ≥ 0[11]. A symmetric matrix A is called positive (semi)definite and we write (A ! 0) A 0ifforallx = 0wehave(xT Ax ≥ 0) xT Ax > 0. If this holds for −A then A is called negative (semi)definite and we write (A # 0) A ≺ 0. For matrices A,B we write (A # B) A ≺ B if (B − A ! 0) B − A 0. Next we define the class of positive systems and state a well-known characteri- zation, see, e.g., [5, 7]. ∈ + ( ) Definition 1. System (1) is called positive if for all t Ê the state x t and the output y(t) are nonnegative for any nonnegative initial state x0 and any measurable

p

→ Ê (τ) ≥ τ ∈ [ , ] input function u : Ê with u 0forall 0 t .

Theorem 1. System (1) is positive if and only if A is a −Z-matrix and B,C,D ≥ 0. Positivity Preserving Model reduction 133

3 Balanced Truncation for Positive Systems

Consider the system [A, B, C, D] in continuous-time and assume that A is c-stable. The method of balanced truncation is based on Lyapunov equations. It generates c-stable reduced order systems. However, positivity is not preserved in general. In- stead, we consider model reduction based on Lyapunov inequalities

AP + PAT + BBT # 0, AT Q + QA +CTC # 0, (3) with diagonal matrices P,Q ! 0. In the following we show that for positive systems the equations (3) are solvable. Moreover, we show the existence of a positive diago- ∈ n×n [ , , , ] nal transformation T Ê such that for the transformed system Ab Bb Cb Db given by

−1 −1 Ab = T AT, Bb = T B, Cb = CT, and Db = D, (4) the corresponding Lyapunov inequalities P + P T + T # , T Q + Q + T # , Ab b bAb BbBb 0 Ab b bAb Cb Cb 0 (5) are fulfilled for P = (Σ,Σ , , ), Q = (Σ, ,Σ , ) b diag c 0no 0nco b diag 0no o 0nco (6)

≺ nc×nc no×no ≺ Σ ∈ Ê with 0 Σc ∈ Ê ,0 o and

Σ = diag(σ1,σ2,...,σk) for some σ1 ≥ σ2 ≥ ...≥ σk > 0. (7)

We will call [Ab , Bb , Cb , Db ] a positive balanced realization. Note that the indices no and nc may be zero. Theorem 2. Consider the c-stable continuous-time positive standard system (1). Then, there exists a diagonal matrix T 0 such that the positive system given by (4) [Ab , Bb , Cb , Db ] is positive balanced, i.e. there exist diagonal matrices Pb ! 0, Qb ! 0 as in (6), such that the Lyapunov inequalities in (5) hold. Proof. A −M-matrix is diagonally stable, i.e., there exist diagonal positive defi- nite matrices X,Y such that

AX + XAT ≺ 0andATY +YA≺ 0, see, e.g. [1, Theorem 1]. In particular, there exist diagonal matrices P ! 0, Q ! 0 such that (3) holds. We define a permutation matrix Π such that Π T PΠ = ( , , , ), Π T QΠ = ( , , , ) diag X11 X22 0no 0nco diag Y11 0nc Y33 0nco with the additional property that

X11 = diag(x1,...,xk), Y11 = diag(y1,...,yk) 134 T. Reis and E. Virnik satisfy x1y1 ≥ x2y2 ≥ ...≥ xkyk > 0. Setting

− 1 ø = (( 1) 4 , , , ), = Π ø, T diag X11Y11 Inc Ino Inco T T

−1 −T T we have that Pb = T PT , Qb = T QT have the desired form. The trans- formed system is given by [Ab , Bb , Cb , Db ] as defined in (4). Since Ab is a −Z- matrix and Bb,Cb,Db ≥ 0, the transformed system is again positive by Theorem 1. 

The numbers σ1,...,σk play the role of Hankel singular values for conventional balanced truncation [6]. Consider a partition

A11 A12 B1 Ab = , Bb = , Cb = C1 C2 , (8) A21 A22 B2

× = < σ < σ where A11 ∈ Ê and either k or k such that +1 . The matrices B and C are partitioned accordingly. By means of balanced realizations, reduced-order models x2ú(t)=A2x2(t)+Bu2 (t), (9) y2(t)=C2x2(t)+Du2 (t) can now be constructed, where A2, B2, C2, D2 are defined by 2 2 2 2 A = A11, B = B1, C = C1, D = D. (10)

An alternative method for the construction of reduced-order models is

2 −1 2 −1 A = A11 − A12A A21, B = B1 − A12A B2, 22 22 (11) 2 = − −1 , 2 = − −1 . C C1 C2A22 A21 D D C2A22 B2 For the reduced-order models, we have the following result. Theorem 3. Let [A, B, C, D] be a realization of G(s) that is c-stable. Moreover, let Ab,Bb,Cb,Db be constructed as in (4), such that (5) holds for Pb ! 0, Qb ! 0 as in (6). Let [A2, B2, C2, D2 ] be the realization that is either constructed via (10) or (11). Then, the system [A2, B2, C2, D2 ] is positive and the transfer function G2(s)= 2 2 − 2 2 C(sI − A) 1B + D satisfies

k 2 G − G∞ ≤ 2 ∑ σi. (12) i=+1 Proof. It suffices to show the positivity of the reduced-order systems. For a proof of the error bound in (12), we refer to [8]. The reduced-order system defined in (10) is again positive, since B2≥ 0,C2≥ 0, D2 ≥ 0 and A2 is a −M-matrix as a submatrix of a −M-matrix. The positivity of the reduced-order system defined in (11) can be seen as follows: The −M-matrix property of A2 is preserved, since it is a Schur complement of A Positivity Preserving Model reduction 135

− −1 ≤ [11]. Furthermore, since A22 is also a M-matrix, we have A22 0, and hence, B2,C2,D2 ≥ 0.  The main difference between the reduced-order models (10) and (11) is that the model (10) is exact for s = ∞ meaning that G(∞)=G2(∞), whereas (11) is exact at s = 0. In balanced truncation based on Lyapunov equations, the first method is called standard balanced truncation, whereas the second method is called singular perturbation balanced truncation. Note that for the computation of positive reduced-order models, there is no need to compute a balanced realization explicitely. Instead, for solutions

P = diag(p1,...,pn) ! 0andQ = diag(q1,...,qn) ! 0 (13) of (3), indices i1,...,in have to be found such that we have ≥ ...≥ > ≥ ...≥ . pi1 qi1 pi qi pi+1 qi+1 pin qin (14)

Then, a reduced-order model (9) can be obtained in the following way: Let I1 = [i1,...,i], I2 =[i+1,...,in] and

Aø11 = A(I1,I1), Aø12 = A(I1,I2), Bø1 = B(I1,[1,...,m]),

Aø21 = A(I2,I1), Aø22 = A(I2,I2), Bø2 = B(I2,[1,...,m]), (15)

Cø1 = C([1,...,p],I1), Cø2 = C([1,...,p],I2). and either define the reduced-order model by 2 2 2 2 [A, B, C, D]=[Aø11 , Bø1 , Cø1 , D] (16) or by

[A2, B2, C2, D2 ] (17) =[ ø − ø ø−1 ø , ø − ø ø−1 ø , ø − ø ø−1 ø , − ø ø−1 ø ]. A11 A12A22 A21 B1 A12A22 B2 C1 C2A22 A21 D C2A22 B2 These systems are linked to the reduced-order models (10) and (11), respectively, via a positive diagonal state-space transformation. Therefore, the positivity as well as the error bound (12) are still valid. Summarizing, we present the reduction procedure in algorithmic form. Algorithm 4 Positivity-preserving model reduction by a) standard balanced truncation b) singular perturbation balanced truncation. Given a positive system G =[A, B, C, D], compute a positive reduced-order model G2 =[A2, B2, C2, D2 ]. 1. Solve the Lyapunov inequalities (5) for diagonal P and Q as in (13). ,..., ≥ > 2. Form distinct indices i1 in such that pik qik pil qil for k l. ∈{ ,..., } > 3. Choose 1 n such that pi qi pi+1 qi+1 and define the vectors of indices I1 =[i1,...,i],I2 =[i+1,...,in]. 4. Build the reduced-order a) by (16) or b) by (17). 136 T. Reis and E. Virnik

u1

R2 h2 h1 f13 R − hn−1 R1 fo,2 n 1 . . . . . f , o1 f23 fo,n−1

f12 fn−1,n Rn hn

R3 h3 . . . . .

fo,n fo,3 . . .

Fig. 1 System of n water reservoirs

One can see that the error of the reduced-order model can be estimated by n √  − 2 ≤ . G G ∞ 2 ∑ pik qik (18) k=+1 Let us finally give a remark on the Lyapunov inequalities (3). It is clear that their solutions are not unique and one should√ look for solutions P = diag(p1,...,pn), Q = diag(q1,...,qn) such that PQ has a large number of small diagonal elements. This yields components of the state which are candi- dates to truncate. A good heuristic for this is the minimization of the trace of P and Q [4]. For getting even sharper bounds, the Lyapunov inequalities can be solved once more while now minimizing the sum of those diagonal elements of P and Q corresponding to the candidates for truncation.

4 Example

In this section we present a numerical example to demonstrate the properties of the discussed model reduction approach for positive systems. The numerical tests were run in MATLABR Version 7.4.0 on a PC with an Intel(R) Pentium(R) 4 CPU 3.20GHz processor. Example 1. Consider a system of n water reservoirs such as schematically shown in Fig. 1. All reservoirs R1,...,Rn are assumed to be located on the same level. ThebaseareaofRi and its fill level are denoted by ai and hi, respectively. The first reservoir R1 has an inflow u which is the input of the system, and for each i ∈{1,...,n}, Ri has an outflow fo,i through a pipe with diameter do,i. The output of the system is assumed to be the sum of all outflows. Furthermore, each Ri and R j are connected by a pipe with diameter dij = d ji ≥ 0. The direct flow from Ri to R j is denoted by fij. We assume that the flow depends linearly on the difference between the pressures on both ends. This leads to the equations Positivity Preserving Model reduction 137

( )= 2 · · ( ( ) − ( )), ( )= 2 · · ( ( ) − ( )), fij t dij c hi t h j t fa,i t do,i c hi t h j t where c is a constant that depends on the viscosity and density of the medium and gravity. The fill level of Ri thus satisfies the following differential equation $ % n ú = c − 2 ( )+ 2 ( ( ) − ( )) + 1 δ ( ), hi do,ihi t ∑ dij h j t hi t 1iu t ai j=1 ai where δ1i denotes the Kronecker symbol, that is δ1i = 1ifi = 1 and zero oth- erwise. Then, we obtain system (1) with D = 0 and matrices A =[aij]i, j=1,...,n, δ1i 2 B =[bi ]i= ,...,n, C =[c j] j= ,...,n with bi = , ci = c · d , and 1 1 1 1 1 a1 1 o i  c −d2 − ∑n d2 i = j, a = · o,i k=1 ik ij 2 = , ai dij i j wherewedefinedii = 0. For our illustrative computation, we have constructed the presented compartment model with ten states. We assume that we have two well connected substructures each consisting of five reservoirs, where each reservoir is connected with every other reservoir by a pipe of diameter 1. The substructures are connected with each other by a pipe of diameter 0.2 between reservoirs one and ten. For simplicity reasons, we also set all base areas of the reservoirs to 1 and also c = 1. Solving the Lyapunov inequalities (3), we obtain the values σ1 = σ2 = σ3 = σ4 = 23.3581, σ5 = 14.9864, σ6 = 0.0097, σ7 = σ8 = σ9 = σ10 = 0.0055. We show only the results obtained with singular perturbation balanced truncation. However, qualitatively similar results were obtained via standard balanced truncation. The reduced model with five states is again positive with ⎡ ⎤ −5.0000 1.0000 1.0000 1.0000 0.8022 ⎢ ⎥ ⎢ 1.0000 −5.0000 1.0000 1.0000 0.8022 ⎥ ⎢ ⎥ Ar =⎢ 1.0000 1.0000 −5.0000 1.0000 0.8022 ⎥, ⎣ 1.0000 1.0000 1.0000 −5.0000 0.8022 ⎦ 1.2466 1.2466 1.2466 1.2466 −5.0400 ⎡ ⎤ ⎡ ⎤ 0 8.4087 T ⎢ ⎥ ⎢ ⎥ ⎢ 0 ⎥ ⎢8.4087⎥ ⎢ ⎥ ⎢ ⎥ Br =⎢ 0 ⎥, Cr = ⎢8.4087⎥ , Dr = 0. ⎣ 0 ⎦ ⎣8.4087⎦ 0.1483 7.0113

The frequency responses, i.e. the transfer function at values s = jω,forω ∈ [0,10], of the original and of the reduced order model are depicted in the upper diagram of Fig. 2. The lower diagram shows the frequency response of the error system along with the mutual error bound 0.0636. 138 T. Reis and E. Virnik

Frequency plots of original and reduced order model

1 original model 0.8 reduced-order model

0.6

0.4

0.2

0 0 1 2 3 4 5 6 7 8 9 10

Frequency plots of error system and error bound 0.07

0.06 error plot 0.05 error bound

0.04

0.03

0.02

0.01

0 0 1 2 3 4 5 6 7 8 9 10

Fig. 2 Frequency plot showing original and reduced order model

5Conclusion

In this paper, we have presented a model reduction technique that preserves the positivity of a system in the continuous-time. The presented method can be ex- tended for the discrete-time case. The proposed method is based on the existence of a diagonal solution of the corresponding Lyapunov inequalities, which may be obtained via LMI solution methods. The reduction then may be performed by stan- dard balanced truncation or singular perturbation balanced truncation methods. It is shown that both methods preserve positivity. Furthermore, a numerical example in the continuous-time case is provided and illustrates the functionality of the proposed algorithm.

Acknowledgements. Supported by the DFG Research Center MATHEON in Berlin.

References

1. Araki, M.: Application of M-matrices to the stability problems of composite dynamical systems. J. Math. Anal. Appl. 52, 309Ð321 (1975) 2. Beck, C., Doyle, J., Glover, K.: Model reduction of multidimensional and uncertain sys- tems. IEEE Trans. Automat. Control 41, 1466Ð1477 (1996) 3. Berman, A., Plemmons, R.J.: Nonnegative Matrices. In: The Mathematical Sciences, Classics in Applied Mathematics, 2nd edn., vol. 9. Society for Industrial and Applied Positivity Preserving Model reduction 139

Mathematics (SIAM), Philadelphia (1994) 4. Boyd, S., El Ghaoui, L., Feron, E., Balakrishnan, V.: Linear matrix inequalities. In: Sys- tem and Control Theory, SIAM Studies in Applied Mathematics, vol. 15. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (1994) 5. Farina, L., Rinaldi, S.: Positive Linear Systems. Theory and applications. John Wiley and Sons Inc., New York (2000) 6. Glover, K.: All optimal Hankel-norm approximations of linear multivariable systems and their L∞-error bounds. Internat. J. Control 39, 1115Ð1193 (1984) 7. Kaczorek, T.: Positive 1D and 2D Systems. Springer, London (2002) 8. Liu, Y., Anderson, B.D.O.: Singular perturbation approximation of balanced systems. Internat. J. Control 50, 1379Ð1405 (1989) 9. Luenberger, D.G.: Introduction to Dynamic Systems. John Wiley and Sons Inc., New York (1979) 10. Reis, T., Virnik, E.: Positivity preserving balanced truncation for descriptor systems. SIAM J. Control Optim. (to appear, 2009) 11. Varga, R.S.: Matrix Iterative Analysis. In: Springer Series in Computational Mathemat- ics, vol. 27. Springer, Berlin (2000) 12. Virnik, E.: Analysis of positive descriptor systems, PhD thesis, TU Berlin (2008) The Minimum Energy Problem for Positive Discrete-Time Linear Systems with Fixed Final State

Ventsi Rumchev and Siti Chotijah

Abstract. The non-negativity of controls in positive linear discrete-time systems usually gives rise to complementarity conditions in the first-order Karush-Kuhn- Tucker optimality conditions - this complicates the analytic solution and usually leads to numerical solutions. The intrinsic relationship between reachable sets and the minimum-energy problem is exploited in this paper to obtain an analytic solution of the minimum-energy control problem for positive linear discrete-time systems with any pair of fixed terminal (initial and final) states.

1 Introduction

The minimum-energy problem for time invariant linear systems is a classical prob- lem in control theory. It has nice analytic solutions if no restrictions are imposed on the state and control variables [1, 8]. Positive discrete-time linear systems (PDLS) are defined on cones and not on linear spaces since the control and trajectory are to be non-negative. The non-negativity of control in such systems gives rise to comple- mentarity conditions in the first-order Karush-Kuhn-Tucker optimality conditions [3], which complicates the analytic solution and usually leads to numerical solu- tions. At the same time the appeal and the advantages of analytic solutions are well appreciated. To the best of our knowledge the only analytical solution to the minimum-energy problem for PDLS with fixed final state is obtained by Kaczorek [6]. The result is developed under some assumptions, among which the assumption of zero initial state seems to be quite restrictive. The relation of the problem with the reachable sets, i.e. the geometry of the problem, is not studied in [6] either.

Ventsi Rumchev Department of Mathematics and Statistics, Curtin University of Technology. GPO Box U 1987 Perth, WA 6845, Perth, Australia e-mail: [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 141Ð149. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 142 V. Rumchev and S. Chotijah

Related work for continuous-time systems with non-negative controls is pub- lished in [4, 5, 7, 9] but the positivity of the system is not exploited in these papers, except in [7], where conditions that guarantee the positivity of the closedÐloop linear quadratic optimal system with free final state are developed. Positivity is an intrinsic property of positive systems and in many cases it helps to simplify the analysis and the results. This paper is concerned with the minimum-energy problem for PDLS with any pair of fixed terminal (initial and final) states. We focus on the scalar case as a more simple case in order to expose the relationship between the reachable sets and the minimum-energy problem. Using the dynamic programming approach [1] we obtain a nice analytic solution to the problem.

2 Problem Formulation and some Preliminaries

The minimum-energy problem for scalar positive discrete-time linear systems (PDLS) with fixed final state is formulated as follows [6]. Minimize 1 T−1 J = ∑ u2(t) (1) 2 t=0 subject to x(t + 1)=ax(t)+bu(t), t = 0,...,T − 1(2)

a,b ≥ 0, u(t) ∈ R+ (3) where x(t) is the state at time t = 0,...,T , u(t) ∈ R+ is the control sequence, the symbol R+ denotes the set of all non-negative real numbers, T is a finite-time hori- zon, and the initial and final state are given by

x(0)=x0 ≥ 0andx(T )=xT ≥ 0. (4)

The state variables x(t), t = 0,...,T , are, clearly, non-negative for any non- negative initial state x0 ≥ 0, and any (non-negative)control sequence u(t),t=0,...,T-1.

Definition 1. The set of all states Rt (0) of PDLS (2)-(3) reachable from the ori- gin in t-steps by admissible (i.e., non-negative) control sequences {u(0),u(1),..., u(t − 1)} is called a t-steps reachable set.Itisdefinedas   t−1  t−k−1 Rt (0)= xx = ∑ a bu(k); a,b,x0 ≥ 0 and k=0 . u(k) ≥ 0fork = 0,1,...,t − 1.

t = 0,...,T − 1, Minimum-Energy Problem 143 with R0(0) ≡ 0. Any non-negative state can, clearly, be reached by non-negative controls in (at least) one step if b > 0. If b = 0, the control does not affect the system state and no positive state can be reached from the origin. Consequently, we have  R+, b > 0 R (0)= t {0}, b = 0

Definition 2. The set of all states Rt (x0) of PDLS (2)-(3) reachable from the initial state x0 = x(0) in t-steps by non-negative control sequences {u(0),u(1),..., u(t − 1)} is defined as

  t−1  t t−k−1 Rt (x0)= xx = a x0 + ∑ a bu(k) a,b,x0 ≥ 0and k=0 . u(k) ≥ 0fork = 0,1,...,t − 1 .

t = a x0 + Rt (0) (5) t = 1,2,...,T with R0(x0) ≡ x0. It is not difficult to see that  t + =[ t ,∞), > ( )= a x0 R+ a x0 b 0 Rt x0 t {a x0}, b = 0

So, the reachable set Rt (0) is a particular case of Rt (x0) for x0 = 0. The PDLS (2)-(3) is asymptotically stable if and only if 0 ≤ a < 1[2].Itisstable, if 0 ≤ a ≤ 1. s( ) > Let Rt x0 denote the reachable sets of a stable PLDS. For b 0 the reachable sets of a stable system possess the nested property s ( ) ⊆ s( ), = , ,..., , Rt−1 x0 Rt x0 t 1 2 T (6) where the inclusion is strict if the system is asymptotically stable, i.e. 0 ≤ a < 1. The inclusion property (6) is in the opposite direction us( ) ⊂ us ( ), = , ,..., , Rt x0 Rt−1 x0 t 1 2 T > > us( ) if the PLDS is unstable (a 1) and b 0, where Rt x0 denotes the t-steps reach- able set of an unstable PDLS (2)-(3). 144 V. Rumchev and S. Chotijah

The PDLS (2)-(3) is

(a) reachable if and only if b > 0 and then any non-negative state x can be reached from the origin by an admissible (that is a non-negative) control in one step; (b)null-controllable (in finite-time) or controllable-to-the-origin, if and only if a = 0 and then the origin can be reached from any non-negative state by zero control in one step; (c)controllable if and only if it is reachable (b > 0) and null-controllable (a = 0) and then the system can be driven by a (non-negative) control from any non-negative initial state x0 ≥ 0intoanyterminalstatex ≥ 0 in one step (see [2, 10]).

3MainResults

Theorem 1. Let xT ∈ RT (x0). Then, the optimal control sequence that minimizes the cost function (1) in the minimum-energy problem (1)-(4) with fixed final state is given by ⎧   T−(t+1) − T−t ∗( ) ⎪ a xT a x t ⎨ , b > 0 T−1 ∗( )= ( −( + )) , = , , ..., − , u t ⎪ b ∑ a2 T i 1 t 0 1 2 T 1 (7) ⎩⎪ i=t 0, b = 0 where x∗ (t) is the corresponding optimal trajectory, and the optimal value of the cost function (1) is ⎧   ⎪ T 2 ⎪ 1 xT − a x0 ⎨ , b > 0 ∗ 2 T−1 J = 2 ∑ 2(T−i−1) . (8) 0 ⎪ b a ⎩⎪ i=0 0, b = 0

Proof. The hypothesis xT ∈ RT (x0) implies that there exists a solution to the two- point boundary-value problem (2) - (4). In other words, there exists an admissible (non-negative) control sequence {u(t) ≥ 0,t = 0,1,...,T − 1} such that the corre- sponding trajectory {x0,x(1),...,x(T − 1),xT } is feasible (that is non-negative). When b = 0 the PDLS (2)-(3) is not reachable and the reachable set RT (x0) con- T sists of the point a x0 only. Then, the only solution to the minimum-energy problem (1)-(4) is the trivial one: { ∗( )= , = , ,..., − } { , ,..., T−1 , } ∗ u t 0 t 0 1 T 1 , x0 ax0 a x0 xT and J0 =0. Let b > 0. Since by hypothesis xT ∈ RT (x0), the two-point boundary-value prob- lem (2)-(4) is consistent and there exists at least one solution. To find the solution that minimises the cost function (1) we apply the dynamic programming proce- dure [1]. The Bellman equation for the minimum-energy problem (1)-(4) can be written as Minimum-Energy Problem 145  . 1 2 Jt (x)=min u + Jt+1 (x) , u = u(t) , x = x(t), t = 0,...,T − 1, u≥0 2 with JT (x)=0. Moving backwards we try for t = T − 1, t = T − 2 and formulate the induction hypothesis:   T−t 2 1 xT − a x J (x)= (9) t 2 T−1 b2 ∑ a2(T−i−1) i=t and   T−(t+1) T−t a b xT − a x u(t)= ≥ 0. (10) T−1 b2 ∑ a2(T−(i+1)) i=t Let expressions (9) and (10) be true for t = k + 1, that is   2 − T−(k+1) 1 xT a x J + (x)= , (11) k 1 2 T−1 b2 ∑ a2(T−(i+1)) i=k+1 and, respectively,   T−(k+2) T−(k+1) a b xT − a x u(k + 1)= ≥ 0. (12) T−1 b2 ∑ a2(T−(i+1)) i=k+1

We prove that (9) and (10) are true for t = k,thatis   T−k 2 1 xT − a x J (x)= k 2 T−1 b2 ∑ a2(T−i−1) i=k and   T−(k+1) T−(k+1) a b xT − a x u(k)= ≥ 0. T−1 b2 ∑ a2(T−(i+1)) i=k

For t = k the Bellman equation is specified as  . ( )= 1 2 + ( ) . Jk x min u Jk+1 x (13) u≥0 2 146 V. Rumchev and S. Chotijah

A substitution of state equation (2) into (11) yields ⎧ ⎫ ⎪   ⎪ ⎨⎪ 2 ⎬⎪ x − aT−k−1 (ax + bu) ( )= 1 2 + 1 T , Jk x min u (14) u≥0 ⎪2 2 T−1 ⎪ ⎩ b2 ∑ a2(T−i−1) ⎭ i=k+1 = ( ) = ( ) ( )= where u u k and x x t is to be specified by the initial condition x 0 x0. T−k−1 2 1 1 xT − a (ax + bu) The differentiation of the expression u2 + with re- 2 2 T−1 b2 ∑ a2(T−i−1) i=k+1 spect to u results in   T−(k+1) T−k a b xT − a x u(k)= ≥ 0. (15) T−1 b2 ∑ a2(T−(i+1)) i=k The substitution of (15) in (14) leads to   T−k 2 1 xT − a x J (x)= . k 2 T−1 b2 ∑ a2(T−i−1) i=k Thus, the assumptions (9) and (10) are true for t=k, and, therefore, they are true by induction for any t. For t = 0wehave   T−1 T a b xT − a x0 u(0)= ≥ 0. T−1 b2 ∑ a2(T−(i+1)) i=0

This concludes the proof of the theorem. 

Under the hypotheses of the theorem, the optimal control sequence is given by (7), the corresponding optimal trajectory by

t−1 ∗ t t−1− j ∗ x (t)=a x0 + b ∑ a u ( j), t = 1,2,...,T − 1, j=0 and the optimal cost function is represented by (8).

Remark 1. 1. If xT ∈/ RT (x0) then the two-point boundary-value problem (2)- (4) is inconsistent and, therefore, the minimum-energy problem with fixed final state (1)-(4) has no solution. Minimum-Energy Problem 147

T 2. If b > 0andxT belongs to the boundary of RT (x0),thatisxT = a x0, the optimal control sequence (7) is a zero sequence and the corresponding optimal trajectory T−1 becomes x0,ax0,...,a x0,xT . 3. Let the system (2)-(3) be controllable that is let a = 0andb > 0. Then, it is not difficult to see using limits that expression (7) is reduced to x u∗(t)=0fort = 0,1,2,...,T − 2andu∗(T − 1)= T , b and, consequently, the minimal value of the cost function becomes

1 x2 J∗ (x)= T . 0 2 b2 4. For b > 0andt = 0 expression (7) becomes   T−1 T ∗ a b xT − a x0 u (0)= . T−1 b2 ∑ a2(T−(i+1)) i=0 ∗ The above expression for u (0) for x0 = 0 agrees with the result in [6]. The expression [7], however, is obtained for any non-negative pair {x0,xT } such that xT ∈ RT (x0) and, therefore, is more general than that in [6], where (among the other assumptions) the reachability of PDLS and a zero initial state are required. The expression

T−1 ( −( + )) − − b2 ∑ a2 T i 1 =(b,ab,a2b,...,aT 1b) · (b,ab,a2b,...,aT 1b)T ≥ 0 i=0 is the gramian of PDLS (2) - (3). 5. The optimal control law (7) for b > 0 can be treated as a feedback control since it depends on the current state. As a matter of fact, it can also be represented as an open-loop control that depends on the initial and final states as the corollary below shows.

Corollary 1. Under the assumptions of Theorem 1, the optimal control can be represented as an open-loop control namely ⎧    ⎨ aT−(t+1) x − aT x 1 − a2 T 0 , > ∗ ( )= b 0 , = , , ,..., − . u t ⎩ b 1 − a2T t 0 1 2 T 1 (16) 0, b = 0

The optimal trajectory corresponding to (16) , then, becomes ⎧    ⎨ aT−t x − aT x 1 − a2t t + T 0 , > ∗( )= a x0 b 0 , = , , ,..., , x t ⎩ (1 − a2T ) t 0 1 2 T (17) t a x0, b = 0 148 V. Rumchev and S. Chotijah and the cost function (1) is given by ⎧     T 2 2 ⎨ 1 xT − a x0 1 − a ∗ ( )= , b > 0 . J0 x ⎩ 2 b2 (1 − a2T ) (18) 0, b = 0

Remark 2. 1. The expressions (16) and (17) clearly tell us that the optimal con- trol sequence and the optimal trajectory are non-negative. 2. It is easy to see from (17) that the optimal trajectory ends at the desired final ∗ state that is x (T )=xT . 3. The optimal control sequence (17) is easy to calculate since u∗(t +1)=u∗(t)/a. 4. If the initial state x0 is zero, expressions (16), (17) and (18) become even simpler

T−(t+1) 2 ∗ a (1 − a )x u (t)= T , t = 0,1,2,...,T − 1, b(1 − a2T )

  T−t 2t a 1 − a xT x∗(t)= , t = 0,1,2,...,T, (1 − a2T )

and   1 1 − a2 x2 J∗ (x)= T . 0 2 b2 (1 − a2T ) The meaning of the above expressions is transparent. 5. It is worth noting that the expression (16) for the optimal control sequence and the expression (18) for the minimal value of the cost function are the same as those when no restrictions are imposed on controls. This is because the min- imum of J0 as a function of u(0),u(1),...,u(T − 1) is achieved at an interior ∗ point u (k) > 0, k = 0,1,2,...,T − 1, for b > 0andxT ∈ RT (x0).

4 Concluding Remarks

Using the dynamic programming approach an analytic solution of the minimum en- ergy problem for positive discrete-time linear systems with any pair of fixed terminal states and scalar controls is obtained and analysed in the paper. The relationship be- tween the problem and the geometric properties of the system is revealed and well exploited. The optimal control sequence is represented in two different (equivalent) forms Ð a feedback form and an open-loop form. The minimum energy problem has a trivial (zero) solution if the positive discrete-time linear system does not possess the reachability property. It does not have a solution if the final state does not belong to the T − steps reachable set RT (x0). The optimal solution becomes quite simple if the system is controllable or the initial state is zero. We discuss the minimum-energy Minimum-Energy Problem 149 problem for positive discrete-time linear systems with vector controls and any fixed terminal pair of states in another paper.

References

1. Bertsekas, D.P.: Dynamic Programming and Optimal Control, 3rd edn. Athena Scientific, Belmont (2005) 2. Caccetta, L., Rumchev, V.: A survey of reachability and controllability for positive linear systems. Annals of Operations Research 98, 101Ð122 (2000) 3. Facchini, F., Pang, J.-S.: Finite-Dimensional Variational Inequalities and Complementar- ity Problems, vol. 1. Springer, New York (2003) 4. Heemels, W., Van Eijndhoven, S., Stoorvogel, A.: A linear quadratic regulator problem with positive contols. International Journal of Control 70(4), 551Ð578 (1998) 5. Hu, Y., Zhou, X.Y.: Constrained stochastic LQ-control with random coefficients and ap- plication to portfolio selection. SIAM Journal on Control and Optimization 44(2), 441Ð 466 (2005) 6. Kaczorek, T.: Positive 1D and 2D Systems. Springer, London (2002) 7. Laabissi, M., Winkin, J., Beauthier, C.: On the positive LQ-problem for linear continuous-time systems. In: Commault, C., Marchand, N. (eds.) Positive Systems. LNCIS, vol. 341, pp. 295Ð302. Springer, Heidelberg (2006) 8. Lewis, F., Syrmos, V.: Optimal Control, 2nd edn. John Wiley & Sons, New York (1995) 9. Pachter, M.: The linear-quadratic optimal control problem with positive controllers. In- ternational Journal of Control 32(4), 589Ð608 (1980) 10. Rumchev, V., James, G.: Controllability of positive linear discrete-time systems. Interna- tional Journal of Control 50(3), 845Ð857 (1989) A Rollout Algorithm for Multichain Markov Decision Processes with Average Cost

Tao Sun, Qianchuan Zhao and Peter B. Luh

Abstract. Many of simulation based learning algorithms have been developed to obtain near optimal policies for Markov decision processes (MDPs) with large state space. However, most of them are for unichain problems. In view that some applica- tions involve multichain processes and it is NP-hard to determine whether a MDP is unichain or not, it is desirable to obtain an algorithm that is applicable to multichain problems as well. This paper presents a rollout algorithm for multichain MDPs with average cost. Preliminary analysis of the estimation error and parameter settings are provided based on the problem structures, i.e., mixing time of transition matrix. Or- dinal optimization and Optimal Computing Budget Allocation are also suggested to improve the efficiency of the algorithm.

1 Introduction

This paper deals with Markov decision processes (MDPs) with average cost crite- rion. The entire MDP theory is based on solving the so called optimality equation which describes the optimality principle [15]. There are two classical methods for

Tao Sun Center for Intelligent and Networked Systems (CFINS), Department of Automation and TNLIST Lab, Tsinghua University, Beijing 100084, China and China Mobile Research Institute, Beijing, 100053, China, e-mail: [email protected] Qianchuan Zhao (Corresponding author. Tel.+8610-62783612, Fax.+8610-62796115) Center for Intelligent and Networked Systems (CFINS), Department of Automation and TNLIST Lab, Tsinghua University, Beijing 100084, China, e-mail: [email protected] Peter B. Luh Center for Intelligent and Networked Systems (CFINS), Department of Automation and TNLIST Lab, Tsinghua University, Beijing 100084, China and Department of Electrical and Computer Engineering, University of Connecticut, Storrs, CT 06269-2157 USA, e-mail: [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 151Ð162. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 152 T. Sun, Q. Zhao and P.B. Luh solving the optimality equation of an MDP namely policy iteration and value iter- ation ([3] and [15]). They are both iterative methods based on solving a series of linear equations. When the state space of the problem is large, it turns out that nei- ther policy iteration nor value iteration can be used to obtain exact solutions because of two reasons: first, the linear equations involved is too large to be solved; second, even the solutions (either a policy or a value function) could not be stored. In prac- tice, instead, approximations are often used and simulation is one way to estimate the policy performance. In this paper, we will focus on how to do policy iteration approximately by sim- ulation. The policy iteration method has two steps: policy evaluation and policy im- provement. When simulation is used at the policy evaluation step, the approximate methods are known as simulation based policy iteration or on-line policy iteration (see, e.g., [11]). When storing a policy is impossible, one can only perform policy evaluation and policy improvement for the states of a sample path. Such methods are known as rollout methods ([4] and [9]). The rollout algorithm was developed first for finite horizon total cost MDPs in [4]. The simulation runs to the end of the time horizon to evaluate the goodness of an action. For average cost MDPs, the time horizon is infinite which makes simulation impossible to look ahead to the end of the time horizon as total cost MDPs. Moreover, the optimization for an average cost MDP should consider the underlying structure of the transition probability matrix, i.e., the number of recurrent classes in the state space. A MDP is said to be unichain if there is only one recurrent class in the state space under any policy. If there is some policy that leads to more than one recur- rent class, the problem is called a multichain MDP [15]. The multichain MDPs ap- pear in inventory control problems [15] and maintenance problems [16]. Compared with unichain MDPs, the optimization for multichain MDPs should select recurrent classes whose average costs are small. Moreover, the problem to determine whether a MDP is unichain or not is NP-hard [19]. Therefore, it is important to obtain an algorithm applicable for multichain MDPs. However, most of the existing learning or simulation based online optimization algorithms (see, e.g., [7], [8], [11] and [17]) are developed for unichain models. There are only a few algorithms that consider multichain cases in recent years (see, e.g., [1], [14]). In this paper a rollout algorithm which aims to address large multichain MDPs is developed. In Section 2, the formulation of average cost MDP is introduced and the policy iteration for solving multichain MDPs is presented. A multichain rollout algorithm is developed in Section 3 and its properties are presented. In Section 4, the estimation error of the rollout algorithm caused by finite look-ahead horizon of simulation is analyzed through mixing time which reflects the structure of the problem. Some techniques are discussed to improve the efficiency of the algorithm. Section 5 concludes this paper with some remarks. The Multichain Rollout Algorithm 153

2 Formulation of Markov Decision Processes

In this section, we will introduce the average cost MDP first. Then, policy iteration method is presented to solve a general multichain MDP. The method is the basis for the multichain rollout algorithm to be developed in Section 3.

2.1 Average Cost Markov Decision Processes

We consider a discrete-time Markov chain, X = {Xk,k = 0,1,...}, with a finite state space S = {1,...,|S |},where|·|denotes the set cardinality. Let A beafiniteset of actions and A (i) stand for all feasible actions for the state i. We consider the set of stationary policies denoted by E . A policy μ ∈ E is a time independent map- ping μ : S → A . Under policy μ, the action μ(i) ∈ A (i) taken for state i leads to state transition probability from i to j described by pμ(i)(i, j), j = 1,2,...,|S |. The Markov chain evolves following the transition probability matrix Pμ, with [Pμ]i, j = pμ(i)(i, j). We consider the average cost over infinite time horizon. Let ημ be a vector which stands for the performance of the policy μ. The element ημ (i) is the expected aver- age cost starting from the state i:  1 K−1 ημ (i)= lim E ∑ fμ (Xk, μ(Xk)) (1) K→∞ K K=0 where X0 = i,i ∈ S . The optimization problem is to find an optimal policy μ that minimizes the average cost for any initial state, i.e., / 0 ∗ μ = arg min ημ . (2) μ∈ E

For a transition matrix P, the limiting matrix is defined as:

1 K−1 P∗ = lim ∑ Pk (3) K→∞ K k=0 By the definition of P∗ in (3), the average cost can be calculated as follows [3]:

K−1 1 k ∗ ημ = lim ∑ Pμ fμ = Pμ fμ (4) K→∞ K k=0 Moreover, it can be obtained that [3, 15]:

ημ = Pμ ημ (5) 154 T. Sun, Q. Zhao and P.B. Luh

2.2 Policy Iteration for Multichain Markov Decision Processes

Policy iteration is a classical method to solve an average cost MDP. The method starts from an initial policy and results in an optimal policy by iteratively conduct- ing policy evaluation and policy improvement steps. In the following, we will first present policy iteration for multichain MDPs. For unichain MDPs, the policy itera- tion can be simplified which is pointed out in the next. Before we proceed, let us define is a function gμ of the states and the policy μ which is known as the “bias” [3, 15]. Let Pμ be a state transition matrix of a policy μ and the cost vector be fμ , then the bias under the policy is defined as:

K gμ = lim gμ . (6) K→∞ In the above, K−1 K k ∗ gμ ≡ ∑ (Pμ − Pμ) fμ . (7) k=0 The difference g(i) − g( j) between the bias of the two states i, j is the difference between the total costs starting from state i instead of j. Therefore, bias is also called “relative value”[15]. We are ready to introduce the steps of the policy iteration algorithm for multichain cases [3]. Policy Evaluation is to obtain the average cost ημ and the bias gμ of a policy μ by solving a set of linear equations for (η,g,u) as follows:

η = Pμη, (8) η + g = fμ + Pμg, (9) g + u = Pμu (10)

Policy Improvement updates a policy μn based on the policy evaluation step. According to the average cost, a new policy μn+1 will make the states evolve among the recurrent classes with small average costs. Namely, if min{Pμημn } = ημn ,let μ∈E

n+1 μ = argmin{Pμημn } (11) μ∈E

Otherwise, let / 0 n+1 μ = argmin fμ + Pμgμn (12) μ∈Eø μ μ μ In (12), Eø is the set of policies which satisfy the equation min{P η n } = η n . μ∈E It can be seen that the policy improvement step for the multichain cases is con- ducted in an embedded manner. Firstly, select the recurrent class by (11). Then if the policy is not changed, action update will be conducted according to (12). For unichain problems, both the policy evaluation and the policy improvement steps are simplified. This is caused by the fact that the average cost is independent The Multichain Rollout Algorithm 155 of states when the problem is unichain. Namely, for unichain problems, η can be represented by a scaler instead of by a vector as for multichain problems. There- fore, in the policy evaluation step, it does not need to solve (8) and (10) which are used to characterize the vector η. Moreover, in the policy improvement step, the optimization in (11) is not necessary because it is satisfied automatically.

3 A Rollout Algorithm for Multichain Markov Decision Processes

In this section, a rollout algorithm for multichain MDP is developed. There are two key points compared with traditional policy iteration. Firstly, the policy evaluation step for a policy is performed by simulation, which avoids the difficulty of solving linear equations. Secondly, the algorithm is performed online and only optimize actions for the states encountered on the sample path. In contrast, traditional policy iteration updates actions for all the states at each iteration. Since the rollout algorithm runs simulations based on an existing policy μ for improvement, such a policy μ is called the “base policy.” In the next, all the state transition matrix P,averagecostη,biasg and cost vector f together with the Q- factor defined below are related with policy μ. For symbolic simplifications, μ will be omitted when there are no ambiguities.

3.1 A Rollout Algorithm for Multichain Models

The policy improvement equations (11, 12) in the policy iteration process can be conducted state-wisely for multichain problems [15]. Namely, if the action for a state cannot be updated by equation (11), one can turn to (12) for the state to achieve a possible improvement. This procedure is described as below. State-wise Policy Improvement For a state i, select the action which can make the system run under a recurrent class with the lowest average cost. Let a row vector pi,a be the state distribution when taking an action a at a state i. Accordingly, the solution set of the optimization equation (11) for state i is:   .  Aø( )=  ∈ { η} . i aa arg min pi,a (13) a∈A (i)

Let aμ (i) be the action taken by the policy μ at the state i.Ifaμ (i) ∈/ Aø(i),then define an actiona ˆ(i) such thata ˆ(i) ∈ Aø(i). Otherwise, perform optimization by equation (12) by choosing an action / 0 ( ) ∈ ( , )+ aˆ i arg min f i a pi,ag (14) a∈Aø(i) 156 T. Sun, Q. Zhao and P.B. Luh

By the definition of gμ in (6), the Q-factor is defined as: * + K−1 ( , )= ( , )+ ( k − ∗) Q i a lim f i a pi,a ∑ P P f (15) K→∞ k=0

For a state, the equation (14) is to choose a smallest Q-factor. For unichain MDPs, the elements of the average cost vector η are the same. Therefore, Aø(i)=A (i) and the policy improvement step is only performed based on (14). For multichain problems, optimizing actions for states are performed within the set of Aøμ (i) ob- tained by (13). For every pair of actions a1, a2 within the set Aøμ (i),wehave (p − p )η = i,a1 i,a2 0. Therefore, ( , ) − ( , ) Q i a*1 Q i a2 + (16) K−1 K−1 k ∗ k ∗ = lim f (i,a )+p , ∑ (P − P ) f − f (i,a ) − p , ∑ (P − P ) f →∞ 1 i a1 2 i a2 K = = * k 0 k 0 + K−1 K−1 k ∗ = lim f (i,a ) − f (i,a )+(p , − p , ) ∑ P f +(p , − p , ) ∑ P f →∞ 1 2 i a1 i a2 i a1 i a2 K = = * k 0 k 0+ K−1 k = lim f (i,a ) − f (i,a )+(p , − p , ) ∑ P f + K(p , − p , )η →∞ 1 2 i a1 i a2 i a1 i a2 K = * k 0 + K−1 = ( , ) − ( , )+( − ) k lim f i a1 f i a2 pi,a pi,a ∑ P f K→∞ 1 2 k=0

= QK(i,a1) − QK(i,a2)+Γ (K,a1,a2) (17)

In (17), K−1 ( , )= ( , )+ k QK i a f i a pi,a ∑ P f (18) k=0

The third term Γ (K,a1,a2) in (17) converges to 0 when K → ∞. This is because, ∞ ∞ Γ (K,a ,a )=(p − p ) Pk f =(p − p ) (Pk − P∗) f 1 2 i,a1 i,a2 ∑ i,a1 i,a2 ∑ k=K k=K ∞ In the above, ∑ (Pk − P∗) f converges to 0 when K → ∞ (p.339 in [15]). k=K Based on the above analysis, we can get an online rollout algorithm for multi- chain MDPs. Let μ be the base policy, for a state i encountered on a sample path at any time, the rollout algorithm chooses actions according to the following algo- rithm: Algorithm 1 The Rollout Algorithm for Multichain MDPs 1) Select recurrent classes with small average costs. Estimate the set Aø(i) defined Aˆ( ) η in (13) by i which is obtained based on an estimation of pi,a as: The Multichain Rollout Algorithm 157   .   1 1 Aˆ(i)= aa ∈ arg min { QøK,W (i,a)} (19) a∈A (i) W K

where QøK,W (i,a) is computed by simulation:  W K  w w  QøK,W (i,a)= ∑ ∑ fμ (ik(ξ ), μ(ik(ξ ))) i0 = i,a(i0)=a (20) w=1 k=1

If aμ (i) ∈/ Aˆμ (i), choose an action aˆ(i) for the state i by: aˆ(i) ∈ Aˆμ(i).Otherwise, execute step 2). The symbol ξ w stands for a realization of uncertainties on a w w simulation run and ik(ξ ) is the state at time k under the realization ξ . 2) Obtain the estimation Qø K(i,a) of QK(i,a) as follows 1 Qø (i,a) ≡ f (i,a)+ Qø , (i,a) K W K W and determine an action aˆ(i) for the state i such that / 0 aˆ(i) ∈ arg min Qø K(i,a) (21) a∈Aˆ(i)

In Algorithm 1, W is the number of simulation replications and K is the length of look ahead at each simulation.

3.2 Properties of the Multichain Rollout Algorithm

The equation (20) calculates the cost of W simulation runs, the cost of each simula- tion is the total cost within time horizon K. Each simulation starts from the state i and employs the action a. Moreover, for the states encountered in the subsequent 1 to K time horizon, the action is chosen according to the base policy μ. As a conse- quence of the law of large numbers, we have the following theorem. Theorem 2. For the simulation based multichain rollout algorithm (Algorithm 1), the following results hold:

1) When W → ∞, QøK (i,a) → QK(i,a). → ∞, → ∞ 1 1 ø ( , ) → η Aˆ( ) → Aø( ) 2) When K W , W K QK,W i a pi,a . Therefore, s s . 3) For the comparisons of actions a1,a2 ∈ Aø(s), QøK (i,a1)−Qø K(i,a2) → Q(i,a1)− Q(i,a2),whenK→ ∞,W → ∞. It can be seen that the Algorithm 1 is essentially a simulation based one step of policy iteration. Further comments on the multichain rollout algorithm are summarized as below. 1. Compared withe policy iteration, which aims at optimal policies, the algorithm reflects the idea of “Goal Soften”: only evaluate and improve the interested 158 T. Sun, Q. Zhao and P.B. Luh

states; only conduct a single step of policy iteration. The length of look-ahead K is truncated and is not infinite as the Definition 15. The sample mean can be estimated by a number of replications. To reduce the comparison variances, “common random numbers” (using the same ξ w) may be applied. 2. The algorithm has the advantage to deal with problems for which the under- lining model is not completely known. By Algorithm 1, it can be seen that the algorithm only calculates the cost and does not need to know the explicit infor- mation of the problem, i.e., the state transition probabilities. This is the common advantage of learning algorithms. 3. Although Algorithm 1 is developed for multichain problems, the computation effort is similar to unichain cases. This is because the calculation in step 1 can be directly used in step 2 and will not need additional simulation effort. For unichain average cost problems, the average cost is independent of initial states. Therefore, Algorithm 1 will be simplified and only performs step 2.

4 Analysis of Rollout Algorithms

Rollout algorithms estimate means by conducting simulation W times and looks ahead K time horizon for each simulation. Therefore, the length of look-ahead and the replication numbers are two aspects of the computational load. They are similar as classical “Exploration and Exploitation”[14] in optimization. In this section, we will analyze these two parameters and investigate how to improve the efficiency of simulations.

4.1 The Effect of Look Ahead

The issue of choosing the length of look-ahead exists in many learning algorithms. It is presented that the length of look-ahead should be comparable with the means of the arrival time when compares the difference between two states [7]. Another way is to use the “regenerative point” ([6] and [11]) setting a reference state which can be arrived from any state. The bias is estimated based on the sample paths from the interested state to the reference state. However, the first arrival time between two states may be quite long for large problems. Moreover, there may be no information helping us to find such a reference state. Investigating the look-ahead length is important to identify the estimation error caused by the truncation. In addition, it helps us to know the goodness of the near optimal policies and provides guidance on the computational load allocation. Re- lated results include applying stationary policies to total cost MDPs (see, e.g., [2] and [12]) and applying policies obtained under total cost to average cost criteria ([8] and [12]). The Multichain Rollout Algorithm 159

The step 1 in Algorithm 1 happens only for transient states while the step 2 is executed both for unichain MDPs and multichain MDPs. Therefore, we analyze how the look-ahead length K affects the comparison of Q-factors in step 2. Here, we give some preliminary results obtained under unichain cases. When the Markov chain under a stationary policy is ergodic, there is a single stationary distribution π and lim Pk = P∗. Starting from state i,afterk time units, k→∞ the state distributions are different from the stationary distribution as:

Δ ( )=1 | k − π( )|, i k ∑ pi, j i (22) 2 j∈S k where pi, j is the probability that state i transits to j after k steps. The “Mixing Time” [5] used to measure the speed of a Markov chain’s convergence and is defined as:

Tmix = maxmin{k : Δi(k) < ε}, i ∈ S (23) i In the above, ε is a small positive number. For the truncated error of Q-factors, we have the following results related with mixing time. Theorem 3. Let K be the look ahead length, the estimation error of a Q-factor can be quantified as follows:

|Q(s,a) − QK(s,a)|≤εg, if Tmix ≤ K (24)

Here g is the sup-norm. Proof. By the definition g in (6), the following holds:

∞ ∞ k ∗ K k ∗ g − gK = ∑ (P − P ) f = P ∑(P − P ) f k=K k=0 = PKg =(PK − P∗)g

In the above, the last equality holds because P∗g = 0. For the Rollout algorithm:       | ( , ) − ( , )| =  ( | , )[ ( ) − ( )] Q i a QK i a  ∑ p j i a g j gK j  ∈S  j       =  ( | , ) K − π( ) ( )   ∑ p j i a ∑ p j,i i g i  j∈S i∈S ≤ εg

Fast mixing means that the increase of mixing time is polynomial with the log- arithm of the number of states [5]. If we know that the state transition matrix of a problem has the property of fast mixing, we may choose a proper K for a small prob- lem. When the problem scale increases, we only need to increase K polynomially. 160 T. Sun, Q. Zhao and P.B. Luh

The topology of the transitions between states generally has “small world” phe- nomena which induces fast mixing [18]. It is known that an aperiodic Markov chain converge to stationary distribution exponentially [15] and most of the differences are achieved in the first few transitions. All of these suggests that it may not need to choose a large K.

4.2 Conducting Simulations Efficiently

The simulation of stochastic systems has been well investigated, which makes it possible to use existing techniques to improve the efficiency of simulations for roll- out algorithms. Here, we present some techniques that have been introduced in [16] to improve the efficiency of simulations for rollout algorithms. The rollout algorithm performs Monte Carlo simulations to estimate Q-factors. The estimation√ accuracy (confidence interval) improves slowly and no faster than O(1/ W ) ([10] and [13]). However, from equation (16) it can be seen that the key of the algorithm is to determine the correct rank of Q-factors and select the best action. From another point of view, rollout algorithms use a crude model to estimate the real ranks of Q-factors. This is because the base policy is obtained by heuristics and is not optimal. However, an “accurate model” employs an optimal policy as the base policy. Therefore, both the rank comparisons and the crude model are consistent with the idea of “ordinal optimization” [13]. Ordinal optimization tells us that to determine “rank” is much easier than to determine the “value”. Compared with the slow convergence of values, ranks converge exponentially [13]. This is intuitively reasonable since it is easier to tell which is larger than to tell how much one is larger than another. Through incorporating ordinal optimization, our idea is that there is no need to estimate Q-factors precisely by too many simulation samples. Besides ordinal optimization, “Optimal Computing Budget Allocation” [10] (OCBA) can be used to improve the efficiency of simulation when the rollout algo- rithm is implemented. Since each Q-factor is a random variable, the most straight- forward way is to obtain a same number of samples to estimate their means. This may not a good choice because some unpromising Q-factors need not to estimate their value as precise as those good ones. By OCBA, a few simulations (samples) for each candidate actions are conducted first to get rough estimations of means and variance. Then, the number of simulations to be conducted in the next will be deter- mined by these estimated quantities and the total number of simulation runs to be allocated. Such procedure will be performed until all the simulation budget is used. In this way, those actions achieve small Q-factors are sampled more than others. The estimation of Q-factors has intrinsic parallelism. Except the action to be evaluated, other factors are the same, e.g., current state, simulation model and the uncertainties generated following common random numbers. Thus, the algorithm can be easily implemented in “Single Program Multiple Data” manner to speed up the optimization. The Multichain Rollout Algorithm 161

5 Conclusions

In this paper, a multichain rollout algorithm is developed to address MDPs with large state space. This is an online learning based algorithm for multichain MDPs, for which few of learning algorithm exists in the literature. Through the structural prop- erty reflected by the mixing time, preliminary results are obtained under unichain conditions for the estimation errors caused by the truncated look-ahead length. Fur- ther research lies on testing the algorithm by numerical examples and applying it to real problems.

Acknowledgements. This work was supported by NSFC Grant (60574067, 60736027, 60721003, 60704008) and in part by the Programme of Introducing Talents of Discipline to Universities (National 111 International Collaboration Project) B06002.

References

1. Auer, P., Ortner, R.: Logarithmic online regret bounds for undiscounted reinforcement learning. In: Advances in Neural Information Processing Systems, vol. 19. MIT Press, Cambridge (2007) 2. Bertsekas, D.P.: Dynamic Programming and Optimal Control, 3rd edn., vol. I. Athema Scientific, Belmont (2005) 3. Bertsekas, D.P.: Dynamic Programming and Optimal Control, 3rd edn., vol. II. Athema Scientific, Belmont (2007) 4. Bertsekas, D.P., Casta˜non, D.A.: Rollout algorithms for stochastic scheduling problems. Journal of Heuristics 5, 89Ð108 (1999) 5. Boyd, S., Diaconis, P., Xiao, L.: Fastest mixing Markov chain on a graph. SIAM Re- view 46(4), 667Ð689 (2004) 6. Cao, X.R.: Stochastic learning and optimizationÐ A Sensitivity based approach. Springer, New York (2007) 7. Cao, X.R., Wan, Y.: Algorithms for sensitivity analysis of Markov systems through po- tentials and perturbation realization. IEEE Transactions on Control Systems Technol- ogy 6(4), 482Ð494 (1998) 8. Chang, H.S., Marcus, S.I.: Approximate receding horizon approach for Markov decision processes: average reward case. Journal of Mathematical Analysis and Applications 286, 636Ð651 (2003) 9. Chang, H.S., Givan, R., Chong, E.K.P.: Parallel rollout for online solution of partially observable Markov decision processes. Discrete Event Dynamic Systems: Theory and Applications 14, 309Ð341 (2004) 10. Chen, C.H., Y¬ucesan, J., Lin, E., Chick, S.E.: Simulation budget allocation for further enhancing the efficiency of ordinal optimization. Discrete Event Dynamic Systems: The- ory and Applications 10, 251Ð270 (2000) 11. Fang, H.T., Cao, X.R.: Potential-based on-line policy iteration algorithms for Markov decision processes. IEEE Transactions on Automatic Control 49, 493Ð505 (2004) 12. Hernandez-Lerma, O., Lasserre, J.B.: Error bounds for rolling horizon policies in discrete-time Markovcontrol processes. IEEE Transactions on Automatic Control 35, 1118Ð1124 (1990) 162 T. Sun, Q. Zhao and P.B. Luh

13. Ho, Y.C., Zhao, Q.C., Jia, Q.S.: Ordinal Optimization: Soft Optimization for Hard Prob- lems. Springer, New York (2007) 14. Kearns, M., Singh, S.: Near-optimal reinforcement learning in polynomial time. Machine Learning 49(22), 209Ð232 (2002) 15. Puterman, M.L.: Markov Decision Process: Discrete Stochastic Dynamic Programming. John Wiley & Sons, New York (1994) 16. Sun, T., Zhao, Q.C., Luh, P.B., Tomastik, R.N.: Optimization of joint replacement poli- cies for multi-part systems by a rollout framework. IEEE Transactions on Automation Science and Engineering 5(4), 609Ð619 (2008) 17. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cam- bridge (1998) 18. Tahbaz-Salehi, A., Jadbabaie, A.: Small World phenomenon, rapidly mixing Markov chains, and average consensus algorithms. In: Proceedings of IEEE Conference on De- cision and Control, New Orleans, LA. pp. 276Ð281 (2007) 19. Tsitsiklis, J.N.: NP-Hardness of checking the unichain condition in average cost MDPs. Operations research letters 35(3), 319Ð323 (2007) Analysis of Degenerate Chemical Reaction Networks

Markus Uhr, Hans-Michael Kaltenbach, Carsten Conradi and J¬org Stelling

Abstract. Positivity of states and parameters in dynamic models for chemical reac- tion networks are exploited by Chemical Reaction Network Theory (CRNT) to pre- dict the potential for multistationarity of ‘regular’ networks without knowledge of parameter values. Especially for biochemical systems, however, CRNT’s large ap- plication potential cannot be realized because most realistic networks are degenerate in the sense of CRNT. Here, we show how degenerate networks can be regularized such that the theorems and algorithms of CRNT apply. We employ the method in a case study for a bacterial reaction network of moderate size.

1 Introduction

Chemical Ð and biochemical Ð reaction networks are sets of chemical compounds connected through reactions. The formal analysis of their dynamic properties, for instance, in the area of systems biology remains difficult because often network topologies and kinetic parameters are uncertain or unknown. Hence, there is a gen- eral interest in developing formal analysis methods that consider the structure of the induced dynamic system alone [2]. In modeling signal transduction or cell cy- cle, for example, the connection between the network structure and the existence of multiple positive steady states (multistationarity) is critical. Chemical reaction network dynamics can be represented by a system of or- dinary differential equations (ODEs). If, as in this paper, all kinetics are of the mass action form, then the right hand sides of the ODEs are polynomials in the

Markus Uhr, Hans-Michael Kaltenbach and J¬org Stelling Dept. Biosystems Science & Engineering, ETH Zurich, 8092 Z¬urich, Switzerland, e-mail: [email protected],[email protected] Carsten Conradi Max-Planck-Institute Magdeburg, Magdeburg, Germany, e-mail: [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 163Ð171. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 164 M. Uhr et al. concentrations c and rate constants k (bold-face symbols denote vectors and ma- trices). Importantly, meaningful concentrations and rate constants need to be posi- tive. These positivity constraints and the special structure induced by mass-action kinetics have been exploited to develop the Chemical Reaction Network Theory (CRNT) [3, 4]. It connects network topology and the existence of multiple positive steady states independent of parameter values. CRNT employs a nonnegative inte- ger δ obtained solely from the stoichiometry associated with the network. If, for example, δ = 1 for a network and if that network is ‘regular’ in the sense of CRNT, then the existence of multiple steady state solutions to the polynomial steady state equations requires feasibility of at least one of a (potentially large) number of linear inequality systems. However, realistic networks often have δ > 1, leading to polynomial inequalities. To circumvent this, we previously suggested a decomposition of the overall network into subnetworks with δ = 1. If the resulting subnetwork is regular, then one can use the deficiency one algorithm to establish multistationarity for this subnetwork. To confirm multistationarity for the overall network we present sufficient conditions that allow the extension of multistationarity to the overall network (see [1] and espe- cially [5] for the extension of solutions from the subnetwork to the overall network). Because many realistic networks in biology have only degenerate subnetworks, we here develop approaches for regularization of degenerate reaction networks. Thus the present contribution extends the ideas of [1] and [5] in the following sense: one can now use the deficiency one algorithm to establish multistationarity for the reg- ularized subnetworks, while it is still possible to use the results of [1] and [5] to extend multistationarity from the (regularized) subnetwork to the overall network.

2 Chemical Reaction Network Theory

A reaction network in CRNT consists of a set S of chemical species and a set R of reactions, each mapping a multiset of species to another multiset of species. Each such multiset is called a complex, together forming the set C . Denote by m = |S |, n = |C |,andr = |R| the number of species, complexes, and reactions of a network, respectively. Fixing any order of the species, we will further identify a complex =[ T m by its stoichiometry vector y y1,...,ym] ∈ Ê ,whereyi denotes the number of species i in the complex.

C ← A → B (1) B +C → 2A

Consider the example reaction network (1) with species S = {A,B,C}, complexes C = {{A},{B},{C},{B,C},{A,A}} and reactions R = {A → B,A → C,B +C → T 2A}.ThecomplexB +C is represented, e. g., by the vector yB+C =[0,1,1] . The state variables of such a reaction network are the species concentrations

ci(t) ∈ Ê≥0 which give the amount of species i per unit volume at time t. We will ≡ ( ) ∈ m write c c t Ê≥0 for the vector of species concentrations. For each reaction j Analysis of Degenerate Chemical Reaction Networks 165

with y → y , there is a column n j =(y − y) in the stoichiometric matrix N and a corresponding reaction rate v j. Assuming mass action kinetics, the rate is given by = · y · ∏m yi v j k j c k j i=1 ci ,wherethe‘ ’ sign means ‘equal by definition’. The pa- rameter k j ∈ Ê>0 is the rate constant for that reaction. By collecting all rates v j in a vector v(k,c), we obtain the ODE system describing the concentration dynamics:

cú = f (k,c)=N · v(k,c). (2)

Often the matrix N does not have full row rank; let S im(N) and s rank(N) < ∈ m×m−s ( T) with s m and W Ê abasisofkerN . Then, one has the conservation T relations W c(t)=const. and thus c(t) ∈ c0 +S for solutions c(t) of (2) with c(0)= c0, motivating the following original definition of multistationarity: Definition 1 (Multistationarity, cf. [3]). Consider a reaction network with as- sociated ODE system (2). We say that the network admits multistationarity if there ∗ exist at least two distinct, positive vectors c and c∗ and a positive vector k with

f (k,c∗)=0(3a) f (k,c∗)=0 (3b) T ∗ W (c − c∗)=0. (3c)

An important observation of CRNT is that the dynamics can be written in terms of complexes rather than species. Species concentrations are first mapped onto “com- y plex concentrations” by ψ(c)=∑y c · ey with the sum running over all complexes n and ey the corresponding canonical basis vector of Ê . The dynamics is then given by A = IaIK where IK and Ia can be interpreted as a complex flux and a com- plex stoichiometry, respectively. Mapping the derivatives of complex concentra- tions back to species concentration changes by the matrix of stoichiometry vectors Y =[y1|···|yn] completes the decomposition of f (c)=YIaIKψ(c) (Fig. 1 and [6]). Importantly, although f is non-linear, the only non-linearity occurring in the decom- position is the map ψ. The matrix IK gives the relation between the reactions and their associated sub- strate complexes. This can be modeled as a bipartite graph where one set of nodes are the reactions and the other set are the complexes. For every reaction of the network, there is an edge in this graph connecting the reaction with its substrate complex. Every edge is weighted by a rate constant ki. Then, the matrix IK is the adjacency matrix of this weighted bipartite graph. The matrix Ia is the incidence matrix of the network graph defined as follows. Definition 2 (Network Graph). The network graph of a chemical reaction network (S ,C ,R) is the directed graph G =(C ,R) with the complexes as vertices and the reactions as edges. The following three notions are central in the CRNT literature and are based on the network graph [3]. Most importantly, the deficiency of a network captures 166 M. Uhr et al.

Fig. 1 Commutative dia- gram of the different de-

compositions of cú = f (c). n o A n Ê Ê aB O BB || BBIa IK || BB || B |} | r ψ Y | Ê aBB || BB N|| BvB || BB  |} f

m o m

Ê Ê the degrees of freedom in mapping complex concentration changes onto species concentration changes such that the latter are zero.

Definition 3 (Linkage Classes). The (strongly) connected components of the network graph are called (strong) linkage classes. A strong linkage class and its complexes are called terminal if there is no reaction mapping one of its complexes out of this class.

Definition 4 (Deficiency Space; Deficiency). The deficiency δ of a reaction network is the dimension of its deficiency space:

δ dim[ker(Y) ∩ im(Ia)].

Lemma 1. [7] The deficiency δ satisfies the relation

δ = n − l − s, where n is the number of complexes, l the number of linkage classes and s = rank(N). Note that n − l = rank(Ia).

3 Degenerate Chemical Reaction Networks with Deficiency One

An algorithm for determining whether a network of deficiency δ = 1 admits mul- tistationarity and for computing a pair of steady states is given in [4], the so-called deficiency one algorithm. It is applicable to any regular reaction network; that is, a network satisfying the following conditions (cf. [4] or [1]): (C.1) The deficiency of the network is δ = 1. (C.2) The deficiency of every linkage class is δ = 0. (C.3) ker(N) contains a positive vector. (C.4) Every strong linkage class contains exactly one terminal strong linkage class. (C.5) Terminal strong linkage classes do not contain any cycles. Analysis of Degenerate Chemical Reaction Networks 167

As many reaction networks have δ > 1, an application of the algorithm to the complete network is not possible. Therefore, we suggested an elementary flux mode (EFM) decomposition of a network [1].

Definition 5 (EFM; Stoichiometric Generator ([6] and [1])). The genera-

( r )∩ Ê tors of the pointed polyhedral cone ker YIa ≥0 are called elementary flux modes ∈ r (EFMs). In particular, an EFM is a vector v Ê≥0 that satisfies

YIav = 0(4a)

Given two EFMs v, v ,wehavethat / 0 { | = }⊆ | = ⇒ = = α i vi 0 i vi 0 v 0orv v (4b)

A stoichiometric generator is an EFM g that additionally satisfies

Iag = 0. (4c)

Subnetworks defined by stoichiometric generators are guaranteed to satisfy con- ditions (C.1) Ð (C.3) and (C.5) but not necessarily condition (C.4) [1]. Biochemical reaction networks describing metabolic reactions, for example, have a prominent source of degeneracy even for a sufficiently detailed degree of modeling: species uptake. To describe open systems, CRNT introduces the zero-complex0. / For exam- ple,0 / → A + B would be an uptake reaction for the complex A + B with a cer- tain rate. However, if more than one complex is taken up, this method leads to a linkage class with two or more terminal strong linkage classes, thus violating condition (C.4) This motivates our following definition of a degenerate reaction network:

Definition 6 (Degenerate; /0-degenerate). A reaction network is called de- generate iff it violates the condition (C.4). That is, the network has at least one linkage class with more than one terminal strong linkage class. It is called0 /-degenerate if moreover it is generated by a single stoichiometric generator (thus δ = 1) and the only violation of condition (C.4) is due to several uptake reactions from the zero complex, resulting in several terminal strong linkage classes each linked to the zero complex by a single reaction.

The zero complex renders the reaction equations for the species uptakes par- ticularly simple because the entries vi of these reactions are independent of the species concentrations. This allows to remove the0-degeneracy / from the network: individual uptake reactions with rate constants ki are replaced by a single reaction for the uptake of a combined complex as shown in (5). The idea is formalized in Definition 7. 168 M. Uhr et al.

k 8 A q1qq k3 / q degenerate C /0 MM MM& (5) k2 2B ∗ k3 / k / regularized C /0 g1A + g2B

Definition 7 (Regularized Network). Consider a0-degenerate / reaction net- work with n complexes, r reactions, and u > 1 uptake reactions such that the zero complex gives rise to a linkage class with more than one terminal strong linkage class. Let y1,...,yu be the complexes taken up with rate constants k1,...,ku,re- spectively. The regularized network is then given by deleting all u uptake reactions and introducing a new uptake reaction with rate constant k∗ and uptake complex yø g1y1 + ···+ guyu, where gi are elements of the unique (up to scalar multipli-

( r ) ∩ Ê cation) generator g of ker YIa ≥0. Thus, the regularized network has the same concentration vector c but different numbers of reactions and rate constants ki. To analyze the connection between the dynamical systems derived from the two networks, split k, g, v and N into compo- nents belonging to uptake reactions (indicated by ˜ ) and components not belonging to uptake reactions (indicated by ˆ ): subnetwork regularized network ∗ k = k˜ kr = k kˆ kˆ g = g˜ gr = 1 gˆ gˆ ˜ ∗ v(k,c)= k vr(kr,c)= k vˆ(kˆ,c) vˆ(kˆ,c) r N = y1 ...yu|Nˆ N = g1 y1 + ...+ gu yu|Nˆ The following lemma establishes the equivalence of the dynamics of the two systems.

Lemma 2 (Regularization Lemma). With f r(kr,c) Nr vr(kr,c) the regu- larized version of f (k,c) Nv(k,c), the following two statements hold:

( , ) ∈ r m × Ê ( , )= (a) If there exists k c Ê with f k c 0,then

k˜ = α g˜ (6)

˜ = > ˆ ∈ r−u (b) For k αg˜ as above, any c 0, any α > 0, and any k Ê we have that     T T r T T f [ αg˜T, kˆ ] ,c ≡ f [ α, kˆ ] ,c . (7)

r m

× Ê ( , )= ( , )=α Proof. (a) Consider (k,c) ∈ Ê with f k c 0. Then v k c g,where

( r ) ∩ Ê g is the unique (up to scalar multiplication, cf. [1]) generator of ker YIa ≥0. T T T T T Then, [ k˜ , vˆ(kˆ,c)T ] = α [ g˜ , gˆ ] and thus k˜ = αg˜. (b) With the decomposition into uptake and non-uptake parts, Analysis of Degenerate Chemical Reaction Networks 169

  T αg˜ f [ ˜ T, ˆ T ] ,c = y ...y | Nˆ = k k 1 u vˆ   α T g y + ...+ g y | Nˆ ≡ f r [ α, ˆ T ] ,c , (8) 1 1 u u vˆ k which is identical to f r (kr,c), with k∗ = α. 

Corollary 1 (Equivalence of conservation relation). Following the notation r in Lemma 2, let c f (t) and c f r (t) be solutions for the systems f and f , respec- tively, with the same initial condition c0.Ifk˜ = αg˜,thenc f (t) ≡ c f r (t) for all t ≥ 0. r r r T r T Further, with W a basis for the left kernel of N , (W ) c f r (t) ≡ (W ) c0 implies r T r T (W ) c f (t) ≡ (W ) c0. This result motivates the following new definition of multistationarity for0- / degenerate reaction networks which is a stronger version of Definition 1: Definition 8 (Multistationarity of /0-degenerate networks). A/0-degenerate network is called multistationary iff its regularized version admits multistationarity.

4 Application Example

We use the upper part of glycolysis (including the pentose phosphate pathway) as an application example for EFM-based analysis of multistationarity. The correspond- ing network shown in Fig. 2 is an important part in bacterial metabolism. It is the central pathway for growing on monosaccharides (e. g. glucose) and for producing energy for the cell. It is also important for the synthesis of precursors for amino acids and nucleotides. Notably, even for such well-characterized pathways the dy- namic features are not fully understood. Our network model consists of 4 uptake reactions, 7 outflows, and 13 enzyme- catalyzed reactions. This results in a mass-action model with 53 species, 79 com- plexes and 78 reactions. The network deficiency is 8, which hampers CRNT-based analysis without prior network decomposition. In analyzing the model, we found 123 EFMs and 88 of these EFMs were stoichiometric generators. The subnetworks that correspond to these generators have between 19 and 34 reactions (see Fig. 2 for an example) and they all are degenerate. However, the only degenerate link- age classes are those containing the uptake reactions and thus the network is0- / degenerate. We regularized the subnetworks to enable analysis with the deficiency one al- gorithm. Interestingly, none of the subnetworks corresponding to an EFM can ad- mit multiple steady states. However, as an EFM-based analysis only provides suffi- cient conditions for multistationarity of the entire network, multistationarity of the overall network is still possible [1]. Moreover, multistationarity in the overall net- work seems very likely, given its complex structure, which contains several feedback loops (e. g. via the energy carrier ATP). 170 M. Uhr et al.

ATP

glk NADPH G6P PGlac zwf pgl pgi

F6P PGluc ATP tktAB NADPH CO2 pfkA fbp talB gnd F16P X5P rpe Rl5P

fba tktAB rpiA DHAP tpiA G3P C2 R5P C3 ATP E4P tktAB talB S7P

Fig. 2 Model for the upper part of glycolysis and the pentose phosphate pathway. Black reactions form an EFM subnetwork that can be analyzed with the deficiency one algorithm. Gray reactions are not active in the EFM.

5 Conclusions and Perspectives

The application of some of the main methods and theorems of CRNT (in particular the deficiency one algorithm) is currently restricted to regular reaction networks, which limits their applicability to real-world problems. Here, we developed and ap- plied a simple method for model regularization to circumvent the limitations for (at least certain classes of) biochemical reaction networks. Future generalizations of more elements of CRNT, for instance, of the deficiency one theorem are interest- ing and preliminary results indicate that such extensions of the theory are feasible. Finally, our case-study of a medium-sized network model indicates that this is a promising approach, for instance, to understand the remarkable robustness of bio- logical systems [8].

Acknowledgements. This work was supported in part by the European Union FP6 project ‘BaSysBio’ and the European Union FP7 project ‘UniCellSys’. Analysis of Degenerate Chemical Reaction Networks 171

References

1. Conradi, C., Flockerzi, D., Raisch, J., Stelling, J.: Subnetwork analysis reveals dynamic features of complex (bio)chemical networks. Proc. Natl. Acad. Sci. 104(49), 19175Ð19180 (2007) 2. Doyle, F.J., Stelling, J.: Systems interface biology. J. R. Soc. Interface 10(3), 603Ð616 (2006) 3. Feinberg, M.: Chemical reaction network structure and the stability of complex isother- mal reactors Ð I. The deficiency zero and deficiency one theorems. Chemical Engineering Science 42(10), 2229Ð2268 (1987) 4. Feinberg, M.: Chemical reaction network structure and the stability of complex isothermal reactors Ð II. Multiple steady states for networks of deficiency one. Chemical Engineering Science 43(1), 1Ð25 (1988) 5. Flockerzi, D., Conradi, C.: Subnetwork analysis for multistationarity in mass action ki- netics. Journal of Physics: Conference Series 138, 36 pages (012006) (2008) 6. Gatermann, K., Wolfrum, M.: Bernstein’s second theorem and Viro’s method for sparse polynomial systems in chemistry. Advances in Applied Mathematics 34(2), 252Ð294 (2005) 7. Gunawardena, J.: Chemical reaction network theory for in-silico biologists. Bauer Center for Genomics Research (2003) 8. Stelling, J., Sauer, U., Szallasi, Z., Doyle, F.J., Doyle, J.: Robustness of cellular functions. Cell 118(6), 675Ð685 (2004) k-Switching Reachability Sets of Continuous-Time Positive Switched Systems

Maria Elena Valcher

Abstract. In the paper, the k-switching reachability set Rk of a continuous-time positive switched system is introduced, and a necessary and sufficient condition { } for the chain of this sets Rk,k ∈ Æ to stop increasing after some finite index k is given. It is shown that, for special classes of (multiple-input) positive switched n systems, reachability always ensures that Rn = Ê+, n being the system dimension.

1 Introduction

“Switched linear systems” are systems whose describing equations change, accord- ing to some switching law, within a (possibly infinite) family of (linear) subsys- tems. On the other hand, the positivity requirement is often introduced in the system model whenever the physical nature of the describing variables constrains them to take only positive (or at least nonnegative) values. As a result, positive linear sys- tems [1] naturally arise in fields such as bioengineering (compartmental models), economic modeling, behavioral science, and stochastic processes (Markov chains or hidden Markov models). In this perspective, switched positive systems are mathematical models which keep into account two different aspects: the fact that the system dynamics can be suitably described by means of a family of subsystems, each of them formalizing the system laws under specific operating conditions, among which the system com- mutes, and the nonnegativity constraint the physical variables are subject to. This is the case when trying to describe certain physiological and pharmacokinetic pro- cesses, like the insulin-sugar metabolism. Of course, the need for this class of sys- tems in specific research contexts has stimulated an interest in theoretical issues related to them, in particular, reachability/controllability properties [4, 5, 7], and stability issues [2, 3].

Maria Elena Valcher Dip. Ingegneria dell’Informazione, Universit`a di Padova, Italy, e-mail: [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 173Ð181. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 174 M.E. Valcher

In this paper, we introduce the concept of k-switching reachability set Rk of a positive switched system. This definition as well as some preliminary results on { } the chain of sets Rk,k ∈ Æ for the class of single-input continuous-time positive switched systems have been derived in [6]. In this paper it is shown that, for special classes of (multiple-input) positive switched systems, reachability always ensures n that Rn = Ê+, n being the system dimension.

∈   = { , ,..., } Ê Notation. For every k Æ,weset k : 1 2 k . + is the semiring of non-

n Ê negative real numbers, Ê+ the set of n-dimensional vectors with entries in +,and

n×p × Ê Ê+ the set of n p matrices with entries in +.

AmatrixA with entries in Ê+ is a nonnegative matrix;ifA is nonnegative and A = 0, A is a positive matrix, while if all its entries are positive it is a strictly positive matrix.AMetzler matrix, on the other hand, is a real square matrix, whose off- diagonal entries are nonnegative. The nonzero pattern of a vector v is the set of indices corresponding to its nonzero entries, namely ZP(v) := {i : [v]i = 0}, where [v]i is the ith entry of v.Weletei n denote the ith vector of the canonical basis in Ê (where n is always clear from the context), whose entries are all zero except for the ith which is unitary. We say that ∈ n ( )= ( )={ } a vector v Ê+ is an ith monomial vector if ZP v ZP ei i . For any set S ⊆n,weseteS := ∑i∈S ei.

2 Reachability Property and k-Switching Reachability Sets

A (continuous-time) positive switched system is described by the following equation ( )= , xú t Aσ(t)x(t)+Bσ(t)u(t), t ∈ Ê+ (1) where x(t) and u(t) denote the n-dimensional state variable and the m-dimensional input, respectively, at the time t,andσ is a switching sequence, taking values in a finite set P = {1,2,...,p} = p. We assume that the switching sequence is piece-wise constant, and hence in every time interval [0,t[ there is a finite number of discontinuities, which corresponds to a finite number (say k, including the initial time) of switching instants 0 = t0 < t1 < ··· < tk−1 < t. For each i ∈ P, the pair (Ai,Bi) represents a continuous-time positive system, which means that Ai is an n × n Metzler matrix and Bi is an n × m nonnegative matrix. As a first step, we recall the definition of monomial reachability and of reacha- bility for positive switched systems.

n Definition 1. [5] A state x f ∈ Ê+ is said to be reachable if there exist some time > m t 0, a switching sequence σ : [0,t[→ P and an input u : [0,t[→ Ê+, that lead the state trajectory from x(0)=0tox(t)=x f . A positive switched system is monomially reachable if every monomial vector (equivalently, every canonical vector ei, i ∈n) is reachable, and reachable if every n state x f ∈ Ê+ is reachable. Reachability Sets of Continuous-Time Positive Switched Systems 175

Monomial reachability is a necessary (but, unfortunately, not sufficient) condition for reachability [4] and it admits a rather easy characterization. Proposition 1. [5] Given a positive switched system (1), commuting among p subsystems (Ai,Bi),i∈ P, the following conditions are equivalent: i) the system is monomially reachable; ii) ∀i ∈n there exists an index j = j(i) ∈ P such that A jei = αiei,forsomeαi ≥ 0, and one column of B j is an ith monomial vector. In the special case of single-input positive switched systems (1), monomial reach- ability is equivalent to the fact that there exists a relabeling of the p subsystems (Ai, Bi),i ∈ P, such that the first n subsystems satisfy

Aiei = αiei and Bi = βiei, (2) for suitable αi ≥ 0andβi > 0. We now introduce the definition of k-switching reach- ability set. Definition 2. [6] Given a positive switched system (1) and a positive integer k, we define the k-switching reachability set, and denote it by Rk, as the set of states that can be reached in finite time by the system, by making use (of a nonnegative input signal u(·) and) of a switching sequence σ that commutes no more than k − 1 times, meaning that the switching instants of the switching sequence are no more than k (i.e., 0 = t0 < t1 < ···< t−1 with ≤ k).

It is easily seen that Rk is a cone, since if x f belongs to Rk then α · x f surely does, for every α ≥ 0. However, in general, it is neither convex nor polyhedral. Of course, we are interested in investigating how the cone Rk varies, as k varies over the positive integers. To this end, we recall that the state at the time t, starting from the zero initial condition, under the action of the input u(τ),τ ∈ [0,t[, and of the switching sequence σ : [0,t[→ P, with switching instants 0 = t0 < t1 < ··· < tk−1 < t and switching values i0,i1,...,ik−1 ∈ P (i.e., i = σ(t) for t ∈ [t,t+1[), can be expressed as follows:  t1 A (t−t − ) A (t −t ) A (t −τ) ( )= ik−1 k 1 ... i1 2 1 i0 1 (τ) τ + x t e e e Bi0 u d t0 t2 A (t−t − ) A (t −t ) A (t −τ) + ik−1 k 1 ... i2 3 2 i1 2 (τ) τ + ... e e e Bi1 u d (3)  t1 t A (t−τ) + ik−1 (τ) τ. e Bik−1 u d tk−1

n Therefore, a vector x f ∈ Ê+ belongs to Rk if and only if it can be expressed as in (·) ≥ , , , ∈ P,∈{ , ,..., − } (3), for suitable u 0 t t ∈ Ê+ and i 0 1 k 1 .Evenmore,it is easily seen that x f ∈ Rk if and only if  t A (t−t − ) A (t−τ) = ik−1 k 1 + ik−1 (τ) τ, x f e w e Bik−1 u d (4) tk−1 176 M.E. Valcher for some 0 < tk−1 < t,someik−1 ∈ P, a nonnegative signal u(·) and some vector w ∈ Rk−1. Clearly, Rk ⊆ Rk+1, and hence R1 ⊆ R2 ⊆··· ⊆Rk ⊆ ..... Moreover, if n = the above chain of subsets of Ê+ stops at some stage, namely Rk Rk+1 for some ∈ k Æ, then it cannot be increased any more [6]. ∈ We want to investigate under which conditions an index k Æ can be found, such that Rk = Rk+1. To this end, we denote by Rt (Ai,Bi) the cone of (positive) states which are reachable at time t > 0 (by means of nonnegative inputs) by the single subsystem xú(t)=Aix(t)+Biu(t). Notice that, differently from what happens to standard linear systems, Rt(Ai,Bi) typically grows with t [1]. As an immediate consequence of equation (4), we obtain the following identity   Ait Rk+1 = ∪i∈P ∪t>0 e Rk + Rt(Ai,Bi) , (5) which leads to the following result. Proposition 2. Given a positive switched system (1), commuting among p single- input subsystems, the following facts are equivalent: = i) there exists k ∈ Æ such that Rk Rk+1; ∈ Ait + R ( , ) ⊆ ∀ > ,∀ ∈ P ii) there exists k Æ such that e Rk t Ai Bi Rk, t 0 i .

Proof. i) ⇒ ii) If Rk = Rk+1, then the set of states which are reachable (in finite A t time) coincides with Rk. Clearly, if a state x f > 0 is reachable, then e i x f + v(i,t) is reachable, too, for every i ∈ P,everyt > 0 and every vector v(i,t) ∈ Rt(Ai,Bi). Indeed, once x f has been reached, it is sufficient to switch to the ith subsystem and apply a suitable input signal for a lapse of time equal to t. This ensures that A t e i Rk + Rt(Ai,Bi) ⊆ Rk, ∀ t > 0,i ∈ P. A t ii) ⇒ i) If e i Rk + Rt(Ai,Bi) ⊆ Rk for every t > 0andeveryi ∈ P, then (by (5)), Rk+1 ⊆∪i∈P ∪t>0 Rk = Rk, and since the converse inclusion Rk ⊆ Rk+1 is always true, this implies that i) holds.  At this stage of our research, it is not clear, yet, whether, for a reachable system n (1) an index k can always be found such that Rk = Ê+. There are classes of systems, however, for which this is surely true and it turns out that reachability ensures that n Rn = Ê+. This is the case of single-input positive switched systems of dimension n = 2orn = 3 [6]. Further classes of systems endowed with these properties will be investigated in the following sections. To conclude the section, we introduce a technical lemma we will use in the fol- ∈ × lowing. The lemma makes use of the following notation. Given n Æ,asetofn n Metzler matrices {A1,A2,...,Ap} =: {Ai,i ∈ P}, and a subset S ⊆n,wedefine the set A IS := {i ∈ P : ZP(e i eS )=S }. (6) Lemma 1. If the n-dimensional continuous-time positive switched system (1) is reachable, then for every set S ⊆n theindexsetIS is non-empty.

n Proof. Let x f ∈ Ê+ be a reachable state that can be reached through the switching sequence σ, ordinately taking the values i0,i1,...,ik−1 ∈n and set S := ZP(x f ). Reachability Sets of Continuous-Time Positive Switched Systems 177

Clearly, x f belongs to Rk for some k ∈ Æ and hence it can be expressed as in (4) for some 0 < tk−1 < t,someik−1 ∈ P, a nonnegative signal u(·) and some vector ( − ) Ai − t tk−1 w ∈ Rk−1.IfZP(x f )=S = ZP(e k 1 w), then [5] ik−1 ∈ IS and hence IS = 0./ ( − ) Ai − t tk−1 If ZP(x f )=S ZP(e k 1 w), then it is sufficient to notice that, due to the nonnegativity of all matrix functions and signals involved,   t   A (t−τ) A = ik−1 (τ) τ = ik−1 , /0 ZP e Bik−1 u d ZP e Bik−1 z tk−1 where z is any positive vector satisfying: d ∈ ZP(z) if and only if [u]d (·),the dth entry of the vector function u(·), is positive on a non-zero measure time ( − ) Ai − t tk−1 interval within ]tk−1,t[. This ensures that ZP(x f )=S = ZP(e k 1 w) ∪ A A A A ( ik−1 )= ( ik−1 ) ∪ ( ik−1 )= ( ik−1 ( + )), ZP e Bik−1 z ZP e w ZP e Bik−1 z ZP e w Bik−1 z and hence [5] ik−1 ∈ IS . 

3 Positive Switched Systems with a Constant State Matrix

In this section we consider the special class of positive switched systems which commute among p subsystems having the same system matrix A, but different input- to-state matrices Bi,i ∈ P, and hence described by the differential equation: ( )= ( )+ , xú t Ax t Bσ(t)u(t), t ∈ Ê+ (7) where A is an n × n Metzler matrix, while the matrices Bi,i ∈ P, are nonnegative. For these systems a strong characterization of reachability can be derived.

Proposition 3. Given a continuous-time positive switched system (7), which switches among p subsystems (A,Bi),i ∈ P, sharing the same system matrix, the following facts are equivalent ones: i) the system is reachable; ii) the system is monomially reachable; iii)A is a nonnegative diagonal matrix, and ∀i ∈n there exists an index j = j(i) ∈ P such that one column of B j is an ith monomial vector.

If any of the previous equivalent conditions holds, then every positive vector x f n belongs to R and hence, in particular, R = Ê+. |ZP(x f )| n Proof. The proof of i) ⇒ ii) is obvious, while ii) ⇒ iii) follows immediately from Proposition 1 in the special case when A j = A for every j ∈ P. So, it remains to be shown that iii) ⇒ i). To this end, assume first that m = 1, namely that we are dealing with single- input systems. If so, condition iii) can be rephrased as follows: A is a nonnegative 178 M.E. Valcher diagonal matrix (|P|≥n), and there exists a relabeling of the subsystems such that the first n vectors Bi are linearly independent monomial vectors. In other words, after a suitable relabeling, we get (for suitable αi ≥ 0andβi > 0) ⎡ ⎤ ⎡ ⎤ α1 β1 ⎢ α ⎥ ⎢ β ⎥ ⎢ 2 ⎥ ⎢ 2 ⎥ = , ... = . A ⎢ . ⎥ B1 B2 Bn ⎢ . ⎥ (8) ⎣ .. ⎦ ⎣ .. ⎦ αn βn n ( )={ , ,..., } Let, now, x f ∈ Ê+ be any positive vector. Set ZP x f i0 i1 ik−1 . Notice that all indices i j belong to n. If we assume t = k as final time, t0 = 0 < t1 = 1 < ··· < tk−1 = k − 1 as switching instants, and we assume that the input u(·) takes some suitable positive constant value ui in the time interval [ti,ti+1[= [i,i + 1[,then equation (3) becomes   1 2 ( )= A(k−1) A(1−τ) τ + A(k−2) A(2−τ) τ + ... x k e e Bi0 u0d e e Bi1 u1d  0  1 k 1 + A(k−τ) τ = A(k−1) A(1−τ) τ · e Bi − uk−1d e e Bi d u0 − k 1 0 k 1  0 1 1 + A(k−2) A(1−τ) τ · + ...+ A(1−τ) τ · . e e Bi1 d u1 e Bik−1 d uk−1 0 0 So, upon noticing that   1 1 ( − − ) ( −τ) ( − − ) α (1−τ) A k 1 j A 1 τ = A k 1 j i j β τ = γ , e e Bi j d e e i j ei j d i j ei j 0 0 γ > ( )=∑k−1 γ . for some i j 0, it follows that x k j=0 i j ei j u j So, it is easily seen that by properly choosing the values u j > 0 we can always ensure that x(k)=x f , and hence i) holds. Notice that the proof also shows that the last part of the proposition state- ment holds for m = 1. On the other hand, if we are dealing with the multiple-input case and condition iii) holds, we may associate with system (7) a new single-input positive switched system which is the system described as in (7), and that commutes among all subsystems ( , ∗) ∗ , ∈ P A Bi , Bi a column of some B j j . This amounts to considering the dynamic of system (7) under the constraint that at every instant t only one input is active. Clearly, the associated single-input system satisfies condition iii), too, and hence, by the previous part of the proof, it is reachable. But this means that, for system (7), all positive states can be reached by making use of a single input at every instant, and hence (7) is reachable. So, the result is proved.  Reachability Sets of Continuous-Time Positive Switched Systems 179

4 Positive Switched Systems with a Constant Input-to- State Matrix

Let us address, now, the class of positive switched systems described by the differ- ential equation: ( )= , xú t Aσ(t)x(t)+Bu(t), t ∈ Ê+ (9) where Ai,i ∈ P, are n×n Metzler matrices, while the n×m matrix B is nonnegative. Also in this case, we start by adapting the monomial reachability characterization provided in Proposition 1 to the specific class of systems we are considering. It turns out system (9) is monomially rechable if and only if there exist indices ji ∈ P, ∈ , = α , α ≥ × i n such that A ji ei iei for some i 0, and B has an n n monomial submatrix. It is easily seen that if B has an n × n monomial submatrix and we are ( ) m dealing with nonnegative input signals, Bu t is an arbitrary vector in Ê+ at every time t. As a consequence, system (9) is reachable if and only if the following system is reachable. ( )= . xú t Aσ(t)x(t)+uø(t), t ∈ Ê+ (10) By making use of this necessary and sufficient condition, we provide a complete characterization of reachability property for the class of systems (9).

Proposition 4. A continuous-time positive switched system (9), switching among |P| subsystems (Ai,B),i ∈ P, is reachable if and only if the following two condi- tions hold: a) B has an n × n monomial submatrix and A b) ∀S ⊆n, IS = /0,i.e.∃ j(S ) ∈ P such that ZP(e j(S ) eS )=S .

If any of the previous equivalent conditions holds, then every positive vector x f n belongs to R and hence, in particular, R = Ê+. |ZP(x f )| n Proof. Suppose, first, that system (9) is reachable, then b) is true by Lemma 1. On the other hand, the system is a fortiori monomially reachable, and hence a) holds. Conversely, assume that a) and b) hold. Condition b) applied to all sets S of uni- tary cardinality ensures that for every i ∈n there exists an index ji ∈ P such that = α , α ≥ A ji ei iei for some i 0. So, under assumptions a) and b), monomial reacha- bility is ensured, and hence, by the preliminary discussion, it remains to be shown that the associated switched system (10) is reachable. n = | ( )| Given any positive vector x f ∈ Ê+,setr : ZP x f and assume w.l.o.g. that ZP(x f )={i1,i2,...,ir}, with i1 < i2 < ···< ir.Foreveryh ∈{1,2,...,r},setSh := {i1,i2,...,ih} and let j(Sh) be the index which makes assumption b) satisfied, and A (S ) ( j h )=S . hence ZP e eSh h Now, we show that by suitably choosing a final time t > 0, the values of the switching instants th,h = 0,1,...,r − 1, with 0 = t0 <...< tr−1 < t, and a nonnegative input, which takes as a value a suitable monomial vector [ , ) eih uøh in every time interval th−1 th , we may ensure that 180 M.E. Valcher

 t1 A (t−t − ) A (S )(t −t ) A (S )(t −τ) = j(Sr) r 1 ... j 2 2 1 j 1 1 τ x f e e e d ei1 uø1  t0 t ( −τ) + ...+ A j(Sr) t τ e d eir uør (11) tr−1

(which amounts to proving that every positive vector x f is reachable in |ZP(x f )| steps for system (10)). By the previous considerations, every term in (11) has a nonzero pattern included in Sr. Moreover, it is easy to conclude that, since every exponential matrix can be made, by choosing the time interval between two consecutive switching instants sufficiently small, as close as we want to the identity matrix (see Lemma 2 in [4]) A (S ) { }⊆ ( j h ) ⊆ S ⊆ S , and since eih is an ihth monomial vector, with ih ZP e eih h r then each positive term  th A (t−t − ) A (S )(t + −t ) A (S )(t −τ) = j(Sr) r 1 ... j h+1 h 1 h j h h τ vh : e e e d eih th−1 can be made as close as we want to the monomial vector eih (and, of course, with nonzero pattern included in Sr). Once the switching instants have been selected, in order to ensure that the aforementioned vectors vh are desired approximations of all , ∈ S ( ) ⊆ S the monomial vectors eih ih r,andthatZP vh r,surelyx f is an internal point of the cone generated by the vectors v1,v2,...,vr. So, nonnegative valuesu øh can be found such that (11) holds. This part of the proof also ensures that the last part of the proposition statement holds true. 

References

1. Farina, L., Rinaldi, S.: Positive linear systems: theory and applications. Series on Pure and Applied Mathematics. Wiley-Interscience, New York (2000) 2. Mason, O., Shorten, R.N.: Some results on the stability of positive switched linear sys- tems. In: Proceedings of the 43rd Conference on Decision and Control, Paradise Island, Bahamas, pp. 4601Ð4606 (2004) 3. Mason, O., Shorten, R.N.: On linear copositive Lyapunov functions and the stability of switched positive linear systems. IEEE Transactions on Automatic Control 52, 1346Ð1349 (2007) 4. Santesso, P., Valcher, M.E.: An algebraic approach to the structural properties of contin- uous time positive switched systems. In: Commault, C., Marchand, N. (eds.) LNCIS, pp. 185Ð192. Springer, Heidelberg (2006) 5. Santesso, P., Valcher, M.E.: Controllability and reachability of switched positive systems. In: Proceedings of the Seventeenth International Symposium on Mathematical Theory of Networks and Systems, Kyoto, Japan (2006); File MoP01.3.pdf Reachability Sets of Continuous-Time Positive Switched Systems 181

6. Valcher, M.E.: On the k-switching reachability sets of single-input positive switched sys- tems. Submitted to the 2009 American Control Conference, St. Louis (2009) 7. Valcher, M.E., Santesso, P.: On the reachability of single-input positive switched systems. In: Proceedings of the 47th Conference on Decision and Control, Cancun, Mexico, pp. 947Ð952 (2008) Inverse-Positive Matrices with Checkerboard Pattern

Manuel F. Abad, Mar«õa T. Gass«o and Juan R. Torregrosa

Abstract. A nonsingular real matrix A is said to be inverse-positive if all the ele- ments of its inverse are nonnegative. This class of matrices contains the M-matrices, from which inherit some of their properties and applications, especially in Economy. In this work we analyze the inverse-positive concept for a particular type of pattern: the checkerboard pattern. In addition, we study the Hadamard product of certain classes of inverse-positive matrices whose entries have a particular sign pattern.

1 Introduction

In economics as well as other sciences, the inverse-positivity of real square matri- ces has been an important topic. A nonsingular real matrix A is said to be inverse- positive if all the elements of its inverse are nonnegative. An inverse-positive matrix being also a Z-matrix is a nonsingular M-matrix, so the class of inverse-positive matrices contains the nonsingular M-matrices, which have been widely studied and whose applications, for example, in iterative methods, dynamic systems, economics, mathematical programming, etc, are well known. Of course, every inverse-positive matrix is not an M-matrix. For instance,   −12 A = 3 −1 is an inverse-positive matrix that is not an M-matrix. The concept of inverse-positive is preserved by multiplication, left or right posi- tive diagonal multiplication, positive diagonal similarity and permutation similarity.

Manuel F. Abad, Mar«õa T. Gass«o and Juan R. Torregrosa Instituto de Matem«atica Multidisciplinar, Universidad Polit«ecnica de Valencia, 46022 Valencia, Spain, e-mail: [email protected],[email protected],[email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 185Ð194. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 186 M.F. Abad, M.T. Gass«o and J.R. Torregrosa

So we may assume, without loss of generality, that all diagonal entries are equal to 1 when they are positive. Now, we present some examples of inverse-positive matrices that appear in dif- ferent numerical processes. Example 1. The following n × n tridiagonal matrix ⎛ ⎞ a 1 + −10... 00 ⎜ − ⎟ ⎜ a b ⎟ ⎜ −12−1 ... 00⎟ ⎜ ⎟ ⎜ 0 −12... 00⎟ T = ⎜ ⎟, ⎜ . . . . . ⎟ ⎜ . . . . . ⎟ ⎝ 000... 2 −1 ⎠ 000... −11

−1 with a > 0anda > b is an inverse-positive matrix since T =(1/a)C with C =(cij) where cij = min{ai − b,aj− b}, i, j = 1,2,...,n. T n Example 2. Let x =[x1,x2,...,xn] be a vector of R+,thatis,xi > 0, i = 1,2,...,n. The lower bidiagonal matrix ⎛ ⎞ 1 ⎜ 00... 00⎟ ⎜ ⎟ ⎜ x1 ⎟ ⎜ −1 1 ⎟ ⎜ ... ⎟ ⎜ 0 00⎟ ⎜ x1 x2 ⎟ ⎜ − ⎟ ⎜ 1 1 ... ⎟ ⎜ 0 00⎟ ( , )=⎜ x2 x3 ⎟ P x n ⎜ ⎟ ⎜ . . . . . ⎟ ⎜ . . . . . ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ −1 1 ⎟ ⎜ 00... 0 ⎟ ⎜ xn−2 xn−1 ⎟ ⎝ −1 1 ⎠ 00... 0 xn−1 xn is an inverse-positive matrix since ⎛ ⎞ x1 00... 00 ⎜ ⎟ ⎜ x2 x2 0 ... 00⎟ ⎜ ⎟ ⎜ x3 x3 x3 ... 00⎟ P(x,n)−1 = ⎜ ⎟. ⎜ . . . . . ⎟ ⎜ . . . . . ⎟ ⎝ ⎠ xn−1 xn−1 xn−1 ... xn−1 0 xn xn xn ... xn xn

Johnson in [3] studied the possible sign patterns of a matrix which are compati- ble with inverse-positiveness. Following his results we analyze the inverse-positive concept for a particular type of pattern: the checkerboard pattern. Inverse-Positive matrices 187

In addition, it is well known that the inverse of a nonsingular sign regular matrix is either a matrix with checkerboard pattern or is the opposite of a matrix with checkerboard pattern. In particular, the inverse of a nonsingular totally nonnegative matrix has checkerboard pattern.

Definition 1. An n×n real matrix A =(aij) is said to have a checkerboard pattern i+ j if sign(aij)=(−1) , i, j = 1,2,...,n.

We study in this paper the inverse-positivity of bidiagonal, tridiagonal and lower (upper) triangular matrices with checkerboard pattern. We obtain characterizations of the inverse-positivity for each class of matrices.

Definition 2. The Hadamard (or entry-wise) product of two n × n matrices A = (aij) and B =(bij) is A ◦ B =(aijbij).

Several authors have investigated about the Hadamard product of matrices. John- son [2] showed that if the sign pattern is properly adjusted the Hadamard product of M-matrices is again an M-matrix and for any pair M, N of M-matrices the Hadamard product M ◦ N−1 is again an M-matrix. This result does not hold in general for inverse-positive matrices. We analyze when for any M, N checkerboard pattern inverse-positive matrices, the Hadamard product M ◦ N−1 is closed. The submatrix of a matrix A of size n × n, lying in rows α and columns β,in which α,β ⊆ N = {1,...,n}, is denoted by A[α|β], and the principal submatrix A[α|α] is abbreviated to A[α]. On the other hand, A(α|β) is the submatrix obtained from A by deleting the rows indexed by α and columns indexed by β.

2 Checkerboard Inverse-Positive Matrices

For matrices of size 2 × 2, it is easy to prove the following result. Proposition 1. Let A be a 2 × 2 nonsingular matrix. A is inverse-positive matrix if and only if, a) A has a checkerboard pattern and det(A) > 0, or b) −A has a checkerboard pattern and det(A) < 0. In general, this result does not hold for nonsingular matrices of size n × n, n ≥ 3, as we can see in the following example. Example 3. The nonsingular matrix ⎛ ⎞ 100 A = ⎝ −110⎠, 2 −11 188 M.F. Abad, M.T. Gass«o and J.R. Torregrosa has positive determinant and checkerboard pattern, but it is not an inverse-positive matrix. Now, we are going to analyze the inverse-positivity of bidiagonal, tridiagonal and lower (upper) triangular checkerboard matrices of size n × n, n ≥ 3. If A is a bidiagonal matrix with checkerboard pattern, then A is an M-matrix, so

Proposition 2. If A is an n×n bidiagonal nonsingular matrix with checkerboard pattern, then A is an inverse-positive matrix.

By other hand, we can observe that if A is a bidiagonal matrix and −A has checkerboard pattern, then A never is an inverse-positive matrix. Note that, in general, a tridiagonal matrix with checkerboard pattern it is not an inverse-positive matrix, as we can observe in the following example.

Example 4. The tridiagonal matrix ⎛ ⎞ 1 −10 A = ⎝ −21−3 ⎠ 0 −41 is nonsingular with checkerboard pattern, but it is not an inverse-positive matrix.

In the following results we present necessary and sufficient conditions for a tridi- agonal matrix to be inverse-positive.

Theorem 1. Let A =(aij) be an n × n tridiagonal nonsingular matrix with checkerboard pattern. Then, A is an inverse-positive matrix if and only if

detA[α] ≥ 0, α ⊆{1,2,...,n}|α|≥2. (1)

Proof. We are going to prove the sufficiency of condition (1). By hypothesis we have that all diagonal entries are positive, so we can assume that matrix A has the form ⎛ ⎞ 1 −a12 ... 00 ⎜ ⎟ ⎜ −a21 1 ... 00⎟ ⎜ ⎟ = ⎜ . . . . ⎟. A ⎜ . . . . ⎟ ⎝ ⎠ 00... 1 −an−1,n 00... −an,n−1 1

1 + Let A−1 =(b ) with b = (−1)i j detA( j|i). By using condition (1) it is ij ij detA easy to prove that bij ≥ 0, for all i, j. The necessity of condition (1) follows a similar reasoning since if A is an inverse- positive tridiagonal matrix, with checkerboard pattern, then aii = 0, i = 1,2,...,n. 

When −A has a checkerboard pattern, we can establish the following result. Inverse-Positive matrices 189

Proposition 3. Let A be an 3 × 3 tridiagonal nonsingular matrix such that −A has a checkerboard pattern. Then, A is an inverse-positive matrix if and only if detA > 0, detA[{1,2}] ≥ 0 and detA[{2,3}] ≥ 0.

We can observe that, if A =(aij) is an n×n, n > 3, tridiagonal nonsingular matrix with |aij| > 0, |i − j| < 2, such that −A has a checkerboard pattern, then A is not an inverse-positive matrix. Finally, when A is a nonsingular lower (upper) triangular matrix, with checker- board pattern, the nonnegativity of its inverse is not guaranteed.

Example 5. Let us consider the lower triangular matrix ⎛ ⎞ 1000 ⎜ −2100⎟ A = ⎜ ⎟ ⎝ 3 −110⎠ −45−11

It is easy to check that A is not inverse-positive.

We introduce a condition related with the associated graph of a matrix, which we call P-condition.

Definition 3. Let A =(aij) be an n×n lower (upper) triangular matrix. A satisfies the P-condition if aij ≤ aikakj, i > k > j (i < k < j). We need the next lemma in order to get the main result for the inverse-positivity of this class of matrices.

Lemma 1. Let A be an n × n nonsingular lower triangular matrix, with checker- board pattern, that satisfies the P-condition. Then

sign(detA[{i,i + 1,...,n}|{i − 1,i,...,n − 1}]) = (−1)n+i−1, i = 2,3,...,n.

If A is an upper triangular matrix, the thesis of the above lemma is

sign(detA[{i − 1,i,...,n − 1}|{i,i + 1,...,n}]) = (−1)n+i−1, i = 2,3,...,n.

Theorem 2. Let A be an n × n nonsingular lower (upper) triangular matrix with checkerboard pattern, that satisfies the P-condition. Then, A is an inverse-positive matrix.

Proof. Suppose that A is upper triangular. The proof is by induction on n.For n = 3 ⎛ ⎞ ⎛ ⎞ 1 −a12 a13 1 a12 a12a23 − a13 = ⎝ − ⎠ −1 = ⎝ ⎠ ≥ , A3 01 a23 and A3 01 a23 0 00 1 00 1 190 M.F. Abad, M.T. Gass«o and J.R. Torregrosa by using the P-condition. A matrix of size n × n can be partitioned as   A − v − A = n 1 n 1 , n 01 where An−1 is the upper triangular submatrix A[{1,2,...,n − 1}] and vn−1 is the submatrix A[{1,2,...,n − 1}|{n}]. We can observe that   −1 −1 − A −A v − A 1 = n−1 n−1 n 1 , n 01 where − −1 = An−1vn−1 T (−1)n+1 detA[{1,...,n − 1}|{2,...,n}],...,(−1)2n−1 detA[{n − 1}|{n}] .

−1 ≥  The hypothesis of induction and Lemma 1 allow us to assure that An 0. In general, the converse does not hold, as the following example shows.

Example 6. Consider the lower triangular matrix ⎛ ⎞ 1000 ⎜ −7100⎟ A = ⎜ ⎟. ⎝ 0 −110⎠ −11−11

It is easy to check that A is inverse-positive, but a41 > a42a21.

However, we can establish that

Theorem 3. Let A be an 3 × 3 nonsingular lower (upper) triangular matrix with checkerboard pattern. Then A satisfies the P-condition if and only if A is an inverse- positive matrix.

In order to obtain a necessary condition in the general case, we introduce the following notation (see [4]). Given an n × n matrix A and the positive integers 1 ≤ m1 < m2 < ···< mk ≤ n, we denote =(− )k [{ , ,···, }|{ , ,···, }]. am1,m2,...,mk 1 detA m1 m2 mk−1 m2 m3 mk We can establish the following result Theorem 4. Let A be an n × n nonsingular lower (upper) triangular matrix with checkerboard pattern. Then, A is an inverse-positive matrix if and only if for any 1 ≤ m1 < m2 < ···< mk ≤ n, 1 ≤ k ≤ n, ≤ . am1,m2,...,mk 0 Inverse-Positive matrices 191

−1 Proof. Suppose that A is lower triangular. If A =(bij) it is easy to observe that

bij = −a j, j+1,...,i.

Analogously, if A is upper triangular we have

bij = −ai,i+1,..., j. 

We can extend the P-condition for general matrices in the following way: Given an n × n matrix A =(pij), A satisfies the P-condition if

pij ≤ pik pkj, i = j = k.

Finally, by using the above condition we establish the following result.

Theorem 5. Let A be an n×n, P- matrix with checkerboard pattern. If A satisfies the P-condition then, A is an inverse-positive matrix.

3 The Hadamard Product of Inverse-Positive Matrices

Several authors have investigated about the Hadamard product of matrices. A celebrate and well-known theorem of Schur is that if A and B are positive semidef- inite (nonnegative definite) matrices of the same size, then so is A ◦ B.ForM- matrices, though they have great many analogies to the positive definite matrices, the Hadamard product of two M-matrices is not M-matrix. Johnson [2] shown that the Hadamard product of an M-matrix and its own inverse is again M-matrix. Recently several authors have investigated the Hadamard product of inverse M-matrices. For example, Wang et al. in [4], proved that the class of inverse M-matrix is Hadamard -closed if and only if n ≤ 3. Our purpose here is to study the inverse-positivity of the Hadamard product A ◦ B−1 when A and B are inverse-positive matrices. In general, for matrices of size 2 × 2, it is easy to prove that

Proposition 4. If A and B are 2 × 2 inverse-positive matrices and sign(detA)= sign(detB) then, A ◦ B−1 is an inverse-positive matrix.

Now, we analyze when the class of lower (upper) triangular inverse-positive ma- trix, with checkerboard pattern, is closed under the Hadamard product. First, we consider the following technical result.

Proposition 5. Let A =(aij) and B =(bij) be upper triangular matrices with ( )−1 ( , ) −1 unit main diagonal. If B ij denotes the i j entry of B , we have

( )−1 = − ( )−1 = − B ij bi,i+1,..., j and B i,i+1 bi,i+1 192 M.F. Abad, M.T. Gass«o and J.R. Torregrosa

−1 and, being C = A ◦ B =(cij),

cij = −aijbi,i+1,..., j, for all i < j. (2)

We can establish a similar technical result for lower triangular matrices.

Theorem 6. Let A and B be n × n, n ≥ 3, upper (lower) triangular matrices, with checkerboard pattern and unit main diagonal, satisfying the P-condition. Then A ◦ B−1 is an inverse-positive.

Proof. Suppose that A and B are upper triangular matrices (the proof for lower triangular matrices is similar). The proof is by induction on n.Forn = 3 ⎛ ⎞ ⎛ ⎞ 1 −a12 a13 1 −b12 b13 ⎝ ⎠ ⎝ ⎠ A = 01−a23 , B = 01−b23 00 1 00 1 and ⎛ ⎞ 1 b12 b12b23 − b13 −1 ⎝ ⎠ B = 01 b23 ≥ 0. 00 1 We observe that ⎛ ⎞ 1 −a12b12 a13 detB[{1,2}|{2,3}] −1 ⎝ ⎠ C = A ◦ B = 01 −a23b23 ≥ 0. 00 1

It is easy to prove that detC[{1,2}|{2,3}] ≥ 0, so C satisfies the P-condition and therefore C is inverse-positive. Now, let A and B be upper triangular matrices of size n × n, n > 3,     A aø B bø A = 11 12 and B = 11 12 . 01 01

Note that,   −1 −1 − A ◦ B −aø ◦ B bø C = A ◦ B 1 = 11 11 12 11 12 01 and   −1 − T − (A ◦ B ) 1 [r ,r ,...,r − ] C 1 = 11 11 1n 2n n 1n . 01

By using the hypothesis of induction, we only need to prove that r jn ≥ 0, j = , ,..., − =( )−1 = − 1 2 n 1, where r jn C jn c j, j+1,...,n. By using (2), we have

rn−1,n = −cn−1,n = −an−1,n(−bn−1,n) ≥ 0.

Now, Inverse-Positive matrices 193

rn−2,n = −cn−2,n−1,n = detC[{n − 2,n − 1}|{n − 1,n}]=cn−2,n−1cn−1,n − cn−2,n and by (2)

rn−2,n =(−an−2,n−1bn−2,n−1)(−an−1,nbn−1,n) − (−an−2,nbn−2,n−1,n).

Taking into account that A and B have checkerboard pattern and satisfy the P- condition, we obtain

rn−2,n =(an−2,n−1an−1,n − an−2,n)(bn−2,n−1bn−1,n)+an−2,nbn−2,n ≥ 0.

In a similar way,

rn−3,n = −cn−3,n−2,n−1,n = −detC[{n − 3,n − 2,n − 1}|{n − 2,n − 1,n}]=

= −cn−3,n−2rn−2,n + cn−3,n−1cn−1,n − cn−3,n.

Since C has checkerboard pattern we have

−cn−3,n−2rn−2,n ≥ 0 and, by using the P-condition and the checkerboard pattern of A and B, we obtain

cn−3,n−1cn−1,n − cn−3,n =(an−3,n−1an−1,n − an−3,n)(bn−3,n−2,n−1bn−1,n)+

+an−3,n(−bn−3,n−2bn−2,n + bn−3,n) ≥ 0.

Therefore, rn−3,n ≥ 0. In a similar way, we prove that r jn ≥ 0, for j = n − 4,n − 5,...,2. Finally, we are going to prove that r1n ≥ 0. =( )−1 =(− )n+1 [{ , ,..., − }|{ , ,..., }]= r1n C 1n 1 detC 1 2 n 1 2 3 n n+1 =(−1) [c12 detC[{2,3,...,n − 1}|{3,4,...,n}]− − [{ , ,..., − }|{ , ,..., }]+ c13 detC 3 4 n 1 4 5 n n−1 n + c14 detC[{4,...,n − 1}|{5,...,n}]+···+(−1) c1,n−1cn−1,n +(−1) c1n .

By using a similar reasoning as rn−3,n,ifn is even, and a similar reasoning as rn−2,n when n is odd we obtain that r1n ≥ 0. 

Acknowledgements. Supported by Ministerio de Ciencia y Tecnolog«õa MTM2007-64477. 194 M.F. Abad, M.T. Gass«o and J.R. Torregrosa

References

1. Berman, A., Plemmons, R.J.: Nonnegative matrices in the Mathematical Sciences. SIAM, Philadelphia (1994) 2. Johnson, C.R.: A Hadamard Product Involving M-matrices. Linear Algebra and its Appli- cations 4, 261Ð264 (1977) 3. Johnson, C.R.: Sign patterns of inverse nonnegative matrices. Linear Algebra and its Ap- plications 55, 69Ð80 (1983) 4. Wang, B.Y., Zhang, X., Zhang, F.: On the Hadamard Product of Inverse M-matrices. Lin- ear Algebra and its Applications 305, 23Ð31 (2000) Some Remarks on Links between Positive Invariance, Monotonicity, Strong Lumpability and Coherency in Max-Plus Algebra

Mourad Ahmane and Laurent Truffet

Abstract. In this paper, we make clearly appear links in Max-Plus algebra between four apparently different concepts encountered in various domains, such as Posi- tive invariance, Monotonicity, Strong lumpability and Coherency. The first concept concerns Positive invariance of a particular set by a (linear) map. The second con- cept concerns Monotonicity of a given matrix. The third concept concerns Strong lumpability and the last concept concerns Coherency. To achieve these objectives, we begin first by recalling the idempotent version of Haar’s lemma, which gives nec- essary and sufficient conditions for the inclusion of ”one sided” idempotent poly- hedra. Then, we generalize this result to the case of the inclusion of ”two sided” idempotent polyhedra. Finally, these results allow to formulate conditions for each concept and to give links between them. Recalling that Strong lumpability and Co- herency are used for the aggregation (reduction) of systems. All the proofs are based on residuation theory, which play a central role in duality theory.

Basic Notations

• All vectors are column vectors. ≤n denotes the component-wise or product on n X where (X ,≤) is a poset (i.e. x ≤n y ⇐⇒ ∀ i,xi ≤ yi). • Sm×k denotes the set of all m × k-dimensional matrices in a semifield S. th th • For all matrix A, ai, j, al,· and a·,k denote the entry (i, j),thel row and the k column of A, respectively. • If A,B ∈ Sn×m then A ≤ B denotes the entry-wise comparison of A and B.

Mourad Ahmane SET Laboratory, University of Technology of Belfort-Montb«eliard, 90010 Belfort, France, e-mail: [email protected] Laurent Truffet Ecole des Mines de Nantes, 4 rue Alfred Kastler, BP.20722, 44307 Nantes Cedex3, France e-mail: [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 195Ð204. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 196 M. Ahmane and L. Truffet

1 Introduction

In our previous work [1], the following result formulated in Max-plus algebra is given to characterize in algebraic form the inclusion of ”one sided” idempotent polyhedra . Result 1 (Idempotent version of Haar’s Lemma) Let (S,⊕,⊗,ε,e) be a complete d idempotent semifield. Given two polyhedra P(P, p)={x ∈ S : P ⊗ x ≤m p} and d P(Q,q)={x ∈ S : Q ⊗ x ≤m q}, where P (resp. Q) be a m × d(resp.m× d) matrix and p (resp. q) be a m (resp. m ) dimensional column vector. Define that supp(v)={l : vl = ε},Tv = {l : vl = '} and ζ the complementary set of ζ. Assume the following hypotheses: (H0) : ∀ j,supp(P·, j) = /0, (i.e. P have non null columns); ( ) ∀ , ( )∩ = H1 : j supp q TP·, j /0. The assertion P(P, p) ⊆ P(Q,q) is true if and only if there exists a matrix

H ∈ Sm ×m such that the following conditions hold:

(i). Q ≤ H ⊗ P and (ii). H ⊗ p ≤m q. (1)

It is almost classical that the following linear system over idempotent semiring or semifield (S,⊕,⊗,ε,e) defined by:  x(0) ∈ Sd,A ∈ Sd×d (d,A) : (2) x(n)=A ⊗ x(n − 1),n ≥ 1, model linear systems on dioids of practical interest (e.g. some manufacturing sys- tems, communication protocols (TCP), transmission of flows in network, etc.). For th more details, see e.g. [2], [8]. Recall that the i component (A⊗x(n−1))i of System (2) is expended as max (Ai, j + x j(n − 1)). j=1,...,d The objectives of this paper are as follows. First, we generalize the idempotent version of Haar’s Lemma given by Result 1, which gives necessary and sufficient conditions for the inclusion of ”one sided” idempotent polyhedra, to the case of the inclusion of ”two sided” idempotent polyhedra (see Proposition 1). Then, according to Result 1 or Proposition 1, we give conditions for each property, such as Positive invariance, Monotonicity, Strong lumpability and Coherency. Finally, we show the links between these different properties. Recalling the following definitions for each property: • The first concept is Positive invariance of particular sets by a (linear) map. This concept is of particular importance because it leads to control strategies (see e.g. [6]). A set E ⊂ Sd is said to be positively invariant by the map f : Sd → Sd if f (E) ⊂ E. In this paper, f will be a linear map, i.e. f (x)=A⊗x,whereA ∈ Sd×d. In this case, we say that E is A-invariant. • The second concept concerns Monotonicity of a given matrix. Given two ma- trices W,W ∈ Sm×d, a matrix A is said to be (W,W )-monotone if ∀x,y ∈ Sd : x ≤W,W y =⇒ A ⊗ x ≤W,W A ⊗ y. The ≤W,W is defined by: Positive Invariance, Monotonicity, Strong Lumpability and Coherency 197

d ∀x,y ∈ S : x ≤W,W y =⇒ W ⊗ x ≤m W ⊗ y. The links between Positive in- variance and Monotonicity have already been noticed in the context of non linear dynamical systems (see e.g. [5]), and in the context of linear dynamical systems over idempotent semirings (see e.g. [15], [14]). • The two last concepts concern Strong lumpability and Coherency. These proper- ties are used for the aggregation of systems (see e.g. [13, pp. 16,17]). A basic way to reduce the dimension of these linear systems over dioids is to lump or collapse some states into a single Mega-state. Thus we obtain a partition {C(1),...,C(N)} of S = {1,...,d} of System (2) into Σ = {1,...,N} classes with N ≤ d. Given such a partition, we define a non-decreasing surjective map Φ from the state set S into the set Σ by ∀k ∈ S,∀l ∈ Σ : Φ(k)=l ⇔ k ∈ C(l).ThemapΦ will be referred as a lumping map. We associate with the map Φ a lumping matrix V ∈ SN×d defined by ∀I ∈ Σ,∀ j ∈ S : vI, j = δ{Φ( j)=I},wherethe{ε,e}-valued function δ{.} is e if the logical assertion is true, and ε otherwise. Then we deal with the system: ⎧ ⎨ x(0) ∈ Sd, ( , , ) ( )= ⊗ ( − ), ≥ , ( ) . d A V : ⎩ x n A x n 1 n 1 I (3) z(n)=V ⊗ x(n)

In general, the series of vectors z(n) does not verify a difference equation of the form as (3,(I)). A condition under which there exist a matrix A ∈ SN×N such that z(n + 1)=A ⊗ z(n), is called a lumping condition (see e.g. [13]). These lumpability conditions are the counterparts of those existing for Markov chains ([12]). The paper is organized as follows. In Section 2, we introduce the main definitions of Max-Plus algebra (see e.g. ([2], [3], [11]). In Section 3, we give a generalization of the idempotent version of Haar’s Lemma given by Result 1, which gives neces- sary and sufficient conditions for the inclusion of ”one sided” idempotent polyhedra, to the case of the inclusion of ”two sided” idempotent polyhedra (see Proposition 1). In Section 4, we express different formulations of each concept: Positive invariance, Monotonicity, Strong lumpability and Coherency, under the form of Result 1 or Proposition 1. In Section 5, we give a conclusion where we express the different links between the various concepts mentioned above.

2 Basic Algebraic Structures

For any set, (S,⊕,⊗,ε,e) is a semiring if (S,⊕,ε) is a commutative monoid, (S,⊗,e) is a monoid, ⊗ distributes over ⊕, the neutral element ε for ⊕ is also absorbing for ⊗,i.e.∀a ∈ S, ε ⊗a = a⊗ε = ε,ande is the neutral element for (⊗). (S,⊕,⊗,ε,e) is an idempotent semiring (called also dioid) if (S,⊕,⊗,ε,e) is a semiring, ⊕ is idempotent, i.e. ∀a ∈ S, a ⊕ a = a.If(S,⊗,e) is a commutative monoid, then the idempotent semiring (S,⊕,⊗,ε,e) is said commutative. 198 M. Ahmane and L. Truffet

(S,⊕,⊗,ε,e) is a an idempotent semifield if (S,⊕,⊗,ε,e) is an idempotent semiring and (S\{ε},⊗,e) is a group, i.e. (S\{ε},⊗,e) is a monoid such that all its elements are invertible (∀a ∈ S\{ε}, ∃a−1 : a ⊗ a−1 = a−1 ⊗ a = e). Also if (S\{ε},⊗,e) is a commutative monoid, then the idempotent semifield (S,⊕,⊗,ε,e) is said commutative. Let (S,⊕,⊗,ε,e) be an idempotent semiring. Each element of Sn is a n-dimensio- nal column vector. We equip Sn with the two laws ⊕ and ⊗ as follows:

n not. def ∀x,y ∈ S , (x ⊕ y)i = xi ⊕ yi,∀s ∈ S, (s.x)i =(s ⊗ x)i = s ⊗ xi, i = 1,...,n.

The addition ⊕ and the multiplication ⊗ are naturally extended to matrices with compatible dimension. Any n × p matrix A is associated with a (⊕,⊗)-linear map A : Sp → Sn.The(i, j) entry, the lth row-vector and the kth column-vector of matrix A are denoted Ai, j, Al,· and A·,k, respectively. Let (S,⊕,⊗,ε,e) be an idempotent semiring or an idempotent semifield, then (S,⊕,ε) is an idempotent monoid, which can be equipped with the natural order relation ≤ defined by

def ∀a,b ∈ S : a ≤ b ⇔ a ⊕ b = b. (4)

We say that (S,⊕,⊗,ε,e) is complete if it is complete as a naturally ordered set and if the respective left and right multiplications, λa,ρa : S → S, λa(x)=a ⊗ x, ρa(x)=x ⊗ a are continuous for all a ∈ S. We adopt the following notations

not. def ∀a,b ∈ S : λa(b) = a\◦b = {x ∈ S : x ⊗ a ≤ b}, not. def ρa(b) = b/◦a = {x ∈ S : a ⊗ x ≤ b}.

A typical example of complete dioid is the top completion of an idempotent semi- field. Let us note that if a ∈ S is invertible then: a\◦b = a−1 ⊗ b and b/◦a = b ⊗ a−1. not. Let us note also that as S is complete it possesses a top element S = ' =+∞. We have by convention the following identities:

ε ⊗'= '⊗ε = ε, and ∀a ∈ S, a ⊕'= ',a ∧'= '∧a = a.

We suppose besides that (for a discussion, see e.g. ([2, p. 163-164]):

∀a = ε, a ⊗'= '⊗a = '. (5)

By definition of /◦ (idem for \◦) and properties of ', ε and of (5) we have:   ε if a = ' ε if a = ε ∀a ∈ S : a/◦ε = ','/◦a = ',a/◦' = ,ε/◦a = . (6) ' if a = ' ' if a = ε

The operations ·/◦·, ·\·◦ are extended to matrices and vectors with compatible dimen- sions assuming that all the elements of these matrices and vectors are in a complete set S with ≤ denote the entry-wise comparison of matrices: Positive Invariance, Monotonicity, Strong Lumpability and Coherency 199

def ◦ ◦ ◦ ◦ (A\y)i = ∧ j (a j,i\y j); (A\B)i, j =( {X : A ⊗ X ≤ B})i, j = ∧k (ak,i\bk, j); def ◦ ◦ (D/C)i, j =( {X : X ⊗C ≤ D})i, j = ∧ j (di,l/c j,l).

3 Generalization of Idempotent Version of Haar’s Lemma

The aim of this section is to generalize the idempotent version of Haar’s Lemma given in Result 1, which provides necessary and sufficient conditions for the inclu- sion of ”one sided” idempotent polyhedra, to the case of the inclusion of ”two sided” idempotent polyhedra. Before that, we need the following result. Result 2 ([1]) Let (S,⊕,⊗,ε,e) be a complete idempotent semifield. Let A ∈ Sm×d m be a matrix and b,g ∈ S be two vectors. Assume that Hypotheses (H0) and (H1) of Result 1 in this case are respected. Then the following equality holds:

b/◦(A\◦g)=(b/◦g) ⊗ A. (7)

Proposition 1. Let (S,⊕,⊗,ε,e) be a complete idempotent semifield. Given ”two d d sided” idempotent polyhedra P(P, f (y)) = {x ∈ S ,y ∈ S : P ⊗ x ≤m f (y)} and d d m×d m ×d P(Q,g(y)) = {x ∈ S ,y ∈ S : Q ⊗ x ≤m g(y)} where P ∈ S ,Q∈ S and d m d m f : S → S ,g: S → S be two functions. Assume that Hypotheses (H0) and d (H1) of Result 1 are respected. The assertion ∀y ∈ S : P(P, f (y)) ⊆ P(Q,g(y)) is true if and only if there exist a matrix H ∈ Sm ×m such that the following condition hold: d ∀y ∈ S : (i). Q ≤ H ⊗ P and (ii). H ⊗ f (y) ≤m g(y). (8)

Proof. The assertion ∀y ∈ Sd : P(P, f (y)) ⊆ P(Q,g(y)) can be rewritten as follows:

d d d ∀y ∈ S : {x ∈ S : P ⊗ x ≤m f (y)}⇒{x ∈ S : Q ⊗ x ≤m g(y)}, which is equivalent to (by definition of \◦):

d d ◦ d ∀y ∈ S : {x ∈ S : x ≤d P\ f (y)}⇒{x ∈ S : Q ⊗ x ≤m g(y)}.

Since ⊗ is non-decreasing, this last assertion is true if and only if

d ◦ ∀y ∈ S : Q ⊗ (P\f (y)) ≤m g(y),

which is equivalent to (by definiton of /◦): ∀y ∈ Sd : Q ≤ g(y)/◦(P\◦ f (y)). Using Result 2, we obtain:

∀y ∈ Sd : Q ≤ (g(y)/◦ f (y)) ⊗ P. (9) 200 M. Ahmane and L. Truffet

It is sufficient to take H = g(y)/◦ f (y). From the definition of the residuation, it is clear that the condition (ii) of Proposition 1 holds true, and from (9), we realize that (i) is satisfied too, which ends the proof. 

4 Positive Invariance, Monotonicity, Strong Lumpability and Coherency

In this section, we give the formulation of each property under the form of Result 1 or Proposition 1.

Remark 1. In each case of the rest of the paper, Hypotheses (H0) and (H1) of Result 1 must be respected.

4.1 Positive Invariance

AsetE ⊂ Sd is said to be positively invariant by the map f : Sd → Sd if f (E) ⊂ E. In this paper, f will be a linear map, i.e. f (x)=A ⊗ x,whereA ∈ Sd×d. In this case, we say that E is A-invariant.

Proposition 2. Let (S,⊕,⊗,ε,e) be a complete idempotent semifield. Let us consider the set E := x ∈ Sd,y ∈ Sd : K ⊗ x ≤ K ⊗ y , with K ∈ Sm×d and  m x K ∈ Sm×d have non null columns. Let z = ∈ Sd+d be a vector and M ∈ y

S(d+d )×(d+d ) be a matrix. The set E is M-invariant if there exists a matrix H ∈ S(m+m)×(m+m) such that the following conditions hold:

(i). T ⊗ M ≤ H ⊗ T and (ii). H ⊗ L ⊗ z ≤m+m L ⊗ M ⊗ z, (10)    

KK E K where T = ∈ S(m+m)×(d+d ),L= ∈ S(m+m)×(d+d ) and E is a E K KK m × d-matrix which its all entries are ε.

Proof. From assertion (4), the inequaliy {K ⊗ x ≤m K ⊗ y} of the set E can be rewritten in Max-Plus algebra as {(K ⊗ x) ⊕ (K ⊗ y)=K ⊗ y}. This last assertion

(K ⊗ x) ⊕ (K ⊗ y) ≤m K ⊗ y can be written as follows: , which is equivalent to: K ⊗ y ≤m (K ⊗ x) ⊕ (K ⊗ y)        .

x d+d KK E K z = ∈ S : ⊗ z ≤ + ⊗ z = {T ⊗ z ≤ + L ⊗ z}. y E K m m KK m m

The set E is said to be M-invariant if z ∈ E ⇒ M ⊗ z ∈ E, which is equivalent to:

{T ⊗ z ≤m+m L ⊗ z ⇒ T ⊗ M ⊗ z ≤m+m L ⊗ M ⊗ z}. (11) Positive Invariance, Monotonicity, Strong Lumpability and Coherency 201

From Proposition 1 and by taking Q = T ⊗M,P = T, f (z)=L⊗z,g(z)=L⊗M ⊗z, equation (1) becomes:

+ ∀z ∈ Sd d : P(T,L ⊗ z) ⊆ P(T ⊗ M,L ⊗ M ⊗ z), which reads

∀g ∈ Sd+d ,∀z ∈ Sd+d : T ⊗ g ≤ L ⊗ z ⇒ T ⊗ M ⊗ g ≤ L ⊗ M ⊗ z. (12)

We remark that equation (12) implies equation (11), and the result is proved. 

Result 3 ([1]) Let (S,⊕,⊗,ε,e) be a complete idempotent semifield and D,G ∈

Sm×d be two matrices. The following equality holds: ((D ⊗ y)/◦(G ⊗ y)) = D/◦G. (13)

y∈Sd

Proposition 3.Let (S,⊕,⊗,ε,e) be a complete idempotent semifield. Let us con- sider the set E := x ∈ Sd,y ∈ Sd : K ⊗ x ≤ K ⊗ y with K ∈ Sm×d and K ∈ Sm×d   m x have non null columns. Let z = ∈ Sd+d be a vector and M ∈ S(d+d )×(d+d ) be y a matrix. The set E is M-invariant if there exists a matrix H ∈ S(m+m)×(m+m) such that the following conditions hold:

(i). T ⊗ M ≤ H ⊗ T and (ii). H ⊗ L ≤ L ⊗ M. (14)    

KK E K where T = ∈ S(m+m)×(d+d ),L= ∈ S(m+m)×(d+d ) and E is a E K KK m × d-matrix which its all entries are ε.

Proof. By residuation, Condition (ii) of assertion (1) becomes:

H ≤ (L ⊗ M ⊗ z)/◦(L ⊗ z).

By using of Result 3, we obtain H ⊗ L ≤ L ⊗ M. Condition (i) is the same as Condition (i) of Proposition 2, which ends the proof. 

4.2 Monotonicity

Definition 1. Let A ∈ Sd×d and W,W ∈ Sm×d.MatrixA is said to be (W,W )- monotone if d ∀x,y ∈ S : x ≤W,W y ⇒ A ⊗ x ≤W,W A ⊗ y. (15) In the literature, this property is used to simplify some problems of performances as- sessement of linear systems on dioids. Here, we can characterize some monotonous 202 M. Ahmane and L. Truffet operations in a more general case and therefore to hope to build more elaborate techniques of simplifications. Result 4 ([1]) Let (S,⊕,⊗,ε,e) be a complete idempotent semifield and W,W ∈ Sm×d be two matrices. Then a matrix A ∈ Sd×d is (W,W )-monotone if and only if

W ⊗ A ≤ (W ⊗ A/◦W ) ⊗W. (16)

Proposition 4. Let (S,⊕,⊗,ε,e) be a complete idempotent semifield. Let us con- d×d m×d sider three matrices A ∈ S ,W,W ∈ S . Assume that Hypotheses (H0) and d×d (H1) of Result 1 are respected. Then, a matrix A ∈ S is (W,W )-monotone if and only if there exists a matrix H ∈ Sm×m such that the following conditions hold:

(i). W ⊗ A ≤ H ⊗W and (ii). H ⊗W ≤ W ⊗ A. (17)

Proof. From equation (16), it suffice to take H = W ⊗ A/◦W and the proof is achieved. 

Proposition 5. When W = W in Proposition 4, we say that a matrix A is (W,W)- monotone if and only if there exists a matrix H ∈ Sm×m such that the following condition holds: W ⊗ A = H ⊗W. (18)

Remark 2. Proposition 5 coincides with Theorem 4.2 of [15] obtained by a dif- ferent proof.

4.3 Strong Lumpability

Let Φ be a lumping map from S into Σ and V the corresponding lumping matrix (∀I ∈ Σ,∀ j ∈ S : vI, j = δ{Φ( j)=I}). We can define a partition of S into N aggregates −1 −1 Φ (J)=[mJ,MJ] such that cardinal(Φ (J)) = dJ,J ∈ Σ. Recall the system given by (3): ⎧ ⎨ x(0) ∈ Sd ( , , ) ( )= ⊗ ( − ), ≥ . d A V : ⎩ x n A x n 1 n 1 z(n)=V ⊗ x(n) The series (x(n),n ≥ 0) of system (3) with a given initial data x(0) is said to be lumpable if the aggergated series (z(n),n ≥ 0) satisfy the reduced equation z(n + 1)=A⊗ z(n), for some matrix A ∈ SN×N . Definition 2. Let V ∈ SN×d be a lumping matrix. The matrix A ∈ Sd×d is said to be strongly lumpable by V,orsimplyV-lumpable [13] if there exists A ∈ SN×N such that V ⊗ A = A⊗V, (19) ∀ ∈ Σ,∀ ∈ Σ,∀ ∈ Φ−1( ), = or equivalently: I J j J i∈Φ−1(I) ai, j aI,J. Positive Invariance, Monotonicity, Strong Lumpability and Coherency 203

4.4 Coherency

Definition 3. AmatrixA ∈ Sd×d is C-coherent [13] with respect to the lumping map Φ if there exists a matrix A ∈ SN×N such that:

A ⊗C = C ⊗ A. (20)

In particular V ⊗C = IN where IN =(δ{I=J})I,J=1,...,N is the m × m-identity matrix.

5Conclusion

In this section, we give the different links between the different notions seen in the previous subsections.

Proposition 6. Let (S,⊕,⊗,ε,e) be a complete idempotent semifield. Let us consider the set E := x ∈ Sd,y ∈ Sd : K ⊗ x ≤ K ⊗ y , with K ∈ Sm×d and  m x K ∈ Sm×d have non null columns. Let z = ∈ Sd+d be a vector and M ∈   y  

KK E K S(d+d )×(d+d ),T= ∈ S(m+m)×(d+d ),L= ∈ S(m+m)×(d+d ) and E K KK E is a m×d-matrix which its all entries are ε. Then we have the following inclusion:

Mis(T,L) − monotone =⇒ EisM− invariant. (21)

Proof. Obvious from equations (14) and (17). 

Proposition 7. Let (S,⊕,⊗,ε,e) be a complete idempotent semifield. Let Φ := {1,...,d}→Σ = {1,...,N} be the aggregated function associated with matrix V defined by (∀I ∈ Σ,∀ j ∈ S : vI, j = δ{Φ( j)=I}) and a d-dimensional vector v s.t. ∀i,vi = ε.LetΠy be the yth d × d-matrix defined by: Πy(x,x)=eifx∈ Φ−1( ) Π ( , )=ε ∗ = ⊗ T ⊗ −1 =[Π ⊗ y and y x1 x2 otherwise. Define A : Dv A Dv and C y T v;y ∈ Σ]. Assume that ∀y ∈ Σ : 1 ⊗ Πy ⊗ v = e. Then

∗ A is strongly lumpable ⇐⇒ A is coherent. (22)

Proof.

A∗ is strongly lumpable ⇐⇒ ∃ H : V ⊗ A∗ = H ⊗V ⇐⇒ ∃ ⊗ ⊗ T ⊗ −1 = ⊗ H : V Dv A Dv H V T ⇐⇒ ∃ H : (V ⊗ Dv) ⊗ A = H ⊗ (V ⊗ Dv).

T T Now, let us remark that (V ⊗ Dv)=[Πy ⊗ v;y ∈ Σ] = C , thus:

T T T T ∃H : (V ⊗ Dv) ⊗ A = H ⊗ (V ⊗ Dv) ⇐⇒ ∃ H : C ⊗ A = H ⊗C ⇐⇒ A is coherent.  204 M. Ahmane and L. Truffet

Proposition 8. In the case where ξ = Im(C), from Proposition 7 we have:

A∗ is strongly lumpable ⇐⇒ ξ is A − invariant. (23)

Proposition 9. Let (S,⊕,⊗,ε,e) be a complete idempotent semifield. We have the following assertions: 1. A is C-coherent ⇐⇒ ξ is A-invariant. 2. A is V-lumpable ⇐⇒ Ais(V,V)-monotone. 3. A is C-coherent ⇐⇒ AT is (CT ,CT )-monotone.

Proof. 1. From equations (22) and (23). 2. From equations (17) and (19). 3. From equations (17) and (20). 

References

1. Ahmane, M., Truffet, L.: Idempotent versions of Haar’s Lemma: links between com- parison of discrete event systems with different state spaces and control. Kybernetika Journal 43(3), 369Ð391 (2007) 2. Baccelli, F., Cohen, G., Olsder, G.J., Quadrat, J.P.: Synchronisation and Linearity. John Wiley and Sons, Chichester (1992) 3. Blyth, T.S., Janowitz, M.F.: Residuation Theory. Pergamon Press, Oxford (1972) 4. Birkoff, G.: Lattice Theory, vol. XXV. AMS Colloquium Publicat (1967) 5. Bitsoris, G., Gravalou, E.: Comparison Principle, Positively Invariance and Constrained Regulation of Nonlinear Systems. Automatica 31, 217Ð222 (1995) 6. Blanchini, F.: Set Invariance in Control. Automatica 35, 1747Ð1767 (1999) 7. Ledoux, J., Truffet, L.: Comparison and aggregation of max-plus linear systems. Linear Algebra and its Applications 378C, 245Ð272 (2004) 8. De Vries, R., De Schutter, B., De Moor, B.: On max-plus algebraic models for transporta- tion networks. In: Proceeding of the International Workshop on Discrete Event Systems, pp. 457Ð462 (1998) 9. Cohen, G., Gaubert, S., Quadrat, J.P.: Duality and Separation Theorems in Idempotent Semimodules. Linear Algebra and its Applications 379, 395Ð422 (2004) 10. Dorea, C.E.T., Hennet, J.C.: (A,B)-Invariant Polyhedral Sets of Linear Discrete-Time Systems. Journal of Optimization Theory and Applications 103 (1999) 11. Golan, J.S.: The theory of semiring with Applications in Mathematics and Theorical Computer Science. Longman Sci. Tech. 54 (1992) 12. Ledoux, J.: A geometric Invariant in Weak Lumpability of Finite Markov Chains. Journal of Applied Probability 34, 847Ð858 (1997) 13. Quadrat, J.P., Max-Plus, W.: Min-plus linearity and statistical mechanics. Markov Pro- cesses and related Fields 3, 565Ð597 (1997) 14. Truffet, L., Wagneur, E.: Monotonicity and Positive Invariance of Linear Systems Over Dioids. Journal on Discrete Mathematics 150, 29Ð39 (2005) 15. Truffet, L.: Monotone Linear Dynamical Systems Over Dioids. In: Benvenuti, L., De Santis, A., Farina, L. (eds.) Positive Systems: Theory and Applications. Lecture Notes in Control and Information Sciences, vol. 294, pp. 39Ð46. Springer, Heidelberg (2003) 16. Wagneur, E.: Duality in the max-algebra, Nantes (1998) Stability Analysis and Synthesis for Linear Positive Systems with Time-Varying Delays

Mustapha Ait Rami

Abstract. This paper provides necessary and sufficient conditions for the asymp- totic stability of linear positive systems subject to time-varying delays. It introduces and initiates an original method for solving directly the proposed stability and stabi- lization problems without using the well-known Lyapunov theory that is commonly used in the field of stability analysis. In that way and for readers convenience, the paper avoids possible long and tedious superfluous calculus.

1 Introduction

The reaction of real world systems to exogenous signals is never instantaneous and always infected by certain time delays. Differential delay systems known also as hereditary or systems with aftereffects, represent a class of infinite-dimensional sys- tems that can model and take into account the delay influence on wide range of systems such as propagation phenomena, population dynamics and many physical, biological and chemical processes. The study of the delay effects on the stability and control of dynamical systems (delays in the state and/or in the input) are problems of a great interest in practice. For general linear systems, even nominal stable systems when are affected by de- lays, may inherit very complex behaviors such as oscillations, instability and bad performances. In addition, it is well-know that small constant delays may destabi- lize some systems, while large constant delays may stabilize others. Note that the effect of time-varying delays still not well understood for general linear systems. In contrast, for positive linear time-delay systems (systems whose state vari- ables take only nonnegative values are referred to be positive, see [11, 12, 19, 22] for general references), it has been shown in the very beginning of the 80 s that the

Mustapha Ait Rami Dept. Ingenieria de Sistemas y Automatica, Universidad de Valladolid, 47005 Valladolid, Spain, e-mail: [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 205Ð215. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 206 M. Ait Rami presence of constant delays does not affect the stability performance of the system [20, 21, 23] (see also recent works [6, 7, 13, 16, 17]). Since then, no one has con- jectured that this fact holds true for time-varying delays. In this paper, we go a step further and show that remarkable fact, that is, the stability performance of positive linear time-delay systems is insensitive to any kind of time-varying delays. The aim of this paper is to present a new method and techniques for the stabil- ity analysis and synthesis of linear positive in presence of time-varying delays. The proposed approach for the stability analysis is quite new and does not use any based Lyapunov technic. This paper develops theoretical results with necessary and suf- ficient condition for stability and stabilizability of linear positive delayed systems. Specifically, we will show that the stabilization problem can be cast either as an LP problem or as an LMI problem. Since there exist powerful LP softwares (as Cplex) that can solve efficiently very large size problems, we believe that the LP approach is more simple and can have a legitimate numerical advantage in comparison to the LMI approach. We stress out that the proposed LP formulation was introduced by the author in the context of positive observation of delayed systems [6, 7] and inter- val observers [5]. An old LP formulation has been introduced earlier by the author in [1Ð4] for dealing with positive observers and positive systems with state and control constraints. It has been adapted for positive system with constant delays [16, 17] and for positive 2D-system [8, 15]. The remainder of the paper is organized as follows. In section 2 some preliminary facts and results are given. Section 3 provides necessary and sufficient conditions for the stability of positive linear systems with time-varying delays. Section 4 solves the stabilization problem for standard state-feedback controls and also for nonnegative state-feedback controls. Finally, section 5 gives some conclusions.

Notations

n n T Re+ denotes the non-negative orthant of the n-dimensional real space Re . M de- notes the transpose of the real matrix M. For a real matrix M, M > 0 means that its components are positive: Mij > 0, and M ≥ 0 means that its components are non- negative: Mij ≥ 0. diag(λ) is the diagonal matrix whose diagonal is formed by the components of the vector λ. M 0,whereM is a symmetric real matrix, means that M is definite positive.

2 Statements and Preliminaries

This section provides some necessary preliminary statements and technical keys that are primordial for the characterization and the treatment of positive systems satisfying a differential delayed equation. The introduced facts and results will be essentially used in development and the derivation of our main stability result. Positive Systems with Time-Varying Delays: Stability and Stabilization 207

The system under investigation is described by a general forced linear differential delayed equation. dx m = Ax + ∑ Aix(t − τi(t)) + Bu(t), (1) dt i=1

n×n nu×n the given matrices A,A1,...,Am ∈ Re and B ∈ Re are time-invariant and 0 ≤ τ1(·),··· ,0 ≤ τm(·) are time-varying delays that are supposed to be Lebesgue measurable. Throughout the paper we use the notation

τ := max supτi(t). 1≤i≤m t≥0

The vector x(t) ∈ Ren is the instantaneous system state at time t and u(t) ∈ Renu represents an external input. The whole state at time t of system (1) is infinite di- mensional which is given by the set {x(s)|−τ ≤ s ≤ t}. Following [14], it can be shown that the solution to the system’s equation (1) ex- ists, unique and totally determined by any given initial Locally Lebesgue integrable vector function φ(·) such that

x(s)=φ(s) for − τ ≤ s ≤ 0.

Throughout this paper, the free system is assumed to satisfy a positivity constraint on its states as follows. n Definition 1. For any nonnegative initial condition φ(t) ∈ Re+ such that x(t)= φ(t) for −τ ≤ t ≤ 0, System (1) is said to be positive if the corresponding trajectory n is nonnegative, that is x(t) ∈ Re+ for all t ≥ 0. We stress out that that intrinsic properties of the delayed system’s positivity be- havior are related to Metzlerian matrices and positive matrices. Definition 2. A real matrix M is called a Metzler matrix if its off-diagonal ele- ments are nonnegative: Mij ≥ 0, i = j. Definition 3. A real matrix M is called a positive matrix if all its elements are nonnegative: Mij ≥ 0. Note that the following result shows how Metzlerian matrices are intrinsically connected to positivity. Lemma 1. Let M be a Metzler matrix then the following holds true. (a) M Metzler ⇔ etM ≥ 0, ∀t ≥ 0. (b) if v > 0,thenetMv > 0, ∀t ≥ 0. Proof. Item (a) is well-know [22]. Item (b)istrivial.  We emphasize that the following result can be interpreted as an extension of the classical result on positive linear systems (see [22]) and its proof can be obtained in the same spirit of reasoning and then omitted. Also, this result offers an easy test for checking the positivity of the free system. Proposition 1. System (1) with u = 0 is positive if and only if A is a Metzler matrix and A1,...,Am are positive matrices. 208 M. Ait Rami

3MainResult

In this section, the stability of autonomous linear positive systems subject to time- varying delays is studied. The relevant derived result involves necessary and suffi- cient conditions. Previous results and equivalent conditions for a Metzler matrix to be Hurwitz can be found in many places in the literature, see for example [4, 10, 18]. In the following, some well-known facts are presented and will be used in order to derive our main result. Lemma 2. Let M be a Metzler matrix. Then, the following conditions are equivalent i) M is Hurwitz (has eigenvalues with negative real part). ii) The inverse of M exists and all its components are negative: M−1 ≤ 0. iii)There exist a vector λ > 0 such that Mλ < 0. iv) There exist a diagonal matrix D 0 such that MT D + DM ≺ 0. In the sequel, conditions for the asymptotic stability of the general positive lin- ear time-delay system (1) are derived. But, before we need the following technical lemma. n×n Lemma 3. Let A,A1,...,Am ∈ Re be constant matrices and assume that the matrix A is Metzler and A1,...,Am are nonnegative. Consider the following delayed linear system with constant delay τ.

dz m = Az(t)+∑ Aiz(t − τ), dt i=1 (2) z(s)=λ, for − τ ≤ s ≤ 0.

m Then, if (A + ∑ Ai)λ < 0 and λ > 0, we have that z(t) is strictly decreasing for i=1 t ≥ 0. Moreover, z(t) converges asymptotically to zero. That is zú(t) < 0, ∀t > 0 and limz(t)=0. t→0

Proof. Since A is Metzler and A1,...,Am are nonnegative, the minus derivative −zú of the solution to system (2) satisfies a linear delayed positive system equation m and it can be easily shown that −zú(t) > 0, ∀t ≥ 0. Also, because that (A+ ∑ Ai)λ < i=1 0andλ > 0wehavethatz(t) converges asymptotically to zero (see for this fact [6, 13] and by this our claim is proved. 

Theorem 1. Assume that system (1) is positive, or equivalently that the matrix A is Metzler and A1,...,Am are positive matrices. Then, the following statements are equivalent. Positive Systems with Time-Varying Delays: Stability and Stabilization 209

τ∗,...,τ∗ i) There exist a constant-time delays 1 m and a nonnegative initial functional condition φ ∗(·) with φ ∗(0) > 0 for which the free system (1) (u=0) is asymptoti- cally stable. ii) System (1) is asymptotically stable for every nonnegative initial condition φ(·) ≥ 0 and for any bounded arbitrary time-varying delays. iii)System (1) is asymptotically stable for every initial condition taking values in Ren (φ(·) has indefinite sign) and for any bounded arbitrary time-varying delays. iv) There exists λ ∈ Ren such that

m (A + ∑ Ai)λ < 0, λ > 0. (3) i=1

Proof. The implications (iii) ⇒ (ii) ⇒ (i) are obvious. The rest of the proof will be proceeded in 3 steps. We emphasize that the second step is the more delicate and subtle part of the proposed proof. Step 1:(i) ⇒ (iv) By integrating System (1) we have   T m T ( ) − ( )= ( ) + ( − τ∗) , x t x 0 A x t dt ∑ Ai x t i dt 0 i=1 0 which by change of variable can be expressed as the following identity  m T (A + ∑ Ai) x(t)dt = i=1 0  m T−τ∗ m 0 i ∗ x(T )+∑ Ai x(t)dt − ∑ Ai φ (t)dt − x(0), −τ∗ i=1 T i=1 i  m T−τ∗ i ∗ since x(T ) goes to zeros, then also ∑ Ai x(t)dt. Moreover, since φ is non- T i=1  m 0 ∗ negative and x(0) is positive, the term ∑ Ai φ (t)dt +x(0) is constant and pos- −τ∗ i=1 i itive. Thus, regarding to these facts, it suffices to select a sufficiently large T to get m (A + ∑ Ai)λ < 0, λ > 0, i=1

λ λ = T ( ) ( ) where is defined as 0 x t dt which is positive due to the fact that x 0 is positive and the trajectory x(t) is continuous.

Step 2:(iv) ⇒ (ii) Let φ(·) ≥ 0 be any initial functional condition and consider m its associated trajectory x(·). Now, take any vector λ > 0 satisfying (A + ∑ Ai)λ < i=1 0. Of course, there exist a positive constant scalar α > 0suchthat

αφ(s) < λ, ∀s : −τ ≤ s ≤ 0. 210 M. Ait Rami

Note that by linearity the associated trajectory to αφ(·) is αx(·), so that the scaled trajectory is solution to  t m tA (t−s)A αx(t)=e αx(0)+ e ∑ Aiαx(s − τi(s))ds. (4) 0 i=1 Now, consider the following delayed linear system with constant delay τ, such that τ ≥ max supτi(t). 1≤i≤m t≥0 dz m = Az(t)+∑ Aiz(t − τ), dt i=1 (5) z(s)=λ, for − τ ≤ s ≤ 0. Next, we claim that αx(t) < z(t), ∀t ≥ 0. If this fact does not hold, let t∗ be the maximal time such that there exist at least ∗ ∗ ∗ ∗ a component xi(t ) of x(t ) such that αxi(t ) ≥ zi(t ) and αx(s) < z(s), ∀s : −τ ≤ s < t∗. Based on the integral expression (4), we are going to perform a comparison at ∗ time t∗ by using the fact that et A(z(0) − αx(0)) > 0(sincez(0) > αx(0))andalso (t∗−s)A ∗ e ≥ 0ift ≥ s (apply Lemma 1). Thus, since the matrices A1,...,Am are positive (and do not forget that αx(s) < z(s), ∀s : −τ ≤ s < t∗), we obtain  t∗ m ∗ t∗A (t∗−s)A αx(t ) < e z(0)+ e ∑ Aiz(s − τ)ds. (6) 0 i=1 At this moment, one can wonder why this holds true? To give a positive answer we use Lemma 3 that asserts that z(t) is strictly decreasing, then from this fact, we can of course see that

∗ αx(t − τi) < z(t − τi(t)) ≤ z(t − τ) ∀t : −τ ≤ t < t .

(t∗−t)A Multiplying αx(t −τi(t))−z(t −τ) ≤ 0bye Ai ≥ 0 , integrating and summing from 1 up to m, we obtain   t∗ m t∗ m (t∗−s)A (t∗−s)A e ∑ Aix(s − τi(s))ds ≤ e ∑ Aiz(s − τ)ds, 0 i=1 0 i=1

∗ ∗ so that by keeping in mind the strict inequality et Az(0) > αet Ax(0), now we can be sure that inequality (6) occurs at time t∗. At this stage, we give the crucial conclusion. The right hand side of the claimed inequality (6) is nothing else than z(t). Consequently, we got αx(t∗) < z(t∗),which ∗ ∗ turn out to contradict the fact that there is a component such that αxi(t ) ≥ zi(t ), and we are almost done. Because, we have now 0 ≤ αx(t) < z(t), ∀t and z(t) goes to m zero (since A + ∑ Ai is Hurwitz see for this Lemma 3). Henceforth, we have shown i=1 Positive Systems with Time-Varying Delays: Stability and Stabilization 211 that system (1) is asymptotically stable for every initial functional condition φ(·) n taking values in Re+. Step 3:(ii) ⇒ (iii) This implication results from the linearity of the system and the fact that φ can be decomposed as φ = φ + − φ − where φ + ≥ 0,φ − ≥ 0. So that the proof is complete. 

Corollary 1. Assume that system (1) is positive, or equivalently that the matrix A is Metzler and A1,...,Am are positive matrices. Then, the following statements are equivalent. τ∗,...,τ∗ i) There exist a constant-time delays 1 m and a nonnegative initial functional condition φ ∗(·) with φ ∗(0) > 0 for which the free system (1) (u=0) is asymptoti- cally stable. ii) system (1) is asymptotically stable for every initial condition and for any arbi- trary bounded time-varying delays. m iii)The inverse of A + ∑ Ai exists and all its components are negative i=1

m −1 (A + ∑ Ai) ≤ 0. i=1

m iv) There exist a vector λ such that (A + ∑ Ai)λ < 0, λ > 0. i=1 v) There exist a diagonal matrix D such that

m m T (A + ∑ Ai)D + D(A + ∑ Ai) ≺ 0, D 0. i=1 i=1

m v) A + ∑ Ai is a Hurwitz matrix. i=1 Proof. It suffices to apply Theorem 1 and Proposition 2. 

4 Controllers Design

The aim of this section is to show how our stability result can be applied in order to compute stabilizing feedback controllers. In particular, those control laws that take only nonnegative values will be considered owing to their importance in practice. The following result provides necessary and sufficient conditions for the exis- tence of stabilizing control law that preserves the positivity of the system. It also provides an easy and efficient approach for checking the solvability of the stabi- lization problem and for computing any stabilizing state-feedback control by either using LP or LMI softwares. 212 M. Ait Rami

Theorem 2. Assume that A1,...,Am are positive matrices. Then, the following statements are equivalent i) There exists a stabilizing memoryless state-feedback law u(t)=Kx(t) such that the resulting closed-loop system (1) is positive and asymptotically stable for ar- bitrary bounded time-varying delays. × ii)There exists a matrix K ∈ Renu n such that A + BK is Metzler matrix and A + m BK + ∑ Ai is a Hurwitz matrix. i=1 × iii)The following LP problem in the variables λ ∈ Ren and Z ∈ Renu n is feasible ⎧ ⎡ ⎤ ⎪ 1 ⎪ m ⎢ ⎥ ⎪ ( + )λ + . < , ⎨ A ∑ Ai BZ ⎣ . ⎦ 0 i=1 (7) ⎪ 1 ⎪ (λ)+ + ≥ , ⎩⎪ Adiag BZ I 0 λ > 0,

Moreover, a gain matrix K satisfying the conditions (i) and (ii) can be computed as follows K = Zdiag(λ)−1, where the vector λ and the matrix Z are any feasible solution to the above LP problem. × × iv) The following LMI problem in the variables D ∈ Ren n andY ∈ Renu n is feasible ⎧ m m ⎪ ( + ) + ( T + T )+ + T T ≺ , ⎨ A ∑ Ai D D A ∑ Ai BY Y B 0 i=1 i=1 ⎪ + + ≥ , (8) ⎩⎪ AD BY I 0 D 0.

Moreover, a gain matrix K satisfying the conditions (i) and (ii) can be computed as follows K = YD−1, where the matrices D and Z are any feasible solution to the above LMI problem.

Proof. The equivalence between i) and ii) is straightforward from Theorem 1. Now let us show that ii) and iii) are equivalent. First, consider the implication ii) → iii). Note that since A + BK is Metzler and A1,...,Am are positive matrices then m m A + BK + ∑ Ai is Metzler. So that by using Corollary 1, we have A + BK + ∑ Ai is i=1 i=1 Hurwitz if and only if there exists a vector λ > 0suchthat

m (A + BK + ∑ Ai)λ < 0. i=1 Positive Systems with Time-Varying Delays: Stability and Stabilization 213

Now, define K = Zdiag(λ)−1. Thus, with this change of variable, the above inequal- ity is effectively the first inequality in condition iii). The second inequality in the LP constraints, is obtained as follows. Note that A + BK is Metzler if and only if (A + BK)diag(λ) is Metzler, or equivalently (by adding the identity matrix I)

(A + BK)diag(λ)+I ≥ 0, this holds true by choosing λ with sufficiently small components (since the stability condition is homogeneous in λ). Thus, by recalling that K = Zdiag(λ)−1, the above inequality is nothing else than the second inequality AT diag(λ)+BZ +I ≥ 0, in the LP constraints. The reverse implication iii) → ii) can be trivially obtained by a simple matrix ma- nipulation as shown above. Also, to show the equivalence between ii) and iv),it suffices to use the LMI condition given by Corollary 1, make the change of variable K = YD−1 and follow the same line of argument as for the LP formulation. Thus, the proof is complete. 

Now, the following result provides necessary and sufficient conditions for the existence of a stabilizing nonnegative control law that preserves the positivity of the system.

Theorem 3. The following statements are equivalent i) There exists a stabilizing nonnegative memoryless state-feedback law u(t)= Kx(t) ≥ 0 such that the resulting closed-loop system (1) is positive and asymp- totically stable for arbitrary time varying delays. × ii)There exists a matrix K ∈ Renu n such that K ≥ 0,A+ BK is Metzler matrix and m A + BK + ∑ Ai is a Hurwitz matrix. i=1 × iii)The following LP problem in the variables λ ∈ Ren and Z ∈ Renu n is feasible ⎧ ⎡ ⎤ ⎪ 1 ⎪ m ⎢ ⎥ ⎪ ( + )λ + ⎣ . ⎦ < , ⎨⎪ A ∑ Ai BZ . 0 i=1 1 (9) ⎪ (λ)+ + ≥ , ⎪ Adiag BZ I 0 ⎪ ≥ , ⎩⎪ Z 0 λ > 0.

Moreover, a gain matrix K satisfying the conditions (i) and (ii) can be computed as follows K = Zdiag(λ)−1, where the vector λ and the matrix Z are any feasible solution to the above LP problem. 214 M. Ait Rami

× × iv) The following LMI problem in the variables D ∈ Ren n andY ∈ Renu n is feasible ⎧ ⎪ m m ⎪ (A + ∑ A )D + D(AT + ∑ AT )+BY +Y T BT ≺ 0, ⎨⎪ i i i=1 i=1 + + ≥ , ⎪ AD BY I 0 (10) ⎪ ≥ ⎩⎪Y 0 D 0.

Moreover, a gain matrix K satisfying the conditions (i) and (ii) can be computed as follows K = YD−1, where the matrices D and Z are any feasible solution to the above LMI problem.

Proof. It is easy to see that u(t) ≥ 0 is equivalent to the positivity of its gain K ≥ 0. The rest of the proof mimics that one of Theorem 2, so that it is omitted. 

5 Conclusions

We have provided necessary and sufficient conditions for the asymptotic stability of linear positive systems subject to time-varying delays. We have introduced an origi- nal method for solving directly the proposed stability and stabilization problems. In addition to developing theoretical results, all the proposed conditions are necessary and sufficient, which turn out to be solvable in terms of LP or LMI.

Acknowledgements. This work is funded by Ramon y Cajal grant, Spain.

References

1. Ait Rami, M., Tadeo, F.: Controller Synthesis for Linear Systems to Impose Positiveness in Closed-Loop States. In: Proceedings of the IFAC World Congress, Prague (2005) 2. Ait Rami, M., Tadeo, F.: Positive observation for positive discrete linear systems. In: IEEE CDC (2006) 3. Ait Rami, M., Tadeo, F.: Linear programming approach to impose positiveness in closed- loop and estimated states. In: Proceedings Sixteenth International Symposium on Math- ematical Theory of Networks and Systems, Kyoto, Japan (2006) 4. Ait Rami, M., Tadeo, F.: Controller Synthesis for Positive Linear Systems with Bounded controls. IEEE Trans. on Circuits and Sys. II 54, 151Ð155 (2007) 5. Ait Rami, M., Cheng, C.H., de Prada, C.: Tight robust interval observers: an LP approach. In: IEEE CDC (2008) 6. Ait Rami, M., Helmke, U., Tadeo, F.: Positive observation problem for time-delays linear positive systems. In: Proceedings of the 15th IEEE Med. conf., Athens (2007) 7. Ait Rami, M., Helmke, U., Tadeo, F.: Positive observation problem for linear time-lag positive systems. In: The 3rd IFAC Symposium on System, Structure & Control, Foz do Iguau, Brazil (2007) Positive Systems with Time-Varying Delays: Stability and Stabilization 215

8. Ait Rami, M., Hmamed, A., Alfidi, M.: L1 stability and stabilization of positive 2D con- tineous systems. Syst. and Control Letters (2008) (accepted) 9. Ait Rami, M., Tadeo, F., Benzaouia, A.: Control of constrained positive discrete systems. In: Proceedings of the American Control Conf., New York (2007) 10. Berman, A., Plemmon, R.J.: Nonnegative matrices in the mathematical sciences. SIAM Classics Appl. Maths (1994) 11. Berman, A., Neumann, M., Stern, R.J.: Nonnegative Matrices in Dynamic Systems. Wi- ley, New York (1989) 12. Farina, L., Rinaldi, S.: Positive Linear Systems: Theory and Applications. Wiley, New York (2000) 13. Haddad, W.M., Chellaboina, V.: Stability theory for nonegative and compartmental dy- namical systems with time delay. Syst. and Control Letters 51, 355Ð361 (2004) 14. Hale, J.K.: Theory of Functional Differential Equations. Springer, New York (1977) 15. Hmamed, A., Ait Rami, M., Afidi, M.: Controller synthesis for positive 2D systems described by the Roesser model. In: IEEE CDC (2008) 16. Hmamed, A., Benzaouia, A., Ait Rami, M., Tadeo, F.: Positive stabilization of discrete- time systems with unknown delay and bounded controls. In: European Control Confer- ence, Kos, Greece (2007) 17. Hmamed, A., Benzaouia, A., Ait Rami, M., Tadeo, F.: Memoryless control to drive states of delayed continuous-time systems within the nonnegative orthant. In: Proceedings of the 17th World Congress, Seoul, Korea (2008) 18. Horn, R., Johnson, C.: Topics in Matrix Analysis. Cambridge Univ. Press, Cambridge (1991) 19. Kaczorek, T.: Positive 1D and 2D Systems. Springer, UK (2001) 20. Lewis, R.M., Anderson, B.D.O.: Insensitivity of a class of Nonlinear Compartmen- tal Systems to the Introduction of Arbitrary Time Delays. IEEE Trans. on Circuit and Sys. 27, 604Ð612 (1980) 21. Lewis, R.M., Anderson, B.D.O.: Necessary and Sufficient Condition for Delay- Independent Stability of Linear Autonomous Systems. IEEE Trans. Aut. Contr. 25, 735Ð 739 (1980) 22. Luenberger, D.G.: Introduction to Dynamic Systems. Wiley, New York (1979) 23. Ohta, Y.: Stability Criteria for Off-Diagonally Monotone Nonlinear Dynamical Systems. IEEE Trans. on Circuit and Sys. 27, 956Ð962 (1980) Linear Programming Approach for 2-D Stabilization and Positivity

Mohammed Alfidi, Abdelaziz Hmamed and Fernando Tadeo

Abstract. The problem of synthesizing stabilizing state-feedback controllers is solved when the closed-loop system is required to remain positive, for the class of 2-D linear systems described by the Fornasini-Marchesini second model. First, a constructive necessary and sufficient condition expressed as a Linear Programming problem is provided for stabilization of these systems when the states must be non- negative (assuming that the boundary conditions are nonnegative). It is shown how it is simple to include additional constraints (such as positive controls). Moreover, this result is also extended to include uncertainty in the model, making possible to synthesize robust state-feedback controllers, solving Linear Programming prob- lems. Some numerical examples are included to illustrate the proposed approach for different design problems.

Keywords: positive 2-D systems, stabilization, Fornasini-Marchesini second model, linear programming.

1 Introduction

During the last two decades, the two-dimensional (2-D) systems theory has been given a considerable attention by many researchers. These 2-D linear models were introduced in the seventies [7, 10] and have found many applications, in areas such as digital data filtering, image processing [18], partial differential equations [17],

Mohammed Alfidi, Abdelaziz Hmamed LESSI, Department of Physics, Faculty of Sciences Dhar El Mehraz, B.P. 1796, 30000 Fes-Atlas, Morocco, e-mail: [email protected],[email protected] Fernando Tadeo Universidad de Valladolid, Dept. Ingenieria de Sistemas y Automatica, 47005 Valladolid, Spain, e-mail: [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 217Ð232. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 218 M. Alfidi, A. Hmamed, F. Tadeo etc. In connection with the Roesser [18] and Fornasini-Marchesini [7] models, some important problems, such as realization, controllability or minimum energy control, have been extensively investigated (see for example [12]). On the other hand, the stabilization problem is not fully investigated and still not completely solved. In particular, it has not been completely solved for positive 2-D systems, that is, 2-D systems where the states always remain nonnegative. Recently, a growing interest in both theory and application of positive 2-D systems has been seen [2, 4, 13, 19, 21]. For a complete monograph on positive 2-D systems, see [13]. In the present paper, we first analyze the stability of linear positive 2-D models [7], following ideas borrowed from 1-D systems [1], already used by some of the authors for Roesser models [11], deriving a necessary and sufficient condition for 2- D stability, based on simple linear inequalities. From this result, a simple numerical method is proposed for a complete treatment of the stabilization problem of these positive 2-D systems, when they can be described by a Fornasini-Marchesini sec- ond model (A parallel result for Roesser 2-D systems is presented in [2]). Moreover, based on this approach, necessary and sufficient conditions, expressed in terms of a Linear Programming problem, are proposed for the stabilization problem.

We must point out that although it is known that the stability of linear digital 2-D systems can be reduced to checking the stability of a 2-D characteristic polynomial [3, 22], it seems difficult to apply this result in practice for the control synthesis problem. Thus, in the literature, various types of easily checkable but only sufficient conditions for asymptotic stability and stabilization problems for 2-D linear systems have been proposed [9, 15, 16, 23]. In contrast, the results in this paper are given as necessary and sufficient conditions. This paper is organized as follows: Section 2 presents the problem formulation and some preliminary results. Section 3 presents first the 2-D stability problem, which is then used to derive the main controller synthesis results, that are also ex- tended to other related problems (bounded control and uncertainty). Finally, after presenting some illustrative examples in Section 4, some conclusions are given.

Notations

n n T R+ denotes the non-negative orthant of the n-dimensional real space R . M denotes the transpose of the real matrix M. For a real matrix M, M > 0 denotes a positive matrix, with all its components positive (mij > 0), and M ≥ 0 denotes a nonnegative matrix, with none of its components negative (mij ≥ 0). I denotes the identity matrix of appropriate order and N denotes the set of integer numbers. ρ(M) denotes the spectral radius of M ∈ Rn×n. LP Approach for 2-D Stabilization and Positivity 219

2 Problem Formulation and Preliminaries

Consider a linear homogeneous 2-D system described by the following Fornasini- Marchesini second model [7]:

x(i + 1, j) u(i + 1, j) x(i + 1, j + 1)=A + B (1) x(i, j + 1) u(i, j + 1)

n×2n n×2m where A =[A1 A2] ∈ R and B =[B1 B2] ∈ R are given real matrices, x(i, j) ∈ Rn is the state matrix and u(i, j) ∈ Rm is the input matrix. The boundary conditions for (1) are given by  x(i,0)=x ∀ i ∈ N i0 (2) x(0, j)=x j0 ∀ j ∈ N

In the sequel, the following definition will be used. Definition 1. System (1)-(2) with zero input (u = 0) is said to be a positive 2- D Fornasini-Marchesini system if for any given nonnegative boundary conditions x j0 ≥ 0andxi0 ≥ 0, the resulting states are always nonnegative, that is, x(i, j) ≥ 0 for all i, j ∈ N. The following result shows how one can check the positiveness of System (1) (see [13]). Proposition 1. System (1)-(2) with zero input (u = 0) is positive if and only if n×2n A ∈ R+ . The asymptotic stability for general Fornasini-Marchesini second models [7] has been extensively studied in the literature. For example, a well-known necessary and sufficient frequency condition for asymptotic stability is stated in the following re- sult. n×n n×n Lemma 1. Let A1 ∈ R and A2 ∈ R be given constant real matrices. Then, the 2-D system described by the Fornasini-Marchesini second model (1) with zero input, is asymptotically stable if and only if any of the following conditions holds

(i) ρ(A1 + zA2) < 1, |z| = 1 (3) (ii) det(I − z1A1 − z2A2) = 0,∀(z1,z2) ∈{(z1,z2) : |z1|≤1,|z2|≤1}

In the sequel, our purpose is to investigate the existence of state-feedback control laws of the form u(i, j)=Kx(i, j), (4) such that the resulting closed-loop system:

x(i + 1, j) x(i + 1, j + 1)=Aø (5) x(i, j + 1) 220 M. Alfidi, A. Hmamed, F. Tadeo

l×n is positive and asymptotically stable, where K [kij] ∈ R is the controller gain to be determined and

K 0 Aø = A + BKø, Kø = . (6) 0 K Of course, if one utilizes directly the results of Lemma 1 and Proposition 1, one can have the following necessary and sufficient condition for the closed-loop system to be positive and asymptotically stable: ⎧ ⎪ • + , ⎪ A1 B1K1 is a nonnegative matrix ⎪ ⎨⎪ • A2 + B2K is a nonnegative matrix, and ⎪ (7) ⎪ ⎪ • det(I − z (A + B K ) − z (A + B K)) = 0, ⎩⎪ n1 1 1 1 1 2 2 2 ∀(z1,z2) ∈{(z1,z2) : |z1|≤1,|z2|≤1}.

However, this is a formulation which leads to a problem hard to solve, since we have a linear constraint (the positivity constraint) mixed with the very highly nonlinear infinite dimensional constraint (the asymptotic stabilizability constraint). A signifi- cant contribution of this paper is reflected by the simplicity and completeness of the solution provided. Effectively, all the provided main results involve easily checkable necessary and sufficient conditions. In fact, it will be shown how we can completely solve problem (7) in terms of a Linear Programming problem, which avoids unnec- essary computational burdens. Previous results and equivalent conditions for a nonnegative matrix M to be Schur (or equivalently ρ(M) < 1) can be found in the literature [6]. In the following, some known results are presented and will be used in order to derive our main results. Proposition 2. [1] Let M be a nonnegative matrix. Then, the following conditions are equivalent: (i) The 1-D system x(k + 1)=Mx(k) is asymptotically stable (or equivalently, ρ(M) < 1). (ii)There exists a positive vector λ > 0 such that (M − I)λ < 0.

3 Stabilization with Positivity of 2-D Fornasini- Marchesini Systems

3.1 Stability Analysis for Positive 2-D Systems

This section provides preliminary stability results for the free linear 2-D system described by the Fornasini-Marchesini second model (1) when A ≥ 0: LP Approach for 2-D Stabilization and Positivity 221

x(i + 1, j) x(i + 1, j + 1)=A (8) x(i, j + 1) In fact, it will be shown that the asymptotic stability of System (8) (under the pos- itivity constraint) is equivalent to the stability and positivity of the following 1-D linear discrete-time system:

x˜(k + 1)=(A1 + A2)x˜(k), (9) where system (9) is called positive if for any given nonnegative initial condition the resulting trajectories are also nonnegative. In fact, it has already been proved that if A1 ≥ 0, and A2 ≥ 0, the system (9) is positive [1]. Now, some results are needed in order to establish our main stability result. Theorem 1. [19, 20] Assume that the system (8) is positive (or equivalently that A1 and A2 are nonnegative). Then, the following statements are equivalent: i) det(In − z1A1 − z2A2) = 0, ∀(z1,z2) ∈{(z1,z2) : |z1|≤1,|z2|≤1}. ii) ρ(A1 + A2) < 1. iii)The positive 2-D system (8) is asymptotically stable. iv) The positive 1-D system (9) is asymptotically stable. Now, we are in position to state the result that will be used in the rest of the paper.

Corollary 1. Assume that the matrices A1 ≥ 0 and A2 ≥ 0, the following state- ments are equivalent: i) The 2-D system described by the Fornasini-Marchesini second model (8) is pos- itive and asymptotically stable. ii) The 1-D system described by (9) is positive and asymptotically stable. n iii)The matrices A1 and A2 are nonnegative and there exists a vector d ∈ R such that the following Linear Program condition is fulfilled:

(A1 + A2 − I)d < 0, d > 0 (10)

Proof. ii) ⇔ iii) results from Proposition 2. i) ⇒ iii): Using Proposition 1 we have that A1 and A2 are nonnegative, so from Lemma 1 the asymptotic stability of the 2-D system (8) is equivalent to det(In − z1A1 − z2A2) = 0, ∀(z1,z2) ∈{(z1,z2) : |z1|≤1,|z2|≤1}, which by Theorem 1 is equivalent to ρ(A1 + A2) < 1, and using Proposition 2 implies iii). Finally, iii) ⇒ i) follows from Proposition 1 combined with Proposition 2, Theorem 1 and Lemma 1. 

3.2 Proposal for Synthesis of Stabilizing Controllers

This section studies the stabilization problem of linear 2-D systems described by the Fornasini-Marchesini second model under state-feedback of the form u(i, j)= Kx(i, j). This control law is designed to ensure the positivity and the asymptotic 222 M. Alfidi, A. Hmamed, F. Tadeo stability of the resulting closed-loop system. As will be shown, our proposed ap- proach does not impose any restriction on the dynamics of the governed system. For instance, the free Fornasini-Marchesini second model can possibly be nonpositive. In this case, our synthesis design can be interpreted as enforcing the system to be positive. Now, consider the closed-loop Fornasini-Marchesini second model:

x(i + 1, j) x(i + 1, j + 1)=(A + BKø) (11) x(i, j + 1)

n×2n n×2m where A =[A1 A2] ∈ R and B =[B1 B2] ∈ R are supposed to be any real matrices (not necessarily nonnegative). In what follows we provide the main result of this section. Theorem 2. The closed-loop system (11) is positive and asymptotically stable for T n any boundary conditions, if and only if there exist n+1 vectors d =[d1 ... dn] ∈ R m and y1 ... yn ∈ R such that ⎧ ⎪ n ⎪ (A + A − I )d +(B + B ) ∑ y < 0, ⎨⎪ 1 2 n 1 2 i i=1 > , ⎪ d 0 (12) ⎪ + ≥ , ≤ , ≤ , ⎩⎪ a1ijd j b1iy j 0 1 i j n a2ijd j + b2iy j ≥ 0, 1 ≤ i, j ≤ n,

=[ ] =[ ] T =[ T ... T ] T =[ T ... T ] with A1 a1ij ,A2 a2ij ,B1 b11 b1n and B2 b21 b2n . Moreover, the gain matrix K is given by:

=[ −1 ... −1 ]. K d1 y1 dn yn (13) Proof. Assume that condition (12) is satisfied and define the appropriate ma- =[ , ... , ] = −1 = ,..., trix K k1 kn with columns constructed as ki di yi,fori 1 n,and partitioned as K =[KK]. By this construction, it is easy to see that A1 + B1K and A2 + B2K are nonnegative matrices. Effectively, from the last inequalities in condi- tion (12) we have for i, j = 1,...,n: ≤ ( + ) −1 = + =( + ) . 0 a1ijd j b1iy j d j a1ij b1ik j A1 B1K ij

≤ ( + ) −1 = + =( + ) . 0 a2ijd j b2iy j d j a2ij b2ik j A2 B2K ij Next, we show the asymptotic stability under the feedback control u = Kx.Using n the previous gain, we obtain by calculation B1Kd+B2Kd =(B1 +B2)(∑ yi) which i=1 is used in condition (12) and leads to (A1 + B1K + A2 + B2K − In)d < 0. Now, since d > 0, A1 + B1K and A2 + B2K are nonnegative, then by using Corollary 1, we conclude that the 2-D system described by the closed-loop Fornasini-Marchesini second model (11) is asymptotically stable. The rest of the proof follows the same line of argument, so it is omitted.  LP Approach for 2-D Stabilization and Positivity 223

Remark 1. We emphasize that the LP formulation proposed in Theorem 2 does not impose any restriction on the dynamics of the governed system. In fact, the matrix A may have negative components, or equivalently, the free system may be nonpositive. In this case the proposed synthesis methodology can be viewed as en- forcing a nonpositive system to be positive.

It must be pointed out that the requirement of positivity for the controls can also be handled by an LP approach similar to (12), that is now provided, and can be proved following the same ideas: Theorem 3. The following statements are equivalent: i) There exists a positive state-feedback law u(i, j)=Kx(i, j) ≥ 0 such that the closed-loop system (11) is positive and asymptotically stable for any initial boundary conditions.

m×n ii) There exists a matrix K ∈ R such that K ≥ 0,A1 + B1K and A2 + B2K are nonnegative matrices and A1 + B1K + A2 + B2K is a Schur matrix (that is, ρ(A1 + B1K + A2 + B2K) < 1).

T n iii)The following LP problem in the variables d =[d1 ... dn] ∈ R and y1, ..., yn ∈ m R is feasible. ⎧ ⎪ n ⎪ (A + A − I )d +(B + B ) ∑ y < 0, ⎪ 1 2 n 1 2 i ⎨⎪ i=1 d > 0, ⎪ ≥ , ≤ ≤ , (14) ⎪ yi 0 1 i n ⎪ + ≥ , ≤ , ≤ , ⎩⎪ a1ijd j b1iy j 0 1 i j n a2ijd j + b2iy j ≥ 0, 1 ≤ i, j ≤ n, =[ ] =[ ] T =[ T ... T ] T =[ T ... T ] with A1 a1ij ,A2 a2ij ,B1 b11 b1n and B2 b21 b2n . Moreover, the gain matrix in conditions i) and ii) can be chosen as: [ −1 ... −1 ], K= d1 y1 dn yn where d and y1, ..., yn are given by any feasible solution to the above LP problem Remark 2. Note that if a negative state-feedback control law is to be considered it suffices to impose yi ≤ 0 instead of yi ≥ 0 in the previous LP formulation.

3.3 Synthesis with Uncertain Plant

An important issue when designing control systems for practical applications is en- suring the robust stability, that is, the closed-loop system should remain stable under uncertainty in the plant model. In fact, the system matrices that describe a 2-D sys- tem are frequently uncertain [5, 8]. Thus, this section considers the robust stabilization of Fornasini-Marchesini sec- ond models, when the dynamics are not exactly known. More precisely, we assume 224 M. Alfidi, A. Hmamed, F. Tadeo that the uncertainties can be captured using a polytopic domain representation [8]. That is, we consider that our 2-D system can be represented by the following uncer- tain model:

x(i + 1, j) u(i + 1, j) x(i + 1, j + 1)=A(α) + B(α) , (15) x(i, j + 1) u(i, j + 1)

n×2n n×2m where A(α)=[A1(α) A2(α)] ∈ R and B(α)=[B1(α) B2(α)] ∈ R are supposed to be not exactly known, but they belong to the following convex set:

A1(α) A2(α) B1(α) B2(α) ∈ D, (16)

l l = α [ i i i i ], α = , α ≥ , D : ∑ i A1 A2 B1 B2 ∑ i 1 i 0 (17) i=1 i=1 [ 1 1 1 1], ..., [ l l l l ] where A1 A2 B1 B2 A1 A2 B1 B2 are known matrices. Thus, the robust synthesis design problem consists in finding a single constant gain matrix K for which the following closed-loop system is positive and asymptot- ically stable for every [A1(α) A2(α) B1(α) B2(α)] ∈ D:

x(i + 1, j) x(i + 1, j + 1)=(A(α)+B(α)Kø) . (18) x(i, j + 1)

Using a simple convexity property, the LP formulation proposed in Theorem 2 can be easily extended to these systems with polytopic uncertainties, by just repeat- ing the LP conditions for all the vertices in the polytope, using similar ideas as those proposed by some of the authors in [11], as stated in the following result. Theorem 4. There exists a robust state-feedback law u(i, j)=Kx(i, j) such that the resulting closed-loop system (18) is positive and asymptotically stable for any initial boundary conditions and for every [A1(α) A2(α) B1(α) B2(α)] ∈ D,ifthe T n m following LP problem in the variables d =[d1 ... dn] ∈ R and y1, ... ,yn ∈ R , is feasible.

⎧ n ⎪ ( k + k − ) +( k + k)( ) < , = ,..., ⎪ A1 A2 In d B1 B2 ∑ yi 0 fork 1 l ⎨ i=1 > , ⎪ d 0 (19) ⎪ k + k ≥ , ≤ , ≤ , = ,..., , ⎩⎪ a1ijd j b1iy j 0 1 i j n k 1 l k + k ≥ , ≤ , ≤ , = ,..., a2ijd j b2iy j 0 1 i j n k 1 l

k =[ k ] k =[ k ] kT =[ kT ... kT ] kT =[ kT ... kT ] = with A1 a1ij ,A2 a2ij B1 b11 b1n and B2 b21 b2n where n n1 + n2 and k = 1,...,l. Moreover, the gain matrix can be computed as =[ −1 ... −1 ] K d1 y1 dn yn where d and y1, ... ,yn are given by any feasible solution to the LP problem 19. LP Approach for 2-D Stabilization and Positivity 225

Proof. If a solution {d1, ..., dn, y1, ..., yn} fulfills (19), then it also fulfills the ∗ ∗ ∗ ∗ ∈ following LP for any A1 A2 B1 B2 D: ⎧ ⎪ n ⎪ (A∗ + A∗ − I )d +(B∗ + B∗) ∑ y < 0, fork= 1,...,l ⎨⎪ 1 2 n 1 2 i i=1 > , ⎪ d 0 (20) ⎪ ∗ + ∗ ≥ , ≤ , ≤ , = ,..., , ⎩⎪ a1ijd j b1iy j 0 1 i j n k 1 l ∗ + ∗ ≥ , ≤ , ≤ , = ,..., . a2ijd j b2iy j 0 1 i j n k 1 l Thus, by convexity it is only necessary to check the LP (12) at the vertices of the polytope, which gives the result. 

4 Numerical Examples

4.1 Stabilization Example

As first illustration of the proposed design methodology for stabilization, we deal with the following system given by (1) and the following matrices: ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ −1.50.10 00 0 1 0 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ A1 = 0.20.50.3 , A2 = 00 0 , B1 = 1 and B2 = 0 . 000 110.2 0 0

Of course, as the matrix A1 is not nonnegative, the free system (i.e., when u = 0) is not positive. This fact is also illustrated by the evolution of the free system depicted in Figure 1 (starting from nonnegative boundary conditions). According to the result given in Lemma 1, the system in open-loop is also unstable, i.e., the spectral radius jω ρ(A1 + zA2) > 1, where z = e ,forallω ∈ [0,2π] (see Figure 2). Then, the objective is to design a state-feedback controller that stabilizes the sys- tem and enforces it to be positive. For this purpose, it suffices to use the result of Theorem 2 and find a solution fulfilling the inequalities (12).

Since we have shown in Theorem 2 that the gain of a stabilizing control is given =[ −1 −1 −1] by K y1d1 y2d2 y3d3 , we have used the following feasible solution to the corresponding inequalities (12): ⎡ ⎤ ⎡ ⎤ d1 8.5299 ⎢ ⎥ ⎢ ⎥ ⎢ d2 ⎥ ⎢ 138.5801 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ d3 ⎥ ⎢ 202.3965 ⎥ ⎢ ⎥ = ⎢ ⎥, ⎢ y1 ⎥ ⎢ 14.5909 ⎥ ⎣ ⎦ ⎣ ⎦ y2 −11.4508 y3 0.8102 to obtain the following gain of a stabilizing controller: K =[1.7106 − 0.0826 0.0040]. 226 M. Alfidi, A. Hmamed, F. Tadeo

Fig. 1 Open-loop evolution of x1(i, j)

1.525

1.52

1.515

1.51 spectral radius 1.505

1.5 0 1 2 3 4 5 6 7 w

Fig. 2 Spectral radius of the open-loop system

The corresponding system matrices in closed-loop are given by: ⎡ ⎤ ⎡ ⎤ 0.2106 0.0174 0.0040 00 0 ⎣ ⎦ ⎣ ⎦ A1 + B1K = 1.9106 0.4174 0.3040 and A2 + B2K = 00 0 . 000 110.2

Hence, it suffices to look at the entries of the matrices A1 + B1K and A2 + B2K, to conclude that the closed-loop system is positive (according to Proposition 1). In addition, according to Corollary 1, the closed-loop system is asymptotically stable (it can be checked that the matrix A1 +B1K +A2 +B2K has all the eigenvalues inside the unit circle, namely λ1 = 0.92, λ2 = 0.17 and λ3 = −0.26). LP Approach for 2-D Stabilization and Positivity 227

Figure 3 shows the evolution of the closed-loop state x2(i, j), using the proposed feedback law, from nonnegative boundary conditions: we can see that, effectively, all the states are nonnegative and converge to zero.

Fig. 3 Closed-loop evolution of x2(i, j) under the designed state feedback

4.2 Positive Feedback Example

Let us consider the non-positive 2D system described by (5) with the following system matrices:

0.20.7 00 0 0 A = , A = , B = and B = . 1 00 2 0.9 −0.8 1 0 2 1 jω Note that the system is open-loop unstable, since the spectral radius ρ(A1 + e A2) has values greater than 1.0 for some ω ∈ [0,2π] (see Figures 4 and 5). Here, our task is to use a positive state-feedback control, in order to stabilize the system and enforce the state to be positive. Based on the result provided by Theorem 3 the following inequalities must be fulfilled: ⎡ ⎤ −0.80.700 ⎡ ⎤ ⎢ ⎥ ⎢ 0.9 −1.81 1⎥ d1 ⎢ ⎥⎢ ⎥ ⎢ −1000⎥⎢ d2 ⎥ ⎢ ⎥⎣ ⎦ < 0, ⎢ 0 −100⎥ y1 ⎣ ⎦ 00−10 y2 000−1 228 M. Alfidi, A. Hmamed, F. Tadeo

Fig. 4 Uncertain plant example: Open-loop evolution of x1(i, j)

1.3

1.2

1.1

1 spectral radius 0.9

0.8 0 1 2 3 4 5 6 7 w

Fig. 5 Uncertain plant example: Open-loop spectral radius

⎡ ⎤⎡ ⎤ −0.20 00 d1 ⎢ ⎥⎢ ⎥ ⎢ 0 −0.70 0⎥⎢ d2 ⎥ ⎣ ⎦⎣ ⎦ ≤ 0. −0.90−10 y1 00.80−1 y2 Now, from the following feasible solution to the above LP problem: ⎡ ⎤ ⎡ ⎤ d1 97.3594 ⎢ ⎥ ⎢ ⎥ ⎢ d2 ⎥ ⎢ 106.6566⎥ ⎣ ⎦ = ⎣ ⎦, y1 6.6901 y2 89.5453 LP Approach for 2-D Stabilization and Positivity 229 we obtain (as stated in Theorem 3) that a stabilizing controller has the gain K = [ . . ] =[ −1 −1] 0 0687 0 8396 (computed from K y1d1 y2d2 ). Thus, the dynamic matrices of the closed-loop system are:

0.20.7 00 A + B K = and A + B K = , 1 1 00 2 2 0.9687 0.0396 It can be seen that all the states of system are convergent and asymptotically stable (see Figure 6). It can also be checked that the matrix A1 + B1K + A2 + B2K has all its eigenvalues inside the unit circle, namely at λ1 = 0.9471 and λ2 = −0.7076. From this, it follows that the system is asymptotically stable, by simple looking at the entries of the dynamic matrices A1 +B1K and A2 +B2K. Then, we can conclude that, according to Proposition 2.1, the governed system is positive.

Fig. 6 Positive feedback example: Closed-loop evolution of x3(i, j)

4.3 Uncertain Plant Example

In this example, we consider an uncertain system (15) subject to a parametric per- turbation, described by the following matrices: ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ −1.50.10 00 0 1 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ A1 = 0.20.50.3 , A2(α)= 00 0 , B1(α)= 1 − 0.01α . − . α 000 110⎡2 ⎤0 01 0 0 ⎣ ⎦ and B2 = 0 . 0 230 M. Alfidi, A. Hmamed, F. Tadeo

The uncertain parameter is 0 ≤ α ≤ 1. We are looking for a robust state- feedback control which stabilizes and enforces the positivity of all the plants be- tween the two extreme plants (α = 0andα = 1). By applying Theorem 4, based =[ −1 −1 −1] on K y1d1 y2d2 y3d3 ), we obtain the following gain of a robust stabilizing controller: K =[1.7446 − 0.0836 0.0039].

Fig. 7 Uncertain plant example: Closed-loop state evolution for the extreme plant correspond- ing to α = 0

Hence, with this gain all the closed-loop systems between the two extreme plants (from α = 0toα = 1) are positive and asymptotically stable. As illustration, the state evolution of the two extreme plants (α = 0andα = 1), starting from initial positive boundaries, are depicted in Figures 7 and 8, respectively.

5Conclusion

An approach for solving the stability synthesis problem for 2-D systems described by the Fornasini-Marchesini second model under the requirement of positivity of the closed-loop system is presented. For this, necessary and sufficient conditions for the solvability of the stabilization problem have been proposed, including the pres- ence of bounds on the control signal and uncertainty in the plant. Moreover, it has been shown that the proposed conditions are solvable in terms of Linear Programs, which are simple to solve using off-the-self software. Some numerical example il- lustrate the proposed approach. It must be pointed out that the proposed approach is quite general and can be extended to more involved problems. In fact, work is being LP Approach for 2-D Stabilization and Positivity 231

Fig. 8 Uncertain plant example: Closed-loop state evolution for the extreme plant correspond- ing to α = 1 carried out to extend these ideas to more general 2-D systems, 2-D systems with delays, n-D systems, etc.

Acknowledgements. This work is funded by MiCInn project DPI2007-66718-C04-02. The authors would like to thank Dr. Ait Rami for many helpful discussions.

References

1. Ait Rami, M., Tadeo, F.: Controller synthesis for positive linear systems with bounded controls. IEEE Trans. Circuits and Systems-II 54(2), 151Ð155 (2007) 2. Alfidi, M., Hmamed, A.: Control for stability and positivity of 2D linear discrete-time systems. WSEAS Transactions on Systems and Control 12(2), 546Ð556 (2007) 3. Anderson, B.D.O., Agathoklis, P., Jury, E.I., Mansour, M.: Stability and the Matrix Lya- punov Equation for Discrete 2-Dimensional Systems. IEEE Transactions on Circuits and Systems 33, 261Ð266 (1986) 4. Bailo, E., Bru, R., Gelonch, J., Romero, S.: On the Reachability Index of Positive 2-D Systems. IEEE Transactions on Circuits and Systems-II 53(19), 997Ð1001 (2006) 5. Dua, C., Xie, L., Zhang, C.: H∞ control and robust stabilization of two-dimensional systems in Roesser models. Automatica 37(2), 205Ð211 (2001) 6. Farina, L., Rinaldi, S.: Positive Linear Systems. Theory and Applications. Pure and Ap- plied mathematics. John Wiley & Sons, Inc., New York (2000) 7. Fornasini, E., Marchesini, G.: State-space realization theory of two-dimensional filters. IEEE Transactions on Automatic Control 21(4), 484Ð492 (1976) 8. Gao, H., Lam, J., Wang, C., Xu, S.: H∞ model reduction for uncertain two-dimensional discrete systems. Optimal Control Appl. and Methods 26(4), 199Ð227 (2005) 232 M. Alfidi, A. Hmamed, F. Tadeo

9. Galkowski, K., Rogers, E., Xu, S., Lam, J., Owens, D.H.: LMIs-A Fundamental Tool in Analysis and Controller Design for Discrete Linear Repetitive Process. IEEE Transac- tions on Circuits and Systems I 49(6), 768Ð778 (2002) 10. Givone, D.D., Roesser, R.P.: Multidimensional linear iterative circuits - General proper- ties. IEEE Transactions on Computers 21(10), 1067Ð1073 (1972) 11. Hmamed, A., Ait Rami, M., Alfidi, M.: Controller synthesis for positive 2D systems described by the Roesser model. In: Proceedings of the 47th Conference on Decision and Control, Cancun, Mexico, pp. 387Ð391 (2008) 12. Kaczorek, T.: Realization problem, reachability and minimum energy control of posi- tive 2D Roesser model. In: Proceedings of the 6th Annual International Conference on Advances in Communication and Control, pp. 765Ð776 (1997) 13. Kaczorek, T.: Positive 1D and 2D Systems. Springer, London (2002) 14. Lancaster, P., Tismenetsky, M.: The Theory of Matrices. Academic Press, London (1985) 15. Lee, E.B., Lu, W.S.: Stabilization of Two-Dimensional Systems. IEEE Transactions on Automatic Control 30, 409Ð411 (1985) 16. Lu, W.S.: Some New Results on Stability Robustness of Two-Dimensional Discrete Sys- tems. Multidimensional Systems and Signal Processing 5, 345Ð361 (1994) 17. Marszalek, W.: Two dimensional state-space discrete models for hyperbolic partial dif- ferential equations. Applied Mathematical Modelling 8, 11Ð14 (1984) 18. Roesser, R.: A discrete state-space model for linear image processing. IEEE Transactions on Automatic Control 20, 1Ð10 (1975) 19. Twardy, M.: An LMI approach to checking stability of 2D positive systems. Bulletin of the Polish Academy of Sciences: Technical Sciences 55(4), 385Ð395 (2007) 20. Valcher, M.E.: On the Internal Stability and Asymptotic Behavior of 2-D Positive Sys- tems. IEEE Transactions on Circuits and Systems I 44(7), 602Ð613 (1997) 21. Valcher, M.E., Fornasini, E.: State models and asymptotic behaviour of 2D positive sys- tems. IMA Journal of Mathematical Control and Information 12(1), 17Ð36 (1995) 22. Wu-Sheng, L., Lee, E.B.: Stability analysis for two-dimensional systems via a Lyapunov Approach. IEEE Transactions on Circuits and Systems 32, 61Ð68 (1985) 23. Yaz, E.: On State-feedback Stabilization of Two-Dimensional Digital Systems. IEEE Transactions on Circuits and Systems 32, 1069Ð1070 (1985) An Algorithmic Approach to Orders of Magnitude in a Biochemical System

Eric Benoˆõt and Jean-Luc Gouz«e

Abstract. We use orders of magnitudes of variables and parameters of a chemical system described by an ordinary differential equation, to obtain a partition of the state space in boxes (hyper-rectangles). From the fast system in each box, we de- rive rules of transition, and obtain a transition graph. This graph can be used for a qualitative simulation and validation of the system.

1 Introduction

For biological models, the positivity of the variables is of primary importance, but the units (dimensions) and the scaling (order of magnitude) of the variables and the parameters also play a central role. Making the model without units permits to compare the variables between them, and to group parameters to reduce their number ([8]). The orders of magnitude are a good way of simplifying a big model, by keeping only the “large” or “fast” part and neglecting the “small” or “slow” part. To be more precise, consider a classical biochemical model, described by an or- dinary differential equation. The well known method of “quasi-steady state approx- imation” ([6])(or, more mathematically, the singular perturbation method [7]) is re- lated to the above kind of approximation: the order of magnitude of some groups of parameters leads to the fact that the rates of variation wrt. time of some variables are far greater than some others, and (under some assumptions) these “fast” variables can be put to their quasi-equilibria, leading to a simpler differential system with only “slow” variables.

Eric Benoˆõt Laboratoire de Math«ematiques et Applications, Universit«edelaRochelle,andCOMORE, INRIA Sophia-Antipolis, France, e-mail: [email protected] Jean-Luc Gouz«e COMORE, INRIA Sophia-Antipolis, France, e-mail: [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 233Ð241. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 234 E. Benoˆõt and J.-L. Gouz«e

But the order of magnitude of the variables is also important, although more rarely used [6]. For a biologist, it is very important to check that, given the param- eters, the system will evolve in some domain where, for example, the concentra- tion of some complex is very small with respect to the concentrations of the other reactants: this gives a mean to check qualitatively the model versus experimental measurements and a priori knowledge. Here our aim will be to combine the different approaches, using the order of magnitudes of the variables and of the parameters, and also the positivity of some terms and the structure of the system (chemical systems). Our goal will be to analyze large chemical systems, in an algorithmic way, with the help of a computer. This paper gives the first steps of this methodology. We define what we call orders of magnitude for the variables and parameters. We partition the space of the variables (without units) into rectangular hyper rectangles (called boxes), each box representing an order of magnitude. Within a box, we study the system and separate it into a “fast” and a “slow” part. Then we study the fast subsystem, and show that, with good hypotheses, we can obtain information on the full system. Roughly, we can distinguish two types of boxes: in the first type, any solution will exit the box in finite time and go to other boxes. The box is transitory, and our algorithm gives the possible transitions between the transitory boxes. We give sufficient conditions on the parameters so that the exit is possible or forbidden through a given face of the box. In the second type of box, a solution can remain during a “long” time, possibly until reaching the equilibrium. In this kind of box, it is possible to proceed to a singular or regular perturbation analysis to simplify the system ([7]). Finally, we obtain a qualitative simulation through the transition graph of boxes, and the biologist can compare this simulation to his experiments. In particular, he can check that the temporal order of the changes in the orders of magnitude is compatible with the graph. This approach is quite similar to qualitative simulation and hybrid system ap- proaches, where the transition graph is an abstraction (in the hybrid system sense) of the continuous system. For example, this method was used to describe the quali- tative behavior of large genetic networks [5]: in each box, a piecewise linear system is given by the model, but changes from box to box. The qualitative simulations can be compared with experiments and noisy data. It is also possible to check properties of the transition graph by model checking techniques [1, 2]. The notion of reacha- bility (the set of boxes that can be attained from an initial set) plays an important role. Another related work, more oriented toward control aspects, concerns the multi- affine systems defined inside hyper rectangles ([3]). The authors derive sufficient conditions for driving all the solutions in the rectangle through desired faces of exit in finite time. The paper is organized as follows: we first define the system (of biochemical type), then the scaling and the boxes. We give some mathematical lemmas describ- ing the behavior of the solutions inside a box or on a face of a box; then we de- scribe the algorithm implemented with Maple. We take the classical example of the Michaelis-Menten mechanism for enzymes, which was the subject of numerous An Algorithmic Approach to Orders of Magnitude 235 papers concerning the different quasi steady states approximations obtainable from this system ([9]). Here we give the transition graph obtained for given values of the parameters.

2 The System

We consider a biochemical system with N nonnegative variables Xi,i = 1...N (these variables are typically chemical concentrations), and P positive parameters Kj,and we write the general system: Xú = F(X,K). In general, the variables Xi could have different units, but we suppose here, for the 0 = / e sake of simplicity, that we have scaled the variables by defining Xi Xi Xi where e 0 Xi is the chosen unit for Xi and Xi is the new unitless variable. The new system can ú 0 = ( 0, 0) 0 be written: X F X K where the Kj are the new unitless parameters defined e from the parameters Kj and from the units Xi . We keep the same notations. In the following, we will need the explicit form of the system. We suppose that it can be written in the following form:

Xúi = Ai00 + ∑ Aij0Xj + ∑ AijkXjXk − Xi(Bi0 + ∑BijXj) (1) j=i j=i,k=i, j≤k j In the expression, the positive terms represent the reactants of chemical reactions producing Xi.ThefirsttermAi00 is a constant input, the second term Aij0Xj describes the kinetics of the reaction Xj → Xi, the third positive term AijkXjXk describes the kinetics of the reaction Xj + Xk → Xi. The negative terms represent the decay of the reactant Xi.ThefirsttermBi0Xi can be seen as a degradation rate toward the exterior of the system, or as a term due to a reaction Xi → Xj +Xk or Xi → Xj, and the second term describes the kinetics of the reaction Xj + Xi → Xk. We have done some (reasonable) hypotheses: in particular we suppose that the reactions are at most of order two (bi-molecular). We have chosen to restrict our study to such a particular form to be able to write explicit formulas and give explicit lemmas in the following. Yet, the principle of our method remains valid for more complex systems. The main hypothesis is that we are able to separate the right-hand side into groups with very different orders of magnitude. These hypotheses on the structure of the system give us the nonnegativity of the solutions, by showing that the field is repulsive on the boundary (Xi = 0 ⇒ Xúi ≥ 0).

3 The Scaling

We first define a hyper rectangular “box“ corresponding to an order of magnitude. Let ω be a real (large) number, that will define the scale (ω > 1). The box B in the space of the variables is described by the notation B =[n1,...,nN] (where the n j are integers) and means that, for each i = 1,...N,thevariableXi is such that 236 E. Benoˆõt and J.-L. Gouz«e

n −1/2 n +1/2 n ω i ≤ Xi ≤ ω i . This box is of (logarithmic) length ω and of center ω i . Now we can scale the variables inside each box by the change of variables

ni xi = Xi/(ω )

−1/2 +1/2 so that, in the box, the new variable verifies ω ≤ xi ≤ ω . The new parame- ters (i.e., the coefficients of the monomials in the equations) after the scaling of the variables are denoted by k. The new scaled system inside some particular box [n1,...nN ] is now written as xú = f (x,k) and the explicit form is the same as equation (1) with scaled variables and new parameters:

xúi = ai00 + ∑ aij0x j + ∑ aijkx jxk − xi(bi0 + ∑bijx j) (2) j=i j=i,k=i, j≤k j

The numbers aijk,−bij will now be called the coefficients of fi and designated as Coef( fi) (a vector). We remark that they are positive or negative with this definition. We will write x > 0 to say that the vector x has nonnegative coordinates, not all equal to zero. To be more explicit, easy computations give that

(n j+nk−ni) n j aijk = ω Aijk, bij = ω Bij.

In a given box, because of the scaling, the coefficients aijk and −bij will often have different order of magnitude, and the right-hand side of the equation (2) will be the sum of a large (or fast) part and of a small (or slow) part. Let us remark that, until now, we have not scaled the time. Therefore we can choose a new time scale τ, with τ = Ct, such that the larger coefficients are of order greater than one, and are separated from the other coefficients by a gap of length G. Let us explain that more precisely.

Definition 1. In a box, for the appropriate change of time, the system (2) can be dx = f ( , )+ s( , ) f written in the form: dτ f x k f x k where the non-zero coefficients of f are greater than G in absolute value, and the non-zero coefficients of f s are smaller than 1 in absolute value. The number G is called the gap between the coefficients in the box.

This gap is nothing but the interval between the larger coefficients and the other. The grouping of the parameters around the gap is not unique. We will show below that, if the gap G is large enough, then the full system can dx = f ( , ) be well approximated by the fast subsystem dτ f x k and we deduce several properties concerning the fact that the solutions in the box have to escape this box via some faces, and that the exit through some other faces is forbidden. An Algorithmic Approach to Orders of Magnitude 237

4 The Behavior in a Given Box

Suppose, as above, that the solution is in some box B =[n1,...,nN ]. The following lemma gives a sufficient condition for one coordinate to be monotone in B.Wefirst define four different disjoint sets of indexes, each being a subset of {1,...,N}.Let = { ( f )= } = { ( f ) > } = { ( f ) < } E0 i;Coef fi 0 , E+ i;Coef fi 0 , E− i;Coef fi 0 .The f set E∗ will contain the rest of the indexes, i.e. the indexes such that the fast part fi has positive and negative coefficients in the box. Of course E+ ∪ E− ∪ E0 ∪ E∗ = {1,...,N}.

∈ > ( + )ω2 dxi > Lemma 1. Suppose that i E+. Then, if G N 1 ,then dτ 0,i.e.thei-th +1/2 coordinate is increasing in B. The box is exited in finite time. The face xi = ω is −1/2 a possible face of exit. The solution cannot exit through the face xi = ω . ∈ ( f ) Proof. Because i E+, all the coefficients of fi are zero or positive and greater s than G, and at least one is positive. The coefficients of fi are lower than 1, and of any sign. In the box B, the following inequalities are valid for any j:

−1/2 +1/2 ω ≤ x j ≤ ω

Now, given the explicit form (2), we can minoratex úi with the above inequalities. The first part of the right-hand side ai00 + ∑( j=i) aij0x j + ∑( j=i,k=i) aijkx jxk is (be- cause of the positivity of the elements) minorated by G or Gω−1/2 or G/ω ;we keep G/ω because ω > 1 . The second (negative) part −xi(bi0 +∑( j) bijx j) is mino- rated by −ω+1/2(1 + Nω+1/2), more roughly by (N + 1)ω because ω > 1. Finally > ( + )ω2 dxi >  we obtain that, if G N 1 ,then dτ 0. Now we give a lemma about the possible exit on a face. This lemma is used by the algorithm to compute if the exit by some face is possible.

Lemma 2. Suppose that i ∈ E+. Then, if G > (N + 1)ω, a solution in the box cannot exit through the face ω−1/2.

−1/2 The proof is similar to the preceding one, except that xi has the value ω . The third lemma is similar to Lemma 1, but for negative coefficients in the fast system.

∈ > ( + )ω2/ dxi < Lemma 3. Suppose that i E−. Then, if G N N 1 2,then dτ 0,i.e. the i-th coordinate is decreasing in B. The box is exited in finite time. The face xi = −1/2 1/2 ω is a possible face of exit. The solution cannot exit through the face xi = ω .

Lemma 4. Suppose that i ∈ E−. Then, if G > N(N +1)ω/2, a solution in the box cannot exit through the face ω1/2.

The next lemma is technical and gives an upper bound for the time needed for a solution to escape the box when the assumptions of Lemma 1 are fulfilled. 238 E. Benoˆõt and J.-L. Gouz«e

Lemma 5. Suppose that i ∈ E+. Then, if G > (N + 1)ωq, with real q > 2,then −1/2 1/2 asolutionx(τ) is such that xi(τ) cannot stay between ω and ω longer than = 1 ω3/2−q( + ( /ω)) time T N+1 1 O 1 . Suppose that i ∈ E−. Then, if G > (N + 1)ωq, with real q > 2, then a solution −1/2 1/2 x(τ) is such that xi(τ) cannot stay between ω and ω longer than time T = 1 ω1/2−q ω + (ω5/2−2q) N+1 ln O . Next lemma makes use of the above estimates of the maximal time during which a coordinate xi stays in the box. We suppose now that the velocities of some coor- dinate i of the fast system are equal to zero (for these coordinates, the coefficients are all slow), and that some other coordinates j of the fast system have a positive or negative velocity. Then we show that, roughly, (up to some approximation depend- ing on the gap) the solutions will escape the box through the faces corresponding to x j, and not xi.

1/2 Lemma 6. Suppose that i ∈ E0, and that E+ ∪E− = /0.Thenifxi(0) < Bω , with = − N ω(2−q) + (ω(2−q)) (τ) = ω1/2 B 1 2 o ,thenx cannot escape through the face xi . −1/2 Similarly, suppose that i ∈ E0, and that E+ ∪ E− = /0.Thenifxi(0) > B ω ,with (2−q) (2−q) −1/2 B = 1 + ω + o(ω ),thenx(τ) cannot escape through the face xi = ω . Remark that, if the gap G is large, the bounds B and B in the lemma are close to 1. In the algorithm below, the faces xi will be called ”almost forbidden“ to express the fact that the algorithm will consider that no solution escapes through the faces ±1/2 xi = ω . Due to the lack of space, we do not give the proofs (see [4]). Remark 1 (Stoichiometric invariants). Very often, a chemical or biochemical system has invariants that are linear first integrals of the system. Indeed, the system is writtenx ú = Sv(x) where the v(x) are monomials of degree at most two. The in- variants are the elements of the left kernel of stoichiometric matrix S,andtheyare ( )= given by (here X is not scaled) I X ∑i kiXi. They are used in the algorithm to reduce the possible number of faces of exit.

5 The Algorithm

We wrote a Maple program to perform this study. The structure is briefly described below: • Specify the data : Ð The stoichiometric matrix S and the rates of the reactions v(X) as in (1). Ð The numerical values of the parameters K, and an initial condition. Ð Choose the values of ω and q that are control parameters for the algorithm. Then G =(N + 1)ωq. An Algorithmic Approach to Orders of Magnitude 239

• Preliminaries : Ð Write the equation in the form (1). Ð Determination of the left kernel of S to define the stoichiometric invariants. Ð Determination of the initial box and of the value of the invariants in this box. • The main loop : Ð Scale the equation to obtain (2). Ð Sort the coefficients of f , and find (if possible) a gap greater than G. Ð Write the ”fast” vector field f f . ÐWhenE+ or E− is non empty, using lemmas, we can write a list of forbidden (lemmas 2 and 4), or almost forbidden exit faces (lemma 6). Ð Eliminate from the list of the possible transitions the boxes that are not com- patible with the value of the stoichiometric invariants. Ð Return to the loop with a new box. • The program ends when all the encountered boxes are studied or when the num- ber of boxes is greater than a given large number. • The result of the algorithm is a graph with boxes as edges and possible transitions as vertices.

6 Example: Michaelis-Menten Equation

We now test the Maple algorithm with the classical Michaelis Menten equation. Of course, our goal is to apply this algorithm to systems of higher dimensions, but this simple example will illustrate our approach. ⎧ ⎪ ú = − + ⎨⎪ S k+ SE k− C Eú = −k+ SE + k− C + k2 C ⎪ ú = − − ⎩⎪ C k+ SE k− C k2 C Pú = k2 C

We choose k+ = 1, k− = 2andk2 = 3, all of the same order of magnitude. We choose ω = 102 and G = 106, an initial box [0,−2,−25,−15]. Therefore the concentrations of C and of P are almost zero. The invariants are S +C + P and E +C. If we only consider the orders of magnitude of the variables, these invariants can be computed as max(n1,n3,n4) and max(n2,n3). Their values are 0 and −2. Because C is always −2 lower than ω , the first invariant can be simplified to max(n1,n4)=0. For this simple example, we have a planar representation of the graph of boxes: because max(n1,n4)=0, the value of n1 −n4 characterizes n1 and n4. So we can draw a box as the point (n1 − n4,n2 − n3). For technical reasons, an edge, oriented from the left to the right is drawn as : . Therefore, a thick segment corresponds to two opposite edges. A vertex! surrounded by a small gray disk (left of the figure) is a vertex such that E+ E− = 0/ : a more precise study could be done in this box using the singular perturbation theory ([9]), but it is not our aim here. 240 E. Benoˆõt and J.-L. Gouz«e

On the picture, the trajectory of boxes starts at point n =[0,−2,−25,−15],cor- responding to the point (15,23). Then the graph of transition shows that n2 is de- creasing and n1, n3, n4 stay constant until n2 is between −7and−10. Up to here, the pathway is unique. After that, n4 and n3 are increasing, with a mean slope around 1/2. In a third phase, n3 and n1 are decreasing. The trajectory enters the region where singular perturbation analysis is possible. A biologist can therefore compare this qualitative trajectory throughout orders of magnitude with his experiments. In fact, the number of paths found by the algorithm is excessive, because we did not give enough constraints. For example, some trajectories stay on a vertical line n1 − n4 = 5, with n2 − n3 moving between −6 and 4, which is not possible. Improving the lemmas will drastically decrease the number of trajectories.

7Conclusion

The paper gives the first steps of the method; room is left for many improvements. In particular, it is possible to obtain stronger results on the possibility of exiting via a face of the box, using the invariants and the structure of the chemical system. The An Algorithmic Approach to Orders of Magnitude 241 transition graph that is obtained at the end of the procedure can be very big. Existing computer tools, like model checking, could be adapted to be able to manage large networks and verify given properties, in a spirit similar to [2].

Acknowledgements. The authors acknowledge the support of the French ANR Biosys Metagenoreg program.

References

1. Batt, G., Belta, C., Weiss, R.: Model checking genetic regulatory networks with param- eter uncertainty. In: Bemporad, A., Bicchi, A., Buttazzo, G. (eds.) HSCC 2007. LNCS, vol. 4416, pp. 61Ð75. Springer, Heidelberg (2007) 2. Batt, G., Ropers, D., de Jong, H., Geiselmann, J., Mateescu, R., Page, M., Schneider, D.: Validation of qualitative models of genetic regulatory networks by model checking: analysis of the nutritional stress response in Escherichia coli. Bioinformatics 21(supp. 1), i19Ði28 (2005) 3. Belta, C., Habets, L.: Controlling a Class of Nonlinear Systems on Rectangles. IEEE Transactions on Automatic Control 51(11), 1749 (2006) 4. Benoˆõt, E., Gouz«e, J.-L.: A mathematical and algorithmic approach to orders of magnitude in a biochemical system. Research report, INRIA (2009) 5. de Jong, H., Gouz«e, J.-L., Hernandez, C., Page, M., Sari, T., Geiselmann, J.: Qualita- tive simulation of genetic regulatory networks using piecewise-linear models. Bull. Math. Biol. 6(2), 301Ð340 (2004) 6. Heinrich, R., Schuster, S.: The Regulation of Cellular Systems. Chapman & Hall, Boca Raton (1996) 7. Khalil, H.: Nonlinear systems. Prentice Hall, Upper Saddle River (2002) 8. Lin, C., Segel, L.: Mathematics Applied to Deterministic Problems in the Natural Sci- ences. Society for Industrial Mathematics (1988) 9. Murray, J.: Mathematical Biology: I. An Introduction. Springer, Heidelberg (2002) Structural Identifiability of Linear Singular Dynamic Systems

Bego˜na Cant«o, Carmen Coll and Elena S«anchez

Abstract. Structured singular systems depending on a parametric vector are consid- ered. The identification of the parameters is analyzed in terms of the input-output behavior of the system. The role of the reachability and observability properties in this analysis is studied and a characterization of the structural identifiability prop- erty is given. Finally, the structural identifiability of a positive reachable system is studied.

1 Introduction

Structured systems are used in the modeled of mechanical, electrical, biological and economics models. Normally, the mathematical model incorporates parameters that symbolize empirical relations among variables. Given a parametrized state space model, structural identifiability is concerned with whether the unknown parame- ters within the model can be identified from the experiment considered. A struc- tural global identifiability analysis of the model is important in the modeling pro- cess and is necessary for the system identification or the parameter estimation. It means uniqueness of the parametric structure in the model. If the structural global identifiability property holds, it is possible to determine the values of the parame- ters uniquely in terms of known quantities. Several results on structured and global identifiability linear system have been published. For instance, a survey on linear structured systems is given in [6] and references therein. For references of global identifiability see [1] and [8]. In the singular case, few results have been given, see for instance [2] and [7]. In this paper a characterization for the structural identifia- bility of linear singular systems is treated.

Bego˜na Cant«o, Carmen Coll and Elena S«anchez Institut de Matem`atica Multidisciplinar, Universitat Polit`ecnica de Val`encia, 46071, Valencia, ` Spain, e-mail: [email protected],[email protected],[email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 243Ð249. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 244 B. Cant«o, C. Coll and E. S«anchez

The motivation for the study of structural identifiability of positive systems is the used of positive structured linear systems in economics, chemistry and other research areas. In these systems the structure of coefficients matrices plays an im- portant role because there are the positive restrictions on the behavior of the system. A powerful tool used in the solution of the singular systems is the Drazin inverse. ∈ n×n D Let M  a square matrix. The matrix M is the Drazin inverse of M if it satisfies MDMMD = MD, MMD = MDM and Mk+1MD = Mk,wherek = ind(M), the least nonnegative integer such that rank(Mk)=rank(Mk+1), is the index of matrix M. Consider a structured singular system

E(p)x(k + 1)=A(p)x(k)+B(p)u(k), (1) y(k)=C(p)x(k)

( ), ( ) ∈ n×n n×m m×n ( ) ∈ Ê ( ) ∈ Ê where the matrices E p A p Ê , B p and C p have a fixed P ⊆ ∇ ( ) = structure with p belongs to a subset Ê .MatrixE p is singular with n1 rank(ED(p)E(p)) and l = ind(E(p)). This structured singular system is denoted by ( )=( ( ), ( ), ( ), ( ),P) [λ ( ) − S p E p A p B p C p . If there exists λ ∈  such that det E p A(p)] = 0 the system S(p) has solution. The output of the system is given by

 k y(k)= C(p)(ED(p)A(p) ED(p)E(p)x(0) k−1  k−i−1 + ∑ C(p)ED(p) ED(p)A(p) B(p)u(i) i=0 l−1    i −C(p) I − ED(p)E(p) ∑ E(p)AD(p) AD(p)B(p)u(k + i), i=0 if E(p)A(p)=A(p)E(p) and ker(E(p)) ∩ ker(A(p)) = {0}. The admissible initial conditions set, X0,isgivenby

D X0 = Im E (p)E(p) H0 ... Hq−1    D D i D where Hi = I − E (p)E(p) E(p)A (p) A (p), i = 0,...,l − 1. Before solving the identifiability problem we will remind the definition of reach- able, observable and similar properties. AsystemS(p) is called structurally reachable if, for all p ∈ P the reachability matrices   D D D n−1 R f (S(p)) = E (p)B(p) ... E (p) E (p)A(p) B(p)   D D Rb(S(p)) = I − E (p)E(p) A (p)B(p) ...    − ... I − ED(p)E(p) E(p)AD(p) l 1 AD(p)B(p) , have full rank, that is rank(R f (S(p))) = n1 and rank(Rb(S(p))) = n − n1. And, asystemS(p) is called structurally observable if, for all p ∈ P the observability matrices Structural Identifiability of Singular Systems 245

  T D T D D n−1 T O f (S(p)) = (C(p)E (p)) ... (C(p)E (p) E (p)A(p) )   O ( ( )) = ( ( ) − ( )D ( ) D( ))T ... b S p C p I E p E p A p    − T ... (C(p) I − ED(p)E(p) E(p)AD(p) l 1 AD(p))T have full rank, that is rank(O f (S(p))) = n1 and rank(Ob(S(p))) = n − n1. Two systems S(p) and S(q), p, q ∈ P are structurally similar if there exists an invertible matrix T such that E(p)=TE(q)T −1, A(p)=TA(q)T −1, B(p)=TB(q) and C(p)=C(q)T −1.

2 Structural Identifiability Problem

The identifiability of the parameters of the system is concerned with the determina- tion of these from the external behavior of the system. The response of the system from an input given can be analyzed in the z-domain or using the input-output ap- plication. That is, to determine the input-output behavior (io) of a model S(p) we can use the transfer matrix

G(z,p)=C(p)(zE(p) − A(p))−1B(p) or the Markov parameters associated to the system S(p). These parameters are given by   ( , )= ( ) D( ) D( ) ( ) j ( ), ≥ V j p C p E p E p A p B p j  0 H( j,p)=C(p)(I − E(p)ED(p)) E(p)AD(p) j AD(p)B(p) (2) j = 0,...,l − 1. The concept of structural identifiability is given in the following definition.

Definition 1. The system S(p) is structurally identifiable if and only if, for almost any two candidates parameter vector values p, q ∈ P, io(p)=io(q) implies p = q, where io(·) denotes the input-output behavior of the system S(p).

This concept is also named in the literature global identifiability (see [2], [8] to continuous-time case). The structural reachability and observability are related to the structural identifiability property. However, these properties are neither neces- sary nor sufficient conditions for global identifiability. For example, consider the system S(p) with the following structure ⎛ ⎞ ⎛ ⎞ p1 p2 00 01 ⎜ ⎟ ⎜ ⎟ ⎜ 0 p3 00⎟ ⎜ 10⎟ E(p)=⎝ ⎠,A(p)=I + E(p), B = ⎝ ⎠, (3) 000p4 00 0000 11 and 246 B. Cant«o, C. Coll and E. S«anchez   1010 C = 0101 ∈ P = {( 4 , > } where the parametric vector p p1, p2, p3, p4) ∈ Ê+ pi 0 . Constructing the reachability and observability matrices it is easy to see that they have full rank, then the system is structurally reachable and observable but the system is unidenti- fiable since its transfer matrix is ⎛ ⎞ − ( − ) p2 z 1 ( − ) ⎜ ( − − )( − − ) p4 1 z ⎟ ( , )=⎝ zp1 1 p1 zp3 1 p3 ⎠ G z p 1 −1 zp3 − 1 − p3 and the parameters are undetermined. Then they cannot be estimated using input- output data, even using well-design experiments. On the other hand there exist structurally globally identifiable systems which are not structurally reachable. For example, the structured system (A(p),B) given by     1 − α 0 1 A(p)= , B = , 01− α 0 ∈{ / α = } with the parametric vector p α ∈ Ê 0 . Since the reachability matrix satisfies   11− α rank = 1, 00 the system is not reachable but as its transfer function is   1 1 G(z,p)= , z + α − 1 0 the parameter α is uniquely determined and the system is globally identifiable. In the following result we have a characterization of the structural global identi- fiability property when the system is structurally reachable and observable. Theorem 1. Consider the structured system S(p) given in (1) structurally reach- able and structurally observable. Then, S(p) is structurally identifiable if and only if S(p) and S(q) structurally similar implies that p = q and T = I.

Proof. Consider S(p) and S(q) structurally similar, then they have the same Markov parameters. That is, the input-output behavior is the same io(p)=io(q) and by hypothesis p = q. Reciprocally, consider S(p) and S(q) such that they have the same input-output behavior. It is known that each one of them is equivalent to (see [5])        IO A1(p) O B1(p) Sø(p)= , , , (C1(p) C2(p)) ON(p) OI B2(p) Structural Identifiability of Singular Systems 247        IO A1(q) O B1(q) Sø(q)= , , , (C1(q) C2(q)) ON(q) OI B2(q) and they satisfy io(p)=io(q). Using the definition (2) of the Markov parameters and the definition of the reachability and observability matrices we show that

O f (S(p))B1(p)=O f (S(q))B1(q) C1(p)R f (S(p)) = C1(q)R f (S(q)) (4) O f (S(p))A1(p)R f (S(p)) = O f (S(q))A1(q)R f (S(q)) and Ob(S(p))B2(p)=Ob(S(q))B2(q) C2(p)Rb(S(p)) = C2(q)Rb(S(q)) (5) Ob(S(p))N(p)Rb(S(p)) = Ob(S(q))N(q)Rb(S(q)).   T O Constructing T = f with OTb   T T = R (S(p))RT (S(q)) R (S(q))RT (S(q)) f f f  f f  = R ( ( ))RT ( ( )) R ( ( ))RT ( ( )) T Tb b S p b S q b S q b S q and using (4)-(5) it is easy to show that T is nonsingular and the systems Sø(p) and Sø(q) are similar under the transformation matrix T . By hypothesis, this implies that T = I and p = q, and hence, the structured system is identifiable. 

Before we have indicated that the structured system S(p) given by (3) is uniden- tifiable, now we can observe that it neither satisfies the condition of the theorem 1. Consider two structured systems S(p) and S(q) of type (3) such that ⎛ ⎞ ⎛ ⎞ p1 p2 00 p1 p2 00 ⎜ ⎟ ⎜ ⎟ ⎜ 0 p3 00⎟ ⎜ 0 p3 00⎟ E(p)=⎝ ⎠ and E(q)=⎝ ⎠. 000p4 000q4 0000 0000

We can prove that these systems are similar by means of the transformation matrix ⎛ ⎞ 1000 ⎜ ⎟ ⎜ 0100⎟ = ⎜ p ⎟. L ⎝ 4 010⎠ q4 0001

Hence, if we choose p4 = q4 then p =(p1, p2, p3, p4) =(p1,q2, p3,q4)=q and L = I. 248 B. Cant«o, C. Coll and E. S«anchez

3 Structural Identifiability of Positive Systems

When positive restrictions are considered a concept of reachability of positive states by means of positive controls is used. This concept is named positive reachability (see [3]). In this context, we are interested in the study of the structural identifiabil- ity problem when the system satisfies this property. Thus, we say that a structured system S(p) is

(i) structurally identifiable positive system if it is positive for all p ∈ P and it is structurally identifiable, and (ii) structurally reachable positive system if it is positive and positively reachable for all p ∈ P.

Consider the system S(p) where the matrices are given by       DO A (p) O B (p) E = ≥ 0, A(p)= 1 B(p)= 1 , (6) ON OI B2 where D is a nonsingular diagonal matrix, N is a nilpotent matrix, B2 ≤ 0, and ⎛ ⎞ ⎛ ⎞ 0 p1 00 ⎜ ⎟ 0 ⎜ .. ⎟ ⎜ . ⎟ ⎜ 00 . 0 ⎟ ⎜ . ⎟ A1(p)=⎜ ⎟, and B1(p)=⎜ ⎟, ⎝ .. .. ⎠ ⎝ 0 ⎠ 0 . . pn −1 1 b pn1 00 0

+

P = { =( n1 1 ,..., , ) ∈ Ê / , > } being p p1 pn1 b pi b 0 . This structured system is pos- itive since it holds the conditions established in [4] in order to be a positive system, that is, EDE ≥ 0, EDA(p) ≥ 0, EDB(p) ≥ 0, and I − EDE EAD(p) i AD(p)B(p) ≤ 0, i = 0,1,...,l − 1, where l is the index of E. Moreover, the reachability matrices R f (S(p)) and R f (S(p)) have a monomial matrix of size n1 and n−n1, respectively, then the positive structural reachability property holds (see [3]). In the following result we prove that it is also structurally globally identifiable. Theorem 2. Consider the positive structured system (6). This system is struc- turally globally identifiable.

Proof. We consider two positive structured systems S(p) and S(q) of type (6) with p, q ∈ P such that they have the same input-output behavior (io)

V( j,p)=V( j,q), j ≥ 0 H( j,p)=H( j,q), j = 0,...,l − 1, and we shall prove that p = q.Ifp =(p1,...,pn1,b) and q =(q1,...,qn1,bø),by definition of the Markov parameters in (2) and by the structure of the matrices of the system (6) we have Structural Identifiability of Singular Systems 249

EDB(p)=EDB(q) ⇒ b = b D ( ) ( )= D ( ) ( ) ⇒ = E A p B p E A q B q pn1−1 qn1−1 . . D n1−1 D n1−1 (E A(p)) B(p)=(E A(q)) B(q) ⇒ p1 = q1 ( D ( ))n1 ( )=( D ( ))n1 ( ) ⇒ = E A p B p E A q B q pn1 qn1

Hence, p = q. 

4 Conclusions

The problem of structural global identifiability of the model consists in determinate the uniqueness of the parameter vector when an input-output response is considered. This problem has been studied for structured singular systems. The relation between the structural properties and the structural identifiability has been analyzed for these systems and a characterization of this property has been given. Finally, the structural identifiability of a class of positive reachable system has been treated.

Acknowledgements. Supported by Spanish DGI grant MMT2007-64477.

References

1. Audoly, S., D’Angi«o, L., Saccomani, M.P., Cobelli, C.: Global identifiability of linear compartmental models. IEEE Trans. Biomed. Eng. 45, 36Ð47 (1998) 2. Ben-Zvi, A., McLellan, P.J., McAuley, K.B.: Identifiability of linear time-invariant differential-algebraic systems. I. The generalized Markov parameter approach. Ind. Eng. Chem. Res. 42, 6607Ð6618 (2003) 3. Bru, R., Coll, C., Romero, S., S«anchez, E.: Some problems about structural properties of positive descriptor systems. LNCIS, vol. 294, pp. 233Ð240. Springer, Heidelberg (2003) 4. Bru, R., Coll, C., S«anchez, E.: Structural properties of positive linear time-invariant difference-algebraic equations. Linear Algebra and its Applications 349, 1Ð10 (2002) 5. Dai, L.: Singular Control Systems. LNCIS, vol. 118. Springer, Heidelberg (1989) 6. Dion, J.M., Commault, C., Van der Woude, J.: Generic properties and control of linear structured systems: a survey. Automatica 39, 1125Ð1144 (2003) 7. Miyamura, A., Kazuyuki, K.: Identifiability of delayed singular systems. In: Proceedings 5th Asian Control Conference, Melbourne, Australia, pp. 789Ð797 (2004) 8. Tayakout-Fayolle, M., Jolimaitre, E., Jallut, C.: Consequence of strutural identifiability properties on state model formulation for linear inverse chromatography. Chemical Eng. Science 55, 2945Ð2956 (2000) On Positivity of Discrete-Time Singular Systems and the Realization Problem

Rafael Cant«o, Beatriz Ricarte and Ana M. Urbano

Abstract. In this work we introduce different positivity concepts for singular sys- tems. Discrete-time regular singular systems are considered and the minimal real- ization problem is discussed for the case of weakly positive systems and internally positive systems.

1 Introduction

Positive singular systems are widely applied in different fields like engineering prob- lems such as electrical circuits network, power systems, aerospace engineering or chemical processing, and social, economic or biological systems among others. In the literature singular systems have been called by different names as for example descriptor variable systems or the generalized state-space systems [2, 13]. Nevertheless, we can find different concepts of positivity in systems theory such us weak positivity, internal positivity or simple positivity [2, 7, 10, 11, 13]. In this work we consider those definitions and their properties for singular systems. Then discrete-time regular singular systems are considered and the realization problem is discussed for the case of weakly positive systems and internally positive systems. Moreover, we try to give conditions to obtain the minimal positive realization when it is possible.

( )= M(z) r×s n×n

∈ Ê [ ] , ∈ Ê We recall that given a transfer matrix T z d(z) z , matrices E A ,

∈ n×s r×n ∈ Ê B Ê and C such that

− T (z)=C[zE − A] 1 B

Rafael Cant«o, Beatriz Ricarte and Ana M. Urbano Institut de Matem`atica Multidisciplinar, Universitat Polit`ecnica de Val`encia, 46071 Val`encia, Spain, e-mail: [email protected],[email protected],[email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 251Ð258. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 252 R. Cant«o, B. Ricarte and A.M. Urbano are called a realization of T (z). It is denote by (E,A,B,C). The size of A is called the dimension of the realization. The realization is minimal if it has minimum di- mension.

2 Preliminary Results

In this paper we consider the discrete-time singular system  Eøxø(k + 1)=Aøxø(k)+Bøuø(k) (1) yø(k)=Cøxø(k)

n s r

( ) ∈ Ê ( ) ∈ Ê wherex ø(k) ∈ Ê is the state vector,u ø k is the input vector,y ø k is the

, ø ∈ n×n n×s r×n ∈ Ê ø ∈ Ê = output vector and Eø A Ê , Bø and C .IfEø I, the system is called standard,ifEø is a singular matrix, then the system is called singular. System (1) is denoted by (Eø,Aø,Bø,Cø). It is well known that if the system (1) satisfies the regularity condition, i.e. there

exists an scalar λ ∈  such that

det[λEø − Aø] = 0(2) then (1) is equivalent to the canonical forward-backward form (see [7]) given by  Ex(k + 1)=Ax(k)+Bu(k) (3) y(k)=Cx(k) with E = diag(In ,N), A = diag(A1,In ), n1 + n2 = n,wheren1 is the degree of 1 × 2 ×

[ − ] n1 n1 n2 n2 ∈ Ê polynomial det zE A , A1 ∈ Ê and N is a nilpotent matrix with − index q,i.e.,Nq 1 = 0and Nq = 0. B

1 n1×s n2×s

∈ Ê =[ ] Consider B = with B1 ∈ Ê and B2 ,andletC C1 C2 with B2

r×n1 r×n2 ∈ Ê C1 ∈ Ê and C2 , then the system (3) can be broken down into the follow- ing subsystems: • the standard (or forward) subsystem  x1(k + 1)=A1x1(k)+B1u(k) (4) y1(k)=C1x1(k)

• the complete singular (or backward) subsystem  Nx2(k + 1)=x2(k)+B2u(k) (5) y2(k)=C2x2(k) Positive Realization Problem 253

x1(k) where x(k)= and y(k)=y1(k)+y2(k). x2(k)

Lemma 1. [10, Lemma 4.2] The transfer matrix of the system (3) is equal to the sum T(z)=Tsp(z)+W(z) of the strictly proper transfer matrix of the subsystem (4)

( )= [ − ]−1 Tsp z C1 zIn1 A1 B1 and the polynomial transfer matrix of the subsystem (5)

( )= ( − )−1 W z C2 zN In2 B2

n1×n1 n1×s n2×s n2×n2

∈ Ê ∈ Ê ∈ Ê ∈ Definition 1. Matrices A1 ∈ Ê , B1 , B2 , N , C1

r×n1 r×n2 ∈ Ê Ê , C2 are called the realization in Weierstrass canonical form (WCF ( ) ∈ r×s( ) realization) of the matrix T z Ê z .

From Lemma 1 we deduce that the realization problem in regular singular sys- tems can really be dealt as two realization subproblems, a realization problem in a standard system and a realization problem of a complete singular system. Positive realization problem has been wide studied for standard systems, see for instance [1,3,4,8,12]forSISOsystemsor[5,6]forMIMOsystems.Withrespectto polynomial matrices of complete singular systems, some algorithms have been de- veloped to obtain minimal realizations (see, for instance, [6, 7]). tα×tα From now on, we denote by Jt,α ∈ Ê the following nilpotent matrix of nilpo- tent index t ⎡ ⎤ OO... OO ⎢ ⎥ ⎢ Iα O ... OO⎥ ⎢ ⎥ OIα ... OO Jt,α = ⎢ ⎥. ⎢ . . . . ⎥ ⎣ . . . . ⎦ OO... Iα O In this work we base on the following result we can find in [6]. ( )= t−1 r×s[ ] Proposition 1. Consider W z Wt−1z + ···+W1z +W0 ∈ Ê z .

(1) If rankWt−1 = s then a minimal realization (N,Its,B,C) of W(z) with a nilpotent

∈ ts×ts ts×s r×ts ∈ Ê ∈ Ê matrix N Ê ,B and C is given by ⎡ ⎤ Ip ⎢ ⎥ ⎢ O ⎥ N = Jt,s, B = ⎢ . ⎥ and C =[−W0 −W1 ··· −Wt−2 −Wt−1]. ⎣ . ⎦ O 254 R. Cant«o, B. Ricarte and A.M. Urbano

(2) If rankWt−1 = r then a minimal realization (N,Itr,B,C) of W(z) with a nilpotent

∈ tr×tr tr×s r×tr ∈ Ê ∈ Ê matrix N Ê ,B and C is given by ⎡ ⎤ −W0 ⎢ ⎥ ⎢ −W1 ⎥ ⎢ ⎥ = T , = ⎢ . ⎥ =[ ··· ]. N Jt,r B ⎢ . ⎥ and C Ir O OO ⎣ ⎦ −Wt−2 −Wt−1

As we comment in the Introduction, in the literature there are many different con- cepts of positivity. In next sections we consider those definitions and the realization problem.

3 Weakly Positive System

Definition 2. [10, p. 92] The system described by Equation (1) is called weakly positive if and only if

n×n ø n×n n×s ø r×n

, ∈ Ê , ∈ Ê , ∈ Ê . Eø ∈ Ê+ A + Bø + and C +

From now on, we consider regular singular systems in canonical forward-backward form. Therefore, a regular singular system will be weakly positive if and only if

n1×n1 n1×s n2×s n2×n2 r×n1 r×n2

∈ Ê ∈ Ê ∈ Ê ∈ Ê ∈ Ê A1 ∈ Ê+ , B1 + , B2 + , N + , C1 + and C2 + , i.e., we obtain a positive realization in Weierstrass canonical form. Recall that by Lemma 1 we can broken down the problem to obtain a positive realization in Weierstrass canonical form into two subproblems, the computation of a positive realization (A1,B1,C1) of the strictly proper matrix Tsp(z) and the com- putation of the positive realization (N,I,B2,C2) of the polynomial matrix W (z).Let us see this second subproblem. ( )= t−1 r×s[ ] Corollary 1. Let W z Wt−1z + ···+W1z +W0 ∈ Ê z be a transfer ma- r×s = , ,..., − trix. If rankWt−1 = sorrankWt−1 = r and −Wi ∈ Ê+ ,fori 0 1 t 1,then there exists a minimal realization (N,I,B,C) of W(z)=C(zN − I)−1B with N ≥ O, B ≥ O and C ≥ O.

Proof. By Proposition 1, there exist a minimal realization (N,I,B,C) of W(z) where N, B and C are nonnegative matrices if −Wi ≥ O,fori = 0,1,...,t − 1. 

Taking B2 = B and C2 = C we obtain the desired realization. Positive Realization Problem 255

4 Internally Positive System

Definition 3. [2, 11] The singular system (1) is called internally positive (or sim- ( ) ∈ n ply positive) if for any admissible initial conditionx ø 0 Ê+ and for every input

( ) ∈ s n r ∈  ( ) ∈ Ê ( ) ∈ Ê ∈  control sequenceu ø k Ê+, k +,wehaveøx k + andy ø k + for k +.

The following algebraic characterization of positive singular systems is given in [2, Proposition 1]. We denote by MD the Drazin inverse of a matrix M, and by q = ind(Eø) the index of Eø,thatis,q is the smallest nonnegative integer such that rank(Eø q)=rank(Eø q+1).

Proposition 2. Consider the system (Eø,Aø,Bø). Suppose that EøEø D ≥ O and EøAø = AøE.ø The system (Eø,Aø,Bø) is positive if and only if Eø DAø ≥ O, Eø DBø ≥ O and (I − Eø DEø)(EøAøD)iAøDBø ≤ O, i = 0,1,...,q − 1,whereq= ind(Eø).

If the system (Eø,Aø,Bø) satisfies the regularity condition (2), then it is equivalent to the canonical forward-backward form given by

Ex(k + 1)=Ax(k)+Bu(k).

The authors [2] prove that in this case EDE = diag(I ,O), EDA = diag(A ,O), n1 1 D = T T ( − D )( D)i D = ( i )T T E B B1 O and I E E EA A B O N B2 . Hence, a forward- i backward system is positive if and only if A1 ≥ 0, B1 ≥ 0andN B2 ≤ 0, i = 0,1,...,q − 1. Next result gives necessary and sufficient conditions on the matrices Eø, Aø, Bø,and Cø such that the system (Eø,Aø,Bø,Cø) is positive [9, Theorem 2.1].

Corollary 2. Consider the system (Eø,Aø,Bø,Cø). Suppose that EøEø D ≥ O, EøAø = AøEø and ker(Eø)∩ker(Aø)={0}. The system (Eø,Aø,Bø,Cø) is internally positive if and only if the following conditions hold, for i = 0,1,...,ind(Eø) − 1: 1. Eø DAø ≥ O, 2. Eø DBø ≥ O, 3. CøEø DEø ≥ O, 4. (I − Eø DEø)(EøAøD)iAøDBø ≤ O, 5. Cø(I − EøEø D)(EøAøD)iAøDBø ≤ O.

If the system (Eø,Aø,Bø,Cø) satisfies the regularity condition (2), then it is equiva- lent to the canonical forward-backward form given by (3). Now, we introduce the following algebraic characterization for this system can be internally positive.

Proposition 3. A singular system given in the canonical forward-backward form is internally positive if and only if the following conditions hold, for i = 0,1,..., ind(E) − 1:

1. A1 ≥ O, 2. B1 ≥ O, 3. C1 ≥ O, 256 R. Cant«o, B. Ricarte and A.M. Urbano

i 4. N B2 ≤ O, i 5. C2N B2 ≤ O. Proof. By Corollary 2 and taking into account that

D In1 0 CE E =[C1 C2 ] =[C1 0 ] ≥ 0, then C1 ≥ O 00 ( − D)( D)i D =[ ] O = i ≤ . C I EE EA A B C1 C2 i C2N B2 0 N B2 

Directly from the previous result we obtain the following Corollary. Corollary 3. Consider a singular system in canonical forward-backward form expressed by (4) and (5). If

A1 ≥ O, B1 ≥ O, C1 ≥ O, N ≥ O, B2 ≤ O, C2 ≥ O then this system is internally positive. Remark 1. Note that in general, neither a singular weakly positive system (1) is internally positive, nor an internally positive system is weakly positive except for the particular case when B2 = O. Kaczorek [11] observed a similar conclusion. By Corollary 3 and Lemma 1 the problem to obtain an internally positive real- ization of a singular system given in the canonical forward-backward (4) and (5) can be separated into two subproblems, the computation of a positive realization (A1,B1,C1) of the strictly proper matrix Tsp(z) and the computation of a realization (N,I,B2,C2) of the polynomial matrix W (z) with N ≥ O, B2 ≤ O,andC2 ≥ O.Let us see this second subproblem. ( )= t−1 r×s[ ] Corollary 4. Let W z Wt−1z + ···+W1z +W0 ∈ Ê z be a transfer ma- r×s = , ,..., − trix. If rankWt−1 = sorrankWt−1 = r and Wi ∈ Ê+ ,fori 0 1 t 1,then there exists a minimal realization (N,I,B,C) of W(z)=C(zN − I)−1B with N ≥ O, B ≤ O and C ≥ O. Proof. By Proposition 1, we can obtain a minimal realization (N,I,B,C) of W(z) where N ≥ O, B ≤ O and C ≥ O if Wi ≥ O,fori = 0,1,...,t − 1. 

Taking B2 = B and C2 = C we obtain the desired realization.

5 Conclusions

Taking into account the different kinds of positivity existing in system theory, if specific conditions are satisfied in each particular case then positive realizations ( ) ∈ r×s( ) of a transfer matrix T z Ê z can be computed by the use of the following procedure. Positive Realization Problem 257

Step 1. Given a transfer matrix T(z) , express it as the sum of an strictly proper rational matrix Tsp(z) and a polynomial matrix W (z).

Step 2. Compute a positive realization (A1,B1,C1) for Tsp(z).

Step 3. Compute a minimal realization (N,I,B2,C2) for W (z). Step 4. Compose the positive realization (E,A,B,C) for T (z) from the realizations given in Steps 2 and 3.

Note that if the realizations obtained in Step 2 is minimal, then the global reali- zation in Step 4 is also minimal.

6 Example

Obtain an internally positive realization of the transfer matrix T (z):

1 2z − 1.74z − 3.4 z + 11 T (z)= + (z − 1)(z − 0.7) 0.61.2 0 z

Step 1. T (z)=Tsp(z)+W(z) with

M(z) 1 2z − 1.74z − 3.4 T (z)= = sp d(z) (z − 1)(z − 0.7) 0.61.2

z + 11 10 11 W (z)= = z + = W z +W 0 z 01 00 1 0 where W0 ≥ 0, W1 ≥ 0 and detW1 = 0 (Corollary 4).

Step 2. By [5] we obtain the minimal positive realization (A1,B1,C1) of Tsp(z):

0.85 0.15 00 02 A = B = C = . 1 0.15 0.85 1 12 1 40

Step 3. We obtain a minimal realization (N,I4,B2,C2) by Proposition 1: ⎡ ⎤ ⎡ ⎤ 0000 −10 ⎢ 0000⎥ ⎢ 0 −1 ⎥ 1110 N = ⎢ ⎥ B = ⎢ ⎥ C = . ⎣ 1000⎦ 2 ⎣ 00⎦ 2 0001 0100 00

Step 4. We compose the minimal internally positive realization (E,A,B,C): 258 R. Cant«o, B. Ricarte and A.M. Urbano ⎡ ⎤ ⎡ ⎤ 100000 0.85 0.15 0000 ⎢ ⎥ ⎢ ⎥ ⎢ 010000⎥ ⎢ 0.15 0.85 0000⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 000000⎥ ⎢ 001000⎥ E = ⎢ ⎥ A = ⎢ ⎥ ⎢ 000000⎥ ⎢ 000100⎥ ⎣ 001000⎦ ⎣ 000010⎦ 000100 000001 ⎡ ⎤ 00 ⎢ ⎥ ⎢ 12⎥ ⎢ ⎥ ⎢ −10⎥ 021110 B = ⎢ ⎥ C = . ⎢ 0 −1 ⎥ 400001 ⎣ 00⎦ 00

Acknowledgements. Supported by the Spanish DGI grant MTM2007-64477 and by the UPV under its research program.

References

1. Benvenuti, L., Farina, L.: A Tutorial on the Positive Realization Problem. IEEE Trans- actions on Automatic Control 49(5), 651Ð664 (2004) 2. Bru, R., Coll, C., S«anchez, E.: Structural properties of positive linear time-invariant difference-algebraic equations. Linear Algebra and its Applications 349, 1Ð10 (2002) 3. Bru, R., Cant«o, R., Ricarte, B., Rumchev, V.: A Basic Canonical Form of Discrete- Time Compartmental Systems. International Journal of Contemporary Mathematical Sci- ences 2(6), 261Ð273 (2007) 4. Cant«o, R., Ricarte, B., Urbano, A.M.: On Positive Realizations of Irreducible Transfer Matrices. In: Commault, C., Marchand, N. (eds.). LNCIS, vol. 341, pp. 41Ð48. Springer, Heidelberg (2006) 5. Cant«o, R., Ricarte, B., Urbano, A.M.: Positive Realizations of Transfer Matrices with real poles. IEEE Trans. Circuits Syst. II, Expr. Briefs 54(6), 517Ð521 (2007) 6. Cant«o, R., Ricarte, B., Urbano, A.M.: Computation of realizaciones of complete singular systems (submitted) 7. Dai, L.: Singular Control Systems. LNCIS, vol. 118. Springer, Heidelberg (1989) 8. Halmschlager, A., Matolcsi, M.: Minimal Positive Realizations for a Class of Transfer Functions. IEEE Trans. Circuits Syst. II, Expr. Briefs 52(4), 177Ð180 (2005) 9. Herrero, A., Ram«õrez, A., Thome, N.: An algorithm to check the nonnegativity of singu- lar systems. Applied Mathematics and Computation 189, 355Ð365 (2007) 10. Kaczorek, T.: Positive 1D and 2D Systems. Springer, London (2002) 11. Kaczorek, T.: Externally and Internally Positive Singular Discrete-Time Linear Systems. International Journal of Appl. Math. Comput. Sci. 12(2), 197Ð202 (2002) 12. Nagy, B., Matolcsi, M.: Minimal Positive Realizations of Transfer Functions with Non- negative Multiple Poles. IEEE Transactions on Automatic Control 50(9), 1447Ð1450 (2005) 13. Virnik, E.: Stability analysis of positive descriptor systems. Linear Algebra Appl. 429(10), 2640Ð2659 (2008) Multi-Point Iterative Methods for Systems of Nonlinear Equations

Alicia Cordero, Jos«eL.Hueso,EulaliaMart«õnez and Juan R. Torregrosa

Abstract. A family of multi-point iterative methods for solving systems of nonlin- ear equations is described. Some classical methods are included in the mentioned family. Under certain conditions, convergence order is proved to be 2d +1, where d is the order of the partial derivatives required to be zero in the solution. Moreover, different numerical tests confirm the theoretical results and allow us to compare these variants with Newton’s method.

1 Introduction

Let us consider the problem of finding a real zero of a function F : D ⊆ Rn −→ Rn, that is, a solution α ∈ D of the nonlinear system F(x)=0. This solution can be obtained by means of the fixed point iteration method. The best known fixed point method is the classical Newton’s method. In different types of applications it is required to find a solution of a nonlinear system. For example, many physical systems can be described by a system of dif- ferential equations

Alicia Cordero, Jos«e L. Hueso, Juan R. Torregrosa Instituto de Matem«atica Multidisciplinar, Universidad Polit«ecnica de Valencia, 46022 Valencia, Spain, e-mail: [email protected],[email protected], [email protected] Eulalia Mart«õnez Instituto de Matem«atica Pura y Aplicada, Universidad Polit«ecnica de Valencia, 46071 Valencia, Spain, e-mail: [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 259Ð267. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 260 A. Cordero et al. ⎫ ⎪ dx1 ⎪ = f1(x1,x2,...,xn) ⎪ dt ⎪ ⎪ dx2 ⎬⎪ = f2(x1,x2,...,xn) dt . (1) . ⎪ . ⎪ ⎪ ⎪ dx ⎪ n = f (x ,x ,...,x ) ⎭ dt n 1 2 n Equations of this type arise quite often in biological and physical applications, economical models, etc. T The existence of equilibrium valuesx ø =(xø1,xø2,...,xøn) ,forwhichx(t) ≡ xø are solutions of (1), is an important problem in the qualitative theory of differen- tial equations. An equilibrium valuex ø is actually a solution of the nonlinear sys- tem F(x)=0, where F is a vectorial function with f1, f2,..., fn as its coordinate functions. Also bifurcations can be analyzed by solving the corresponding nonlinear sys- tem of equations with different values of a parameter as, for example, in Lorentz equations: ⎫ ⎪ dx ⎪ = σ(x + y) ⎪ dt ⎬⎪ dy = −xz + μx − y . ⎪ dt ⎪ dz ⎪ = xy + βz ⎭ dt These equations have an equilibrium point at the origin for μ ≤ 1andtwonew equilibrium points appear for μ > 1, at x = y = ± β(μ − 1), z = μ − 1. The construction of numerical methods for the approximation to the solution α of a nonlinear system is an interesting task in numerical mathematics and applied sci- entific branches. There is a collection of papers concerned with multi-point iterative methods; for example, for a nonlinear system F(x)=0, the authors suggest in [1] the extension of the application of quadrature formulas in the development of new adjustments of Newton’s method. In [2] a family of modified Newton’s methods is obtained whose general expression is * + −1 m (k+1) (k) (k) (k) x = x − ∑ AhJF (ηh(x )) F(x ), (2) h=1

(k) (k) (k) −1 (k) with ηh(x )=x − τhJF (x ) F(x ),whereJF (x) is the jacobian matrix of F. In this paper we analyze a collection of multi-point iterative methods obtained from Newton’s method by replacing F(x(k)) by a linear combination of values of F(x) in different points. Specifically, the general method is $ % m (k+1) (k) (k) −1 (k) x = x − JF (x ) ∑ AhF(ηh(x )) , (3) h=1 Variants of Newton’s Method for Nonlinear Systems 261 where τh and Ah are parameters to be chosen in [0,1] and R, respectively, and m is a positive integer. As we will see, the value of these parameters plays an important role in the order of convergence of the method. We consider the definition of efficiency index (see [3]) as p1/d,wherep is the order of convergence and d is the total number of new function evaluations by step. Since (3) can be considered as an iterative fixed point formula, we study the conver- gence of the different methods by using the following result.

Theorem 1. ([4]) Let G(x) be a fixed point function with continuous partial derivatives of order p with respect to all components of x. The iterative method x(k+1) = G(x(k)) is of order p if

G(α)=α; ∂ k (α) gi = , ≤ ≤ − , ≤ , ,..., ≤ ∂ ∂ ...∂ 0 for all 1 k p 1 1 i j1 jk n; x j1 x j2 x j ∂ p (α) k gi = , , ,..., ∂ ∂ ...∂ 0 for at least one value of i j1 jp x j1 x j2 x jp where gi are the component functions of G.

The rest of the paper is structured as follows. In Section 2 we show some techni- cal results that we need to prove the convergence of the methods. In Section 3, we analyze the general iterative formula (3) and we study the conditions that parame- ters τh and Ah must satisfy in order to obtain a method with a particular order of convergence. Finally, the last section is dedicated to the numerical results obtained by applying some of the described methods to several nonlinear systems.

2 Preliminary Results

n We consider x ∈ R , n > 1, and denote by Jij(x) the (i, j)-entry of the jacobian matrix of F, and by Hij(x) the respective entry of its inverse, so

n ∑ Hij(x)Jjk(x)=δik. (4) j=1

If f j(x), j = 1,2,...,n denote the coordinate functions of F(x), it is easy to prove that: n ∂ ( ) ∂ ( ) n ∂ 2 ( ) Hji x fi x = − ( ) fi x , ∑ ∂ ∂ ∑ Hji x ∂ ∂ (5) i=1 xl xr i=1 xl xr

n ∂ 2 ( ) ∂ ( ) n ∂ ( ) ∂ 2 ( ) n ∂ ( ) ∂ 2 ( ) ∑ Hji x fi x = − ∑ Hji x fi x − ∑ Hji x fi x − ∂x ∂x ∂x ∂x ∂x ∂x ∂x ∂x ∂x i=1 s l r i=1 l s r i=1 s r l (6) n ∂ 3 ( ) − ( ) fi x . ∑ Hji x ∂ ∂ ∂ i=1 xs xr xl 262 A. Cordero et al.

The following results are useful in the proof of the main theorem.

Lemma 1. Let λ(x) be the iteration function of classical Newton’s method, n whose coordinates are λ j(x)=x j − ∑ Hji(x) fi(x),j= 1,...,n. Then, i=1 ∂λ (α) j = 0, (7) ∂xl

∂ 2λ (α) n ∂ 2 (α) j = (α) fi , ∂ ∂ ∑ Hji ∂ ∂ (8) xr xl i=1 xr xl

∂ 3λ (α) n ∂ (α) ∂ 2 (α) ∂ (α) ∂ 2 (α) ∂ (α) ∂ 2 (α) j = Hji fi + Hji fi + Hji fi + ∂ ∂ ∂ ∑ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ xs xr xl i=1 xr xs xl xs xr xl xl xs xr n ∂ 3 (α) + (α) fi , 2 ∑ Hji ∂ ∂ ∂ (9) i=1 xs xr xl for j,l,r,s ∈{1,2,...,n}.

Let us note that the convergence order of Newton’s method, can be obtained by applying Theorem 1 and using expressions (7) and (8). η ( ) η ( )= − τ −1( ) ( ) Lemma 2. Let k x be the iteration functions k x x kJF x F x ,where τk ∈ [0,1],fork= 1,...,m. Then, ∂(η ( )) k x q =( − τ )δ , ∂ 1 k ql (10) xl |x=α

∂ 2(η ( )) n ∂ 2 (α) k x q = τ (α) fi , ∂ ∂ ∑ kHqi ∂ ∂ (11) xr xl |x=α i=1 xr xl and  ∂ 3(η ( )) n ∂ (α) ∂ 2 (α) ∂ (α) ∂ 2 (α) k x q = τ Hqi fi + Hqi fi ∂ ∂ ∂ k ∑ ∂ ∂ ∂ ∂ ∂ ∂ xs xr xl | =α = xr xs xl xs xr xl x i 1  ∂ (α) ∂ 2 (α) n ∂ 3 (α) + Hqi fi + τ (α) fi , ∂ ∂ ∂ 2 k ∑ Hqi ∂ ∂ ∂ (12) xl xs xr i=1 xs xr xl for q,l,r,s ∈{1,2,...,n}.

Lemma 3. Let ηk(x),k= 1,...,n, be the functions used in Lemma 2 and fi(x), i = 1,...,n, the coordinate functions of F(x). Then, ∂ (η ( )) fi k x =( − τ ) (α), ∂ 1 k Jil (13) xl |x=α Variants of Newton’s Method for Nonlinear Systems 263

∂ 2 (η ( )) ∂ 2 (α) fi k x =( + τ2 − τ ) fi , ∂ ∂ 1 k k ∂ ∂ (14) xr xl |x=α xr xl and ∂ 3 (η ( )) ∂ 3 (α) fi k x =( − τ3 + τ2 − τ ) fi − 1 k 3 k k (15) ∂xs∂xr∂x | ∂xs∂xr∂x  l x=α l  n ∂ 2 (α) ∂ 2 (α) ∂ 2 (α) ∂ 2 (α) ∂ 2 (α) ∂ 2 (α) −τ2 (α) fi fi − fi fi − fi fi , k ∑ Hqi ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ q=1 xq xl xs xr xr xq xs xl xs xq xr xl for i,l,r,s ∈{1,2,...,n}.

3 Description and Convergence Analysis of the Methods

Let F : D ⊆ Rn −→ Rn be a sufficiently differentiable function and α ∈ D a zero of the nonlinear system F(x)=0. Let G be the fixed point function that allows us to describe (3), $ % m −1 −1 G(x)=λ(x)+JF (x) F(x) − JF(x) ∑ AhF(ηh(x)) , h=1

−1 −1 where ηh(x)=x − τhJF (x) F(x), τh ∈ [0,1] and λ(x)=x − JF(x) F(x). The ith component of this function can be rewritten as $ % n n m ∑ Jij(x) g j(x) − λ j(x) − ∑ Hjp(x) fp(x) + ∑ Ah fi(ηh(x)) = 0. (16) j=1 p=1 h=1

By direct differentiation of (16), being i and l arbitrary and fixed, $ % n n ∂Jij(x) ∑ g j(x) − λ j(x) − ∑ Hjp(x) fp(x) + = ∂xl = $ j 1 p 1 % n n n ∂g j(x) ∂λj(x) ∂Hjp(x) ∂ fp(x) + ∑ Jij(x) − − ∑ fp(x) − ∑ Hjp(x) + = ∂xl ∂xl = ∂xl = ∂xl j 1  p 1 p 1 m n ∂ (η ( )) ∂(η ( )) + fi h x h x q = . ∑ Ah ∑ ∂(η ( )) ∂ 0 (17) h=1 q=1 h x q xl

When x = α, by applying (4), (7), (10) and taking into account that g j(α)=α, λ j(α)=α and fi(α)=0, we have $ % n ∂ (α) m (α) g j + (α) − + ( − τ ) = . ∑ Jij ∂ Jil 1 ∑ Ah 1 h 0 j=1 xl h=1 264 A. Cordero et al.

Therefore, we can conclude the following result.

m Proposition 1. If parameters Ah and τh satisfy ∑ Ah(1 − τh)=1 and matrix h=1 JF (x) is continuous and nonsingular in x = α, then the iterative method (3) has at least order of convergence 2.

If we take A1 and τ1 such that A1(1 − τ1)=1, we obtain methods whose order of convergence is at least 2. For example, by using A1 = 1andτ1 = 0, we have Newton’s method. Now, by direct differentiation of (17), being r arbitrary and fixed, by substituting m x = α and applying (4), (5), (7),(8) and (14) and taking into account that ∑ Ah(1 − h=1 τh)=1, we obtain $ % n ∂ 2 (α) ∂ 2 (α) m (α) g j + fi − + τ2 = . ∑ Jij ∂ ∂ ∂ ∂ 1 ∑ Ah h 0 j=1 xr xl xr xl h=1

So, we can affirm:

Proposition 2. If parameters Ah and τh satisfy

m m ( − τ )= τ2 = , ∑ Ah 1 h 1 and ∑ Ah h 1 (18) h=1 h=1 and matrix JF (x) is continuous and nonsingular in x = α, then the iterative method (3) has at least order of convergence 3.

In this case, for m = 1 we obtain the iterative method √ 3 + 5 x(k+1) = x(k) − J (x(k))−1F(η (x(k))), (19) 2 F 1 √ ( ) ( ) 5 − 1 ( ) − ( ) where η (x k )=x k − J (x k ) 1F(x k ). The order of this method is 3 and 1 2 F 2 its efficiency index 31/(n +2n). We note that this index is greater than the efficiency 2 index of Newton’s method, whose value is 21/(n +n). For m = 2, parameters A j and τ j must satisfy a system that has infinite solutions. One of them, A1 = A2 = 1, τ1 = 0andτ2 = 1 gives us the method described by Traub in [4]. In an analogous way, by a new direct differentiation of (17), being s arbitrary and fixed, by taking x = α and by applying (5), (6), (8), (9), (15) and conditions (18), we have Variants of Newton’s Method for Nonlinear Systems 265 $ % ∂ 3 (α) m n ∂ 3 (α) fi − τ3 + (α) g j − ∂ ∂ ∂ 1 ∑ Ah h ∑ Jij ∂ ∂ ∂ xs xr xl h=1 j=1 xs xr xl n ∂ 2 (α) ∂ 2 (α) n ∂ 2 (α) ∂ 2 (α) − (α) fi fi − (α) fi fi + ∑ Hqi ∂ ∂ ∂ ∂ ∑ Hqi ∂ ∂ ∂ ∂ q=1 xq xl xs xr q=1 xq xr xs xl n ∂ 2 (α) ∂ 2 (α) + (α) fi fi = . ∑ Hqi ∂ ∂ ∂ ∂ 0 q=1 xs xq xr xl

Therefore, we can establish the following proposition:

Proposition 3. If parameters Ah and τh satisfy

m m m ( − τ )= , τ2 = τ3 = , ∑ Ah 1 h 1 ∑ Ah h 1 and ∑ Ah h 1 (20) h=1 h=1 h=1

∂ 2 f (α) and i = 0, ∀a,b,i ∈{1,2,...,n}, then (3) has at least order 4. ∂xa∂xb For m = 3 one of the infinite solutions allows us to obtain the method   (k+1) (k) (k) −1 (k) (k) (k) x = x − JF (x ) 4F(η1(x )) − 6F(η2(x )) + 4F(η3(x )) , (21) where τ1 = 1/4,τ2 = 1/2andτ3 = 3/4. This method has order 4 and efficiency index 2 41/(n +4n). Again, being u arbitrary and fixed, using conditions of Proposition 3 and the results of the previous section, it can be proved that $ % ∂ 4 (α) m n ∂ 4 (α) fi − + τ4 + (α) g j = . ∂ ∂ ∂ ∂ 1 ∑ Ah h ∑ Jij ∂ ∂ ∂ ∂ 0 xu xs xr xl h=1 j=1 xu xs xr xl

Therefore, we can establish a similar result to the previous proposition.

Proposition 4. If parameters Ah and τh satisfy

m m m m ( − τ )= , τ2 = , τ3 = τ4 = ∑ Ah 1 h 1 ∑ Ah h 1 ∑ Ah h 1 and ∑ Ah h 1 (22) h=1 h=1 h=1 h=1

∂ 2 f (α) and i = 0, ∀a,b,i ∈{1,2,...,n}, then the iterative method (3) has at least ∂xa∂xb order 5. In general, we can establish the following result:

Theorem 2. Let F : D ⊆ Rn −→ Rn be sufficiently differentiable at each point of an open neighborhood D of α ∈ Rn, that is a solution of the system F(x)=0.Letus (k) suppose that JF (x) is continuous and nonsingular in α. Then the sequence {x }k≥0 obtained using the iterative expression (3) converges to α with convergence order: 266 A. Cordero et al.

m m • ( − τ )= , τ p = , = , ,..., − 2d, if ∑ Ah 1 h 1 ∑ Ah h 1 p 2 3 2d 1 and h=1 h=1 ∂ j (α) fi = , ≤ , ,..., ≤ , = , ,..., . ∂ ∂ ...∂ 0 1 i a1 a j n j 2 3 d xa1 xa2 xa j m m • + ( − τ )= , τ p = , = , ,..., 2d 1,if ∑ Ah 1 h 1 ∑ Ah h 1 p 2 3 2d and h=1 h=1 ∂ j (α) fi = , ≤ , ,..., ≤ , = , ,..., . ∂ ∂ ...∂ 0 1 i a1 a j n j 2 3 d xa1 xa2 xa j

4 Numerical Examples

In this section we apply classical Newton’s method (CN), Traub’s method (TM) and the new methods described by (19) and (21) (denoted by NM1 and NM2, respec- tively) in order to estimate the zeros of the following nonlinear functions.

T (a)F(x1,x2)=(sin(x1)+x2 cos√(x1),x1 − x2), α =(0,0) . ( , )=( ( 2) − ( ), − ) α =( , )T (b)F x1 x2 exp x1 exp 2x1 x1 x2 , 0 0 . x2 (c)F(x ,x )=(− 2 + exp(x )+x − 2,x − 2x + 2), α =(1,0)T . 1 2 2 2 1 2 1 T (d)F (x)=(f1 (x), f2 (x),..., fn (x)),wherex =(x1,x2,...,xn) and n fi : R → R, i = 1,2,...,n, such that fi (x)=xixi+1 − 1, i = 1,2,...,n − 1and fn (x)=xnx1 − 1. The exact zeros of F(x) are α1 =(1,1,...,1) and α2 =(−1,−1,...,−1) when n is odd. Results appearing in Table 1 are obtained for n = 999 and all the methods converge to α1. T (e)F(x1,x2)=(x1 + exp(x2) − cos(x2),3x1 − x2 − sin(x2)), α =(0,0) . 1 ( , , )=(− ( )+ ( ), x1 − , ( ) − 2) (f) F x1 x2 x3 sin x1 cos x2 x3 exp x1 x3 , x2 α =(0.9095695,0.6612268,1.575834)T. These nonlinear functions have been chosen in order to have different points of view: the second partial derivatives of functions from (a) to (c) are null in their respective solutions, so that the convergence order of the methods increase; the case of function (d) is that of a big-sized system; functions (e) and (f) have singular, respectively bad conditioned, jacobian matrix. - All computations- -  were done- using MATLAB. The stopping criterion used is - - - - -x(k+1) − x(k)- + -F x(k) - < 10−12. For every method, we analyze the number of iterations needed to converge to the solution and the order of convergence esti- mated by ------ln(-x(k+1) − α-/-x(k) − α-) p ≈ - - - - . (23) ln(-x(k) − α-/-x(k−1) − α-) Variants of Newton’s Method for Nonlinear Systems 267

Table 1 Numerical results for nonlinear systems

F(x) x(0) Solution Iterations p CN NM1 TM NM2 CN NM1 TM NM2 (a) (0.4,0.4)T α 5 4 4 4 3.0 3.0 5.0 5.0 (1,−2)T α 6 5 4 4 3.0 3.0 4.9 4.8 (b) (−0.5,0.5)T α 5 4 4 4 3.0 3.0 5.1 4.8 (−1,−0.5)T α 6 5 5 5 3.1 - 5.1 4.7 (c) (−1,−2)T α 5 5 4 4 2.8 3.0 3.6 3.7 (2.5,1.5)T α 5 5 4 4 3.0 3.0 4.4 3.9 T (d) (2,...,2) α1 6 5 5 5 2.0 3.0 2.9 3.0 T (3,−3,...,3) α1 7 5 5 5 2.2 3.1 3.2 2.9 (e) (2,−2)T α n.c. 43 n.c. 75 - 3.1 - 2.9 (1,1)T α 6 5 5 5 2.0 2.9 2.9 2.9 (f) (1,1,1)T α 8 8 n.c. n.c. - - - - (0.5,0.1,0.7)T α 10 7 7 7 2.0 - - -

In Table 1 we can observe several results obtained using the previously de- scribed methods in order to estimate the zeros of functions from (a) to (f). For every function, the following items are specified: the initial estimation x(0) and, for each method, the approximate solution found, the number of iterations needed (n.c. de- notes that the method does not converge) and the estimated computational order of convergence p.Thevalueofp that appears in Table 1 is the last coordinate of vector p when the variation between its coordinates is small. When this does not happen, the value of p is said to be not conclusive, and is denoted by ” - ” in the mentioned table.

Acknowledgements. Supported by Ministerio de Ciencia y Tecnolog«õa MTM2007-64477.

References

1. Cordero, A., Torregrosa, J.R.: Variants of Newton’s method for functions of several vari- ables. Applied Mathematics and Computation 183, 199Ð208 (2006) 2. Frontini, M., Sormani, E.: Third-order methods from quadrature formulae for solving sys- tems of nonlinear equations. Applied Mathematics and Computation 149, 771Ð782 (2004) 3. Ostrowski, A.M.: Solutions of equations and systems of equations. Academic Press, New York (1966) 4. Traub, J.F.: Iterative methods for the solution of equations. Chelsea Publishing Company, New York (1982) Identifiability of Nonaccessible Nonlinear Systems

Leontina D’Angi`o, Maria Pia Saccomani, Stefania Audoly and Giuseppina Bellu

Abstract. Identifiability is a fundamental prerequisite for model identification. Dif- ferential algebra tools have been applied to study identifiability of dynamic systems described by nonlinear polynomial equations. In a previous paper a differential alge- bra method for testing identifiability for locally and globally non accessible systems has been proposed. In this paper we describe a strategy to simplify the above dif- ferential algebra method to test identifiability of systems which are non accessible from everywhere. In particular we make the method more efficient and thus of more general applicability. A strategy for testing identifiability also of nonlinear models described by non polynomial equations is proposed.

1 Introduction

Global identifiability concerns uniqueness of the model parameters determined from input-output data, under ideal conditions of noise-free observations and error-free model structure. There different methods which have been proposed to check global identifiability of nonlinear systems [2, 3, 7Ð9, 11, 13, 14]. In particular the differ- ential algebra approach has been recently generalized [11] to deal with both locally and globally non accessible systems. Unfortunately in this last case, i.e. systems nonaccessible from everywhere, the proposed method requires the calculation of

Leontina D’Angi`o and Giuseppina Bellu Department of Mathematics University of Cagliari, 09100 Cagliari Italy, e-mail: [email protected],[email protected] Maria Pia Saccomani Department of Information Engineering University of Padova, 35131 Padova Italy, e-mail: [email protected] Stefania Audoly Department of Structural Engineering University of Cagliari, 09100 Cagliari Italy, e-mail: [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 269Ð277. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 270 L. D’Angi`oetal. a closed form expression for the integral of a suitable differential form, which of course is a difficult task. To the best of our knowledge, a general procedure to do this integration is not known. The principal goal of this paper is to show that, to check global identifiability of globally nonaccessible systems, the above integration is not necessary. In this paper we also provide the theoretical basis for correctly testing global identifiability of systems involving non polynomial, for example exponential or log- arithmic, functions. As an elementary example consider a (non-polynomial) system like

xú1 = aexp(−x2)+u

xú2 = −bx1 where a and b are the unknown parameters, u the input, x1 and x2 the state variables. This system can be rendered polynomial by introducing a new state x3 = aexp(−x2) and by differentiating it the following additional equationx ú3 = −xú2x3 is provided. This differential equation will turn it into a third order system of the following form

xú1 = x3 + u

xú2 = −bx1

xú3 = bx1x3 which is indeed polynomial (and time-invariant). Furthermore, in many biological and physiological applications, very often in the differential equations describing the phenomena, time-varying coefficients ap- pear with a known functional form but depending on some unknown parameters. Consider for example a system like

xú1 = aexp(−bt)x1 + u

This system also can be rendered polynomial by introducing a new state x2 = exp(−bt) and an additional equationx ú2 = −bx2, which will turn it into the following second order polynomial (and time-invariant) system

xú1 = ax1x2 + u

xú2 = −bx2

With this technique one can handle many classical situations where time-varying coefficients of known functional form appear, and in fact even non algebraic non- linearities arising to an augmented model which is trivially globally nonaccessible since the evolution of the system obtained by adding the new state variable is con- strained to take place in some invariant submanifold. Identifiability of Nonaccessible Nonlinear Systems 271

2 Background

2.1 A priori Identifiability

This section provides the reader with the definitions which are necessary to set the notations used in the paper. For a formal treatment of differential algebra, see [10]. Consider a nonlinear dynamic system described in state space form  xú(t)=f(x(t),p)+∑m g (x(t),p)u (t) i=1 i i (1) y(t)=h(u(t),x(t),p)

n where the state variable x evolves in an open set X of the n-dimensional space Ê ; u is the m-dimensional input ranging on some vector space of piecewise smooth (infinitely differentiable) functions and y is the r-dimensional output. The constant unknown p-dimensional parameter vector p belongs to some open subset P of the p p-dimensional Euclidean space Ê . Whenever initial conditions are specified, the relevant equation x(0)=x0 is added to the system. The essential assumptions here are that there is no feedback, there is the affine structure in u and f, g1,...,gm and h are vectors of rational functions in x. The dependence on p may be rational. We adopt the identifiability definitions used in [11] and recall here only that of = ψ ( , ) global identifiability. Let y x0 p u be the input-output map of the system (1) started at the initial state x0 (we assume that this map exists). Definition 1. The system (1) is a priori globally (or uniquely) identifiable from input-output data if, for at least a generic set of points p∗ ∈ P, there exists (at least) one input function u such that the equation ψ ( , )=ψ ( ∗, ) x0 p u x0 p u (2) = ∗ n has only one solution p p for almost all initial states x0 ∈ X ⊆ Ê .

2.2 Accessibility

Here we recall a concept of geometric nonlinear control theory [4, 5, 12] called accessibility. In particular the accessibility can be view as a weak counterpart of the concept of reachability (from an arbitrary initial state).

Definition 2. The system (1) is accessible from x0 if the set of states reachable n from x0 (at any finite time) has a nonempty interior, i.e. contains an open ball in Ê . To study accessibility one looks at the Control Lie Algebra, i.e. the smallest Lie algebra C containing the vector fields f,g1,...,gm of (1) and invariant under Lie bracketing with f,g1,...,gm.TotheLiealgebraC we associate the distribution ΔC n mapping each x ∈ Ê into the vector space 272 L. D’Angi`oetal.

ΔC (x)=span{τ(x) : τ ∈ C } We recall from the literature [4, 5, 12] the so called accessibility rank condition. Theorem 1. For analytic, in particular polynomial, systems, a necessary and suf- ficient condition for accessibility from x0 is that dimΔC (x0)=n

3 Identifiability of Globally Nonaccessible Systems

Our goal here is to investigate the differential algebra algorithm, to see if it correctly tests identifiability of globally nonaccessible systems. The starting point will be the results in [11] holding for local nonaccessible systems. We know that the characteristic set of the ideal generated by the polynomials defin- ing the dynamical system is independent of the system initial conditions; in fact the pseudodivision algorithm for calculating the characteristic set does not take into ac- count the initial conditions. If the system is accessible, whatever the initial point, the whole space where the solutions evolve is correctly described by the ideal gen- erated by the differential polynomials describing the system [11]. In case of systems nonaccessible from initial points belonging to a thin set, the above ideal does not any more describe the space where the system solution evolves. In this case the ideal to be considered in the identifiability test should include also the invariant submani- fold where the solution of the system starting from that particular point, evolves. This submanifold can be obtained by Frobenius Theorem, calculating the Control Lie algebra as indicated at the end of the previous section. Since our system is poly- n nomial (and hence analytic), there exists a unique maximal submanifold Mx0 of Ê through x0 which carries all the trajectories of the control system (1) started at x0. Δ ( ) ( )= In particular, if the dimension of C x0 is n then dim Mx0 n [4]. Thus the ideal which correctly describes the whole space where the solutions evolve is the one generated by the polynomials defining the dynamical system plus the polynomials defining the invariant submanifold. The case of a globally nonaccessible system is different. In this case many invariant submanifolds exist, one for each initial point. We would like to stress that, in this case, the initial state does not belong to a thin set, but rather to a generic set of points in the state space. This implies that, when the polynomials defining the invariant submanifold are added to the original polynomials, no additional conditions are added to the original system. We have shown this in many examples. We have also observed that the polynomials defining the invariant submanifold could be calculated by integrating some suitable combination of the polynomials describing the dynamic system. By following this line of reasoning, we can conclude that, in case of globally nonaccessible model, the system obtained by adding the polynomials defining the invariant submanifold to the original ones, is equivalent to the original system itself where an integration in closed form has been performed. Identifiability of Nonaccessible Nonlinear Systems 273

Note that this integration in closed form can be performed in both the accessible and nonaccessible models, providing the equation of the invariant submanifold. In fact, by adding this equation to those defining the dynamical system and by suitably re- ducing them, the system order always decreases, i.e. the maximum order derivative variable disappears (the only difference in case of accessible model being that the invariant submanifold depends on the system input). Thus, we conclude that to test identifiability of a nonlinear dynamical model, one has first to test the accessibility of the model from its initial conditions. If the model is globally non accessible, the identifiability test on the original system gives the correct result, independently from the knowledge of the invariant submanifold where the solution of the system evolves.

3.1 Two Examples

To give evidence to the method discussed in the previous section, we present two examples of globally nonaccessible systems. Example 1. Consider the following model ⎧ ⎪ = ⎨⎪ xú1 p1ux3 xú2 = p2x1 ⎪ = (3) ⎩⎪ xú3 p3x1x2 y = p4x2

It is easy to see that by combining the second and the third differential equations we obtainx ú3/xú2 −(p3/p2)x2 = 0. By integrating it, one can see that, no matter of which 3 input is chosen, the system evolves in the algebraic submanifold of Ê described by φ( )=− 2/ + + 2 / − = x p3x2 2 p2x3 p3x20 2 p2x30 0(4) which clearly shows that we have a system which is not accessible from any point 3 in Ê . To check this formally, one computes the matrix made with the vector fields f ,g and the Lie brackets [ f ,g], [ f ,[ f ,g]], [ f ,[ f ,[ f ,g]]],.... Since all Lie brackets in the above sequence are zero after the fifth one, the distribution is involutive and all covectors μ orthogonal to the distribution ΔC have the form:

T μ = α(x)[0,−p3x2, p2] (5) where α(x) is an arbitrary non-zero smooth function. 3

Therefore dimΔC < 3forallx ∈ Ê . By applying Frobenius Theorem in a relatively open neighborhood of any x0 for which dimΔC (x0)=2, it must follow that for some suitable α, the covector μ generates a closed differential form. In fact, it is easy to check that φ has differential

dφ(x)=−p3x2dx2 + p2dx3 (6) which, by integration provides just eq. (4). 274 L. D’Angi`oetal.

To test the identifiability of system (3), consider first the dynamic system re- gardless of accessibility. In this case an easy computation provides the following characteristic set: ... 2 A1 ≡−uúyp¬ 4+ y up4 − ypú 1 p3u y A ≡ yú− p p x 2 2 4 1 (7) A3 ≡ y − p4x2 A4 ≡ y¬− p1 p2 p4ux3 hence the exhaustive summary is p1 p3/p4 and the system (3) is nonidentifiable. Now assume that the system is started from a generic initial condition x0 = T [x10,x20,x30] with all the components different from zero so that dimΔC (x0)=2. In this case the solution of system (3) evolves in the slice described by φ(x)=0. 3 − } Note that dimΔC = 2forallx ∈{Ê T where T is the “thin” set of equilibrium points where dimΔC = 0:

T := {x : x1 = 0, x2 = x20, x3 = 0} (8)

By following the same line of reasoning applied to the generically accessible sys- tems, we add the equation φ(x)=0 to the system equations (3) and the characteristic set turns out to be ˆ ≡ + (− 2 + ( 2 2 − 2)) A1 2¬yp4 p1 2p2 p4ux30 p3 p4ux20 uy Aˆ ≡ yú− p p x 2 2 4 1 (9) Aˆ3 ≡ y − p4x2 ˆ ≡ 2( − )+ ( 2 2 − 2) A4 p2 p4 2x3 2x30 p3 p4x20 y By comparing this with the previous characteristic set (7), one can see that the sys- tem order is decreased. However, the identifiability test based on this characteristic set gives exactly the same identifiability results, i.e. (p1 p3/p4), obtained regardless of initial conditions. Thus it is not necessary to calculate a closed form expression for the integral of the differential form (6).

Example 2. It may well happen that the invariant submanifold is not algebraic. In this case one can not enlarge the ideal in the ring of differential polynomials. Con- sider the system ⎧ ⎪ = + ⎨⎪ xú1 p1ux3 x2 xú2 = p2x1 ⎪ = (10) ⎩⎪ xú3 p3x1x3 y = x3

By integratingx ú3/xú2 − (p3/p2)x3 = 0, one can see that, no matter of which input is chosen, the system evolves in the submanifold described by

φ(x)=p2(log(x3) − log(x30)) − p3(x2 − x20)=0 (11) Identifiability of Nonaccessible Nonlinear Systems 275

T where x0 =[x10,x20,x30] . Hence the system is not accessible from any point in 3

Ê . Given the form of eq. (11), there is no algebraic polynomial vanishing at the solutions of the system started at any x0,i.e.Mx0 is not algebraic, see section 2.2. Formally, by computing the matrix with the vector fields f ,g and the Lie brackets [ f ,g], [ f ,[ f ,g]], [ f ,[ f ,[ f ,g]]],..., it can be shown that all covectors μ orthogonal to the distribution ΔC must be of the form:

T μ = α(x)[0,−p3x3, p2] (12) where α(x) is an arbitrary non-zero smooth function. Therefore dimΔC < 3forall ∈ 3 μ x Ê . By applying Frobenius Theorem it follows that the covector generates the following closed differential form:

dφ(x)=−p3x3dx2 + p2dx3 (13) which, by integration provides just eq. (11). First, identifiability of system (10) regardless of initial conditions is tested. The characteristic set is the following

4 ... 2 3 3 2 A1 ≡ uyú p1 p3 − y y + 3¬yyyú − 2úy + yuyú p1 p3 + yyú p2 A2 ≡ yú− p3x1y 2 3 2 (14) A3 ≡−yy¬ + yú + uy p1 p3 + x2y p3 A4 ≡−x3 + y

hence the exhaustive summary is (p2, p1 p3) showing that p2 has one solution while p1 and p3 have an infinite number of solutions. Note that if the initial condition is known, the system becomes globally identifiable. Now assume that the system is started from an unknown initial condition x0 = T [x10,x20,x30] with all components different from zero so that dimΔC (x0)=2. The solution of system (10) evolves in the slice described by the nonalgebraic equation φ(x)=0. This obviously can not be added to the system equations (10). However, by suitably manipulating the polynomials defining the characteristic set (14) together with the transcendent function defining the invariant submanifold (11), we obtain the following new set of functions

2 2 3 2 Aˆ1 ≡−yy¬ + yú +(log(y) − log(x30))p2y + p1 p3uy + p3x20y Aˆ ≡ yú− p x y 2 3 1 (15) Aˆ3 ≡ (log(y) − log(x30))p2 − (x2 − x20)p3 Aˆ4 ≡−x3 + y

It is easy to see that Aˆ1, which now depends also from the initial condition, provides exactly the same identifiability result of that obtained regardless of initial conditions, i.e. the system (10) is nonidentifiable. Again, if the initial condition is known, the system becomes globally identifiable. Note that in this example also it is shown that it is not necessary to calculate a closed form expression for the integral of the differential form (13). 276 L. D’Angi`oetal.

4 A Biological Model

Consider the following biological model. It is a twelve-compartments model re- cently proposed in [6] to describe the nitrogen metabolism in humans. ⎧ ⎪ = − + ⎪ gú k21g u ⎪ ú = − ( + ) + ⎪ il k21g k32 k42 il k24sa ⎪ = ⎪ eú k32il ⎪ = + + + − ( + + + + + ) ⎪ saú k42il k45scp k47 pl k410bu k24 k54 k64 k74 k104 k124 sa ⎪ = − ⎨⎪ scpú k54sa k45scp sepú = k64sa ⎪ ú = + − ( + ) ⎪ pl k74sa k78 pa k47 k87 pl ⎪ = + − ( + + ) ⎪ paú k87 pl k89 pp k78 k98 k128 pa ⎪ = − ⎪ ppú k98 pa k89pp ⎪ ú = − ( + ) ⎪ bu k104sa k1110 k410 bu ⎪ = ⎩⎪ uuú k1110bu uaú = k124sa + k128 pa (16) The measurement equations are:

y1 = e, y2 = sep, y3 = pl, y4 = bu, y5 = uu, y6 = ua (17)

The unknown parameters are: p =[k21,k32,k24,k42,k45,k47,k54,k64,k74,k104,k124,k78,k87,k89,k98,k128,k410,k1110]

To study the accessibility of this model, one should apply the Frobenius theorem and calculate dimΔC (x0). Given the high dimension of the model this calculation is very complex. However, it is easy to see that the model is globally nonaccessible simply by manipulating some of its equations. For example one can calculate:

bu = uuú /k1110 sa = sepú /k64 (18) which substituted in the 10th equation of system (16) provides:

k1110k64buú + k64uuú (k1110 + k410) − k1110k104sepú (19)

This equation can be finitely integrated

φˆ = k64k1110bu + k64uu(k1110 + k410) − k1110k104sep + constant (20)

The solution of the model evolves thus in this submanifold showing that the model is globally nonaccessible (φˆ is not necessarily the equation of the whole invariant submanifold φ which should be calculated with the Frobenius theorem). As dis- cussed in the previous section, one can correctly perform the identifiability test of Identifiability of Nonaccessible Nonlinear Systems 277 this model by calculating the ideal generated only by the polynomials defining the system. The model is globally identifiable [6].

5 Conclusions

In this paper we show that the differential algebra method based on the characteristic set of the ideal generated only by the polynomials defining the system can be suc- cessfully used in testing the identifiability of globally nonaccessible systems. This result is particular relevant since it allows to correctly test the identifiability of many models described by nonpolynomial equations. A software tool DAISY [1] to test identifiability of nonlinear systems under the conditions described in this paper is available in htt p : //www.dei.unipd.it/ ∼ pia/.

References

1. Bellu, G., Saccomani, M.P., Audoly, S., D’Angi`o,L.: DAISY: a new software tool to test global identifiability of biological and physiological systems. Computer Methods and Programs in Biomedicine 88, 52Ð61 (2007) 2. Chapman, M.J., Godfrey, K.R., Chappell, M.J., Evans, N.D.: Structural identifiability of non-linear systems using linear/non-linear splitting. Int. J. Control 76(3), 209Ð216 (2003) 3. Chappell, M.J., Godfrey, K.R.: Structural identifiability of the parameters of a nonlinear batch reactor model. Math. Biosci. 108, 245Ð251 (1992) 4. Hermann, R., Krener, A.J.: Nonlinear controllability and observability. IEEE Trans. Au- tomatic Control AC-22(5), 728Ð740 (1977) 5. Isidori, A.: Nonlinear control systems, 3rd edn. Springer, London (1995) 6. Juillet, B., Saccomani, M.P., Bos, C., Gaudichon, C., Tom«e, D., Fouillet, E.: Conceptual, methodological and computational issues in compartmental modeling of a complex bi- ological system: the postprandial inter-organ metabolism of dietary nitrogen in humans. Math. Biosci. 204, 282Ð309 (2006) 7. Ljung, L., Glad, S.T.: On global identifiability for arbitrary model parameterizations. Automatica 30(2), 265Ð276 (1994) 8. Margaria, G., Riccomagno, E., Chappell, M.J., Wynn, H.P.: Differential algebra methods for the study of the structural identifiability of rational function state-space models in the biosciences. Math. Biosci. 174, 1Ð26 (2001) 9. Ollivier, F.: Le probl`eme de l’identifiabilit«e structurelle globale:etude « th«eorique, m«ethodes effectives et bornes de complexit«e. Th`ese de Doctorat en Science, Ecole« Polyt«echnique, Paris, France (1990) 10. Ritt, J.F.: Differential Algebra. RI: American Mathematical Society, Providence (1950) 11. Saccomani, M.P., Audoly, S., D’Angi`o, L.: Parameter identifiability of nonlinear sys- tems: the role of initial conditions. Automatica 39, 619Ð632 (2003) 12. Sontag, E.D.: Mathematical control theory, 2nd edn. Springer, Berlin (1998) 13. Denis-Vidal, L., Joly-Blanchard, G.: Equivalence and identifiability analysis of uncon- trolled nonlinear dynamical systems. Automatica 40, 287Ð292 (2004) 14. Walter, E., Lecourtier, Y.: Global approaches to identifiability testing for linear and nonlinear state space models. Math. Biosci. 24, 472Ð482 (1982) Trajectory Tracking Control of a Timed Event Graph with Specifications Defined by a P-time Event Graph

Philippe Declerck and Abdelhak Guezzi

Abstract. The aim of this paper is a trajectory tracking control of Timed Event Graphs with specifications defined by a P-time Event Graph. Two problems are solved on a fixed horizon knowing the current state: The optimal control for favor- able past evolution; The prediction of the earliest future evolution of the process. These two parts make up an on-line control which is used on a sliding horizon. Completely defined in (max, +) algebra, the proposed approach is a Model Predic- tive Control using the componentwise order relation.

1 Introduction

In this paper, we focus on the trajectory tracking control of Timed Event Graphs with reference model defined by a P-time Event Graph. The P-time Event Graph describes the desired behavior of the interconnections of all the internal transitions. Some events are stated as controllable, meaning that the corresponding transitions (input) may be delayed from firing until some arbitrary time provided by a supervi- sor. We wish to determine the greatest input in order to obtain the desired behavior defined by the desired output and the specifications. This problem is denoted prob- lem 1 in the document. Moreover, the aim of this paper is also the extension of problem 1 to Predictive Control on infinite horizon. This extension is denoted problem 2. Using a receding horizon principle, Model Predictive Control is a form of control in which the current control is obtained by solving on-line, a finite open-loop optimal control problem at each sampling instant. The current state of the process is considered as the initial

Philippe Declerck and Abdelhak Guezzi LISA EA4014, University of Angers, 62 avenue Notre-Dame du Lac, 49000 Angers, France, e-mail: [email protected], [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 279Ð290. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 280 P. Declerck and A. Guezzi state. The optimization yields an optimal control sequence but only the first control in this sequence is applied to the plant. This procedure can be repeated infinitely. In this paper, we complete the approaches developed in [4] and [6] by introduc- ing specifications defined by a P-time Event Graph as in the preliminary study [5] which is generalized to a sliding horizon. The framework of this proposed study can be found in [6] where a comparison with [15] is given in the standard algebra. The approach is based on the concept of earliest desired output which was introduced in [4] to the best of our knowledge. A similar concept was also considered in [11]: as this last approach uses the past control and not the current state, we can prove that the relevant updated desired output (called reference input in [11]) can be lower. Let us recall that a simple forward technique gives the earliest desired output while the control is given by the classical backward approach. However, this simple tech- nique does not hold if some specifications are introduced in the problem as shown in parts 2 and 3: the structure of matrix Dh in part 2.2 shows the forward and backward connections of inequality X ≥ Dh ⊗ X for instance. In this paper, we consider that each transition is observable: the event date of each transition firing is assumed to be available. Let us note that we have developed software written in Scilab composed of estimation, prediction and control. No hy- pothesis is taken on the structure of the Event Graphs which does not need to be strongly connected. The initial marking should only satisfy the classical liveness condition and the usual hypothesis that places should be First In First Out (FIFO) is taken. In the context of the trajectory tracking control (problem 2), we consider different structures of matrix B. Defined in part 3.3.2, the case of fully controlled transitions can be found in the modeling of railway system where each departure of train must be controlled [2, 14]. This structure is also considered in urban bus networks where the timetable must be respected at each stop [9]. The paper is structured as follows: The optimal control on a fixed horizon (prob- lem 1) and its extension to a sliding horizon (problem 2) are successively considered. The resolution of problem 2 is based on the prediction of the earliest desired output. By reason on the lack of place, we cannot give a complete presentation of the pre- liminary remarks but the reader can easily find more information in [1] and [8]. The presentation of the model of the P-time Event Graph is also omitted: the reader can find the preliminaries and the presentation of the models in [5]. Maximization and addition operations are denoted respectively ⊕ and ⊗.Theset

. =

Ê ∪{−∞}∪{+∞},⊕,⊗) of n n matrices with entries in dioid D Êmax =( including the two operations ⊕ and ⊗ is a dioid, which is denoted Dn.n. Mapping f is said to be residuated if for all y ∈ D, the least upper bound of subset {x ∈ D | f (x) ≤ y}

∈ ( n ) → ⊗ Ê exists and lies in this subset. Mapping x Êmax A x, defined over max is residuated (see [1]) and the left ⊗−residuation of B by A is denoted by: A\B = { ∈ ( )n ⊗ } max x Êmax such that A x B . The following Theorem uses the Kleene star ∗ = +∞ i defined by: A i=0 A . Theorem 1. [1, Theorem 4.75 part 1] Consider equation x = A ⊗ x ⊕ B and in- equality x ≥ A ⊗ x ⊕ B with A and B in complete dioid D. Then, A∗B is the least solution to these two relations. Trajectory Tracking Control 281

th Variable xi(k) is below the date of the k firing of transition xi.

2 Control on a Fixed Horizon (Problem 1)

Let us consider the objective of problem 1.

2.1 Objective

The problem of this paper is the determination of the greatest control of a plant described by a Timed Event Graph when the state and control trajectories are con- strained by additional specifications defined by a P-time Event graph. Applications of P-time Event Graphs can be found in production systems, microcircuit design, transportation systems, real-time systems and food industry. The objective is to cal- culate the greatest control u on horizon [ks + 1,k f ] such that its application to the Timed Event Graph defined by  x(k + 1)=A ⊗ x(k) ⊕ B ⊗ u(k + 1) (1) y(k)=C ⊗ x(k) satisfies the following conditions: a) y ≤ z knowing the trajectory of the desired output z on a fixed horizon [ks + , 1 k f ] with h = k f − ks ∈ Æ; b) The state trajectory follows the model of the autonomous P-time Event Graph defined by       x(k) ε A+ x(k) ≥ ⊗ (2) x(k + 1) A− A= x(k + 1)

for k ≥ ks; c) The first state vector of the state trajectory x(k) for k ≥ ks is finite and is known vector x(ks) . This “ non-canonical ” initial condition can be the result of a past evolution of a process. Underlined symbols like x(ks), z(k) correspond to known data of the problem and x(k) and y(k) are estimated in the following resolutions based on the information available at number of events ks . A simple example of this problem is a production system composed of two tasks which are the cooking of a product and its packaging with an additional con- straint: the cooking time must not be too excessive , otherwise, the product would be damaged. In the following part 2.2, we present the relations which describe a trajectory of a Timed Event Graph satisfying the specifications defined by a P-time Event Graph (constraint b)). The introduction of the ”Just-in-time” objective (constraint a)) in part 2.3 allows the resolution of the control problem on a fixed horizon. 282 P. Declerck and A. Guezzi

2.2 Trajectory Description

From (1) and (2), we deduce a system which describes the trajectories on horizon [ , ] = ks k f . Let us introduce the following notations. Let X ( )t ( + )t ( + )t ··· ( − )t ( )t t = x⎛ks x ks 1 x ks 2 x k f 1 x k f ⎞ (t: transposed) and Dh ε A+ ε ··· εεε ⎜ − = + ⎟ ⎜ A ⊕ A A A ··· εεε⎟ ⎜ − = ⎟ ⎜ ε A ⊕ A A ··· εεε⎟ ⎜ ⎟ ⎜ ··· ··· ··· ··· ··· ··· ··· ⎟ .MatrixDh presents an original ⎜ = + ⎟ ⎜ εεε··· A A ε ⎟ ⎝ εεε··· A ⊕ A− A= A+ ⎠ εεε··· ε A ⊕ A− A= block tridiagonal structure: this is a square matrix, composed of a lower diagonal (square submatrices A ⊕ A−), a main diagonal (square submatrices A= except the first element) and an upper diagonal (square submatrices A+), with all other blocks being zero matrices (ε). As n is the dimension of x, Dh is a n.(h + 1)) x n.(h + 1)) matrix.

Theorem 2. The state trajectories of a Timed Event Graph (1) starting from x(ks) and following the specifications defined by a P-time Event Graph (2) on horizon [ks,k f ] satisfy the following system ⎧ ⎪ ≥ ⊗ ⎨⎪ X Dh X x(k) ≥ B ⊗ u(k) for k ∈ [ks + 1,k f ] ⎪ ( ) ≤ ⊗ ( − ) ⊕ ⊗ ( ) ∈ [ + , ] (3) ⎩⎪ x k A x k 1 B u k for k ks 1 k f x(ks)=x(ks)

Proof. System (3) is directly deduced from the models of the Timed Event Graph (1) and the P-time Event Graph (2). For instance, equality (1) is equivalent to A ⊗ x(k − 1) ⊕ B ⊗ u(k) ≤ x(k) for k ≥ k .  x(k) ≤ A ⊗ x(k − 1) ⊕ B ⊗ u(k) s

2.3 Greatest Trajectory

We now introduce the ”Just-in-time” objective defined by constraint a). Using the previous description of the state and control trajectories (3), the problem is rewritten under a general fixed point formulation x ≤ f (x) which allows the resolution of control problem 1. The greatest estimated state trajectory X and its relevant state x(k) are denoted X + and x+(k) , respectively.

Theorem 3. The greatest state and control trajectory of a Timed Event Graph (1) starting from x(ks) and following specifications defined by a P-time Event Graph (2) on horizon [ks,k f ] is the greatest solution of the following fixed point inequality system Trajectory Tracking Control 283 ⎧ ⎪ ≤ \ ⎨⎪ X Dh X u(k) ≤ B\x(k) for k ∈ [ks + 1,k f ] ⎪ ( ) ≤ [ ⊗ ( − ) ⊕ ⊗ ( )] ∧ \ ( ) ∈ [ + , ] (4) ⎩⎪ x k A x k 1 B u k C z k for k ks 1 k f x(ks) ≤ x(ks)

+ with condition x(ks) ≤ x (ks).

Proof. From Dh ⊗ X ≤ X and B.u(k) ≤ x(k) , we deduce X ≤ Dh\X and u(k) ≤ B\x(k) on horizon [ks + 1,k f ]. The constraints of the desired output y ≤ z and y(k)=Cx(k) can be introduced in the fixed-point formulation with x(k) ≤ C\z(k) .So,x(k) ≤ [Ax(k − 1) ⊕ B.u(k)] ∧ C\z(k) on horizon [ks + 1,k f ] . The constraint x(ks)=x(ks) can be written x(ks) ≤ x(ks) and x(ks) ≤ x(ks).There- fore, a condition is x(ks) ≤ x(ks).  + + If condition x(ks) ≤ x (ks) is satisfied, then x(ks)=x (ks) and condition c) are satisfied. Therefore, the calculated state trajectory for k ≥ ks is consistent with the past evolution k ≤ ks: In other words, the Timed Event Graph can follow calculated + trajectory X after ks which obeys the specifications defined by the P-time Event Graph. System (4) leads to a fixed-point formulation whose general form is such that x ≤ f (x). Containing (min, max, +) term [A ⊗ x(k − 1) ⊕ B ⊗ u(k)] ∧C\z(k) , f is also a (min, max, +) function. It can be defined by the following grammar: f = , b x1,x2,...,xn | f ⊗a | f ∧ f | f ⊕ f where a,b are arbitrary real numbers (a,b ∈ Ê). The effective calculation of the greatest control can be made by a classical iterative algorithm of Mc Millan and Dill [12] which particularizes the algorithm of Kleene to (min, max, +) expressions. The general resolution of x ≤ f (x) isgivenbythe iterations of xi ← xi−1 ∧ f (xi−1) if the finite starting point is greater than the final solution. Here, number i represents the number of iterations and not the number of components of vector x. The general algorithm of Mc Millan and Dill [12] is known to be pseudo-polynomial in practice. The aim of the following part is the extension of problem 1 to predictive control.

3 Predictive Control (Problem 2)

We present below the principle of the sliding horizon in predictive control and the general technique of the proposed approach. Another description can be found in [6] where the control of a Timed Event Graph without specification is described in standard algebra. We assume that each event date of transition firing is available for current number = ( ) ofeventk:atstepk ks, uks and xks are known. A future control sequence u k for k ∈ [ks + 1,ks + h] is determined such that this control is the optimal solution of the problem. The first element of the optimal sequence (here u(ks +1)) is applied to the process. At the next number of event ks +1, the horizon is shifted: at step ks +1, the 284 P. Declerck and A. Guezzi

u x problem is updated with new information ks+1 and ks+1 and a new optimization is performed.

3.1 Principle of the Proposed Approach

+ = After the calculation of state trajectory x and control u at step ks, condition c) xk + s x (ks) must be checked in order to guarantee the coherence of the state trajectory between each iteration: this verification shows that future trajectory k ≥ ks + 1is the extension of the past trajectory (k ≤ ks). The on-line comparison of the two +( ) vectors xks and x ks , is similar to the comparator of the closed-loop of classical continuous control which compares a desired trajectory and its measure: when the two data are equal, the objective is obtained. In our context, an optimal control is similarly found. Let us consider the different cases. • = +( ) If condition xks x ks is satisfied, we can conclude that control problem 1 has a solution for data z and xks : there is an optimal control such that, starting from

the current state xks , the Timed Event Graph can follow a trajectory obeying the specifications defined by a P-time Event Graph with a Just-in-time criteria. • = +( ) If xks x ks , we can conclude that control problem 1 has no solution for data z

and xks : the process presents some delays produced by a disruption of the process activity for instance. The Timed Event Graph cannot (provisionally) follow a trajectory obeying the constraints of the problem, i.e. the three conditions a), b) and c). Consequently, at least a specification and/or the Just-in-time criteria, is not satisfied if we directly apply the calculated control of part 2.3 to the Timed Event Graph starting from the initial condition xks . = +( ) Therefore, the problem must be modified such that condition c) xks x ks is satisfied. In this paper, we consider that the model of the Timed Event Graph cannot be modified. If we assume that the initial condition is the result of a past evolution, xks is a datum of the problem and only condition a) and/or condition b) can be changed.

3.2 Predictive Control Objective

Suppose that the fulfillment of the specifications (condition b)) is essential. In con- sequence, the only possibility is to modify the just in time criteria of condition a) and to put the desired output back such that problem 1 presents a solution. Therefore, an aim is the determination of a desired output such that control prob- lem 1 presents a solution. Particularly, the state trajectory must start from current state xks . As a minimal desired output allows the limitation of the delays, the prob- lem is to find the earliest desired output denoted z− such that Trajectory Tracking Control 285

• there is control such that its application to the Timed Event Graph generates a

state trajectory which starts from the current state xks (condition c)) • this state trajectory follows the additional specifications defined by the P-time Event Graph on horizon [ks + 1,ks + h] (condition b)). This earliest desired output is a limit such that the Timed Event Graph cannot follow a lower trajectory satisfying the different constraints of the problem. Knowing ear- liest desired output z−, the optimal approach of part 2.3 can be applied to modified − desired output trajectory zm(k)=z(k) ⊕ z (k) for k ∈ [ks + 1,ks + h] such that this procedure yields a control which can be applied to the process. Therefore, condition a) is satisfied for the modified desired output zm and the relevant calculated control is optimal for zm. Below, we characterize an arbitrary state trajectory obeying the specifications (condition b)). System (3) will be rewritten under a fixed point formulation f (x) ≤ x allowing the prediction problem of the earliest desired output z−.

3.3 Prediction of the Earliest Desired Output z−

An arbitrary state trajectory obeying the specifications is now described with a fixed point form. From system (3), we deduce the following system ⎧ ⎨ X ≥ Dh ⊗ X ( ) ≥ ⊗ ( ) ∈ [ + , ] ⎩ x k B u k for k ks 1 k f (5) x(ks)=x(ks) which allows the determination of an interesting desired output. Indeed, this system is a fixed-point form f (X) ≤ X where f is a (max,+) function (if we assume that control u is known). Therefore, we can apply the concept of componentwise order to the desired output as follows: The resolution makes the prediction of the earliest state trajectory x−(k) for − − k ∈ [ks + 1,ks + h] and so, of the earliest output trajectory z (k)=C ⊗ x (k).The − modified desired output zm is consequently obtained: zm(k)=z(k) ⊕ z (k) for k ∈ [ks + 1,ks + h]. We now characterize the set of trajectories of systems (3) and (5). Property 1. Each trajectory of system (3) satisfies (5). Proof. Immediate: As system (3) contains an additional constraint, any trajectory of this system satisfies relaxed system (5). 

3.3.1 Earliest Firing Rule

As x(k) ≥ A ⊗ x(k − 1) ⊕ B ⊗ u(k) is already satisfied in (5), constraint x(k) ≤ A ⊗ x(k − 1) ⊕ B ⊗ u(k) guarantees the earliest firing rule. In this part, we determine the 286 P. Declerck and A. Guezzi conditions such that this last relation can be disregarded in the determination of the trajectory. If we now consider only inequality x(k) ≥ B ⊗ u(k) of system (5), the greatest control is obviously u(k)=B\x(k). This control law is considered below. We as- sume that no row of B is null.

Theorem 4. A trajectory of (5) x satisfies (3) if this state trajectory x also satisfies condition B ⊗ (B\x(k)) = x(k) for k ∈ [ks + 1,k f ]. Proof. Let us prove that inequality x(k) ≤ A ⊗ x(k − 1) ⊕ B ⊗ u(k) of system (3) is also satisfied in (5). The relaxation of x(k)=A ⊗ x(k − 1) ⊕ B ⊗ u(k) for k ∈ [ks + 1,k f ] gives x(k) ≥ A ⊗ x(k − 1)⊕ B ⊗ u(k) or, x(k) ≥ B ⊗ u(k) and x(k) ≥ A ⊗ x(k−1). This last inequality is expressed in (5) with X ≥ Dh ⊗X. Let us suppose that an arbitrary trajectory denoted X satisfies system (5). Particularly, X satisfies X ≥

Dh ⊗X and so inequality x (k) ≥ A⊗x (k−1) is satisfied. We want the Timed Event Graph defined by its state equation to follow given trajectory X (neither earlier, nor later) by applying a specific control. For given x (k), a possible control is u(k)= B\x (k) which is the greatest control satisfying inequality x (k) ≥ B ⊗ u(k).AsB ⊗ u(k)=B⊗(B\x (k)) = x (k) and x (k) ≥ A⊗x (k−1), we can deduce that A⊗x (k− 1)⊕B⊗u(k) is equal to x (k). Particularly, equality x (k)=A⊗x (k −1)⊕B⊗u(k) implies inequality x (k) ≤ A ⊗ x (k − 1) ⊕ B ⊗ u(k) . This control guarantees the values of trajectory X and consequently, the consistency of X ≥ Dh ⊗ X .  Therefore, condition on state trajectory B ⊗ (B\x(k)) = x(k) leads to a control satisfying x(k)=B ⊗ u(k) (and not only x(k) ≥ B ⊗ u(k) ). The relation expressing the earliest firing rule x(k) ≤ A ⊗ x(k − 1) ⊕ B ⊗ u(k) can be disregarded in the determination of the trajectory.

3.3.2 Structures 1 and 2

As above, we assume that no row of B is null. Moreover, we assume that each column of B contains a non-null element at the most (but a row can contain more than one element). With this structure of matrix B (denoted structure 1), there is a control such that B ⊗ u(k)=x(k) and condition B ⊗ (B\x(k)) = x(k) is satisfied. m Indeed, as a general result of residuation is (A\b)i = A ji\b j where A is an m × n j=1 matrix, we obtain ui =(B\x(k))i = B j i\x j (k) for a specific row j and equality B j i ⊗ ui = x j (k) is satisfied. As no row of B is null, B ⊗ u(k)=x(k) . A more restrictive condition (structure 2) is as follows. We can also assume that each column and each row of B contain a non-null element at the most . This last assumption also corresponds to the hypothesis of ”fully controlled” transitions i.e B = I. Therefore, the firing of each transition can be delayed in a control way and all the transitions are said to be controllable. Modeling of transportation network with timetable often leads to this assumption [2, 3, 9, 14]. Consequently, B ⊗ (B\x(k)) = x(k) is always satisfied for any state trajectory and the control law is obviously u(k)=x(k) . Trajectory Tracking Control 287

Using the Kleene star, a simple resolution of relaxed fixed-point form (5) in (max, +) algebra can now give the earliest state trajectory denoted X − and so, the earliest output trajectory z−(k)=C ⊗ x−(k) where x−(k) is the earliest state vector for k ∈ [ks + 1,k f ]. − Let us⎛ now determine⎞ the earliest state trajectory X of the prediction problem. x(ks) ⎜ ⎟ ⎜ ε ⎟ ⎜ ⎟ ⎜ ε ⎟ Let E = ⎜ ⎟ As constraint x(ks)=x(ks) can be written x(ks) ≤ x(ks) and ⎜ ··· ⎟ ⎝ ε ⎠ ε − x(ks) ≤ x(ks), the earliest state trajectory X is given by the resolution of X ≥ Dh ⊗ − X ⊕ E with condition x(ks) ≥ x (ks). The application of Kleene star by Theorem 1 − ∗ − gives the lowest solution X =(Dh) ⊗E with condition x(ks) ≥ x (ks). The control is given by u−(k)=B\x−(k).

3.3.3 Generalization

Condition on state trajectory B ⊗ (B\x(k)) = x(k) leads to control u(k)=B\x(k) which produces the exact calculated state trajectory. The same result can be ob- tained with assumptions on the structure of matrix B (structure 1 or 2). In fact, this technique can be generalized as we can only consider only transitions whose dates obey the additional constraints and neglect the other ones. Using the previously cal- culated state trajectory, the application of control u(k)=B\x(k) must lead to the exact firing dates of the first class but can minimize the firing dates of the second class. The structure of B is defined as follows: Divide the set of transitions TR into Tc defined below, and its complement Tnc with TR = Tc ∪ Tnc;SetTc is the set of − = + transitions xi such that there is a non-null coefficient Aij or Aij or Aij . Recall that ( + ) ≥ − ⊗ ( ) ( + ) ≥ = ⊗ ( + ) ( ) ≥ + ⊗ ( + ) xi k 1 Aij x j k , xi k 1 Aij x j k 1 and xi k Aij x j k 1 ,for k ≥ ks. After reorganization of the rows and columns, matrix B is as follows: vector xc (respectively xnc) expresses the firing dates of transitions xi ∈ Tc (respectively xi ∈Tnc); With no conditions onB12 andB22, xc(k) B11 B12 u1(k) = ⊗ where B11 follows structure 1 and B21 = ε. xnc(k) B21 B22 u2(k) So, the control can satisfy xc(k)=B11 ⊗ u1(k) with xc(k) ≥ B12 ⊗ u2(k) and xnc(k) ≥ B22 ⊗ u2(k) .

3.4 Causality

Approaches based on a feedback defined by a Petri net are limited by the condition that the temporisation and initial marking of each added place are non-negative. The 288 P. Declerck and A. Guezzi existence of a linear state feedback is discussed in [10]: this problem is reminiscent of difficulties of the theory of linear dynamical systems over rings [7]. Similarly, Model Predictive Control is limited by the following behavior. Let um (respectively, − u ) be the calculated control corresponding to the modified desired output zm (re- spectively, earliest desired output z−). In general, not all components of x(k) are known at the same time and some of the components of x(ks + j) for some j > 0 might be known when the control um is calculated. In this part, we only consider the usual procedure used in model predictive control. Consequently, the application of control um(ks + 1) must be made after the dates of x(ks) which are data of the prob- lem. So, each component (um)i(ks + 1) must be greater than the date of the possible application which is the addition (in standard algebra) of the maximum of com- ponents of x(ks) and the CPU time TCPU . More formally, the causality condition is um(ks +1) ≥ Fu ⊗x(ks) where Fu is the ⊗−product of TCPU and a full matrix of zeros with appropriate dimensions. Moreover, each calculated date xi(ks + 1) is the result of the application of the control and we can similarly write x(ks + 1) ≥ Fx ⊗ x(ks) where Fx is defined as Fu with appropriate dimensions. As the complete analysis of these conditions needs an extensive study (see part 7.2 ”Directions for future research” in [13]), we only give the following results.

Remark 1. If matrix B has no null row, then the first causality relation implies the second one. Indeed, x(ks + 1) ≥ B ⊗ um(ks + 1) ≥ B ⊗ Fu ⊗ x(ks) ≥ Fx ⊗ x(ks) . Different authors give examples following this assumption on B (see [15], chapter 3 and 4 in [13] for instance).

− The following result assumes that the predictive control approach gives x (ks)= x(ks) .

Property 2. Suppose that the control procedure gives a control u− such that B ⊗ − − − u (k)=x (k) and x (ks)=x(ks) . The causality conditions x(ks + 1) ≥ Fx ⊗ x(ks) − and um(ks +1) ≥ Fu ⊗x(ks) are satisfied for any x(ks) if I ⊕A⊕A ≥ Fx and B\[(I ⊕ − A ⊕ A )] ≥ Fu , respectively. − Proof. Let us consider the causality condition on state x.So,x(ks +1) ≥ x (ks + − − − − − − 1) ≥ (A ⊕ A ) ⊗ x (ks) ⊕ B ⊗ u (ks + 1)=(A ⊕ A ) ⊗ x (ks) ⊕ x (ks + 1) − − − =(I ⊕ A ⊕ A ) ⊗ x (ks)=(I ⊕ A ⊕ A ) ⊗ x(ks) − As relation x(ks + 1) ≥ (I ⊕ A ⊕ A ) ⊗ x(ks) is always satisfied and assumption − I ⊕ A ⊕ A ≥ Fx is taken, we can deduce that x(ks + 1) ≥ Fx ⊗ x(ks) . − Let us consider the causality condition on control um.So,um(ks + 1) ≥ u (ks + − − − 1)=B\x (ks + 1) ≥ B\[(I ⊕ A ⊕ A )⊗ x(ks)] ≥ B\[(I ⊕ A ⊕ A )] ⊗ x(ks) (property − f12 in [1]). If assumption B\[(I ⊕A⊕A )] ≥ Fu is taken, we can deduce that um(ks + 1) ≥ Fu ⊗ x(ks) .  Trajectory Tracking Control 289

4Conclusion

In this paper, we present a trajectory tracking control of Timed Event Graphs with specifications defined by a P-time Event Graph. The proposed approach presents the following characteristics. The approach is completely defined in (max, +) algebra and does not use standard algebra. Except the algorithm of Kleene star, every used mathematical tool is present in the document which gives a complete description of the approach. The two parts of the trajectory tracking control are: a) the optimal control; b) the updating of a desired output based on a prediction of the earliest possible desired output trajectory. These two parts use a special block tridiagonal matrix. This type of matrix is often encountered in numerical solutions of engineering problems (e.g. computational fluid dynamics, finite element method). In the general case, a pseudo-polynomial algorithms gives the control and pro- poses an initial condition which must satisfy a condition of coherence of the state trajectory. This technique is sufficient when the control system can apply the calcu- lated initial condition to the process. For different structures of matrix B, the pro- posed trajectory tracking control is composed of two polynomial algorithms. Trials show that the approach can be applied on-line for relatively important sizes of Event Graphs and horizon of calculation. It can offset unfavorable initial situations while the specifications are met.

References

1. Baccelli, F., Cohen, G., Olsder, G.J., Quadrat, J.P.: Synchronization and Lin- earity. In: An Algebra for Discrete Event Systems. Wiley, New York (1992), http://maxplus.org 2. Braker, J.G.: Algorithms and Applications in Timed Discrete Event Systems, PhD thesis, Department of Technical Mathematics and Informatics, Delft University of Technology, Delft, the Netherlands (1993) 3. Cofer, D.D., Garg, V.K.: A max-algebra solution to the supervisory control problem for real-time discrete event systems. In: 11th International Conference on Analysis and Optimization of Systems Discrete Event Systems, Sophia-Antipolis (June 15-17, 1994) 4. Declerck, P.: Control synthesis using the state equations and the “ARMA” model in Timed Event Graphs. In: 5th IEEE Mediterranean Conference on Control and Systems, CD-ROM, invited session, Paphos Chypre (July 1997) 5. Declerck, P., Didi Alaoui, M.K.: Extremal trajectories in P-time Event Graphs: applica- tion to control synthesis with specifications. In: Proc. 44th IEEE Conference on Decision and Control and European Control Conference ECC, CDC-ECC 2005, Seville, Spain, pp. 7621Ð7626 (2005), http://www.istia.univ-angers.fr/˜declerck/ 6. Guezzi, A., Declerck, P., Boimond, J.-L.: From monotone inequalities to Model Predictive Control. In: ETFA 2008, Hamburg, Germany(September 15-18, 2008), http://www.istia.univ-angers.fr/˜declerck/ 7. Hautus, M.L.J.: Controlled invariance in systems over rings. In: Feedback Control of Linear and Nonlinear Systems, vol. 39, pp. 107Ð122. Springer, Berlin (1982) 290 P. Declerck and A. Guezzi

8. Heidergott, B., Olsder, G.J., van der Woude, J.: Max Plus at Work. Princeton University Press, Princeton (2006) 9. Houssin, L., Lahaye, S., Boimond, J.-L.: Just in Time Control of constrained (max,+)- Linear Systems. Discrete Event Dynamic Systems 17(2), 59Ð178 (2007) 10. Katz, R.D.: Max-Plus (A,B)-Invariant Spaces and Control of Timed Discrete-Event Sys- tems. IEEE Transactions on Automatic Control 52(2), 229Ð241 (2007) 11. Menguy, E., Boimond, J.-L., Hardouin, L.: Optimal control of discrete event systems in case of updated reference input. In: Proceedings of the IFAC conference on system structure and control, Nantes, France, July 1998, pp. 601Ð607 (1998) 12. Mc Millan, K., Dill, D.: Algorithms for interface timing verification. In: Proceedings of the IEEE, International Conference on Computer Design: VLSI in Computers and Processors (1992) 13. Necoara, I.: Model Predictive Control for Max-Plus-Linear and Piecewise Affine Sys- tems, PhD Thesis, Delft Center for Systems and Control, Delft University of Technology, The Netherlands (October 2006) 14. Olsder, G.J., Subiono, S., Mc Gettrik, M.: On time tables and allocation of trains, Wodes 1998, Cagliary, Italy (1998) 15. De Schutter, B., van den Boom, T.: Model predictive control for max-plus-linear discrete event systems. Automatica 37(7), 1049Ð1056 (2001) Tropical Scaling of Polynomial Matrices

St«ephane Gaubert and Meisam Sharify

Abstract. The eigenvalues of a matrix polynomial can be determined classically by solving a generalized eigenproblem for a linearized matrix pencil, for instance by writing the matrix polynomial in companion form. We introduce a general scaling technique, based on tropical algebra, which applies in particular to this compan- ion form. This scaling, which is inspired by an earlier work of Akian, Bapat, and Gaubert, relies on the computation of “tropical roots”. We give explicit bounds, in a typical case, indicating that these roots provide accurate estimates of the order of magnitude of the different eigenvalues, and we show by experiments that this scal- ing improves the accuracy (measured by normwise backward error) of the computa- tions, particularly in situations in which the data have various orders of magnitude. In the case of quadratic polynomial matrices, we recover in this way a scaling due to Fan, Lin, and Van Dooren, which coincides with the tropical scaling when the two tropical roots are equal. If not, the eigenvalues generally split in two groups, and the tropical method leads to making one specific scaling for each of the groups.

1 Introduction

A classical problem is to compute the eigenvalues of a matrix polynomial

d P(λ)=A0 + A1λ + ···+ Adλ n×n, = ... where Al ∈  l 0 d are given. The eigenvalues are defined as the solutions of det(P(λ)) = 0. If λ is an eigenvalue, the associated right and left eigenvectors x ∈ n (λ) = ∗ (λ)= and y  are the non-zero solutions of the systems P x 0andy P 0,

St«ephane Gaubert and Meisam Sharify INRIA Saclay Ð Ile-de-Franceˆ & Centre de Math«ematiques appliqu«ees, Ecole Polytechnique, 91128 Palaiseau, France, e-mail: [email protected],[email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 291Ð303. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 292 S. Gaubert and M. Sharify respectively. A common way to solve this problem, is to convert P into a “linearized” matrix pencil ( nd×nd L λ)=λX +Y, X,Y ∈  with the same spectrum as P and solve the eigenproblem for L, by standard nu- merical algorithms like the QZ method [16]. If D and D are invertible diagonal matrices, and if α is a non-zero scalar, we may consider equivalently the scaled pencil DL(αλ)D . The problem of finding the good linearizations and the good scalings has received a considerable attention. The backward error and conditioning of the matrix pencil problem and of its linearizations have been investigated in particular in works of Tisseur, Li, Higham, and Mackey, see [11, 12, 17]. A scaling on the eigenvalue parameter to improve the normwise backward error of a quadratic polynomial matrix was proposed by Fan, Lin, and Van Dooren [8]. This scaling only relies on the norms γl := Al, l = 0,1,2. In this paper, we intro- duce a new family of scalings which also rely on these norms. The degree d is now arbitrary. These scalings originate from the work of Akian, Bapat, and Gaubert [1, 2], in which the entries of the matrices Al are functions, for instance Puiseux series, of a (perturbation) parameter t. The valuations (leading exponents) of the Puiseux series representing the different eigenvalues were shown to coincide, under some gener- icity conditions, with the points of non-differentiability of the value function of a parametric optimal assignment problem (the tropical eigenvalues), a result which can be interpreted in terms of amoebas [13]. Indeed, the definition of the tropical

eigenvalues in [1, 2] makes sense in any field with valuation. In particular, when the

→ | |  Ê ∪{−∞} coefficients belong to , we can take the map z log z from to as the valuation. Then, the tropical eigenvalues are expected to give, again under some non degeneracy conditions, the correct order of magnitude of the different eigenvalues. The tropical roots used in the present paper are an approximation of the tropical eigenvalues, relying only on the norms γl = Al. A better scaling may be achieved by considering the tropical eigenvalues, but computing these eigenvalues requires O(nd) calls to an optimal assignment algorithm, whereas the tropical roots consid- ered here can be computed in O(d) time, see Remark 3 below for more information. We examine such extensions in a further work. As an illustration, consider the following quadratic polynomial matrix       − 12 −310 − 12 15 P(λ)=λ 210 18 + λ + 10 18 34 16 45 34 28

By applying the QZ algorithm on the first companion form of P(λ) we get the eigenvalues -Inf,- 7.731e-19 , Inf, 3.588e-19, by using the scaling proposed in [8] we get -Inf, -3.250e-19, Inf, 3.588e-19. However by using the tropical scaling we can find the four eigenvalues properly: - 7.250e-18 ± 9.744e-18i, - 2.102e+17 ± 7.387e+17i. The result was shown to be correct (actually, up to a 14 digits pre- cision) with PARI, in which an arbitrarily large precision can be set. The above computations were performed in Matlab (version 7.3.0). Tropical Scaling of Polynomial Matrices 293

The paper is organized as follows. In Section 2, we recall some classical facts of max-plus or tropical algebra, and show that the tropical roots of a tropical polyno- mial can be computed in linear time, using a convex hull algorithm. Section 3 states preliminary results concerning matrix pencils, linearization and normwise backward error. In Section 4, we describe our scaling method. In Section 5, we give a theorem locating the eigenvalues of a quadratic polynomial matrix, which provides some the- oretical justification of the method. Finally in Section 6, we present the experimental results showing that the tropical scaling can highly reduce the normwise backward error of an eigenpair. We consider the quadratic case in Section 6.1 and the general case in Section 6.2. For the quadratic case, we compare our results with the scaling proposed in [8].

2 Tropical Polynomials

Ê ∪{−∞} The max-plus semiring Êmax,istheset , equipped with max as addition, and the usual addition as multiplication. It is traditional to use the notation ⊕ for ⊕ = ⊗ + ⊗ =

max (so 2 3 3), and for (so 1 1 2). We denote by ¼ the zero element of

⊕ = ¼ = −∞ ½

the semiring, which is such that ¼ a a,here , and by the unit element

⊗ = ⊗ ½ = ½ = of the semiring, which is such that ½ a a a,here 0. We refer the reader to [3, 4, 14] for more background.

A variant of this semiring is the max-times semiring Êmax,×, which is the set of + × nonnegative real numbers Ê , equipped with max as addition, and as multipli- →

cation. This semiring is isomorphic to Êmax by the map x logx. So, every no- Ê tion defined over Êmax has an max,× analogue that we shall not redefine explicitly. In the sequel, the word “tropical” will refer indifferently to any of these algebraic structures. Consider a max-plus (formal) polynomial of degree n in one variable, i.e., a = k formal expression P 0≤k≤n PkX in which the coefficients Pk belong to Êmax, and the associated numerical polynomial, which, with the notation of the classi- cal algebra, can be written as p(x)=max0≤k≤n Pk + kx. Cuninghame-Green and Meijer showed [7] that the analogue of the fundamental theorem of algebra holds

in the max-plus setting, i.e., that p(x) can be written uniquely as p(x)=Pn + ( , ) ,..., ∈ Ê ∑1≤k≤n max x ck ,wherec1 cn max are the roots, i.e., the points at which the maximum attained at least twice. This is a special case of more general notions which have arisen recently in tropical geometry [13]. The multiplicity of the root c is the cardinality of the set {k ∈{1,...,n}|ck = c}.DefinetheNewton polygon Δ(P) of P to be the upper boundary of the convex hull of the set of points (k,Pk), k = 0,...,n. This boundary consists of a number of linear segments. An application of Legendre-Fenchel duality (see [2, Proposition 2.10]) shows that the opposite of the slopes of these segments are precisely the tropical roots, and that the multiplicity of a root coincides with the horizontal width of the corresponding segment. (Actu- ally, min-plus polynomials are considered in [2], but the max-plus case reduces to 294 S. Gaubert and M. Sharify the min-plus case by an obvious change of variable). Since the Graham scan algo- rithm [10] allows us to compute the convex hull of a finite set of points by making O(n) arithmetical operations and comparisons, provided that the given set of points is already sorted by abscissa, we get the following result. Proposition 1. The roots of a max-plus polynomial in one variable can be com- puted in linear time. The case of a max-times polynomial reduces to the max-plus case by replacing every coefficient by its logarithm. The exponentials of the roots of the transformed polynomial are the roots of the original polynomial.

3 Matrix Pencil and Normwise Backward Error

Let us come back to the eigenvalue problem for the matrix pencil P(λ)=A0 + d A1λ + ···+ Adλ . There are many ways to construct a “linearized” matrix pencil ( nd×nd (λ) L λ)=λX +Y, X,Y ∈  with the same spectrum as P ,see[15]fora general discussion. In particular, the first companion form λX1 +Y1 is defined by ⎛ ⎞ Ak−1 Ak−2 ... A0 ⎜ ⎟ ⎜ −In 0 ... 0 ⎟ = ( , ), = . X1 diag Ak I(k−1)n Y1 ⎜ . . . . ⎟ ⎝ . . . ..⎠ 0 ... −In 0

In the experimental part of this work, we are using this linearization. To estimate the accuracy of a numerical algorithm computing an eigenpair, we shall consider, as in [17], the normwise backward error. The latter arises when con- sidering a perturbation

d ΔP = ΔA0 + ΔA1λ + ···+ ΔAdλ .

The backward error of an approximate eigenpair (x˜,λ˜ ) of P is defined by ˜ ˜ ˜ η(x˜,λ)=min{ε : (P(λ)+ΔP(λ))x˜ = 0,ΔAl2 ≤ εEl2,l = 0,...m} .

The matrices El representing tolerances. The following computable expression for η(x˜,λ˜ ) is given in the same reference,

r η(x˜,λ˜ )= 2 α˜ x˜2

˜ ˜ l where r = P(λ)x˜ and α˜ = ∑|λ| El2. In the sequel, we shall take El = Al. Our aim is to reduce the normwise backward error, by a scaling of the eigen- value λ = αμ,whereα is the scaling parameter. This kind of scaling for quadratic Tropical Scaling of Polynomial Matrices 295 polynomial matrix was proposed by Fan, Lin and Van Dooren [8]. We next introduce a new scaling, based on the tropical roots.

4 Construction of the Tropical Scaling

Consider the matrix pencil modified by the substitution λ = αμ

d P˜(μ)=A˜0 + A˜1μ + ···+ A˜d μ

i where A˜i = βα Ai. The tropical scaling which we next introduce is characterized by the property that α and β are such that P˜(μ) has at least two matrices A˜i with an (induced) Euclidean norm equal to one, whereas the Euclidean norm of the other matrices are all bounded by one. This scaling is inspired by the work of M. Akian and R. Bapat and S. Gaubert [1], which concerns the perturbation of the eigenvalues of a matrix pencil. The theorem on the location of the eigenvalues which is stated in the next section provides some justification for the present scaling. We associate to the original pencil the max-times polynomial

d tp(x)=max(γ0,γ1λ,···,γd λ ) , where γi := Ai

(the symbol t stands for “tropical”). Let α1 ≤ α2 ≤ ...≤ αd be the tropical roots of tp(x) counted with multiplicities. For each αi, the maximum is attained by at least two monomials. Subsequently, the transformed polynomial q(x) := βitp(αix), with −1 βi :=(tp(αi)) has two coefficients of modulus one, and all the other coefficients have modulus less than or equal to one. Thus α = αi and β = βi will satisfy the goal. The idea is to apply this scaling for all the tropical roots of tp(x) and each time, to compute n out of nd eigenvalues of the corresponding scaled matrix pencil, be- cause replacing P(λ) by P(αiμ) is expected to decrease the backward error for the eigenvalues of order αi, while possibly increasing the backward error for the other ones. More precisely, let α1 ≤ α1 ≤ ... ≤ αd denote the tropical roots of tp(x).Also let μ ,...,μ , μ + ,...,μ ,...,μ ,...,μ 1 n n 1 2n (d−1)n+1 nd be the eigenvalues of P˜(μ) sorted by increasing modulus, computed by setting −1 α = αi and β = tp(αi) and partitioned in d different groups. Now, we choose the ith group of n eigenvalues, multiply by αi and put in the list of computed eigenval- ues. By applying this iteration for all i = 1...d, we will get the list of the eigenvalues of P(λ). Taking into account this description, we arrive at Algorithm 1. It should 296 S. Gaubert and M. Sharify be understood here that in the sequence μ1,...,μnd of eigenvalues above, only the eigenvalues of order αi are hoped to be computed accurately. Indeed, in some ex- treme cases in which the tropical roots have very different orders of magnitude (as in the example shown in the introduction), the eigenvalues of order αi turn out to be accurate whereas the groups of higher orders have some eigenvalues Inf or Nan. So, Algorithm 1 merges into a single picture several snapshots of the spectrum, each of them being accurate on a different part of the spectrum.

Algorithm 1 Computing the eigenvalues using the tropical scaling INPUT: Matrix pencil P(λ) OUTPUT: List of eigenvalues of P(λ) 1. Compute the corresponding tropical polynomial tp(x) 2. Find the tropical roots of tp(x) 3. For each tropical root such as αi do 3.1 Compute the tropical scaling based on αi 3.2 Compute the eigenvalues using the QZ algorithm and sort them by increasing modulus 3.3 Choose the ith group of the eigenvalues

2 To illustrate the algorithm, let P(λ)=A0 + A1λ + A2λ be a quadratic polyno- 2 mial matrix and let tp(λ)=max(γ0,γ1λ,γ2λ ) be the tropical polynomial corre- sponding to this quadratic polynomial matrix. + − + − We refer to the tropical roots of tp(x) by α ≥ α .Ifα = α which happens 2 γ0 −1 −1 when γ ≤ γ0γ2 then, α = γ and β = tp(α) = γ . This case coincides with 1 2 0 γ the scaling of [8] in which α∗ = 0 . γ2 γ γ When α+ = α−, we will have two different scalings based on α+ = 1 , α− = 0 γ2 γ1 and two different β corresponding to the two tropical roots:

γ 1 β + = tp(α+)−1 = 2 , β − = tp(α−)−1 = . γ2 γ 1 0 To compute the eigenvalues of P(λ) by using the first companion form linearization, we apply the scaling based on α+, which yields $ %   γ 1 1 2 γ A2 γ A1 γ2 A0 λ 2 + 1 1 , I −I 0 to compute the n biggest eigenvalues. We apply the scaling based on α−,which yields $ % γ   0 1 1 γ2 A2 γ A1 γ A0 λ 1 + 1 2 , I −I 0 to compute the n smallest eigenvalues. Tropical Scaling of Polynomial Matrices 297

In general, let α1 ≤ α1 ≤ ...≤ αd be the tropical roots of tp(x) counted with mul- tiplicities. To compute the ith biggest group of eigenvalues, we perform the scaling for αi, which yields the following linearization: ⎛ ⎞ ⎛ ⎞ d−1 βαd βα Ad−1 ... βαiA1 βA0 i Ad ⎜ i ⎟ ⎜ ⎟ ⎜ −I 0 ... 0 ⎟ ⎜ I ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ .. . ⎟ λ ⎜ .. ⎟ + ⎜ 0 −I . . ⎟ ⎜ . ⎟ ⎜ ⎟ ⎝ ⎠ ⎝ . . ⎠ I . .. 0 I 0 ... −I 0

−1 where β = tp(αi) . Doing the same for all the distinct tropical roots, we can com- pute all the eigenvalues.

Remark 1. The interest of Algorithm 1 lies in the accuracy (since it allows us to solve instances in which the data have various order of magnitudes). Its inconvenient is to call several times (once for each distinct tropical root, and so, at most d times) the QZ algorithm. However, we may partition the different tropical roots in groups consisting each of eigenvalues of the same order of magnitude, and then, the speed factor we would loose would be reduced to the number of different groups.

5 Splitting of the Eigenvalues in Tropical Groups

In this section we state a simple theorem concerning the location of the eigenvalues of a quadratic polynomial matrix, showing that under a non degeneracy condition, the two tropical roots do provide the correct estimate of the modulus of the eigen- values. We shall need to compare spectra, which may be thought of as unordered sets, therefore, we define the following metric (eigenvalue variation), which appeared in [9]. We shall use the notation spec for the spectrum of a matrix or a pencil.

Definition 1. Let λ1,...λn and μ1 ...μn denote two sequences of complex num- bers. The variation between λ and μ is defined by

v(λ, μ) := min{max|μπ(i) − λi|} , π∈Sn i

n×n where Sn is the set of permutations of {1,2,...,n}.IfA,B ∈  , the eigenvalue variation of A and B is defined by v(A,B) := v(specA,specB). Recall that the quantity v(λ, μ) can be computed in polynomial time by solving a bottleneck assignment problem. We shall need the following theorem of Bathia, Elsner, and Krause [5]. , ∈ n×n ( , ) ≤ × −1/n( + )1−1/n − Theorem 1. [5] Let A B  .Thenv A B 4 2 A B A B1/n . 298 S. Gaubert and M. Sharify

The following result shows that when the parameter δ measuring the separation between the two tropical roots is sufficiently large, and when the matrices A2,A1 are well conditioned, then, there are precisely n eigenvalues of the order of the maxi- mal tropical root. By applying the same result to the reciprocal pencil, we deduce, under the same separation condition, that when A1,A0 are well conditioned, there are precisely n eigenvalues of the order of the minimal tropical root. So, under such conditions, the tropical roots provide accurate a priori estimates of the order of the eigenvalues of the pencil.

2 Theorem 2 (Tropical splitting of eigenvalues). Let P(λ)=λ A2 +λA1 +A0 n×n γ =   = , , where Ai ∈  , and i : Ai ,i 0 1 2. Assume that the max-times polynomial 2 + − p(λ)=max(λ γ2,λγ1,γ0) has two distinct tropical roots, α := γ1/γ2 and α = + − γ0/γ1, and let δ := α /α . Assume that A2 is invertible. Let ξ1,...,ξn denote the eigenvalues of the pencil λA2 + A1, and let us set ξn+1 = ···= ξ2n = 0. Then,

Cα+ v(specP,ξ ) ≤ , δ 1/2n where     = × −1/2n + + condA2 1−1/2n 1/2n , C : 4 2 2 2condA2 δ condA2 and

+ −1 + α (condA1) ≤|ξi|≤α condA2, 1 ≤ i ≤ n . (1)

Proof. Let us make the scaling corresponding to the maximal tropical root α+ = γ /γ β + = γ /γ2 1 2, with 2 1 , which amounts to considering the new polynomial matrix + + 2 Q(μ)=β P(α μ)=Aø2μ + Aø1μ + Aø0 where γ Aø = γ−1A , Aø = γ−1A , Aø = 2 A . 2 2 2 1 1 1 0 γ2 0 1 + Since A2 is invertible, λ is an eigenvalue of the pencil P if and only if λ = α μ where μ is an eigenvalue of the matrix:   −Aø−1Aø −Aø−1Aø X = 2 1 2 0 I 0

Let μi,i = 1,...,2n denote the eigenvalues of this matrix. Consider   −Aø−1Aø 0 Y = 2 1 I 0

 ø  =  ø  = γ γ /γ2 = /δ Observe that A1 1and A0 2 0 1 1 . Since the induced Euclidean norm ·is an algebra norm, we get Tropical Scaling of Polynomial Matrices 299

 ≤  +  ø −1 ø  +  ø −1 ø ≤ +  −1  +  −1  ø  X I A2 A1 A2 A0 1 A2 A2 A2 A2 A0 = 1 + condA2(1 + 1/δ) .

Moreover, Y≤1 + condA2 , X −Y =(condA2)/δ. Using Theorem 1, we deduce that

/ v(specX,specY) ≤ C/δ 1 2n .

Since the family of eigenvalues of P coincide with α+(specX), and since the family + of numbers ξi coincides with α (specY ), the first part of the result is proved. + If ξ is an eigenvalue of A2λ + A1, then, we can write ξ = α ζ,whereζ is ø μ + ø |ζ|≤ø−1 ø  = an eigenvalue of A2 A1. We deduce that A2 A1 condA2,which establishes the second inequality in (1). The first inequality is established along the same lines, by considering the reciprocal pencil of Aø2μ + Aø1. 

Remark 2. Theorem 2 is a typical, but special instance of a general class of re- sults that we discuss in a further work. In particular, this theorem can be extended to matrix polynomials of an arbitrary degree, with a different proof technique. Indeed, the idea of the proof above works only for the two “extreme” groups of eigenval- ues, whereas in the degree d case, the eigenvalues are split in d groups (still under nondegeneracy conditions). Note also that the exponent in δ 1/2n is suboptimal

Remark 3. In [1, 2], the tropical eigenvalues are defined as follows. The perma- × =( nent of a n n matrix B bij) with entries in Êmax is defined by

perB := max ∑ b σ( ) . σ∈ i i Sn 1≤i≤n

This is nothing than the value of the optimal assignment problem with weights (bij). =( The characteristic polynomial of a matrix C cij) is defined as the map from Êmax to itself, x → PC(x) := per(C ⊕ xI) , where I is the max-plus identity matrix, with diagonal entries equal to 0 and off- diagonal entries equal to −∞.ThesumC ⊕ xI is interpreted in the max-plus sense, so  cij if i = j (C ⊕ xI)ij = max(cii,x) if i = j. The tropical eigenvalues are defined as the roots of the characteristic polynomial. The previous definition has an obvious generalization to the case of tropical matrix

polynomials: if C0,...,Cd are n×n matrices with entries in Êmax, the eigenvalues of d the matrix polynomial C(x) := C0 ⊕C1x ⊕···⊕Cdx are defined as the roots of the polynomial function x → per(C(x)). The roots of this function can be computed in polynomial time by O(nd) calls to an optimal assignment solver (the case in which C(x)=C0 ⊕ xI was solved by Burkard and Butkoviˇc [6]; the generalization to the 300 S. Gaubert and M. Sharify degree d case was pointed out in [1]). When the matrices A0,...,Ad are scalars, the logarithms of the tropical roots considered in the present paper are readily seen to coincide with the tropical eigenvalues of the pencil in which Ck is the logarithm of the modulus of Ak,for0≤ k ≤ d. When these matrices are not scalars, in view of the asymptotic results of [1], the exponentials of the tropical eigenvalues are expected to provide more accurate estimates of the moduli of the complex roots. This alternative approach is the object of a further work, however, the comparative interest of the tropical roots considered here lies in their simplicity: they only depend on the norms of A0,...,Ad, and can be computed in linear time from these norms. They can also be used as a measure of ill-posedness of the problem (when the tropical roots have different orders of magnitude, the standard methods in general fail).

6 Experimental Results

6.1 Quadratic Polynomial Matrices

2 Consider first P(λ)=A0 + A1λ + A2λ and its linearization L = λX + Y.Letz be the eigenvector computed by applying the QZ algorithm to this linearization. Both ζ1 = z(1:n) and ζ2 = z(n + 1:2n) are eigenvectors of P(λ). We present our results for both of these eigenvectors; ηs denotes the normwise backward error for the scaling of [8], and ηt denotes the same quantity for the tropical scaling. Our first example coincides with Example 3 of [8] where A22 ≈ 5.54 × −5 3 −3 10×10 10 ,A12 ≈ 4.73 × 10 ,A02 ≈ 6.01 × 10 and Ai ∈  . We used 100 ran- domly generated pencils normalized to get the mentioned norms and we computed the average of the quantities mentioned in the following table for these pencils. Here we present the results for the 5 smallest eigenvalues, however for all the eigenvalues, the backward error computed by using the tropical scaling is of order 10−16 which is the precision of the computation. The computations were carried out in SCILAB 4.1.2.

|λ| η(ζ1,λ) η(ζ2,λ) ηs(ζ1,λ) ηs(ζ2,λ) ηt (ζ1,λ) ηt (ζ2,λ) 2.98E-07 1.01E-06 4.13E-08 5.66E-09 5.27E-10 6.99E-16 1.90E-16 5.18E-07 1.37E-07 3.84E-08 8.48E-10 4.59E-10 2.72E-16 1.83E-16 7.38E-07 5.81E-08 2.92E-08 4.59E-10 3.91E-10 2.31E-16 1.71E-16 9.53E-07 3.79E-08 2.31E-08 3.47E-10 3.36E-10 2.08E-16 1.63E-16 1.24E-06 3.26E-08 2.64E-08 3.00E-10 3.23E-10 1.98E-16 1.74E-16

−6 In the second example, we consider a matrix pencil with A22 ≈ 10 ,A12 ≈ 3 5 40×40 10 ,A02 ≈ 10 and Ai ∈  . Again, we use 100 randomly generated pencils with the mentioned norms and we compute the average of all the quantities pre- sented in the next table. We present the results for the 5 smallest eigenvalues. This time, the computations shown are from MATLAB 7.3.0, actually, the results are in- sensitive to this choice, since the versions of MATLAB and SCILAB we used both rely on the QZ algorithm of Lapack library (version 3.0). Tropical Scaling of Polynomial Matrices 301

|λ| η(ζ1,λ) η(ζ2,λ) ηs(ζ1,λ) ηs(ζ2,λ) ηT (ζ1,λ) ηT (ζ2,λ) 1.08E+01 2.13E-13 4.97E-15 8.98E-12 4.19E-13 5.37E-15 3.99E-16 1.75E+01 5.20E-14 4.85E-15 7.71E-13 4.09E-13 6.76E-16 3.95E-16 2.35E+01 4.56E-14 5.25E-15 6.02E-13 4.01E-13 5.54E-16 3.66E-16 2.93E+01 4.18E-14 5.99E-15 5.03E-13 3.97E-13 4.80E-16 3.47E-16 3.33E+01 3.77E-14 5.28E-15 4.52E-13 3.84E-13 4.67E-16 3.53E-16

6.2 Polynomial Matrices of Degree d

d Consider now the polynomial matrix P(λ)=A0 + A1λ + ···+ Adλ ,andletL = λX +Y be the first companion form linearization of this pencil. If z is an eigenvector for L then ζ1 = z(1:n) is an eigenvector for P(λ). In the following computations, we use ζ1 to compute the normwise backward error of Matrix pencil, however this is possible to use any z(kn + 1:n(k + 1)) for k = 0...d − 1. To illustrate our results, we apply the algorithm for 20 different randomly gener- ated matrix pencils and then compute the backward error for a specific eigenvalue of these matrix pencils. The 20 values x-axis, in Fig. 1 and 2, identify the random in- stance while the y-axis shows the log10 of backward error for a specific eigenvalue. Also we sort the eigenvalues in a decreasing order of their absolute value. We firstly consider the randomly generated matrix pencils of degree 5 where the order of magnitude of the Euclidean norm of Ai is as follows:

A0 A1 A2 A3 A4 A5 O(10−3) O(102) O(102) O(10−1) O(10−4) O(105) Fig. 1 shows the results for this case where the dotted line shows the backward er- ror without scaling and the solid line shows the backward error using the tropical scaling. We show the results for the minimum eigenvalue, the “central” 50th eigen- value and the maximum one from top to down. In particular, the picture at the top shows a dramatic improvement since the smallest of the eigenvalues is not com- puted accurately (backward error almost of order one) without the scaling, whereas for the biggest of the eigenvalues, the scaling typically improves the backward error by a factor 10. For the central eigenvalue, the improvement we get is intermediate. The second example concerns the randomly generated matrix pencil with degree 10 while the order of the norm of the coefficient matrices are as follows:

A0 A1 A2 A3 A4 A5 O(10−5) O(10−2) O(10−3) O(10−4) O(102) O(1) A6 A7 A8 A9 A10 O(103) O(10−3) O(104) O(102) O(105) In this example, the order of the norms differ from 10−5 to 105 and the space di- mension of Ai is 8. Figure 2 shows the results for this case where the dotted line shows the backward error without scaling and the solid line shows the backward error using tropical scaling. Again we show the results for the minimum eigenvalue, the 40th eigenvalue and the maximum one from top to down. 302 S. Gaubert and M. Sharify

Fig. 1 Backward error for randomly generated Fig. 2 Backward error for randomly gen- matrix pencils with n = 20, d = 5. erated matrix pencils with n = 8, d = 10.

References

1. Akian, M., Bapat, R., Gaubert, S.: Perturbation of eigenvalues of matrix pencils and optimal assignment problem. C. R. Acad. Sci. Paris, S«erie I 339, 103Ð108 (2004) 2. Akian, M., Bapat, R., Gaubert, S.: Min-plus methods in eigenvalue perturbation theory and generalised Lidskii-Vishik-Ljusternik theorem (2005), arxiv:math.SP/0402090 3. Akian, M., Bapat, R., Gaubert, S.: Max-plus algebras. In: Hogben, L. (ed.) Handbook of Linear Algebra, Discrete Mathematics and Its Applications, ch. 25, vol. 39, Chapman & Hall/CRC (2006) 4. Baccelli, F., Cohen, G., Olsder, G.J., Quadrat, J.P.: Synchronization and Linearity. Wiley, Chichester (1992) 5. Bhatia, R., Elsner, L., Krause, G.: Bounds for the variation of the roots of a polynomial and the eigenvalues of a matrix. Linear Algebra Appl. 142, 195Ð209 (1990) 6. Burkard, R.E., Butkoviˇc, P.: Finding all essential terms of a characteristic maxpolyno- mial. Discrete Appl. Math. 130(3), 367Ð380 (2003) 7. Cuninghame-Green, R.A., Meijer, P.F.J.: An algebra for piecewise-linear minimax prob- lems. Discrete Appl. Math. 2(4), 267Ð294 (1980) 8. Fan, H.-Y., Lin, W.-W., Van Dooren, P.: Normwise scaling of second order polynomial matrices. SIAM J. Matrix Anal. Appl. 26(1), 252Ð256 (2004) 9. Gal«antai, A., Heged˝us, C.J.: Perturbation bounds for polynomials. Numer. Math. 109(1), 77Ð100 (2008) 10. Graham, R.L.: An efficient algorithm for determining the convex hull of a finite planar set. Inf. Proc. Lett. 1(4), 132Ð133 (1972) Tropical Scaling of Polynomial Matrices 303

11. Higham, N.J., Li, R.-C., Tisseur, F.: Backward error of polynomial eigenproblems solved by linearization. SIAM J. Matrix Anal. Appl. 29(4), 1218Ð1241 (2007) 12. Higham, N.J., Mackey, D.S., Tisseur, F.: The conditioning of linearizations of matrix polynomials. SIAM J. Matrix Anal. Appl. 28(4), 1005Ð1028 (2006) 13. Itenberg, I., Mikhalkin, G., Shustin, E.: Tropical algebraic geometry. Oberwolfach sem- inars, Birkh¬auser (2007) 14. Kolokoltsov, V.N., Maslov, V.P.: Idempotent analysis and its applications. In: Mathemat- ics and its Applications, vol. 401. Kluwer Academic Publishers Group, Dordrecht (1997) 15. Mackey, D.S., Mackey, N., Mehl, C., Mehrmann, V.: Vector spaces of linearizations for matrix polynomials. SIAM J. Matrix Anal. Appl. 28(4), 971Ð1004 (2006) 16. Moler, C.B., Stewart, G.W.: An algorithm for generalized matrix eigenvalue problems. SIAM J. Numer. Anal. 10, 241Ð256 (1973) 17. Tisseur, F.: Backward error and condition of polynomial eigenvalue problems. Linear Algebra Appl. 309(1-3), 339Ð361 (2000); Proceedings of the International Workshop on Accurate Solution of Eigenvalue Problems, University Park, PA (1998) Scrutinizing Changes in the Water Demand Behavior

Manuel Herrera, Rafael P«erez-Garc«õa, Joaqu«õn Izquierdo and Idel Montalvo

Abstract. Time series novelty or anomaly detection refers to automatic identifica- tion of novel or abnormal events embedded in normal time series points. In the case of water demand, these anomalies may be originated by external influences (such as climate factors, for example) or by internal causes (bad telemetry lectures, pipe bursts, etc.). This paper will focus on the development of markers of differ- ent possible types of anomalies in water demand time series. The goal is to obtain early warning methods to identify, prevent, and mitigate likely damages in the water supply network, and to improve the current prediction model through adaptive pro- cesses. Besides, these methods may be used to explain the effects of different dys- functions of the water network elements and to identify zones especially sensitive to leakage and other problematic areas, with the aim to include them in reliability plans. In this paper, we use a classical Support Vector Machine (SVM) algorithm to discriminate between nominal and anomalous data. SVM algorithms for classifica- tion project low-dimensional training data into a higher dimensional feature space, where data separation is easier. Next, we adapt a causal learning algorithm, based on the reproduction of kernel Hilbert spaces (RKHS), to look for possible causes of the detected anomalies. This last algorithm and the SVM’s projection are achieved by using kernel functions, which are necessarily symmetric and positive definite functions.

1 Introduction

The anomaly detection of water demand time series aims to correct likely data er- rors in measures from telemetry systems. These systems are used by most water

Manuel Herrera, Rafael P«erez-Garc«õa, Joaqu«õn Izquierdo and Idel Montalvo CMMF, Universidad Polit«ecnica de Valencia 46022 Valencia, Spain, e-mail: [email protected],[email protected], [email protected],[email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 305Ð313. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 306 M. Herrera et al. companies in big cities for control and operation purposes. This will allow more accurate estimations that can be used to immediately detect severe anomalies, such as service disruption. Simultaneously, it can identify more rapidly light anomalies which can develop insidious and progressively [7]. If no errors are found in data, the novelties significance is the occurrence of some physical change in the water supply network or in the demand behavior caused by external influences, such as climate factors. Changes in time series behavior may exhibit permanent or transi- tional effects. The causes of these are diverse, but could be divided into external and internal causes. Examples of the first class are weather or calendar factors. For the internal causes one can have wrong telemetry readings, water leakage or failure of one or more valves. There is a need to divide the problem into two phases: anomaly detection and action taking. In this way, one can obtain early warning methods to identify and mitigate likely damages in the water supply network or to improve the current prediction model through some adaptive process. To distinguish between normal and abnormal deviations, novelties will be sought in three specific cases: when data loggers identify a disruption of service, when the discrepancies between the last observations and their prediction are significant, and when the last observations lack of the expected random characteristics. Here, we consider working with sliding time windows to include all possible cases. The slid- ing window method is based on a window size W; only the latest W observations are used for detection. As an observation arrives, the oldest observation in the slid- ing window expires. An alert processing method based on Support Vector Machines will be proposed to extract trends and to highlight punctual discrepancies between observed and predicted data [14]. To look for possible causes of the detected anoma- lies, we propose using the recently developed causal learning algorithm based on the reproduction of kernel Hilbert spaces [20]. By using this methodology the statisti- cal dependences can always be detected by correlations after the data are mapped into an appropriate feature space. The algorithm is an improvement of the inductive causation (IC) algorithm [18], which generalizes in several ways. The control of the consequences of novelties in the earlier stages can avoid, among other things, economic and water losses, which are of great importance from the point of view of water as a scarce resource. This paper will focus on the development of markers of the different possible types of anomalies in water demand time series, explain- ing their causes, and proposing a feasible integration mechanism in the prediction system. Figure 1 summarizes the process. The paper is organized as follows. Section 2 shows a brief literature review about detection of anomalies in water distribution systems. In section 3 we present a methodology to detect the possible types of novelties within time series data, to develop a way of classifying them and to discuss their causes. Section 4 gives some application results and summarizes the conclusions. Changes in the Water Demand Behavior 307

Fig. 1 Scheme of the methodology proposed

2 Brief Literature Review

Detecting novel events is an important ability of any signal classification scheme. This is the main reason for several models of novelty detection, which have proved to perform well on different data, to exist. It is clear that there is no single best model for novelty detection, and success depends not only on the type of method used but also on the statistical properties of the data handled. Thus, several applications have been published in the literature with the goal of detecting and classifying possible outliers or abnormal data. In the last years, Neural Networks approaches [1] have been replaced by Support Vector Machines applications in this regard [10, 14]. As different alternatives, other methods include the control charts, proposed by Nong [12], and the techniques based on fuzzy rough clustering, tested by Chimphlee et al. [2], to increase the detection rates and reduce false positive rates in the intru- sion detection system. Herrera et al. [5] have proposed hybrid nonlinear models for interpolation in the case of having problems with telemetry lectures of water consumption. Izquierdo et al. [7] have presented a neuro-fuzzy approach to fault detection in water supply systems. In addition to novelty detection, we also aim to identify its causes. Pearl [13] has shown that, under reasonable assumptions, it is possible to get hints about causal relationships from non-experimental data. Sch¬olkopf and Smola [14] have proposed the idea of measuring dependences by reproducing kernel Hilbert spaces. Sun et al. [20] and Fukumizu et al. [3] have worked on an algorithm describing the causal learning method, which we will follow in this paper. Being able to establish the cause-effect relationships in a water supply environment in the presence of anoma- lies, would certainly produce better understanding of the demand behavior. 308 M. Herrera et al.

3 Methodology

To obtain abnormal data (anomaly events) in an easy way and to train correctly the Machine Learning procedures, we propose working with our real system replicated by its EPANET [15] model. This way, we can run the water demand simulations under different novelty scenarios and check the response of our methodology. The next step will be to detect the abnormal data. Then, by using a kernel-based causal algorithm we will try to establish the causes of the observed anomalies.

3.1 EPANET Simulation

The above methodology is tested on the simulated consumption of water by using EPANET. The first premise is to work with a correct pattern demand curve. We pro- pose generating curves by using the current model for prediction. For the sake of simplicity we use a simple and novel weighted pattern-based model for water fore- casting, which has been tested by the authors with very good results [6]: this method is based on the pattern of the demand, which considers its seasonal properties. This proposal contains two components: a first part that reflects the seasonal pat- tern of the water demand; and a second part that corrects/adjusts this initial forecast to account for the specificities of the day for which a prediction is being obtained. Both parts use exponentially decreasing weights, which give more importance to more recent values of the water demand. Equation (1) gives the formal definition of the model.

L l−1 l−1 l−2 yˆk = ∑ α(1 − α) yk−24l + β (1 − β) Δk−24l, (1) l=1 where k = 25,26,...,L is the number of items to include in the predictor, Δh = yh −yˆh and α and β are the exponential coefficients of the weights. These weights are independent but, seeking the stability of the model, usually the seasonality pattern part weight, α, will be higher than error part weight, β. The model can discriminate between the different days of the week. All the characteristics of this model are reproduced as a generator system of the EPANET’s pattern demand curve. The different anomalies are also simulated in EPANET at randomized points of time: valves failures are schematized straightforward (with a programmed change of their characteristics as a function of the loss ratio) and, for example, leaks may be modeled as shown in Figure 2. From the hydraulic point of view, a leak can be simulated by a model consisting of a valve and a node with zero manometric pressure. The loss ratio of the valve will be proportional to the effective section of the fault, and depends on the coefficients of contraction and the velocity. Changes in the Water Demand Behavior 309

Fig. 2 Leakage simulated under EPANET

3.2 Detecting Anomalies

Support Vector Machines provide a novel approach to the classification problem, learning to perform the classification task through a supervised learning procedure. Vapnik [21, 22] and Shaw-Taylor and Cristianini [17] are two of the essential ref- erences for SVM. These are complemented with the works by Karatzouglou [8, 9], implementing SVM and kernel methods environment in R Language. In this work, we propose a classical SVM to discriminate the nominal and anoma- lous data obtained in the last sliding time window. The basis for the SVM algorithm for classification is the projection of the low-dimensional training data in a higher dimensional feature space, since it is easier to separate the input data in this higher dimensional feature space. This projection is achieved by using kernel functions. According to Mercer’s theorem [11, 19], kernel functions necessarily are symmet- ric and positive definite functions. The proposed working-line keeps the next flowing scheme: 1. measure the distance between the predicted and observed data within the last W-long performed array; 2. use the SVM algorithm to classify this array; 3. if an anomaly is detected then reclassify it as: outlier, trend or service disruption These are the necessary steps to complete the anomaly detection phase.

3.3 Kernel-Based Causal Algorithm

As stated, the identification of the cause-effect relationships in a water supply en- vironment in the presence of anomalies will produce better understanding of the demand behavior. To achieve this we propose the application of the kernel-based 310 M. Herrera et al. causal learning algorithm (KCL) developed by Sun et al. [20]. This approach as- sumes that a variable Z is likely to be a common effect of X and Y , if conditioning on Z increases the dependence between X and Y . Based on this assumption, the algorithm collect “votes” for hypothetical causal directions and orient the edges by the majority principle. The algorithm is an improvement of inductive causation (IC) algorithm [18], generalizing it in several ways: First, it handles both discrete and continuous variables. Next, it does not need the assumption of special kinds of distributions. Let (X ,BX ) and (Y ,BY ) be measurable spaces and let (HX ,KX ) and (HY ,KY ) be reproducing kernel Hilbert spaces of funtions on X and Y , with positive definite kernels KX , KY . We consider random vector (X,Y) on X × Y such that expectations EX [KX (X,X)] and EY [KY (Y,Y )] are finite. We define ΣXY as the cross-covariance operator and ΣXY|Z as the conditional cross-covariance opera- tor. We have that ΣXY = 0 ⇐⇒ X ⊥ Y. The strength of the marginal and conditional dependence can be defined by

= ||Σ ||2 , ÀXY : XY HS (2)

= β ||Σ ||2 , ÀXY|Z : Z XY HS (3) β = /|| ||2  ,  = [ ( ) ( )] with Z : 1 TZ HS and TZ is defined by h2 TZh1 E h1 Z h2 Z for arbitrary h1,h2 ∈ HZ . Gretton et al. [4] obtained consistent estimators of these dependences. The algorithm is based on the next heuristics: conditioning on a common effect has the tendency to generate dependence between the causes. This is true when the unconditional dependences between the causes are small. Based on this, a voting-

like procedure for orientation of edges is introduced: for any triple (X,Y,Z), one > λ À gets a vote for Z being a common effect of X and Y, if and only if ÀXY|Z XY , with appropriate λ > 0. By continuing with these votes we may direct most edges

in the majority direction. We choose λ1 very large in the first run and set λ2 := À

max ÀZX|Y , ZY |X in the second run. If the result is balanced, leave the edge undi- À ÀZX ZY rected. See [20] for further details.

3.4 Action Taking Phase

To take suitable actions when there is evidence of anomaly data detection we pro- pose following the next steps: 1. Repairing the data with interpolation to continue working with the current pre- diction model 2. Analyzing the anomaly origin a. External cause i. Detect the cause type ii. Adapt the prediction model to these novelties Changes in the Water Demand Behavior 311

b. Internal cause i. Detect the cause type ii. Repair it iii. Prevent it, explaining characteristics and checking the common points with other anomalies in the water supply network

4 Conclusions and Results

We have tested this methodology in a hourly EPANET simulation of the conditions of a water supply network zone along 100 days. This is a real-world case study exhibiting high intensity in the presence of different dysfunctions what makes it suitable for training the learning algorithms involved. Working with the demand variable, in a time window of 12 hours, the SVM algorithm is able to detect all the anomalies in the validation (40 days) and the testing data (10 days). In this case, we not only managed to detect the anomalies in the water demand behavior, but also we have been able to find justifiable cause-effect relationships in the water demand environment. The causal model includes the continuous variables: pressure, valve position (that represents the water entrance to our zone from other parts of the water supply network), diameter of the leakage and also the discrete inputs from the previ- ous classification stage. This model offers deeper knowledge of the water consump- tion behavior and the supply network and their elements, as seen in our working example of the Figure 3. This graph shows some effects (for example, valves are related to outliers but are only a cause regarding trend novelties) that are obtained with the KCL algorithm. Future work will aim to obtain more information of the factors, to improve the identification of the anomaly causes, and to take actions to prevent them or mitigate their effects. The kernel-based independence measures benefit from the power of detecting nonlinear dependence and can keep, for example, type II errors (deciding indepen- dence when there is dependence) at a very low level. The methodology we have shown is a perfect supplement to improve the current prediction models of water demand, since it can be easily adapted to various anomaly scenarios. In the fu- ture, methods to screening the water network to find specially sensible or vulnerable zones, where abnormal events may have more important consequences, should be explored.

Acknowledgements. Supported by grants BES-2005-9708 and MAEC-AECI 0000202066, awarded to two of the authors. 312 M. Herrera et al.

Fig. 3 Final step of the KCL causal-effect structure

References

1. Augusteijn, M.F., Folkert, B.A.: Neural network classification and novelty detection. In- ternational Journal of Remote Sensing 23(14), 2891Ð2902 (2002) 2. Chimphlee, W., Abdullah, A.H., Sap, M.N., Srinoy, S., Chimphlee, S.: Anomaly-based intrusion detection using fuzzy rough clustering. In: 2006 International Conference on Hybrid Information Technology (ICHI 2006), vol. 1, pp. 329Ð334 (2006) 3. Fukumizu, K., Bach, F., Gretton, A.: Statistical consistency of kernel canonical correla- tion analysis. Journal of Machine Learning Research 8, 361Ð383 (2007) 4. Gretton, A., Bousquet, O., Smola, A.J., Sch¬olkopf, B.: Measuring Statistical Dependence with Hilbert-Schmidt Norms. In: Jain, S., Simon, H.U., Tomita, E. (eds.) ALT 2005. LNCS, vol. 3734, pp. 63Ð77. Springer, Heidelberg (2005) 5. Herrera, M., Garc«õaÐD«õaz, J.C., P«erez, R., Mart«õnez, J.F., L«opez, P.A.: Interpolaci«on con redes neuronales artificiales en series temporales intervenidas para la predicci«on de la demanda urbana de agua. In: Proceedings NOLINEAL 2007. Ciudad Real, Spain (2007) 6. Herrera, M., Torgo, L., Izquierdo, J., P«erez, R.: Predictive models for forecasting hourly urban water demand (submitted, 2009) 7. Izquierdo, J., L«opez, P.A., Mart«õnez, F.J., P«erez, R.: Fault detection in water supply sys- tems using hybrid (theory and dataÐdriven) modelling. Mathematical and Computing Modelling 46, 341Ð350 (2007) 8. Karatzouglou, A.: Kernel methods software, algorithms and applications. PhD. disserta- tion, Technischen Universitat Wien, Austria (2006) 9. Karatzouglou, A., Meyer, D., Hornik, K.: Support Vector Machines. R. Journal of Statis- tical Software 15(9) (2006), http://www.jstatsoft.org/v15/i09 (accessed on January 2009) 10. Ma, J., Perkins, S.: Time-series novelty detection using one-class support vector ma- chines. In: Proceedings of the International Joint Conference on Neural Networks, vol. 3, pp. 1741Ð1745 (2003) 11. Mercer, J.: Functions of positive and negative and their connection with the theory of integral equations. Philos. Trans Royal Soc. 209, 415Ð446 (1909) 12. Nong, Y., Qian, C.: Computer intrusion detection through EWMA for autocorrelated and uncorrelated data. IEEE Transactions on Realibility 52(1), 75Ð82 (2003) 13. Pearl, J.: Causality: Models, reasoning, and inference. Cambridge University Press, Cambridge (2000) Changes in the Water Demand Behavior 313

14. Rocco, M.C., Zio, E.: A support vector machine integrated system for the classification of operation anomalies in nuclear components and systems. Reliability Eng. & System Safety 92, 593Ð600 (2007) 15. Rossman, L.: EPANET-User’s Manual. United States Environmental Protection Agency (EPA), Cincinnati, OH (2000) 16. Sch¬olkopf, B., Smola, A.: Learning with kernels. MIT Press, Cambridge (2002) 17. Shawe-Taylor, J., Cristianini, N.: An Introduction to Support Vector Machines. Cam- bridge University Press, Cambridge (2000) 18. Spirtes, P., Gylmour, C.: An algorithm for fast recovery of sparse causal graphs. Social Science Computer Review 9, 67Ð72 (1991) 19. Sun, H.: Mercer theorem for RKHS on noncompact sets. Journal of Complexity 21(3), 337Ð349 (2005) 20. Sun, X., Janzig, D., Sch¬olkopf, B., Fukumizu, K.: A kernel-based causal learning al- gorithm. In: Proc. 24th Annual International Conference on Machine Learning (ICML 2007), pp. 855Ð862 (2007) 21. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995) 22. Vapnik, V.: Statistical Learning Theory. John Wiley and Sons, Chichester (1998) Characterization of Matrices with Nonnegative Group-Projector

Alicia Herrero, Francisco J. Ram«õrez and N«estor Thome

Abstract. In [Jain, Tynan, Linear Algebra and its Applications 379, 381Ð394, 2004], the authors shown that a nonnegative square matrix A satisfies that AA# ≥ O,being A# the group inverse of A, if and only if A is permutationally similar to a matrix with a special structure. In this paper, a similar structure for this kind of matrices, slightly simplified, is presented, where the restriction of the nonnegativity of the matrix A is omitted. In addition, this result to characterize the {k}-group involutory matrices is applied.

1 Introduction

n×n n×n ∈ Ê For a given matrix F ∈ Ê , a matrix G is called its group inverse if the properties FGF = F, GFG = G,andFG = GF hold. When this matrix exists, it will be denoted by F# and it is unique [1]. Through this paper we will assume that the involved matrices have group inverse. We will stand A ≥ O for a matrix A with nonnegative entries and AT for the transpose of A. We remember that a square matrix A is called {k}-group periodic if it satisfies A# = Ak−1 where k belongs to {2,3,...}. It is well-known that AA# is a projector on the range of A along the null space of A#. In order to distinguish this projector among others defined using other generalized inverses, we will call group-projector to AA#. The group inverse has been widely studied in the literature and applied to solve real problems. For instance, it is applied in model electric networks, Markov chains, symmetric singular control systems, numerical methods, etc. [1, 4Ð9, 11, 14, 15]. In

Alicia Herrero and N«estor Thome Instituto de Matem«atica Multidisciplinar, Universidad Polit«ecnica de Valencia, 46022 Valencia, Spain, e-mail: [email protected],[email protected] Francisco J. Ram«õrez Instituto Tecnol«ogico de Santo Domingo, Av. Los Pr«oceres, Gal«a, Santo Domingo, Rep«ublica Dominicana, e-mail: [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 315Ð320. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 316 A. Herrero, F.J. Ram«õrez and N. Thome particular, in [12] a similar form to the one studied in this paper has been used to ob- tain conditions to guarantee the nonnegativity of control linear systems. Moreover, the problem of characterizing group involutory matrices (that is, {2}-group periodic matrices) has been studied in [3]. The purpose of this paper is to give a characterization of square matrices A such that AA# ≥ O without any restriction on the matrix A. In addition we apply the result in order to characterize the {k}-group periodic matrices.

2MainResults

We start this section with a result partially given in [10] where the converse is also included here. We recall that we assume that all involved matrices have group in- verse. ∈ n×n ( )= Lemma 1. Let A Ê a nonzero matrix with rank A r. Then A is an idem- ∈ n×n potent nonnegative matrix if and only if there exists a permutation matrix P Ê such that ⎡ ⎤ XY XYM O A = P⎣ OOO⎦PT (1) NXY NXYM O where M, N are arbitrary nonnegative matrices of appropriate sizes and X = ( ,..., ) = ( T ,..., T ) diag x1 xr ,Y diag y1 yr being xi and y j positive column vectors with i, j ∈{1,...,r} such that YX = I.

Proof. The sufficiency is given in the proof of the Lemma 2.1 in [10]. Since P, M, N, X,andY are nonnegative matrices, from (1) we get that A ≥ O. Again from (1) and making a simple block product we get A2 = A. The necessity is then proved. 

An important particular case is shown in the next corollary. ∈ n×n ( )= # Corollary 1. Let A Ê with rank A r such that AA is a nonzero matrix. # n×n

Then AA ≥ O if and only if there exists a permutation matrix P ∈ Ê such that ⎡ ⎤ XY XYM O AA# = P⎣ OOO⎦PT (2) NXY NXYM O where M, N are arbitrary nonnegative matrices of appropriate size and X = ( ,..., ) = ( T ,..., T ) diag x1 xr ,Y diag y1 yr being xi and y j positive column vectors with i, j ∈{1,...,r} such that YX = I.

Proof. The group inverse A# of the matrix A satisfies that (AA#)2 = AA#AA# = AA#,thatisAA# is an idempotent matrix. Then, the results follow directly by ap- plying the Lemma 1 to the matrix AA# when its nonnegativity is assumed since rank(AA#)=rank(A).  Characterization of Matrices with Nonnegative Group-Projector 317

Now, we obtain the main result of this paper. The importance of this result is that the condition A ≥ O is suppressed and besides the form of the matrix A is simplified with respect to that given in Theorem 1 in [13]. More precisely, taking into account ∈ n×n # that for a given matrix A Ê it follows that AA is idempotent, the nonnegativity of AA# allows to factorize this product in a special form by the Corollary 1. This factorization leads to the following result on the matrix A. ∈ n×n ( )= # = # ≥ Theorem 1. Let A Ê with rank A r and AA O. Then AA O if and ∈ n×n only if there exists a permutation matrix P Ê such that ⎡ ⎤ XTY XTYM O A = P⎣ OOO⎦PT (3) NXTY NXTYM O

r×r where M, N are arbitrary nonnegative matrices of appropriate size, T ∈ Ê is = ( ,..., ) = ( T ,..., T ) nonsingular and X diag x1 xr ,Y diag y1 yr being xi and y j positive column vectors with i, j ∈{1,...,r} such that YX = I. In this case, ⎡ ⎤ XT−1YXT−1YM O A# = P⎣ OOO⎦PT . (4) NXT−1YNXT−1YM O

Proof. From the Corollary 1 we can write AA# in the next form ⎡ ⎤ XY XYM O AA# = P⎣ OOO⎦PT (5) NXY NXYM O being P, M, N, X,andY as there. Now, we partition the matrix A in a 3 × 3block matrix as in (5) as follows ⎡ ⎤ A1 A2 A3 ⎣ ⎦ T A = P A4 A5 A6 P , (6) A7 A8 A9 and we apply the property (AA#)A = A. The form of AA# and the partition of A lead to: A4, A5,andA6 are null matrices, Ai = XYAi for i = 1,2,3, A7 = NA1, A8 = NA2, # # # and A9 = NA3.SinceA A = AA , the same property A(A A)=A also yields to A3 = O, A1 = A1XY,andA2 = A1M. Summarizing, ⎡ ⎤ A1 A1MO A = P⎣ OOO⎦PT , (7) NA1 NA1MO where A1 = XYA1 = A1XY.ThenA1 = XTY being T = YA1X. Note that rank(A1)= ( )= = ( ) ≤ ( ) ≤ ∈ r×r rank A r. Thus, r rank XTY rank T r because T Ê . The converse is evident.  318 A. Herrero, F.J. Ram«õrez and N. Thome

In the next result the nonnegativity of the matrix A is added in order to compare it with the one presented by S. Friedland and E. Virnik in [10]. ∈ n×n ( )= ≥ # = # ≥ Corollary 2. Let A Ê with rank A r, A O, and AA O. Then AA O ∈ n×n if and only if there exists a permutation matrix P Ê such that ⎡ ⎤ XTY XTYM O A = P⎣ OOO⎦PT (8) NXTY NXTYM O

r×r where M, N are arbitrary nonnegative matrices of appropriate size, T ∈ Ê is = ( ,..., ) = ( T ,..., T ) a nonnegative and nonsingular matrix, X diag x1 xr ,Y diag y1 yr being xi and y j positive column vectors with i, j ∈{1,...,r} such that YX = I.

Proof. Applying the Theorem 1 one only has to show that T ≥ O.FromA ≥ O and P ≥ O we have that PT AP ≥ O and thus, by (8), in particular we get XTY ≥ O. Since X ≥ O, Y ≥ O and YX = I, premultiplying and postmultiplying the inequality XTY ≥ O by Y and X, respectively, we obtain T ≥ O. 

Note that, in general, for a nonnegative matrix A the condition A2 = A implies that AA# ≥ O, which corresponds to T = I in Corollary 2. However, the converse is not always true. In fact, the following nonnegative matrix ⎡ ⎤ 010 A = ⎣ 100⎦ 000 satisfies that AA# ≥ O but A2 = A. This implies that the Corollary 2 is a more general version of the Lemma 2.1 in [10]. Now, we applied the Theorem 1 in order to characterize the {k}-group periodic matrices. ∈ n×n ( )= ( )= k ≥ k = Theorem 2. Let A Ê with rank A r, ind A 1, and A O with A O being k ∈{1,2,...}.ThenAk+1 = A if and only if there exists a permutation matrix ∈ n×n P Ê such that ⎡ ⎤ XTY XTYM O A = P⎣ OOO⎦PT (9) NXTY NXTYM O where M, N are nonnegative matrices of appropriate size, X = diag(x1,...,xr),Y = ( T ,..., T ) , ∈{ ,..., } diag y1 yr ,xi and yi are positive column vectors with i j 1 r such that

T r×r k

= ∈ Ê = yi xi 1, and T is a nonsingular matrix such that T I. Proof. Let k > 1. Following a similar reasoning as in [3], it is easy to see that whenever k > 1, the condition Ak+1 = A is equivalent to the condition A# = Ak−1. This implies that AA# = Ak ≥ O and by hypothesis AA# = O. Then, from Theorem 1, we have that A and A# have the form (3) and (4) respectively. Moreover, using these expressions, we can write the condition A# = Ak−1 as the equivalent one Characterization of Matrices with Nonnegative Group-Projector 319 ⎡ ⎤ ⎡ ⎤ XT−1YXT−1YM O XTk−1YXTk−1YM O ⎣ OOO⎦ = ⎣ OOO⎦. NXT−1YNXT−1YM O NXTk−1YNXTk−1YM O

Then, it follows that XT−1Y = XTk−1Y . Premultiplying by Y and postmultipliying by X and using that YX = I we obtain T −1 = T k−1,thatisT k = I. To prove the converse it is enough to compute Ak+1 using the expression (9) and take into account that T k = I. Note that the case k = 1 corresponds to Lemma 1 (where A2 = A and T = I). 

From this result, we can also keep the form of A in the case A ≥ O,takinginto account that the matrix T must be nonnegative in this case. n×n ( )= ( )= ≥ k = Corollary 3. Let A ∈ Ê with rank A r, ind A 1,A O and A O being k ∈{1,2,...}.ThenAk+1 = A if and only if there exists a permutation matrix ∈ n×n P Ê such that ⎡ ⎤ XTY XTYM O A = P⎣ OOO⎦PT (10) NXTY NXTYM O where M, N are arbitrary nonnegative matrices of appropriate size, and X = ( ,..., ) = ( T ,..., T ) diag x1 xr ,Y diag y1 yr being xi and y j positive column vectors with , ∈{ ,..., } = ∈ r×r i j 1 r such that YX I and T Ê is a nonnegative and nonsingular matrix satisfying T k = I.

Acknowledgements. The authors would like to thank the referees for their valuable sugges- tions and comments, which resulted in a great improvement of the original manuscript. This paper has been partially supported by the DGI project with number MTM2007- 64477.

References

1. Ben-Israel, A., Greville, T.: Generalized inverses: Theory and applications. Wiley, New York (1974) 2. Berman, A., Plemmons, R.J.: Nonnegative matrices in Mathematical Sciences. SIAM Academic Press, New York (1979) 3. Bru, R., Thome, N.: Group inverse and group involutory matrices. Linear and Multilinear Algebra 45(2-3), 207Ð218 (1998) 4. Campbell, S.L.: Singular systems of differential equations. Pitman, London (1980) 5. Campbell, S.L., Meyer Jr., C.D.: Generalized inverses of linear transformations. Dover, London (1979) 6. Chen, J., Xu, Z., Wei, Y.: Representations for the Drazin inverse of the sum P+Q+R+S and its applications. Linear Algebra and its Applications 430, 438Ð454 (2009) 7. Coll, C., Herrero, A., S«anchez, E., Thome, N.: Output feedback stabilization for sym- metric control systems. Journal of the Franklin Institute 342, 814Ð823 (2005) 8. Deng, C.Y.: The Drazin inverses of products and differences of orthogonal projections. J. Math. Anal. Appl. 335, 64Ð71 (2007) 320 A. Herrero, F.J. Ram«õrez and N. Thome

9. Deng, C.Y.: The Drazin inverses of sum and difference of idempotents. Linear Algebra and its Applications 430, 1282Ð1291 (2009) 10. Friedland, S., Virnik, E.: Nonnegative of Schur complements of nonnegative idempotent matrices. Electronic Journal of Linear Algebra 17, 426Ð435 (2008) 11. Gro§ J.: Nonsingularity of the difference of two oblique projectors. SIAM Journal on Matrix Analysis and Application 21(2), 390Ð395 (1999) 12. Herrero, A., Ram«õrez, A., Thome, N.: An algorithm to check the nonnegativity of singu- lar systems. Applied Mathematics and Computation 189, 355Ð365 (2007) 13. Jain, S.K., Tynan, J.: Nonnegative matrices A with AA# ≥ O. Linear Algebra and its Applications 379, 381Ð394 (2004) 14. Wei, Y.: Index splitting for the Drazin inverse and the singular linear systems. Applied Mathematics and Computation 95(2-3), 115Ð124 (1998) 15. Wei, Y.: On the perturbation of the group inverse and oblique projection. Applied Math- ematics and Computation 98(1), 29Ð42 (1999) Robust Design of Water Supply Systems through Evolutionary Optimization

Joaqu«õn Izquierdo, Idel Montalvo, Rafael P«erez-Garc«õa and Manuel Herrera

Abstract. Water Supply Systems (WSS) are clearly dynamical systems. Processes associated with WSS include design, planning, maintenance, control, management, rehabilitation, enlargement, etc. Modeling and simulation of these processes can be performed by using a number of variables and constraints that are non-negative in nature. Demands, diameters of pipes, flowrates, minimum pressure at demand nodes, volume of reservoirs, are only a few examples, taken from the purely tech- nical context. In this paper we will focus on the design of WSS. This a mixed discrete-continuous constrained optimization problem that is addressed here by the use of an evolutionary technique based on swarm intelligence. Robustness is en- forced by adding reliability to the system both to cope with abnormal conditions and by considering the likelihood of different state and load conditions. Application to a real-world problem is also provided.

1 Introduction

Water Supply Systems are live beings. They born, grow, age and deteriorate, need care (preventive care but also surgery is sometimes needed), are expected to work properly, have to meet basic requirements even under adverse circumstances, and so on. The aim is quality long-lasting life. As a consequence, the design of WSS cannot be thought as single, material and static design. This is one reason for WSS design optimization to be one of the most heavily researched areas in Hydraulics (see [5, 11, 22] for detailed review). The objective of the optimal design of a WSS is to determine the values of all involved variables so that all the demands are satisfied, even under certain failure

Joaqu«õn Izquierdo, Idel Montalvo, Rafael P«erez-Garc«õa and Manuel Herrera Centro Multidisciplinar de Modelaci«on de Fluidos, Universidad Polit«ecnica de Valencia, 46071 Valencia, Spain, e-mail: [email protected],[email protected], [email protected],[email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 321Ð330. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 322 J. Izquierdo et al. conditions, while the investment and maintenance costs are minimal [8]. This optimal design problem involves minimizing a fitness function that includes costs for layout and sizing using new components, reusing or substituting existing components, creating a working system configuration that fulfils all water demands -including water quality-, adhering to the design constraints, and guaranteeing a certain degree of reliability for the system [6, 7]. Reliability refers to the ability of the network to provide consumers with ad- equate and high quality supply under normal and abnormal conditions. Both hy- draulic and mechanical reliability are considered. The former refers to uncertainty coming mainly from nodal demand and pipe roughness. The latter usually refers to failures of system components, such as pipe breakage. There is no universal agree- ment about what would be the best measure of reliability, redundancy or resilience and what an acceptable level of these concepts is [19]. The approach considered here is twofold. Firstly, it considers hydraulic aspects. Working conditions can change due to change in demand, change in pipe roughness, technical failures, and so on. Robust optimization-based designs must consider these scenarios. Secondly, it also faces mechanical problems. Early work by Alperovits and Shamir [1] already showed that designs obtained by purely minimizing the cost of the pipes produce branched networks. But branched networks cannot guarantee the service to consumers downstream of a broken pipe. To get looped networks, able to cope with such abnormal situations, redundancy must be added to the system. Enforcing minimum diameters seems unnatural, and current trends point towards adding certain economic costs. In this paper we considered a proposal recently raised in [14]. We claim that it enforces certain level of reliability by considering some costs incurred by the lack of satisfaction of the supply. Interestingly, the sys- tem improvement implies only moderate increase regarding the initial investment costs. Due to the sundry aspects considered into the fitness function and to the nature of the involved variables a general purpose global optimization technique must be used. Recently, evolutionary algorithms (EA) have turned to be the preferred -and more suitable- water system design optimization techniques for many researchers [3, 4, 12, 13, 15, 20, 23Ð25]. They use full network model simulation to evaluate so- lution quality and, as a consequence, may require substantial computing time when real networks are considered. But, on the other hand, they manage to get rid of the drawbacks associated to classical optimization methods. In this paper, we apply a derivative of Particle Swarm Optimization (PSO), re- cently introduced by the authors [16, 17], to the design of WSS by using reliabil- ity ideas from [14]. This derivative is able to consider mixed discrete-continuous optimization, since the problem we tackle here involves the use of both types of variables. Also, it is able to find optimum or near-optimum solutions much more efficiently and with considerably less computational effort because of the richer population diversity it introduces. Finally, the cumbersome aspect, which is com- mon to all metaheuristics, of parameters’ selection is tackled through self-adaptive dynamic parameter control. Robust Design of WSS through EO 323

2 The Optimization Problem

The optimization model considers a number, s, of scenarios each with probability s = of occurrence Pi and ∑i=1 Pi 1. We now describe the fitness function and the con- straints for a given scenario. Although the diameters of the pipes are the main decision variables, storage vol- umes, pump heads, kind of rehabilitation to be performed, etc., are also frequently required. These variables, being discrete, continuous or binary, share one charac- teristic: they are non-negative. The estimation of individual costs depends on these variables. The correct approach to assess the costs for each element becomes impor- tant when defining the fitness function, which has to be fully adapted to the problem under consideration: design, enlargement, rehabilitation, operation design, etc. For the sake of simplicity, we include here only the cost of the pipes. Other costs can be included in a straightforward manner. For a given scenario, this cost is represented by

L F1(D)=∑ ci(Di)li, (1) i=1 t where one sums over all, L, individual pipes. D =(D1,...,DL) is the vector of pipes’ diameters. The cost per meter, depending on the diameter of pipe i,isgivenby ci(Di) and its corresponding length by li. Note that Di is chosen from a discrete set of available diameters and ci is a non linear function of diameter. A number of constraints can be considered. Again, we restrict ourselves here to the provision of quality water in terms of minimum supply pressure. Accordingly, piezometric head at demand nodes, H, must be bigger than certain positive value: Hmin. These prob- lem constraints are, obviously, positive. They are included as penalty costs in the fitness function, such that the violation of one of the imposed constraints provokes an increase in its value:

L N F2(D)=∑ ci(Di)li + ∑ H(Hmin − Hj) · p · (Hmin − Hj), (2) i=1 j=1 where penalties are added over all, N, demand nodes. H(·) is the Heaviside func- tion, and the factor p, which multiplies with the head difference, represents a fixed value which becomes effective whenever the minimal head requirement is not met. Note that in this model the individual penalties grow linearly with this difference. Penalty is high enough to render the corresponding solution unfeasible. In addition, the distribution of flowrates through the network and the piezometric head values must satisfy the classical equations of continuity and energy enforced into the hy- draulic model. The complete set of equations may be written, by using block matrix notation [9], as      ( ) − A11 q A12 q = A10Hf t , (3) A12 0 H Q 324 J. Izquierdo et al. where A12 is the so-called connectivity matrix describing the way demand nodes are connected through the lines; its size is L × Np, Np being the number of demand nodes; q is the vector of the flowrates through the lines; H the vector of unknown heads at demand nodes; A10 is an L × Nf matrix, Nf being the number of fixed head nodes with known head Hf and Q is the Np-dimensional vector of demands. Finally, A11(q) is an L × L diagonal matrix. System (3) is a non-linear problem, whose solu- tion is the state vector x =(q,H)t of the system. Continuity and energy equations are enforced by the use of EPANET2 [18], which is the benchmark hydraulic analysis tool used worldwide. Following [14], reliability is added here from an economic point of view, by considering the costs of the water not delivered due to problems in the system. Finally, the fitness function adds up this additional cost:

L N L ( )= ( ) + ( − ) · · ( − )+ · · −u. F3 D ∑ ci Di li ∑ H Hmin Hj p Hmin Hj ∑ wi li Di (4) i=1 j=1 i=1

Here, wi is a coefficient associated to each pipe, of the form a ·t f · (c f + ca ·Vf ); a · l · D−u gives the number of expected failures per year of one pipe, as a function of diameter, Di, and length, li,(a and u are known constants); t f is the average number of days required to repair the pipe; c f is the daily repairing average cost; ca is the average cost of the water supplied to affected consumers, in monetary units per unit volume; and Vf = 86400·Qbreak is the daily volume of water that should be supplied to the affected consumers due to the loss of water of Qbreak in cubic meters per second. The scenarios considered here follow the approach of “breaking” by turn all the pipes of a specific design to check if all the constraints are fulfilled subjected to this circumstance. If the test is negative the design is suitably penalized. This way, designs will develop increasing reliability. To undergo those tests, the system must be analyzed for any of those specific “breakages”.

3 Description of the PSO Variant Used

A swarm consists of an integer number, M, of particles, Xi, moving in the search space, S ⊂ Rd, each representing a potential solution of the problem:

Find minXεS F(X), subject to appropriate constraints, where F is the fitness function associated with the problem that, without loss of generality, we consider a minimization problem. In each cycle of the evolu- tion, t, each particle, i, with position vector Xi(t)=(xi1,...,xid ) has an associ- ated velocity vector, Vi(t)=(vi1,...,vid), and an associated best personal position at which the best fitness was encountered by the particle, Yi(t)=(yi1,...,yid )= argmin(F(Xi(t)),F(Xi(t −1))). Also, the position of the best particle of the swarm, Robust Design of WSS through EO 325

∗ Y = argmin{F(Xi(t)),i = 1,...,M}, is identified for every t. In each generation, the velocity of each particle is updated:

∗ Vi = ωVi + c1rand()(Yi − Xi)+c2rand()(Y − Xi). (5)

On each dimension, particle velocities are restricted to minimum and maximum velocities, which are user-defined parameters,

Vmin ≤ Vj ≤ Vmax, (6) to control excessive roaming of particles outside the search space. The position of each particle is also updated every generation:

Xi = Xi +Vi. (7)

The parameters are as follows: ω is a factor of inertia [21] that controls the im- pact of the velocity history into the new velocity. Acceleration parameters c1 and c2 are typically two positive constants, called the cognitive and social parameters, respectively. rand( ) is a function that creates random numbers between 0 and 1, used to maintain the population diversity. The discussion so far has considered the standard PSO algorithm, which is ap- plicable to continuous systems and cannot be used for mixed discrete-continuous problems. To tackle discrete variables, this algorithm takes the integer parts of the flying velocity vector’s discrete components into account:

∗ Vi = fix(ωVi + cirand()(Yi − Xi)+c2rand()(Y − Xi)), (8) where fix(·) implies that we only take the integer part of the result. The role of the inertia, ω, in (5) is considered critical for the PSO algorithm’s convergence behavior. As it facilitates the balancing of global and local searches, it has been suggested to allow to adaptively decrease linearly with time, usually in a way that at first emphasizes global search and then, with each cycle of the iteration, increasingly prioritizes local search [21]. A significant improvement in the performance of PSO is achieved by using [10] 1 ω = 0.5 + . (9) 2(ln(t)+1) The acceleration coefficients and the clamping velocity, however, are neither set to a constant value, as in standard PSO, nor set as a time-varying function, as in adaptive PSO variants [2]. Here, instead they are incorporated into the optimization problem. Each particle is allowed to self-adaptively set its own parameters by using the same process used by PSO and given by equations (5) or (8) and (7). To this end, these three parameters are considered as three new variables that are incorporated into position vectors Xi. 326 J. Izquierdo et al.

Obviously, these new variables do not enter the fitness function, but rather they are manipulated by using the same mixed individual-social learning paradigm used in PSO. Note that also Vi and Yi increase their dimension, correspondingly. By using equations (5) or (8) and (7), each particle is additionally endowed with the ability to adjust its parameters by taking into account both the parameters it had at its best position in the past as well as the parameters of the leader, which facilitated this best particle’s move to its privileged position. As a consequence, particles use their cognition of individual thinking and social cooperation not only to improve their positions but also to improve the way they improve their position by accommodating themselves to the best known conditions, namely, their conditions and their leader’s conditions when they achieved the thus-far best position. Finally, in [16], PSO was endowed with a re-generation-on-collision formulation, which further improves the performance of standard discrete PSO. The random re- generation of the many birds that tended to collide with the best birds was shown to avoid premature convergence, as it prevented clone populations from dominating the search. The inclusion of this procedure into the discrete PSO produces greatly increased diversity and improved convergence characteristics and yields higher- quality final solutions. In this study, a population size of M = 100 particles has been used. Also, if there is no improvement after 800 iterations, the process is stopped. The performance of the approach herein introduced can be observed from the results reported in the next section for a real-world problem.

4 Case Study

In this case study the minimum pressure allowed is 15m and the available commer- cial diameters are given in Table 1, also including the Hazen-Williams coefficient, C, used in the hydraulic model, and the unit cost of the pipes.

Table 1 Commercially available diameters

Diameter(mm) C Cost($ units) 100 140 117.14 150 140 145.16 200 140 191.42 250 140 241.09 300 140 333.16

The problem is solved by using the two fitness functions F2 and F3 defined in (2) and (4). The same penalty factor was used in both cases. The layout of the network can be seen in Figure 1. For the understanding of the results a code for colors has Robust Design of WSS through EO 327 been used. Regarding pipes, blue, green, yellow and red colors represent 100, 150, 200 and 250mm pipes, respectively. Regarding nodes, dark blue means pressure above 15m; light blue, between 14 and 15m; green, between 12 and 14m; yellow, between 10 and 12m; and, finally, nodes with a pressure under 10m are represented in red. This network, which is fed by a tank, has 294 lines amounting to 18.337km of pipes and 240 nodes consuming 81.53l/s in total. Figure 1 (left) presents the solution obtained by using F3 (including reliability). This solution is only a mere 3.65% more expensive than the one obtained by using F2 (no reliability consideration), whose diameters can be seen in Figure 1 (right). Table 2 presents a comparison between the initial investment costs for both solutions.

Table 2 Comparison between costs for both solutions

Diameter Without reliability With reliability (mm) Length (m) Cost($ units) Length(m) Cost($ units) 100 17731.10 2077021.41 15822.31 1853425.63 150 606.39 88023.28 2077.69 301597.04 200 0.00 0.00 328.79 62937.56 250 0.00 0.00 108.70 26206.24 300 0.00 0.00 0.00 0.00 Total cost ($ units) 2165044.69 2244166.47

The effect of closing the pipe pointed by the arrow can be observed in Figure 1(right) for the solution without reliability. It shows the great impact produced by a closed pipe. It does not happen for the more reliable design obtained from F3 (left), no matter which pipe is out of service.

Fig. 1 Solutions with (left) and without (right) reliability considerations for the case-study 328 J. Izquierdo et al.

Designs with and without reliability perform in a completely different way under the event of a broken or closed pipe. In Figure 1 (right) only some of the nodes close to the tank are able to maintain the minimal pressure of 15 m. On the other hand, in Figure 1 (left) the solution obtained considering reliability was able to re-distribute the flowrates and to guarantee the demand at the required pressure at all the demand nodes of the network. This represents a great advantage from the operating and management point of view. In addition, as said before, this is achieved with only a small increase in the initial investment. Finally, if consideration is made not only of the initial investment costs, but also of the costs derived from breakages, the solution with reliability reveals itself as much more advantageous from an economical point of view. The following table shows the value of pressure at the most critical nodes when pipe indicated in Figure 1 is closed.

Table 3 Pressure at most critical nodes

Node Without reliability With reliability ID Pressure (m) Pressure (m) 1111345 18.81 3.21 1102108 18.77 3.24 1112395 18.85 3.24 1106799 18.90 3.33 1098891 18.75 3.35 1103578 19.19 3.59 1113234 19.21 3.59 1107987 19.26 3.64 1100151 19.04 3.65 1099662 19.14 3.75 1094132 19.33 4.02 1062222 19.23 4.52 1049416 19.30 4.89 1047213 19.32 4.97

5 Conclusions

Most processes on WSS fall clearly under the category of positive systems. We have tackled here the robust design of such a WSS. The solution cannot ignore the evalu- ation of aspects related to different scenarios and certain failure conditions. Consid- eration of only the initial investment costs will produce designs that will be cheaper but that will experiment serious difficulties to cope with abnormal situations. In this work, we have shown, through a case study, that more reliable designs do not nec- essarily must involve immoderate investment increase. Interestingly, the same case Robust Design of WSS through EO 329 study shows, nonetheless, the much better performance of reliable designs in the case that failure events are represented by pipes being out of service. The concept of reliability we have used here takes into account the economical impact of the water not delivered due to this kind of failure events during the life of the network. Optimization has been carried out by using a variant of PSO devised by the au- thors that considers both discrete and continuous variables, has increased popula- tion diversity and manages self-adaptively its parameters. Having at one’s disposal an optimization tool like the one used here is of paramount importance, since other terms (load or service conditions, rehabilitation costs, life-long costs, and so on) can be added to the fitness function without rendering the problem conceptually more complex. In addition, this tool can be combined easily with hydraulic network simu- lation modules, thus allowing great versatility in the analysis of candidate solutions. Finally, the multi-agent approach that permeates the optimization algorithm con- stitutes and open-door environment for multi-objective formulations regarding the design of WSS.

Acknowledgements. Supported by grants MAEC-AECI 0000202066 and BES-2005-9708, awarded to two of the authors.

References

1. Alperovits, E., Shamir, U.: Design of optimal water distribution systems. Water Resour. Res. 13(6), 885Ð900 (1977) 2. Arumugam, M.S., Rao, M.V.C.: On the improved performances of the particle swarm optimization algorithms with adaptive parameters, cross-over operators and root mean square (RMS) variants for computing optimal control of a class of hybrid systems. Appl. Soft Comput. 8(1), 324Ð336 (2008) 3. Cunha, M.C., Sousa, J.: Water distribution network design optimization: simulated an- nealing approach. J. Wat. Res. Plann. Mgmt. 125(4), 215Ð221 (1999) 4. Geem, Z.W.: Optimal cost design of water distribution networks using harmony search. Eng. Optm. 38(3), 259Ð280 (2006) 5. Goulter, I.C.: Systems analysis in water-distribution network design: From theory to practice. J. Wat. Res. Plann. Mgmt. 118(3), 238Ð248 (1992) 6. Goulter, I.C., Bouchart, F.: Reliability-Constrained Pipe Network Model. J. Hydraul. Eng. 116(2), 211Ð229 (1990) 7. Goulter, I.C., Coals, A.V.: Quantitative approaches to reliability assessment in pipe net- works. J. Transp. Eng. 112(3), 287Ð301 (1986) 8. Izquierdo, J., P«erez, R., Iglesias, P.L.: Mathematical Models and Methods in the Water Industry. Math. Comput. Modelling (39), 1353Ð1374 (2004) 9. Izquierdo, J., Tung, M.M., P«erez, R., Mart«õnez, F.J.: Estimation of fuzzy anomalies in Water Distribution Systems. In: Progress in Industrial Mathematics at ECMI 2006, vol. 12, pp. 801Ð805. Springer, Berlin (2008) 10. Jin, Y.X., Cheng, H.Z., Yan, J.Y., Zhang, L.: New discrete method for particle swarm optimization and its application in transmission network expansion planning. Electri. Power Syst. Res. 77(3-4), 227Ð233 (2007) 330 J. Izquierdo et al.

11. Lansey, K.E.: Optimal Design of Water Distribution Systems. In: Mays, L.W. (ed.) Water Distribution System Handbook. McGraw-Hill, New York (2000) 12. Liong, S.Y., Atiquzzama, M.: Optimal design of water distribution network using shuf- fled complex evolution. J. Inst. Eng. 44(1), 93Ð107 (2004) 13. Maier, H.R., Simpson, A.R., Zecchin, A.C., Foong, W.K., Phang, K.Y., Seah, H.Y., Tan, C.L.: Ant-colony optimization for design of water distribution systems. J. Wat. Res. Plann. Mgmt. 129(3), 200Ð209 (2003) 14. Mart«õnez, J.B.: Quantifying the economy of water supply looped networks. J. Hydraul. Eng. 133(1), 88Ð97 (2007) 15. Mat«õas, A.S.: Dise˜no de redes de distribuci«on de agua contemplando la fiabilidad, medi- ante Algoritmos Gen«eticos. Departamento de Ingenier«õa Hidr«aulica y Medio Ambiente. Universidad Polit«ecnica de Valencia. Doctoral dissertation (2003) 16. Montalvo, I., Izquierdo, J., P«erez, R., Iglesias, P.L.: A diversity-enriched variant of dis- crete PSO applied to the design of Water Distribution Networks. Engineering Optimiza- tion 40(7), 655Ð668 (2008) 17. Montalvo, I., Izquierdo, J., P«erez, R., Tung, M.M.: Particle Swarm Optimization applied to the design of water supply systems. Comput. Math. Appl. 56(3), 769Ð776 (2008) 18. Rossman, L.A.: EPANET, users manual, U.S. EPA, Cincinnati (2000) 19. Savic, D.A.: Coping with risk and uncertainty in urban water infrastructure rehabilitation planning. In: Acqua e citt`a - i convegno nazionale di idraulica urbana, S’Agnello (NA), pp. 28Ð30 (2005) 20. Savic, D.A., Walters, G.A.: Genetic algorithms for least-cost design of water distribution networks. J. Wat. Res. Plann. Mgmt. 123(2), 67Ð77 (1997) 21. Shi, Y., Eberhart, R.C.: A modified particle swarm optimizer. In: Proceedings of the IEEE Congress on Evolutionary Computation, Piscataway, NJ, pp. 69Ð73 (1998) 22. Walski, T.M.: State of the Art: Pipe Network Optimization, Computer Applications in Water Resources, ASCE (1985) 23. Wu, Z.Y., Simpson, A.R.: Competent genetic-evolutionary optimization of water distri- bution systems. J. Comput. Civ. Eng. 15(2), 89Ð101 (2001) 24. Wu, Z.Y., Walski, T.: Self-Adaptive Penalty Approach Compared with Other Constraint- Handling Techniques for Pipeline Optimization. J. Wat. Res. Plann. Mgmt. 131(3), 181Ð 192 (2005) 25. Zecchin, A.C., Simpson, A.R., Maier, H.R., Leonard, M., Roberts, A.J., Berrisford, M.J.: Application of two ant-colony optimisation algorithms to water distribution system opti- misation. Math. Comput. Modelling 44(5-6), 451Ð468 (2006) Applications of Linear Co-positive Lyapunov Functions for Switched Linear Positive Systems

Florian Knorn, Oliver Mason and Robert Shorten

Abstract. In this paper we review necessary and sufficient conditions for the exis- tence of a common linear co-positive Lyapunov function for switched linear positive systems. Both the state dependent and arbitrary switching cases are considered and a number of applications are presented.

1 Introduction

Positive systems, that is systems in which each state can only take positive values, play a key role in many and diverse areas such as economics [11, 16], biology [1, 9], communication networks [4, 17], decentralised control [21] or synchronisation / consensus problems [10]. Although these as well as switched systems have been the focus of many recent studies in the control engineering and mathematics literature — to name but a few [2, 3, 12, 20] — there are still many open questions relating to the stability of systems that fall into both categories: switched positive systems. Proving stability for switched systems involves determining a Lyapunov function that is common to all constituent subsystems, [18]. In that context, work discussed in [14, 15] provides necessary and sufficient conditions for the existence of a particular type of Lyapunov function, namely a linear co-positive Lyapunov function (LCLF). It is the aim of this paper to review these results and provide examples of their use. Our brief paper is structured as follows. In Section 2 we present a number of examples from various applications to motivate the problem. We then summarise conditions for the existence of a common LCLF for switched systems evolving in the entire positive orthant, as well as when the positive orthant is partitioned into

Florian Knorn, Oliver Mason and Robert Shorten Hamilton Institute, National University of Ireland Maynooth, Maynooth, Co. Kildare, Ireland, e-mail: [email protected],[email protected], [email protected]@nuim.ie

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 331Ð338. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 332 F. Knorn, O. Mason and R. Shorten cones. Finally, in Section 4 we apply these results to the examples given at the beginning.

Notation and Mathematical Preliminaries

n

Ê Ê Throughout, Ê (resp. +) denotes the field of real (resp. positive) numbers, n×n × is the n-dimensional Euclidean space and Ê the space of n n matrices with C n real entries. A closed, pointed convex cone is a subset of Ê if and only if αx + βy ∈ C for any x,y ∈ C and non-negative scalars α,β. Matrices or vectors are said to be positive (resp. non-negative) if all of their en- tries are positive (resp. non-negative); this is written as A 0 (resp. A ! 0), where 0 is the zero-matrix of appropriate dimension. A matrix A is said to be Hurwitz if all its eigenvalues lie in the open left half of the complex plane. A matrix is said to be Metzler if all its off-diagonal entries are non-negative. We use ΣA to denote the linear time-invariant (LTI) system xú = Ax. Such a system is called positive if, for a positive initial condition, all its states remain in the positive orthant throughout time. A classic result shows that this will be the case if and only if A is a Metzler matrix, [3]. Similarly, a switched linear positive system is a = →{ ,..., } dynamical system of the form xú As(t)x,forx(0)=x0 where s : Ê 1 N is the so-called switching signal and {A1,...,AN } are the system matrices of the constituent systems, which are Metzler matrices. See [5, 18] for more details on systems of this type. Below we will just write xú = A(t)x for such a system. Finally, the function V(x)=vTx is said to be a linear co-positive Lyapunov func- tion (LCLF) for the positive LTI system ΣA if and only if V(x) > 0andVú (x) < 0 for all x ! 0 and x = 0, or, equivalently, v 0 and vTA ≺ 0.

2 Motivating Examples

To motivate our results we shall first present a few situations to which they can be applied.

1) Classes of switched time-delay systems Consider the class of n-dimensional linear positive systems with time-delay τ ≥ 0, similar to those considered by Haddad et al. in [6], but where both the system and the delay matrices may be switching over time:

xú(t)=A(t)x(t)+Ad(t)x(t − τ), x(θ)=φ(θ), −τ ≤ θ ≤ 0(1) / 0 whereweassumethatthesystemmatrix/ 0 A(t) ∈ A1,...,AN is Metzler, the delay matrix Ad(t) ∈ Ad1,...,AdM is non-negative, A(t)+Ad(t) is Metzler and Hur- ≥ n witz for all t 0, and where φ : [−τ,0] → Ê is a continuous, vector valued function Applications of Linear Co-positive Lyapunov Functions 333 specifying the initial condition of the system. How can stability of the system for arbitrary switching and delays be shown?

2) Switched positive systems with multiplicative noise Consider the class of switched positive systems with feedback quantisation or where the states experience resets. In this type of system, the states on the right hand side are scaled by a (usually time-varying) diagonal matrix: / 0 xú = A(t)D(t)x, A(t) ∈ A1,...,AN where we assume that A(t) is Metzler and Hurwitz for all t, and the diagonal ma- trix D(t) has strictly positive and bounded diagonal entries for all t. Under which conditions would such a system be stable?

3) Robustness of switched positive systems with channel dependent multiplica- tive noise An important class of positive systems is the class that arises in certain networked control problems. Here, the system of interest has the form:

xú = A(t)x + C[1](t)+···+ C[n](t) x where A(t) is Metzler and where C[i](t) ! 0 is an n × n matrix that describes the communication path from the network states to the ith state; namely it is a matrix of unit rank with only one non-zero row. Usually, the network inter- connection/ structure0 varies with time/ between 0N different configurations, so that ( ) ∈ ,..., [i]( ) ∈ [i],..., [i] = ,..., A t A1 AN and C t C1 CN for i 1 n. Again, we assume that A(t)+C[1](t)+···+ C[n](t) is also Metzler and Hurwitz for all t. What can be said regarding asymptotic stability here?

4) Numerical example Finally, to provide a more concrete example, assume we are given a switched linear positive system with the following three Metzler and Hurwitz matrices ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ −16 6 6 −10 4 0 −92 8 A = ⎣ 1 −18 2 ⎦, B = ⎣ 8 −10 9 ⎦, C = ⎣ 6 −10 4 ⎦ (2) 53−20 43−13 80−16

Can we prove that it is stable under arbitrary switching?

3 Common Linear Co-positive Lyapunov Functions

As mentioned in the introduction, work reported in [14] discusses conditions for the existence of a common LCLF for switched linear positive systems comprised of sets of LTI systems, where each of the constituent systems is assumed to be associated n with a convex region of the positive orthant of the Ê . 334 F. Knorn, O. Mason and R. Shorten

Let us briefly present two results. The first, more general result concerns situa- tions where the state space (the positive orthant) is partitioned into smaller regions, and where only certain subsystems may be active in certain regions (this may be in- terpreted as state dependent switching). The other result focuses on the special case where each of those regions is the entire positive orthant itself, that is the system can switch to any subsystem in any given point in the state space.

3.1 Switching in Partitioned Positive Orthant

Assume there are N closed pointed convex cones C j such that the closed positive n = ∪N C orthant can be written as Ê+ j=1 j. Moreover, assume that we are given stable Σ = ,..., positive LTI systems A j for j 1 N such that the jth system can only be active for states within C j. The following theorem then gives a necessary and sufficient condition for the existence of a common LCLF in this set-up. n×n Theorem 1. Given N Metzler and Hurwitz matrices A1,...,AN ∈ Ê and N

C n N ,...,C Ê = ∪ C closed, convex pointed cones 1 N such that + j=1 j, precisely one of the following statements is true: ∈ n T < ∈ C 1. There is a vector v Ê+ such that v A jx j 0 for all non-zero x j j and j = 1,...,N. ∈ C N ! 2. There are vectors x j j not all zero such that ∑ j=1 A jx j 0. Proof. 2 ⇒¬1:1 Assume 2 holds. Then, for any positive vector v 0 we have T T v A1x1 + ···+ v AN xN ≥ 0 which implies that 1 cannot hold. ¬2 ⇒ 1: Assume 2 does not hold, i. e. there are no vectors x j ∈ C j not all zero N ! such that ∑ j=1 A jx j 0. This means that the following intersection of convex cones is empty: N ∈ C ∩ ! = . ∑ j=1 A jx j : x j j, not all zero x 0 /0

O1 O2 By scaling appropriately we can see that this is equivalent to: N ∈C , N   = ∩ ! = ∑ j=1 A jx j : x j j ∑ j=1 x j 1 x 0 0(3)/ 1

Oø1 O2 · Oø O where 1 denotes the L1-norm. Now, 1 and 2 are disjoint non-empty closed convex sets and additionally Oø1 is bounded. Thus, we can apply Corollary 4.1.3 ∈ n from [8] which guarantees the existence of a vector v Ê such that

T T maxv y < inf v y (4) ø ∈O y∈O1 y 2

1 That is, we show that if 2 is true, then 1 cannot hold. Applications of Linear Co-positive Lyapunov Functions 335

As the zero vector is in O , it follows that inf vTy ≤ 0. However, as O is the 2 y∈O2 2 cone {x ! 0} it also follows that inf vTy ≥ 0. Thus, inf vTy = 0. Hence, y∈O2 y∈O2 vTy ≥ 0forally ! 0 and thus v ! 0. Moreover, from (4), we can conclude that for/ any j = 1,...,N 0and any x j ∈ C j T with x j1 = 1wehavev A jx j < 0. As C j ∩ x ! 0 : x1 = 1 is compact, it follows from continuity that by choosing ε > 0 sufficiently small,/ we can guarantee0 T that vε := v + ε1 0 satisfies vε A jx j < 0forallx j ∈ C j ∩ x ! 0 : x1 = 1 and all j = 1,...,N,where1 is the vector of all ones. T Finally, it is easy to see that vε A jx j < 0 is true even without the norm require- ment on x j. This completes the proof of the theorem. 

A very practical way of partitioning the state space would be to partition it us- ing simplicial cones C j. These are cones generated by non-negative, non-singular n×n generating matrices Q j ∈ Ê :  C =  = ∑n α (i),α ≥ , = ,..., j : x x i=1 iq j i 0 i 1 n (5) = ,..., (i) where j 1 N and q j denotes the ith column of Q j. In that case, we may include the cone generating matrices into the second statement of Theorem 1 to reword it slightly to: [...] ! N ! = 2. There are vectors w j 0 not all zero s. t. ∑ j=1 B jw j 0, with B j : A jQ j. This new statement 2 can now be easily tested by running a feasibility check on a suitably defined linear program, see [14] for more details.

3.2 Switching in Entire Positive Orthant

An important special case of the previous results is when the Q j matrices are the identity matrix, namely when we seek a common linear co-positive Lyapunov func- tion for a finite set of linear positive systems. For that, some additional notation is required: Let the set containing all possible mappings σ : {1,...,n}→{1,...,N} be called Sn,N, for positive integers n and N.GivenN matrices A j, these mappings will then be used to construct matrices Aσ (A1,...,AN ) in the following way:   (1) (2) (n) σ ,..., = ... A A1 AN : aσ(1) aσ(2) aσ(n) (6)

(i) (i) where a j denotes the ith column of A j. In other words, the ith column aσ of Aσ is the ith column of one of the A1,...,AN matrices, depending on the mapping σ ∈ Sn,N chosen. We then have the following condition:

n×n Theorem 2. Given N Hurwitz and Metzler matrices A1,...,AN ∈ Ê , the fol- lowing statements are equivalent: 336 F. Knorn, O. Mason and R. Shorten

n T ≺ = ,..., 1. There is a vector v ∈ Ê+ such that v A j 0 for all j 1 N. 2. Aσ (A1,...,AN ) is Hurwitz for all σ ∈ Sn,N.

Proof. Given in [14]. 

Remark 1. Since the submission of [14] it has come to our attention that this result may also be deduced from the more general results on P-matrix sets given in [19]. Theorem 2 states that N positive LTI systems have a common linear co-positive T Lyapunov function V(x)=v x if and only if the Aσ (A1,...,AN ) matrices are Hur- witz matrices, for all σ ∈ Sn,N. In that case, the switched system formed by these subsystems is uniformly asymptotically stable under arbitrary switching. Finally, note that when the A jQ j in Theorem 1 (or its reworded version) are Metzler and Hurwitz, then the Hurwitz condition of Theorem 2 can also be used to give a solution to the state dependent switching problem.

4 Solution to Motivating Examples

We shall now use these results to answer the problems posed in Section 2. 1) Classes of switched time-delay systems We can show stability under arbitrary switc hing and delays if two conditions are met: (a) there is a matrix A˜ d such that Ad(t) − A˜ d # 0 for all t,i.e.thereisa ˜ = ,..., matrix Ad that is entry-wise greater or equal thanAdi for all i 1 M;(b)forall σ ∈ Sn,N the matrices Aσ A1+A˜ d ,...,AN +A˜ d are Hurwitz. This can be seen by noting that (b) guarantees (by applying Theorem 2) the existence of a vector v 0 T such that v A(t)+A˜ d ≺ 0. Then, consider the following Lyapunov-Krasovskii functional, [7, 13].  0 T T V(ψ)=v ψ(0)+v A˜ d ψ(θ)dθ −τ

T for some v 0. Clearly V(ψ) ≥ v ψ(0) ≥ aψ(0)∞ with a = mini{vi} > 0and ·∞ being the maximum modulus norm. Next, define xt := {x(t + θ) | θ ∈ [−τ,0]} as the trajectory segment of the states in the interval [t − τ , t ]. Then, if condition (a) is met, the directional derivative of the above functional along the solutions of (1) will be

Vú (x )=vTxú(t)+vTA˜ x(t) − x(t − τ) t d = T ( ) ( )+ ( ) ( − τ) + T ˜ ( ) − ( − τ) v A t x t Ad t x t v Ad x t x t = vT A(t)+A˜ x(t)+vT A (t) − A˜ x(t − τ) d d d # ≤−pT x(t) 0

≤−βx(t)∞ Applications of Linear Co-positive Lyapunov Functions 337 where β = mini{pi} > 0. It then follows (see for instance [7]) that the switched system is uniformly asymptotically stable.

2) Switched positive systems with multiplicative noise Through Theorem 2 we know that if Aσ (A1,...,AN ) is a Hurwitz matrix for all σ ∈ Sn,N, then there exists a common LCLF for the system. In that case, since D(t)x 0, the system will be stable for any D(t).

3) Robustness of switched pos. systems with channel dep. multiplicative noise Again, our principal result can be used to give conditions such that this sys- tem is stable. A sufficient requirement for asymptotic stability here would be that Aσ B ,...,B is a Metzler and Hurwitz matrix for all σ ∈ S , ,where 1 q [ ] n q [ ] q = N(n+1) and B ,...,B are all the matrices of the form A + C 1 + ···+ C n 1 q i0 i1 iN with i0,...,iN ∈{1,...,N}. Further, by exploiting simple properties of Metzler matrices, we will also get the robust stability of the related system:

xú = A(t)x + C[1](t)D[1](t)+···+ C[n](t)D[n](t) x where the D[i](t) are non-negative diagonal matrices whose diagonal entries are strictly positive, but with entries bounded less than one. This latter result is im- portant as it can be used to model uncertain communication channel characteristics.

4) Numerical example With A,B,C given as in (2), it turns out that all Aσ (A,B,C) are Hurwitz matrices, for any σ ∈ S3,3; hence a switched linear positive system with these matrices will be uniformly asymptotically stable under arbitrary switching. If, however, the (3,1)- element of C is changed from 8 to 14 — note that after the change C is still a = (1) (2) (3) Metzler and Hurwitz matrix — then the matrix A(3,2,3) c b c will have an eigenvalue λ * 1.7 which violates the Hurwitz condition.

5Conclusion

In this paper, after presenting a few motivating examples, we have reviewed neces- sary and sufficient conditions for the existence of a certain type of Lyapunov func- tion for switched linear positive systems. We then illustrated and commented on the implications of our results. Future work will consider switched positive systems with time delay, and we suspect that the results reviewed here will be of great value in this future study. 338 F. Knorn, O. Mason and R. Shorten

References

1. Arcak, M., Sontag, E.D.: Diagonal stability of a class of cyclic systems and its connection with the secant criterion. Automatica 42(9), 1531Ð1537 (2006) 2. Berman, A., Plemmons, R.J.: Nonnegative Matrices in the Mathematical Sciences. In: Computer science and applied mathematics. Academic Press, New York (1979) 3. Farina, L., Rinaldi, S.: Positive Linear Systems. Wiley-Interscience Series. John Wiley & Sons, Inc., New York (2000) 4. Foschini, G.J., Miljanic, Z.: A simple distributed autonomous power control algorithm and its convergence. IEEE Transactions on Vehicular Technology 42(4), 641Ð646 (1993) 5. Gurvits, L., Shorten, R., Mason, O.: On the stability of switched positive linear systems. IEEE Transactions on Automatic Control 52(6), 1099Ð1103 (2007) 6. Haddad, W.M., Chellaboina, V.: Stability theory for nonnegative and compartmental dy- namical systems with time delay. Systems & Control Letters 51(5), 355Ð361 (2004) 7. Hale, J., Verduyn Lunel, S.M.: Introduction to Functional Differential Equations. In: Applied Mathematical Sciences, vol. 99. Springer, New York (1993) 8. Hiriart-Urruty, J.B., Lemar«echal, C.: Fundamentals of convex analysis. Grundlehren Text Editions. Springer, Heidelberg (2001) 9. Jacquez, J.A., Simon, C.P.: Qualitative theory of compartmental systems. SIAM Re- view 35(1), 43Ð79 (1993) 10. Jadbabaie, A., Lin, J., Morse, A.S.: Coordination of groups of mobile autonomous agents using nearest neighbor rules. IEEE Transactions on Automatic Control 48(6), 988Ð1001 (2003) 11. Johnson, C.R.: Sufficient conditions for D-stability. Journal of Economic Theory 9(1), 53Ð62 (1974) 12. Johnson, C.R., Mehrmann, V., Olesky, D.D.: Sign controllability of a nonnegative matrix and a positive vector. SIAM Journal on Matrix Analysis and Applications 14(2), 398Ð407 (1993) 13. Kharitonov, V.L.: Robust stability analysis of time delay systems: A survey. Annual Re- views in Control 23, 185Ð196 (1999) 14. Knorn, F., Mason, O., Shorten, R.: On linear co-positive Lyapunov functions for sets of linear positive systems. Automatica (2008) (to appear) 15. Mason, O., Shorten, R.: On linear copositive Lyapunov functions and the stability of switched positive linear systems. IEEE Transactions on Automatic Control 52(7), 1346Ð 1349 (2007) 16. Meyn, S.P.: Control Techniques for Complex Networks. Cambridge University Press, New York (2008) 17. Shorten, R., Wirth, F., Leith, D.J.: A positive systems model of tcp-like congestion con- trol: asymptotic results. IEEE/ACM Transactions on Networking 14(3), 616Ð629 (2006) 18. Shorten, R., Wirth, F., Mason, O., Wulff, K., King, C.: Stability criteria for switched and hybrid systems. SIAM Review 49(4), 545Ð592 (2007) 19. Song, Y., Gowda, M.S., Ravindran, G.: On some properties of P-matrix sets. Linear Algebra and its Applications 290(1-3), 237Ð246 (1999) 20. Virnik, E.: Analysis of positive descriptor systems. Ph.D. thesis, Technische Universit¬at Berlin, Germany (2008) 21. Siljak,ˇ D.D.: Large-Scale Dynamic Systems: Stability and Structure. North-Holland Se- ries in System Science and Engineering, vol. 3. North-Holland Publishing Co., New York (1979) A Problem in Positive Systems Stability Arising in Topology Control

Florian Knorn, Rade Stanojevic, Martin Corless and Robert Shorten

Abstract. We present a problem in the stability of switched positive systems that arises in network topology control. Preliminary results are given that guarantee sta- bility of a network topology control problem under certain assumptions. Roughly speaking, these assumptions reduce the underlying stability problem to a nonlinear consensus problem with a driving term, that eventually becomes a Lur’e problem. Simulation results are given to illustrate our algorithm. While these results indi- cate that our assumptions can be removed, a proof of the general stability problem remains open.

1 Introduction

Recent years have witnessed a growing interest in the control community in prob- lems that arise when dynamic systems evolve over graphs. While the most high pro- file of these applications are in consensus applications such as formation flying and synchronisation problems, [4, 8, 11], many other applications have arisen where the manner in which network topologies change affect the performance of algorithms that are run over these networks. In such applications, an essential requirement is that the topology of the graph be such that some properties required to support com- munication and control are satisfied, the most basic of these being that the network is connected. Considerations of this kind have given rise to the emerging field of network topology control. Clearly, graph connectivity is an essential component in

Florian Knorn, Rade Stanojevic, Robert Shorten Hamilton Institute, National University of Ireland Maynooth, Maynooth, Co. Kildare, Ireland, e-mail: [email protected],[email protected], [email protected] Martin Corless School of Aeronautics & Astronautics, Purdue University, West Lafayette, IN 47907, USA, e-mail: [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 339Ð347. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 340 F. Knorn et al. situations where a group of network nodes must work together, in a decentralised manner, to achieve some global task. This issue of graph connectivity is therefore very important and has achieved much attention in various contexts recently. In this paper we describe a recently proposed decentralised topology control algo- rithm [6] and suggest a simple way of adding weights to the states. This algorithm was posed to overcome some common assumptions in topology control (namely, that the underlying graph is symmetric). It exploits the fact that the rate of conver- gence of certain algorithms evolving over a graph is a good proxy for graph con- nectivity. Furthermore, this rate of convergence can be estimated in a decentralised manner, and can therefore be used to regulate graph connectivity. Under the as- sumption that the estimation problem and the control problem operate on different time-scales, stability can be demonstrated using elementary arguments. In partic- ular, we show that the feedback system reduces to a consensus problem with an input, and it eventually becomes a scalar nonlinear system that can be analysed in a Lur’e problem framework. Simulations are presented to illustrate the validity of our results. These results also indicate that the feedback system is stable even when a separation of time-scales is not present. This latter problem in positive systems remains open and is posed in the concluding remarks of the paper.

2MainResults

In the context of this paper and the topology control problem discussed in [6], we are interested in the following type of n-dimensional positive systems with an input term:     x(k + 1)= f x(k),k + u x(k),k 1, x(0)=x0, k = 0,1,2,...

( ) ∈ n n n Ê ×  → Ê where, x k Ê+ are the states, f : + + is a continuous vector-valued ( ( ), ∈ =( ... )T function, and u x k k Ê is an input term, that, through 1 1 1 , is thus equally added to all states and that we assume to be such that the system’s states do not leave the positive orthant. We would then like to investigate under which condi- tions the system’s states approach each other over time and, in the limit, eventually all take the same value — which may be time-varying, depending on the input term. More formally, we are looking for conditions such that limk→∞ xi(k)−x j(k) = 0 for all i, j ∈{1,...,n}.

2.1 Affine Case

Before stating our more general result we would like to present a more easily es- tablished result which can be obtained when the function f takes a particular linear , = ( ) ( ) ∈ n×n form: f x k P k x where P k Ê is a sequence of primitive, row-stochastic A Problem in Positive Systems Stability Arising in Topology Control 341

(and thus non-negative) matrices with strictly positive main diagonal entries. This special form is often encountered in distributed averaging or consensus applications, for instance. ( ) ∈ n×n Theorem 1. Let P k Ê be a sequence of matrices taken from a finite set of primitive, row-stochastic matrices with strictly positive main diagonal entries, and ( ( ), ( ) ∈ n u x k k a sequence of real, non-negative numbers. If x k Ê+ evolves for some ( )= n x 0 x0 ∈ Ê+ according to   x(k + 1)=P(k)x(k)+u x(k),k 1 (1) where 1 =(1 ... 1)T, the elements of x(k) will approach each other over time, that is limk→∞ xi(k)−x j(k)=0 for all i, j ∈{1,...,n}. Proof. Given in [6]. 

2.2 General Case

The previous result can be extended to classes of nonlinear consensus operators using the recent results of [7]. Borrowing its notation, we get:    Theorem 2. Let G (k)= V ,A (k) be a sequence of strongly connected graphs1,

( ), n ×  u x k k a sequence of finite real numbers and f a map on Ê satisfy- ing the following. Associated to each directed graph G = V ,A with node set V = { ,..., } ∈ V ∈ n 1 n , each node i and each state x Ê+, there is a compact set E (A )(x) ⊂ Ê satisfying: i  

A ( ) ( ) ∀ ∈ n ∀ ∈ Ê 1. fi(x,k) ∈ Ei k x k Æ x +, 2. Ei(A )(x)={xi} whenever the states of node i and its neighbouring nodes j are all equal, 3. Ei(A )(x) is contained in the relative interior of the convex hull of the states of node i and its neighbouring nodes j whenever the states of node i and its neighbouring nodes j are not all equal, 4. Ei(A )(x) depends continuously on x, that is, the set-valued function Ei(A ) :

n ⇒ Ê Ê+ + is continuous.

( ) ∈ n n ( )= ∈ Ê Then, if the states x k Ê+ evolve for some initial condition x 0 x0 + according to     x(k + 1)= f x(k),k + u x(k),k 1 T where 1=(1 ... 1) , the elements of x(k) will approach each other over time, i. e. limk→∞ xi(k)−x j(k) = 0 for all i, j ∈{1,...,n}. Proof. Given in [6].  Remark 1. Put simply, the theorem’s four conditions require that the updated state of each node must be a strict convex combination of its own and its neighbours’ states, and that the update function must be continuous. 1 That is, there is a directed path connecting any two nodes in the network. 342 F. Knorn et al.

3 Application of Main Result

3.1 Distributed Averaging and Topology Control

Consensus or distributed averaging algorithms have been the subject of an inordi- nate amount of attention in the past decade, as they arise in applications such as distributed sensing, clock synchronisation, flocking, or fusion of Kalman filter data; seeforinstance[2,4,9,10].Sincetherate at which these algorithms converge strongly depends on structural properties of the network of nodes they are run on, it is an interesting problem to try to somehow regulate the topology of the graph in order to ultimately control the speed at which consensus will be achieved. But as control usually requires some form of measurement and feedback of the quantity of interest, in this case one would need to be able to determine the level of connectivity. While the primary focus of the present paper is neither on the properties nor the dynamics of consensus algorithms, we recall that the second eigenvalue in magni- tude of the averaging matrix2 determines the rate at which the nodes in a network achieve consensus. Classically, the second smallest eigenvalue of the Laplacian (or transition Laplacian) matrix of a graph has been used as an algebraic measure for connectivity, [1, 3]. However, Laplacians are usually only defined for symmetric graphs, a restriction that we would like to avoid. In that regard, the second eigen- value of an averaging matrix is also an excellent candidate measurement to indicate the degree of connectivity of an entire graph (whether the underlying graph is di- rected or not). It also has the added benefit of being able to be estimated locally. In [6] we describe several methods of estimating this important, global quantity in a distributed way. With the algorithms provided therein, each node in a network is able to estimate the second eigenvalue using only local, readily available infor- mation. In wireless networks (or, on a more abstract level, geometric graphs), this would offer the abovementioned possibility to control or maintain a certain level of interconnectedness: Each node could reduce or expand its communication radius if the connectivity is estimated to be larger or smaller than required (as decreasing or increasing this radius will lead to reducing or increasing the number of neighbours, hence changing connectivity). That such a strategy is well posed is evident and fol- lows from the basic observation that if all nodes increase their communication radii sufficiently, then the graph will eventually achieved the desired level of connected- ness. Let us investigate this control application more concretely in the following.

2 Many distributed averaging algorithms can be written as x(k+1)=Px(k) where x(k) is the vector containing the states of all the nodes in the network, at time k. The row-stochastic, so-called averaging matrix P describes how each node averages its own value with that of its neighbors. A Problem in Positive Systems Stability Arising in Topology Control 343

3.2 Control Strategy

Given a wireless network, we wish to adjust the communication radius of each node in the network r1,...,rn > 0 using the estimates of the second eigenvalue in magni- tude of the averaging matrix, λ, with the ultimate objective of regulating λ to some ∗ ∗ neighbourhood of a target value; namely so that λ − λ  < ε for some λ ∈ (0,1) and ε > 0. Since there will always be more than one set of communication radii {r1 ,...,rn } that will guarantee this objective, we shall propose a control law that guarantees that the closed loop algorithm converges to a unique, single radius used by all nodes. Although this additional requirement is made to facilitate analytical tractability (that is, uniqueness of the solution), it can also be motivated from a practical standpoint: Having all nodes use the same broadcast radius helps to achieve similar battery lifetimes of the nodes, which is desirable in many applications. How- ever, our framework is sufficiently general to allow other quantities of interest to be included in the control law design (but the convergence proofs will change accord- ingly). For instance, relaxing the requirements on the communication radii, one may require all nodes to have an equal number of neighbours (which would also yield a unique radius distribution). To achieve this, we propose updating the individual node radii using a convex combination of their neighbours’ radii, plus an input term that depends on the esti- mated second largest eigenvalue, that is we feed back of the current level of connec- tivity.3 Specifically, we suggest the following decentralised control law

∗ r(k + 1)=Pc(k)r(k)+η λ(k) − λ 1 (2) for some initial, strictly positive radius distribution r(0)=r0 which guarantees a strongly connected graph. Here Pc(k) is a sequence of primitive, row-stochastic av- eraging matrices on the graphs induced by r(k), λ(k) is the magnitude of the second largest eigenvalue of the averaging matrix P as in Footnote 2 for the graph topol- ogy at time k,andη > 0 is a suitable control gain. To be fully precise, we could write λ r(k) to highlight that the second eigenvalue is ultimately a function of the topology of the graph, which in turn is dependent on r(k). Unfortunately both dependencies are rather complex in nature and hard to determine or express ana- lytically. However, it can be shown that λ(k) may be treated as a sector bounded nonlinearity so that it can be treated in a Lur’e framework, [5]: Both Theorems 1 and 2 will guarantee that the control law forces all the radii, over time, to a common value. In other words, (2) will eventually become a scalar relation, so that the stabil- ity and convergence properties of the controlled system will eventually be governed by the scalar, positive system   x(k + 1)=x(k)+u x(k),k

3 Note that we assume a certain separation of time scales between the estimation and the control scheme, i. e. we assume that the estimators have successfully converged to an exact estimate that is hence common to all nodes in the network. 344 F. Knorn et al.

Since the properties of such systems are well understood, the above theorems offer interesting possibilities for the design of further control laws. This also allows us to determine how the control gain η must be chosen so that the closed loop system is stable, which is reported in depth in [6]. Note again, that any other consensus scheme (to which Theorem 2 can be applied) may be used as well. Also, we would like to stress that the proposed controller is decentralised in that each node only requires the radius information of its neighbours — information that can easily be broadcast along the communication that is necessary to run the algorithm used for estimating λ(k) in the first place.

Remark 2. Let us comment on possible connectivity issues when using the above ∗ control law. When λ is chosen very close to one, it may be possible that in some iteration the control law would adjust (reduce) the communication radii so much that network becomes disconnected. This can either be prevented by using a much ∗ smaller control gain than necessary for stability (which guarantees that λ is ap- proached without overshoot), or by introducing a “minimum radius” that the nodes’ radii are not allowed to fall below and that is large enough to guarantee that the graph always remains strongly connected.

3.3 Weighting

In the scheme presented above, all nodes eventually reach a common radius. Now, imagine a setting where some nodes (say, nodes 17 and 25) are equipped with a longer-lasting power supply than others. In that case, these “special” nodes should be allowed to use a larger broadcast radius relative to the consensus value of the other, “ordinary” nodes. This would be an example of a situation were a certain “weighting” is applied to the states of each node. We shall now see that this can easily be incorporated in our set-up, without changing the convergence proofs. Let W := diag{wi} with wi > 0fori = 1,...,n be the n × n diagonal matrix with −1 positive entries wi along its main diagonal, and let r˜ := W r. Then, run the control strategy for the “auxiliary” states r˜ — which will converge to a common value — but recover and utilise the “weighted” radii using r = Wr˜. In the example above, this would mean that setting w17 = w25 = 2, and wi = 1 for all other nodes. Remark 3. The proposed weighting could also be used in a slightly more elabo- rate way. For instance, a node’s weight could be made a function of the remaining battery power, such that it is decreased over time as its battery is getting emptied. This way, nodes with little remaining battery power are allowed to use smaller radii than others so that they can “survive” a little bit longer: As their weight decreases, so will their communications radius, relative to the other nodes in the network. How- ever, one will need to assure that the radius is not decreased too much so that the node disconnects from the network (see the remark at the end of the previous sub- section). A Problem in Positive Systems Stability Arising in Topology Control 345

4 Simulation Results

Let us now present some simulations that demonstrate our results. They all show experiments on networks with 200 nodes and randomly distributed initial radii in the [0.05, 0.95] interval,4 in which the second largest eigenvalue in magnitude was regulated to some desired value. Depicted are the evolution over time of the second eigenvalue together with the nodes’ radii.

1 1 0.8 0.8 0.6 0.6 −→ −→ 0.4 0.4

λ λ 0.2 0.2 0 0 1 1 0.8 0.8 0.6 0.6 −→ −→

i 0.4 i 0.4 r r 0.2 0.2 0 0 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 Time step k −→ Time step k −→ (a) (b)

Fig. 1 Evolution of the second eigenvalue λ(k) of the averaging matrix (upper subplots) and the individual nodes’ communication radii ri(k) (lower plots) in two networks of 200 nodes ∗ ∗ for (a) λ = 0.9and(b)λ = 0.2.

∗ Both Figures 1(a) resp. (b) show situations where the nodes were to achieve λ = ∗ 0.9resp.λ = 0.2 using a common communication radius. It can be seen that at first the radii converge to a common value and then, on a slower time scale, change such that the second eigenvalue reaches the desired value. Figure 2 shows two examples where the nodes’ states were weighted. In (a), we simulate the setting where two nodes are equipped with different power supplies than the others. This presents an application of weighting the states as mentioned in Subsection 3.3. We picked two nodes which we wanted to use twice resp. half the radius as the other nodes in the network. This was achieved by setting the cor- responding weights to 2 resp. 0.5. As can be seen in the plot, the second largest ∗ eigenvalue of the network converges quickly to its desired value of λ = 0.9, and the nodes’ radii all converge to a common value but for the two special nodes of different weighting. An example for the remark at the end of Subsection 3.3 is given in Figure 2(b). Whilst again regulating the second largest eigenvalue in magnitude, we started to successively reduce one node’s weight starting at time k = 40. The plots show that the desired level of connectivity is, again, quickly achieved and maintained through- out. However, after k = 40 one node’s radius decreases bit by bit whereas the other 4 We deliberately chose different initial radii to show that consensus is achieved on these values. 346 F. Knorn et al.

1 1 0.8 0.8 0.6 0.6 −→ −→ 0.4 0.4

λ λ 0.2 0.2 0 0 1 1 0.8 0.8 0.6 0.6 −→ −→

i 0.4 i 0.4 r r 0.2 0.2 0 0 0 5 10 15 20 25 30 35 40 0 10 20 30 40 50 60 70 80 Time step k −→ Time step k −→ (a) (b)

Fig. 2 Evolution of λ(k) and the individual nodes’ radii ri(k) in two networks of 200 resp. 50 ∗ nodes as the second eigenvalue is regulated to λ = 0.9. In (a) two nodes where to have twice resp. half the radius as their peers; in (b) one node’s weight was successively reduced starting at k = 40. nodes’ radii all commonly increase, slightly, to counter the effect of the reduction in radius of the special node.

5 Conclusion and Future Directions

In the context of consensus algorithms, we presented two theorems that provide conditions for the convergence of the states to a common value, even when there are inputs to the system. We also suggested a simple modification that allows for different weightings to be applied to the states. We then used these results to control the topology of wireless networks or geometric graphs in general. The proposed decentralised control law, which adjusts the communication radii of the nodes so that the overall network achieves a predefined level of connectivity, poses such a consensus problem with inputs, and possible weighting of the states. This leads us to the more general, open problem of finding consensus conditions for systems of the type     x(k + 1)= f x(k), k + g x(k), λ˜ (k), k where f is some convex function of the system’s states, and g is a function of the ˜ (local) states and the (local) estimates λi(k) of the second largest eigenvalue of the averaging matrix of the graph. These systems are encountered when we drop the assumption of separation of time scales of estimation and control scheme, or in certain distributed optimisation problems, when g represents the derivative of some convex cost function. A Problem in Positive Systems Stability Arising in Topology Control 347

Acknowledgements. This work was supported by Science Foundation Ireland PI Award 07/IN.1/1901.

References

1. Chung, F.R.K. (ed.): Spectral Graph Theory. CBMS Regional Conference Series in Mathematics, vol. 92. American Mathematical Society, Province (1997) 2. Estrin, D., Girod, L.D., Pottie, G.J., Srivastava, M.: Instrumenting the world with wire- less sensor networks. In: Proc. of the Int. Conf. on Acoustics, Speech, and Signal Pro- cessing, Salt Lake City, UT, USA, vol. 4, pp. 2033Ð2036 (2001) 3. Fiedler, M.: Algebraic connectivity of graphs. Czech. Math. J. 23(98), 298Ð305 (1973) 4. Jadbabaie, A., Lin, J., Morse, A.S.: Coordination of groups of mobile autonomous agents using nearest neighbor rules. IEEE Trans. Autom. Control 48(6), 988Ð1001 (2003) 5. Khalil, H.K.: Nonlinear Systems. Macmillan Publishing Co., New York (1992) 6. Knorn, F., Stanojevic, R., Corless, M., Shorten, R.: A framework for decentralised feed- back connectivity control with application to sensor networks. Int. J. Control (to appear, 2009) 7. Moreau, L.: Stability of multiagent systems with time-dependent communication links. IEEE Trans. Autom. Control 50(2), 169Ð182 (2005) 8. Olfati-Saber, R., Murray, R.M.: Consensus problems in networks of agents with switch- ing topology and time-delays. IEEE Trans. Autom. Control 49(9), 1520Ð1533 (2004) 9. Reynolds, C.W.: Flocks, herds, and schools: A distributed behavioral model. In: Proc. of the 14th Annual Conf. on Computer Graphics and Interactive Techniques, Anaheim, CA, USA, pp. 25Ð34 (1987) 10. Vicsek, T., Czir«ok, A., Ben-Jacob, E., Cohen, I., Shochet, O.: Novel type of phase tran- sition in a system of self-driven particles. Phys. Rev. Lett. 75(6), 1226Ð1229 (1995) 11. Zavlanos, M.M., Pappas, G.J.: Controlling connectivity of dynamic graphs. In: Proc. of the Joint 44th IEEE Conf. on Decision and Control, and the European Control Conf., Seville, Spain, pp. 6388Ð6393 (2005) Control of Uncertain (min,+)-Linear Systems

Euriell Le Corronc, Bertrand Cottenceau and Laurent Hardouin

Abstract. This paper deals with the control of uncertain (min,+)-linear systems which belong to an interval. Thanks to the residuation theory, a precompensator controller placed upstream of the studied system is given in such a way that even if the system’s behavior is not perfectly known, it has the property to delay the in- put as much as possible while keeping the input/output behavior unchanged. This precompensator is called neutral.

1 Introduction

Discrete Event Dynamic Systems (DEDS) such as production systems, computing networks and transportation systems which are characterized by delay and synchro- nization phenomena can be described by linear models. Thanks to the particular al- gebraic structure called idempotent semiring (or dioid), this translation into a linear model is possible through for instance the (min,+)-algebra. This approach, detailed in [1] and [4], has numerous analogies with the classical automatic theory and in particular, the control of these systems can be considered. For instance, some model matching problems are solved by the way of different control structures (open-loop or close-loop structures) as presented in [2], [6] and [8]. These results rely on the residuation theory and assume that the model is perfectly known. This paper puts forward a control synthesis problem when the system is modeled with some parametric uncertainties. More precisely, the following conditions are assumed: • the system has a (min,+)-linear input/output behavior denoted h,

Euriell Le Corronc, Bertrand Cottenceau and Laurent Hardouin Laboratoire d’Ing«enierie des Syst`emes Automatis«es, Universit«e d’Angers, 62, Avenue Notre Dame du Lac, 49000 Angers, France, e-mail: [email protected], [email protected], [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 349Ð357. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 350 E. Le Corronc, B. Cottenceau and L. Hardouin

• because of uncertainties, h is unknown but belongs to an interval [ h , h ],the bounds of which are known. Under these assumptions, a precompensator controller p for the unknown system h is computed in order to achieve two goals: • the precompensator p is the greatest as possible, i.e. the one which delays the input as much as possible, • the input/output transfer1 is unchanged, i.e. h ∗ p = h. In a manufacturing context, such a controller allows the work-in process to be re- duced while keeping the same process output. This enables to preserve input/output stream while decreasing internal congestions. It is important to note that our approach is different from the one presented in [7]. Indeed, in [7], the system also belongs to an interval (h ∈ [ h , h ]) but is subject to fluctuation2 within the interval limits and admits a precompensator p ∈ [ p , p ] such that h ∗ p ∈ [ h , h ]. In this paper, a precompensator p is computed such that the equality h ∗ p = h is satisfied, provided that h is a stationary (min,+)-linear system. In order to introduce this work, the paper is organized as follows. The second section recalls some algebraic tools required to the DEDS study through idempo- tent semiring and residuation theory. In the third section, models and controls of (min,+)-linear systems are presented. Finally, in the fourth section, the neutral pre- compensator controller p is proposed and an example is given.

2 Algebraic Preliminaries

2.1 Dioid Theory

An idempotent semiring D is a set endowed with two inner operations denoted ⊕ and ⊗ (see [1, ¤4.2]). The sum ⊕ is associative, commutative, idempotent (i.e. ∀a ∈ D,a ⊕ a = a) and admits a neutral element denoted ε. The product ⊗ is associative, distributes over the sum and accepts e as neutral element. An idempotent semiring is said to be complete if it is closed for infinite sums and if the product distributes over infinite sums too. Moreover, the greatest element of D is denoted T (for Top) and represents the sum of all its elements. Due to the sum idempotency, an order relation can be associated with D by the following equivalences: ∀a,b ∈ D, a ! b ⇐⇒ a = a ⊕ b and b = a ∧ b. Because of the lattice properties of a complete idempotent semiring, a ⊕ b is the least upper bound of D whereas a ∧ b is its greatest lower bound.

1 where ∗ is the convolution product. 2 h is not necessarily (min,+)-linear.

Control of Uncertain (min,+)-Linear Systems 351

 =( ∪{−∞,+∞}) Example 1. ( min). The set min , endowed with the min operator as sum ⊕ and the classical sum as product ⊗, is a complete idempotent semiring ∧ where ε =+∞, e = 0andT = −∞.Onmin, the greatest lower bound takes the sense of the max operator.

2.2 Residuation Theory

Residuation is a general notion in lattice theory which allows to define “pseudo- inverse” of some isotone maps (see [1]). In particular, the residuation theory pro- vides optimal solutions to inequalities such as f (x) # b (respectively f (x) ! b), where f is an order-preserving mapping defined over ordered sets. A mapping f defined over ordered sets is isotone, respectively antitone, if a # b ⇒ f (a) # f (b), respectively f (a) ! f (b).Now,letf : E → F be an isotone mapping, where (E ,#) and (F,#) are ordered sets. Mapping f is said residuated if ∀b ∈ F, the greatest element denoted f (b) of subset {x ∈ E | f (x) # b} exists and belongs to this subset. Mapping f is called the residual of f .When f is residuated, f is the unique isotone mapping such that f ◦ f # IdF and f ◦ f ! IdE ,where IdF (respectively IdE ) is the identity mapping on F (respectively on E ).

Example 2. (Left product). Mapping La : x → a ⊗ x defined over a complete idem- potent semiring D is residuated. Its residual represents the optimal solution to in- ◦ equality a ⊗ x # b and is usually denoted La : x → a\x (left quotient).

Remark 1. (Isotony and antitony). ∀x,y,a ∈ D, an ordered set, these properties are given:  a\◦x # a\◦y (x → a\◦x is isotone), x # y ⇒ (1) x\◦a ! y\◦a (x → x\◦a is antitone).

3 Models and Control of (min,+)-Linear Systems

3.1 Counter Functions

Some idempotent semiring algebras enable to model DEDS which involve synchro- nization and delay phenomena. The behavior of such systems can be represented by discrete functions called “counter” functions. More precisely, a discrete variable x(t) is associated to an event labeled x and represents the occurrence number x at time t (the numbering conventionally beginning at 0). For negative values of t,these

variables are defined as constant so they can be manipulated as mappings from  to

min. Thanks to these counter functions, the studied DEDS can be modeled on the

idempotent semiring min by the following linear state representation: 352 E. Le Corronc, B. Cottenceau and L. Hardouin  x(t)=Ax(t − 1) ⊕ Bu(t), (2) y(t)=Cx(t),

n×n n×p q×n

∈  ∈  ∈  where A min , B min and C min while n, p and q refer respectively to the state vector (x) size, the input vector (u) size and the output one (y). In the SISO3 case (p = 1andq = 1), the state equation leads to the following input/output relation: y(t)= CAτ Bu(t − τ). (3) τ≥0 Moreover, setting h(τ)=CAτ B, and defining the inf-convolution (or (min,+)- ∀ , ∈ convolution) as follows (see [9] and [5]), f g min: ( f ∗ g)(t) [ f (τ) ⊗ g(t − τ)] = min [ f (τ)+g(t − τ)], τ≥0 τ≥0 relation (3) can be rewritten as y(t)=(h ∗ u)(t), which is actually the transfer rela- tion of the considered system, with h(t) the transfer function4. According to [1, Theorem 5.39] and [3], a (min,+)-linear system defined as (2) is necessarily such that h(t) is periodic and causal i.e.: ∃ |∀ ≥ , ( + )= ⊗ ( ) T0,N,T ∈ Æ t T0 h t T N h t [periodicity], (4)

 h(t)=h(0) for t < 0 [causality]. (5) h(t) ≥ 0fort ≥ 0

5  Let us note that the set of nondecreasing mappings from  to min endowed with the two inner operations ⊕ as pointwise addition and ∗ as inf-convolution is also an

 ,⊕,∗ ε idempotent semiring denoted ( min )where and e are defined by:  0fort < 0, ∀t,ε : ε(t) → +∞ and e : e(t) → (6) +∞ for t ≥ 0.

In the MIMO6 case, the input/output relation becomes Y(t)=(H ∗U)(t),where

 

∈ (  p q q×p ) ∈ ( ) ∈ ( ) U  min , Y min and H min is such that Hij is periodic. The inf- ∗ ( )=( p ( ∗ ))( ) convolution is then naturally extended to matrices as Yj t i=0 Hij Ui t . 3 Single Input Single Output. 4 Let us note that h(t) corresponds to the impulse response of the system, i.e. the output due to the particular input: if t < 0,u(t)=0andift ≥ 0,u(t)=+∞. 5 Nondecreasing in the natural order i.e. for t1 > t2 ⇒ h(t1) ≥ h(t2). 6 Multiple Inputs Multiple Outputs. Control of Uncertain (min,+)-Linear Systems 353

3.2 Precompensator Control

A specific (min,+)-linear controller, called precompensator p, can be placed up- stream of process h so that u(t)=(p ∗ v)(t) and where v is the controller input.

 ,⊕,∗ ( )=( ∗ ∗ )( ) In (min ), the output of the controlled system becomes y t h p v t . With this configuration, the controller p aims at slowing down the system input. Moreover, the residuation theory shows (see [8]) that there exists an optimal neutral precompensator given by:

pˆ(t)=sup{p(t) | (h ∗ p)(t)=h(t)} =(h\◦h)(t), where the mapping x → a\◦x is in that case the residual of the inf-convolution prod- uct. This optimal controller is said to be neutral since it lets the input/output behavior unchanged. Nevertheless, it delays the process input u as much as possible in order to avoid useless accumulations into h. The computation ofp ˆ requires thus the use of the residual of the inf-convolution product (see [9]):

pˆ(t)=(h\◦h)(t)= [h(τ −t)\◦h(τ)] = max [h(τ) − h(τ −t)]. (7)

τ∈ τ∈

Remark 2. (Periodicity). If function h(t) is periodic, (h\◦h)(t) is periodic too.

Remark 3. (Argument of the maximum). Let us note that if h is periodic, there τ [ (τ) − (τ − )] = exists at least a 0 (not necessarily unique) such that maxτ∈ h h t h(τ0) − h(τ0 −t) and defined by:

τ0 ∈ argmax[h(τ) − h(τ −t)]. (8) τ∈

4 Neutral Precompensator for Unknown Systems

Usually, this optimal neutral precompensator is given for (min,+)-linear system the transfer function h of which is perfectly known. This section deals with the problem of finding such a precompensator when h presents some parametric uncertainties and belongs to an interval [ h , h ]. In such a case, we will see thatp ˆ = e⊕h\◦h is the greatest precompensator which is neutral for all systems, i.e. ∀h ∈ [ h , h ],h∗ pˆ = h.

4.1 SISO Case

 [ , ] , ∈  Proposition 1. Let h h be an interval with h h min :

◦ ◦ ∀hi ∈ [ h , h ],hi\hi ! h\h. 354 E. Le Corronc, B. Cottenceau and L. Hardouin

◦ Proof. According to the left quotient isotony and antitony properties (1), hi\hi ! ◦ ◦ h\hi ! h\h. 

[  , ] , ∈  Proposition 2. Let h h be an interval with h h min , two periodic and causal functions (see (4) and (5)):

◦ ◦ ∀ti > 0, ∃hi ∈ [ h , h ] such that (hi\hi)(ti)=(h\h)(ti).

Proof. Thanks to the residual of the inf-convolution product definition (see (7) and (8)): (h\◦h)(t )= h(τ − t )\◦h(τ)=h(τ − t )\◦h(τ ) with τ ∈ i τ∈ i i i i i [ (τ) − (τ − )] τ 7 argmaxτ∈ h h ti .This i leads to the hi following definition :  h(t), for t < τi, hi(t) (9) h(t), for t ≥ τi.

On Fig. 1 is illustrated an example of the hi function for which ti = 1andτi = 6. In ◦ ◦ that case (hi\hi)(1)=(h\h)(1)=h(6) − h(5)=4. As specified in remark 3, τi is not unique for ti = 1 and belongs to the set {3,4,6,8,10,...}.

Fig. 1 Example of a hi function where ti = 1and τi = 6. The arrow represents the distance (= 4) between h and h for these values.

Then, equation (7) shows that: (h \◦h )(t )= h (τ − t )\◦h (τ). This latter ex- i i i τ∈ i i i pression can be factorized, since ∀τ, h(τ) # h(τ) and according to (1) we obtain: ◦ ◦ ◦ h(τ −ti)\h(τ) ! h(τ −ti)\h(τ) ! h(τi −ti)\h(τi), τ<τi τ<τi

7 

#  ≥ It is important to recall that the order of min is the opposite to the natural order of functions. Moreover, as illustrated in Fig. 1, hi(t) is still a nondecreasing function. Control of Uncertain (min,+)-Linear Systems 355 ◦ ◦ ◦ h(τ −ti)\h(τ) ! h(τ −ti)\h(τ) ! h(τi −ti)\h(τi). τi+ti≤τ τi+ti≤τ Moreover:     ◦ ◦ ◦ h(τi −ti)\h(τi) ∧ h(τ −ti)\h(τ) = h(τi −ti)\h(τi). τi<τ<τi+ti

◦ ◦ ◦ Finally, by defining hi as in (9), (hi\hi)(ti)=h(τi −ti)\h(τi)=(h\h)(ti). 

◦ ◦ Remark 4. Let us note that for ti = 0, (h\h)(0)=0 whereas (h\h)(0) # 0(≥ 0). For instance, on Fig. 1 (h\◦h)(0)=h(2) − h(2)=4.

These preliminary results lead to the following proposition.

[  , ] , ∈  Proposition 3. Let h h be an interval with h h min , two periodic and causal functions (see (4) and (5)): e ⊕ h\◦h = h\◦h. h∈[ h , h ]

◦ Proof. According to proposition 2, ∀ti > 0,∃hi ∈ [ h , h ] such that (hi\hi)(ti)= ( \◦ )( ) ∀ > H ⊂ [ , ] h h ti . So, thanks to proposition 1, t 0, a subset of systems h h exists such that ∈H (h \◦h )(t)=(h\◦h)(t). Moreover, for t = 0 and according hi i i to remark 4, ∀t ≥ 0, ∈H (h \◦h )(t)=e ⊕ (h\◦h)(t). To conclude, ∀h ∈ [ h , h ], hi i i \◦ ! ( \◦ )( ) ⊕ \◦ = \◦  h h hi∈H hi hi t and finally e h h h∈[ h , h ] h h.

Proposition 3 must be interpreted as follows: the precompensatorp ˆ = e ⊕ h\◦h is the greatest precompensator which is neutral for all systems h ∈ [ h , h ] i.e. h∗ pˆ = h.

4.2 MIMO Extension

Proposition 3 given for all uncertain SISO systems belonging to an interval can be ∈ (  )q×p extended to MIMO systems. For an uncertain (min,+)-linear system H min in an interval [ H , H ], the greatest neutral precompensator is now defined by

◦  )p×p

ˆ = ⊕ \ (

P I H H,whereI is the identity matrix of min .

 , ∈  Proposition 4. Let ha hb min : ◦ ◦ hb\ha = hb\ha.

ha ∈ [ ha , ha ] hb ∈ [ hb , hb ] 356 E. Le Corronc, B. Cottenceau and L. Hardouin

Proof. Thanks to (1). 

 )q×p [ , ] , ∈ ( Proposition 5. Let H H be a matrix interval with H H min and which represents the behavior of an uncertain p-input q-output system: I ⊕ H\◦H = H\◦H. H∈[ H , H ] . ∀ ∈ [ , ] ( \◦ ) = n \◦ Proof. Thanks to [1, Equation (4 82)], H H H , H H ij k=1 Hki Hkj. = ( \◦ ) = n \◦ ∈ [ , ] On the one hand, ifi j, H H ii k=1 Hki Hki with Hki Hki Hki . Thanks to proposition 3, ( H\◦H) = n (e ⊕ H \◦H ) . On the other hand, H∈[ H , H ] ii k=1 ki ki ii = ( \◦ ) = n \◦ ∈ [ , ] ∈ [ , ] for i j, H H ij k=1 HkiHkj with Hki Hki Hki , Hkj Hkj Hkj and thanks to proposition 4, ( H\◦H) = n (H \◦H ) = H \◦H . H∈[ H , H ] ij k=1 ki kj ij ki kj ∀ , ,( \◦ ) =( ⊕ \◦ )  Finally, i j H∈[ H , H ] H H ij I H H ij.

4.3 Example of Neutral Precompensator for MIMO Systems

 1×2

∈ ( ) A MIMO system with H min (two inputs, one output) the transfer function ([ , ][ , ]) of which belongs to an interval H11 H11 H12 H12 is considered. The bounds are the periodic and causal functions given in table 1.

Table 1 Bounds of H

t 01234 t ≥ 5 ≥ ( )= ⊗ ( − ) H11 03 t 2, H11 t 3 H11 t 2 H11 00012 t ≥ 5, H11(t)=3 ⊗ H11(t − 2) ≥ ( )= ⊗ ( − ) H12 02 t 2, H12 t 2 H12 t 2 H12 00 t ≥ 2, H12(t)=1 ⊗ H12(t − 1)

As previously said, (min,+)-linear systems are always characterized by periodic functions ([1, Theorem 5.39]) and according to remark 2, residuals of the inf- convolution are periodic functions too. Thus, for this system and thanks to propo-

∈ (  )2×2

ˆ  sition 5, the computation of the neutral precompensator P min given by ◦ Pˆ = I ⊕H\H, is described in table 2. Let us note that for this example, Pˆ21 = ε with ε defined by (6). Control of Uncertain (min,+)-Linear Systems 357

Table 2 Neutral precompensator Pˆ

t 012 t ≥ 3

Pˆ11 057 t ≥ 3, Pˆ11(t)=3⊗P11(t −2)

Pˆ12 2 t ≥ 1, Pˆ12(t)=1⊗P12(t −1)

Pˆ22 03 t ≥ 2, Pˆ22(t)=1⊗P22(t −1)

5Conclusion

This paper has introduced the control of unknown (min,+)-linear systems belonging to an interval the bounds of which are known. A neutral precompensator controller placed upstream of these systems has been given without changing the input/output behavior while delaying the process input as much as possible. This precompensator is enabled both for SISO and MIMO systems and an example has been given in order to illustrate these propositions.

References

1. Baccelli, F., Cohen, G., Olsder, G.J., Quadrat, J.P.: Synchronisation and linearity: an alge- bra for discrete event systems. Wiley and sons, Chichester (1992) 2. Cottenceau, B., Hardouin, L., Boimond, J.L., Ferrier, J.L.: Model reference control for timed event graphs in dioids. Automatica 37(9), 1451Ð1458 (2001) 3. Gaubert, S.: Th«eorie des syst`emes lin«eaires dans les dio¬õdes. PhD Thesis. Ecole Nationale Sup«erieure des Mines de Paris (1992) 4. Heidergott, B., Olsder, G.J., Woude, J.: Max plus at work, modeling and analysis of syn- chronized systems. In: A course on max-plus algebra and its applications. Princeton Uni- versity Press, Princeton (2006) 5. Le Boudec, J.Y., Thiran, P.: Network calculus: a theory of deterministic queuing systems for the internet. Springer, Heidelberg (2001) 6. Lhommeau, M., Hardouin, L., Cottenceau, B.: Optimal control for (max,+)-linear systems in the presence of disturbances. In: International Symposium on Positive Systems: Theory and Applications, Roma. POSTA 2003 (2003) 7. Lhommeau, M., Hardouin, L., Ferrier, J.L., Ouerghi, I.: Interval analysis in dioid: appli- cation to robust open-loop control for timed event graphs. In: 44th IEEE Conference on Decision and Control and European Control Conference, Seville. CDC-ECC 2005, pp. 7744Ð7749 (2005) 8. Maia, C.A., Hardouin, L., Santos-Mendes, R., Cottenceau, B.: Optimal closed-loop con- trol of timed event graphs in dioids. IEEE Transactions on Automatic Control 48(12), 2284Ð2287 (2003) 9. Max Plus Second order theory of min-linear systems and its application to discrete event systems. In: Proceedings of the 30th IEEE Conference on Decision and Control, Brighton. CDC 1991(1991) On a Class of Stochastic Models of Cell Biology: Periodicity and Controllability

Ivo Marek

Abstract. This contribution is a natural continuation of a series of papers devoted to analysis of models utilized by specialists in Cell Biology around E. Bohl and W. Boos. Our novelty may be seen in enriching the models in direction of controllabil- ity in the spirit of biology engineering. Besides the standard properties of the models such as existence of appropriate solutions and their uniqueness the following issues are of interest: Asymptotic behavior (e.g. steady states and pseudo-steady states), controllability and also special features such as various types of symmetries, peri- odicity etc. Our aim is focused on periodicity of solutions of models whose state objects share the properties of concentrations, i.e. probabilities.

1 Definitions and Notation

Let X be a generally infinite dimensional partially ordered Banach space, generated by a closed normal cone K e.g. space X = L p(Ω) is consisting of classes of functions defined on (0,+∞) whose representatives possess convergent integral  |x(t)|p dt < +∞, x ∈{x}⊂L p(Ω), 1 ≤ p < +∞. Ω

In applications the role of the cone K is usually played by some version of the collection of classes of functions whose representatives assume nonnegative real values almost everywhere, i.e.

p K = L+(Ω)={{x}⊂X : x(t) ≥ 0 a.e.within [0,+∞)}.

Ivo Marek Czech University of Technology, School of Civil Engineering, Thakurova 7, 166 29 Praha 6, Czech Republic, e-mail: [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 359Ð367. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 360 I. Marek

Assume X is a Banach space. Then symbol X denotes the dual of X ,i.e.the space of all bounded linear functionals mapping X into reals R. Space X is equipped with the norm / 0

  = | ( )|   ≤ . x X sup x x : x X 1

If the norm . is defined via an inner product [.,.] the space X is then a Hilbert / space and the norm of x ∈ X reads xX =[x,x]1 2 in this case.

Symbol ρ(H) is exploited for denoting the spectral radius of H,i.e.ρ(H)= Max{|λ| : λ ∈ σ(H)},whereσ(H) is the spectrum of linear operator H.

2 Periodicity and Controllability

Below we formulate a general problem that can be frequently met when modeling concrete situations of various research experiments, here we focus to research in Biology. We are going to show that the methods and results obtained in [4], [5], [3] are suitable tools to solve problems just mentioned. In particular, the properties concerned with asymptotic behavior of the required solutions will appear as decisive for establishing a theory adequate for the experiments studied.

Problem 1. Let u ∈ U,whereU is the Banach space of classes of scalar functions on (0,+∞) equipped with the L1-norm  +∞ uU = |u(t)|dt. 0 Let L {X } denote the space of bounded linear operators mapping X into X and G = G(u) :U → L {X } be an operator function each element of which is a bounded linear operator acting on X and G(x)=H(x) − ρ(H(x))I, H(x)K ⊂ K .Further, let B be the densely defined infinitesimal generator of a of linear oper- ators of class C0 such that each of the operators of the semigroup T (t;B) leaves invariant the cone K . Furthermore, there is a vectorx ˆ ∈ K such that relations xˆ (x) > 0 hold for all 0 = x ∈ K and B xˆ = 0 =[G(x)] xˆ ,whereB denotes the dual of B and similarly G(u) is the dual with respect to G(u). Find a solution x = x(t,u) ∈ X such that

dx(t) = Bx(t)+G(x(t))x(t)u(t),x(0)=x , (1) dt 0 where u ∈ U is assumed to satisfy ≤ ( ) ≤ , , ∈ R+. umin u t umax umin umax Stochastic Models 361

Remark 1. There is a variety of possible applications behind the fact that Prob- lem 1 as formulated above is free of nonhomogeneous term. However, its possible presence produces no difficulties in the investigation.

Problem 1 formulated above forms a basis for applications. Some results have been published already, see [3], [4] showing existence, uniqueness and asymptotic properties of Bohl’s models on an abstract level. A realization of a version of the model based on a system of ODE’s devoted to multi-level time aspects (Michaelis- Menten kinetics) is presented in [5]. A natural prolongation of the results contained in the mentioned papers is presented in this contribution. It concerns existence of periodic solutions to Problem 1 and some properties of controllability with piece- wise constant input. As the applications in Biology concerns, the earlier papers are related to Cell Biology and the present one to some issues related to microalgal growth in the spirit of Thesis of S.ˇ Pap«aˇcek [13] and [14], see also [7].

2.1 Periodic Solutions of Models based on ODE’s

With no loss of generality, let t0 = 0. Further we let ⎧ ⎨ ua for 0 ≤ t < t1 ( )= u t ⎩ (2) ub for t1 ≤ t < t2, where ua and ub are positive real numbers, and

u(s)=u(t), for any real positive s such that s −t = k[t1 +t2], (3) where k ranges within the set of positive integers N = {1,2,...}. A piecewise constant function u defined in (2) periodically prolonged onto [0,+∞) is called intermittent control. Assume that X = Rn equipped with some norm . and let

A(t)=A(u(t)) = B +C(u(t)), (4) where B is a constant matrix and

C = C(u)=(c jk), c jk = c jk(u(t)), j,k = 1,...,N, (5) with u defined in (3). Moreover, let B and C(u) be negatives of generally singular M-matrices [1, p.133].

The following collection of matrices will appear useful in approximating the ma- trix of system (4) under periodicity hypothesis. Let ε > 0 be arbitrary. Define 362 I. Marek

⎧ −ε ⎪C(u ) − [C(u ) −C(u )] t 0 ≤ t ≤ ε ⎪ a b a ε ⎪ ⎪ ⎨⎪C(ua) for t ∈ [ε,t1) ( )= Cε t ⎪ ⎪ t−(t1+ε) ⎪C(ub)+ [C(ua) −C(ub)] for t ∈ [t1,t1 + ε] ⎪ t1−(t1−ε) ⎪ ⎩⎪ C(ub) for t ∈ [t1 + ε,t1 +t2)

1 and periodically for any t ∈ R+ as above in defining C(t). Since  t v(t)=exp{Bt} + exp{B(t − s)}C(u(s))v(s)ds 0 and similarly,  t vε (t)=exp{Bt} + exp{B(t − s)}Cε (u(s))vε (s)ds, 0 we get estimate

Max{v(s) − vε(s) :0≤ s ≤ t1 +t2} - - - +ε t +ε - ≤ - + 1 exp{B(t − s)}[C(u(s))v(s) −Cε (uε (s))vε (s)]ds- 0 t1

≤ κε, where κ is a constant independent of ε and t ∈ [0,+∞). According to this relation, v is the uniform limit of sequence {vε } on any finite interval of the semiaxis [0,+∞). Our next aim is to prove Theorem 1. Under the hypotheses of this section Problem 1 possess a solution periodic in time variable. The number of linear independent periodic solutions is bounded above by the dimension of the kernel of the matrix system

A(u)=B +C(u).

Proof. We consider first the situation when matrix Cε is in place of matrix C(t). Since, according to [5], there exists a constant κ < +∞ such that

expA(u(t))≤κ we see that any fundamental matrix Φ of the investigated system satisfies

Φ(t)≤κ.

From the well known fact [8, pp.90-93] that fundamental matrix of a homogeneous periodic system whose matrix A = A(t) is continuous can be represented in form Stochastic Models 363

Φ(t)=P(t)exp{tR}, P(t + ν)=P(t),ν = t1 +t2, we deduce that spectrum of the approximation Rε ,whereΦε (t)=Pε (t)Rε denotes the fundamental matrix of Cε constructed in this section, has the following structure:

σ(Rε )={0}∪{ρ1,...,ρs}, Rρ j < 0, and due to the Markov property of our model, the block corresponding to eigenvalue zero is diagonalizable. Thus, taking one of the eigenvectors of Aε corresponding to eigenvalue 0, and denoting it by v0, we check easily that vε satisfying  t vε (t)=exp{t}v0 + exp{B(t − τ)}Cε (u(τ))vε (s)dτ 0 is, by construction, periodic and thus, it represents a periodic solution to the Cauchy problem d X(t)=Aε (u(t))X(t), X(0)=v . dt 0 T T 1 T n Relations B xˆ = Cε (t) xˆ = 0forallt ∈ R+, wherex ˆ = e =(1,...,1) ∈ R ,imply that  t v(t)=lim vε (t)=exp{Bt}v0 + exp{B(t − τ)}C(u(τ))v(s)dτ ε→0 0 is a solution to d X(t)=A(u(t))X(t), X(0)=v . dt 0

Moreover, since v0 is (t1 + t2)-periodic, so is v. Since the part of the theorem con- cerned with the number of periodic solutions is obvious, the proof is complete. 

It is interesting that an analog of the previous theorem remains valid for a broader class of problems; actually, we have Theorem 2. In the following, let . denote some norm in RN. Assume a sys- tem of operators A(t) is defined via matrices B and G(t) possessing the following properties: (i) B is a constant N × N minus M-matrix;

N (ii) G = G(x;u),x ∈ R+,u ∈ U ,whereU is a set of scalar real valued functions on the reals, is a minus M-matrix;

(iii) There is a positive constant κ < +∞ independent of x ∈ RN and u ∈ U such that G(x;u)≤κ (iv) There is a positive constant quantity ν < +∞ independent of x ∈ RN and u ∈ U such that G(x;u) − G(y;u)≤νx − y. 364 I. Marek

Besides the hypotheses (i)−(iv) let u = u(t),u ∈ U be a periodic within the time interval [0,+∞) with period τ. Then there exist τ-periodic solutions to Problem 1. The number of linearly inde- pendent such periodic solutions coincides with the number of linearly independent N solutions to the following nonlinear equation belonging to R+:

Bw + G(w,u)w = 0. (6)

Proof. First, according to [5], there exist solutions to (6). Let w be one of such solutions Then we can construct matrix G(w,u(t)).Sincew is independent of time we see that this matrix and hence also Problem 1 with this data is periodic and satisfies conditions of Theorem 1. It follows that the solution obtained according to the Theorem just mentioned is periodic. Since the proof of the remaining part of Theorem 2 is obvious and is thus omitted. The proof is complete. 

2.2 Models based on PDE’s

This subsection is devoted to investigations generalizing the previous results and methods in order to cover by mathematical models situations without limitations to both sides of groups of researchers, the specialists in experimental as well as in theoretical area. Our approach gives freedom to both sides to consider very general situations on the one hand and broadens our experience for making realistic visions for a progress in the near future. Typical in this direction is appearance of a semigroup of operators in place of the exponential of a matrix. We are going to keep the generality in realistic bounds how- ever by assuming that the infinite-dimensional objects can be relatively accurately approximated by their finite-dimensional counterparts. The class of problems we are going to consider in this subsection is characterized as follows: (α) X is a Banach space equipped with norm . and is generated by a closed normal cone K [11]; (β) Operator B is densely defined on D(B) ⊂ X and generates a semigroup of bounded linear operators of class C0 [10]; (γ) each operator of the class of operators forming the semigroup mentioned in (β) leaves invariant the cone K [11]; (δ) For every x ∈ X linear operator G(x) is such that uniformly with respect to K G(x)≤κ < +∞; (ε) Relations G(x)=H(x) − ρ(H(x))I, H(x)K ⊂ K hold for every x ∈ K ; (η) There is a constant ν,0 ≤ ν < +∞ such that relations

G(x) − G(y)≤νx − y hold for any x,y ∈ K . Stochastic Models 365

Without proof we formulate an abstract result whose validity is based on our results established by solving some concrete problems, some of them presented in the present contribution. The proofs exploit the ”finite-dimensional” techniques and convergence properties of some suitable approximations in order to establish ap- propriate generalizations in abstract infinite-dimensional spaces. Two examples of major interest are shown in the Appendix.

Theorem 3. Hypotheses (α)-(η) and periodicity of data imply that the initial value problem introduced as Problem 1 possesses solutions periodic in the time variable.

A crucial step in proving Theorem 3 consists of showing that there is a suitable discretization of data appearing in Problem 1 such that the corresponding approxi- mate problems are determined by data for which Theorem 2 applies. For operators exposed in the Appendix the step just mentioned is based on the discrete maximum principle valid for piecewise linear finite element discretizations of the spatial vari- ables and convergence of the corresponding approximate solutions to the exact ones. This is the case of Example 1 (see [9, pp.285-286]). For the case of Example 2 a method of V.S. Vladimirov is appropriate [12, pp.73-76].

3 Concluding Remarks

Our stochastic models introduced for analyzing actual problems of Biology and Chemistry have been further generalized and completed by enriching further struc- tures such as more general partial order in order to examine formulation of more complex mathematical problems. Two areas of results are established: Existence of periodic solutions to some nonlinear boundary value problems in direction of theory belongs to the first category and an explanation of certain type of experiments con- cerned with growth of microalgae via intermittent control processes as application belongs to the second category.

4 Appendix

Example 1. Diffusion operator Let Ω ⊂ Rd,d ≤ 3, be a bounded domain with the Lifschitz boundary ∂Ω. Diffusion operator L is defined as the following elliptic differential operator by formulas   d ∂ ∂ ∂ ( ) ( ) ≡− ( ) w , ∈ Ω w r = ∈ ∂Ω. Lw r ∑ ∂ D r ∂ r and ∂ 0forr k=1 xk xk n 366 I. Marek

Example 2. Transport operator Let Ω ⊂ Rd,d ≤ 3, be a bounded convex domain with smooth/ boundary ∂Ω.Let/n denote the direction of the outer00 normal. Fur- V = [ ,+∞) × ω, ω = ∈ R3 | |2 = 2 + 2 + 2 = ther, let 0 v : v vx vy vz 1 be the veloc- ity space. By w we denote a density of particles. Transport operator L is then defined by relations Lw ≡ vgrad w + vΣ(r,v)w + μw, v = |v|, r ∈ Ω, and w(r,v)=0forr ∈ ∂Ω and v ∈ V whenever (n,v) < 0. The data in the definition of the transport operator is the total cross-section Σ(r,v) and a quantity possessing various meanings depending on the matter of research. In our treatment, μ makes the transport operator to satisfy our basic requirement: The semigroup of operators generated by corresponding infinitesimal generators should leave invariant some cone generating the appropriate space and preserve the total concentrations.

Acknowledgements. Supported by grant from the Grant Agency of the Czech Republic un- der contract Nr. 201/09/1544 and by the grant from the Ministry of Education, Sports and Youth under contract Nr. MSM 6840770010.

References

1. Berman, A., Plemmons, R.: Non-negative Matrices in the Mathematical Sciences. Aca- demic Press, New York (1979) 2. Bohl, E., Boos, W.: Quantitative analysis of binding protein-mediated ABC transport system. J. Theor. Biology 186, 65Ð74 (1997) 3. Bohl, E., Marek, I.: A stability theorem for a class of linear evolution problems. Integral equations Operator Theory 34, 251Ð269 (1999) 4. Bohl, E., Marek, I.: Existence and uniquness results for nonlinear cooperative systems. OperatorTheory: Advances and Applications 130, 153Ð170 (2001) 5. Bohl, E., Marek, I.: Input-output systems in Biology nad Chemistry and a class of math- ematical models characterizing them. Appl. Math. 50, 219Ð245 (2005) 6. Brenner, P., Thom«ee, V.: On rational approximations of semigroups. SIAM J. Numer. Anal. 16, 683Ð694 (1979) 7. Celikovsk«ˇ y, S.: On the Lipschitzian dependence of trajectories of bilinear systems of multi-input time dependent bilinear systems of control. Problems Control Information Theory 17, 231Ð238 (1988) 8. Coddington, E.A., Levinson, N.: Theory of Ordinary Differential equations. McGraw- Hill Book Company, New York (1955); Russian translation Izdat. Innostrannoi Liter- atury, Moscow (1958) 9. Ern, A., Guermond, J.-L.: Theory and Practice of Finite Element Methods. Springer, Heidelberg (2005) 10. Hille, E., Phillips, R.S.: Functional Analysis and Semigroups. In: Amer. Math. Socitey Coll. Publ., vol. XXXI. Third printing of Revised edition Providence, Rhode Island (1968) 11. Krein, M.G., Rutman, M.A.: Linear operators leaving invariant a cone in a Banach space. Uspekhi mat. nauk III Nr. 1, 3Ð95 (1948) (in Russian); Eenglish translation in AMS Translations 26 (1950) Stochastic Models 367

12. Marchouk, G.I.: Methods of Nuclear Reactor Computation, Gosatomizdat, Moscow (1961) (in Russian) 13. Pap«aˇcek, S.:ˇ Photobioreactors for Cultivation of Microalgae Under Strong Irradiances Modelling: Simulation and Design. Ph.D. Thesis, Technical University Liberec (2005) 14. Pap«aˇcek, S.,ˇ Celikovsk«ˇ y, S., Stys,ˇ D., Ruiz-Le«on, J.: Bilinear system as a modelling framework for analysis of microalgal growth. Kybernetika 43, 1Ð20 (2007) 15. Sultangazin, U.M., Smelov, V.V., Akishev, A.S., Sabekov, A., Marek, I., M«õka, S., Zitn«ˇ y, K.: Some Mathematical Problems of Kinetic Tranport Theory, Nauka, Alma-Ata (1986) (in Russian) Implementation of 2D Strongly Autonomous Behaviors by Full and Partial Interconnections

Diego Napp Avelli and Paula Rocha

Abstract. In this paper we study linear discrete two-dimensional systems in the be- havioral context where control is viewed as interconnection. Within the behavioral framework a natural concept of interconnection has been introduced by J.C.Willems, called regular interconnection. We investigate regular interconnections that yield fi- nite dimensional behaviors, and prove that when a finite dimensional behavior can be achieved from a given behavior B by regular full/partial interconnection then the controllable part/manifest part of B is rectifiable.

1 Introduction

As is well known, the central idea in the behavioral approach to control is the one of interconnection. This consists in the interconnection of a given behavior to be controlled B (the plant) with a suitable behavior (the controller), in order to obtain a desired behavior Bd. If this is possible, we say that Bd is implementable from B. In this context, there are two main situations to be considered: either all the sys- tem variables are available for control (i.e., are control variables) or only some of the variables are control variables. In order to distinguish these two cases, we re- spectively refer to full and partial control or interconnection. In this paper we focus on a particular kind of interconnection that is called regular interconnection. In such interconnection, the restrictions imposed on the plant by the

Diego Napp Avelli Research Unit of Mathematics and Applications, Department of Mathematics, University of Aveiro, e-mail: [email protected] Paula Rocha Department of Electrical and Computer Engineering Faculty of Engineering University of Oporto, e-mail: [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 369Ð378. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 370 D. Napp Avelli and P. Rocha controller are independent of the restrictions already present in the plant, as happens, for instance, in a feedback interconnection (see [7, 15]). More concretely, we are interested in studying regular interconnections that yield finite dimensional behaviors, i.e., we wish to characterize the behaviors from which a finite dimensional behavior is implementable by regular interconnection. This can be seen as a relaxation of the control objective of implementing the zero behavior by regular interconnection from a given behavior B, a problem that has already been addressed in [10, 15]. In this sense, regular implementation of a finite dimensional behavior can be regarded as almost regular implementation of zero, see [4, 5]. Using a notion of stability defined with respect to a specified stability region by adapting the ideas in [6] to the discrete case, it was recently proven, in [9], that the stable behaviors considered there have the property of being finite dimensional 2 ( q) . B linear subspaces of Ê Thus, the possibility of stabilizing a behavior is strictly connected with the regular implementation of a finite dimensional behavior from B. In the context of full interconnection, a complete characterization of the stabiliza- tion property was given in [9] under the assumption that the controllable part of the 2 B Bc q) given behavior , denoted by , is rectifiable, i.e., is a direct summand of (Ê . This is a very strong property and allows to derive several results that are in general only valid for the one dimensional case (1D), such as, for instance the existence of a decomposition of the behavior into the direct sum of its controllable part and an autonomous part. However, in this paper we prove that if a finite dimensional behavior is imple- mentable by regular interconnection from a given behavior B,thenBc is rectifiable. As a consequence of this result we conclude that the assumption about the rectifia- bility of Bc, used in [9] in order to obtain several results on stabilization, is indeed not restrictive since it is a necessary condition for stabilization. In contrast with the full interconnection case, we also treat the partial intercon- nection case and show that, under one condition, analog results can be obtained for this new situation. We begin by introducing some necessary background from the field of 2D discrete behavioral theory. Section 3 is devoted to an exposition of regular inter- connection and finite dimensional behaviors. In Section 4 we move from the full interconnection context to the partial one.

2 Preliminaries

In order to state more precisely the questions to be considered, we introduce some preliminary notions and results. B 2 We consider 2D behaviors defined over  that can be described by a set of linear partial difference equations, i.e.,

− − B = kerR(σ,σ 1) := {w ∈ U | R(σ,σ 1)w ≡ 0}, Implementation of 2D Strongly Autonomous Behaviors 371

2 − q) σ =(σ ,σ ) σ 1 = where U is the trajectory universe, here taken to be (Ê , 1 2 , (σ −1,σ −1) σ σ ( )= 1 2 ,the i’s are the elementary 2D shift operators (defined by iw k

( 2 2  w k + ei),fork ∈  ,whereei is the ith element of the canonical basis of )and R(s,s−1) is an 2D Laurent-polynomial matrix known as representation of B. Instead of characterizing B by means of a representation matrix R,itisalso possible to characterize it by means of its orthogonal module Mod(B),which ( −1 1×q[ , −1] consists of all the 2D Laurent-polynomial rows r s,s ) ∈ Ê s s such that B ⊂ ( −1 [ , −1] ( ) ker r σ,σ ), and can be shown to coincide with the Ê s s -module RM R generated by the rows of R, i.e., Mod(B)=RM(R(s,s−1)). Theorem 1. [15, pag.1074] Let B1 and B2 be two 2D behaviors. Then, B1 +B2 1 1 2 and B ∩ B2 are also 2D behaviors. Moreover, one has that Mod(B + B )= Mod(B1) ∩ Mod(B2) and Mod(B1 ∩ B2)=Mod(B1)+Mod(B2).

2 B ⊂ ( q) ∈ Definition 1. Abehavior Ê is said to be controllable if for all w1, w2 B 2 ( , ) > δ there exits δ > 0 such that for all subsets U1, U2 ⊂  with d U1 U2 ,there ∈ B | | | | exists a w such that w U1 = w1 U1 and w U2 = w2 U2 . On the other hand, we say that a behavior is autonomous if it has no free vari- ables (or inputs). B = kerR(σ,σ −1) is autonomous if and only if R(s,s−1) has full [ , −1] column rank (over Ê s s ), [13]. In the 1D case, all autonomous behaviors are finite-dimensional vector spaces. For general multidimensional variable behaviors this is no longer true. In fact, an autonomous multidimensional behavior that is finite-dimensional is called strongly autonomous in [6]. As also shown in [13], every 2D behavior B can be decomposed into a sum B = Bc +Ba, where Bc is the controllable part of B (defined as the largest controllable sub-behavior of B)andBa is a (non-unique) autonomous sub-behavior said to be an autonomous part of B. If the controllable-autonomous decomposition happens to be a direct sum decomposition, i.e., if B = Bc ⊕Ba, we say that the autonomous part of Ba is an autonomous direct summand of B. An interesting case is when the controllable part Bc is rectifiable. A 2D be- 2 B = ( −1 q) havior kerR σ,σ ) ⊂ (Ê is said to be rectifiable if there exists an in- vertible operator U(σ,σ −1),whereU(s,s−1) is an 2D Laurent-polynomial matrix, −1 such that U(σ,σ )(B)=ker[Il 0],whereIl is the l × l identity matrix, for some l ∈{1,...,q}. It has been shown that a behavior B is rectifiable if and only if B is 2 ( q) direct summand of Ê , (see [10, Lemma 2.12] and [14, Th. 9 and Th. 10, page 819]). When a rectifying operator exists, it is possible to take advantage of the simplified form of the rectified behaviors in order to derive various results. In particular, one can obtain the next proposition, see [9, Prop.1].

2 B = ( −1 q) Proposition 1. Let kerR σ,σ ) ⊂ (Ê be a behavior with rectifiable controllable part Bc and U(σ,σ −1) be a corresponding rectifying operator such −1 c that U(σ,σ )(B )=ker[Il 0]. Then the following are equivalent: 372 D. Napp Avelli and P. Rocha

1. B = Bc ⊕ Ba,  P 0 2. Ba = ker U , with P(s,s−1) such that RU−1 =[P 0] and X(s,s−1) XIq−l an arbitrary Laurent-polynomial matrix of suitable size.

Since the behaviors Ba of Proposition 1 always exist and are autonomous, we have that the previous result shows that every behavior with rectifiable controllable part has autonomous direct summands and, moreover, provides a parametrization for all such summands. B (B) [ , −1] Definition 2. Abehavior is said to be regular if Mod is free (as a Ê s s - module), or equivalently if there exists a polynomial matrix R of full row rank such that B = kerR.

3 Control by Regular Interconnections

Given two behaviors B1 and B2 their interconnection is defined as the intersection B1 ∩ B2. This interconnection is said to be regular if Mod(B1) ∩ Mod(B2)={0} 2 B1 2 q) or equivalently if + B =(Ê , see [10, Lemma 3, pag 115]. If the intercon- 1 2 1 2 nection of B and B is regular then we denote it as B ∩reg B . In a regular interconnection, the controller imposes restrictions which are not already present in the plant. In this sense a feedback controller is a simple example of a regular interconnection where the controller imposes restrictions only on the plant input, which in the plant is unrestricted [7]. This notion of regularity of an interconnection is independent from the concept of a regular behavior. Based on the notion of behavior interconnection it is possible to formulate a con- trol problem in set theoretic terms. Indeed, if B is the behavior of the system to be controlled (the plant) and C is the set of all signals compatible with the additional restrictions to be imposed on w, i.e., the controller, then the resulting controlled be- havior is given by the interconnection B∩C of the behaviors B and C . Thus, in the behavioral setting, a control problem consists in, given a desired controlled behav- ior Bd, finding a controller C such that its interconnection with the plant behavior B results in Bd. In case this interconnection is regular, the desired behavior Bd is said to be achievable or implementable by regular interconnection. The following necessary condition for implementation by regular interconnection has been derived in [10, Th. 4.5, pag 124]. Theorem 2. Let B and Bd be two behaviors. Then if Bd is implementable by regular interconnection from B then B = Bc + Bd . Using this result it is possible to show the next useful Lemma. Implementation of 2D Strongly Autonomous Behaviors 373

Lemma 1. Let B and K be two 2D behaviors. If the interconnection of B and K is regular then also is the interconnection between Bc and K .

Proof. Let B ∩ K = Bd with regular interconnection, i.e. Mod(B)⊕ Mod (K )=Mod(Bd). Using Theorem 2 we have that B = Bc + Bd or equivalently Mod(B)=Mod(Bc)∩Mod(Bd)=Mod(Bc)∩(Mod(B)⊕Mod(K )).Usingthat Mod(B) ⊂ Mod(Bc) one easily show that Mod(Bc) ∩ (Mod(B) ⊕ Mod(K )) = (Mod(Bc)∩Mod(K ))⊕Mod(B).SinceMod(B)∩Mod(K )={0} we have that Mod(Bc) ∩ Mod(K )={0}.  Lemma 1 shows that the controllable part of a behavior plays an important role in the context of regular interconnections. Indeed, a controller which does not inter- connect with Bc in a regular way, can not interconnect with B regularly.

The next results are crucial for the study of the regular implementation of finite dimensional behaviors.

Lemma 2. (see [4, Th.12]) Let B be a behavior. Then there exists a unique reg- ular behavior B+ such that B/B+ is finite dimensional (as a vector space over

−1

Ê[ , ] Ê), i.e., there exists a unique (up to isomorphism) free s s -module, denoted (B)+ 1×q[ , −1] (B)+/ (B) by Mod , which is contained in Ê s s and such that Mod Mod

has finite dimension (as a vector space over Ê).

Note that Mod(B)+ is the smallest free module containing Mod(B) and its com- putation can be effectively implemented, see [4].

Lemma 3. (see [4, Cor.23]) Let B be a behavior and Bd ⊂ B be a sub-behavior. d Then there exists a controller C such that (B ∩reg C )/B has finite dimension (as a vector space) if and only if Mod(B)+ is direct summand of Mod(Bd)+.

The previous lemma has a clear system theoretical interpretation. It can be seen as the almost implementation of Bd since the controlled behavior and the desired one ’differ’ just in a finite dimensional behavior which in the nD context is consid- ered “small”. The idea that two behaviors are “almost” equal if they differ by finite dimensional behaviors has been considered in several recent papers [2, 4, 5].

Definition 3. Let B beabehaviorandBd ⊂ B be a sub-behavior. If there exists d a controller C such that (B ∩reg C )/B has finite dimension (as a vector space) then we say that Bd is regularly almost implementable.

2 q) Theorem 3. Let B ⊂ (Ê be a behavior. If there exists a controller behavior fd c C such that B = B∩reg C is finite dimensional (strongly autonomous) then B is rectifiable. 374 D. Napp Avelli and P. Rocha

Proof. Applying Lemma 3 for Bd = 0 one obtains that B ∩ C = B fd is finite dimensional if and only if Mod(B)+ is direct summand of Mod(0)+ = Mod(0)= 1×q[ , −1] B+ B+ Ê s s .Define as in Lemma 2. Using Theorem 1 one can obtain that 2 + q) B is direct summand of (Ê . Thus, is a rectifiable behavior and therefore also controllable. We have that for the 2D case Mod(Bc) is free (see [6, Corollary 4, page 399]) and since Mod(B)+ is the smallest free module containing Mod(B) we have that Bc ⊂ B+ ⊂ B.Further,Bc is, by definition, the largest controllable sub-behavior which implies B+ ⊂ Bc and therefore Bc = B+ is rectifiable.  Remark 1. According to [10, 15], rectifiability is equivalent to the possibility of obtaining the zero behavior by regular interconnection. Hence, the possibility of obtaining a finite dimensional behavior by regular interconnection from B can be regarded as almost regular interconnection to zero, since it represents the implemen- tation of the zero behavior up to a finite dimensional one. The following theorem extends the results obtained in [10, Lemma 3.5., pag.117] and [15, Cor.5.2,pag.1083] on the characterization of regular implementation of the zero behavior, and also the related results on the dual problem of decomposition of behaviors obtained in [2]. Theorem 4. Let B be a behavior. Then a finite dimensional behavior is regularly implementable from B if and only if all sub-behaviors B ⊂ B are almost imple- mentable.

Proof. (⇒): Let K be a behavior such that B ∩reg K has finite dimension. We (B)+ + 1×q[ , −1] B ⊂ apply Lemma 3 to obtain that Mod ⊕Mod(K ) = Ê s s .Further, B is almost regular implementable if and only if Mod(B)+ is direct summand of Mod(B)+, again by Lemma 3. Now it is easy to check that Mod(K )+ ∩Mod(B)+ is the direct complement of Mod(B)+ in Mod(B)+, i.e., Mod(B)+ ⊕(Mod(K )+ ∩ Mod(B)+)=Mod(B)+. The implication Mod(B)+ ⊕(Mod(K )+ ∩Mod(B)+) ⊂ Mod(B)+ is trivial. For the other inclusion, take m ∈ Mod(B)+. Obviously m = a+b for some a ∈ Mod(B)+ and b ∈ Mod(K )+.Sincea ∈ Mod(B)+ ⊂ Mod(B)+ one has that b = m − a ∈ Mod(B)+ and therefore m ∈ Mod(B)+ ⊕ (Mod(K )+ ∩ Mod(B)+). (⇐): Obvious since zero is almost regular implementable. 

4 Implementation of Finite Dimensional Behaviors by Partial Regular Interconnections

In contrast to previous sections where the interconnections were considered for the case that all system variables are available for interconnection (called full intercon- nections), we consider, in this section, interconnections in the generality that we are Implementation of 2D Strongly Autonomous Behaviors 375 allowed to use only some of the system variables for the purpose of interconnection (called partial interconnections), see [1, 3, 8, 11, 12]. The reference to partial and full is sometimes dropped when it is clear from the context which is the case under consideration. The variables whose trajectories we intend to shape, denoted by w, can be con- trolled through a set of control variables c, over which we can ’attach’ a controller. These are the variables, that can be measured and/or actuated upon. B Hence, suppose we have a system behavior, denoted by (w,c), with two types of variables, the variable to be controlled w and the variable c through which the system can be interconnected to a controller behavior. To interconnect the behavior to a controller means requiring that the c trajectories in the behavior are also elements of the controller behavior. The space of w trajectories in B(w,c) is called the manifest behavior and the space of w trajectories in the interconnection of the behavior and controller is called the manifest controlled behavior. A given (’desired’) behavior is called regularly implementable by partial interconnection (through c)ifitcan be obtained as manifest controlled behavior through a regular interconnection. It is immediately apparent that restricting oneself to using only the control variables for interconnection is more involved. In this section we prove, that, under a certain condition, if a finite dimensional behavior is regularly implementable from a given behavior B(w,c), then the control- lable part of the corresponding manifest behavior Bw must be rectifiable. In order to make the notion of partial control more precise we introduce the fol- lowing notation. If the variables are partitioned as (w,c), we assume that the kernel representation of B(w,c) is partitioned accordingly as B(w,c) = ker(RM).Onthe other hand, a controller behavior Cc = kerK, with variable c, can be thought as a behavior C(w,c) with extended variable (w,c) and kernel representation ker(0 K).For notation convenience we denote C := C(w,c). We say that the partial interconnection of B(w,c) and Cc is regular if the full interconnection of B(w,c) and C is regular. We define the w-behavior of B(w,c) as Bw := ΠwB(w,c) := {w |∃c such that (w,c) ∈ B(w,c)}. This is the the manifest behavior and can be interpreted as the ’to be con- trolled’ behavior, the behavior we are interested in, see [12]. Using the fundamental principle of Ehrenpreis and Palamodov it can be shown that Bw is again a linear nD behavior. Indeed, if B(w,c) = ker(RM), then a kernel representation of Bw is constructed as follows: take a minimal left annihilator (MLA) F of M, i.e., a poly- nomial matrix F such that kerF = imM.ThenBw = ker(FR). In an analogous way, we proceed with Bc.

Definition 4. Let B(w,c) = ker(RM) be a behavior then we define the hidden behavior H := {(w,c) | (0,c) ∈ B(w,c)}. Clearly H = ker(0 M). The following remark collects some known results in this area, for more information see [8, 11].

Remark 2. Let B(w,c) = ker(RM) beabehaviorandC = ker(0 K) a con- troller behavior. Thus, note that Πw(B(w,c) ∩ C )=Πw(B(w,c) ∩ (C + H )) and C + H = ker(0 LM) for some polynomial matrix L. Also it is not difficult to see 376 D. Napp Avelli and P. Rocha that B(w,c) ∩C is a regular interconnection if and only if B(w,c) ∩(C +H ) is a reg- ular interconnection. Moreover, the (partial) interconnection ker(RM) ∩ ker(0 K) is regular if and only if the (full) interconnection kerXM∩kerK is regular, where X is the MLA of R. Analogously the interconnection ker(RM) ∩ ker(T 0) is reg- ular if and only if the interconnection kerFR∩ kerT is regular, where F is the MLA of M. B Lemma 4. Let (w,c) be a behavior and C a controller behavior. If the manifest controlled behavior,i.e., Πw(B(w,c) ∩reg C ), has finite dimension then Πw(B(w,c) ∩reg C c) has finite dimension, where C c is the controllable part of C .

c c Proof. Obviously B(w,c) ∩ C ⊂ B(w,c) ∩ C implies Πw(B(w,c) ∩ C ) ⊂ Πw (B(w,c) ∩ C ) and therefore if Πw(B(w,c) ∩ C ) is finite dimensional so it is Πw c c (B(w,c) ∩ C ). Finally, the intersection of B(w,c) and C is regular by Lemma 1 and Remark 2, which concludes the proof. 

d Lemma 5. Let B(w,c) = ker(RM) be a behavior. If a desired behavior B is implementable by regular partial interconnection with a regular controller C = d d ker(0 LM) then B = Bw ∩reg ker(LR), i.e., B can also be implementable by regular (full) interconnection from Bw.

Proof. Without loss of generality we supposed  that the matrix LM is full row rank C I 0 · RM = RM . since is a regular behavior. Further, − Let X L I 0LM  LR 0  RM XR be the MLA of M. Hence Π (B( , ) ∩ C )=Π (ker )=ker = w w c w LR 0 LR B (w,c) ∩kerLR. To see that the interconnection between B(w,c) and kerLR is regular we prove that the interconnection between ker(RM) ∩ ker(LR 0) is regular, i.e., v(RM)=z(LR 0) for some row vectors v and z, implies v(RM)=0 = z(LR 0). Suppose that v(RM)=z(LR 0). Note that z(LR 0)=z[(0 − LM)+(LR LM)] and then v(RM) − z(LR LM)=(v − zL)(RM)=z(0 − LM). By assumption B that the interconnection of (w,c) and C is regular one has that (v − zL)(RM)= z(0 − LM)=0 and since LM is full row rank one obtains that z = 0 and therefore v(RM)=z(LR 0)=0 which proves that the interconnection is regular. 

Theorem 5. Let B(w,c) be a behavior and assume that H is controllable. If a finite dimensional behavior is regular implementable from B(w,c) by partial inter- connection, then the controllable part of the manifest behavior Bw is rectifiable.

Proof. Let Bd be a finite dimensional behavior that is regular implementable d from B(w,c) = ker(RM), i.e., Πw(B(w,c) ∩reg C )=B with C a controller behavior. By Remark 2 one can assume that C = ker(0 LM), i.e., H ⊂ C . Applying Lemma 4 Implementation of 2D Strongly Autonomous Behaviors 377

c d c one obtains that Πw(B(w,c) ∩reg C )=B has finite dimension, where C is the controllable part of C . By [6, Corollary 4, page 399] we have that Mod(C c) is free since C c is controllable, i.e., C c is a regular behavior. Further, H ⊂ C implies that H c ⊂ C c,whereH c is the controllable part of H .SinceH c = H kerM controllable we have that and C c = ker(0 LM) for some matrix L. It follows from d Lemma 5 that B is regular implementable from Bw by full interconnection. Since we are now in the context of full interconnections we can apply Theorem 3 in order to conclude the proof. 

5 Conclusions

In this paper we have investigated two fundamental notions in the behavioral ap- proach to control theory, namely, regular implementation and controllability. We have shown that if a finite dimensional behavior can be implemented by a regular interconnection from a given behavior B, then the controllable part of B is rec- tifiable, and therefore it is possible to derive important structural properties of B. We have also treated the situation where not all system variables are available for interconnection. The results rely strongly on the properties of the two dimensional behaviors and n B ⊂ ( q) the proofs are not adaptable to the higher dimensional case, i.e., Ê with n > 2. We expect some interesting problems arising in the extension of the results presented in this paper to the higher dimensional case, where some new tools will need to be introduced.

References

1. Belur, M.N., Trentelman, H.L.: Stabilization, pole placement, and regular imple- mentability. IEEE Trans. Automat. Control 47(5), 735Ð744 (2002) 2. Bisiacco, M., Valcher, M.E.: Two-dimensional behavior decompositions with finite- dimensional intersection: a complete characterization. Multidimens. Syst. Signal Pro- cess 16(3), 335Ð354 (2005) 3. Julius, A.A., Willems, J.C., Belur, M.N., Trentelman, H.L.: The canonical controllers and regular interconnection. Systems & Control Letters 54(8), 787Ð797 (2005) 4. Napp Avelli, D.: Almost direct sum decomposition and implementation of 2D behaviors. Math. Control Signals Systems 21(1), 1Ð19 (2009) 5. Oberst, U.: Almost regular interconnection of multidimensional behaviors. Accepted for publication in SIAM Journal on Control and Optimization (2008) 6. Pillai, H., Shankar, S.: A behavioral approach to control of distributed systems. SIAM J. Control Optim. 37(2), 388Ð408 (1998) 7. Rocha, P.: Feedback control of multidimensional behaviors. Systems & Control Let- ters 45, 207Ð215 (2002) 8. Rocha, P.: Canonical controllers and regular implementation of nD behaviors. In: Pro- ceedings of the 16th IFAC World Congress, Prague, Chech Republic (2005) 378 D. Napp Avelli and P. Rocha

9. Rocha, P.: Stabilization of multidimensional behaviors. Multidimens. Systems Signal Process 19, 273Ð286 (2008) 10. Rocha, P., Wood, J.: Trajectory control and interconnection of 1D and nDsystems.SIAM J. Control Optim. 40(1), 107Ð134 (2001) 11. Trentelman, H.L., Napp Avelli, D.: On the regular implementability of nD systems. Sys- tems and Control Letters 56(4), 265Ð271 (2007) 12. Willems, J.C.: On interconnections, control, and feedback. IEEE Trans. Automat. Con- trol 42(3), 326Ð339 (1997) 13. Zerz, E.: Primeness of multivariate polynomial matrices. Systems Control Lett. 29(3), 139Ð145 (1996) 14. Zerz, E.: Multidimensional behaviours: an algebraic approach to control theory for (PDE). International Journal of Control 77(9), 812Ð820 (2004) 15. Zerz, E., Lomadze, V.: A constructive solution to interconnection and decomposition problems with multidimensional behaviors. SIAM J. Control Optim. 40(4), 1072Ð1086 (2002) Ordering of Matrices for Iterative Aggregation - Disaggregation Methods

Ivana Pultarov«a

Abstract. In this short paper we show how the convergence of the iterative aggrega- tion-disaggregation methods for computing the Perron eigenvector of a large sparse irreducible stochastic matrix can be improved by an appropriate ordering of the data and by the choice of a basic iteration matrix. Some theoretical estimates are intro- duced and a fast algorithm is proposed for obtaining the desired ordering. Numerical examples are presented.

1 Introduction

The problem of solving Perron eigenvectors of stochastic matrices appears in many applications in web information retrieval, in computing the reliability of composed security electrical appliances or in the queuing problems. Due to complexities of these tasks, efficient methods are needed. Similarly to another computing disci- plines, two- or multi-level approaches are well applicable in this field. These al- gorithms are called the iterative aggregation-disaggregation (IAD) methods. Properties of basic IAD algorithms were introduced in [13]. Further theoretical results were obtained in [1, 4, 6] and some modifications were devised in order to solve effectively some particular problems [2, 7]. A progress in the convergence estimates was achieved in paper [2] for a special two-level IAD process, and a gen- eralization of this idea was proved in [12]. We continue in studying the convergence properties of the IAD methods. From the results in [8, 12] it follows that a nonzero pattern of a stochastic matrix close to the pattern of a cyclic matrix may cause slowing down the convergence rate of some IAD methods. In spite of it, choosing an appropriate basic iteration method may increase the convergence speed significantly. We address this issue in this paper.

Ivana Pultarov«a Department of Mathematics, Faculty of Civil Engineering, Czech Technical University in Prague, Thakurova 7, 166 29 Prague, Czech Republic, e-mail: [email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 379Ð385. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 380 I. Pultarov«a

In the next section, the IAD algorithm is described and its basic properties are recalled. In Section 3 we focus on the cases when the stochastic matrix is cyclic and several types of the IAD method with different basic iterations are theoretically examined. Two theorems are introduced regarding this problem. In Section 4 we present a simple algorithm which yields an ordering of a stochastic matrix which is close to that of a cyclic matrix. Some large scale numerical examples are introduced.

2IADMethod

We assume an N × N irreducible column stochastic matrix B and we want to obtain a Perron eigenvectorx ˆ of B,i.e.Bxˆ = xˆ, eT xˆ = 1, where e is a vector with all ones entries. The irreducibility of B implies thatx ˆ is unique. For a two-level iterative method, the set of indices {1,2,...,N} is divided into n ≤ N subgroups G1,...,Gn and they are considered as a new set of macro-states of a higher level. We will use the following notation. Let R be an n × N matrix for which Rij = 1 if j ∈ Gi and Rij = 0 otherwise. For any positive vector x we define an N × n matrix S(x) with the elements x S(x) = i ij ∑ k∈G j xk if i ∈ G j and S(x)ij = 0 otherwise. We denote P(x)=S(x)R.LetBa(x) be the aggre- gated matrix of size n Ba(x)=RBS(x). Then the algorithm of the IAD method is the following. 1. Starting with some positive vector x0 we solve the equation

0 Ba(x )z = z

for z. 2. Then z is prolonged to the size N by

y = S(x0)z.

3. Several steps of some basic iterative method are performed. We may use e.g. the power method, Jacobi or Gauss-Seidel methods or their block forms. Let M −W be some weak nonnegative splitting of I −B,whereI is an identity matrix. Then the basic iteration matrix will be T = M−1W.Then

x1 = T my

for some chosen integer m, and one loop of the IAD method is finished. Ordering of Matrices for IAD Methods 381

Matrix Ba(x) is an irreducible stochastic matrix for any positive x [6]. Obviously P(x) is a projection.

3 Spectral Radius of the Error Matrix

It was shown [6, 9, 11] that for the sequence of computed approximations xk it holds

xk+1 − xˆ = J(xk)(xk − xˆ), where the error matrix J(x) is

J(x)=T (I − P(x)Z)−1(I − P(x)), where Z = B − xeˆ T . In [2] the authors show that for a certain IAD scheme the global convergence of the algorithm can be proved for a matrix B which has a special nonzero structure with respect to the choice of the aggregation groups Gk. As a generalization of it we can show [12] that we obtain a locally convergent process when it is chosen T = B and the sparsity pattern of B has some special property. Up to now a general satisfactory proof of the convergence of IAD algorithm in dependency on the data (even in local sense) has not been established yet. From some examples [8, 12] we may deduce that cyclicity of the data causes divergence of the algorithm for some choice of the aggregation groups and for some basic iteration matrices T .Thatis why we study such situations in this paper. Matrix B is assumed to be cyclic,

B1,N = 1, Bi+1,i = 1fori = 1,2,...,N − 1(1) and Bij = 0 otherwise. We will study the spectral radius of J(xˆ) to whether determine the asymptotic rate of convergence of the IAD method or to prove its divergence. Since we study the IAD methods in which various basic iterative matrices are used we will denote the corresponding error matrices J(xˆ,T m) in the remaining part of the paper. We assume that the numbering of the events and that of the groups are consecu- tive,i.e.ifi ∈ Gk, j ∈ Gm,k < m then i < j. Let us denote B1 a block-diagonal matrix composed from the diagonal blocks of B where the indices of the particular blocks correspond to the aggregation groups G1,...,Gn.LetB2 = B − B1. The asymptotic convergence behavior is determined by the spectral radius of J(xˆ,T ). The view of the asymptotic convergence for some special cases is provided by the following two theorems. Theorem 1. Let n < N and let B be defined by (1). Asymptotic spectral radii of the error matrices corresponding to the IAD methods for the basic iteration matrices N −1 B, B and (I − B1) B2, respectively, are

ρ(J(xˆ,B)) = 1, 382 I. Pultarov«a

ρ(J(xˆ,BN )) = 1, and −1 ρ(J(xˆ,(I − B1) B2)) = 0, respectively.

Proof. The first proposition ρ(J(xˆ,B)) = 1 follows from [9, Theorem 1] and from [12, Theorem 3.10]. Since BN = I, the second proposition ρ(J(xˆ,BN )) = 1 is equal to

ρ((I − P(xˆ)Z)−1(I − P(xˆ))) = 1.

Matrix (I − P(xˆ)Z)−1(I − P(xˆ)) is a projection which is not null for n < N,thenwe get the assertion. −1 The off-diagonal blocks of (I − B1) B2 are all rank-one matrices, then the third proposition follows directly from [7]. 

Theorem 2. [10, Theorem 7.6] Let the block rows of the lower block triangle of −1 B be all rank-one matrices. Then the IAD method with T =(I − B1) B2 finishes afteratmostnsteps.

It might be assumed that the IAD methods converge at least in local sense for great part of the set of irreducible stochastic matrices. But for an illustration we introduce examples, that cyclicity of B increases the spectral radius of J(xˆ,BN−1) close to 2. From the continuous dependency of the spectrum on the elements of the matrix, the convergence can be arbitrarily slow for data which have almost a cyclic structure. In Figures 1 and 2 one can see the spectra (thick dots) of the error matrices J(xˆ,BN−1) and J(xˆ,BN/2−1) for 20 aggregation groups each including 30 elements. Thin lines of two or three circles are also displayed in each figure, which help to recognize the location of the eigenvalues.

1

0.5

0

Fig. 1 The spectrum of −0.5 J(xˆ,BN−1),whereB is defined by (1) and N = 600, −1 = n 20, each aggregation −1 −0.5 0 0.5 1 1.5 2 group with 30 elements.

From Theorems 1 and 2 and from these two examples of the spectra of the error matrices we may conclude that the choice of the aggregation groups and of the Ordering of Matrices for IAD Methods 383

1

0.5

0

−0.5

−1

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2

Fig. 2 The spectrum of J(xˆ,BN/2−1),whereB is defined by (1) and N = 600, n = 20, each aggregation group with 30 elements. basic iteration matrix together with ordering the events have a crucial impact on the convergence of the IAD method. Apparently (Theorem 1) a very effective algorithm can be obtained when big elements of B are concentrated in diagonal blocks and −1 T =(I − B1) B2 is used as the basic iteration matrix. The inversion of (I − B1) + + 2 + ···+ k may be substituted by I B1 B1 B1 for some integer k. In practical sparse large scale examples we may try to organize the columns and rows of matrix B in order to obtain a nonzero structure similar to that defined by (1). Then we may expect an acceleration of convergence. A simple algorithm for sym- metric reordering of B is discussed in the following section.

4 Numerical Examples

The set of data available in [3] is examined. It is Stanford web matrix of size 281903. The matrix is very sparse and represents links within a set of web pages. The aim is to find a Perron eigenvector of a stochastic matrix αB +(1 − α)peT ,whereB is Stanford web matrix, α ∈ (0,1), p is a positive vector and eT p = 1. Before applying the IAD method either no reordering of B is performed or the rows and columns of B are symmetrically reordered in the following manner. It is started from an arbitrary column and then the largest value of it is found. The corresponding row index is then taken as for the next column. Then in this column the maximal element is found and its row index is taken as for the next column and so on. No column can be gone through twice or more. When it happens that there is no remaining nonzero element in some column, the path then continues in an arbitrary column. We can call this process ”following the maximal column value” (FMCV). In a formal description of the FMCV algorithm we use a vector Perm in which we store the permutation vector, a set C which contains the indices of columns which have not been checked yet and a variable F which points to the last number 384 I. Pultarov«a

(position) stored in Perm. The FMVC algorithm then can be performed in the fol- lowing two steps. 1. Set C := {2,...,N}, F := 1, Perm(F) := 1. 2. Repeat until C is empty (or equivalently F is equal to N):

find m that Bm,Perm(F) = max j∈C B j,Perm(F), if Bm,Perm(F) = 0thenm := minC, F := F + 1, Perm(F) := m, C := C \{m}. The desired nonzero pattern of B is then obtained by B := B(Perm,Perm). Of course the resulting permuting of B is not unique. It depends on the choice of the initial column and of the columns when the quantity max j∈C B j,Perm(F) is zero. Complexity of the algorithm is equal to N. We compare sequences of approximations computed by IAD method where −1 T =(I −B1) B2 for the original matrix B with four applications of the IAD method −1 to the reordered matrix B with different basic iteration matrices: T1 =(I − B1) B2, =( + + 2 + ···+ 10) =( − )−1( ) T2 I B1 B1 B1 B2, T3 I BD,1 BD,2 ,whereBD,1 is a block diagonal matrix, the diagonal blocks of which contain only the main diagonals and the second lower diagonals of the corresponding blocks of B and BD,2 = B − BD,1. =( + +···+ 10 )( ) The fourth set of approximations is computed for T4 I BD,1 BD,1 BD,2 . Thus we have five series of approximations. Numbers of steps in each set in which the error is decreased to 10−5 are shown in five columns of Table 1 which are de- noted by T, T1, T2, T3,andT4, respectively. Matrices B are parts of the Stanford web matrix. The number of blocks times the sizes of blocks are 100 × 100, 200 × 200, 300 × 300 and 400 × 400, respectively. Into all of the empty columns one value of 1 is added. We set α = 0.85 and p =(1,1,...,1)T /N.

Table 1 Number of steps of the five variants of the IAD method until the error 10−5 is reached.

number of blocks × block size TT1 T2 T3 T4 100 × 100 59 30 48 33 34 200 × 200 59 34 42 35 35 300 × 300 55 37 40 36 36 400 × 400 53 39 42 37 37

As we can see, while for smaller size data the matrix T1 yields the best results, for larger data manipulating with only the diagonal and with the first subdiagonal is more efficient. Ordering of Matrices for IAD Methods 385

5 Discussion

We have shown some difficulties which can be met when applying blindly the IAD method and we propose an approach for their resolution. Achieving a structure of elements of a stochastic matrix similar to that defined by (1) and using the block Jacobi basic iteration usually increases the convergence speed. Let us stress that this way is well applicable in practical large scale computing. The appropriate ordering =( + + 2 + ···+ k) and the use of the basic iteration matrix T2 I B1 B1 B1 B2 which is −1 the approximation of T1 =(I − B1) B2 is e.g. exactly performed by the algorithm proposed in [5].

Acknowledgements. Supported by the project of GACRˇ No. 201/09/P500 and by the re- search project CEZ MSM 6840770001.

References

1. Courtois, P.J., Semal, P.: Block iterative algorithms for stochastic matrices. Linear Alge- bra and its Applications 76, 59Ð80 (1986) 2. Ipsen, I., Kirkland, S.: Convergence analysis of a PageRank updating algorithm by Langville and Meyer. SIAM Journal on Matrix Analysis and Applications 27, 952Ð967 (2006) 3. Kamvar, S.: Data sets of Stanford Web Matrix and Stanford-Berkeley Web Matrix, http://www.cise.ufl.edu/research/sparse 4. Krieger, U.R.: Numerical solution of large finite Markov chains by algebraic multigrid techniques. In: Stewart, W.J. (ed.) Computations with Markov Chains, pp. 403Ð424. Kluwer Academic Publisher, Boston (1995) 5. Litvak, N., Robert, P.: Analysis of an on-line algorithm for solving large Markov chains. In: Proceedings SMCTools 2008, Athens, Greece (2008) 6. Marek, I., Mayer, P.: Convergence analysis of an aggregation/disaggregation iterative method for computation stationary probability vectors of stochastic matrices. Numerical Linear Algebra With Applications 5, 253Ð274 (1998) 7. Marek, I., Mayer, P.: Convergence theory of some classes of iterative aggregation/ dis- aggregation methods for computing stationary probability vectors of stochastic matrices. Linear Algebra and its Applications 363, 177Ð200 (2003) 8. Marek, I., Mayer, P., Pultarov«a, I.: IAD methods based on splittings with cyclic iteration matrices. In: Dagstuhl Seminar Proceedings 07071 (2007) 9. Marek, I., Pultarov«a, I.: A note on local and global convergence analysis of iterative aggregation/disaggregation methods. Linear Algebra and its Applications 413, 327Ð341 (2006) 10. Marek, I., Pultarov«a, I.: An aggregation variation on the Google matrix (submitted) 11. Pultarov«a, I.: Local convergence analysis of iterative aggregation-disaggregation meth- ods with polynomial correction. Linear Algebra and its Applications 421, 122Ð137 (2007) 12. Pultarov«a, I.: Necessary and sufficient local convergence condition of one class of it- erative aggregation - disaggregation methods. Numerical Linear Algebra with Applica- tions 15, 339Ð354 (2008) 13. Stewart, W.J.: Introduction to the Numerical Solution of Markov Chains. Princeton Uni- versity Press, Princeton (1994) The Positive Servomechanism Problem under LQcR Control

Bartek Roszak and Edward J. Davison

Abstract. This paper considers the servomechanism problem for MIMO positive LTI systems. In particular, the servomechanism problem of nonnegative constant reference signals for stable MIMO positive LTI systems with unmeasurable un- known constant nonnegative disturbances under strictly nonnegative control inputs is solved using a clamping LQ regulator.

1 Introduction

In this paper we consider the positive servomechanism problem for stable MIMO positive LTI systems using linear quadratic clamping regulators. The servomechanism problem for LTI systems has been nicely captured and solved by [1]; however, the interest of the servomechanism problem for positive systems has had limited consideration [6]-[8]. In the case of positive systems, [7] considers a subclass of the servomechanism problem under measurable disturbances with feedforward compensators and tuning regulators; [6, 8] take into account the tracking/regulation problem for SISO posi- tive LTI systems with almost-positivity and clamping tuning regulators. The above references only consider unknown systems (i.e. where the plant matrices (A,B,C,D) are unknown) and thus the control methods have been limited to tuning regulators with on-line tuning. In this paper, we assume that the plant matrices are known; in this case, we show that our results for tracking/disturbance rejection can incorporate “LQR clamping” control that in general results in improved performance compared to [6]-[8].

Bartek Roszak and Edward J. Davison Systems Control Group, Electrical and Computer Engineering University of Toronto, Toronto, Ontario M5S 1A4, Canada, e-mail: [email protected],[email protected]

R. Bru and S. Romero-Viv«o (Eds.): Positive Systems, LNCIS 389, pp. 387Ð395. springerlink.com c Springer-Verlag Berlin Heidelberg 2009 388 B. Roszak and E.J. Davison

For numerous related topics to this paper and positive systems see [2], [3], [5], [6Ð8], and references therein. The paper is organized as follows. Preliminaries and background are presented first, where the terminology, the problem of interest, and the control strategy are outlined. Section 3 provides the main theoretical results of the paper. An illustrative example is presented in Section 4 and all concluding remarks finalize the paper.

2 Terminology and Problem Statement

Throughout the paper we use standard positive system terminology with several definitions given next.

n n

= { ∈ Ê | ≥ } Ê = { =( , ,..., ) ∈ Ê | ∈ Let the set Ê+ : x x 0 ,theset + : x x1 x2 xn xi , ∀ = ,..., } Ê+ i 1 n . If exclusion of 0 from the sets will be necessary, then we’ll denote n \{ } th A the sets in the standard way Ê+ 0 .Theij entry of a matrix will be denoted

as aij.Anonnegative matrix A has all of its entries greater or equal to 0, aij ∈ Ê+ ∀i, j.AMetzler matrix A is a matrix for which all off-diagonal elements of A are = nonnegative, i.e. aij ∈ Ê+ for all i j. In this paper a plant is considered stable if all eigenvalues are located in the open left-half plane of the imaginary axis. The plant of interest is defined next. Consider,

xú = Ax + Bu + Eω y = Cx+ Du + Fω (1)

e := y − yre f

n×m r×n

∈ Ê = ω ∈ where A is an n × n Metzler Hurwitz matrix, B ∈ Ê+ , C + , D 0, E

n r r ω

ω ∈ Ω ⊂ Ê ∈ ⊂ Ê ω ∈ Ω ⊂ Ê = Ω1 ⊂ Ê+, F 2 +, yre f Yre f +, +. Also, assume that m r, i.e. the number of inputs equals the number of outputs. The problem of interest is outlined below.

Problem 1. Consider the plant (1). Assume that rank(D −CA−1B)=r and that the

n r ω ∈ Ê sets Ω and Yre f are chosen such that Eω ∈ Ê+ and F +, with the steady state values of the plant’s states, outputs (xss and yss)being nonnegative and the steady state of the input (uss)being positive, i.e. for all constant tracking and constant dis- turbance signals in question, it is assumed that the steady-state of the system (1) is given by

AB x E 0 ω − ss = (2) CD uss F −I yre f

n r i

∈ Ê = + + ω = ∈ Ê ∈ Ê \ has the property that xss +, yss Cxss Duss F yre f +, uss + {0}, ∀i ∈{1,...,m}. It is to be noted that a solution to (2) exists if and only if rank(D −CA−1B)=r.

Then, find a controller that The Positive Servomechanism Problem under LQcR Control 389

(a) guarantees closed loop stability; (b) ensures the plant (1) is nonnegative, i.e. the inputs u are nonnegative for all time; and (c) ensures tracking of the reference signals, i.e. e = y−yre f → 0, as t → ∞, ∀yre f ∈ Yre f and ∀ω ∈ Ω. In addition, (d) assume that a controller has been found so that conditions (a), (b), (c) are satis- fied; then for all perturbations of the nominal plant model which maintain prop- erties (a) and (b), it is desired that the controller can still achieve asymptotic tracking and regulation, i.e. property (c) still holds.

Problem 1 will be referred to as the positive robust servomechanism problem. Note that condition (b) above also guarantees, by nonnegativity of u, that the states x and the outputs y will be nonnegative for all time. Notice that the two assumptions in the latter problem are in fact necessary for a positive system to adhere to any type of tracking constraint, i.e. the rank condition (rank(D −CA−1B)=r) is a necessary condition for the servomechanism problem of LTI systems and hence must hold true for positive LTI systems; also, the steady- state condition for xss and yss is clearly a necessary condition due to the restriction of positive systems. Also, it must be pointed out that in Problem 1 we do not make any assump- tion that the disturbance is known; thus, we are considering unmeasurable distur- bances that abide to the steady-state conditions. See [7] for a closer discussion of this steady-state assumption. Next, the linear quadratic clamping regulator (LQcR) is defined. This control law will be used in the sequel to solve Problem 1.

Controller 2.1 Assume rank(D −CA−1B) = r. Given ρ > 0, the controller is de- scribed by:

T ηú = y − yre f , η(0)=0; u = α[Kx Kη ][x η] , (3)

 ∃ ∈{ ,..., } ([ ][ η]) ≤ , α = 0 if i 1 r s.t Kx Kη x i 0 1 otherwise

m×n m×r ∈ Ê where Kx ∈ Ê and Kη are found by solving the LQ control problem:  ∞ eT Qe + ρ2uúT udú τ (4) 0 with ρ > 0 and Q =((D−CA−1B)−1)T (D−CA−1B)−1). Notice that with the given Q we still abide to the conditions of stabilizability and detectability, which are needed to proceed with the LQ control problem.  390 B. Roszak and E.J. Davison

For convenience, since we will be interested in letting ρ → ∞ we re-write (4) as  ∞ ε2eT Qe + uúT udú τ (5) 0 where ε > 0. The latter transformation allows us to treat the problem with ε → 0 and still obtain the same value for K =[Kx Kη ].

3MainResults

In this section, the necessary and sufficient conditions for Problem 1 are presented via the use of the LQcR controller.

n Theorem 1. Consider system (1), with x0 ∈ Ê+, under controller (3). Then Prob- lem 1 is solvable if and only if there exists an ε∗ for (5) such that for all ε ∈ (0,ε∗] the controller (3) solves Problem 1.

Proof. A sketch only of the proof will be given. Let us show that indeed with the given assumptions the LQcR controller will solve Problem 1. We first concentrate on showing that tracking of yre f occurs. In order to show the latter, we will break down the proof into two steps. 1. Our first step will be to show that if u = 0 (clamping occurs), then there exists a T time t1 > 0 such that the input will switch to the control law u =[Kx Kη ][x η] , i.e. in (3) the input will eventually stop clamping. 2. Then, we’ll show that if there exists a time t2 such that u > 0, then there exists ∗ ∗ an ε such that for all time t ≥ t2 and all ε ∈ (0,ε ] the controller (3) maintains nonnegativity of the input and solves Problem 1. Each of the above steps is given next:

By contradiction, assume there does not exist a time t1,i.e.u = 0 for all time. There- fore, the closed loop system becomes

xú = Ax + Eω ηú = Cx+ Fω − yre f

−1 and since A is stable x →−A Eω = xss, t → ∞,i.e.ifu = 0 for all time t > 0, then the system state tends toward xss as t → ∞, which can also be expressed as:

xú = 0 = Axss + Buss + Eω −1 −1 −A Eω = xss + A Buss −1 xss = xss + A Buss.

However, this implies that The Positive Servomechanism Problem under LQcR Control 391

uú = Kxxú+ Kη ηú

→ Kx(Axss + Eω)+Kη(Cxss + D(0)+Fω − yre f ) −1 → Kx(0)+Kη(C(xss + A Buss)+D(0)+Fω − yre f ) −1 → Kη (C(xss + A Buss)+D(uss − uss)+Fω − yre f ) −1 → Kη (Cxss + Duss + Fω − yre f ) − Kη(D −CA B)uss) −1 → 0 − Kη(D −CA B)uss

→ εuss > 0 component-wise;

− − the second last line comes from the fact as ε → 0, Kη →−ε(D −CA 1B) 1 (this can be shown via manipulation of the ARE and uniqueness of the gain matrix; the details are omitted.). Next, we proceed to show that if for some time t2 ≥ 0 (which exists from above ∗ and satisfies t2 ≥ t1), u(t2) > 0, then there exists an ε such that for all time t ≥ t2 and all ε ∈ (0,ε∗] the controller (3) maintains nonnegativity of the input. In order to prove the above, we use the results of singular perturbation [4]. The closed loop system with the controller in place for u > 0 and shifted by its equilibrium

z = x − xss q u uss becomes

zú ABz = . (6) qú ε(KxA + KηC) ε(KxB + Kη D) q

Note that K = εK (this can be shown via manipulation of the sequence that solves the ARE of (5) [9][Ch.12], details are omitted due to space limitations). For convenience, rewrite (6) as

qú ε(K B + Kη D) ε(K A + KηC) q = x x . (7) zú BAz

Next, let’s scale the derivatives (i.e. scaling of time) by εdt = dτ, resulting in the transformed system * + + q (KxB + Kη D)(KxA + KηC) q + = , (8) ε z BAz

+ + ε dq = ε = ε dz = ε = with dτ q qú and dτ z zú. We have now transformed our model into the standard singular perturbation model. Now, since (8) is linear and time invariant and we are only interested in u it suffices to show that the reduced model by singular 392 B. Roszak and E.J. Davison perturbation yields exponential stability [4]; all other assumptions clearly hold. The reduced model obtained (we omit details) results in:

u = q + uss (9) − Kη (D−CA 1B)τ = uss + e (u(t2) − uss) − εKη (D−CA 1B)t = uss + e (u(t2) − uss) −εt = uss + e (u(t2) − uss)

∗ and since u(t2) > 0, then for all time t ≥ t2, there exists an ε such that u > 0forall ∗ ε ∈ (0,ε ] and t ≥ t2 since u is monotonically approaching uss. Thus, y → yre f as t → ∞ if uss > 0. Finally, nonnegativity trivially holds since u ≥ 0 for all time by the definition of the control law, and the fact that all other conditions of Problem 1 hold are also trivially satisfied by similar arguments as used in [7] (details are omitted). Necessity can be deduced from the latter result and is omitted. 

The latter Theorem can be easily interpreted; mainly, it states that one can ac- complish nonnegativity, tracking and disturbance rejection, and robustness with an LQcR control law. Theorem 1 does not tell us how small or how large the ε∗ can be, thus we cannot guarantee a settling time.

4 LQcR MIMO Example

In this section we illustrate via an example the use of the LQcR controller.

Example 1. Consider the system of reservoirs of Figure 1; note that each reservoir is identified by a number (1,2,...,6) where the water storage level (x1,x2,...,x6)is a state of the system. Also γ and φ are the splitting coefficients of the flows at the branching points. The system is of order 6, as we assume the pump dynamics can be neglected. As pointed out in [2], the dynamics of each reservoir can be captured by a single differential equation:x úi = −αixi + v + eiω, z = αixi for all i = 1,...,6, where xi is the water storage (in L)andα > 0 is the ratio between outflow rate z and storage, with eiω being the disturbance rate into the storage. The input into the reservoir is designated by v and is in (L/s). Consider the case where γ = 0.5, φ = 0.7, α1,...,α6 = 0.8, 0.7, 0.5, 1, 2, 0.8. Note that all the rates are measured in L/s. This results in the following system: The Positive Servomechanism Problem under LQcR Control 393

u1 + ω

γ 1 − γ

12

pump 3 φ u2 1 − φ

6 4

5

Fig. 1 System set up for Example 1.

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ −0.800020 0.50 0.5 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 0 −0.70000⎥ ⎢0.50⎥ ⎢0.5 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 0.80.7 −0.500 0⎥ ⎢ 00⎥ ⎢ 0 ⎥ xú = ⎢ ⎥x + ⎢ ⎥u + ⎢ ⎥ω (10) ⎢ 000.15 −10 0⎥ ⎢ 00⎥ ⎢ 0 ⎥ ⎣ 0001−20⎦ ⎣ 00⎦ ⎣ 0 ⎦ 000.3500−0.8 01 0

001000 y = x (11) 100001 It is now desired to solve Problem 1 for this stable system. Assume the initial con- dition and the disturbance are x0 =[222222] and ω = 0.5, respectively. Addi- T tionally, assume that we would like to track the reference input yre f =[55] . With the choice of ε = 10, the desired result is obtained. Figure 2 illustrates the simulated input and output response, and since the input is nonnegative it is clear that the states must also be nonnegative for all time. Notice that initially (0-3 seconds) the controller clamps to avoid negativity of the control input. 394 B. Roszak and E.J. Davison

outputs and inputs y 6

5

y2 4

3 y and u y1

2 u1

1 u2

0 0 5 10 15 20 25 30 time (s)

Fig. 2 Input and output response for Example 1.

5Conclusion

In this paper, we have provided the necessary and sufficient conditions for the pos- itive MIMO servomechanism problem (Problem 1). The results extended the SISO case of [6, 8] to the MIMO case under a new clamping LQcR control strategy.

Acknowledgements. Supported by NSERC under Grant No A4396.

References

1. Davison, E.J., Goldenberg, A.: Robust control of a general servomechanism problem: the servo-compensator. Automatica 11, 461Ð471 (1975) 2. Farina, L., Rinaldi, S.: Positive Linear Systems. Theory and Applications. Pure and Ap- plied mathematics. John Wiley & Sons, Inc., New York (2000) 3. Kaczorek, T.: 1D and 2D systems. Springer, New York (2002) 4. Khalil, H.K.: Nonlinear Systems. Prentice Hall, New Jersey (2002) 5. Luenburger, D.: Introduction to Dynamic Systems: Theory, Models and Applications. Wi- ley, New York (1979) 6. Roszak, B., Davison, E.J.: Tuning regulators for tracking SISO positive linear systems. In: Proceedings of the European Control Conference 2007, pp. 540Ð547 (2007) The Positive Servomechanism Problem under LQcR Control 395

7. Roszak, B., Davison, E.J.: The Servomechanism Problem for Unknown MIMO LTI Posi- tive Systems: Feedforward and Robust Tuning Regulators. In: Proceedings of the Ameri- can Control Conference 2008, pp. 4821Ð4826 (2008) 8. Roszak, B., Davison, E.J.: The servomechanism problem for unknown SISO positive sys- tems using clamping. In: Proceedings of the 17th IFAC World Congress, pp. 353Ð358 (2008) 9. Wonham, W.M.: Linear Multivariable Control A Geometric Approach, 3rd edn. Springer, New York (1985) Author Index

Abad, Manuel F. 185 Hardouin, Laurent 349 Ahmane, Mourad 195 Herrera, Manuel 305, 321 Ait Rami, Mustapha 205 Herrero, Alicia 315 Alfidi, Mohammed 217 Hinrichsen, Diederich 71 Audoly, Stefania 269 Hmamed, Abdelaziz 217 Hueso, Jos´e L. 259 Beauthier, Charlotte 45 Bellu, Giuseppina 269 Izquierdo, Joaqu´ın 305, 321 Benoˆıt, Eric 233 Benvenuti, Luca 55 Kaltenbach, Hans-Michael 163 Bokharaie, Vahid S. 101 Katkova, Olga M. 83 Knorn, Florian 331, 339 Cant´o, Bego˜na 243 Cant´o, Rafael 251 Le Corronc, Euriell 349 Chotijah, Siti 141 Lorenz, Dirk A. 91 Coll, Carmen 243 Lorenz, Jan 91 Conradi, Carsten 163 Luh, Peter B. 151 Cordero, Alicia 259 Corless, Martin 339 Marek, Ivo 359 Cottenceau, Bertrand 349 Mart´ınez, Eulalia 259 Mason, Oliver 101, 331 D’Angi , Leontina 269 Montalvo, Idel 305, 321 Damm, Tobias 63 Davison, Edward J. 387 Napp Avelli, Diego 369 Declerck, Philippe 279 Ngoc, Pham Huu Anh 111 de Kerchove, Cristobald 3

Ethington, Cristina 63 Pe˜na, Juan Manuel 123 P´erez-Garc´ıa, Rafael 305, 321 Farina, Lorenzo 55 Pultarov´a, Ivana 379

Gass´o, Mar´ıa T. 185 Ram´ırez, Francisco J. 315 Gaubert, St´ephane 291 Reis, Timo 131 Gouz´e, Jean-Luc 233 Ricarte, Beatriz 251 Guezzi, Abdelhak 279 Rocha, Paula 369 398 Author Index

Roszak, Bartek 387 Torregrosa, Juan R. 185, 259 Rumchev, Ventsi 141 Truffet, Laurent 195

Saccomani, Maria Pia 269 Uhr, Markus 163 Salceanu,PaulLeonard 17 Urbano, Ana M. 251 S´anchez, Elena 243 Sharify, Meisam 291 Valcher, Maria Elena 29, 173 Shorten, Robert 101, 331, 339 Van Dooren, Paul 3 Smith, Hal L. 17 Virnik, Elena 131 Stanojevic, Rade 339 Vishnyakova, Anna M. 83 Stelling, J¨org 163 Sun, Tao 151 Winkin, Joseph J. 45 Tadeo, Fernando 217 Thome, N´estor 315 Zhao, Qianchuan 151 Lecture Notes in Control and Information Sciences Edited by M. Thoma, F. Allgöwer, M. Morari

Further volumes of this series can be found on our homepage: springer.com

Vol. 389: Bru, R.; Romero-Vivó, S. (Eds.): Vol. 379: Mellodge P.; Kachroo P.; Positive Systems Model Abstraction in Dynamical Systems: 398 p. [978-3-642-02893-9] Application to Mobile Robot Control 116 p. 2008 [978-3-540-70792-9] Vol. 388: Loiseau, J.J.; Michiels, W.; Niculescu, S-I.; Sipahi, R. (Eds.): Vol. 378: Femat R.; Solis-Perales G.; Topics in Time Delay Systems Robust Synchronization of Chaotic Systems 418 p. [978-3-642-02896-0] Via Feedback 199 p. 2008 [978-3-540-69306-2] Vol. 387: Xia, Y.; Fu, M.; Shi, P.: Vol. 377: Patan K. Analysis and Synthesis of Dynamical Systems Artificial Neural Networks for with Time-Delays the Modelling and Fault 283 p. 2009 [978-3-642-02695-9] Diagnosis of Technical Processes 206 p. 2008 [978-3-540-79871-2] Vol. 386: Huang, D.; Nguang, S.K.: Vol. 376: Hasegawa Y. Robust Control for Uncertain Networked Control Approximate and Noisy Realization of Systems with Random Delays Discrete-Time Dynamical Systems 159 p. 2009 [978-1-84882-677-9] 245 p. 2008 [978-3-540-79433-2] Vol. 385: Jungers, R.: Vol. 375: Bartolini G.; Fridman L.; Pisano A.; The Joint Spectral Radius Usai E. (Eds.) 144 p. 2009 [978-3-540-95979-3] Modern Sliding Mode Control Theory Vol. 384: Magni, L.; Raimondo, D.M.; 465 p. 2008 [978-3-540-79015-0] Allgöwer, F. (Eds.): Vol. 374: Huang B.; Kadali R. Nonlinear Model Predictive Control Dynamic Modeling, Predictive Control 572 p. 2009 [978-3-642-01093-4] and Performance Monitoring 240 p. 2008 [978-1-84800-232-6] Vol. 383: Sobhani-Tehrani E.; Khorasani K.; Vol. 373: Wang Q.-G.; Ye Z.; Cai W.-J.; Fault Diagnosis of Nonlinear Systems Hang C.-C. Using a Hybrid Approach PID Control for Multivariable Processes 360 p. 2009 [978-0-387-92906-4] 264 p. 2008 [978-3-540-78481-4] Vol. 382: Bartoszewicz A.; Vol. 372: Zhou J.; Wen C. Nowacka-Leverton A.; Adaptive Backstepping Control of Uncertain Time-Varying Sliding Modes for Second Systems and Third Order Systems 241 p. 2008 [978-3-540-77806-6] 192 p. 2009 [978-3-540-92216-2]

Vol. 381: Hirsch M.J.; Commander C.W.; Vol. 371: Blondel V.D.; Boyd S.P.; Pardalos P.M.; Murphey R. (Eds.) Kimura H. (Eds.) Optimization and Cooperative Control Strategies: Recent Advances in Learning and Control Proceedings of the 8th International Conference 279 p. 2008 [978-1-84800-154-1] on Cooperative Control and Optimization Vol. 370: Lee S.; Suh I.H.; 459 p. 2009 [978-3-540-88062-2] Kim M.S. (Eds.) Vol. 380: Basin M. Recent Progress in Robotics: New Trends in Optimal Filtering and Control for Viable Robotic Service to Human Polynomial and Time-Delay Systems 410 p. 2008 [978-3-540-76728-2] 206 p. 2008 [978-3-540-70802-5] Vol. 369: Hirsch M.J.; Pardalos P.M.; Vol. 355: Zhang H.; Xie L.: Murphey R.; Grundel D. Control and Estimation of Systems with Advances in Cooperative Control and Input/Output Delays Optimization 213 p. 2007 [978-3-540-71118-6] 423 p. 2007 [978-3-540-74354-5] Vol. 354: Witczak M.: Vol. 368: Chee F.; Fernando T. Modelling and Estimation Strategies for Fault Closed-Loop Control of Blood Glucose Diagnosis of Non-Linear Systems 157 p. 2007 [978-3-540-74030-8] 215 p. 2007 [978-3-540-71114-8] Vol. 367: Turner M.C.; Bates D.G. (Eds.) Vol. 353: Bonivento C.; Isidori A.; Marconi L.; Mathematical Methods for Robust and Nonlinear Rossi C. (Eds.) Control Advances in Control Theory and Applications 444 p. 2007 [978-1-84800-024-7] 305 p. 2007 [978-3-540-70700-4] Vol. 366: Bullo F.; Fujimoto K. (Eds.) Vol. 352: Chiasson, J.; Loiseau, J.J. (Eds.) Lagrangian and Hamiltonian Methods for Applications of Time Delay Systems Nonlinear Control 2006 358 p. 2007 [978-3-540-49555-0] 398 p. 2007 [978-3-540-73889-3] Vol. 351: Lin, C.; Wang, Q.-G.; Lee, T.H., He, Y. Vol. 365: Bates D.; Hagström M. (Eds.) LMI Approach to Analysis and Control of Nonlinear Analysis and Synthesis Techniques for Takagi-Sugeno Fuzzy Systems with Time Delay Aircraft Control 204 p. 2007 [978-3-540-49552-9] 360 p. 2007 [978-3-540-73718-6] Vol. 350: Bandyopadhyay, B.; Manjunath, T.C.; Vol. 364: Chiuso A.; Ferrante A.; Umapathy, M. Pinzoni S. (Eds.) Modeling, Control and Implementation of Smart Modeling, Estimation and Control Structures 250 p. 2007 [978-3-540-48393-9] 356 p. 2007 [978-3-540-73569-4] Vol. 349: Rogers, E.T.A.; Galkowski, K.; Vol. 363: Besançon G. (Ed.) Owens, D.H. Nonlinear Observers and Applications Control Systems Theory 224 p. 2007 [978-3-540-73502-1] and Applications for Linear Vol. 362: Tarn T.-J.; Chen S.-B.; Repetitive Processes Zhou C. (Eds.) 482 p. 2007 [978-3-540-42663-9] Robotic Welding, Intelligence and Automation Vol. 347: Assawinchaichote, W.; Nguang, 562 p. 2007 [978-3-540-73373-7] K.S.; Shi P. Vol. 361: Méndez-Acosta H.O.; Femat R.; Fuzzy Control and Filter Design González-Álvarez V. (Eds.): for Uncertain Fuzzy Systems Selected Topics in Dynamics and Control of 188 p. 2006 [978-3-540-37011-6] Chemical and Biological Processes Vol. 346: Tarbouriech, S.; Garcia, G.; Glattfelder, 320 p. 2007 [978-3-540-73187-0] A.H. (Eds.) Vol. 360: Kozlowski K. (Ed.) Advanced Strategies in Control Systems Robot Motion and Control 2007 with Input and Output Constraints 452 p. 2007 [978-1-84628-973-6] 480 p. 2006 [978-3-540-37009-3] Vol. 359: Christophersen F.J. Vol. 345: Huang, D.-S.; Li, K.; Irwin, G.W. (Eds.) Optimal Control of Constrained Intelligent Computing in Signal Processing Piecewise Affine Systems and Pattern Recognition 190 p. 2007 [978-3-540-72700-2] 1179 p. 2006 [978-3-540-37257-8] Vol. 358: Findeisen R.; Allgöwer Vol. 344: Huang, D.-S.; Li, K.; Irwin, G.W. (Eds.) F.; Biegler L.T. (Eds.): Assessment and Future Intelligent Control and Automation Directions of Nonlinear 1121 p. 2006 [978-3-540-37255-4] Model Predictive Control 642 p. 2007 [978-3-540-72698-2] Vol. 341: Commault, C.; Marchand, N. (Eds.) Positive Systems Vol. 357: Queinnec I.; Tarbouriech 448 p. 2006 [978-3-540-34771-2] S.; Garcia G.; Niculescu S.-I. (Eds.): Vol. 340: Diehl, M.; Mombaur, K. (Eds.) Biology and Control Theory: Current Challenges Fast Motions in Biomechanics and Robotics 589 p. 2007 [978-3-540-71987-8] 500 p. 2006 [978-3-540-36118-3] Vol. 356: Karatkevich A.: Vol. 339: Alamir, M. Dynamic Analysis of Petri Net-Based Discrete Stabilization of Nonlinear Systems Using Systems Receding-horizon Control Schemes 166 p. 2007 [978-3-540-71464-4] 325 p. 2006 [978-1-84628-470-0]