Automatica 37 (2001) 573}580

Brief Paper Robust maximum likelihood estimation in the linear modelଝ Giuseppe Cala"ore! *, Laurent El Ghaoui"

!Dipartimento di Automatica e Informatica, Politecnico di Torino, Cso Duca degli Abruzzi 24, 10129 Torino, Italy "Electrical Engineering and Computer Sciences Department, University of California at Berkeley, USA Received 15 April 1999; revised 4 July 2000; received in "nal form 6 October 2000

Abstract

This paper addresses the problem of maximum likelihood parameter estimation in linear models a!ected by Gaussian noise, whose and matrix are uncertain. The proposed estimate maximizes a lower bound on the worst-case (with respect to the uncertainty) likelihood of the measured sample, and is computed solving a semide"nite optimization problem (SDP). The problem of linear robust estimation is also studied in the paper, and the statistical and optimality properties of the resulting linear are discussed. ( 2001 Elsevier Science Ltd. All rights reserved.

Keywords: Robust estimation; Distributional robustness; ; Convex optimization; Linear matrix inequalities

1. Introduction to model uncertainty) value of the . Next, we will analyze the case of linear robust estimation, The problem of estimating parameters from noisy ob- and discuss the bias, and optimality properties served has a long history in engineering and experi- of the resulting estimator. The undertaken minimax ap- mental science in general. When the observations and the proach to robustness is in the spirit of the distributional unknown parameters are related by a , and robustness approach discussed in Huber (1981) for par- a stochastic setting is assumed, then the application of ametrized families of distributions. In our case, the the maximum likelihood (ML) principle (see for instance minimax is performed with respect to unknown-but- the monograph Berger & Wolpert, 1988) leads to the bounded parameters appearing in the underlying statist- well-known least-squares (LS) parameter estimate. How- ical model. The techniques introduced in this paper may ever, the well-established ML principle assumes that the also be viewed as the stochastic counterpart of the deter- true parametric model for the data is exactly known, ministic robust estimation methods that appeared re- a seldom veri"ed assumption in practice, where models cently in Chandrasekaran, Golub, Gu, and Sayed (1998) only approximate reality (Knight, 2000, Chapter 5). This and El Ghaoui and Lebret (1997). In particular, the paper introduces a family of that are based on model uncertainty will here be represented using the a robust version of the ML principle, where uncertainty linear fractional transformation (LFT) formalism (El in the underlying is explicitly taken into Ghaoui and Lebret, 1997), which allows to treat cases account. In particular, we will study estimators that where the regression matrix has a particular form, such maximize a lower bound on the worst-case (with respect as Toeplitz or Vandermonde, and where the uncertainty a!ects the data in a structured way. Robust estimation trades accuracy which is best ଝThis paper was not presented at any IFAC meeting. This paper was achieved using standard techniques as LS or total least recommended for publication in revised form by Associate Editor squares (TLS) (Van Hu!el & Vandewalle, 1991), with T. Sugie under the direction of Editor Roberto Tempo. This work was robustness, i.e. insensitvity with respect to parameters supported in part by Italy CNR funds. Research of second author variations. In this latter context, links between robust partially supported via a National Science Foundation CAREER estimation, sensitivity, and regularization techniques, award. * Corresponding author. Tel.: #39-011-564-7066; fax: #39-011- such as Tikhonov regularization (Tikhonov & Arsenin, 564-7099. 1977), may be found in BjoK rk (1991), Elden (1985) and E-mail address: cala"[email protected] (G. Cala"ore). El Ghaoui and Lebret (1997) and references therein.

0005-1098/01/$- see front matter ( 2001 Elsevier Science Ltd. All rights reserved. PII: S 0 0 0 5 - 1 0 9 8 ( 0 0 ) 0 0 1 8 9 - 8 574 G. Calaxore, L.El Ghaoui / Automatica 37 (2001) 573}580

In this paper, we will study a mixed-uncertainty prob- To cast our problem in a ML setting, the log- lem, where the regression matrix is a!ected by determin- likelihood function L is de"ned as the logarithm of istic, structured and norm-bounded uncertainty, while the a posteriori joint probability density of x,y the measure is a!ected by Gaussian noise whose  *" " is also uncertain. In this setting, we L(x, xQ, yQ) log ( fV(xQ) fW(yQ)), will compute a robust (with respect to the deterministic where fV, fW are the probability density functions of x,y, model uncertainty) estimate via semide"nite program- respectively. Since x,y are independent Gaussian vectors, ming (SDP). An example of application of the introduced maximizing the log-likelihood is equivalent to minimiz- theory to the estimation of dynamic parameters of a ro- ing the following function: bot manipulator from real experimental data is presented  * " !  2 \ * !  in Section 4. l(x, ) (xQ x) P ( N)(xQ x) # ! *  2 \ * ! *  Notation. For a square matrix X, X Y 0 (resp. X ) 0) (yQ C( A)x) D ( B)(yQ C( A)x), X is symmetric, and positive-de"nite (resp. where * is the total uncertainty matrix, containing the j " 2 semide"nite). !(X), where X X , denotes the max- blocks * , * , * . We notice that, for "xed *, computing "" "" N B A imum eigenvalue of X. X denotes the operator (max- the ML estimate reduces to solving the following stan- 31L"L imum singular value) norm of X. For P , with dard norm minimization problem: P Y 0, and xN 31L, the notation x&N(x , P) means that  * " "" *  ! * "" x is a Gaussian random vector with expected value x and x( +*( ) arg min F( )x g( ) , covariance matrix P. V where

2. Problem statement D\(* )C(* ) D\(* )y F(*)" B A ; g(*)" B Q . (1) C P\(* ) D CP\(* )x D We consider the problem of estimating a parameter N N Q from noisy observations that are related to the unknown If now * is allowed to vary in a given norm-bounded set, parameter by a linear statistical model. To set up the as precised in Section 2.1, we de"ne the worst-case max- problem, we shall take the Bayesian point of view, and imum likelihood (WCML) estimate x( 5!+* as assume an a priori on the unknown 31L &  *  31L x( "arg min max ""F(*)x !g(*)"". (2) parameter x , i.e. x N(x, P( N)), where x is the 5!+* V * expected value of x, and the a priori covariance The WCML estimate provides therefore a guaranteed P(* )31L L depends on a matrix * of uncertain para- N N level of the likelihood function, for any possible value of meters, as it will be discussed in detail in Section 2.1. the uncertainty. In Section 2.1, we detail the uncertainty Similarly, the observations vector y31K is assumed to be model used throughout the paper, and state a funda- independent of x, and with normal distribution &  *  31K * 31K K mental technical lemma. y N(y, D( B)), with y , and D( B) . The linear statistical model assumes further that the expected values of x and y are related by a linear relation which, in our 2.1. LFT uncertainty models case, is also uncertain We shall consider matrices subject to structured  " *  y C( A)x. uncertainty in the so-called linear-fractional (LFT) form Given some a priori estimate xQ of x, and given the  vector of measurements yQ, we seek an estimate of x M(*)"M#¸*(I!H*)\R, (3) that maximizes a lower bound on the worst-case (with respect to the uncertainty) a posteriori probability where M, ¸, H, R are constant matrices, while the uncer- * D of the observed event. When no deterministic uncertainty tainty matrix belongs to the set , where D "+*3D ""*""4 , D is present on the model, this is the celebrated maximum  : 1 , and is a linear subspace. The likelihood (ML) approach to parameter estimation, norm used is the spectral (maximum singular value) which enjoys special properties such as e$ciency norm. The subspace D, referred to as the structure sub- and unbiasedness (see for instance Berger and Wolpert space in the sequel, de"nes the structure of the perturba- (1988), Goodwin and Payne (1997), and Ljung tion, which is otherwise only bounded in norm. Together, (1987)). For the important special case of linear estima- the matrices M, ¸, H, R and the subspace D, constitute tion, we will discuss in Section 3.2 how these properties a linear-fractional representation of an uncertain model. extend to the robust estimator, and how the resulting We will make from now on the standard assumption that D estimate is related to the minimum a posteriori variance all LFT models are well-posed over , meaning that ! * O *3D estimator. det(I H ) 0, for all , see Fan, Tits, and Doyle G. Calaxore, L.El Ghaoui / Automatica 37 (2001) 573}580 575

\#¸ * ! * \ \ * " \#¸ (1991). We also introduce the following linear subspace D B B(I HB B) RB, P ( N) P N D * ! * \ B( ), referred to as the scaling subspace: N(I HN N) RN, be the LFT representation of the Cholesky factors of D(* ) and P(* ), respectively. Then, (D)"+(S, ¹, G)"S*"*¹, G*"!*2G2 N N B using the common rules for LFT operation (see for for every *3D,. instance Zhou et al., 1996), we obtain an LFT repres- entation of LFT models of uncertainty are general and now widely * * " #¸* ! * \ used in robust control (Fan et al., 1991; Zhou, Doyle, [F( ) g( )] [Fg] (I H ) [R$ RE], & Glover, 1996) (especially in conjunction with SDP techniques, see for instance Asai, Hara, & Iwasaki, 1996), where F(*), g(*) are given in (1), * is a structured matrix * * * in identi"cation (Wolodkin, Rangan, & Poolla, 1997), containing the (possibly repeated) blocks A, B, N on and "ltering (El Ghaoui & Cala"ore, 1999; Xie, Soh, & de the diagonal, and Souza, 1994). This uncertainty framework includes the " 2 \ \ 2 " 2 \ 2 \ 2 case when parameters perturb each coe$cient of the data F [C D P ] ; g [yQ D xQ P ] . (7) matrices in a (polynomial or) rational manner, as stated Using the Schur complement rule (see for instance Boyd in the representation lemma in Dussy and El Ghaoui et al., 1994), the WCML estimation problem (2) may then (1998). be cast as a robust semide"nite optimization problem The main results in this paper rely on the following (SDP, see El Ghaoui et al., 1998): lemma, which provides a su$cient condition for a linear matrix inequality (LMI), see Boyd, El Ghaoui, Feron, g "argmin g and Balakrishnan, 1994 to hold for any allowed value of 5!+* V E the uncertainty. subject to

Lemma 1. Consider the LFT model (3) in the matrix vari- IF(*)x!g(*) Y ∀* D *31N O = 0, 3 . (8) able , and let be a given real symmetric matrix. If C(F(*)x!g(*))2 g D  there exist a triple (S, ¹, G)3B(D) such that SY0, ¹Y0 and (4) 3.1. Robust ML estimation M ¸ 2 M ¸ = The WCML estimate x( 5!+* is de"ned as the value of C I 0D C I 0D x at the optimum of problem (8). However, the solution RH2 ¹ G RH of the above problem is in general numerically hard to ! Y0, (5) compute, (Ben-Tal et al., 2000; El Ghaoui et al., 1998). In C D C 2 ! DC D 0 I G S 0 I order to obtain a computable solution, we shall apply the then the LFT model (3) is well posed for all *3D and robustness lemma to the robust LMI constraint (8). In  this way, we obtain a convex inner approximation of the 2 * 2 * 2Y ∀*3D [M ( ) I]=[M ( ) I] 0, . (6) feasible set, and the minimization of g subject to this new constraint will provide an upper bound on the Conditions (4) and (5) are also necessary when D"1N O, optimal objective of (8). The so-obtained solution i.e. in the case of unstructured perturbation. x( 0+* will be called a robust maximum likelihood esti- mate (RML). This is summarized in the following The proof of this lemma is reported in Appendix A. theorem. Remark 2. The above lemma provides in general a su$- cient condition only. A discussion on the tightness of this Theorem 3. The robust maximum likelihood estimate condition and its approximation error is out of the scope x( 0+* is obtained solving the SDP of this paper; general results and a further discussion may g "arg min g (9) be found in Ben-Tal, El Ghaoui, and Nemirovskii (2000) 0+* V 1 % 2 E and El Ghaoui, Oustry, and Lebret (1998). subject to ¹ 3 D Y ¹ Y 3. Robust estimation (S, G, ) B( ), S 0, 0, Fx!g In order to compute a robust estimate, we "rst set #(S, G, ¹) R x!R up the complete uncertainty model for (2) in LFT $ E Y 0, * form. Let C( A) be given in the LFT form as C[(Fx!g)2 (R x!R )2] g D * " #¸ * ! * \ \ * " $ E C( A) C A A(I HA A) RA, and let D ( B) 576 G. Calaxore, L.El Ghaoui / Automatica 37 (2001) 573}580 where subject to

#(S, G, ¹ ) (S, G, ¹)3B(D), SY0, ¹Y0, 2 2 I!¸¹¸ !¸(¹H #G) #(S, G, ¹) AK!G " . Y C! ¹# 2 ¸2 ! ¹ 2! ! 2 2D 0 (12) (H G ) S H H HG G H C(AK!G)2 lI D (10) and let K0*# be the value of K at the optimum of (11), then g g "g The optimal upper bound 0+* is exact (i.e. 0+* 5!+*) x( "K z&K x #K y when D"1N O. 0*# 0*# V Q W Q is a robust linear estimate (R¸E) guaranteeing that "" * ! * ""4g Proof. The result in the theorem follows immediately F( )x( g( ) 0*# for all admissible values of the un- from the application of Lemma 1 to the LMI constraint certainty, where in (8). In particular, the result is obtained setting g """ ""l 0*# z 0*#. 2 2 F 0 R$ M(*)& # *(I!H2*)\[¸2 0], g g Cg2 0D CR2D Thus, 0*# is a minimized upper bound on 0+* (i.e. E g 5g 0*# 0+*), and the optimal gain KMNR is independent of 00x the observations z. x =& 0 I 0 , x " . C!1D C  2 gD Proof. We start from the result of Theorem 3, assume x 0 that zO0 (the case z"0 may be trivially considered aside) and introduce a new variable K such that x"Kz. Remark 4. When there is no model uncertainty, we can We now rewrite problem (9) in the equivalent form ¸" " " " set 0, H 0, R$ 0, RE 0. In this case, the re- sults of the previous theorem reduce to the standard LS g " g 0+* arg min estimate, and the robust estimate is consistent with the ) 1 % 2 E idealized (uncertainty free) model. subject to

While the result of the previous theorem is useful to (S, G, ¹)3B(D), S ) 0, ¹ ) 0, obtain a numerical estimate of the parameters, due to complicated non-linear dependence of the estimate on #(S, G, ¹)(AK!G)z the data x , y , it is ackward to study further the statist- Y 0. (13) Q Q C 2 ! 2 g D ical properties of the resulting estimator. To pursue this z (AK G) study, in Section 3.2 we will consider the additional The previous is only a restatement of (9), and the result- constraint that the estimator should be linear in the data. ing estimate is not yet linear in the observations, as the optimal gain K will depend on z. However, we now show 3.2. Robust linear estimation that condition (13) is satis"ed whenever condition (12) is satis"ed. This is because, taking Schur complements, (13) The goal of this section is to compute a robust estimate is equivalent to #Y0, and which is linear in the observations. This is done in order to recover some of the nice features related to linear g'z2(AK!G)2#\(AK!G)z. (14) estimators, and to allow for further analysis of the bias and variance characteristics of the estimate. To Since this end, let K be an unknown gain matrix, and " & 2 2 2 & 2 ! 2#\ ! let x Kz, with z [yQ xQ ] , K [KW KV], and z (AK G) (AK G)z A&[F2 R2]2, h&[g2 R2]2&Gz, where G is some given $ E 4"" ""j ! 2#\ ! matrix that can be deduced from (1) and (7). Then, the z !((AK G) (AK G)), main result on the optimal robust linear estimate (RLE) g'"" ""j ! 2#\ is provided by the following theorem. then (14) is implied by z !((AK G) (AK!G)). Introducing the new variable l"g/""z"", this latter condition, together with #Y0, may be restated in Theorem 5. Let the form of (12), applying again the Schur complement l "arg min l (11) rule. As the initial constraint has been replaced by a more 0*# ) 1 % 2 J stringent one, it immediately follows that all solutions to G. Calaxore, L.El Ghaoui / Automatica 37 (2001) 573}580 577

(11) will be feasible for (9), therefore x( will be a robust where K0*3# is the value of KW at the optimum of estimate, and g an upper bound on g . Minimizing 0*# 0+* l " l over l (i.e. solving problem (11)) amounts to "nding the 0*3# arg min )W 1 % 2 J g best possible upper bound on 0+*, based on the premise that the estimate is linear in the samples. subject to (S, G, ¹)3B(D), S Y 0, ¹ Y 0, 3.3. Bias and variance of the robust linear estimate # ¹ ! ! (S, G, ) A[KW I KWC] G Y 0. C ! ! 2 l D In this section, we examine the bias and variance (A[KW I KWC] G) I characteristics of the linear robust estimator, and present a result for the computation of a robust linear unbiased Then, x( 0*3# is a robust linear unbiased estimate (RLUE) "" * ! * ""4g estimator. guaranteeing that F( )x( g( ) 0*3# for all admiss- The estimation bias is de"ned as b&E+x( !x,, where ible values of the uncertainty, where x is the (unknown) expected value of x, and x( is a linear g """ ""l 0*3# z 0*3#. estimate in the form g g 0*3# is the best possible upper bound on 0*#, based x( "K x #K y . (15) on the premise that the robust estimate must be linear and V Q W Q g 5g 5g unbiased, therefore 0*3# 0*# 0+*. " * " # !  " We then have that b b( A) E(KVxQ KWyQ x) B(* )x , where The proof of the above theorem follows from the A previous discussion. B(* )"B#K ¸ * (I!H * )\R ; We now discuss the a posteriori covariance properties A W A A A A A of the estimator in the parameter space. For a generic linear estimate in the form (15), the a posteriori B&K !I#K C, (16) V W covariance matrix is de"ned as R&E+(x( !x )(x( !x )2,. A standard manipulation then yields therefore, the bias is a linear function of the unknown * " * 2# * 2# *   2 2 * mean x , with uncertain coe$cients. Notice that the ro- R( ) KVP( N)KV KWD( B)KW B( A)xx B ( A),(17) bust estimate will be in general a!ected by bias. A condi- where B(* )isde"ned as in (16). tion for having robustly zero bias is of course given by A We notice that (17) depends on the unknown mean x , B(* )"0, that is B"0, K ¸ "0. The "rst condition A W A therefore an empirical estimate of the covariance may be requires in particular that K "I!K C, which means V W obtained substituting the estimated value x( in the place that the estimate should be in the classical `innovationsa of the unknown x . However, the signi"cance of the form x( "x #K (y !Cx ). The second condition re- Q W Q Q covariance matrix in case of biased estimate may be quires orthogonality between the gain K and the matrix W questionable. A more interesting result is obtained in the ¸ describing the uncertainty on the regression matrix A case of unbiased estimates (obtained by means of C(* ). In particular, when there is no uncertainty on A Theorem 6, when C is exactly known), where R(*) C (but we still allow for uncertainty in the covariance * " * 2# * 2 * * ¸ " reduces to R( ) KVP( N)KV KWD( B)KW. matrices D( B), P( N)), we have A 0, and we can have unbiased estimates, provided that B"0, i.e. " ! Remark 7. Notice that, using standard rules for opera- KV I KWC. Further, notice that both conditions im- pose linear constraints on the gain K, which may be tions with LFTs, one can determine an LFT representa- * easily added (i.e. the resulting problem is still an SDP) to tion for R( ), and use it in a recursive estimation the constraints of problem (11), in order to compute framework, collecting a new observation yQ, and Q * Q * linear robust unbiased estimates. Notice also that the setting xQ x( 0*3#, P( A) R( ), etc. Further study is additional constraints on the gain will not destroy feasi- however needed to analyze the behavior of the recursive bility, but simply decrease the level of the achievable RLUE. robust ML performance. Our result on robust linear unbiased estimation It is important at this point to make some observa- (RLUE) is summarized in the following theorem. tions. It is well known that, when the linear model is perfectly known, the application of the ML principle provides estimates which are unbiased and e$cient, in Theorem 6. Assume no uncertainty acts on the regression the sense that the a posteriori covariance in parameter ¸ " matrix C, i.e. A 0. Let space reaches the CrameH r}Rao lower bound, see for instance Knight (2000, Section 6.4). In our context of " # ! x( 0*3# xQ K0*3#(yQ CxQ), robust estimation, we saw that unbiased estimates can be 578 G. Calaxore, L.El Ghaoui / Automatica 37 (2001) 573}580 obtained (Theorem 6) only if no uncertainty is acting on mented regression model q "(CI #¸I *I RI )h#dI , where the regression matrix C, while allowing for uncertainty in q 31, CI 31 , *I "diag(d I ,2,d I ). The robust  P  P * * P( N) and D( B). When this is not the case, estimation estimate was then computed solving (9) by means of bias will be unavoidable due to imperfect knowledge of a Matlab code based on the LMITool SDP solver (El  " *  the linear relation y C( A)x between the mean values Ghaoui & Commeau, 1999). The robust solution was of x and y. then compared to the standard LS estimate. It is worth to As for the a posteriori parameter covariance, one remind that on the estimation the LS es- may ask how the robust maximization of the likelihood timate yields a smaller residual than the robust estimate, e &"" \ q ! I hK """ function is related to robust minimization of the a poste- indeed we found *1 D ( C *1) 3.24 and e &"" \ q ! I hK """ riori covariance. The answer to this question is that 0+* D ( C 0+*) 3.54. However, the quality our robust ML linear estimator is also the one that of the two parameter estimates should be compared on minimizes an upper bound on the worst-case a posteriori data sets di!erent from the one used for the estimation covariance, therefore the estimate provided by Theorem (validation data sets). We therefore computed the resid- e e 6 is also a minimum variance unbiased estimate uals *1 and 0+* using data collected over several di!er- (MVUE), in the robust sense explained above. The rea- ent trajectories. The average residual resulted to be 6.2 son for this resides in a deep and general duality for the robust estimate and 6.76 for the LS estimate. result between the maximization of the log-likelihood More interestingly, if we compare the peak values of the function and the minimization (in matrix sense) of the residuals we obtain 10.1 for the robust estimate and 15.1 a posteriori covariance. For linear models, both prob- for the LS estimate, which is about 50% worse. Also, we lems have an equivalent formulation, provided that noticed that the robust estimate is more `regulara than suitable dual bases are chosen to write the optimization the LS estimate; regularity may be measured by the problem; for a thorough discussion of this issue, the variance of the residuals, which was 2.65 for the robust reader is referred to Kailath, Sayed, and Hassibi estimate and 4.52 for the LS estimate, i.e. again about (2000, Chapter 15). Finally, we remark that the robust 70% worst than the robust estimate. The actual data estimation framework proposed in this paper has used in this are available from the authors on similarities with the minimax approach to ML estimates request. and minimax variance estimates discussed in Huber (1981, Chapter 4), where robustness issued are con- 5. Conclusions sidered with respect to parametrized families of dis- tributions. In this paper, we have shown that the maximum likeli- hood estimation problem with uncertainty in the regres- sion matrix and in the observations covariance can be 4. Example solved in a worst-case setting using convex program- ming. This implies that, in practice, these problems can In this section, we report a result of the application of be solved e$ciently in polynomial time using available the presented methodology to the experimental estima- software. tion of dynamic parameters of a SCARA two-link IMI The paper also presents specialized results for robust manipulator available at the Politecnico di Torino Ro- linear estimation and unbiased robust linear estimation. botics Lab. Details related to the experimental setup, In particular, this latter estimator recovers most of the manipulator model and data treatment are discussed in nice features of standard ML estimators, and seems to be Cala"ore and Indri (1998). The goal is to estimate eight suitable for implementation in a recursive estimation dynamic and friction parameters of the manipulator from framework. noisy joint torque data. Denoting by q2"[q , q ] the   Robust estimation has been applied to an experi- joint positions, and by q2"[q , q ] the measured joint   mental problem of manipulator parameters identi"ca- torques, the following manipulator model was developed tion, which has inherent uncertainty in the regression for identi"cation: q"C (q, q , qK )h#d, where h31 is the B matrix. The reported results show that a consistent im- vector of identi"able parameters, d31 is a zero mean provement can be obtained over standard estimation Gaussian noise vector, and C(q, q , qK )31  is the nominal methods. regression matrix, which is a non-linear function of q,q ,qK . Since these data are experimentally acquired from the system, or reconstructed by means of "lters, the regres- Appendix. Proof of Lemma 1 sion matrix is intrinsically uncertain. We then developed a linearized model for the uncertainty in the LFT form, We "rst observe that the lower-right block of the and collected data at six time instants on the reference matrix in (5) is H2¹H#H2G#G2H!S, therefore con- trajectory. The relative torque measurements and LFT dition (5) implies well posedness of the LFT (3), see for regression models were then stacked, obtaining the aug- instance Fan et al. (1991) for a proof. Now, if the LFT for G. Calaxore, L.El Ghaoui / Automatica 37 (2001) 573}580 579

* 5 M( ) is well posed, then condition (6) is satis"ed if and such that f(u, p) 0. This condition is that there exists only if a scalar q50 such that, for every non-zero (u, p), we 'q have f(u, p) f(u, p). This is exactly the condition of u 2 M ¸ 2 M ¸ u the theorem in the unstructured case, written with f (u, p):" = '0 "q ¹"q "  CpD C I 0D C I 0DCpD S IN, IO and G 0.

"* # *3D for all u, p such that p (Ru Hp) for some . Let then q"Ru#Hp, and (S, ¹, G)3B(D), with ¹Y0. "* *3D References The condition p q for some  implies that q2Gp"q2G*q"0, by skew-symmetry of G*. In addi- tion, we have Asai, T., Hara, S., & Iwasaki, T. (1996). Simultaneous modelling and synthesis for robust control by LFT scaling. Proceedings of 2¹ ! 2 " 2 ¹!*2 * 13th IFAC world congress, vol. G, San Francisco, CA, USA q q p Sp q ( S )q (pp. 309}314). Ben-Tal, A., El Ghaoui, L., & Nemirovskii, A. (2000). Robust "q2¹(I!¹\*2*¹)¹q50. semide"nite programming. In: R. Saigal, L. Vandenberghe & In the above, we have used the fact that S*"*¹, H. Wolkowicz (Eds.), Handbook of semidexnite programming. and that the matrix ¹\*2*¹ is actually symmetric, Dordrecht, Canada: Kluwer Academic Publishers, Waterloo. and has eigenvalues less or equal to one. We conclude Berger, J. O., & Wolpert, R. (1988). The likelihood principle. Hayward, CA: Institute of Mathematical . that BjoK rk, A. (1991). Component-wise perturbation analysis and error bounds for linear least squares solutions. BIT, 31, 238}244. u 2 RH2 ¹ G RH u Boyd, S., El Ghaoui, L., Feron, E., & Balakrishnan, V. (1994). Linear 50 C D C D C 2 ! DC DC D matrix inequalities in system and control theory, Studies in Applied p 0 I G S 0 I p Mathematics. Philadelphia: SIAM. Cala"ore, G., & Indri, M. (1998). Experiment design for robot dynamic "* # *3D for every u, p such that p (Ru Hp) for some , calibration. Proceedings of the 1998 IEEE international conference and every triple (S, ¹, G)3B(D) with ¹Y0. Based on this on robotics and automation. Leuven, May 16}21. fact, we obtain a su$cient condition for (6) to hold, i.e. Chandrasekaran, S., Golub, G., Gu, M., & Sayed, A. H. (1998). Parameter estimation in the presence of bounded data uncertain- that for every non-zero pair (u, p), we have ties. SIAM Journal on Matrix Analysis and Applications, 19(1), 235}252. u 2 M ¸ 2 M ¸ u Dussy, S., & El Ghaoui, L. (1998). Measurement-scheduled control for = CpD C I 0D C I 0DCpD the RTAC problem. International Journal of Robust and Nonlinear Control, 8(4}5), 377}400. Elden, L. (1985). Perturbation theory for the least-squares problem with u 2 RH2 ¹ G RHu ' linear equality constraints. BIT, 24, 472}476. CpD C0 I D CG2 !SDC0 I DCpD El Ghaoui, L., & Cala"ore, G. (1999). Deterministic state prediction under structured uncertainty. Proceedings of the American control ¹ 3 D ¹Y conference, San Diego, California. for some triple (S, , G) B( ), with 0. The above El Ghaoui, L., & Commeau, J.-L. (1999). lmitool version 2.0, January. condition is exactly the one stated in the theorem. Available via http://www.ensta.fr/gropco. It remains to prove that our condition is also necessary El Ghaoui, L., & Lebret, H. (1997). Robust solutions to least-squares in the unstructured case, D"1N O. In this case, the problems with uncertain data. SIAM Journal of Matrix Analysis Applications, 18(4), 1035}1064. set B(D) reduces to the set of triples (S, ¹, G), with "q ¹"q q31 " El Ghaoui, L., Oustry, F., & Lebret, H. (1998). Robust Solutions to S IN, IO, , and G 0. First, we note Uncertain Semide"nite Programs. SIAM Journal of Optimization, that the well posedness su$cient condition 9(1), 33}52. H2¹H#H2G#G2H!S Y 0 for some (S, ¹, G)3B(D) Fan, M. K. H., Tits, A. L., & Doyle, J. C. (1991). Robustness is equivalent to ""H""(1, which is the exact well posed- in the presence of mixed parametric uncertainty and unmodeled ness condition. Second, we note that for every u, p,we dynamics. IEEE Transactions on Automatic Control, 36(1), 25}38. "* # * ""*""4 Goodwin, G. C., & Payne, R. L. (1997). Dynamic system identixcation: have p (Ru Hp) for some , 1 if and only if Experiment design and data analysis. New York: Academic Press. Huber, P. J. (1981). Robust statistics. New York: Wiley. u 2 RH2 I 0 RH u Kailath, T., Sayed, A. H., & Hassibi, B. (2000). Linear estimation. f (u, p):" O 50.  C D C D C ! DC DC D Englewood Cli!s, NJ: Prentice-Hall. p 0 I 0 IN 0 I p Knight, K. (2000). . New York: Chapman & Hall/CRC. We note that the above inequality is strictly feasible, that Ljung, L. (1987). System identixcation: theory for the user. Englewood ' Cli!s, NJ: Prentice-Hall. is there exists a pair (u, p) such that f(u, p) 0 (since ""H""(1, it su$ces to choose u "0 and p O0). In this Tikhonov, A., & Arsenin, V. (1977). Solutions of ill-posed problems. New   York: Wiley. case, the S-procedure (Boyd et al., 1994) provides a ne- Van Hu!el, S., & Vandewalle, J. (1991). The total least-squares cessary and su$cient condition for the quadratic con- problem: Computational aspects and analysis. Philadelphia, PA: ' straint f(u, p) 0 to hold for every non-zero pair (u, p) SIAM. 580 G. Calaxore, L.El Ghaoui / Automatica 37 (2001) 573}580

Wolodkin, G., Rangan, S., & Poolla, K. (1997). An LFT approach Information Systems Laboratory, Stanford University, in 1995, at to parameter estimation. Proceedings of the American control Ecole Nationale SupeH rieure de Techniques AvanceeH s (ENSTA), Paris, conference, Albuquerque, New Mexico. in 1998, and at the University of California at Berkeley, in 1999. His Xie, L., Soh, Y. C., & de Souza, C. E. (1994). Robust Kalman "ltering for research interests are in the "elds of convex optimization, randomized algorithms, and identi"cation, prediction and control of uncertain uncertain discrete-time systems. IEEE Transactions on Automatic systems. Control, 39(6), 1310}1314. Zhou, K., Doyle, J. C., & Glover, K. (1996). Robust and optimal control. Upper Saddle River: Prentice-Hall. Laurent El Ghaoui graduated from Ecole Polytechnique, Palaiseau, France, in 1985. Giuseppe Carlo Cala5ore was born in He obtained a Ph.D. thesis in Aeronautics Torino, Italy in December 1969. He re- and Astronautics from Stanford Univer- ceived the Laurea degree in Electrical En- sity in 1990. He held a research position gineering from Politecnico di Torino in at Laboratoire Systemes et Perception, 1993, and the Doctorate degree in In- Arcueil, France, from 1990 to 1992. From formation and System Theory from Polite- 1992 until 1999, he was a Faculty member cnico di Torino, in 1997. Since 1998, he at the Ecole Nationale Superieure de Tech- holds the position of Assistant Professor at niques Avancees in Paris. He is currently Dipartimento di Automatica e Infor- Associate Professor in the Department matica, Politecnico di Torino. Dr. of Electrical Engineering and Computer Cala"ore held visiting positions at Sciences of the University of California at Berkeley.