Chapter 8 Canonical Duality Theory: Connections between Nonconvex Mechanics and Global Optimization
David Y. Gao and Hanif D. Sherali
Dedicated to Professor Gilbert Strang on the occasion of his 70th birthday
Summary. This chapter presents a comprehensive review and some new developments on canonical duality theory for nonconvex systems. Based on a tricanonical form for quadratic minimization problems, an insightful re- lation between canonical dual transformations and nonlinear (or extended) Lagrange multiplier methods is presented. Connections between complemen- tary variational principles in nonconvex mechanics and Lagrange duality in global optimization are also revealed within the framework of the canonical duality theory. Based on this framework, traditional saddle Lagrange duality and the so-called biduality theory, discovered in convex Hamiltonian systems and d.c. programming, are presented in a unified way; together, they serve as a foundation for the triality theory in nonconvex systems. Applications are illustrated by a class of nonconvex problems in continuum mechanics and global optimization. It is shown that by the use of the canonical dual trans- formation, these nonconvex constrained primal problems can be converted into certain simple canonical dual problems, which can be solved to obtain all extremal points. Optimality conditions (both local and global) for these extrema can be identified by the triality theory. Some new results on gen- eral nonconvex programming with nonlinear constraints are also presented as applications of this canonical duality theory. This review brings some fun- damentally new insights into nonconvex mechanics, global optimization, and computational science.
Key words: Duality, triality, Lagrangian duality, nonconvex mechanics, global optimization, nonconvex variations, canonical dual transformations, critical point theory, semilinear equations, NP-hard problems, quadratic pro- gramming
David Y. Gao, Department of Mathematics, Virginia Tech, Blacksburg, VA 24061, U.S.A. e-mail: [email protected] Hanif D. Sherali, Grado Department of Industrial and Systems Engineering, Virginia Tech, Blacksburg, VA 24061, U.S.A., e-mail: [email protected]
D.Y. Gao, H.D. Sherali, (eds.), Advances in Applied Mathematics and Global Optimization 257 Advances in Mechanics and Mathematics 17, DOI 10.1007/978-0-387-75714-8_8, © Springer Science+Business Media, LLC 2009 258 D.Y.Gao,H.D.Sherali 8.1 Introduction
Complementarity and duality are two inspiring, closely related concepts. To- gether they play fundamental roles in multidisciplinary fields of mathematical science, especially in engineering mechanics and optimization. The study of complementarity and duality in mathematics and mechanics has had a long history since the well-known Legendre transformation was formally introduced in 1787. This elegant transformation plays a key role in complementary duality theory. In classical mechanical systems, each energy function definedinaconfiguration space is linked via the Legendre trans- formation with a complementary energy in the dual (source) space, through which the Lagrangian and Hamiltonian can be formulated. In static systems, the convex total potential energy leads to a saddle Lagrangian through which a beautiful saddle min-max duality theory can be constructed. This saddle Lagrangian plays a central role in classical duality theory in convex analy- sis and constrained optimization. In convex dynamic systems, however, the total action is usually a nonconvex d.c. function, that is, the difference of convex kinetic energy and total potential functions. In this case, the classical Lagrangian is no longer a saddle function, but the Hamiltonian is convex in each of its variables. It turns out that instead of the Lagrangian, the Hamilto- nian has been extensively used in convex dynamics. From a geometrical point of view, Lagrangian and Hamiltonian structures in convex systems and d.c. programming display an appealing symmetry, which was widely studied by their founders. Unfortunately, such a symmetry in nonconvex systems breaks down. It turns out that in recent times, tremendous effort and attention have been focused on the role of symmetry and symmetry-breaking in Hamilto- nian mechanics in order to gain a deeper understanding into nonlinear and nonconvex phenomena (see Marsden and Ratiu, 1995). The earliest examples of the Lagrangian duality in engineering mechanics are probably the complementary energy principles proposed by Haar and von K´arm´an in 1909 for elastoperfectly plasticity and Hellinger in 1914 for contin- uum mechanics. Since the boundary conditions in Hellinger’s principle were clarified by E. Reissner in 1953 (see Reissner, 1996), the complementary— dual variational principles and methods have been studied extensively for more than 50 years by applied mathematicians and engineers (see Arthurs, 1980, Noble and Sewell, 1972).1 The development of mathematical duality theory in convex variational analysis and optimization has had a similar his- tory since W. Fenchel proposed the well-known Fenchel transformation in 1949. After the revolutionary concepts of superpotential and subdifferentials introduced by J. J. Moreau in 1966 in the study of frictional mechanics,
1 Eric Reissner (PhD 1938) was a professor in the Department of Mathematics at MIT from 1949 to 1969. According to Gil Strang, since Reissner moved to the Department of Mechanical and Aerospace Engineering at University of California, San Diego in 1969, many applied mathematicians in the field of continuum mechanics, especially solid mechanics, switched from mathematical departments to engineering schools in the United States. 8 Canonical Duality Theory 259 the modern mathematical theory of duality has been well developed by cele- brated mathematicians such as R. T. Rockafellar (1967, 1970, 1974), Moreau (1968), Ekeland (1977, 2003), I. Ekeland and R. Temam (1976), F. H. Clarke (1983, 1985), Auchmuty (1986, 2001), G. Strang (1979—1986), and Moreau, Panagiotopoulos, and Strang (1988). Mathematically speaking, in linear elas- ticity where the total potential energy is convex, the Hellinger—Reissner com- plementary variational principle in engineering mechanics is equivalent to a Fenchel—Moreau—Rockafellar type dual variational problem. The so-called generalized complementary variational principle is actually the saddle La- grangian duality theory, which serves as the foundation for hybrid/mixed finite element methods, and has been subjected to extensive study during the past 40 years (see Strang and Fix (1973), Oden and Lee (1977), Pian and Tong (1980), Pian and Wu (2006), Han (2005), and the references cited therein). Early in the beginning of the last century, Haar and von K´arm´an (1909) had already realized that in nonlinear variational problems of continuum me- chanics, the direct approaches for solving minimum potential energy (primal problem) can only provide upper bounding solutions. However, the minimum complementary energy principle (i.e., the maximum Lagrangian dual prob- lem) provides a lower bound (the mathematical proof of Haar—von K´arm´an’s principle was given by Greenberg in 1949). In safety analysis of engineering structures, the upper and lower bounding approximations to the so-called col- lapse states of the elastoplastic structures are equally important to engineers. Therefore, the primal—dual variational methods have been studied extensively by engineers for solving nonsmooth nonlinear problems (see Gao, 1991, 1992, Maier, 1969, 1970, Temam and Strang, 1980, Casciaro and Cascini, 1982, Gao, 1986, Gao and Hwang, 1988, Gao and Cheung, 1989, Gao and Strang, 1989b, Gao and Wierzbicki, 1989, Gao and Onate, 1990, Tabarrok and Rim- rott, 1994). The article by Maier et al. (2000) serves as an excellent survey on the developments for applications of the Lagrangian duality in engineering structural mechanics. In mathematical programming and computational sci- ence, the so-called primal—dual interior point methods are also based on the Lagrangian duality theory, which has emerged as a revolutionary technique during the last 15 years. Complementary to the interior-point methods, the so-called pan-penalty finite element programming developed by Gao in 1988 (1988a,b) is indeed a primal—dual exterior-point method. He proved that in rigid-perfectly plastic limit analysis, the exterior penalty functional and the associated perturbation method possess an elegant physical meaning, which ledtoanefficient dimension rescaling technique in large-scale nonlinear mixed finite element programming problems (Gao, 1988b). In mathematical programming and analysis, the subject of complementar- ity is closely related to constrained optimization, variational inequality, and fixed point theory. Through the classical Lagrangian duality, the KKT condi- tions of constrained optimization problems lead to corresponding complemen- tarity problems. The primal—dual schema has continued to evolve for linear 260 D.Y.Gao,H.D.Sherali and convex mathematical programming during the past 20 years (see Walk, 1989, Wright, 1998). However, for nonconvex systems, it is well known that the KKT conditions are only necessary under certain regularity conditions for global optimality. Moreover, the underlying nonlinear complementarity problems are fundamentally difficult due to the nonmonotonicity of the non- linear operators, and also, many problems in global optimization are NP-hard. The well-developed Fenchel—Moreau—Rockafellar duality theory will produce a so-called duality gap between the primal problem and its Lagrangian dual. Therefore, how to formulate perfect dual problems (with a zero duality gap) is a challenging task in global optimization and nonconvex analysis. Extensions of the classical Lagrangian duality and the primal—dual schema to nonconvex systems are ongoing research endeavors (see Aubin and Ekeland, 1976, Eke- land, 1977, Thach, 1993, 1995, Thach, Konno, and Yokota, 1996, Singer, 1998, Gasimov, 2002). On the flip side, the Hellinger—Reissner complementary en- ergy principle, emanating from large deformation mechanics, holds for both convex and nonconvex problems. It is very interesting to note that around thesametimeperiodofReissner’swork,the generalized potential variational principle in finite deformation elastoplasticity was proposed independently by Hu Hai-chang (1955) and K. Washizu (1955). These two variational principles are perfectly dual to each other (i.e., with zero duality gap) and play impor- tant roles in large deformation mechanics and computational methods. The inner relations between the Hellinger—Reissner and Hu—Washizu principles were discovered by Wei-Zang Chien in 1964 when he proposed a systematic method to construct generalized variational principles in solid mechanics (see Chien, 1980). Mechanics and mathematics have been complementary partners since Newton’s time, and the history of science shows much evidence of the bene- ficial influence of these disciplines on each other. However, the independent developments of complementary—duality theory in mathematics and mechan- ics for more than a half century have generated a “duality gap” between the two partners. In modern analysis, the mathematical theory of duality was mainly based on the Fenchel transformation. During the last three decades, many modified versions of the Fenchel—Moreau—Rockafellar duality have been proposed. One, the so-called relaxation method in nonconvex mechanics, can be used to solve the relaxed convex problems (see Atai and Steigmann, 1998, Dacorogna, 1989, Ye, 1992). However, due to the duality gap, these relaxed solutions do not directly yield real solutions to the nonconvex primal prob- lems. Thus, tremendous efforts have been focused recently on finding the so-called perfect duality theory in global optimization. On the other hand, it seems that most engineers and scientists prefer the classical Legendre trans- formation. It turns out that their attention has been mainly focused on how to use traditional Lagrange multiplier methods and complementary consti- tutive laws to correctly formulate complementary variational principles for numerical computational and application purposes. Although the generalized Hellinger—Reissner principle leads to a perfect duality between the noncon- 8 Canonical Duality Theory 261 vex potential variational problem and its complementary—dual, and has many important consequences in large deformation theory and computational me- chanics, the extremality property of this well-known principle, as well as the Hu—Washizu principle, remained an open problem for more than 40 years, and this raised many arguments in large deformation theory and nonconvex mechanics (see Levinson, 1965, Veubeke, 1972, Koiter, 1976, Ogden, 1975, 1977, Lee and Shield, 1980a,b, Guo, 1980). Actually, this open problem was partially solved in 1989 in the joint work of Gao and Strang (1989a) on nonconvex/nonsmooth variational problems. In order to recover the lost symmetry between the nonconvex primal problem and its dual, they introduced a so-called complementary gap function,which leads to a nonlinear Lagrangian duality theory in fully nonlinear variational problems. They proved that if this gap function is positive on a dual feasi- ble space, the generalized Hellinger—Reissner energy is a saddle-Lagrangian. Therefore, this gap function provides a sufficient condition in nonconvex vari- ational problems. However, the extremality conditions for negative gap func- tion were ignored until 1997 when Gao (1997) got involved with a project on postbuckling problems in nonconvex mechanics. He discovered that if this gap function is negative, the generalized Hellinger—Reissner energy (the so-called super-Lagrangian) is concave in each of its variables, which led to a biduality theory. Therefore, a canonical duality theory has gradually developed, first in nonconvex mechanics, and then in global optimization (see Gao, 1990—2005). This new theory is composed mainly of a potentially useful canonical dual transformation and an associated triality theory, whose components comprise a saddle min-max duality and two pairs of double-min, double-max dualities. The canonical dual transformation can be used to formulate perfect dual problems without a duality gap, whereas the triality theory can be used to identify both global and local extrema. The goal of this chapter is to present a comprehensive review on the canon- ical duality theory within a unified framework, and to expose its role in estab- lishing connections between nonconvex mechanics and global optimization. Applications to constrained nonconvex optimization problems are shown to reveal some important new results that are fundamental to global optimiza- tion theory. This chapter should be of interest to both the operations research and applied mathematics communities. In order to make this presentation easy to follow by interdisciplinary readers, our attention here is mainly fo- cused on smooth systems, although some concepts from nonsmooth analysis have been used in later sections.
8.2 Quadratic Minimization Problems
Let us begin with the simplest quadratic minimization problem (in short, the primal problem ( q)): P 262 D.Y.Gao,H.D.Sherali
1 ( q): min P (u)= u, Au u, f : u k , (8.1) P 2h i − h i ∈ U ½ ¾ where k is an open subset of a linear space ; A is a linear symmetrical U U operator, which maps each u into its dual space ∗; the bilinear form ∈ U U u, u∗ : ∗ R puts and ∗ in duality; f ∗ is a given input, and h i U×U → U U ∈ U P : R represents the total cost (action) of the system. The criticality conditionU → δP(u) = 0 leads to a linear equation
Au = f, (8.2) which is called the fundamental equation (or equilibrium equation) in math- ematical physics. By the fact that A : ∗ is a symmetrical operator, we have the following canonical decomposition,U → U
A = Λ∗DΛ, (8.3) where Λ : is a so-called geometrical operator, which maps each u into a so-calledU → Vintermediate space , and the symmetrical operator D links∈ U V with its dual space ∗. The bilinear form v ; v∗ : ∗ R puts V V h i V×V → and ∗ in duality. We distinguish between the notations , and ; V V h i h i accordingtothedifferences of the dual spaces ∗ and ∗ on which U×U V×V they are respectively defined. The mapping v∗ = Dv ∗ is called the duality ∈ V equation.TheadjointoperatorΛ∗ : ∗ ∗,defined by V → U
Λu ; v∗ = u, Λ∗v∗ , h i h i is also called the balance operator. Thus, by the use of the intermediate pair (v, v∗), the fundamental equation (8.2) can be split into the so-called tri- canonical form
(a) geometrical equation: Λu = v (b) duality equation: Dv = v∗ Λ∗DΛu = f. (8.4) ⎫ ⇒ (c) balance equation: Λ∗v∗ = f ⎬
In mathematical physics, the duality equation v⎭∗ = Dv is also recognized as the constitutive law and the operator D depends on the physical properties of the system considered. The pair (v,v∗)issaidtobeacanonical dual pair on a ∗ ∗ if the V ×Va ⊂ V×V duality mapping D : a a∗ ∗ is one-to-one and onto. Generally speaking, most physicalV variables⊂ V → V appear⊂ V in dual pairs; that is, there exists aGˆateaux differentiable function V : a R such that the duality relation V → v∗ = δV (v): a a∗ is revertible, where δV (v)representstheGˆateaux derivative of VVat→v. InV mathematical physics, such a function is called free energy. Its Legendre conjugate V ∗(v∗): ∗ R,defined by the Legendre transformation V → 8 Canonical Duality Theory 263
V ∗(v∗)=sta v; v∗ V (v):v a , (8.5) {h i − ∈ V } is called complementary energy,wheresta denotes finding stationary points of the statement in . In order to study{} the canonical duality theory, consider the following definition.{}
Definition 8.1. A real-valued function V : a R is called a canonical V ⊂ V → function on a if its Legendre conjugate V ∗(v∗) can be uniquely defined on V ∗ ∗ such that the following relations hold on a ∗: Va ⊂ V V ×Va
v∗ = δV (v) v = δV ∗(v∗) v ; v∗ = V (v)+V ∗(v∗). (8.6) ⇔ ⇔ h i
Clearly, if D : a a∗ is invertible, the quadratic function V (v)= 1 V → V 1 1 v;Dv is canonical on a and its Legendre conjugate V ∗(v∗)= D− v∗;v∗ 2 h i V 2 h i is a canonical function on a∗. Generally speaking, if V : a R is a canonical V V → function and v∗ = δV (v), then (v, v∗) is a canonical dual pair on a a∗.The one-to-one canonical duality relation serves as a foundation for theV ×V canonical dual transformation method reviewed in the following sections. The defini- tion of the canonical pairs and functions can be generalized to nonsmooth systems where the Fenchel transformation and subdifferential have to be ap- plied (see Gao, 2000a,c). This is discussed in the context of constrained global optimization problems in Section 8.8 of this chapter. In order to study general problems, we denote the linear function u, f h i by U(u). If the feasible space k can be written in the form of U
k = u a Λu a , (8.7) U { ∈ U | ∈ V } then the problem ( q) can be written in a general form P
( ): min P (u)=V (Λu) U(u):u k . (8.8) P { − ∈ U } This general form covers many problems in applications. In continuum mechanics,thefeasibleset k is usually called the kinetically admissible space. In statics, where the functionU V (v)isviewedasaninternal(orstored)energy and U(u) is considered as an external energy, the cost function P (u)isthe so-called total potential and ( ) represents a minimal potential variational problem. In dynamical systemsP if V (v) is considered as a kinetic energy and U(u) is the total potential, then P (u) is called the total action of the system. In this case, the variational problem associated with the general form ( )is the well-known least action principle. A diagrammatic representation ofP this tricanonical decomposition is shown in Figure 8.1. The development of the Λ∗DΛ-operator theory was apparently initiated by von Neumann in 1932, and was subsequently extended and put into a more general setting in the studies of complementary variational principles in con- tinuum mechanics by Rall (1969), Arthurs (1980), Tonti (1972a,b), Oden and Reddy (1983), and Sewell (1987). In mathematical analysis, the tricanonical form of A = Λ∗DΛ hasalsobeenusedtodevelopamathematicaltheory 264 D.Y.Gao,H.D.Sherali
¾ - u a u, u∗ Ua∗ u∗ ∈ U ⊂ U h i U ∗ ⊃ 3 6 Λ Λ∗ ? ¾ - v a v ; v∗ a∗ v∗ ∈ V ⊂ V h i V∗ ⊃ V 3
Fig. 8.1 Diagrammatic representation for quadratic systems.
of duality by Rockafellar (1970), Ekeland and Temam (1976), Toland (1978, 1979), Auchmuty (1983), Clarke (1985), and many others. In the excellent textbook by Strang (1986), the trifactorization A = Λ∗DΛ for linear oper- ators can be seen through an application of continuum theories to discrete systems. In what follows, we list some simple examples. More applications can be found in the monograph Gao (2000a).
8.2.1 Quadratic Optimization Problems in Rn
n First, we consider as a finite-dimensional space such that = ∗ = R . U n n U U Thus A : ∗ is a symmetric matrix in R × and the bilinear form UT → U n u, u∗ = u u∗ is simply a dot-product in R .Bylinearalgebra,thecanonical h i decomposition A = Λ∗DΛ can be performed in many ways (see Strang, 1986), where Λ : Rn Rm is a matrix, D : Rm Rm is a symmetrical matrix, and T → m →n Λ∗ = Λ maps ∗ = R back to ∗ = R . The bilinear forms , and V U h∗ ∗i ; are simply dot products in Rn and Rm, respectively, that is, h∗ ∗i m n n m T Λu; v∗ = v∗ Λijuj = uj Λijv∗ = u, Λ v∗ . h i ⎛ i ⎞ i h i i=1 j=1 j=1 à i=1 ! X X X X ⎝ ⎠ If the matrix A is positive semidefinite, we can always choose a geometrical m m operator Λ to ensure that the matrix D R × is positive definite. In this case the problem ( ) is a convex program∈ and any solution of the fundamental equation Au = f alsoP solves the minimization problem ( ). P1 If the matrix A is indefinite, the quadratic function 2 u, Au is noncon- vex. From linear algebra, it follows then that by choosingh a particulari linear operator Λ : Rn Rm,thematrixA can be written in the tricanonical form: → D 0 Λ A = ΛT ,I , (8.9) 0 C I µ − ¶µ ¶ ¡ ¢ 8 Canonical Duality Theory 265
m m n n where D R × is positive definite, C R × is positive semidefinite, ∈ n ∈ 1 and I is an identity in R .Inthiscase,bothV (v)= 2 v; Dv and U(u)= 1 u, Cu + u, f are convex quadratic functions, but h i 2 h i h i 1 1 P (u)=V (Λu) U(u)= Λu; DΛu u, Cu u, f − 2h i − 2h i − h i is a nonconvex d.c. function,thatis,adifference of convex functions. In this case, the problem ( ) is a nonconvex quadratic minimization and the solution of Au = f is only aP critical point of P (u). Nonconvex quadratic programming and d.c. programming are important from both the mathematical and application viewpoints. Sahni (1974) first showed that for a negative definite matrix A,theproblem( )isNP-hard. This result was also proved by Vavasis (1990, 1991) and by PardalosP (1991). During the last decade, several authors have shown that the general quadratic programming problem ( )isanNP-hard problem in global optimization (cf. Murty and Kabadi, 1987,P Horst et al., 2000). It was shown by Pardalos and Vavasis (1991) that even when the matrix A is of rank one with exactly one negative eigenvalue, the problem is NP-hard. In order to solve this difficult problem, much effort has been devoted during the last decade. Comprehensive surveys have been given by Floudas and Visweswaran (1995) for quadratic programming, and by Tuy (1995) for d.c. optimization.
8.2.2 Variational Problems in Continuum Mechanics
In continuous systems the linear space is usually a function space over a time—space domain, and the linear mappingU A is a differential operator. In classical Newtonian dynamics, for example, the fundamental equation (8.2) is a second-order differential equation
Au = mu00 = f, − where f is an applied force field. In this case, Λ =d/ dt is a linear differential operator, m>0 is a mass density, and Λ∗ = d/ dt can be defined by − integrating by parts over a time domain T R with boundary ∂T: ⊂
Λu; v∗ = u0v∗ dt = u( v∗)0 dt = u, Λ∗v∗ , h i − h i ZT ZT subject to the boundary conditions u(t)v∗(t)=0, t ∂T. For Newton’s law, D = m is a constant and the∀ tricanonical∈ form Au = Λ∗DΛu = mu00 = f is Newton’s equilibrium equation. The quadratic form −
1 1 1 2 V (Λu)= u, Au = Λu; DΛu = mu0 dt 2h i 2h i 2 ZT 266 D.Y.Gao,H.D.Sherali represents the internal (or kinetic) energy of the system, and the linear term
U(u)= uf dt ZT represents the external energy of the system. The function P (u)=V (Λu) U(u) is called the total action, which is a convex functional. − 2 2 For Einstein’s law, however, D = m(t)=mo/ 1 c /v depends on the − velocity v = u0,wheremo > 0isaconstantandc is the speed of light. In this case, the tricanonical form Au = f leads top Einstein’s theory of special relativity: d m d o u = f. 2 − dt à 1 u0 /c dt ! − The kinetic energy p
2 2 V (v)= mo 1 v /c dt T − − Z p is no longer quadratic, but is still a convex functional on a = v V { ∈ ∞(T ) v(t)
1 1 V (v)= mv2 dt, U(u)= ku2 uf dt, 2 2 − ZT ZT µ ¶ where the quadratic function U(u) represents the total potential energy. The quadratic functional given by 1 1 P (u)=V (Λu) U(u)= mu2 dt [ ku2 uf]dt (8.11) − 2 ,t − 2 − ZT ZT is the well-known total action, which is again a d.c. functional. 8 Canonical Duality Theory 267
2 Actually, every function P (u) is d.c. on any compact convex set k, and any d.c. optimization problem∈ canC be reduced to the canonical form (seeU Tuy, 1995): min V (Λu):U(u) 0,G(u) 0 , (8.12) { ≤ ≥ } where V, U, and G are convex functions. In the next section, we demonstrate how the tricanonical Λ∗DΛ-operator theory serves as a framework for the Lagrangian duality theory.
8.3 Canonical Lagrangian Duality Theory
Classical Lagrangian duality was originally studied by Lagrange in analytical mechanics. In engineering mechanics it has been recognized as the comple- mentary variational principle, and has been subjected to extensive study for more than several centuries. In this section, we show its connection to con- strained optimization/variational problems. In addition to the well-known saddle Lagrangian duality theory, a so-called super-Lagrangian duality is pre- sented within a unified framework, which leads to a biduality theorem in d.c. programming and convex Hamiltonian systems. Recall the general primal problem (8.8)
( ): min P (u)=V (Λu) U(u):u k , (8.13) P { − ∈ U } where V : a R is a canonical function, U : a R is a Gˆateaux V ⊂ V → U → differentiable function, either linear or canonical, and k = u a Λu U { ∈ U | ∈ a is a convex feasible set. Without loss of generality, we assume that the V } geometrical operator Λ : a can be chosen in a way such that the U → V canonical function V : a R is convex. By the definition of the canonical V → function, the duality relation v∗ = δV (v): a ∗ leads to the following V → Va Fenchel—Young equality on a ∗, V ×Va
V (v)= v; v∗ V ∗(v∗). h i −
Substituting this into equation (8.13), the Lagrangian L(u, v∗): a a∗ R associated with the canonical problem ( )canbedefined by U ×V → P
L(u, v∗)= Λu; v∗ V ∗(v∗) U(u). (8.14) h i − −
Definition 8.2. (Canonical Lagrangian) AfunctionL : a a∗ R associated with the problem ( ) is called a canonical LagrangianU ×V if it→ is a P canonical function on ∗ and a canonical or linear function on a. Va U The criticality condition δL(¯u, v¯∗) = 0 leads to the well-known Lagrange equations: 268 D.Y.Gao,H.D.Sherali
Λu¯ = δV (¯v ) ∗ ∗ (8.15) Λ∗v¯∗ = δU(¯u).
By the fact that V : a ∗ is a canonical function, the Lagrange equations V → Va (8.15) are equivalent to Λ∗δV (Λu¯)=δU(¯u). If (¯u, v¯∗) is a critical point of L(u, v∗), thenu ¯ is a critical point of P (u)on k. U Because the canonical function V is assumed to be convex on a,the V canonical Lagrangian L(u, v∗) is concave on ∗. Thus, the extremality condi- Va tions of the critical point of L(u, v∗) depend on the convexity of the function U(u). Two important duality theories are associated with the canonical La- grangian, as shown in Sections 8.3.1 and 8.3.2 below.
8.3.1 Saddle-Lagrangian Duality
First, we assume that U(u) is a concave function on a.Inthiscase,L(u, v∗) U is a saddle-Lagrangian; that is, L(u, v∗)isconvexon a and concave on ∗. U Va By the traditional definition, a pair (¯u, v¯∗) is called a saddle point of L(u, v∗) on a ∗ if U ×Va
L(u, v¯∗) L(¯u, v¯∗) L(¯u, v∗), (u, v∗) a ∗. (8.16) ≥ ≥ ∀ ∈ U ×Va The classical saddle-Lagrangian duality theory can be presented precisely by the following theorem. Theorem 8.1. (Saddle-Min-Max Theorem) Suppose that the function U : a R is concave and there exists a linear operator Λ : a a U → U → V such that the canonical Lagrangian L : a a∗ R is a saddle function. If U ×V → (¯u, v¯∗) a ∗ is a critical point of L(u, v∗),then ∈ U ×Va
min max L(u, v∗)=L(¯u, v¯∗)= max min L(u, v∗). (8.17) u k v v u a ∈U ∗∈Va∗ ∗∈Vk∗ ∈U d By using this theorem, the dual function P (v∗)canbedefined as
d P (v∗)= minL(u, v∗)=U (Λ∗v∗) V ∗(v∗), (8.18) u a ∈U − where U : ∗ R is a Fenchel conjugate function of U defined by the Fenchel transformationU →