Adv. Nonlinear Anal. 2 (2013), 271–307 DOI 10.1515/anona-2013-1001 © de Gruyter 2013
Variational methods in the presence of symmetry
Jonathan M. Borwein and Qiji J. Zhu
Abstract. The purpose of this paper is to survey and to provide a unified framework to connect a diverse group of results, currently scattered in the literature, that can be usefully viewed as consequences of applying variational methods to problems involving symme- try. Here, variational methods refer to mathematical treatment by way of constructing an appropriate action function whose critical points—or saddle points—correspond to or contain the desired solutions. Keywords. Variational methods, convex analysis, symmetry, group action. 2010 Mathematics Subject Classification. 49J53, 58J70, 58E30.
1 Introduction
The purpose of this paper is to survey and to provide a unified framework to con- nect a diverse group of results, currently scattered in the literature, that can be usefully viewed as consequences of applying variational methods to problems in- volving symmetry. Here, variational methods refer to mathematical treatment by way of constructing an appropriate action function whose critical points corre- spond to or contain the desired solutions. Variational methods can be viewed as a mathematical form of the least ac- tion principle in physics. Variational methods have been a powerful tool in both pure and applied mathematics ever since the systematic development of calcu- lus of variations commenced over 300 years ago. With the discovery of modern variational principles and the development of nonsmooth analysis, the range of application of such techniques has been extended dramatically (see recent mono- graphs [14, 18, 32, 39, 40]). Symmetry is ubiquitous in the real world and its modelling, and it also presents frequently in variational problems. Often when the action function is symmetric, the solution also has a certain symmetry. Research on such symmetric variational problems is currently scattered in the literature and sometimes is treated only im- plicitly. Since Felix Klein introduced his Erlangen Program [28], symmetry is, in general, treated as invariance with respect group actions (see [45, 49]. This, how- ever, proves to be inadequate in dealing with many variational problems involving symmetry. 272 J. M. Borwein and Q. J. Zhu
For example, a key result leading to Adrian Lewis’ celebrated representation of subdifferentials of spectral functions [30] is:
Proposition 1.1. Let f be a convex permutation invariant function of several real variables. Then y @f.x/ y y; z x f .z/ f .x/ if and only if 2 D ¹ W h i Ä º y @f.x / and y; x y ; x : # 2 # h i D h # #i
Here x# denotes the decreasing rearrangement of x. We note that in Proposition 1.1: (a) the permutation invariance of the action function f is not preserved by the solution—its subdifferential, (b) the decreasing rearrangement is not invertible and, therefore, cannot be de- scribed by a group action. These are not the only special features of variational problems involving symme- try. Some other ‘anomalies’ include: (c) there may be no compatible topology for the symmetrization process, (d) at times only approximate symmetries can be constructed. A typical example involving (c) and (d) is the existence of Schwarz-symmetric so- lutions to Laplace-type partial differential equations as discussed in Section 3.4.2. There have been some prior efforts to deal systematically with variational prob- lems involving symmetry. An early result is the Palais principle of symmetric criticality [34]. When the action group consists of differentiable isometries, this principle states that finding symmetric critical points of a smooth action function requires only handling its restriction to the invariant sub-manifold corresponding to the group action. Limiting the application of the Palais principle is its strong smoothness requirement on the action function and the restrictive isometric prop- erty it needs for the group action. In [29], Ledyaev and Zhu adopted the use of nonsmooth functions on smooth manifolds as a framework to deal with symmetry in variational problems. Spectral functions provide examples that can benefit from such a framework (see [14, Chap- ter 7]). The research in [29] established a set of tools for nonsmooth variational problems on smooth manifolds and illustrated that linearity of the underlying space is not essential in dealing with many variational problems. However, this approach still relies on characterizing symmetry as invariance with respect to group actions. As a result the special features of variational problems involving symmetry alluded to above were not adequately addressed. Variational methods in the presence of symmetry 273
Recently, Van Schaftingen [46] and Squassina [42] proposed versions of sym- metric minimax and variational principles—in different levels of generality—that are stimulated by polarization approximations of Steiner symmetry [15]. Tailor- ing variational principles to a specific type of symmetry helps in fitting such vari- ational principles to the targeted problem. Yet this also limits the application of such principles to other problems. A more practicable approach to understanding variational problems involving symmetry is by providing a few simple overarch- ing principles which lead to the development of systematic approaches for dealing with symmetries of diverse nature. We follow this path in the current work. First we lay out a few such simple general variational principles. To make the general principles flexible enough and appropriate to the variational approach we define symmetry as an invariance with respect to a semigroup and study action functions that are ‘sub-invariant’ under the given semigroup action. The key to successfully applying these principles then relies on finding appropriate semigroups and related symmetrizations. We shall illustrate this with a suite of examples of contexts in which variational methods have already been used successfully in solving problems. As it is neither possible nor useful for our purpose to be comprehensive, we instead will illustrate our approach using selected examples that effectively illustrate the four special features, (a) through (d), highlighted above. The organization of the remainder of the paper is as follows. We layout the general principles in Section 2 and then turn to their applications in Section 3. Section 4 is devoted to discussion of saddle points in the presence of symmetry, and we make various conclusions in Section 5.
2 Variational principles in presence of symmetry
2.1 Invariance and symmetry As indicated, we frame symmetry as invariance or sub-invariance with respect to the action of a prescribed semigroup G with identity (monoid) acting on the ambient space. Throughout the paper, unless stated otherwise, we shall assume each semigroup does possess an identity. Let .X; d/ be a complete metric space with the semigroup action G X X 7! .g; x/ gx: (2.1) 7! In the sequel we will always assume that, for any g G, the mapping x gx is 2 7! continuous. For any subset S of X the G-orbit of S is defined by
G S gs g G; s S : WD ¹ W 2 2 º 274 J. M. Borwein and Q. J. Zhu
Consider an extended valued lsc (lsc) function f X R . We denote W 7! [ ¹C1º the (lower) level set of f at a . ; by 2 1 1 Œf a x X f .x/ a : Ä WD ¹ 2 W Ä º Definition 2.1 (Invariance). We say that an extended valued function f X W 7! R is sub-invariant with respect to the semigroup action (2.1) if, for any [ ¹C1º g G and x X, 2 2 f .gx/ f .x/: (2.2) Ä If the inequality is strict for all gx x, we say f is strictly sub-invariant. ¤ We say that an extended valued function f X R is (strictly) super- W 7! [¹ 1º invariant with respect to the semigroup action (2.1) if the function f is (strictly) sub-invariant. If f X R is both sub- and super-invariant, then we say f is W 7! invariant.
Clearly, when f is invariant, its level set Œf a x X f .x/ a is D WD ¹ 2 W D º invariant under the semigroup action, i.e.,
G Œf a Œf a: D D This relationship is convenient when working with equality constraints but non- essential since Œf a Œf a Œ f a: D D Ä \ Ä Thus, we will focus on sub-invariance below. We also observe that if G is a group, then the concept of either sub-invariance or super-invariance coincides with invari- ance. Although the invariance of the level set under the semigroup action is a type of symmetry property, in many situations the following stronger form of symmetry is more significant.
Definition 2.2 (Symmetrization). Let G be a semigroup and let f be a sub-invar- iant function. A map S X X is a .G; f /-symmetrization if, for any x X, W 7! 2 (i) for any g G, S.gx/ gS.x/ S.x/, 2 D D (ii) for any x, S 2.x/ S.x/, D (iii) for any x, f .S.x// f .x/. Ä In particular, if S.x/ cl.G s/, then (iii) holds for any lsc function f . In this 2 case we will simply call S a G-symmetrization. Variational methods in the presence of symmetry 275
This framework relies on ideas in [42, 46]. The main difference is in condi- tion (iii) in the definition of symmetrization. In our definition, the symmetriza- tion is linked to a particular action function f . Verifying (iii) is often the key to application and is nontrivial. Such a verification is, however, usually much eas- ier than trying to show S.x/ cl.G s/ which ensures that (iii) holds for every 2 lower semicontinuous function f ; an unnecessarily strong condition which fails in many cases. An important known concrete example that fits this framework is establishing the existence of symmetric solutions of certain Dirichlet type problems. In those problems, as we shall see, often condition (iii) can be verified using the Palais– Smale property [35] of the action function. The framework herein is also more flexible.
2.2 Symmetric extremal principle Many symmetry properties that one can derive using variational arguments are based on the following simple result.
Proposition 2.3 (Symmetric extremal principle). Let f X R be a W 7! [ ¹C1º sub-invariant extended valued function with respect to the semigroup action (2.1). Then G Œf a Œf a: Ä Ä In particular, letting a inf f , we have D G argmin.f / argmin.f /: Moreover, if f is strictly sub-invariant, then, for any x argmin.f / and any 2 g G, gx x. 2 D Proof. For any x Œf a and any element g G, f .gx/ f .x/ a implies 2 Ä 2 Ä Ä gx Œf a. The two set inclusions follows directly. The last conclusion follows 2 Ä directly from the definition of strict sub-invariance. This suffices to complete the proof.
Similarly to Proposition 2.3 we have
Proposition 2.4 (Symmetric minimization). Let f X R be an sub- W 7! [ ¹C1º invariant extended valued function with respect to the semigroup action (2.1) and let S be a .G; f /-symmetrization. Then S.argmin.f // argmin.f /: Proof. It is impossible to lie strictly below the minimum! 276 J. M. Borwein and Q. J. Zhu
2.3 Symmetric variational principles The symmetric extremal principle only applies to a problem that attains its infi- mum—that is, has a minimum. When the existence of a minimum is not guaran- teed, we need symmetric versions of a ‘variational principle’ [11, 21]. Versions of symmetric variational principles related to Steiner symmetrization have been discussed in [42]. These results require additional structure assumptions to ensure the compatibility with the Steiner symmetrization and its approximation by po- larizations. We present simpler versions with more flexibility of application. The trade-off is a loss of precision when dealing with a specific Steiner symmetrization.
2.3.1 Symmetric Ekeland variational principle Theorem 2.5 (Symmetric variational principle). Let .X; d/ be a complete metric space with a semigroup action G. Let f X R be a G-sub-invariant lsc function which is bounded from below.W Suppose7! [ that ¹C1º f .z/ < inf f ": X C Then, for any g G and > 0 there exists a y such that 2 (i) d.y; gz/ , Ä (ii) f .y/ ."=/ d.y; gz/ f .z/, C Ä (iii) f .x/ ."=/ d.x; y/ > f .y/ for x y. C ¤ Proof. Since f is sub-invariant, we have f .gz/ f .z/ < inf f ": Ä X C Applying Ekeland’s variational principle [14, 21] to gz suffices to complete the proof.
Remark 2.6. We note that for the trivial semigroup action gx x, for all g G, D 2 any function on X is invariant. Thus, Theorem 2.5 is a true, if easy, generalization of the Ekeland variational principle.
2.3.2 Symmetric Borwein–Preiss variational principle Similarly we have the following symmetric special case of the Borwein–Preiss variational principle. Recall that a Banach space is super-reflexive if it possess an equivalent uniformly convex norm or an equivalent uniformly smooth norm, see [11,13]. Hilbert space and each abstract Lp space with 1 < p < are super- 1 reflexive. Variational methods in the presence of symmetry 277
Figure 1. Ekeland (L) and Borwein–Preiss (R) variational principles.
Theorem 2.7 (Symmetric smooth variational principle). Let .X; / be a super- k k reflexive Banach space with a semigroup action G. Let f X R be W 7! [ ¹C1º a G-sub-invariant lsc function which is bounded from below. Suppose that f .z/ < inf f ": X C Then, for any g G and > 0; p 1 there exists a y such that 2 (i) f .gz/ < infX f .x/ ", C (ii) y gz , k k Ä (iii) f .y/ ."=p/ y gz p f .gz/, C k k Ä (iv) f .x/ ."=p/ x y p f .y/. C k k Proof. This is similar to the proof of the symmetric Ekeland variational princi- ple except we use the super-reflexive Borwein–Preiss variational principle [11] as a starting point.
Figure 1 illustrates the difference in the two principles.
3 Applications
Applying the framework in Section 2 to concrete problems requires us to carefully determine the action function f , to find a suitable semigroup G with related sym- metrization S, and to verify their compatibility. Although there are some general patterns, this process is largely problem-specific. 278 J. M. Borwein and Q. J. Zhu
The main purpose of this section is to illustrate this process with several exam- ples involving different forms of symmetry. We arrange our examples according to the special features alluded to in the introduction.
3.1 Symmetrization not compatible with topology We start with a simple example:
Example 3.1 (Minimum of a symmetric function). Consider the problem of min- imizing a permutation invariant lsc convex function f .x/ RN R . W 7! [ ¹C1º Assume that f is coercive (has compact lower level sets) and is bounded from below. Thus, f attains its minimum. Let G1 P.N / be the permutation group on 1; 2; : : : ; N and define WD ¹ º S1.x/ x1; WD NE PN where x n 1 xn=N and the vector 1 has all components 1. We can directly N WD D E verify that S1 is a .G1; f /-symmetrization. Thus, by the Symmetric Extremal Principle the minimizer of f must have the form z z1. DNE This reduces an N -dimensional optimization problem to a one-dimensional problem. We note that in this case S1.x/ cl G1 x unless x a1 for some 62 D E a R. 2 Since f need only be a lsc function, the method in Example 3.1 can easily be applied to minimization problems with invariant constraints, on using an indicator function (as defined below) to turn the constrained minimization problem into an unconstrained one. As an illustration, we use symmetry with respect to P.N / to give a proof of the well-known algebraic-geometric mean inequality.
Example 3.2 (Arithmetic-Geometric mean inequality). Consider N X min f .x/ log.xn/ C .x/; WD C n 1 D where C x x; 1 K; x 0 , and E WD ¹ W h i D º´ 0 for x C; C .x/ 2 D otherwise; 1 represents the indicator function of the set C . Then f satisfies all the conditions for the function in Example 3.1 and, there- fore, has a minimum of the form S1.x/ x1. The constraint S1.x/ C forces DNE 2 x K=N and the minimum is N log.K=N /. This now easily leads to the arith- N D metic-geometric mean inequality. Variational methods in the presence of symmetry 279
Likewise, we prove the classical relative entropy inequality [9] using symmetry.
Example 3.3 (Relative entropy inequality). Consider
N X min f .p; q/ pn log.pn=qn/ C .p; q/; WD C n 1 D where C .p; q/ p; 1 q; 1 1; .p; q/ 0 . WD ¹ W h Ei D h Ei D º We shall show that f .p; q/ f .1; 1/ 0. Now f is invariant with respect to E E D the group action G2 g.p; q/ .gp; gq/; g P.N /, and S2.p; q/ .p1; q1/ W WD 2 WD NE NE is a .P.N /; f /-symmetrization. Again, f has a minimum S2.p; q/ .p1; q1/. D NE NE The constraint S2.p; q/ C forces S2.p; q/ .1; 1/ and the minimum to be 0 as 2 D E E needed. Note that, in general, f .p; q/ > f .S2.p; q//, that is, the invariance of f is not preserved by the symmetrization. Moreover, although f is defined on R2N , it is not P.2N /-invariant. Carefully choosing the semigroup G2 is very important.
Certainly, using convexity in these last two examples leads to shorter proofs. Nevertheless, the proofs here highlight the role of symmetry in such inequalities.
3.2 The role of sub-invariance Much of the analysis of symmetrical properties using variational arguments fol- lows the pattern in the three examples 3.1, 3.2, 3.3 described above. The hard work lies in verifying the conditions. Since the symmetric extremal principle is a consequence of the G-sub-invariance property of the given function f , it is un- surprising that the latter is often more powerful. We illustrate with two more subtle inequalities. We start with the Muirhead inequality [33] which is described in terms of ma- jorization (see e.g. [8]). Recall that, for vectors x; y RN , we say x majorizes y, 2 denoted by x y, if k k N N X X X X x y for 1 k < N and x y : n# n# Ä n# D n# n 1 n 1 n 1 n 1 D D D D We will use a semigroup characterization of majorization that follows the exposi- tion in [25]. ı We define, for ı > 0, the ı-average operators aij by
´ i j x ıe ıe if xi xj > 2ı; aı x C ij WD x otherwise; 280 J. M. Borwein and Q. J. Zhu
and use G3 to denote the semigroup of all the finite compositions of the ı-average operators. Then [25, Lemma 4] can be stated as
Proposition 3.4 (Characterization of majorization). We have x y if and only if y gx for some g G3. D 2 The Muirhead inequality concerns the following function:
X y1 y2 yN N T Œy.x/ x.1/x.2/ x.N /; x; y R : WD 2 C P.N / 2 Theorem 3.5 (Muirhead inequality). For any x RN , the function y T Œy.x/ 2 7! is G-sub-invariant. C
Proof. We need only to show that aı y z implies ij D T Œy.x/ T Œz.x/: If yi yj < 2ı, then y z and T Œy.x/ T Œz.x/. We now consider the non- D D trivial case in which yi yj > 2ı so that zi yi ı, zj yj ı and z y , D D C k D k k i; j . Without loss of generality assuming i < j , we calculate ¤ T Œy.x/ T Œz.x/ X y1 yi 1 yi 1 yj 1 yj 1 yN yN x x x C x x C x x D .1/ .i 1/ .i 1/ .j 1/ .j 1/ .N / .N / P.N / C C 2 yi yj yi yj yi ı yj ı yi ı yj ı x x x x x x C x x C .i/ .j / C .j / .i/ .i/ .j / .j / .i/
X y1 yi 1 yi 1 yj 1 yj 1 yN yN x x x C x x C x x D .1/ .i 1/ .i 1/ .j 1/ .j 1/ .N / .N / P.N / C C 2 yj yj yi yj yi yj yi yj ı ı yi yj ı ı x x x x x x x x .i/ .j / .i/ C .j / .i/ .j / .j / .i/
X y1 yi 1 yi 1 yj 1 yj 1 yN yN x x x C x x C x x D .1/ .i 1/ .i 1/ .j 1/ .j 1/ .N / .N / P.N / C C 2 yj yj yi yj ı yi yj ı ı ı x x x x x x : (3.1) .i/ .j / .i/ .j / .i/ .j / Now it is easy to see that all the summands are nonnegative and, therefore, T Œy.x/ T Œz.x/ as asserted. For fixed x RN , the function T Œy.x/ attains its minimum on any compact 2N C simplex y R y; 1 k . Also, it is easy to check that S3.y/ y1 is a ¹ 2 C W h Ei D º DNNE G3-symmetry. Thus, by the symmetric extremal principle, for any y; x R , 2 C T Œy.x/ TŒy1.x/: (3.2) NE Variational methods in the presence of symmetry 281
Relation (3.2) is very potent. Here are some attractive special cases:
Example 3.6. For y e1 we have T Œe1.x/ TŒ 1 1.x/ or D N E N X 1=N .N 1/Š xn N Š .x1 xN / : n 1 D Dividing both sides by NŠ we get the AG-inequality.
Example 3.7. For y e1 e2 we have T Œe1 e2.x/ TŒ 2 1.x/ which sim- D C C N E plifies to sP n m xnxm 1=N ¤ .x1 xN / : N.N 1/ Example 3.8. In general T Œe1 e2 ek.x/ TŒ k 1.x/ gives us C C C N E 1=k P x x ! 1 n1<