Quick viewing(Text Mode)

Calculus of Variations

Calculus of Variations

of Variations

In this chapter, we discuss the basics of . This will provide us with the mathe- matical language—and the key tools—necessary for introducing and utilizing Lagrangian formalizm.

Functionals

The central notion of the Calculus of Variations is a functional, a real-valued defined for a certain class of real- or complex-valued functions of one or many variables. The functional assigns a real value to one or more functions of the given class. We will be interested in the functionals of the following form Z xb F [f] = g(f, fx, x) dx . (1) xa Here f—the argument of the functional F —is a differentiable function, x is the argument of the 0 function f, fx ≡ f (x) is the of the function f, and g is a certain fixed differentiable function of three variables. For the sake of briefness, we consider the case of one real function of one . Apart from being differentiable, the function f(x) is typically subject to certain constraints. For example, f(xa) = ya , f(xb) = yb , (2) with ya and yb certain fixed numbers. Our definition (1) is straightforwardly generalized to the cases of more than one function f, more than one variable x, and higher-order (partial) .

Example 1. Length of a line in a plane. Suppose we have a smooth line in the xy-plane passing through two given points, (xa, ya) and (xb, yb), and specified by the

y = f(x) . (3)

The length of the line is a functional of f of the form (1). Indeed,

Z Z q Z q Z xb q 2 2 2 2 2 l[f] = dl = (dx) + (dy) = (dx) + (fxdx) = 1 + (fx) dx . (4) xa We see that l[f] has the form (1) with q 2 g(f, fx, x) = 1 + (fx) . (5)

In this example, the function g does not depend explicitly on f and x, which reflects two translational symmetries: f → f + const and x → x + const.

Example 2. Potential energy of a flexible cable suspended from two fixed points. Choose the y axis perpendicular to the of the Earth, the cable being represented by a (3) in the xy plane, with the constraint (2). The potential energy of the element dl is proportional to its hight y and its length dl. Hence, the total potential energy is given by the

Z Z xb q 2 U[f] = y dl = f 1 + (fx) dx . (6) xa

1 We do not care about the units, and thus set the pre-factor equal to unity. We see that U[f] has the form (1) with q 2 g(f, fx, x) = f 1 + (fx) . (7)

Here the function g explicitly depends on f and fx, but not on x (reflecting the translational symmetry x → x + const.

Example 3. Length of a line in a surface. Suppose we have a smooth line L in the smooth surface S, passing through two given points, (xa, ya, za) and (xb, yb, zb). Let the surface S and the line L be specified by the z = s(x, y) (the surface) , (8) y = f(x) , z = s(x, f(x)) (the line in the surface) . (9) Let us show that the length of the line is the functional of the form (1). Introducing convenient notation ∂s(x, y) ∂s(x, y) s = , s = , (10) x ∂x y ∂y and using dy = fx dx , dz = sx dx + sy dy = sx dx + sy fx dx , (11) we get Z Z q Z xb 2 2 2 l[f] = dl = (dx) + (dy) + (dz) = g(f, fx, x) dx , (12) xa where q 2 2 g(f, fx, x) = 1 + fx + [sx(x, f) + sy(x, f) fx] . (13)

Extrema. Euler’s equation

A typical problem arising in connection with a functional F [f] is the problem of finding a function f that minimizes (or maximizes) the functional. With our examples, this problem comes from natural questions: What is the shortest distance between two fixed points in a given surface? What is the shape of the suspended cable? Calculus of Variations provides mathematical tools for solving the problem. Suppose the function f is a (local) minimum/maximum of the functional F . Then, for any small variation of the function f, the variation of the functional has to be sign-definite. Let us see what are the implications of this fact. We introduce the symbol δf(x) for an infinitesimal variation of the function f and the symbol δF for corresponding variation of the functional F . That is

δF = F [f + δf] − F [f] . (14)

With a simple observation that

0 f → f + δf ⇒ fx → fx + (δf) , (15) we have Z xb Z xb 0 δF = g( f + δf, fx + (δf) , x) dx − g(f, fx, x) dx . (16) xa xa Then we use the infinitesimal smallness of δf:

0 ∂g ∂g 0 g( f + δf, fx + (δf) , x) → g(f, fx, x) + δf + (δf) . (17) ∂f ∂fx

2 Z xb ∂g Z xb ∂g δF = δf dx + (δf)0 dx . (18) xa ∂f xa ∂fx

In the second integral, we integrate by parts, with Eq. (2) taken into account [implying δf(xa) = δf(xb) = 0]: Z xb ∂g Z xb d ∂g (δf)0 dx = − dx (δf) . (19) xa ∂fx xa dx ∂fx We get Z xb δF = A(x) δf(x) dx , (20) xa where ∂g d ∂g A(x) = − . (21) ∂f dx ∂fx Equation (20) generalizes the notion of the differential of a multivariable function. Indeed, if we discretize the variable x—so that it becomes a label for discrete variables f(x), then δf(x) and A(x) acquire (respectively) the meaning of the differential of the variable f(x) and corresponding . In view of this close analogy with the multivariable calculus, A(x) is called variational derivative of F with respect to f at the point x, with a convenient symbolic notation (note the importance of keeping the “label” x in the denominator; the symbolic character of the notation is already clearly from the dimensionality of the variational derivative: [A] = [F ][f]−1[x]−1 6= [F ][f]−1)

δF A(x) ≡ (symbolic notation for the variational derivative). (22) δf(x)

What really distinguishes the notion of variational derivative from the generic notion of partial deriva- tive is the relation (21) playing the key part in the Calculus of Variations. According to Eq. (20), the variation δF is a linear functional of δf. The linearity means that the sign of δF changes upon the transformation δf → −δf. Meanwhile, the condition of mini- mum/maximum requires that the sign of variation not change. This is only possible if A(x) ≡ 0, and we arrive at the celebrated Euler’s equation, the central result of the Calculus of Variations: ∂g d ∂g − = 0 . (23) ∂f dx ∂fx As a simple illustration, let us make sure that the shortest distance between two points is a straight line. Applying Eq. (23) to the function g of the Example 1, we get

00 f (x) = 0 ⇒ f(x) = C1x + C2 . (24)

[The constants C1 and C2 are then fixed by the boundary conditions (2).]

Alternate form of Euler’s equation

If fx 6= 0, then Euler’s equation is equivalent to d  ∂g  ∂g g − fx − = 0 . (25) dx ∂fx ∂x Indeed, by the we have dg ∂g ∂g ∂g = fx + fxx + . (26) dx ∂f ∂fx ∂x

3 Taking into account that d ∂g ∂g d ∂g fx = fxx + fx , (27) dx ∂fx ∂fx dx ∂fx we then get d  ∂g  ∂g  ∂g d ∂g  g − fx = + fx − , (28) dx ∂fx ∂x ∂f dx ∂fx and see that Eq. (25) is equivalent to  ∂g d ∂g  fx − = 0 , (29) ∂f dx ∂fx which, in its turn, is equivalent to the original Euler’s equation (23) if fx 6= 0. To put it differently, any solution of Eq. (23) is simultaneously a solution of Eq. (25), but, in contrast to Eq. (23), equation (25) has also a trivial solution f = const. The alternate form of Euler’s equation is very important in the case when g does not depend explicitly on x: g = g(f, fx). Here Eq. (25) simplifies to d  ∂g  g − fx = 0 , (30) dx ∂fx thus yielding the first integral of the Euler’s equation: ∂g g − fx = const . (31) ∂fx

Soap film. The equilibrium shape of a soap film corresponds to a (local) minimum of its surface energy, the latter being directly proportional to the surface . Hence, to find the equilibrium shape of a soap film one has to find the (local) minimum of the surface area under the appropriate boundary conditions. Here we confine ourselves with the simplest example when the film is axially symmetric being stretched between two parallel coaxial wire circles. Let the center of the first circle be at x = 0 and the center of the second circle be at x = h; the two radii are Ra and Rb, respectively; x is the symmetry axis. The surface is parameterized by the radius r at a given x:

r = f(x) , f(0) = Ra , f(h) = Rb . (32)

The surface area, A, is then a functional of f:

Z Z q Z h q 2 2 2 A[f] = 2πr dl = 2πr (dx) + (dr) = 2π f 1 + fx dx . (33) 0 We see that q 2 g = 2π f 1 + fx (34) does not depend explicitly on x (which reflects the translational symmetry of the problem) and we can enjoy Euler’s equation in the form of its first integral (31). This yields

q ff 2 f 1 + f 2 − x = C , (35) x p 2 0 1 + fx where C0 is a constant. A straightforward algebra then leads to q q 2 2 2 2 2 f = C0 1 + fx ⇒ f = C0 (1 + fx ) ⇔ fx = ± (f/C0) − 1 . (36)

4 This differential equation is easy to solve: df q = ± (f/C )2 − 1 . (37) dx 0 Z df Z = ± dx . (38) p 2 (f/C0) − 1 −1 C0 cosh (f/C0) = ± x + const . (39)   x − x0 f(x) = C0 cosh . (40) C0 Note that the sign is absorbed into the constant x0. The solution (40) has two free constants; their values are fixed by two boundary conditions: ( cosh(x /C ) = R /C , 0 0 a 0 (41) cosh[(h − x0)/C0] = Rb/C0 . For quite an instructive of this solution (which goes beyond the scope of our course) please visit the following website. https://mathematicalgarden.wordpress.com/2014/09/06/soap-film-and-minimal-surface/

Functional of many functions The generalization of the theory to the case of more than one function, f → f (1), f (2), . . . , f (m) , is quite straightforward. Now we have

Z xb (1) (2) (m) (1) (2) (m) (1) (2) (m) F [f , f , . . . , f ] = g(f , f , . . . , f , fx , fx , . . . , fx , x) dx , (42) xa

(j) (j) (j) (j) f (xa) = ya , f (xb) = yb (j = 1, 2, . . . , m) . (43) Introducing the variations f (j) → f (j) + δf (j) , (44) and evaluating δF , we get m Z xb X (j) δF = Aj(x) δf (x) dx , (45) j=1 xa ∂g d ∂g A (x) = − . (46) j (j) (j) ∂f dx ∂fx Now if we are looking for minimum/maximum of the functional F , then by definition we have δF ≥ 0 for a minimum, or δF ≤ 0 for a maximum. And this is only possible if

Aj(x) ≡ 0 (j = 1, 2, . . . , m) . (47) (j) Indeed, if for some j = j0 we had Aj0 (x) 6= 0, then we could set δf ≡ 0 for all j’s but j0 and write

Z xb (j0) δF = Aj0 (x) δf (x) dx . (48) xa And then we just repeat the reasoning we used to prove A(x) ≡ 0 in the case of just one function. Hence, for a set of functions {f (1), f (2), . . . , f (m)} to correspond to a minimum/maximum of the functional (42) the following Euler’s equations have to be satisfied: ∂g d ∂g − = 0 (j = 1, 2, . . . , m) . (49) (j) (j) ∂f dx ∂fx

5 This system of m differential equations defines the functions {f (1), f (2), . . . , f (m)} up to 2m free constants. The constants are then fixed by the boundary conditions (43).

Important Theorem What is the generalization, if any, of the alternate Euler’s equation (25) for the case of many functions? In the case of m functions, we cannot replace the whole system of m equations (49) with alternative ones, equivalent to (25). The generalization of Eq. (25) comes in the form of a theorem that might appear too abstract, but actually is extremely important and directly relevant to the conservation of energy in . Consider the quantity1

  m ∂g E f (1)(x), . . . , f (m)(x), f (1)(x), . . . , f (m)(x), x = X f (j) − g . (50) x x x (j) j=1 ∂fx

The theorem states that if the functions {f (1), f (2), . . . , f (m)} extremize2 the functional (42), then (1) (m) (1) (m) for the quantity E(f (x), . . . , f (x), fx (x), . . . , fx (x), x) the following relation takes place dE ∂g = − . (51) dx ∂x In particular, if the function g does not explicitly depend on x, then the right-hand side is zero and the quantity E is just a constant. The proof is a straightforward generalization of Eqs. (26)–(28):

d m ∂g m ∂g m d ∂g X f (j) = X f (j) + X f (j) . (52) dx x (j) xx (j) x dx (j) j=1 ∂fx j=1 ∂fx j=1 ∂fx

dg m ∂g m ∂g ∂g = X f (j) + X f (j) + . (53) dx (j) x (j) xx ∂x j=1 ∂f j=1 ∂fx Hence, ! dE ∂g m d ∂g ∂g = − + X f (j) − . (54) dx ∂x x dx (j) (j) j=1 ∂fx ∂f So far we performed just identical transformations, which are valid for any functions {f (1), f (2), . . . , f (m)}. Now we take into account that we are actually dealing with functions that extremize the functional (42) and thus satisfy Euler’s equations (49). Equations (49) state that each term in the brackets in the r.h.s. of (54) is identically equal to zero, which completes the proof.

Lagrange Multipliers Let us look at the problem of flexible cable, Example 2. Remarkably, the functional of this problem has precisely the same mathematical form as in the problem of the soap film. Nevertheless, the problem is distinctively different. It involves an extra feature, a constraint that the total length of the curve be fixed. To solve such problems we have to understand how to deal with global constraints. And the trick here is the same as in the calculus of functions: one can use Lagrange multipliers.

1We deliberately use the same letter and the same sign convention that we use for the energy. 2In the sense that δF =0, be it a maximum, or minimum, or just a “saddle point.”

6 Suppose we want to find an extremum of the functional (1) on the set of functions subject to the boundary conditions (2) and also the constraint that

Q[f] = C, (55) where C is a given constant and Z xb Q[f] = q(f, fx, x) dx (56) xa is a given functional. (In our Example 2, Q[f] is the length of the curve.) Consider a new functional F˜[f] = F [f] − λQ[f] , (57) where λ is a fixed real number. Under the constraint (55), an extremum of the functional F is simultaneously an extremum of the functional F˜, and vice versa, since the two differ by the fixed constant λC. Now consider a genuine (i.e. unconstrained) extremum of the functional (57). At an arbitrary λ, this genuine extremum is not supposed to have anything to do with the extremum under (λ) the constraint. However, normally their exists a special choice of λ = λ∗ at which the function f , the genuine extremum of the functional (57), satisfies the condition

Q[f (λ∗)] = C. (58)

Clearly, the function f (λ∗) is the solution to the original problem, since the genuine extremum auto- matically implies the extremum under given (and satisfied) constraint. Once we know the value of λ∗, the problem is solved. And to find λ∗ we simply solve the problem of the genuine extremum of the functional (57) with an arbitrary (undefined) λ, and then a posteriori find the proper value λ = λ∗ by requiring that the condition (58) be satisfied.

Solution to the problem of the flexible cable. We have (see Examples 1 and 2) q 2 g(f, fx, x) = f 1 + (fx) , (59)

q 2 q(f, fx, x) = 1 + (fx) . (60) Hence, Z xb Z xb q 2 F˜[f] = [g(f, fx, x) − λq(f, fx, x)] dx = (f − λ) 1 + (fx) dx . (61) xa xa With this particular form of the functional F˜ it is very convenient to introduce the new function

f˜ = f − λ , (62) in which case we get Z xb q 2 F˜[f˜] = f˜ 1 + (f˜x) dx . (63) xa Up to irrelevant numeric pre-factor, this is the functional that we have minimized already in the context of the problem of the soap film. Hence, we just write the answer, Eq. (40):     x − x0 x − x0 f˜(x) = C0 cosh ⇒ f(x) = C0 cosh + λ . (64) C0 C0

The values of the three constants, C0, x0, λ, should be chosen to satisfy two boundary conditions, Eq. (2), and the constraint that the total length of the curve is equal to a given value l.

7 It is also clear that the number of constraints, and, correspondingly, the number of associated Lagrange multipliers, can be arbitrarily large. The approach stays essentially the same. Indeed, enumerate all the constraints with the subscript j,

Qj[f] = Cj , (65) and construct the new functional X F˜[f] = F [f] − λj Qj[f] . (66) j

Then the very same logic as before leads us to the problem of finding the absolute extremum of the functional (66)—as a function of yet unknown parameters {λj}—and then a posteriori fixing the values of these parameters to satisfy all the equations (65).

8