Part IB — Variational Principles

Part IB | Variational Principles Based on lectures by P. K. Townsend Notes taken by Dexter Chua Easter 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures. They are nowhere near accurate representations of what was actually lectured, and in particular, all errors are almost surely mine. n Stationary points for functions on R . Necessary and sufficient conditions for minima and maxima. Importance of convexity. Variational problems with constraints; method of Lagrange multipliers. The Legendre Transform; need for convexity to ensure invertibility; illustrations from thermodynamics. [4] The idea of a functional and a functional derivative. First variation for functionals, Euler-Lagrange equations, for both ordinary and partial differential equations. Use of Lagrange multipliers and multiplier functions. [3] Fermat's principle; geodesics; least action principles, Lagrange's and Hamilton's equations for particles and fields. Noether theorems and first integrals, including two forms of Noether's theorem for ordinary differential equations (energy and momentum, for example). Interpretation in terms of conservation laws. [3] Second variation for functionals; associated eigenvalue problem. [2] 1 Contents IB Variational Principles Contents 0 Introduction 3 1 Multivariate calculus 4 1.1 Stationary points . .4 1.2 Convex functions . .5 1.2.1 Convexity . .5 1.2.2 First-order convexity condition . .6 1.2.3 Second-order convexity condition . .7 1.3 Legendre transform . .8 1.4 Lagrange multipliers . 12 2 Euler-Lagrange equation 15 2.1 Functional derivatives . 15 2.2 First integrals . 17 2.3 Constrained variation of functionals . 21 3 Hamilton's principle 24 3.1 The Lagrangian . 24 3.2 The Hamiltonian . 26 3.3 Symmetries and Noether's theorem . 27 4 Multivariate calculus of variations 32 5 The second variation 36 5.1 The second variation . 36 5.2 Jacobi condition for local minima of F [x]............. 38 2 0 Introduction IB Variational Principles 0 Introduction Consider a light ray travelling towards a mirror and being reflected. z We see that the light ray travels towards the mirror, gets reflected at z, and hits the (invisible) eye. What determines the path taken? The usual answer would be that the reflected angle shall be the same as the incident angle. However, ancient Greek mathematician Hero of Alexandria provided a different answer: the path of the light minimizes the total distance travelled. We can assume that light travels in a straight line except when reflected. Then we can characterize the path by a single variable z, the point where the light ray hits the mirror. Then we let L(z) to be the length of the path, and we can solve for z by setting L0(z) = 0. This principle sounds reasonable - in the absence of mirrors, light travels in a straight line - which is the shortest path between two points. But is it always true that the shortest path is taken? No! We only considered a plane mirror, and this doesn't hold if we have, say, a spherical mirror. However, it turns out that in all cases, the path is a stationary point of the length function, i.e. L0(z) = 0. Fermat put this principle further. Assuming that light travels at a finite speed, the shortest path is the path that takes the minimum time. Fermat's principle thus states that Light travels on the path that takes the shortest time. This alternative formulation has the advantage that it applies to refraction as well. Light travels at different speeds in different mediums. Hence when they travel between mediums, they will change direction such that the total time taken is minimized. We usually define the refractive index n of a medium to be n = 1=v, where v is the velocity of light in the medium. Then we can write the variational principle as Z minimize n ds, path where ds is the path length element. This is easy to solve if we have two distinct mediums. Since light travels in a straight line in each medium, we can simply characterize the path by the point where the light crosses the boundary. However, in the general case, we should be considering any possible path between two points. In this case, we could no longer use ordinary calculus, and need new tools - calculus of variations. In calculus of variations, the main objective is to find a function x(t) that minimizes an integral R f(x) dt for some function f. For example, we might want to minimize R (x2 + x) dt. This differs greatly from ordinary minimization problems. In ordinary calculus, we minimize a function f(x) for all possible values of x 2 R. However, in calculus of variations, we will be minimizing the integral R f(x) dt over all possible functions x(t). 3 1 Multivariate calculus IB Variational Principles 1 Multivariate calculus Before we start calculus of variations, we will first have a brief overview of minimization problems in ordinary calculus. Unless otherwise specified, f will be a function Rn ! R. For convenience, we write the argument of f as x = (x1; ··· ; xn) and x = jxj. We will also assume that f is sufficiently smooth for our purposes. 1.1 Stationary points The quest of minimization starts with finding stationary points. Definition (Stationary points). Stationary points are points in Rn for which rf = 0, i.e. @f @f @f = = ··· = = 0 @x1 @x2 @xn All minima and maxima are stationary points, but knowing that a point is stationary is not sufficient to determine which type it is. To know more about the nature of a stationary point, we Taylor expand f about such a point, which we assume is 0 for notational convenience. 1 X @2f f(x) = f(0) + x · rf + x x + O(x3): 2 i j @x @x i;j i j 1 X @2f = f(0) + x x + O(x3): 2 i j @x @x i;j i j The second term is so important that we have a name for it: Definition (Hessian matrix). The Hessian matrix is @2f Hij(x) = @xi@xj Using summation notation, we can write our result as 1 f(x) − f(0) = x H x + O(x3): 2 i ij j Since H is symmetric, it is diagonalizable. Thus after rotating our axes to a suitable coordinate system, we have 0 1 λ1 0 ··· 0 B 0 λ2 ··· 0 C H0 = B C ; ij B : : :: : C @ : : : : A 0 0 ··· λn where λi are the eigenvalues of H. Since H is real symmetric, these are all real. In our new coordinate system, we have n 1 X f(x) − f(0) = λ (x0 )2 2 i i i=1 4 1 Multivariate calculus IB Variational Principles This is useful information. If all eigenvalues λi are positive, then f(x) − f(0) must be positive (for small x). Hence our stationary point is a local minimum. Similarly, if all eigenvalues are negative, then it is a local maximum. If there are mixed signs, say λ1 > 0 and λ2 < 0, then f increases in the x1 direction and decreases in the x2 direction. In this case we say we have a saddle point. If some λ = 0, then we have a degenerate stationary point. To identify the nature of this point, we must look at even higher derivatives. In the special case where n = 2, we do not need to explicitly find the eigenvalues. We know that det H is the product of the two eigenvalues. Hence if det H is negative, the eigenvalues have different signs, and we have a saddle. If det H is positive, then the eigenvalues are of the same sign. To determine if it is a maximum or minimum, we can look at the trace of H, which is the sum of eigenvalues. If tr H is positive, then we have a local minimum. Otherwise, it is a local maximum. Example. Let f(x; y) = x3 + y3 − 3xy. Then rf = 3(x2 − y; y2 − x): This is zero iff x2 = y and y2 = x. This is satisfied iff y4 = y. So either y = 0, or y = 1. So there are two stationary points: (0; 0) and (1; 1). The Hessian matrix is 6x −3 H = ; −3 6y and we have det H = 9(4xy − 1) tr H = 6(x + y): At (0; 0), det H < 0. So this is a saddle point. At (1; 1), det H > 0, tr H > 0. So this is a local minimum. 1.2 Convex functions 1.2.1 Convexity Convex functions is an important class of functions that has a lot of nice properties. For example, stationary points of convex functions are all minima, and a convex function can have at most one minimum value. To define convex functions, we need to first define a convex set. Definition (Convex set). A set S ⊆ Rn is convex if for any distinct x; y 2 S; t 2 (0; 1), we have (1 − t)x + ty 2 S. Alternatively, any line joining two points in S lies completely within S. non-convex convex 5 1 Multivariate calculus IB Variational Principles Definition (Convex function). A function f : Rn ! R is convex if (i) The domain D(f) is convex (ii) The function f lies below (or on) all its chords, i.e. f((1 − t)x + ty) ≤ (1 − t)f(x) + tf(y)(∗) for all x; y 2 D(f); t 2 (0; 1). A function is strictly convex if the inequality is strict, i.e. f((1 − t)x + ty) < (1 − t)f(x) + tf(y): (1 − t)f(x) + tf(y) x (1 − t)x + ty y A function f is (strictly) concave iff −f is (strictly) convex.

Load more