Alexandrov's Theorem on the Second Derivatives of Convex Functions Via

ALEXANDROV'S THEOREM ON THE SECOND DERIVATIVES OF CONVEX FUNCTIONS VIA RADEMACHER'S THEOREM ON THE FIRST DERIVATIVES OF LIPSCHITZ FUNCTIONS RALPH HOWARD DEPARTMENT OF MATHEMATICS UNIVERSITY OF SOUTH CAROLINA COLUMBIA, S.C. 29208, USA [email protected] These are notes from lectures given the the functional analysis seminar at the University of South Carolina. Contents 1. Introduction 1 2. Rademacher's Theorem 2 3. A General Sard Type Theorem for Maps Between Spaces of the Same Dimension 6 3.1. A more general result. 11 4. An Inverse Function Theorem for Continuous Functions Differentiable at a Single Point 15 5. Derivatives of Set-Valued Functions and Inverses of Lipschitz Functions 17 6. Alexandrov's Theorem 19 7. Symmetry of the Second Derivatives 22 References 22 1. Introduction A basic result in the regularity theory of convex sets and functions is the theorem of Alexandrov that a convex function has second derivatives almost everywhere. The notes here are a proof of this following the ideas in the appendix of the article [4] of Crandall, Ishii, and Lions and they attribute the main idea of the proof to F. Mignot [5]. To make the notes more self contained I have included a proof of Rademacher's theorem on the differentiable almost everywhere of Lipschitz functions following the presentation in the book [8] of Ziemer (which I warmly recommend to anyone wanting Date: May 1998. 1 2 RALPH HOWARD to learn about the pointwise behavior of functions in Sobolev spaces or of bounded variation). Actually a slight generalization of Alexandrov's theorem is given in Theorem 5.3 which shows that set-valued functions that are inverses to Lipschitz functions are differentiable almost everywhere. To simplify notation I have assumed that functions have domains all of Rn. It is straightforward to adapt these proofs to locally Lipschitz functions or convex function defined on convex open subsets of Rn. As to notation. If x; y Rn then the inner product is denoted as usual 2 by either x y or x; y . Explicitly if x = (x1; : : : ; xn) and y = (y1; : : : ; yn) · h i x y = x; y = x1y1 + + xnyn: · h i ··· The norm of x is x = px x. Lebesgue measure in Rn will be denoted by n k k · and integrals with respect to this measure will be written as Rn f(x) dx. L n R If x0 R and r > 0 then the open and closed balls about x0 will be denoted by 2 n n B(x0; r) := x R : x x0 < r B(x0; r) x R : x x0 r : f 2 k − k g f 2 k − k ≤ g 2. Rademacher's Theorem We first review a little about Lipschitz functions in one variable. The following is a special case of a theorem of Lebesgue. 2.1. Theorem. Let f : R R satisfy f(x1) f(x0) M x1 x0 . Then ! j − j ≤ j − j the derivative f 0(t) exists for almost all t and f 0(t) M j j ≤ holds at all points where it does exist. Also for a < b b Z f 0(t) dt = f(b) f(a): a − Note if f : R R is Lipschitz and ' C01(R) then the product '(t)f(t) is also Lipschitz! and so the last result implies2 Z f 0(t)'(t) dt = Z f(t)'0(t) dt: R − R n Now if f : R R then denote by Dif the partial derivative ! @f Dif(x) = (x) @xi at points where this partial derivative exists. Let Df(x) denote Df(x) = (D1f(x);:::;Dnf(x)): n 2.2. Proposition. If f : R R is Lipschitz, say f(x1) f(x0) ! nk − k ≤ M x1 x0 , then Df(x) exists for almost all x R . Moreover all the k − k 2 partial derivative Dif satisfy Dif(x) M j j ≤ THEOREMS OF RADEMACHER AND ALEXANDROV 3 at points where they exist. Thus (2.1) Df(x) pnM: k k ≤ Finally if ' C (Rn) then 2 01 (2.2) Z Dif(x)'(x) dx = Z f(x)Di'(x) dx: Rn − Rn Proof. We show that D1f(x) exists almost everywhere, the argument for Dif being identical. Write x = (x1; x2; : : : ; xn) as x = (x1; x0) where x0 = n 1 (x2; : : : ; xn). Then for any x R let 0 2 − Nx := x1 R : D1f(x1; x0) does not exist. : 0 f 2 g Then by the one variable result N is a set of measure zero in R for all x0 x Rn 1. Therefore by Fubini's the set 0 2 − n N = Nx = x R : D1f(x) does not exist [ 0 f 2 g x Rn 1 02 − is a set of measure zero. That Dif(x) M at points where it exists is clear (or follows from the one dimensionalj j ≤ result). At points where Df(x) exists 2 2 2 2 Df(x) = D1f(x) + + Dnf(x) M + + M = pnM: k k p ··· ≤ p ··· Finally we show (2.2) in the case of i = 1. Using the notation above, Fubini's theorem, and the one variable integration by parts formula. Z D1f(x)'(x) dx = Z Z D1f(x1; x0)'(x1; x0) dx1 dx0 n n 1 R R − R = Z Z f(x1; x0)D1'(x1; x0) dx1 dx0 n 1 − R − R = Z f(x)D1'(x) dx − Rn This completes the proof. 2.3. Definition. Let f : Rn R, then for a fixed vector v Rn define ! 2 f(x + tv) f(x) df(x; v) := lim − : t 0 t ! When this limit exists it is directional derivative of f in the direction of v at the point x. 2.4. Proposition. Let f : Rn R be Lipschitz and let v Rn be a fixed vector. Then df(x; v) exists for almost! all x Rn and is given2 by the formula 2 (2.3) df(x; v) = Df(x) v · for almost all x. 4 RALPH HOWARD n Proof. Note if v = e1 where e1; : : : ; en is the standard coordinate basis of R then df(x; v) = df(x; e1) = D1f(x) and the fact that df(x; v) exists almost everywhere follows from Proposition 2.2. In the general case if v = 0 (and 6 n the case v = 0 is trivial) there is a linear coordinate system ξ1; : : : ; ξn on R so that df(x; v) = @f . But again Proposition 2.2 can be used to see that @ξ1 df(x; v) exits for almost all x Rn. 2 n To see that the formula (2.3) holds let ' C01(R ). Then as ' is smooth the usual form of the chain rule implies d'(2x; v) = D'(x) v. Let M be the Lipschitz constant of f. Then · f(x + tv) f(x) M tv − '(x) k k '(x) M v ' L : t ≤ t j j ≤ k kk k 1 j j f(x+tv) f(x) Therefore for 0 < t 1 the function x t − '(x) is uniformly j j ≤ 7! bounded and has compact support. Thus by the dominated convergence theorem and the version of integration by parts given in Proposition 2.2 f(x + tv) f(x) Z df(x; v)'(x) dx = lim Z − '(x) dx n t 0 n t R ! R 1 = lim Z f(x + tv)'(x) dx Z f(x)'(x) dx t 0 t n − n ! R R 1 = lim Z f(x)'(x tv) dx Z f(x)'(x) dx t 0 t n − − n ! R R '(x tv) '(x) = lim Z f(x) − − dx t 0 n t ! R = Z f(x)d'(x; v) dx Rn − = Z f(x)D'(x) v dx − Rn · n = vi Z f(x)Di'(x) dx − X n i=1 R n = vi Z Dif(x)'(x) dx X n i=1 R = Z Df(x) v'(x) dx: Rn · n Thus Rn df(x; v)'(x) dx = Rn Df(x) v'(x) dx for all ' C1(R ). ThereforeR d(x; v) = Df(x) v Rfor almost all· x Rn. 2 · 2 n m n 2.5. Definition. Let f : R R . Then f is differentiable at x0 R iff there is a linear map L: Rn! Rm so that 2 ! f(x) f(x0) = L(x x0) + o( x x0 ): − − k − k THEOREMS OF RADEMACHER AND ALEXANDROV 5 In this case the linear map L is easily seen to be unique and will be denoted by f 0(x0). By o( x x0 ) we mean a function of the form x x0 g(x; x0) where k − k k − k limx x0 g(x; x0) = 0. This definition could be given a little more formally by letting! Sn 1 = u Rn : u be the unit sphere in Rn. Then f : Rn Rm − f 2 k kg ! is differentiable at x with f 0(x) = L iff for all " > 0 there is a δ > 0 so that for all u Sn 1 2 − f(x0 + tu) f(x0) (2.4) 0 < t δ implies − Lu < ": j j ≤ t − 2.6. Theorem (Rademacher [6] 1919). If f : Rn Rm is Lipschitz, say ! f(x1) f(x0) M x1 x0 , then the derivative f 0(x) exists for almost allk x −Rn. Ink the ≤ casek m−= 1,k so that f : Rn R, then f (x) is given by 2 ! 0 f 0(x)v = Df(x) v · for almost all x Rn. 2 Proof. We first consider the case when m = 1 so that f is scalar valued.

Alexandrov's Theorem on the Second Derivatives of Convex Functions Via

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support