Background on Convex Analysis

John Duchi

Prof. John Duchi Outline

I Convex sets 1.1 Definitions and examples 2.2 Basic properties 3.3 Projections onto convex sets 4.4 Separating and supporting hyperplanes II Convex functions 1.1 Definitions 2.2 Subgradients and directional derivatives 3.3 Optimality properties 4.4 Calculus rules

Prof. John Duchi Why convex analysis and optimization?

Consider problem

minimize f(x)subjecttox X. x 2 When is this (eciently) solvable?

I When things are convex

I If we can formulate a numerical problem as minimization of a f over a X, then (roughly) it is solvable

Prof. John Duchi Convex sets

Definition AsetC Rn is convex if for any x, y C ⇢ 2 tx +(1 t)y C for all t C 2 2

Prof. John Duchi Examples

Hyperplane: Let a Rn, b R, 2 2 n C := x R : a, x = b . { 2 h i }

n m Polyhedron: Let a1,a2,...,am R , b R , 2 2 n C := x : Ax b = x R : ai,x bi {  } { 2 h i }

Prof. John Duchi Examples

Norm balls: let be any norm, k·k n C := x R : x 1 { 2 k k }

Prof. John Duchi Basic properties

Intersections: If C , ↵ , are all convex sets, then ↵ 2A

C := C↵ is convex. ↵ \2A

Minkowski addition: If C, D are convex sets, then C + D := x + y : x C, y D is convex. { 2 2 }

Prof. John Duchi Basic properties

Convex hulls: If x1,...,xm are points,

m m Conv x ,...,x := t x : t 0, t =1 . { 1 m} i i i i ⇢ Xi=1 Xi=1

Prof. John Duchi Projections

Let C be a closed convex set. Define

2 ⇡C (x) := argmin x y 2 . y C {k k } 2 Existence: Assume x =0and that 0 C.Showthatifx is such 62 n that xn 2 infy C y 2 then xn is Cauchy. k k ! 2 k k

Prof. John Duchi Projections

Characterization of projection

2 ⇡C (x) := argmin x y 2 y C k k 2 n o We have ⇡C (x) is the projection of x onto C if and only if

⇡ (x) x, y ⇡ (x) 0 for all y C. h C C i 2

Prof. John Duchi Separation Properties

Separation of a point and convex set: Let C be closed convex and x C.Thenthereexistsaseparating hyperplane:thereis 62 v =0and a constant b such that 6 v, x >b and v, y b for all y C. h i h i 2 Proof: Show the stronger result

v, x v, y + x ⇡ (x) 2 . h ih i k C k2

Prof. John Duchi Separation Properties

Separation of two convex sets: Let C be convex and compact and D be closed convex. Then there is a non-zero separating hyperplane v inf v, x > sup v, y . x C h i y D h i 2 2 Proof: D C is closed convex and 0 D C 62

Prof. John Duchi Supporting hyperplanes

Supporting hyperplane: Ahyperplane x : v, x = b supports { h i } the set C if C is contained in the halfspace x : v, x b and for { h i } some y bd C we have v, y = b 2 h i

Prof. John Duchi Supporting hyperplanes

Let C be a convex set with x bd C.Thenthereexistsa 2 non-zero supporting hyperplane H passing through x.Thatis,a v =0such that 6 C y : v, y b and v, x = b. ⇢{ h i } h i Proof:

Prof. John Duchi Convex functions

A function f is convex if its domain dom f is a convex set and for all x, y dom f we have 2 f(tx +(1 t)y) tf(x)+(1 t)f(y) for all t [0, 1].  2 (Define f(z)=+ for z dom f) 1 62

Prof. John Duchi Convex functions and epigraphs

Duality between convex sets and functions. A function f is convex if and only if its

epi f := (x, t):f(x) t, t R {  2 } is convex

Prof. John Duchi Minima of convex functions

Why convex? Let x be a local minimizer of f on the convex set C.Thenglobal minmization:

f(x) f(y) for all y C.  2 Proof: Note that for y C, 2 f(x + t(y x)) = f((1 t)x + ty) (1 t)f(x)+tf(y). 

Prof. John Duchi Subgradients

Avectorg is a subgradient of f at x if

f(y) f(x)+ g, y x for all y. h i

Prof. John Duchi Subdi↵erential

The subdi↵erential (subgradient set) of f at x is

@f(x):= g : f(y) f(x)+ g, y x for all y . { h i }

Prof. John Duchi Subdi↵erential examples

Let f(x)= x = max x, x .Then | | { } 1 if x>0 @f(x)= 1 if x<0 8 <>[ 1, 1] if x =0. :>

Prof. John Duchi Existence of subgradients Theorem: Let x int dom f.Then@f(x) = 2 6 ; Proof: Supporting hyperplane to epi f at the point (x, f(x))

Prof. John Duchi Optimality properties

Subgradient optimality A point x minimizes f if and only if 0 @f(x).Immediate: 2 f(y) f(x)+ g, y x h i

Prof. John Duchi Advanced optimality properties Subgradient optimality Consider problem

minimize f(x)subjecttox C. x 2 Then x solves problem if and only if there exists g @f(x) such 2 that g, y x 0 for all y C. h i 2

Prof. John Duchi Subgradient calculus m Addition Let f1,...,fm be convex and f = i=1 fi.Then m m P @f(x)= @f (x)= g : g @f (x) i i i 2 i Xi=1 ⇢ Xi=1 (Extends to infinite sums, integrals, etc.)

Prof. John Duchi Subgradient calculus

n m n Composition Let A R ⇥ and f : R R be convex, with 2 ! h(x)=f(Ax).Then

@h(x)=AT @f(Ax)= AT g : g @f(Ax) . { 2 }

Prof. John Duchi Subgradient calculus: maxima

Let fi, i =1,...,m, be convex

f(x) = max fi(x) i

Then with I(x)= i : f (x)=f(x) = max f (x) , { i j j } @f(x) = Conv g : g @f (x),i I(x) . { i i 2 i 2 }

Prof. John Duchi Subgradient calculus: suprema

Let f : ↵ be a collection of convex functions, { ↵ 2A}

f(x)=supf↵(x). ↵ 2A Then if the supremum is attained and (x)= ↵ : f (x)=f(x) , A { ↵ } @f(x) Conv g : g @f (x),↵ (x) ⇢ { ↵ ↵ 2 ↵ 2A }

Prof. John Duchi Examples of subgradients Norms: Recall ` -norm, , 1 k·k1

x = max xj k k1 j | | Then

@ x = Conv ei : ei,x = x ei : ei,x = x k k1 { h i k k1}[{ h i k k1} n o

Prof. John Duchi Examples of subgradients General norms: Recall of norm k·k⇤ k·k y =sup x, y and x =sup x, y . k k⇤ x: x 1 h i k k y: y 1 h i k k k k⇤ Then @ x = y : y 1, y, x = x . k k { k k⇤  h i k k}

Prof. John Duchi