CSCI 1951-G – Optimization Methods in Finance Part 10: Conic Optimization
April 6, 2018
1 / 34 This material is covered in the textbook, Chapters 9 and 10.
Some of the materials are taken from it.
Some of the figures are from S. Boyd, L. Vandenberge’s book Convex Optimization https://web.stanford.edu/~boyd/cvxbook/.
2 / 34 Outline
1. Cones and conic optimization
2. Converting quadratic constraints into cone constraints
3. Benchmark-relative portfolio optimization
5. Approximating covariance matrices
3 / 34 Cones
A set C is a cone if for every x C and θ 0, ∈ ≥ θx C ∈
2 Example: (x, x ), x R R { | | ∈ } ⊂
Is this set convex?
4 / 34 Convex Cones
A set C is a convex cone if, for every x1, x2 C and θ1, θ2 0, 26 ∈ 2 Convex≥ sets
θ1x1 + θ2x2 C. ∈ Example:
x1
x2
0
Figure 2.4 The pie slice shows all points of the form θ1x1 + θ2x2,where θ1, θ2 0. The apex of the slice (which corresponds to θ1 = θ2 = 0) is at ≥ 0; its edges (which correspond to θ1 = 0 or θ2 = 0) pass through the points x1 and x2.
5 / 34
0 0 Figure 2.5 The conic hulls (shown shaded) of the two sets of figure 2.3. Conic optimization
Conic optimization problem in standard form:
min cTx Ax = b x C ∈ where C is a convex cone in finite-dimensional vector space X.
Note: linear objective function, linear constraints. n n If X = R and C = R+, then ...we get an LP!
Conic optimization is a unifying framework for linear programming, • second-order cone programming (SOCP), • semidefinite programming (SDP). •
6 / 34 Norm cones
n−1 Let be any norm in R . k · k
The norm cone associated to is the set k · k
C = x = (x1, . . . , xn): x1 (x2, . . . , xn) { ≥ k k}
It is a convex set.
7 / 34 3 2.2 Some important examplesSecond-order cone in R 31 The second-order cone is the norm cone for the Euclidean norm 2. k · k
1
0.5 t
0 1 1 0 0
x2 1 1 − − x1
3 2 2 1/2 Figure 2.10 Boundary of second-order cone in R , (x1,x2,t) (x1+x2) t . { | ≤ } What happens when we slice the second-order cone? It is (asI.e., the name when suggests) we take a convex the cone.intersection with a hyperplane? We obtain ellipsoidal sets. Example 2.3 The second-order cone is the norm cone for the Euclidean norm, i.e., 8 / 34 n+1 C = (x, t) R x 2 t { ∈ | ∥ ∥ ≤ } T x x I 0 x = 0,t 0 . ! t $ t 0 1 t ≤ ≥ % " # $ " # " − #" # $ $ The second-order cone is also known$ by several other names. It is calledthequadratic cone, since it is defined by a quadratic inequality. It is also called the Lorentz cone or ice-cream cone. Figure 2.10 shows the second-order cone in R3.
2.2.4 Polyhedra
A polyhedron is defined as the solution set of a finite number of linear equalities and inequalities:
= x aT x b ,j=1,...,m, cT x = d ,j=1,...,p . (2.5) P { | j ≤ j j j } A polyhedron is thus the intersection of a finite number of halfspaces and hyper- planes. Affine sets (e.g., subspaces, hyperplanes, lines), rays, line segments, and halfspaces are all polyhedra. It is easily shown that polyhedra are convex sets. A bounded polyhedron is sometimes called a polytope, but some authors use the opposite convention (i.e., polytope for any set of the form (2.5), and polyhedron Rewriting constraints
Let’s rewrite
C = x = (x1, . . . , xn): x1 (x2, . . . , xn) 2 { ≥ k k } as 2 2 2 x1 0, x x x 0 ≥ 1 − 2 − · · · n ≥ This is a combination of an linear and a quadratic constraints.
Also: convex quadratic constraints can be expressed as second-order cone membership constraints.
9 / 34 Rewriting constraints
adratic constraint:
xT P x + 2qTx + γ 0 ≤ Assume P w.l.o.g. positive definite, so the constraint is ...convex.
Also assume, for technical reasons, that qTP q γ 0. − ≥
Goal: rewrite the above constraint as a combination of linear and second-order cone membership constraints.
10 / 34 Rewriting constraints
Because P is positive definitive, it has a Cholesky decomposition:
invertible R s.t. P = RRT . ∃ Rewrite the constraint as:
(RTx)T(RTx) + 2qTx + γ 0 ≤ Let T T −1 y = (y1, . . . , yn) = R x + R q The above is a bijection between x and y.
We are going to rewrite the constraint as a constraint on y.
11 / 34 Rewriting constraints
The constraint:
(RTx)T(RTx) + 2qTx + γ 0 ≤ It holds yTy = (RTx)T(RTx) + 2qTx + qTP −1q Since there is a bijection between y and x, the constraint can be satisfied if and only if
y s.t. y = RTx + R−1q, yTy qTP q γ ∃ ≤ −
12 / 34 Rewriting constraints The constraint is equivalent to: y s.t. y = RTx + R−1q, yTy qTP q γ ∃ ≤ − Lets denote with y0 the square root of the r.h.s. of the right inequality: T y0 = q P q γ R+ − ∈ Consider the vector (y0,y 1,p . . . , yn). y The right inequality then is | {z } n y2 yTy = y2 0 ≥ i i=1 X Taking the square root on both sides:
n 2 y0 y = y 2 ≥ v i k k ui=1 uX This is the membership constraintt for the second-order cone in n+1. R 13 / 34 Rewriting constraints
We rewrite the convex quadratic constraint
xT P x + 2qTx + γ 0 ≤ as
T T −1 (y1, . . . , yn) = R x + R q T y0 = q P q γ R+ − ∈ (y0, y1, . . . , yn) C ∈ p which is a combination of linear and second-order cone membership constraints.
14 / 34 Outline
1. Cones and conic optimization
2. Converting quadratic constraints into cone constraints
3. Benchmark-relative portfolio optimization
4. Semidefinite programming
5. Approximating covariance matrices
6. SDP and approximation algorithms
15 / 34 Benchmark-relative portfolio optimization
Given a benchmark strategy xB (e.g., an index) develop a portfolio x that tracks xB, but adds value by beating it.
I.e., we want a portfolio x with positive expected excess return:
T µ (x xB) 0 − ≥ and specifically want to maximize the expected excess return.
Challenge: balance expected excess return with its variance.
16 / 34 Tracking error and volatility constraints
The (predicted) tracking error of the portfolio x is
T TE(x) = (x xB) Σ(x xB) − − q It measure the variability of excess returns.
In benchmark-relative portfolio optimization, we solve mean-variance optimization w.r.t. the expected excess return and tracking error:
T max µ (x xB) − T 2 (x xB) Σ(x xB) T − − ≤ Ax = b
17 / 34 Comparison with mean-variance optimization
We have seen MVO as:
1 T min x Σx δ 2 max µTx Σx µTx R or − 2 ≥ Ax = b Ax = b
How do they dier from
T max µ (x xB) − T 2 (x xB) Σ(x xB) T − − ≤ Ax = b
The laer is not a standard quadratic program: it has a nonlinear constraint.
18 / 34 T max µ (x xB) − T 2 (x xB) Σ(x xB) T − − ≤ Ax = b
The nonlinear constraint is ...convex quadratic
We can rewrite it as a combination of linear and second-order cone membership, and solve the resulting convex conic problem.
19 / 34 Outline
1. Cones and conic optimization
2. Converting quadratic constraints into cone constraints
3. Benchmark-relative portfolio optimization
4. Semidefinite programming
5. Approximating covariance matrices
6. SDP and approximation algorithms
20 / 34 SemiDefinite Programming (SDP) The variables are the entries of a symmetric matrix in the cone of 2.3 Operations that preserve convexity 35 positive semidefinite matrices.
1
0.5 z
0 1 1 0 0.5 y 1 0 − x
Figure 2.12 Boundary of positive semidefinite cone in S2. 21 / 34 x 0, y 0, xz y2. ≥ ≥ n ≥ n n The set S+ is a convex cone: if θ1, θ2 0 and A, B S+, then θ1A+θ2B S+. This can be seen directly from the definition≥ of positive∈ semidefiniteness: for∈ any x Rn, we have ∈
xT (θ A + θ B)x = θ xT Ax + θ xT Bx 0, 1 2 1 2 ≥
if A 0, B 0 and θ , θ 0. ≽ ≽ 1 2 ≥
Example 2.6 Positive semidefinite cone in S2. We have
xy X = S2 x 0,z 0,xz y2. yz ∈ + ⇐⇒ ≥ ≥ ≥ ! "
The boundary of this cone is shown in figure 2.12, plotted in R3 as (x, y, z).
2.3 Operations that preserve convexity
In this section we describe some operations that preserve convexity of sets, or allow us to construct convex sets from others. These operations, together with the simple examples described in 2.2, form a calculus of convex sets that is useful for determining or establishing convexity§ of sets. Application: approximating covariance matrices
Portfolio Optimization almost always requires covariance matrices.
These are not directly available, but are estimated.
Estimation of covariance matrices is a very challenging task, mathematically and computationally, because the matrices must satisfy various properties (e.g., symmetry, positive semidefiniteness).
To be eicient, many estimation methods do not impose problem-dependent constraints.
Typically, one is interested in finding the smallest distortion of the original estimate that satisfies the desired constraints;
22 / 34 Application: approximating covariance matrices
Let Σˆ n be an estimate of a covariance matrix • ∈ S Σˆ is symmetric ( n) but not positive semidefinite. • ∈ S Goal: find the positive semidefinite matrix that is closest to Σˆ w.r.t. the Frobenius norm:
2 dF (Σ, Σ)ˆ = (Σij Σˆ ij) − i,j sX
Formally: nearest covariance matrix problem:
min dF (Σ, Σ)ˆ Σ Σ Cn ∈ s where Cn is the cone of n n symmetric and positive semidefinite s × matrices.
23 / 34 Application: approximating covariance matrices
min dF (Σ, Σ)ˆ Σ Σ Cn ∈ s
Introduce a dummy variable t and rewrite the problem as
min t
dF (Σ, Σ)ˆ t ≤ Σ Cn ∈ s The first constraint can be wrien as a second-order cone constraint, so the problem is transformed into a conic optimization problem.
24 / 34 Application: approximating covariance matrices
Variation of the problem with additional linear constraints:
Let E (i, j) : 1 i n ⊆ { ≤ ≤ }
Let (`ij, uij), for (i, j) E be lower/upper bounds to impose to the ∈ entries.
We want to solve:
min dF (Σ, Σ)ˆ Σ
`ij < Σij < uij, (i, j) E ∀ ∈ Σ Cn ∈ s
25 / 34 Application: approximating covariance matrices
For example, let Σˆ be an estimation of a correlation matrix.
Correlation matrix have all diagonal entries equal to 1.
We want to solve the nearest correlation matrix problem.
We choose
E = (i, i), 1 i n , `i = 1 = ui, 1 i n { ≤ ≤ } ≤ ≤
26 / 34 Application: approximating covariance matrices
Many other variants are possible: Force some entries of Σˆ to remain the same in Σ; • Weight the changes to dierent entries dierently, because we • trust some more than other; Impose lower bounds to the minimum eigenvalue of Σ, to reduce • instability; All of these can be easily solved with SDP soware.
27 / 34 Outline
1. Cones and conic optimization
2. Converting quadratic constraints into cone constraints
3. Benchmark-relative portfolio optimization
4. Semidefinite programming
5. Approximating covariance matrices
6. SDP and approximation algorithms
28 / 34 A dierent point of view to SDP
A n n matrix A is positive semidefinite if there are vectors × T xi, . . . xj such that Aij = xi xj.
We can then write a semidefinite program as a program involving only linear combinations of the inner products of the vectors xi, . . . xj:
T min cijxi xj i,jX∈[n] T aijkx xj bj, k i ≤ ∀ i,jX∈[n]
This form is particularly useful to develop approximation algorithms.
29 / 34 The MaxCut problem
Given a graph G = (V,E), output a 2-partition of V so as to maximize the number of edges crossing from one side to the other.
Integer quadratic program:
1 vivj max − 2 (i,jX)∈E vi 1, 1 , 1 i n ∈ {− } ≤ ≤
The decision version of the problem is NP-complete.
30 / 34 The MaxCut problem
Steps for an approximation algorithm for MaxCut: 1 Relax the original problem to an SDP; 2 Solve the SDP; 3 Round the SDP solution to obtain an integer solution to the original problem.
31 / 34 The MaxCut problem
Integer quadratic program:
1 vivj max − 2 (i,jX)∈E vi 1, 1 , 1 i n ∈ {− } ≤ ≤ SDP Relaxation:
T 1 v vj max − i 2 (i,jX)∈E 2 vi 2 1, 1 i n k k ≤n ≤ ≤ vi R ∈
It is a relaxation: the optimal obj. value will be larger than the one for the original problem.
32 / 34 The MaxCut problem
T 1 v vj max − i 2 (i,jX)∈E 2 vi 2 1, 1 i n k k ≤n ≤ ≤ vi R ∈ n The optimal solution is a set of unit vectors in R To obtain a solution for the original problem, we need to round this solution and assign each vector to one value in 1, 1 . {− } Goemans and Willamson 1995: choose a random hyperplane that goes through the origin, and split the vectors depending on the side of the hyperplane. Approximation ratio: 0.87856 - ε (essentially optimal)
33 / 34 Outline
1. Cones and conic optimization
2. Converting quadratic constraints into cone constraints
3. Benchmark-relative portfolio optimization
4. Semidefinite programming
5. Approximating covariance matrices
6. SDP and approximation algorithms
34 / 34