Introduction to MOSEK and Conic Optimization

Data-Driven Analytics and Optimization for Energy Systems, 21 June 2019

Micha l Adamaszek

www.mosek.com What about convexity? x t x log(1 + ), x, y > 0 ≥ y

Optimization

”Classical” continuous problem:

minimize f(x) subj. to g (x) 0, i = 1, . . . , m, i ≤ hi(x) = 0, i = 1, . . . , k.

1 / 10 Optimization

”Classical” continuous problem:

minimize f(x) subj. to g (x) 0, i = 1, . . . , m, i ≤ hi(x) = 0, i = 1, . . . , k.

What about convexity? x t x log(1 + ), x, y > 0 ≥ y

1 / 10 Example:

x 5 x 5 R≥0 • ≥ ≡ − ∈ Conic optimization: is a product of cones. • K

Disciplined Convex Programming (DCP)

New formulation:

min cT x s.t. Ax + b ∈ K where is a set which is convex by construction. K

2 / 10 Conic optimization: is a product of cones. • K

Disciplined Convex Programming (DCP)

New formulation:

min cT x s.t. Ax + b ∈ K where is a set which is convex by construction. K Example:

x 5 x 5 R≥0 • ≥ ≡ − ∈

2 / 10 Disciplined Convex Programming (DCP)

New formulation:

min cT x s.t. Ax + b ∈ K where is a set which is convex by construction. K Example:

x 5 x 5 R≥0 • ≥ ≡ − ∈ Conic optimization: is a product of cones. • K

2 / 10 Cones in MOSEK

linear: • K = R≥0 quadratic: • q n 2 2 K = x R : x1 x + + x { ∈ ≥ 2 ··· n} semideﬁnite: • n×n T K = X R : X = FF { ∈ } exponential cone: • 3 K = x R : x1 x2 exp(x3/x2), x2 > 0 { ∈ ≥ } power cone: • 3 p−1 p K = x R : x x2 x3 , x1, x2 0 , p > 1 { ∈ 1 ≥ | | ≥ }

3 / 10 Cones in MOSEK

q 2 2 2 x1 x + x , 2x1x2 x x1 x2 exp(x3/x2) ≥ 2 3 ≥ 3 ≥

4 / 10 Conic representability

Conic Modeling Cheatsheet

Cones Means and averaging Norms, x Rn n ∈ 2 Quadratic cone Log-sum-exp (zi, 1, xi t) Kexp 1, t xi (zi, xi) , t = zi Q xi − ∈ k · k ≥ | 2| 1/2 ∈ Qn+1 t log( e ) i = 1, . . . , n 2, t ( xi ) (t, x) 2 2 ≥ k · k ≥ P ∈ Q 1/p,1 P1/p x x + + x zi 1 1 2 n p, p > 1 (zi, t, xi) 3 − ≥ ··· P ≤ 3 k · k pP1/p ∈ P q Harmonic mean (zi, xi, t) r t ( xi ) i = 1, . . . , n Rotated quadratic cone n 1 1 P ∈ Q ≥ | | r 0 t n( xi− )− i = 1, . . . , n zi = t Q ≤ ≤ P 2 2 xi > 0 zi = nt/2 2x1x2 x3 + + xn, x1, x2 0 P 1 1/i,1/i P ≥ ··· ≥ Geometric mean (zi, xi, zi+1) − P 3 α,1 α 1/n ∈ P Power cone − , α (0, 1) t (x1 xn) i = 2, . . . , n Geometry P3 ∈ | | ≤ ··· xi > 0 z2 = x1, zn+1 = t Bounding ball min r α 1 α 3 x1 x2− x3 , x1, x2 0 t xy, x, y > 0 (x, y, √2t) n+1 √ r minx maxi x xi 2 (r, x xi) ≥ | | ≥ | | ≤ ∈ Q 1 β ,β − i i k − k − ∈ Q Weighted geom. mean (zi, xi, zi+1) 3 Geometric median min ti Exponential cone Kexp α1 αn ∈ P t x x , xi > 0 βi = αi/(α1 + + αi) n+1 | | ≤ 1 ··· n ··· minx x xi 2 (ti, x xi) x3/x2 α > 0, α = 1 i = 2, . . . , n k − k P− ∈ Q x1 x2e , x2 0 i i Analytic center max ti ≥ ≥ z = x , z = t P T T 2 1 n+1 maxx log(bi ai x) (bi ai x, 1, ti) Kexp 1/P4 5/12 1/3 2/3,1/3 − −P ∈ t x y z (s, z, t) 3 | | ≤ ∈ P 3/8,5/8 P x, y, z 0 (x, y, s) Simple bounds ≥ ∈ P3 2 3 Regression and ﬁtting t x (0.5, t, x) r ≥ ∈ Q3 Regularized least squares min t + λr t √x (0.5, x, t) r 2 2 m+2 | | ≤ ∈2 Q minw Xw y 2 + λ w 2 (0.5, t, Xw y) r t x (t, x) Entropy k − k k k − n+2∈ Q ≥ | | ∈ Q 3 (0.5, r, w) r t 1/x, x > 0 (x, t, √2) r t x log x (1, x, t) Kexp ∈ Q ≥ p ∈1 Q/p,1 1/p ≤ − ∈ Max likelihood max aiti t x log(x/y)(y, x, t) Kexp t x , p > 1 (t, 1, x) 3 − max pa1 pan (p , 1, t ) K ≥ | | ∈ P1/(1+p),p/(1+p) ≥ − ∈ 3 p 1 n i i exp p t log(1 + 1/x) (x + 1, u, √2) r ··· P ∈ t 1/x , x > 0, p > 0 (t, x, 1) 3 ≥ ∈ Q Logistic cost function u + v 1 ≥ p ∈ Pp,1 p x > 0 (1 u, 1, t) Kexp T ≤ T t x , x > 0, p (0, 1) (x, 1, t) 3 − θ x (u, 1, θ x t) K | | ≤ ∈ ∈ P − − ∈3 t log(1/(1 + e− )) exp p p 1 1/p,1 1/p t log(1 1/x) (x, u, √2) r ≥ − − − ∈ t x /y − , y 0 (t, y, x) 3 − ≤ − ∈ Q (v, 1, t) Kexp ≥ | | ≥ ∈ P x > 1 (1 u, 1, t) Kexp − ∈ p > 1 − ∈ T n+2 t x log(1 + x/y) (y, x + y, u) Kexp t x x/y, y 0 (0.5t, y, x) r ≥ ∈ ≥ x ≥ ∈ Q x, y > 0 (x + y, y, v) Kexp t e (t, 1, x) Kexp ∈ ≥ ∈ t + u + v = 0 Risk-return t log x (x, 1, t) K n n T n k exp Σ × – covariance, Σ = LL ,L × ≤ ∈ 3 R R t 1/ log x, x > 1 (u, t, √2) ∈ T T ∈ ≥ ∈ Qr maxx α x maxx α x (x, 1, u) Kexp s.t. xT Σx γ (√γ, LT x) k+1 x1 xn ∈ ≤ ∈ Q t a1 an , ai > 0 (t, 1, xi log ai) Kexp Convex quadratic problems T T T ≥ x ··· ∈ maxx α x δx Σx maxx α x δr n n − T − k+2 t xe , x 0 (t, x, u) Kexp Let Σ × , symmetric, p.s.d. (0.5, r, L x) ≥ ≥ P ∈ 3 R (0.5, u, x) ∈ T n k 1.5 ∈ Q ∈ Qr Find Σ = LL , L R × (Cholesky factor). Risk plus x impact cost t δr + β ui x T T∈ 2 ≥ t log(1 + e ) u + v 1 Then x Σx = L x . t δxT Σx + β x 3/2 (0.5, r, LT x) k+2 ≥ ≤ k k2 i (u, 1, x t) Kexp t 1 xT Σx (1, t, LT x) k+2 ≥ | | P∈2/ Q3,1/3 − ∈ 2 r (ui, 1, xi) 3 (v, 1, t) K ≥ T ∈k Q+1 P ∈ P exp t √xT Σx (t, L x) Risk in factor model γ t + s 3/2 − ∈ 2/3,1/3 t x (t, 1, x) 1 ≥T T T ∈ Q T k+2 T T ≥ n+2 3 x Σx + p x + q 0 (1, p x q, L x) r γ x (D + FSF )x (0.5, t, √Dx) ≥ | 3|/2 ∈ P 3 2 ≤ − − ∈ Q r t x , x 0 (s, t, x), (x, 1/8, s) r T 1 T T ≥ T T ∈ Q k+2 maxx c x 2 x Σx max c x r D – speciﬁc risk (diag.) (0.5, s, U F x) r ≥ 3 ≥ 3/4,1/4 ∈ Q − T − k+2 n k ∈ Q t 1/x , x > 0 (t, x, 1) 3 (1, r, L x) r F R × – factor loads ≥ 2/5 ∈ P2/5,3/5 T T ∈ Q m+1 ∈ T 0 t x , x 0 (x, 1, t) , t 0 c x + d Ax + b 2 (c x + d, Ax + b) S = UU – factor cov. ≤ ≤ ≥ ∈ P3 ≥ ≥ k k ∈ Q 5 / 10

More: https://docs.mosek.com/modeling-cookbook/index.html Challenge

If you happen to know a natural, • practical, • important, • convex • optimization problem, which cannot be expressed using the cones we have so far, let us know!

6 / 10 Conic ecosystem

Modeling languages CVX (Matlab) http://ask.cvxr.com • CVXPY (Python) http://www.cvxpy.org/ • YALMIP (Matlab) https://yalmip.github.io/ • JuMP (Julia) https://github.com/JuliaOpt/JuMP.jl • • ··· Free solvers ECOS, SCS, SDPT3, Sedumi, • ···

7 / 10 Conic ecosystem: Mosek Fusion

X min ciPGi s.t.P min P P max Gi ≤ Gi ≤ Gi B θ = PG PD · − (θ θ )/x P i − j ij ≤ ij,max def dcopf(ng, nb, PGmin, PGmax, Pmax, B, x, PD, c): M= Model() PG=M.variable(ng, Domain.inRange(PGmin, PGmax)) theta=M.variable(nb)

M.constraint(Expr.sub(Expr.mul(B, theta), Expr.sub(PG, PD)), Domain.equalsTo(0.0)) M.constraint(Expr.sub(Var.hrepeat(theta,nb), Var.vrepeat(theta.transpose(),nb)), Domain.lessThan(Pmax*x))

M.objective(ObjectiveSense.Minimize, Expr.dot(c, PG))

8 / 10 exploits sparsity •

LP, QP, SOCP, SDP, exponential, power cone • Low-level optimization API • C, Python, Java, .NET, Matlab, R, Julia Object-oriented• API Fusion • C++, Python, Java, .NET 3rd• party • GAMS, AMPL, CVX, CVXOPT, CVXPY, YALMIP, Pyomo, • GPkit, JuMP

Optimizers in MOSEK

primal/dual simplex, interior-point, MIP •

9 / 10 LP, QP, SOCP, SDP, exponential, power cone • Low-level optimization API • C, Python, Java, .NET, Matlab, R, Julia Object-oriented• API Fusion • C++, Python, Java, .NET 3rd• party • GAMS, AMPL, CVX, CVXOPT, CVXPY, YALMIP, Pyomo, • GPkit, JuMP

Optimizers in MOSEK

primal/dual simplex, interior-point, MIP • exploits sparsity •

9 / 10 Optimizers in MOSEK

primal/dual simplex, interior-point, MIP • exploits sparsity •

Licensing: Trial 30-day license • Personal academic license • Group academic license • Commercial licenses • More: WWW www.mosek.com • Demos github.com/MOSEK/Tutorials • Blog themosekblog.blogspot.com/ • I found a bug! / MOSEK is too slow! [email protected] • Twitter @mosektw • Modeling Cookbook www.mosek.com/documentation • Slides: www.mosek.com/resources/presentations •

10 / 10 Contact us