Semi-Riemannian Manifold Optimization

Semi-Riemannian Manifold Optimization Tingran Gao William H. Kruskal Instructor Committee on Computational and Applied Mathematics (CCAM) Department of Statistics The University of Chicago 2018 China-Korea International Conference on Matrix Theory with Applications Shanghai University, Shanghai, China December 19, 2018 Outline Motivation I Riemannian Manifold Optimization I Barrier Functions and Interior Point Methods Semi-Riemannian Manifold Optimization I Optimality Conditions I Descent Direction I Semi-Riemannian First Order Methods I Metric Independence of Second Order Methods Optimization on Semi-Riemannian Submanifolds Joint work with Lek-Heng Lim (The University of Chicago) and Ke Ye (Chinese Academy of Sciences) Riemannian Manifold Optimization I Optimization on manifolds: min f (x) x2M where (M; g) is a (typically nonlinear, non-convex) Riemannian manifold, f : M! R is a smooth function on M I Difficult constrained optimization; handled as unconstrained optimization from the perspective of manifold optimization I Key ingredients: tangent spaces, geodesics, exponential map, parallel transport, gradient rf , Hessian r2f Riemannian Manifold Optimization I Riemannian Steepest Descent x = Exp (−trf (x )) ; t ≥ 0 k+1 xk k I Riemannian Newton's Method 2 r f (xk ) ηk = −∇f (xk ) x = Exp (−tη ) ; t ≥ 0 k+1 xk k I Riemannian trust region I Riemannian conjugate gradient I ...... Gabay (1982), Smith (1994), Edelman et al. (1998), Absil et al. (2008), Adler et al. (2002), Ring and Wirth (2012), Huang et al. (2015), Sato (2016), etc. Riemannian Manifold Optimization I Applications: Optimization problems on matrix manifolds I Stiefel manifolds n Vk (R ) = O (n) =O (k) I Grassmannian manifolds Gr (k; n) = O (n) = (O (k) × O (n − k)) I Flag manifolds: for n1 + ··· + nd = n, Flag = O (n) = (O (n ) × · · · × O (n )) n1;··· ;nd 1 d I Shape Spaces k×n R n f0g =O (n) I ...... From Riemannian to Semi-Riemannian Motivation 1: Question from a Geometer How does optimization depend on the choice of metrics? I Critical points fx 2 M j rf (x) = 0g are metric independent I Gradients, Hessians, Geodesics are metric dependent I Index of Hessian becomes metric independent at a critical point; Morse theory I Optimization trajectories are obviously metric dependent I Newton's equation is metric independent 2 h 2 i r f (xk ) ηk = −∇f (xk ) , re f (xk ) ηk = −re f (xk ) I Does it still work if the metric structure is more general than Riemannian? (E.g. Semi-Riemannian, Finsler, etc.) Barrier Functions and Interior Point Methods d f : Q ! R strictly convex, defined on an open subset Q ⊂ R . 2 I r f (x) 0 on any x 2 Q 2 I r f (x) defines a Riemannian metric on Q 2 −1 I gradient under this Riemannian metric: r f (x) rf (x) I Newton's method for f is exactly the gradient descent under the Riemannian metric defined by r2f (x)! I Nesterov and Todd (2002): primal-dual central path are \close to" geodesics under this Riemannian metric • Nesterov, Yurii E., and Michael J. Todd. On the Riemannian geometry defined by self-concordant barriers and interior-point methods. Foundations of Computational Mathematics 2, no. 4 (2002): 333-361. • Duistermaat, Johannes J. On Hessian Riemannian structures. Asian Journal of Mathematics 5, no. 1 (2001): 79-91. From Riemannian to Semi-Riemannian Motivation 2: Nonconvex objectives for interior point methods I What if f : Q ! R is not strictly convex? I What happens to Newton's method if the Hessian is not positive semi-definite? (Newton's method with modified Hessian) Semi-Riemannian Geometry I Scalar product: h·; ·i : V × V ! R symmetric bilinear form on vector space V , not necessarily positive (semi-)definite I Non-degeneracy: hv; wi = 0 for all w 2 V , v = ~0 I A semi-Riemannian manifold is a differentiable manifold M with a scalar product h·; ·ix defined on Tx M, and h·; ·ix varies smoothly with respect to x 2 M I A semi-Riemannian manifold is non-degenerate if h·; ·ix is non-degenerate for all x 2 M I Non-degeneracy needs not be inherited by subspaces! I Semi-Riemannian submanifold, geodesic, parallel transport, gradient, Hessian, ...... all well-defined if non-degenerate I Of great interest to general relativity as a spacetime model (Lorentzian geometry) Semi-Riemannian Geometry A Simplest Degenerate Subspace Example 2 I Consider R equipped with the Lorentzian metric of signature 2 2 (−; +), i.e., for x = (x1; x2) 2 R and y = (y1; y2) 2 R , define the scalar product as hx; yi = −x1y1 + x2y2 2 I Let W = span f(1; 1)g, which is a degenerate subspace of R ? 2 I If we define W := v 2 R j hw; vi = 0 8w 2 W , then one has W = W ?! I Take-home message: If W is a degenerate subspace of a scalar product space (V ; h·; ·i), then in general W + W ? 6= V , and it may well be the case that W \ W ? 6= 0! I Nevertheless: If W is a non-degenerate subspace of V , then it is still true that V = W ⊕ W ? Examples of Semi-Riemannian Manifolds p;q p+q I Minkowski space R (p ≥ 0, q ≥ 0): vector space R equipped with scalar product > hu; vi = u Ip;qv where −Ip 0 Ip;q = 0 Iq n×n I Square matrices R , equipped with scalar product > > hA; Bi = Tr A Ip;qB = vec (A) (In ⊗ Ip;q) vec (B) I Submanifolds of semi-Riemannian manifolds Examples of Semi-Riemannian Manifolds: Applications I Spacetime models: Lorentzian spaces, de Sitter spaces, anti-de Sitter spaces, ...... I Network models: social networks \behave like" spaces of negative Alexandrov curvature • Carroll, Sean M. Spacetime and geometry: An introduction to general relativity. San Francisco, USA: Addison-Wesley, 2004. • Krioukov, Dmitri, Maksim Kitsak, Robert S. Sinkovits, David Rideout, David Meyer, and MariánBogu~ná. Network cosmology. Scientific reports 2 (2012): 793. Outline Motivation I Riemannian Manifold Optimization I Barrier Functions and Interior Point Methods Semi-Riemannian Manifold Optimization I Optimality Conditions I Descent Direction I Semi-Riemannian First Order Methods I Metric Independence of Second Order Methods Optimization on Semi-Riemannian Submanifolds Joint work with Lek-Heng Lim (The University of Chicago) and Ke Ye (Chinese Academy of Sciences) Optimality Conditions Given a semi-Riemannian manifold with scalar product h·; ·ix on Tx M, define the semi-Riemannian gradient of a smooth function f 2 C 2 (M) by hDf ; X ix = Xf (x) ; 8X 2 Γ(M; Tx M) ; x 2 M and define the semi-Riemannian Hessian of f by 2 D f (X ; Y ) = X (Yf ) − (DX Y ) f ; 8X ; Y 2 Γ(M; Tx M) where DX Y is the covariant derivative of Y with respect to X . Optimality Conditions Necessary conditions: A local optimum x 2 M satisfies I (First order necessary condition) Df (x) = 0 if f : M ! R is first-order differentiable I (Second order necessary condition) Df (x) = 0 and 2 D f (x) 0 if f : M ! R is second-order differentiable Sufficient condition: I (Second-order sufficient condition) If x 2 M is an interior point on a semi-Riemannian manifold M, if Df (x) = 0 and D2f (x) 0, then x is a strict relative minimum of f Descent Direction I If general, −Df (x) is not a descent direction at x 2 M! The main issue is that h−Df (x) ; Df (x)ix may be negative, so the first-order approximation f (x − Df (x)) ≈ f (x) − hDf (x) ; Df (x)ix needs not be smaller than f (x) I Similar issue exists in Newton's method: When Hessian r2f (x) is not positive definite, the Newton direction − [rf (x)]−1 rf (x) may not be a descent direction I Solution for Newton's method: Modified Hessian Methods I For semi-Riemannian optimization: similar idea of modifying Df (x) Descent Direction Solution: modify −Df (x) to obtain a descent direction on the semi-Riemannian manifold I For each Tx M, use semi-Riemannian Gram-Schmidt to find an orthonormal basis e1; ··· ; ed for Tx M n + X I Construct [Df (x)] = hDf (x) ; ei i ei i=1 + I − [Df (x)] is a descent direction, since n + X 2 − [Df (x)] ; Df (x) = − jhDf (x) ; ei ij < 0 i=1 Semi-Riemannian Steepest Descent I/II Semi-Riemannian Steepest Descent II/II Semi-Riemannian Conjugate Gradient Semi-Riemannian Second-Order Methods 2 I Newton's Method: Solve D f (x) η = Df (x) for a descent direction η 2 Tx M at x 2 M. Note that this is exactly the same Newton's equation for Riemannian optimization! I Trust Region: Minimize a local quadratic optimization problem near each x 2 M. Again same as Riemannian optimization! I The equivalence between Riemannian and Semi-Riemannian second-order methods is not just a coincidence; it is because these methods build upon local polynomial surrogates, which are always metric-independent in the sense of jets. Outline Motivation I Riemannian Manifold Optimization I Barrier Functions and Interior Point Methods Semi-Riemannian Manifold Optimization I Optimality Conditions I Descent Direction I Semi-Riemannian First Order Methods I Metric Independence of Second Order Methods Optimization on Semi-Riemannian Submanifolds Joint work with Lek-Heng Lim (The University of Chicago) and Ke Ye (Chinese Academy of Sciences) Which manifolds admit semi-Riemannian structures? I Euclidean spaces and all Riemannian manifolds are particular semi-Riemannian manifolds I Tangent bundles of manifolds admitting a semi-Riemannian metric that is not Riemannian must admit at least one decomposition into two sub-bundles I Any manifold with trivial tangent bundle admits semi-Riemannian manifolds, e.g.

Semi-Riemannian Manifold Optimization

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support