Kernel Distributionally Robust Optimization: Generalized Duality Theorem and Stochastic Approximation

Total Page:16

File Type:pdf, Size:1020Kb

Kernel Distributionally Robust Optimization: Generalized Duality Theorem and Stochastic Approximation Kernel Distributionally Robust Optimization: Generalized Duality Theorem and Stochastic Approximation Jia-Jie Zhu Wittawat Jitkrittum Empirical Inference Department Empirical Inference Department Max Planck Institute for Intelligent Systems Max Planck Institute for Intelligent Systems Tübingen, Germany Tübingen, Germany [email protected] Currently at Google Research, NYC, USA [email protected] Moritz Diehl Bernhard Schölkopf Department of Microsystems Engineering Empirical Inference Department & Department of Mathematics Max Planck Institute for Intelligent Systems University of Freiburg Tübingen, Germany Freiburg, Germany [email protected] [email protected] Abstract functional gradient, which apply to general optimization and machine learning tasks. We propose kernel distributionally robust op- timization (Kernel DRO) using insights from 1 INTRODUCTION the robust optimization theory and functional analysis. Our method uses reproducing kernel Imagine a hypothetical scenario in the illustrative figure Hilbert spaces (RKHS) to construct a wide where we want to arrive at a destination while avoiding range of convex ambiguity sets, which can be unknown obstacles. A worst-case robust optimization generalized to sets based on integral probabil- (RO) (Ben-Tal et al., 2009) approach is then to avoid ity metrics and finite-order moment bounds. the entire unsafe area (left, blue). Suppose we have This perspective unifies multiple existing ro- historical locations of the obstacles (right, dots). We bust and stochastic optimization methods. may choose to avoid only the convex polytope that We prove a theorem that generalizes the clas- contains all the samples (pink). This data-driven robust sical duality in the mathematical problem of decision-making idea improves efficiency while retaining moments. Enabled by this theorem, we re- robustness. formulate the maximization with respect to measures in DRO into the dual program that searches for RKHS functions. Using universal RKHSs, the theorem applies to a broad class of loss functions, lifting common limitations The concept of distributional ambiguity concerns the such as polynomial losses and knowledge of uncertainty of uncertainty — the underlying probability the Lipschitz constant. We then establish a measure is only partially known or subject to change. connection between DRO and stochastic op- This idea is by no means a new one. The classical timization with expectation constraints. Fi- moment problem concerns itself with estimating the R nally, we propose practical algorithms based worst-case risk expressed by maxP 2K l dP where l is on both batch convex solvers and stochastic some loss function. The constraint P 2 K describes the distribution ambiguity, i.e., P is only known to live Proceedings of the 24th International Conference on Artifi- within a subset K of probability measures. The solution cial Intelligence and Statistics (AISTATS) 2021, San Diego, to the moment problem gives the risk under some California, USA. PMLR: Volume 130. Copyright 2021 by worst-case distribution within K. To make decisions the author(s). that will minimize this worst-case risk is the idea of Kernel Distributionally Robust Optimization distributionally robust optimization (DRO) (Delage and P := P(X ) denotes the set of all Borel probability Ye, 2010; Scarf, 1958). measures on X . We use P^ to denote the empirical dis- ^ PN 1 tribution P = i=1 N δξi , where δ is a Dirac measure Many of today’s learning tasks suffer from various man- N and fξig are data samples. We refer to the function ifestations of distributional ambiguity — e.g., covariate i=1 δC(x) := 0 if x 2 C, 1 if x2 = C, as the indicator func- shift, adversarial attacks, simulation to reality trans- ∗ tion. δ (f) := sup hf; µiH is the support function of fer — phenomena that are caused by the discrepancy C µ2C C. SN denotes the N-dimensional simplex. ri(·) denotes between training and test distributions. Kernel meth- the relative interior of a set. A function f is upper semi- ods are known to possess robustness properties, e.g., continuous on X if lim sup f(x) ≤ f(x0); 8x0 2 X ; (Christmann and Steinwart, 2007; Xu et al., 2009). x!x0 it is proper if it is not identically −∞. When there is no However, this robustness only applies to kernelized ambiguity, we simplify the loss function notation l(θ; ·) models. This paper extends the robustness of kernel by using l to indicate that results hold for θ point-wise. methods using the robust counterpart formulation tech- niques (Ben-Tal et al., 2009) as well as the principled 2.1 Robust and distributionally robust conic duality theory (Shapiro, 2001). We term our optimization approach kernel distributionally robust optimization (Kernel DRO), which can robustify general optimiza- Robust optimization (RO) (Ben-Tal et al., 2009) stud- tion solutions not limited to kernelized models. ies mathematical decision-making under uncertainty. The main contributions of this paper are: It solves the min-max problem (omitting constraints) minθ sup l(θ; ξ); where l(θ; ξ) denotes a general loss 1. We rigorously prove the generalized duality theo- ξ2X function, θ is the decision variable, and ξ is a variable rem (Theorem 3.1) that reformulates general DRO representing the uncertainty. Intuitively, RO makes into a convex dual problem searching for RKHS the decision assuming an adversarial scenario, as re- functions, lifting common limitations of DRO on flected in taking supremum w.r.t. ξ. For this reason, the loss functions, such as the knowledge of Lip- it is often referred to as the worst-case RO. Recently, schitz constant. The theorem also constitutes a RO has been applied to the setting of adversarially generalization of the duality results from the liter- robust learning, e.g., in (Madry et al., 2019; Wong and ature of mathematical problem of moments. Kolter, 2018), which we will visit in this paper. In the 2. We use RKHSs to construct a wide range of convex optimization literature, a typical approach to solving ambiguity sets (in Table 1, 3), including sets based RO is via reformulating the min-max program using on integral probability metrics (IPM) and finite- duality to obtain a single minimization problem. In order moment bounds. This perspective unifies contrast, distributionally robust optimization (DRO) existing RO and DRO methods. minimizes the expected loss assuming the worst-case 3. We propose computational algorithms based on distribution: both convex solvers and stochastic approximation, which can be applied to robustify general optimiza- Z min sup l(θ; ξ) dP (ξ) ; (1) tion and machine learning models not limited to θ P 2K kernelized or known-Lipschitz-constant ones. 4. Finally, we establish an explicit connection be- where K ⊆ P, called the ambiguity set, is a subset tween DRO and stochastic optimization with of distributions, e.g., all distributions with the given expectation constraints. This leads to a novel mean and variance. Compared with RO, DRO only stochastic functional gradient DRO (SFG-DRO) robustifies the solution against a subset K of distribu- algorithm which can scale up to modern machine tions on X and is, therefore, less conservative (since R learning tasks. supP 2Kf l(θ; ξ) dP (ξ)g ≤ supξ2X l(θ; ξ)). In addition, we give complete self-contained proofs in The inner problem of (1), historically known as the the appendix that shed light on the connection be- problem of moments traced back at least to Thomas tween RKHSs, conic duality, and DRO. We also show Joannes Stieltjes, estimates the worst-case risk under that universal RKHSs are large enough for DRO from uncertainty in distributions. The modern approaches, the perspective of functional analysis through concrete pioneered by the work of (Isii, 1962) (see also (Lasserre, examples. 2002; Shapiro, 2001; Bertsimas and Popescu, 2005; Popescu, 2005; Vandenberghe et al., 2007; Van Parys et al., 2016; Zhu et al., 2020)), typically seek a sharp 2 BACKGROUND upper bound via duality. This duality, rigorously jus- tified in (Shapiro, 2001), is different from that in the Notation. X ⊂ Rd denotes the input domain, which Euclidean space because infinite-dimensional convex is assumed to be compact unless otherwise specified. sets can become pathological. Using that methodol- Jia-Jie Zhu, Wittawat Jitkrittum, Moritz Diehl, Bernhard Schölkopf ogy, we can reformulate DRO (1) into a single solvable 2007, Section 1.2). The reproducing property allows minimization problem. one to easily compute the expectation of any function f 2 H since Ex∼P [f(x)] = hf; µP iH. Embedding distri- Existing DRO approaches can be grouped into three butions into H also allows one to measure the distance main categories by the type of ambiguity sets used. between distributions in H. If k is universal, then the DRO with (finite-order) moment constraints has been mean map P 7! µP is injective on P (Gretton et al., studied in (Delage and Ye, 2010; Scarf, 1958; Zymler 2012). With a universal H, given two distributions P; Q, et al., 2013). The authors of (Ben-Tal et al., 2013; kµP − µQkH defines a metric. This quantity is known Iyengar, 2005; Nilim and El Ghaoui, 2005; Wang et al., as the maximum mean discrepancy (MMD) (Gretton 2016; Duchi et al., 2016) studied DRO using likelihood p et al., 2012). With kfkH := hf; fiH and the repro- bounds as well as φ−divergence. Wasserstein-distance- 2 ducing property, it can be shown that kµP − µQkH = based DRO has been studied by the authors of (Mo- 0 0 Ex;x0∼P k(x; x ) + Ey;y0∼Qk(y; y ) − 2Ex∼P;y∼Qk(x; y), hajerin Esfahani and Kuhn, 2018; Zhao and Guan, allowing the plug-in estimator to be used for esti- 2018; Gao and Kleywegt, 2016; Blanchet et al., 2019), mating the MMD from empirical data. The MMD and applied in a large body of literature. Many exist- is an instance of the class of integral probability ing approaches require either the assumptions such as metrics (IPMs), and can equivalently be written as quadratic loss functions or the knowledge of Lipschitz R kµP − µQkH = sup f d(P − Q), where the op- constant or RKHS norm of the loss l, which are often kfkH≤1 timum f ∗ is a witness function (Gretton et al., 2012; hard to obtain in practice; see (Virmaux and Scaman, Sriperumbudur et al., 2012).
Recommended publications
  • The Multivariate Moments Problem: a Practical High Dimensional Density Estimation Method
    National Institute for Applied Statistics Research Australia University of Wollongong, Australia Working Paper 11-19 The Multivariate Moments Problem: A Practical High Dimensional Density Estimation Method Bradley Wakefield, Yan-Xia Lin, Rathin Sarathy and Krishnamurty Muralidhar Copyright © 2019 by the National Institute for Applied Statistics Research Australia, UOW. Work in progress, no part of this paper may be reproduced without permission from the Institute. National Institute for Applied Statistics Research Australia, University of Wollongong, Wollongong NSW 2522, Australia Phone +61 2 4221 5435, Fax +61 2 42214998. Email: [email protected] The Multivariate Moments Problem: A Practical High Dimensional Density Estimation Method October 8, 2019 Bradley Wakefield National Institute for Applied Statistics Research Australia School of Mathematics and Applied Statistics University of Wollongong, NSW 2522, Australia Email: [email protected] Yan-Xia Lin National Institute for Applied Statistics Research Australia School of Mathematics and Applied Statistics University of Wollongong, NSW 2522, Australia Email: [email protected] Rathin Sarathy Department of Management Science & Information Systems Oklahoma State University, OK 74078, USA Email: [email protected] Krishnamurty Muralidhar Department of Marketing & Supply Chain Management, Research Director for the Center for Business of Healthcare University of Oklahoma, OK 73019, USA Email: [email protected] 1 Abstract The multivariate moments problem and it's application to the estimation of joint density functions is often considered highly impracticable in modern day analysis. Although many re- sults exist within the framework of this issue, it is often an undesirable method to be used in real world applications due to the computational complexity and intractability of higher order moments.
    [Show full text]
  • THE HAUSDORFF MOMENT PROBLEM REVISITED to What
    THE HAUSDORFF MOMENT PROBLEM REVISITED ENRIQUE MIRANDA, GERT DE COOMAN, AND ERIK QUAEGHEBEUR ABSTRACT. We investigate to what extent finitely additive probability measures on the unit interval are determined by their moment sequence. We do this by studying the lower envelope of all finitely additive probability measures with a given moment sequence. Our investigation leads to several elegant expressions for this lower envelope, and it allows us to conclude that the information provided by the moments is equivalent to the one given by the associated lower and upper distribution functions. 1. INTRODUCTION To what extent does a sequence of moments determine a probability measure? This problem has a well-known answer when we are talking about probability measures that are σ-additive. We believe the corresponding problem for probability measures that are only finitely additive has received much less attention. This paper tries to remedy that situation somewhat by studying the particular case of finitely additive probability measures on the real unit interval [0;1] (or equivalently, after an appropriate transformation, on any compact real interval). The question refers to the moment problem. Classically, there are three of these: the Hamburger moment problem for probability measures on R; the Stieltjes moment problem for probability measures on [0;+∞); and the Hausdorff moment problem for probability measures on compact real intervals, which is the one we consider here. Let us first look at the moment problem for σ-additive probability measures. Consider a sequence m of real numbers mk, k ¸ 0. It turns out (see, for instance, [12, Section VII.3] σ σ for an excellent exposition) that there then is a -additive probability measure Pm defined on the Borel sets of [0;1] such that Z k σ x dPm = mk; k ¸ 0; [0;1] if and only if m0 = 1 (normalisation) and the sequence m is completely monotone.
    [Show full text]
  • On the Combinatorics of Cumulants
    Journal of Combinatorial Theory, Series A 91, 283304 (2000) doi:10.1006Âjcta.1999.3017, available online at http:ÂÂwww.idealibrary.com on On the Combinatorics of Cumulants Gian-Carlo Rota Department of Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139 and Jianhong Shen School of Mathematics, University of Minnesota, Minneapolis, Minnesota 55455 E-mail: jhshenÄmath.umn.edu View metadata, citation and similar papers at core.ac.uk brought to you by CORE Communicated by the Managing Editors provided by Elsevier - Publisher Connector Received July 26, 1999 dedicated to the memory of gian-carlo rota We study cumulants by Umbral Calculus. Various formulae expressing cumulants by umbral functions are established. Links to invariant theory, symmetric functions, and binomial sequences are made. 2000 Academic Press 1. INTRODUCTION Cumulants were first defined and studied by Danish scientist T. N. Thiele. He called them semi-invariants. The importance of cumulants comes from the observation that many properties of random variables can be better represented by cumulants than by moments. We refer to Brillinger [3] and Gnedenko and Kolmogorov [4] for further detailed probabilistic aspects on this topic. Given a random variable X with the moment generating function g(t), its nth cumulant Kn is defined as d n Kn= n log g(t). dt } t=0 283 0097-3165Â00 35.00 Copyright 2000 by Academic Press All rights of reproduction in any form reserved. 284 ROTA AND SHEN That is, m K : n tn= g(t)=exp : n tn ,(1) n0 n! \ n1 n! + where, mn is the nth moment of X. Generally, if _ denotes the standard deviation, then 2 2 K1=m1 , K2=m2&m1=_ .
    [Show full text]
  • The Truncated Stieltjes Moment Problem Solved by Using Kernel Density Functions
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector Journal of Computational and Applied Mathematics 236 (2012) 4193–4213 Contents lists available at SciVerse ScienceDirect Journal of Computational and Applied Mathematics journal homepage: www.elsevier.com/locate/cam The truncated Stieltjes moment problem solved by using kernel density functions P.N. Gavriliadis ∗, G.A. Athanassoulis Department of Naval Architecture and Marine Engineering, National Technical University of Athens, Heroon Polytechniou 9, GR-157 73 Zografos, Athens, Greece article info a b s t r a c t Article history: In this work the problem of the approximate numerical determination of a semi-infinite Received 4 July 2010 supported, continuous probability density function (pdf) from a finite number of its Received in revised form 6 September 2011 moments is addressed. The target space is carefully defined and an approximation theorem is proved, establishing that the set of all convex superpositions of appropriate Kernel MSC: Density Functions (KDFs) is dense in this space. A solution algorithm is provided, based 65R32 on the established approximate representation of the target pdf and the exploitation 65F22 of some theoretical results concerning moment sequence asymptotics. The solution 45Q05 algorithm also permits us to recover the tail behavior of the target pdf and incorporate Keywords: this information in our solution. A parsimonious formulation of the proposed solution Truncated Stieltjes moment problem procedure, based on a novel sequentially adaptive scheme is developed, enabling a very Kernel density function efficient moment data inversion. The whole methodology is fully illustrated by numerical Probability density function examples.
    [Show full text]
  • Stieltjes Moment Sequences for Pattern-Avoiding Permutations Alin Bostan, Andrew Elvey-Price, Anthony Guttmann, Jean-Marie Maillard
    Stieltjes moment sequences for pattern-avoiding permutations Alin Bostan, Andrew Elvey-Price, Anthony Guttmann, Jean-Marie Maillard To cite this version: Alin Bostan, Andrew Elvey-Price, Anthony Guttmann, Jean-Marie Maillard. Stieltjes moment se- quences for pattern-avoiding permutations. The Electronic Journal of Combinatorics, Open Journal Systems, In press, pp.59. 10.37236/xxxx. hal-02425917v3 HAL Id: hal-02425917 https://hal.archives-ouvertes.fr/hal-02425917v3 Submitted on 17 Oct 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Distributed under a Creative Commons Attribution - NoDerivatives| 4.0 International License Stieltjes moment sequences for pattern-avoiding permutations Alin Bostan Andrew Elvey Price Inria and Universit´eParis-Saclay LaBRI, Universit´eBordeaux I 1 rue Honor´ed'Estienne d'Orves 351 cours de la Lib´eration 91120 Palaiseau, France 33405 Talence Cedex, France [email protected] [email protected] Anthony John Guttmann Jean-Marie Maillard School of Mathematics and Statistics LPTMC, CNRS, Sorbonne Universit´e The University of Melbourne 4 Place Jussieu, Tour 23, case 121 Vic. 3010, Australia 75252 Paris Cedex 05, France [email protected] [email protected] Submitted: Feb 28, 2020; Accepted: Sep 16, 2020; Published: TBD © The authors.
    [Show full text]
  • Moments, Cumulants and Diagram Formulae for Non-Linear Functionals of Random Measures Giovanni Peccati, Murad Taqqu
    Moments, cumulants and diagram formulae for non-linear functionals of random measures Giovanni Peccati, Murad Taqqu To cite this version: Giovanni Peccati, Murad Taqqu. Moments, cumulants and diagram formulae for non-linear functionals of random measures. 2008. hal-00338132 HAL Id: hal-00338132 https://hal.archives-ouvertes.fr/hal-00338132 Preprint submitted on 11 Nov 2008 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Moments, cumulants and diagram formulae for non-linear functionals of random measures Giovanni PECCATI∗ and Murad S. TAQQU† November 11, 2008 Abstract This survey provides a unified discussion of multiple integrals, moments, cumulants and diagram formulae associated with functionals of completely random measures. Our approach is combinatorial, as it is based on the algebraic formalism of partition lattices and M¨obius functions. Gaussian and Poisson measures are treated in great detail. We also present several combinatorial interpretations of some recent CLTs involving sequences of random variables belonging to a fixed Wiener chaos. Key Words – Central Limit Theorems; Cumulants; Diagonal measures; Diagram for- mulae; Gaussian fields; M¨obius function; Moments; Multiple integrals; Partitions; Poisson measures; Random Measures. AMS Subject Classification – 05A17; 05A18; 60C05; 60F05; 60H05.
    [Show full text]
  • Ten Lectures on the Moment Problem
    Konrad Schmüdgen Ten Lectures on the Moment Problem August 31, 2020 arXiv:2008.12698v1 [math.FA] 28 Aug 2020 Preface If µ is a positive Borel measure on the line and k is a nonnegativeinteger, the number k sk sk(µ)= x dµ(x) (0.1) ≡ ZR is called the k-th moment of µ. For instance, if the measure µ comes from the dis- tribution function F of a random variable X, then the expectation value of X k is just the k-the moment, +∞ k k E[X ]= sk = x dF(x), Z ∞ − 2 2 2 2 and the variance of X is Var(X)= E[(X E[X]) ]= E[X ] E[X] = s2 s1 (of course, provided the corresponding numbers− are finite). − − The moment problem is the inverse problem of “finding" the measure when the moments are given. In order to apply functional-analytic methods it is convenient to rewrite the mo- ment problem in terms of linear functionals. For a real sequence s = (sn)n N0 let Ls ∈ n denote the linear functional on the polynomial algebra R[x] defined by Ls(x )= sn, n N0. Then, by the linearity of the integral, (0.1) holds for all k if and only if ∈ +∞ Ls(p)= p(x)dµ(x) for p R[x]. (0.2) Z ∞ ∈ − That is, the moment problem asks whether a linear functional on R[x] admits an integral representation (0.2) with some positive measure µ on R. This is the simplest version of the moment problem, the so-called Hamburger moment problem.
    [Show full text]
  • Recovery of Distributions Via Moments 4 5 5 6 Robert M
    IMS Collections Vol. 0 (2009) 251{265 c Institute of Mathematical Statistics, 2009 1 1 arXiv: math.PR/0000011 2 2 3 3 4 Recovery of Distributions via Moments 4 5 5 6 Robert M. Mnatsakanov1,∗ and Artak S. Hakobyan2,∗ 6 7 7 West Virginia University 8 8 9 Abstract: The problem of recovering a cumulative distribution function (cdf) 9 10 and corresponding density function from its moments is studied. This problem 10 is a special case of the classical moment problem. The results obtained within 11 11 the moment problem can be applied in many indirect models, e.g., those based 12 on convolutions, mixtures, multiplicative censoring, and right-censoring, where 12 13 the moments of unobserved distribution of actual interest can be easily esti- 13 14 mated from the transformed moments of the observed distributions. Nonpara- 14 metric estimation of a quantile function via moments of a target distribution 15 15 represents another very interesting area where the moment problem arises. In 16 all such models one can use the procedure, which recovers a function via its 16 17 moments. In this article some properties of the proposed constructions are de- 17 18 rived. The uniform rates of convergence of the approximants of cdf, its density 18 function, and quantile function are obtained as well. 19 19 20 20 21 Contents 21 22 22 23 1 Introduction................................... 251 23 24 2 Some Notations and Assumptions....................... 253 24 25 3 Asymptotic Properties of Fα,ν and fα,ν ................... 254 25 26 4 Some Applications and Examples....................... 258 26 27 Acknowledgments................................
    [Show full text]
  • The Moment Problem Associated with the Stieltjes–Wigert Polynomials
    J. Math. Anal. Appl. 277 (2003) 218–245 www.elsevier.com/locate/jmaa The moment problem associated with the Stieltjes–Wigert polynomials Jacob Stordal Christiansen Department of Mathematics, University of Copenhagen, Universitetsparken 5, 2100 København Ø, Denmark Received 17 May 2001 Submitted by T. Fokas Abstract We consider the indeterminate Stieltjes moment problem associated with the Stieltjes–Wigert polynomials. After a presentation of the well-known solutions, we study a transformation T of the set of solutions. All the classical solutions turn out to be fixed under this transformation but this is not the case for the so-called canonical solutions. Based on generating functions for the Stieltjes– Wigert polynomials, expressions for the entire functions A, B, C,andD from the Nevanlinna (n) parametrization are obtained. We describe T (µ) for n ∈ N when µ = µ0 is a particular N-extremal solution and explain in detail what happens when n →∞. 2002 Elsevier Science (USA). All rights reserved. Keywords: Indeterminate moment problems; Stieltjes–Wigert polynomials; Nevanlinna parametrization 1. Introduction T.J. Stieltjes was the first to give examples of indeterminate moment problems. In [18] he pointed out that if f is an odd function satisfying f(u+ 1/2) =±f(u),then ∞ − unu loguf(log u) du = 0 0 E-mail address: [email protected]. 0022-247X/02/$ – see front matter 2002 Elsevier Science (USA). All rights reserved. PII:S0022-247X(02)00534-6 J.S. Christiansen / J. Math. Anal. Appl. 277 (2003) 218–245 219 for all n ∈ Z. In particular, ∞ − unu logu sin(2π log u) du = 0,n∈ Z, 0 so independent of λ we have ∞ ∞ 1 − 1 − + 2 √ unu logu 1 + λ sin(2π logu) du= √ unu logu du= e(n 1) /4.
    [Show full text]
  • Moment Problems and Semide Nite Optimization
    Moment Problems and Semidenite Optimization y z Dimitris Bertsimas Ioana Pop escu Jay Sethuraman April Abstract Problems involving moments of random variables arise naturally in many areas of mathematics economics and op erations research Howdowe obtain optimal b ounds on the probability that a random variable b elongs in a set given some of its moments Howdowe price nancial derivatives without assuming any mo del for the underlying price dynamics given only moments of the price of the underlying asset Howdoweob tain stronger relaxations for sto chastic optimization problems exploiting the knowledge that the decision variables are moments of random variables Can we generate near optimal solutions for a discrete optimization problem from a semidenite relaxation by interpreting an optimal solution of the relaxation as a covariance matrix In this pap er we demonstrate that convex and in particular semidenite optimization metho ds lead to interesting and often unexp ected answers to these questions Dimitris Bertsimas Sloan Scho ol of Management and Op erations ResearchCenter Rm E MIT Cambridge MA db ertsimmitedu Research supp orted by NSF grant DMI and the Singap oreMIT alliance y Ioana Pop escu Decision Sciences INSEAD Fontainebleau Cedex FRANCE ioanap op escuin seadfr z Jay Sethuraman Columbia University DepartmentofIEORNewYork NY jayccolumbiaedu Intro duction Problems involving moments of random variables arise naturally in many areas of mathe matics economics and op erations research Let us give some examples that motivate the
    [Show full text]
  • The Classical Moment Problem
    The classical moment problem Sasha Sodin∗ March 5, 2019 These lecture notes were prepared for a 10-hour introductory mini-course at LTCC. A cap was imposed at 50 pages; even thus, I did not have time to cover the sections indicated by an asterisk. If you find mistakes or misprints, please let me know. Thank you very much! The classical moment problem studies the map S taking a (positive Borel) measure µ on R to its moment sequence (sk)k≥0, Z k sk[µ] = λ dµ(λ) . The map is defined on the set of measures with finite moments: −∞ −k µ(R \ [−R, R]) = O(R ) , i.e. ∀k µ(R \ [−R, R]) = O(R ) . The two basic questions are 1. existence: characterise the image of S, i.e. for which sequences (sk)k≥0 of real numbers can one find µ such that sk[µ] = sk for k = 0, 1, 2, ··· ? 2. uniqueness, or determinacy: which sequences in the image have a unique pre- image, i.e. which measures are characterised by their moments? In the case of non-uniqueness, one may wish to describe the set of all solutions. The classical moment problem originated in the 1880-s, and reached a definitive state by the end of the 1930-s. One of the original sources of motivation came from probability theory, where it is important to have verifiable sufficient conditions for determinacy. Determinacy is also closely related to several problems in classical analysis, particularly, to the study of the map taking a (germ of a) smooth function f to the sequence of (k) its Taylor coefficients (f (0))k≥0.
    [Show full text]
  • The Truncated Moment Problem
    The Truncated Moment Problem Philipp J. di Dio To my parents They gave me the freedom to worry only about mathematics. The Truncated Moment Problem Von der Fakult¨atf¨urMathematik und Informatik der Universit¨atLeipzig angenommene DISSERTATION zur Erlangung des akademischen Grades DOCTORRERUMNATURALIUM (Dr. rer. nat.) im Fachgebiet Mathematik vorgelegt von Dipl.-Math. B.Sc. Philipp J. di Dio geboren am 25.02.1986 in Lutherstadt Wittenberg. Einreichungsdatum: 10. Januar 2018. Die Annahme der Dissertation wurde empfohlen von: 1. Prof. Dr. Konrad Schm¨udgen,Universit¨atLeipzig 2. Prof. Dr. Claus Scheiderer, Universit¨atKonstanz Die Verleihung des akademischen Grades erfolgt mit Bestehen der Verteidigung am 28. Mai 2018 mit dem Gesamtpr¨adikat magna cum laude. Contents 1 Introduction 1 2 Preliminaries 5 2.1 Basic Definitions for the Moment Problem and Carath´eodory Numbers . .5 2.2 Three Theorems of Hans Richter . .9 2.3 Apolar Scalar Product and Real Waring Rank . 10 3 Carath´eodory Numbers CA and CA;± 11 3.1 The Moment Cone SA and its Connection to CA ........................ 11 3.2 Continuous Functions . 15 3.3 Differentiable Functions . 18 3.4 Upper Bound Estimates on CA;± and the Real Waring Rank . 27 3.5 One-Dimensional Monomial Case with Gaps . 28 3.6 Multidimensional Monomial Case . 38 4 Set of Atoms, Determinacy, and Core Variety 47 4.1 Set W(s) of Atoms . 47 4.2 Determinacy of Moment Sequences . 54 4.3 Core Variety . 57 5 Weyl-Circles 59 5.1 Preliminaries . 59 5.2 Main Tool . 61 5.3 Hamburger Moment Problem . 62 5.4 Stieltjes Moment Problem on [a; 1).............................
    [Show full text]