On Almost Sure Rates of Convergence for Sample Average Approximations
Total Page:16
File Type:pdf, Size:1020Kb
On almost sure rates of convergence for sample average approximations Dirk Banholzer1, J¨orgFliege1, and Ralf Werner2 1 Department of Mathematical Sciences, University of Southampton, Southampton, SO17 1BJ, UK [email protected] 2 Institut f¨urMathematik, Universit¨atAugsburg 86159 Augsburg, Germany Abstract. In this paper, we study the rates at which strongly con- sistent estimators in the sample average approximation approach (SAA) converge to their deterministic counterparts. To be able to quantify these rates at which convergence occurs in the almost sure sense, we consider the law of the iterated logarithm in a Banach space setting and first establish under relatively mild assumptions convergence rates for the approximating objective functions. These rates can then be transferred to the estimators for optimal values and solutions of the approximated problem. Based on these results, we further investigate the asymptotic bias of the SAA approach. Especially, we show that under the same as- sumptions as before the SAA estimator for the optimal value converges in mean to its deterministic counterpart, at a rate which essentially co- incides with the one in the almost sure sense. Further, optimal solutions also convergence in mean; for convex problems with the same rates, but with an as of now unknown rate for general problems. Finally, we address the notion of convergence in probability and conclude that in this case (weak) rates for the estimators can also be derived, and that without im- posing strong conditions on exponential moments, as it is typically done for obtaining exponential rates. Results on convergence in probability then allow to build universal confidence sets for the optimal value and optimal solutions under mild assumptions. Key words: Stochastic programming, sample average approximation, almost sure rates of convergence, law of the iterated logarithm, L1-convergence 1 Introduction Let (Ω; F; P) be a complete probability space on which we consider the stochastic programming problem n o min f(x) := E [h(x; ξ)] ; (1) x2X P where X ⊂ Rn denotes a nonempty compact set, ξ is a random vector whose distribution is supported on a set Ξ ⊂ Rm, and h : X × Ξ ! R is a function 2 Dirk Banholzer et al. depending on some parameter x 2 X and the random vector ξ 2 Ξ. For f to be well-defined, we assume for every x 2 X that h(x; ·) is measurable with respect to the Borel σ-algebras B(Ξ) and B(R), and that it is P-integrable. The stochastic problem (1) may arise in various applications from a broad range of areas, such as finance and engineering, where deterministic approaches turn out to be unsuitable for formulating the actual problem. Quite frequently, the problem is encountered as a first-stage problem of a two-stage stochastic program where h(x; ξ) describes the optimal value of a subordinate second-stage problem, see, e.g., Shapiro et al. (2014). Naturally, problem (1) may also be viewed as a self-contained problem, in which h directly results from modelling a stochastic quantity. Unfortunately, in many situations, the distribution of the random function h(·; ξ) is not known exactly, such that the expected value in (1) cannot be evalu- ated readily and therefore needs to be approximated in some way. Using Monte Carlo simulation, a common approach (see, e.g., Homem-de-Mello and Bayrak- san (2014) for a recent survey) consists of drawing a sample of i.i.d. random vectors ξ1; : : : ; ξN , N 2 N from the same distribution as ξ, and considering the sample average approximation (SAA) problem N ^ 1 X min fN (x) := h(x; ξi) (2) x2X N i=1 as an approximation to the original stochastic programming problem (1). Since the SAA problem (2) depends on the set of random vectors ξ1; : : : ; ξN , it essen- ^∗ tially constitutes a random problem. Its optimal value fN is thus an estimator of ∗ ∗ the optimal value f of the original problem (1), and an optimal solutionx ^N is an estimator of an optimal solution x∗ of the original problem (1). However, for a particular realisation of the random sample, the approximating problem (2) represents a deterministic problem instance, which can then be solved by ade- quate optimisation algorithms. For this purpose, one usually assumes that the set X is described by (deterministic) equality and inequality constraints. An appealing feature of the SAA approach is its sound convergence proper- ties, which have been discussed in a number of publications regarding different aspects. Under some relatively mild regularity assumptions on h, the strong consistency of the sequences of optimal SAA estimators was shown in various publications, see, e.g., Domowitz and White (1982) and Bates and White (1985) for an approach via uniform convergence, or Dupaˇcov´aand Wets (1988) and Robinson (1996) for a general approach based on the concept of epi-convergence. Given consistency, it is meaningful to analyse the rate of convergence at which the SAA estimators approach their original counterparts as N tends to infinity. In this regard, Shapiro (1989, 1991, 1993) and King and Rockafellar (1993), for instance, provided necessary and sufficient conditions for the asymptotic ^∗ normality of the estimators, from which it immediately follows that ffN g and ∗ fx^pN g converge in distribution to their deterministic counterparts at a rate of 1= N. Whereas the findings of the former author are essentially based on the On almost sure rates of convergence for sample average approximations 3 central limit theorem in Banach spaces to which the delta method with a first and second order expansion of the minimum value function is applied, the latter use a generalised implicit function theorem to achieve these results. Rates of convergence for stochastic optimisation problems of the type above were also studied in different settings by Kaniovski et al. (1995), Dai et al. (2000), Shapiro and Homem-de-Mello (2000) and Homem-de-Mello (2008). Specifically, they use large deviation theory to show that, under the rather strong but un- avoidable assumption of an existing moment generating function with a finite value in a neighbourhood of zero, the probability of deviation of the SAA opti- ∗ ∗ mal solutionx ^N from the true solution x converges exponentially fast to zero for N ! 1. This, in turn, allows to establish rates of convergence in probability and to derive conservative estimates for the sample size required to solve the original problem to a given accuracy with overwhelming probability, see, e.g., Shapiro (2003) and Shapiro et al. (2014), Section 5.3, for further details. In view of these findings, it has to be noted that in the SAA context all rates of convergence which have been established so far are for convergence in probability or convergence in distributions. To the best of our knowledge, rates of convergence that hold almost surely and thus complement the strong consistency of optimal estimators with its corresponding rate have not yet been considered in the framework of SAA, with very few exceptions using particular assumptions. Further, not many results are available on convergence in mean, nor have any convergence rates been established to this extent. This is an important issue, as convergence in mean is the basis to derive reasonable statements on the average biasedness of estimators. The most notable previous work here is the paper by Homem-de-Mello (2003), where almost sure convergence rates for the objective function have been investigated, albeit in the slightly different setup of a variable SAA (VSAA), in the special case of finite feasibility set X . For this purpose Homem-de-Mello (2003) applies a (pointwise) law of the iterated logarithm to derive the usual almost sure convergence rate for the objective function; due to the VSAA setup in a more general context of random triangular arrays. As Homem-de-Mello (2003) is focused on a finite (i.e. discrete) feasible set X , these results can be straightforwardly translated to rates for optimal values. However, it is not possible to directly generalise the usage of the pointwise law of the iterated logarithm (LIL) to a general compact feasible set as considered here. For this reason, we have to refer to a uniform, i.e. Banach space, version of the LIL instead. We aim at closing this gap in the existing literature on the SAA approach with the present paper, providing almost sure rates of convergence of the ap- ^ ^∗ proximating objective function ffN g as well as for the optimal estimators ffN g ∗ and fx^N g. We further provide results on the convergence in mean, with cor- responding rates where possible. As it has to be expected, our results can be derived by means of the law of the iterated logarithm in Banach spaces, which is similar to the technique that has been already applied in the form of the func- tional central limit theorem to obtain asymptotic distributions of optimal values and solutions, see, e.g., Shapiro (1991). The main difference between the law of 4 Dirk Banholzer et al. the iterated logarithm and the central limit theorem is that LIL provides almost sure convergence results (in contrast to convergence results in distribution), all under quite similar assumptions. Further, as one advantage, we also directly ob- tain results on convergence in mean. While the estimator for the optimal value shares the same asymptotic rate of convergence, no rates could be obtained for the convergence in mean of the optimal solutions in a general setup. For con- vex problems, however, again the same convergence rate applies. We will show that these convergence rates in mean will not only give us a quantification of the asymptotic bias, but also provide the basis for (weak) rates of convergence in probability via Markov's inequality.