Robust Optimization for Decision-making under Endogenous Uncertainty
Nikolaos H. Lappas, Chrysanthos E. Gounaris*
Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, PA, 15213
Abstract
This paper contemplates the use of robust optimization as a framework for addressing prob- lems that involve endogenous uncertainty, i.e., uncertainty that is affected by the decision maker’s strategy. To that end, we extend generic polyhedral uncertainty sets typically con- sidered in robust optimization into sets that depend on the actual decisions. We present the derivation of robust counterpart models in this setting, and we discuss relevant algorithmic con- siderations for solving these models to guaranteed optimality. Besides capturing the functional changes in parameter correlations that may be induced by given decisions, we show how the use of our decision-dependent uncertainty sets allows us to also eradicate conservatism effects from parameters that become irrelevant in view of the optimal decisions. We quantify these benefits via a number of case studies, demonstrating our proposed framework’s versatility to be utilized in the context of various applications. Keywords: Robust Optimization, Endogenous Uncertainty, Decision-dependent Uncer- tainty Sets
1 Introduction
Uncertainty is inherent in virtually every system we wish to optimize. Parameters affected by various external forces such as market prices and demand, unexpected disruptive events such as equipment malfunctions and natural disasters, or simply incomplete information about the system under study may render solutions of deterministic optimization models suboptimal or even infeasible
1 Author to whom all correspondence should be addressed ([email protected]).
1 when parameter realizations deviate from their nominal values. To that end, multiple approaches have been proposed so as to account for uncertainty during the decision-making process.
The various alternative methodologies that have been developed serve different purposes, and selecting which one to adopt should be based on careful examination of the characteristics of the application in question. For example, when the number of uncertain parameters is relatively small, multi-parametric programming [1] can map the parameter space in order to offer closed- form solutions of optimization problems in terms of the former and to provide robustness estimates of any given solution. When detailed probabilistic information about the system parameters is available, often in the form of discrete scenario trees, stochastic programming [2] can be utilized to optimize system performance in expectation. Bounds on the variability of such performance can then be applied as explicit constraints on various statistically meaningful metrics [3]. In stochastic programming, the assumption of full recourse is commonly utilized and is often implemented by penalizing infeasibilities in the objective function. It should be mentioned, however, that this practice would not be suitable in applications where constraint violations cannot be tolerated (e.g., due to system safety concerns), are not meaningful (e.g., equipment physical limitations), or cannot be fairly “monetized.”
On the other hand, robust optimization (RO) [4–6] offers an attractive option for applications where distributional information about the uncertainty is limited and/or where solution feasibility is top priority. For such settings, RO seeks solutions that remain feasible for any possible uncertainty realization from within a postulated uncertainty set, which captures applicable correlations among uncertain parameters. Although one can derive such sets based on probabilistic information [7,
8], precise knowledge of probability distributions is not typically required to construct such sets.
Multiple types of uncertainty sets (e.g., polyhedral, ellipsoidal, cardinality-constrained budgets) can be used in this context, exploiting correlations among uncertain parameters as a mechanism to control the trade-off between robustness and performance.
A common characteristic across traditional uncertainty sets utilized in RO is their constant nature; that is, the range of parameter realizations they admit does not depend on the values we choose for the decision variables. This fact leads to an important limitation, namely that constant sets do not suffice to model settings where one’s decisions directly affect the underlying probability distribution from which parameter realizations draw (e.g., entering a market sooner provides access
2 to a larger demand). Furthermore, a subtle manifestation arises in cases where one’s decisions render a model parameter that is referenced in the uncertainty set physically meaningless (e.g., the yield of a process step that was not selected in the optimal flowsheet). Not only would it be unrealistic to postulate a correlation involving such a parameter, doing so might lead to overly conservative solutions as a result of an effective relaxation of the uncertainty set’s projection to the space of parameters that remain relevant in view of the optimal decisions.
In order to overcome the above challenges, we recently proposed the use of decision-dependent uncertainty sets (DDUS) in the context of robust optimization models for process scheduling appli- cations [9]. In that work, the effects of task processing times were removed from the uncertainty set whenever the decision was taken to not execute such tasks, offering a more realistic representation of the uncertainty encountered in the actual application, as well as leading to less conservative solutions compared to robust solutions reported in prior literature. Expanding upon a previously published shorter version [10], the contributions of the current paper are four-fold:
• We classify decision-making contexts involving uncertain parameters of endogenous nature
for which the use of DDUS is warranted.
• We propose DDUS of generic polyhedral form that feature decision-dependency in both left-
and right-hand sides, and we discuss ways to instantiate such sets.
• We derive the robust counterpart model in the context of our proposed DDUS, and we discuss
applicable algorithmic approaches to solve the former to guaranteed optimality.
• We assess the performance benefits as well as the computational burden associated with using
our proposed DDUS over a series of case studies that are addressed with RO for the first time
in the open literature.
The remainder of this paper is structured as follows. We begin by providing a brief overview of the static RO methodology in Section 2. In Section 3, we discuss the possibility for uncertain parameters to exhibit an endogenous nature, and we introduce our novel DDUS that allow us to model this setting. We then derive the robust counterpart under DDUS in Section 4, while in Section 5, we present a comprehensive computational study involving a number of case studies
3 stemming from a diverse set of application contexts. We conclude the paper with some final remarks in Section 6.
Notation. Lowercase letters with regular typeface denote scalars (e.g., a), lowercase letters with bold typeface denote column vectors (e.g., a), while uppercase letters in bold typeface denote
th matrices (e.g., A). We use |a| to indicate the size of vector a, and we use ai to denote its i element. We also use e to denote the vector of ones, 0 to denote the vector of zeros, E to denote the matrix of all ones, and O to denote the matrix of all zeros. Sizes for these are to be inferred from the expressions in which they participate. The ◦ operator corresponds to the element-wise product of two equally-sized vectors or matrices, while for a quantity a, we use a and a to represent this quantity’s applicable lower and upper bound, respectively. Finally, an equation involving vector or matrix terms (e.g., a ≤ b) is to be viewed as a set of equations referencing the terms element-wise
(e.g., ai ≤ bi ∀ i), while an implication or equivalence with a set of equations in its sides (e.g.,
{a = b} ⇔ {c ≤ d}) is also to be viewed element-wise (e.g., {ai = bi} ⇔ {ci ≤ di} ∀ i).
2 Static Robust Optimization
Let us consider the mixed-integer linear optimization problem (1),1 which features continuous decision variables x, discrete decision variables w, and inequality constraints indexed over the set M.
min x1 x,w |x| |w| s.t. x ∈ R , w ∈ {0, 1} (1) > > > > > cmx + q Cmx + rmw + q Rmw ≤ am + q bm ∀ m ∈ M (x, w) ∈ F
For ease of exposition, the various expressions in the constraints have been partitioned into terms that do and terms that do not involve parameters q, which we shall consider to be uncertain
|q| and about to realize from within an uncertainty set Q ⊆ R . The quantities cm, Cm, rm,
Rm, am, and bm are therefore constants known with certainty, for all m ∈ M, while the set
|x| |w| F ⊆ R × {0, 1} is introduced to represent the space that results from intersecting all linear constraints (both equalities and inequalities) that do not reference any of the uncertain parameters
1 We remark that, although in this paper we focus on problems that can be represented via mixed-integer linear formulations, the RO methodology can be extended to nonlinear models as well.
4 q. For clarity, we mention that we consider all constraints in the set M to reference at least one parameter from q; that is, at least one of the elements of Cm, Rm, or bm must be non-zero, for each constraint m ∈ M. Furthermore, the objective function has been brought into the set of constraints using a standard epigraph reformulation (where, without loss of generality, the continuous variable x1 is used to denote the epigraph variable). Consequently, any mixed-integer linear problem that does not reference uncertain parameters in equality constraints2 can be equivalently transformed to the form (1). We also highlight that this formulation can accommodate uncertain parameters in both left- and right-hand sides as well as in objective function coefficients.
In this setting, all decisions are to be taken in a single stage, before the actual realizations of the uncertain parameters are observed. The fundamental idea behind single-stage (a.k.a., static)
RO is to guarantee the constraint satisfaction for any uncertain parameter realization from within a suitably chosen uncertainty set, Q, and to then seek the best feasible solution, as assessed against the worst possible such realization. Consequently, the RO formulation can be represented via problem (2).
min x1 x,w |x| |w| s.t. x ∈ R , w ∈ {0, 1} (2) > > > > > cmx + q Cmx + rmw + q Rmw ≤ am + q bm ∀ q ∈ Q ∀ m ∈ M (x, w) ∈ F
We remark that, under the reasonable assumption of a non-empty, non-singleton uncertainty set, formulation (2) typically involves infinitely many constraints, and in order solve it, one shall typically apply standard reformulation techniques and numerical algorithms used in semi-infinite programming.
2 This is due to the fact that a trivial or infeasible solution would arise if an equality is to be enforced for more that one distinct realization of the uncertain parameters [11]. In certain models, equalities that reference uncertain parameters can be eliminated via solving out of the model some suitably chosen state variable. If this is not possible, then the reader is referenced to the methodology of Adjustable Robust Optimization [5], where the introduction of recourse decision variables can facilitate this case via coefficient matching, as has also been demonstrated in our previous work [9].
5 2.1 Solution Approaches
There are mainly two, methodologically-distinct approaches for dealing with the semi-infinite nature of the RO model (2). The first one is based on reformulating the problem into an equivalent, but finite-sized monolithic model using duality arguments (e.g., linear, conic, or Fenchel, depending on the type of the uncertainty set). This approach is advantageous inasmuch it provides an explicit closed form robust counterpart, which can then be solved via off-the-shelf optimization software without additional programming effort. A comprehensive list of robust counterparts for various uncertainty sets and types of constraints (both linear and non-linear) can be found in Ben-Tal et al. [12], or Gorissen et al. [6].
The alternative approach constitutes an adversarial methodology inspired by the cutting plane algorithm, which was originally proposed by Kelley [13], and was later formalized for RO by Mu- tapcic and Boyd [14]. The algorithm relies on the sequential identification of violated scenarios, namely uncertain parameter realizations q∗ ⊆ Q that cause infeasibility of one or more constraints in view of the current decision (x∗, w∗). These violations are subsequently enforced by adding in the master problem suitable deterministic-like constraints to guarantee the feasibility of the offend- ing constraints for these identified scenarios. The algorithm exits successfully when a certificate is obtained that the current master problem solution no longer violates any scenarios from within the uncertainty set. Conversely, when no feasible solution can satisfy the constraints already added in the master problem, then the problem can be announced robust infeasible. The main advantage of this approach lies on the fact that it can accommodate a much wider variety of settings, such as more complicated uncertainty sets (e.g., sets referencing discrete random variables or non-convex sets for which duality gaps arise). On the other hand, this approach can only provide certified robust feasible solutions only after successful termination of the iterative algorithm, meaning that a premature exit (due to time limit, for instance) will not lead to a practically useful outcome. The performance of the aforementioned approaches has been assessed in the literature (see, e.g., the extensive computational work by Bertsimas et al. [15]).
6 2.2 Traditional Uncertainty Sets
As mentioned earlier, one of the most critical elements of applying the RO methodology is the selection of the uncertainty set against which robust feasibility is sought. The shape and size of the set can have a direct impact on the conservatism of the robust optimal solution as well as the tractability of the robust counterpart. Hence, extensive literature effort has focused on defining uncertainty sets that manage the trade off among these two aspects while providing reasonably accurate descriptions of uncertainty in practical applications. Popular uncertainty sets that have been proposed in the literature include the box [16], ellipsoidal [17], cardinality-constrained [18], conic [19], and convex-constrained sets [20]. Furthermore, uncertainty sets describing the possible realizations of discrete uncertain parameters have also been proposed [21], enabling the modeling of scenario-based information.
In this work, we will focus on polyhedral uncertainty sets of the general form shown in (3), which generalize the cardinality-constrained sets. These sets are gaining increasing popularity in the literature due to their simplicity as well as their flexibility to accommodate all types of affine correlations, such as budgets or factor models [22]. Most importantly, polyhedral sets offer consid- erable numerical advantages by yielding robust formulations that belong to the same model class as the deterministic base model describing the problem of interest; for example, if the deterministic model is of mixed-integer linear form, then so will be the robust counterpart formulated in view of a polyhedral uncertainty set. |q| q ∈ R : Q = Hq ≤ d (3) q ≤ q ≤ q
Here, the vector d has size equal to the number of the parameter correlations (other than simple parameter bounds) one wishes to model in this set, storing their right-hand side constants.
Therefore, the matrix H, which stores the left-hand side coefficients multiplying the parameters in these correlations, has size |d| × |q|. The vectors q and q represent the applicable box bounds for the admissible realizations of the continuous uncertain parameters. Note that the constant nature of this set stems from the absence of decision variables in its definition. From an application point of view, this means that the uncertainty one seeks insurance against does not get affected in any
7 way from the decisions made. As we discuss later, this assumption can be quite conservative in many real life settings.
3 The Case of Endogenous Uncertainty
Uncertain parameters can be classified as exogenous, when they are not affected by one’s deci- sions (e.g., weather conditions), or as endogenous, when the decision maker can manipulate their realization or ability to be observed. Most optimization problems studied in RO literature consider only the former type of uncertainty, which can be modeled using traditional, constant (i.e., not decision-dependent) uncertainty sets. In order to properly motivate the use of DDUS, we provide in the following some background on endogenous uncertainty.
Endogeneity arises due to various reasons. In certain cases, a decision may render a parameter referenced in a model physically meaningless (e.g., the price of a product under development that did not hit the market). In other cases, a decision may affect the timing of a parameter realization
(e.g., the time at which we observe the true magnitude of a well production rate depends on when we decide to drill the well), essentially dictating whether the parameter reveals itself before the second, third, or some other later stage. Finally, a decision may affect an uncertain parameter’s stochastic support or distribution from which this parameter draws (e.g., the technology which we choose to invest on will affect the range of possible yields for the process). In the stochastic programming literature, endogenous uncertainty where the decisions can affect the underlying distribution is classified as Type-I endogenous uncertainty, while in the other cases it is classified as Type-II endogenous uncertainty [23, 24].
There are many important application settings where uncertainty is subject to the optimizer’s decisions (see, e.g., Jonsbr˚atenet al. [25], Goel and Grossmann [23]). In a monopolistic market, for instance, a decision to increase the production will have a negative effect on product prices.
Other examples arise in oilfield development planning [26], network capacity expansion [23], net- work interdiction problems [27], and the planning of clinical trials [28], to name but a few. These problems have traditionally been studied as two-stage or multi-stage stochastic programming prob- lems, where the cartesian product between the set of decision variables affecting the realization of uncertainty and the original set of discrete scenarios is considered, creating essentially a copy of
8 the scenario tree for each possible combination of decisions. As a result, the computational effort required to address the above problems can be prohibitive, especially in the case of multi-stage settings where the enforcement of non-anticipativity constraints dominates the model size. More recently, significant efforts have been made towards better modeling and solution approaches that are based on reducing the size of the scenario tree and efficiently enforcing the non-anticipativity restrictions [24]. Alternative approaches that model these problems as Markov decision processes have also been proposed [29].
At the same time, the above application settings constitute examples where the use of traditional
(constant) uncertainty sets in RO can prove limited. Since constant uncertainty set coefficients do not allow functional changes in the applicable correlations they model (e.g., they cannot adapt to changes in the underlying distributions), one must make conservative approximations to be inclusive of all possibilities, admitting unrealistic worst-case realizations and leading to suboptimal solutions. To that end, we propose in this paper a more promising strategy that involves directly encoding in the uncertainty set itself the dependence of the parameters on the actual decisions.
3.1 Decision-dependent Uncertainty Sets
In order to address problems with endogenous uncertainty through RO in a generic enough fashion, we extend the constant polyhedral uncertainty set of the form (3) into a set that depends on the decision variables, as in (4).
|q| q ∈ R : Q (x, w) = Hq + H0 (v(x, w) ◦ q) ≤ Gw + G0v(x, w) + d , (4) n > o H e 6= 0 ∨ v(x, w) = e ⇒ q ≤ q ≤ q
|x| |w| |q| Here, v(x, w): R ×{0, 1} 7→ {0, 1} is a vector of problem-specific, binary-valued functions of our decision variables that indicate the materialization of each uncertain parameter in the vector q (see below for how to select these functions), the matrix H0 contributes left-hand side terms that must be removed when parameters do not materialize, while the matrices G and G0 contribute respectively direct and materialization-related decision dependency to the right-hand sides. The quantities q and q retain their definitions as bounds for the admissible realizations of parameters
9 q, but care should now be taken so that these bounds are wide enough to remain valid under all possible feasible decisions (x, w) for which the corresponding elements of v(x, w) attain the value of one. Conversely, if a parameter that does not participate in a constant left-hand side term also happens to not materialize (i.e., the corresponding materialization indicator attains the value of zero), leading to a situation where this parameter vanishes from the set in light of the specific decisions made, then there is no need to account for the parameter’s bounds at all. This is reflected by referencing the bound constraints inside the apodosis of the implication statement in the DDUS (4). We remark that our definition of a DDUS constitutes a generalization of the specially structured cardinality-constrained sets presented in the works by Poss [30] and Nohadani and Sharma [31] that only involve direct right-hand side decision dependency, covering only a subset of the modeling capabilities we introduce in this work. More specifically, our description encompasses their case by setting v(x, w) = e and G0 = O. Finally, it is noteworthy to mention that DDUS of the form (4) retain the properties of their constant precursors (3) with regards to the model class of the resulting robust counterpart formulation. More specifically, for a mixed- integer linear deterministic model as the basis, the robust counterpart under DDUS will also be mixed-integer linear programming representable.
Evaluation of Materialization Indicators: An uncertain parameter is said to materialize if and only if our decisions do not cause it to vanish from the model, which would happen if all occurrences of the parameter in the formulation are multiplied by variables (or expressions of variables) for which a value of zero has been chosen. There are many reasons why a parameter may not materialize under certain decisions. Often, parameters serve merely as big-M coefficients in terms that do not activate. Other parameters may be associated with tasks that are controlled by a decision of whether to execute them or not, in which latter case these parameters lose their physical meaning (they become unobservable). For example, the efficiency of a compressor will only materialize if the decision is made to turn on that compressor. If the decision is made to leave the compressor at its “off” state, then the efficiency that the compressor would have otherwise attained becomes irrelevant. Consequently, a formal description of the materialization indicator functions v(x, w) can be obtained by declaring a set of state variables v ∈ {0, 1}|q| and introducing
10 the following constraints in the RO counterpart model.
^ {v = 0} ⇔ {Cmx + Rmw = bm} (5) m∈M
We remark that these constraints often lead to an intuitive solution, usually of the form vi = wk, which relates a specific uncertain parameter qi to a specific binary decision wk that governs the materialization of the former. In such cases when the mapping from variables w to variables v is known to the modeler a priori, the materialization indicators can be directly replaced by the native binary variables (or the binary-valued expressions of those) governing their materialization. This allows for the numerical efficiency of not having to introduce new variables v, alleviating also any possible tractability burden associated with introducing the conditional conjunctive constraints (5) in the final RO model.
3.2 Illustrative Modeling Capabilities
In general, non-materialized parameters do not affect the performance of our final solution.
Furthermore, non-materialized parameters that lose their physical meaning in light of the optimal decisions should also not be referenced in the uncertainty set, giving rise to our need to generalize the left-hand sides of sets (4) into constant and decision-dependent terms for full modeling flexibility.
We now briefly illustrate a number of modeling conveniences that our proposed DDUS of the form (4) afford us. Note that these are only a handful of possible examples that can be envisioned, given our sets’ quite general form.
1. Our DDUS allow for the introduction of decision-dependent distributional supports. For
example, Eq. (6) can model decision-dependent bounds for a parameter’s realization.
q ≤ q|{w=1} w + q|{w=0} (1 − w) (6)
2. Our DDUS allow for decision dependency in correlations among uncertain parameters. Among
other uses, this enables the modeling of uncertainty that arises from a set of possible discrete
scenarios. For example, Eq. (7) can model a decision-dependent cardinality budget, where
the maximum number of parameters that can attain their upper bound realization is limited
11 Figure 1: Cardinality budget DDUS for various decisions w1 + w2 + w3 = n, where n may vary be- tween 0 (top left) and 3 (bottom right). The black dots signify the extremal admissible realizations in each case.
based on (e.g., investment type) decisions. Figure 1 illustrates this concept.
q1 + q2 + q3 ≤ w1 + w2 + w3 (7)
3. Our DDUS allow for eliminating the effect of parameters that did not materialize as a result of
our decisions. For example, Eq. (8) removes the contribution of non-materialized parameters
by dynamically projecting the set into the lower-dimensional space of only the materialized
parameters. Figure 2 illustrates the difference in the shape of the applicable uncertainty set
when parameter q2 does and does not materialize, as governed by its corresponding materi-
0 alization indicator binary variable, v2. Here, the vector q holds the values for the nominal realizations.